Difference between revisions of "Main Page"

From cvss
 
(72 intermediate revisions by 3 users not shown)
Line 20: Line 20:
 
* Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
 
* Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
  
==Schedule Spring 2014==
+
==Schedule Fall 2015==
  
 
All talks take place on Thursdays at 3:30pm in AVW 3450.
 
All talks take place on Thursdays at 3:30pm in AVW 3450.
Line 30: Line 30:
 
! Title
 
! Title
 
|-
 
|-
| October 16
+
| December 3
| Abhishek Sharma
 
| Recursive Context Propagation Network for Semantic Scene Labeling
 
|-
 
| October 23
 
| Ang Li
 
| Planar Structure Matching Under Projective Uncertainty for Geolocation
 
|-
 
| October 30
 
| Cancelled
 
| Cancelled
 
|-
 
| November 6
 
| Ejaz Ahmed
 
| Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection
 
|-
 
| November 13
 
| ''CVPR deadline, no meeting''
 
|
 
|-
 
| November 20
 
| Kota Hara
 
| Growing Regression Forests by Classification: Applications to Object Pose Estimation
 
|-
 
| November 27
 
| ''Thanksgiving break, no meeting''
 
|
 
|-
 
| December 4
 
 
| Angjoo Kanazawa
 
| Angjoo Kanazawa
| Locally Convolutional Neural Network
+
| Learning 3D Deformation of Animals from 2D Images
 
|-
 
|-
| December 11
+
| December 10
| Aleksandrs
+
| Xintong Han
|  
+
| Automated Event Retrieval using Web Trained Detectors
 
|}
 
|}
  
==Talk Abstracts Fall 2014==
+
==Talk Abstracts Spring 2015==
  
===Recursive Context Propagation Network for Semantic Scene Labeling===
 
Speaker: [https://www.cs.umd.edu/~bhokaal/ Abhishek Sharma] -- Date: October 16, 2014
 
  
Abstract: The talk will briefly touch upon the Multi-scale CNN of Lecun and Farabet to extract pixel-wise features for semantic segmentation and then I will move on to discuss the work we did to enhance the model further in order to result in a real-time and accurate pixel-wise labeling pipeline. I will talk about a deep feed-forward neural network architecture for pixel-wise semantic scene labeling. It uses a novel recursive neural network architecture for context propagation, referred to as rCPN. It first maps the local features into a semantic space followed by a bottom-up aggregation of local information into a global feature of the entire image. Then a top-down propagation of the aggregated  information takes place that enhances the contextual information of each local features. Therefore, the information from every location in the image is propagated to every other location. Experimental results on Stanford background and SIFT Flow datasets show that the proposed method outperforms previous approaches in terms of accuracy. It is also orders of magnitude faster than previous methods and takes only 0.07 seconds on a GPU for pixel-wise labeling of a 256 by 256 image starting from raw RGB pixel values, given the super-pixel mask that takes an additional 0.3 seconds using an off-the-shelf implementation.
+
===Learning 3D Deformation of Animals from 2D Images===
 +
Speaker: [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa] -- Date: December 3, 2015
  
===Planar Structure Matching Under Projective Uncertainty for Geolocation===
+
Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.
Speaker: [http://www.cs.umd.edu/~angli/ Ang Li] -- Date: October 23, 2014
 
  
Abstract: Image based geolocation aims to answer the question: where was this ground photograph taken? We present an approach to geoloca- lating a single image based on matching human delineated line segments in the ground image to automatically detected line segments in ortho images. Our approach is based on distance transform matching. By ob- serving that the uncertainty of line segments is non-linearly amplified by projective transformations, we develop an uncertainty based repre- sentation and incorporate it into a geometric matching framework. We show that our approach is able to rule out a considerable portion of false candidate regions even in a database composed of geographic areas with similar visual appearances.
+
Link: [http://arxiv.org/pdf/1507.07646v1.pdf paper]
  
===Knowing a Good HOG Filter When You See It: Efficient Selection of Filters for Detection===
+
===Automated Event Retrieval using Web Trained Detectors===
Speaker: [http://www.cs.umd.edu/~ejaz/ Ejaz Ahmed] -- Date: November 6, 2014
 
  
Abstract: Collections of filters based on histograms of oriented gradients (HOG) are common for several detection methods, notably, poselets and exemplar SVMs. The main bottleneck in training such systems is the selection of a subset of good filters from a large number of possible choices. We show that one can learn a universal model of part “goodness” based on properties that can be computed from the filter itself. The intuition is that good filters across categories exhibit common traits such as, low clutter and gradients that are spatially correlated. This allows us to quickly discard filters that are not promising thereby speeding up the training procedure. Applied to training the poselet model, our automated selection procedure allows us to improve its detection performance on the PASCAL VOC data sets, while speeding up training by an order of magnitude. Similar results are reported for exemplar SVMs.
+
Speaker: [http://www.umiacs.umd.edu/~xintong/ Xintong Han] -- Date: December 10, 2015
  
===Growing Regression Forests by Classification: Applications to Object Pose Estimation===
+
Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.
Speaker: [http://www.kotahara.com/ Kota Hara] -- Date: November 20, 2014
 
  
Abstract: In this work, we propose a novel node splitting method for regression trees and incorporate it into the regression forest framework. Unlike traditional binary splitting, where the splitting rule is selected from a predefined set of binary splitting rules via trial-and-error, the proposed node splitting method first finds clusters of the training data which at least locally minimize the empirical loss without considering the input space. Then splitting rules which preserve the found clusters as much as possible are determined by casting the problem into a classification problem. Consequently, our new node splitting method enjoys more freedom in choosing the splitting rules, resulting in more efficient tree structures. In addition to the Euclidean target space, we present a variant which can naturally deal with a circular target space by the proper use of circular statistics. We apply the regression forest employing our node splitting to head pose estimation (Euclidean target space) and car direction estimation (circular target space) and demonstrate that the proposed method significantly outperforms state-of-the-art methods (38.5\% and 22.5\% error reduction respectively).
+
Link: [http://arxiv.org/pdf/1509.07845v1.pdf paper]
 
 
 
 
===Locally Convolutional Neural Network===
 
Speaker: [http://www.umiacs.umd.edu/~kanazawa/] -- Date: December 4, 2014
 
 
 
Abstract: Convolutional Neural Networks (ConvNets) have shown excellent results on many
 
visual classification tasks. With the exception of ImageNet, these datasets are
 
carefully crafted such that objects are well-aligned at similar scales. Naturally, the
 
feature learning problem gets more challenging as the amount of variation in the
 
data increases, as the models have to learn to be invariant to certain changes in
 
appearance. Recent results on the ImageNet dataset show that given enough data,
 
ConvNets can learn such invariances producing very discriminative features [1].
 
But could we do more: use less parameters, less data, learn more discriminative
 
features, if certain invariances were built into the learning process? In this paper
 
we present a simple model that allows ConvNets to learn features in a locally
 
scale-invariant manner without increasing the number of model parameters. We
 
show on a modified MNIST dataset that when faced with scale variation, building
 
in scale-invariance allows ConvNets to learn more discriminative features with
 
reduced chances of over-fitting.
 
  
 
==Past Semesters==
 
==Past Semesters==
 +
* [[Cvss:Spring2015| Spring 2015]]
 +
* [[cvss fall2014|Fall 2014]]
 
* [[cvss_spring2014|Spring 2014]]
 
* [[cvss_spring2014|Spring 2014]]
 
* [[cvss_fall2013|Fall 2013]]
 
* [[cvss_fall2013|Fall 2013]]
Line 120: Line 71:
 
==Funded By==
 
==Funded By==
 
* Computer Vision Faculty
 
* Computer Vision Faculty
* '''[http://www.northropgrumman.com/ Northrop Grumman]'''
+
<!-- * '''[http://www.northropgrumman.com/ Northrop Grumman]''' -->
  
 
==Current Seminar Series Coordinators==
 
==Current Seminar Series Coordinators==
Line 128: Line 79:
 
{| cellpadding="1"
 
{| cellpadding="1"
 
|-
 
|-
| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@
+
| [http://sites.google.com/site/austinomyers/ Austin Myers], amyers@
| (student of [http://www.cs.umd.edu/~lsd/ Professor Larry Davis])
+
| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 +
|-
 +
| [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa], kanazawa@
 +
| (student of [http://cs.umd.edu/~djacobs/ Professor David Jacobs])
 
|-
 
|-
| [https://sites.google.com/site/austinomyers/ Austin Myers], amyers@
+
| [http://sites.google.com/site/yechengxi/ Chenxi Ye] cxy@
 
| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 
| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 
|-
 
|-
| [http://ravitejav.weebly.com/ Raviteja Vemulapalli], raviteja @
+
| [http://www.umiacs.umd.edu/~xintong/ Xintong Han], xintong@
| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 +
|-
 +
| [http://www.cs.umd.edu/~bharat/ Bharat Singh], bharat@
 +
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 +
|-
 +
| [http://bcsiriuschen.github.io/ Bor-Chun (Sirius) Chen], sirius@
 +
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 
|}
 
|}
  
 
Gone but not forgotten.
 
Gone but not forgotten.
 
 
{| cellpadding="1"
 
{| cellpadding="1"
 
|-
 
|-
| [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa]
+
| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@
|  
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 +
|-
 +
| Ching-Hui Chen, ching@
 +
| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
 +
|
 +
|-
 +
| [http://ravitejav.weebly.com/ Raviteja Vemulapalli], raviteja @
 +
| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
 
|-
 
|-
 
| [http://www.umiacs.umd.edu/~sameh/ Sameh Khamis]
 
| [http://www.umiacs.umd.edu/~sameh/ Sameh Khamis]
Line 155: Line 121:
 
|-
 
|-
 
| [http://www.umiacs.umd.edu/~jni/ Jie Ni]
 
| [http://www.umiacs.umd.edu/~jni/ Jie Ni]
| off this semester
+
| now at Sony
 
|-
 
|-
 
| [http://www.umiacs.umd.edu/~taheri/ Sima Taheri]
 
| [http://www.umiacs.umd.edu/~taheri/ Sima Taheri]

Latest revision as of 23:40, 3 December 2015

Computer Vision Student Seminars

The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by current graduate students for current graduate students.

To receive regular information about the Computer Vision Student Seminars, subscribe to our mailing list or our talks list.

Description[edit]

The purpose of these talks is to:

  • Encourage interaction between computer vision students;
  • Provide an opportunity for computer vision students to be aware of and possibly get involved in the research their peers are conducting;
  • Provide an opportunity for computer vision students to receive feedback on their current research;
  • Provide speaking opportunities for computer vision students.

The guidelines for the format are:

  • An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.
  • The talks are meant to be casual and discussion is encouraged.
  • Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.

Schedule Fall 2015[edit]

All talks take place on Thursdays at 3:30pm in AVW 3450.

Date Speaker Title
December 3 Angjoo Kanazawa Learning 3D Deformation of Animals from 2D Images
December 10 Xintong Han Automated Event Retrieval using Web Trained Detectors

Talk Abstracts Spring 2015[edit]

Learning 3D Deformation of Animals from 2D Images[edit]

Speaker: Angjoo Kanazawa -- Date: December 3, 2015

Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.

Link: paper

Automated Event Retrieval using Web Trained Detectors[edit]

Speaker: Xintong Han -- Date: December 10, 2015

Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.

Link: paper

Past Semesters[edit]

Funded By[edit]

  • Computer Vision Faculty

Current Seminar Series Coordinators[edit]

Emails are at umiacs.umd.edu.

Austin Myers, amyers@ (student of Professor Yiannis Aloimonos)
Angjoo Kanazawa, kanazawa@ (student of Professor David Jacobs)
Chenxi Ye cxy@ (student of Professor Yiannis Aloimonos)
Xintong Han, xintong@ (student of Professor Larry Davis)
Bharat Singh, bharat@ (student of Professor Larry Davis)
Bor-Chun (Sirius) Chen, sirius@ (student of Professor Larry Davis)

Gone but not forgotten.

Jonghyun Choi, jhchoi@ (student of Professor Larry Davis)
Ching-Hui Chen, ching@ (student of Professor Rama Chellappa)
Raviteja Vemulapalli, raviteja @ (student of Professor Rama Chellappa)
Sameh Khamis
Ejaz Ahmed
Anne Jorstad now at EPFL
Jie Ni now at Sony
Sima Taheri
Ching Lik Teo