Difference between revisions of "Main Page"
Line 56: | Line 56: | ||
| April 3 | | April 3 | ||
| Bahadir Ozdemir | | Bahadir Ozdemir | ||
− | | | + | | A Probabilistic Framework for Multimodal Retrieval using Integrative Indian Buffet Process |
|- | |- | ||
| April 10 | | April 10 |
Revision as of 21:19, 30 March 2015
Computer Vision Student Seminars
The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by current graduate students for current graduate students.
To receive regular information about the Computer Vision Student Seminars, subscribe to our mailing list or our talks list.
Description[edit]
The purpose of these talks is to:
- Encourage interaction between computer vision students;
- Provide an opportunity for computer vision students to be aware of and possibly get involved in the research their peers are conducting;
- Provide an opportunity for computer vision students to receive feedback on their current research;
- Provide speaking opportunities for computer vision students.
The guidelines for the format are:
- An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.
- The talks are meant to be casual and discussion is encouraged.
- Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
Schedule Spring 2015[edit]
All talks take place on Thursdays at 3:30pm in AVW 3450.
Date | Speaker | Title |
---|---|---|
February 19 | Bharat Singh | PSPGC: Part-Based Seeds for Parametric Graph-Cuts |
February 26 | Jingjing Zheng | Submodular Attribute Selection for Action Recognition in Video |
March 5 | Snow Break | |
March 13 | Yezhou Yang | Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision and Robotics |
March 20 | Spring Break, no meeting | |
March 27 | Sravanthi and Varun Manjunatha | SHOE: Supervised Hashing with Output Embeddings |
April 3 | Bahadir Ozdemir | A Probabilistic Framework for Multimodal Retrieval using Integrative Indian Buffet Process |
April 10 | Ching-hui Chen | TBD |
April 17 | ICCV deadline, no meeting | |
April 24 | Ching Lik Teo | TBD |
May 1 | Joe Ng | TBD |
May 8 | Aleksandr(?), Francisco (?) | TBD |
May 15 | Final Exam, no meeting |
Talk Abstracts Spring 2015[edit]
PSPGC: Part-Based Seeds for Parametric Graph-Cuts[edit]
Speaker: Bharat Singh -- Date: February 19, 2015
Abstract: PSPGC is a detection-based parametric graph-cut method for accurate image segmentation. Experiments show that seed positioning plays an important role in graph-cut based methods, so, we propose three seed generation strategies which incorporate information about location and color of object parts, along with size and shape. Combined with low-level regular grid seeds, PSPGC can leverage both low-level and high-level cues about objects present in the image. Multiple-parametric graph-cuts using these seeding strategies are solved to obtain a pool of segments, which have a high rate of producing the ground truth segments. Experiments on the challenging PASCAL2010 and 2012 segmentation datasets show that the accuracy of the segmentation hypotheses generated by PSPGC outperforms other state-of-the-art methods when measured by three different metrics(average overlap, recall and covering) by up to 3.5%. We also obtain the best average overlap score in 15 out of 20 categories on PASCAL2010. Further, we provide a quantitative evaluation of the efficacy of each seed generation strategy introduced.
Submodular Attribute Selection for Action Recognition in Video[edit]
Speaker: Jingjing Zheng -- Date: February 26, 2015
Abstract: We present an approach to jointly learn a set of view-specific dictionaries and a common dictionary for cross-view action recognition. The set of view-specific dictionaries is learned for specific views while the common dictionary is shared across different views. Our approach represents videos in each view using both the corresponding view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from different views of the same action to have similar sparse representations. In this way, we can align view-specific features in the sparse feature spaces spanned by the view-specific dictionary set and transfer the view-shared features in the sparse feature space spanned by the common dictionary. Meanwhile, the incoherence between the common dictionary and the view-specific dictionary set enables us to exploit the discrimination information encoded in view-specific features and view-shared features separately. In addition, the learned common dictionary not only has the capability to represent actions from unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labels exist in the target view. Extensive experiments using the multi-view IXMAS dataset demonstrate that our approach outperforms many recent approaches for cross-view action recognition.
Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision and Robotics[edit]
Speaker: Yezhou Yang -- Date: March 13, 2015
Abstract: Our ability to interpret other people's actions hinges crucially on predictions about their intentionality. The grasp type provides crucial information about human action. However, recognizing the grasp type from unconstrained scenes is challenging because of the large variations in appearance, occlusions and geometric distortions. In this paper, first we present a convolutional neural network to classify functional hand grasp types. Experiments on a public static scene hand data set validate good performance of the presented method. Then we present two applications utilizing grasp type classification: (a) inference of human action intention and (b) fine level manipulation action segmentation. Experiments on both tasks demonstrate the usefulness of grasp type as a cognitive feature for computer vision. Furthermore, we will present a system that learns manipulation action plans by processing Youtube cooking instructional videos with the grasp type feature. Its goal is to robustly generate the sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots, and further guide it to execute the task.
Related Papers:
- Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision (To appear in CVPR'15)
- Robot Learning Manipulation Action Plans by “Watching” Unconstrained Videos from the World Wide Web (AAAI'15)
- Does the grasp type reveal action intention? (To appear in VSS'15)
SHOE: Supervised Hashing with Output Embeddings[edit]
Speaker: Sravanthi Bondugula and Varun Manjunatha -- Date: March 27, 2015
Abstract: We present a supervised binary encoding scheme for image retrieval that learns projections by taking into account similarity between classes obtained from output embeddings. Our motivation is that binary hash codes learned in this way improve both the visual quality of retrieval results and existing supervised hashing schemes. We employ a sequential greedy optimization that learns relationship aware projections by minimizing the difference between inner products of binary codes and output embedding vectors. We develop a joint optimization framework to learn projections which improve the accuracy of supervised hashing over the current state of the art with respect to standard and sibling evaluation metrics. We further boost performance by applying the supervised dimensionality reduction technique on kernelized input CNN features. Experiments are performed on three datasets: CUB-2011, SUN-Attribute and ImageNet ILSVRC 2010. As a by-product of our method, we show that using a simple k-nn pooling classifier with our discriminative codes improves over the complex classification models on fine grained datasets like CUB and offer an impressive compression ratio of 1024 on CNN features.
Related paper: SHOE
Past Semesters[edit]
Funded By[edit]
- Computer Vision Faculty
- Northrop Grumman
Current Seminar Series Coordinators[edit]
Emails are at umiacs.umd.edu.
Jonghyun Choi, jhchoi@ | (student of Professor Larry Davis) |
Austin Myers, amyers@ | (student of Professor Yiannis Aloimonos) |
Angjoo Kanazawa, kanazawa@ | (student of Professor David Jacobs) |
Ching-Hui Chen, ching@ | (student of Professor Rama Chellappa) |
Gone but not forgotten.
Raviteja Vemulapalli, raviteja @ | (student of Professor Rama Chellappa) |
Sameh Khamis | |
Ejaz Ahmed | |
Anne Jorstad | now at EPFL |
Jie Ni | now at Sony |
Sima Taheri | |
Ching Lik Teo |