Difference between revisions of "Main Page"

From cvss

Latest revision as of 23:40, 3 December 2015

Computer Vision Student Seminars

The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by current graduate students for current graduate students.

To receive regular information about the Computer Vision Student Seminars, subscribe to our mailing list or our talks list.

Description[edit]

The purpose of these talks is to:

Encourage interaction between computer vision students;
Provide an opportunity for computer vision students to be aware of and possibly get involved in the research their peers are conducting;
Provide an opportunity for computer vision students to receive feedback on their current research;
Provide speaking opportunities for computer vision students.

The guidelines for the format are:

An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.
The talks are meant to be casual and discussion is encouraged.
Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.

Schedule Fall 2015[edit]

All talks take place on Thursdays at 3:30pm in AVW 3450.

Date	Speaker	Title
December 3	Angjoo Kanazawa	Learning 3D Deformation of Animals from 2D Images
December 10	Xintong Han	Automated Event Retrieval using Web Trained Detectors

Talk Abstracts Spring 2015[edit]

Learning 3D Deformation of Animals from 2D Images[edit]

Speaker: Angjoo Kanazawa -- Date: December 3, 2015

Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.

Link: paper

Automated Event Retrieval using Web Trained Detectors[edit]

Speaker: Xintong Han -- Date: December 10, 2015

Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.

Link: paper

Past Semesters[edit]

Funded By[edit]

Computer Vision Faculty

Current Seminar Series Coordinators[edit]

Emails are at umiacs.umd.edu.

Austin Myers, amyers@	(student of Professor Yiannis Aloimonos)
Angjoo Kanazawa, kanazawa@	(student of Professor David Jacobs)
Chenxi Ye cxy@	(student of Professor Yiannis Aloimonos)
Xintong Han, xintong@	(student of Professor Larry Davis)
Bharat Singh, bharat@	(student of Professor Larry Davis)
Bor-Chun (Sirius) Chen, sirius@	(student of Professor Larry Davis)

Gone but not forgotten.

Jonghyun Choi, jhchoi@	(student of Professor Larry Davis)
Ching-Hui Chen, ching@	(student of Professor Rama Chellappa)
Raviteja Vemulapalli, raviteja @	(student of Professor Rama Chellappa)
Sameh Khamis
Ejaz Ahmed
Anne Jorstad	now at EPFL
Jie Ni	now at Sony
Sima Taheri
Ching Lik Teo

Web Accessibility

Retrieved from "https://wiki.cs.umd.edu/cvss/w/index.php?title=Main_Page&oldid=1868"

@@ Line 20: / Line 20: @@
 * Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
-==Schedule Spring 2015==
+==Schedule Fall 2015==
 All talks take place on Thursdays at 3:30pm in AVW 3450.
@@ Line 30: / Line 30: @@
 ! Title
 |-
-| February 19
+| December 3
-| Bharat Singh
+| Angjoo Kanazawa
-| PSPGC: Part-Based Seeds for Parametric Graph-Cuts
+| Learning 3D Deformation of Animals from 2D Images
 |-
-| February 26
+| December 10
-| Jingjing Zheng
+| Xintong Han
-| Submodular Attribute Selection for Action Recognition in Video
+| Automated Event Retrieval using Web Trained Detectors
-|-
-| March 5
-| ''Snow Break''
-|
-|-
-| March 13
-| Yezhou Yang
-| Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision and Robotics
-|-
-| March 20
-| ''Spring Break, no meeting''
-|
-|-
-| March 27
-| Sravanthi and Varun Manjunatha
-| SHOE: Supervised Hashing with Output Embeddings
-|-
-| April 3
-| Bahadir Ozdemir
-| TBD
-|-
-| April 10
-| Ching-hui Chen
-| TBD
-|-
-| April 17
-| ''ICCV deadline, no meeting''
-|
-|-
-| April 24
-| Ching Lik Teo
-| TBD
-|-
-| May 1
-| Joe Ng
-| TBD
-|-
-| May 8
-| Aleksandr(?), Francisco (?)
-| TBD
-|-
-| May 15
-| ''Final Exam, no meeting''
-|
 |}
 ==Talk Abstracts Spring 2015==
-===PSPGC: Part-Based Seeds for Parametric Graph-Cuts===
-Speaker: [http://www.cs.umd.edu/~bharat/ Bharat Singh] -- Date: February 19, 2015
-Abstract: PSPGC is a detection-based parametric graph-cut method for accurate image segmentation. Experiments show that seed positioning plays an important role in graph-cut based methods, so, we propose three seed generation strategies which incorporate information about location and color of object parts, along with size and shape. Combined with low-level regular grid seeds, PSPGC can leverage both low-level and high-level cues about objects present in the image. Multiple-parametric graph-cuts using these seeding strategies are solved to obtain a pool of segments, which have a high rate of producing the ground truth segments. Experiments on the challenging PASCAL2010 and 2012 segmentation datasets show that the accuracy of the segmentation hypotheses generated by PSPGC outperforms other state-of-the-art methods when measured by three different metrics(average overlap, recall and covering) by up to 3.5%. We also obtain the best average overlap score in 15 out of 20 categories on PASCAL2010. Further, we provide a quantitative evaluation of the efficacy of each seed generation strategy introduced.
+===Learning 3D Deformation of Animals from 2D Images===
+Speaker: [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa] -- Date: December 3, 2015
+Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.
-===Submodular Attribute Selection for Action Recognition in Video===
+Link: [http://arxiv.org/pdf/1507.07646v1.pdf paper]
-Speaker: [https://sites.google.com/site/jingjingzhengumd/ Jingjing Zheng] -- Date: February 26, 2015
-Abstract: We present an approach to jointly learn a set of view-specific dictionaries and a common dictionary for cross-view action recognition. The set of  view-specific dictionaries is learned for specific views while the common dictionary is shared across different views. Our approach represents videos in each view using  both the corresponding view-specific dictionary and the common dictionary. More importantly, it encourages the set of videos taken from different views of the same action to have similar sparse representations. In this way, we can align view-specific features in the sparse feature spaces spanned by the view-specific dictionary set and transfer the view-shared features in the sparse feature space spanned by the common dictionary. Meanwhile, the incoherence between the common dictionary and the view-specific dictionary set enables us to exploit the discrimination information encoded in view-specific features and view-shared features separately. In addition, the learned common dictionary not only has the capability to represent actions from  unseen views, but also makes our approach effective in a semi-supervised setting where no correspondence videos exist and only a few labels exist in the target view. Extensive experiments using the multi-view IXMAS dataset demonstrate that our approach outperforms many recent approaches for cross-view action recognition.
+===Automated Event Retrieval using Web Trained Detectors===
-===Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision and Robotics===
+Speaker: [http://www.umiacs.umd.edu/~xintong/ Xintong Han] -- Date: December 10, 2015
-Speaker: [http://www.umiacs.umd.edu/~yzyang/ Yezhou Yang] -- Date: March 13, 2015
-Abstract: Our ability to interpret other people's actions hinges crucially on predictions about their intentionality. The grasp type provides crucial information about human action. However, recognizing the grasp type from unconstrained scenes is challenging because of the large variations in appearance, occlusions and  geometric distortions. In this paper, first we present a convolutional neural network to classify functional hand grasp types. Experiments on a public static scene hand data set validate good performance of the presented method. Then we present two applications utilizing grasp type classification: (a) inference of human action intention and (b) fine level manipulation action segmentation.
+Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.
-Experiments on both tasks demonstrate the usefulness of grasp type as a cognitive feature for computer vision. Furthermore, we will present a system that learns manipulation action plans by processing Youtube cooking instructional videos with the grasp type feature. Its goal is to robustly generate the  sequence of atomic actions of seen longer actions in video in order to acquire knowledge for robots, and further guide it to execute the task.
-Related Papers:
+Link: [http://arxiv.org/pdf/1509.07845v1.pdf paper]
-* [http://www.umiacs.umd.edu/~yzyang/paper/CVPR2015Grasp_draft.pdf Grasp Type Revisited: A Modern Perspective on A Classical Feature for Vision (To appear in CVPR'15)]
-* [http://www.umiacs.umd.edu/~yzyang/paper/YouCookMani_CameraReady.pdf Robot Learning Manipulation Action Plans by “Watching” Unconstrained Videos from the World Wide Web (AAAI'15)]
-* [http://www.umiacs.umd.edu/~yzyang/paper/VSS_action_intention.pdf Does the grasp type reveal action intention? (To appear in VSS'15)]
 ==Past Semesters==
+* [[Cvss:Spring2015| Spring 2015]]
 * [[cvss fall2014|Fall 2014]]
 * [[cvss_spring2014|Spring 2014]]
@@ Line 119: / Line 71: @@
 ==Funded By==
 * Computer Vision Faculty
-* '''[http://www.northropgrumman.com/ Northrop Grumman]'''
+<!-- * '''[http://www.northropgrumman.com/ Northrop Grumman]''' -->
 ==Current Seminar Series Coordinators==
@@ Line 127: / Line 79: @@
 {| cellpadding="1"
 |-
-| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@
+| [http://sites.google.com/site/austinomyers/ Austin Myers], amyers@
-| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
-|-
-| [https://sites.google.com/site/austinomyers/ Austin Myers], amyers@
 | (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 |-
@@ Line 136: / Line 85: @@
 | (student of [http://cs.umd.edu/~djacobs/ Professor David Jacobs])
 |-
-| Ching-Hui Chen, ching@
+| [http://sites.google.com/site/yechengxi/ Chenxi Ye] cxy@
-| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
+| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
+|-
+| [http://www.umiacs.umd.edu/~xintong/ Xintong Han], xintong@
+| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
+|-
+| [http://www.cs.umd.edu/~bharat/ Bharat Singh], bharat@
+| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
+|-
+| [http://bcsiriuschen.github.io/ Bor-Chun (Sirius) Chen], sirius@
+| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 |}
 Gone but not forgotten.
 {| cellpadding="1"
+|-
+| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@
+| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
+|-
+| Ching-Hui Chen, ching@
+| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
+|
 |-
 | [http://ravitejav.weebly.com/ Raviteja Vemulapalli], raviteja @
@@ Line 157: / Line 121: @@
 |-
 | [http://www.umiacs.umd.edu/~jni/ Jie Ni]
-| off this semester
+| now at Sony
 |-
 | [http://www.umiacs.umd.edu/~taheri/ Sima Taheri]

Anonymous

Search

Difference between revisions of "Main Page"

Namespaces

More

Page actions

Latest revision as of 23:40, 3 December 2015

Contents

Description[edit]

Schedule Fall 2015[edit]

Talk Abstracts Spring 2015[edit]

Learning 3D Deformation of Animals from 2D Images[edit]

Automated Event Retrieval using Web Trained Detectors[edit]

Past Semesters[edit]

Funded By[edit]

Current Seminar Series Coordinators[edit]

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Difference between revisions of "Main Page"

Latest revision as of 23:40, 3 December 2015

Description[edit]

Schedule Fall 2015[edit]

Talk Abstracts Spring 2015[edit]

Learning 3D Deformation of Animals from 2D Images[edit]

Automated Event Retrieval using Web Trained Detectors[edit]

Past Semesters[edit]

Funded By[edit]

Current Seminar Series Coordinators[edit]

Navigation

Wiki tools

Page tools