Difference between revisions of "Main Page"

From cvss
 
(366 intermediate revisions by 9 users not shown)
Line 1: Line 1:
<Big>'''Computer Vision Student Seminar'''</Big>
+
<Big>'''Computer Vision Student Seminars'''</Big>
  
The Computer Vision Student Seminar at the University of Maryland College Park is a student-run series of talks given by current graduate students for [http://www.cfar.umd.edu/cvl/meetthe.html#Graduate current graduate students].
+
The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by [http://www.cfar.umd.edu/cvl/meetthe.html#Graduate current graduate students] for [http://www.cfar.umd.edu/cvl/meetthe.html#Graduate current graduate students].
  
 +
To receive regular information about the Computer Vision Student Seminars, subscribe to our [https://mailman.cs.umd.edu/mailman/listinfo/cvss mailing list] or our [http://talks.cs.umd.edu/lists/12 talks list].
  
 
==Description==
 
==Description==
Line 13: Line 14:
 
* Provide speaking opportunities for computer vision students.
 
* Provide speaking opportunities for computer vision students.
  
 
+
The guidelines for the format are:
==Format==
 
  
 
* An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.   
 
* An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.   
Line 20: Line 20:
 
* Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
 
* Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.
  
 +
==Schedule Fall 2015==
  
==Subscribe to the Mailing List==
+
All talks take place on Thursdays at 3:30pm in AVW 3450.
 
 
To receive regular information about the Computer Vision Student Seminar, subscribe to the mailing list by following the instructions [https://mailman.cs.umd.edu/mailman/listinfo/cvss here].
 
 
 
 
 
==Schedule Summer 2011==
 
 
 
All talks take place Thursdays at 4pm in AVW 3450.
 
  
{| class="wikitable" cellpadding="10" border="1" cellspacing="0"
+
{| class="wikitable" cellpadding="10" border="1" cellspacing="1"
 
|-
 
|-
 
! Date
 
! Date
Line 36: Line 30:
 
! Title
 
! Title
 
|-
 
|-
| June 9
+
| December 3
| Vlad Morariu
+
| Angjoo Kanazawa
| Multi-Agent Event Recognition in Structured Scenarios
+
| Learning 3D Deformation of Animals from 2D Images
|-
 
| June 16
 
| Ajay Mishra
 
| A Vision System to Extract "Simple" Objects in a Purely Bottom-Up Fashion
 
|-
 
| June 23
 
| (no meeting, CVPR)
 
|
 
 
|-
 
|-
| June 30
+
| December 10
| Dikpal Reddy
+
| Xintong Han
| Fast Imaging with Slow Cameras
+
| Automated Event Retrieval using Web Trained Detectors
|-
 
| July 7
 
| Raghuraman Gopalan
 
| Exploring Context in Unsupervised Object Identification Scenarios
 
|-
 
| July 14
 
| Behjat Siddiquie
 
| Utilizing Contextual Information for Scene Understanding and Image Retrieval
 
|-
 
| July 21
 
| Kaushik Mitra
 
| Robust Regression Using Sparse Learning
 
|-
 
| July 28
 
|
 
|
 
|-
 
| August 4
 
| Carlos Castillo
 
|
 
|-
 
| August 11
 
|
 
|
 
|-
 
| August 18
 
|
 
|
 
|-
 
| August 25
 
|
 
|
 
 
|}
 
|}
  
 
+
==Talk Abstracts Spring 2015==
==Talk Abstracts==
 
 
 
====Multi-Agent Event Recognition in Structured Scenarios====
 
Speaker: [http://www.umiacs.umd.edu/~morariu/ Vlad Morariu] -- Date: June 9, 2011
 
 
 
I will present a framework for the automatic recognition of complex multi-agent events in settings where structure is imposed by rules that agents must follow while performing activities.  Given semantic spatio-temporal descriptions of what generally happens (i.e., rules, event descriptions, physical constraints), and based on video analysis, the framework determines the events that occurred.  Knowledge about spatio-temporal structure is encoded using first-order logic using an approach based on Allen's Interval Logic, and robustness to low-level observation uncertainty is provided by Markov Logic Networks (MLN).  The main contribution is that the framework integrates interval-based temporal reasoning with probabilistic logical inference, relying on an efficient bottom-up grounding scheme to avoid combinatorial explosion. Applied to one-on-one basketball, the framework detects and tracks players, their hands and feet, and the ball, generates event observations from the resulting trajectories, and performs probabilistic logical inference to determine the most consistent sequence of events.
 
 
 
 
 
===A Vision System to Extract "Simple" Objects in a Purely Bottom-Up Fashion===
 
Speaker: [http://www.umiacs.umd.edu/~mishraka/ Ajay Mishra] -- Date: June 16, 2011
 
 
 
Human perception, being active, is inextricably linked to visual fixation. Despite the obvious importance of fixation, it has not become an integral part of computer vision/robotics algorithms so far. To incorporate fixation and attention in a computer vision framework, we have proposed a new segmentation framework that takes a fixation point (i.e a single point) inside a "simple" object as its input and outputs the region corresponding to that object. We have also designed a new attentional mechanism that utilizes the concept of neural border-ownership to automatically select the fixation points inside different "simple" objects in the scene. All of this together creates a fully automatic system that outputs only the regions corresponding to the "simple" objects without knowing the actual number or the size of the objects in the scene.
 
 
 
Using these regions, instead of rectangular patches of fixed sizes, to analyze the content of a scene will result in better performance (in terms of accuracy and robustness to noise) for high-level vision algorithms such as object recognition, object manipulation, and action analysis. A variety of experimental results will conclude the talk.
 
 
 
Also, to understand the role of fixation in perception, Ajay recommends taking the psychophysical test available at http://www.umiacs.umd.edu/~mishraka/fixationExperiment.php
 
 
 
 
 
===Fast Imaging with Slow Cameras===
 
Speaker: [http://www.umiacs.umd.edu/~dikpal/ Dikpal Reddy] -- Date: June 30, 2011
 
 
 
Over the years, the spatial resolution of cameras has steadily increased but the temporal resolution has remained the same. In this talk, I will present my work on converting a regular slow camera into a faster one. We capture and accurately reconstruct fast events using our slower prototype camera by exploiting the temporal redundancy in videos. First, I will show how by fluttering the shutter during the exposure duration of a slow 25fps camera we can capture and reconstruct a fast periodic video at 2000fps. Next, I will present its generalization where we show that per-pixel modulation during exposure, in combination with brightness constancy constraints allows us to capture a broad class of motions at 200fps using a 25fps camera. In both these techniques we borrow ideas from compressive sensing theory for acquisition and recovery.
 
  
  
===Exploring Context in Unsupervised Object Identification Scenarios===
+
===Learning 3D Deformation of Animals from 2D Images===
Speaker: [http://www.umiacs.umd.edu/~raghuram/ Raghuraman Gopalan] -- Date: July 7, 2011
+
Speaker: [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa] -- Date: December 3, 2015
  
The utility of context for supervised object recognition has been well acknowledged from the early seventies, and has been practically demonstrated by many systems in the last few years. The goal of this talk is to understand the role of context in unsupervised pattern identification scenarios. We consider two problems of clustering a set of unlabelled data points using maximum margin principles, and adapting a classifier trained on a specific domain to identify instances across novel domain shifting transformations, and propose contextual sources that provide pertinent information on the identity of the unlabelled data.
+
Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.
  
 +
Link: [http://arxiv.org/pdf/1507.07646v1.pdf paper]
  
===Utilizing Contextual Information for Scene Understanding and Image Retrieval===
+
===Automated Event Retrieval using Web Trained Detectors===
Speaker: [http://www.cs.umd.edu/~behjat/ Behjat Siddiquie] -- Date: July 14, 2011
 
  
In many vision tasks, contextual information can often help disambiguate confusions arising from appearance information. In this talk, I will discuss two different works, which deal with effective utilization of contextual information to improve the performance of active learning for scene understanding and multi-attribute based image retrieval.
+
Speaker: [http://www.umiacs.umd.edu/~xintong/ Xintong Han] -- Date: December 10, 2015
  
First, I will propose an active learning framework to simultaneously learn appearance and contextual models for scene understanding tasks (multi-class classification). Current multi-class active learning approaches ignore the contextual interactions between different regions of an image and the fact that knowing the label for one region provides information about the labels of other regions. We explicitly model the contextual interactions between regions and select the question which leads to the maximum reduction in the combined entropy of all the regions in the image (image entropy).
+
Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.
  
Next, I will present a novel approach for ranking and retrieval of images based on multi-attribute queries. Existing image retrieval methods train separate classifiers for each word and heuristically combine their outputs for retrieving multi-word queries. Moreover, these approaches ignore the interdependencies among the query words. In contrast, we propose a principled approach for multi-attribute retrieval which explicitly models the correlations that are present between the attributes. Given a multi-attribute query, we also utilize other attributes in the vocabulary which are not present in the query, for ranking/retrieval.
+
Link: [http://arxiv.org/pdf/1509.07845v1.pdf paper]
  
 +
==Past Semesters==
 +
* [[Cvss:Spring2015| Spring 2015]]
 +
* [[cvss fall2014|Fall 2014]]
 +
* [[cvss_spring2014|Spring 2014]]
 +
* [[cvss_fall2013|Fall 2013]]
 +
* [[cvss_summer2013|Summer 2013]]
 +
* [[cvss_spring2013|Spring 2013]]
 +
* [[cvss_fall2012|Fall 2012]]
 +
* [[cvss_spring2012|Spring 2012]]
 +
* [[cvss_fall2011|Fall 2011]]
 +
* [[cvss_summer2011|Summer 2011]]
  
===Robust Regression Using Sparse Learning===
+
==Funded By==
Speaker: [http://www.umiacs.umd.edu/~kmitra/ Kaushik Mitra] -- Date: July 21, 2011
+
* Computer Vision Faculty
 
+
<!-- * '''[http://www.northropgrumman.com/ Northrop Grumman]''' -->
Robust regression is a combinatorial optimization problem. Hence, algorithms such as RANSAC and least median squares (LMedS), which are successful in solving low-dimensional problems, can not be used for solving high-dimensional problems. We show that under certain conditions the robust linear regression problem can be solved accurately using polynomial-time algorithms such as a modified version of basis pursuit and a sparse Bayesian algorithm. We then extend our robust formulation to the case of kernel regression, specifically to propose a robust version for relevance vector machine (RVM) regression.
 
 
 
  
 
==Current Seminar Series Coordinators==
 
==Current Seminar Series Coordinators==
Line 136: Line 77:
 
Emails are at umiacs.umd.edu.
 
Emails are at umiacs.umd.edu.
  
{| class="wikitable" cellpadding="5"
+
{| cellpadding="1"
 +
|-
 +
| [http://sites.google.com/site/austinomyers/ Austin Myers], amyers@
 +
| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 +
|-
 +
| [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa], kanazawa@
 +
| (student of [http://cs.umd.edu/~djacobs/ Professor David Jacobs])
 
|-
 
|-
| Anne Jorstad, jorstad@
+
| [http://sites.google.com/site/yechengxi/ Chenxi Ye] cxy@
| (student of Professor David Jacobs)
+
| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])
 
|-
 
|-
| Sameh Khamis, sameh@
+
| [http://www.umiacs.umd.edu/~xintong/ Xintong Han], xintong@
| (student of Professor Larry Davis)
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 
|-
 
|-
| Sima Taheri, taheri@
+
| [http://www.cs.umd.edu/~bharat/ Bharat Singh], bharat@
| (student of Professor Rama Chellappa)
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 
|-
 
|-
| Ching Lik Teo, cteo@
+
| [http://bcsiriuschen.github.io/ Bor-Chun (Sirius) Chen], sirius@
| (student of Professor Yiannis Aloimonos)
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
 
|}
 
|}
  
 
+
Gone but not forgotten.
== Wiki Editing ==
+
{| cellpadding="1"
 
+
|-
Consult the [http://meta.wikimedia.org/wiki/Help:Contents User's Guide] for information on using the wiki software.
+
| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@
 
+
| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])
* [http://www.mediawiki.org/wiki/Help:Configuration_settings Configuration settings list]
+
|-
* [http://www.mediawiki.org/wiki/Help:FAQ MediaWiki FAQ]
+
| Ching-Hui Chen, ching@
* [http://mail.wikimedia.org/mailman/listinfo/mediawiki-announce MediaWiki release mailing list]
+
| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
 +
|
 +
|-
 +
| [http://ravitejav.weebly.com/ Raviteja Vemulapalli], raviteja @
 +
| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])
 +
|-
 +
| [http://www.umiacs.umd.edu/~sameh/ Sameh Khamis]
 +
|
 +
|-
 +
| [http://www.umiacs.umd.edu/~ejaz/ Ejaz Ahmed]
 +
|
 +
|-
 +
| [http://cvlabwww.epfl.ch/~jorstad/ Anne Jorstad]
 +
| now at EPFL
 +
|-
 +
| [http://www.umiacs.umd.edu/~jni/ Jie Ni]
 +
| now at Sony
 +
|-
 +
| [http://www.umiacs.umd.edu/~taheri/ Sima Taheri]
 +
|
 +
|-
 +
| [http://www.umiacs.umd.edu/~cteo/ Ching Lik Teo]
 +
|
 +
|}

Latest revision as of 23:40, 3 December 2015

Computer Vision Student Seminars

The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by current graduate students for current graduate students.

To receive regular information about the Computer Vision Student Seminars, subscribe to our mailing list or our talks list.

Description[edit]

The purpose of these talks is to:

  • Encourage interaction between computer vision students;
  • Provide an opportunity for computer vision students to be aware of and possibly get involved in the research their peers are conducting;
  • Provide an opportunity for computer vision students to receive feedback on their current research;
  • Provide speaking opportunities for computer vision students.

The guidelines for the format are:

  • An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.
  • The talks are meant to be casual and discussion is encouraged.
  • Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.

Schedule Fall 2015[edit]

All talks take place on Thursdays at 3:30pm in AVW 3450.

Date Speaker Title
December 3 Angjoo Kanazawa Learning 3D Deformation of Animals from 2D Images
December 10 Xintong Han Automated Event Retrieval using Web Trained Detectors

Talk Abstracts Spring 2015[edit]

Learning 3D Deformation of Animals from 2D Images[edit]

Speaker: Angjoo Kanazawa -- Date: December 3, 2015

Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.

Link: paper

Automated Event Retrieval using Web Trained Detectors[edit]

Speaker: Xintong Han -- Date: December 10, 2015

Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.

Link: paper

Past Semesters[edit]

Funded By[edit]

  • Computer Vision Faculty

Current Seminar Series Coordinators[edit]

Emails are at umiacs.umd.edu.

Austin Myers, amyers@ (student of Professor Yiannis Aloimonos)
Angjoo Kanazawa, kanazawa@ (student of Professor David Jacobs)
Chenxi Ye cxy@ (student of Professor Yiannis Aloimonos)
Xintong Han, xintong@ (student of Professor Larry Davis)
Bharat Singh, bharat@ (student of Professor Larry Davis)
Bor-Chun (Sirius) Chen, sirius@ (student of Professor Larry Davis)

Gone but not forgotten.

Jonghyun Choi, jhchoi@ (student of Professor Larry Davis)
Ching-Hui Chen, ching@ (student of Professor Rama Chellappa)
Raviteja Vemulapalli, raviteja @ (student of Professor Rama Chellappa)
Sameh Khamis
Ejaz Ahmed
Anne Jorstad now at EPFL
Jie Ni now at Sony
Sima Taheri
Ching Lik Teo