Main Page

From cvss
Revision as of 17:42, 13 September 2012 by Sameh (talk | contribs)

Computer Vision Student Seminars

The Computer Vision Student Seminars at the University of Maryland College Park are a student-run series of talks given by current graduate students for current graduate students.

To receive regular information about the Computer Vision Student Seminars, subscribe to our mailing list or our talks list.

Description[edit]

The purpose of these talks is to:

  • Encourage interaction between computer vision students;
  • Provide an opportunity for computer vision students to be aware of and possibly get involved in the research their peers are conducting;
  • Provide an opportunity for computer vision students to receive feedback on their current research;
  • Provide speaking opportunities for computer vision students.

The guidelines for the format are:

  • An hour-long weekly meeting, consisting of one 20-40 minute talk followed by discussion and food.
  • The talks are meant to be casual and discussion is encouraged.
  • Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.


Schedule Fall 2012[edit]

All talks take place Thursdays at 4:30pm in AVW 3450.

Date Speaker Title
September 6 Angjoo Kanazawa Face Alignment by Explicit Shape Regression
September 13 Sameh Khamis Combining Per-Frame and Per-Track Cues for Multi-Person Action Recognition
September 20 Douglas Summerstay Artificial Intelligence and Artificial Creativity Before 1900
September 27 Mohammad Rastegari Attribute Discovery via Predictable and Discriminative Binary Codes
October 4 Xavier Gibert Serra
October 11 (ECCV week, no meeting)
October 18 Ashish Srivastava
October 25 Yi-Chen Chen
November 1 Raviteja Vemulapalli
November 8 Sumit Shekhar
November 15 Ang Li
November 22 Arijit Biswas
November 29 Fatemeh Mir Rashed
December 6 Ejaz Ahmed
December 13 (Final exams, no meeting)

Talk Abstracts Fall 2012[edit]

Face Alignment by Explicit Shape Regression[edit]

Speaker: Angjoo Kanazawa -- Date: September 6, 2012

In this talk, we will go over CVPR 2012 paper "Face Alignment by Explicit Shape Regression". I will review the paper and discuss its key concepts: cascaded regression, random ferns, shape indexed image features, and correlation based feature selection. Then I will discuss our hypothesis on why this seemingly simple method works so well and how we can apply their method to similar problem domains such as dog and bird parts localization and their challenges.

Abstract from the paper: We present a very efficient, highly accurate, “Explicit Shape Regression” approach for face alignment. Unlike previous regression-based approaches, we directly learn a vectorial regression function to infer the whole facial shape (a set of facial landmarks) from the image and explicitly minimize the alignment errors over the training data. The inherent shape constraint is naturally encoded into the regressor in a cascaded learning framework and applied from coarse to fine during the test, without using a fixed parametric shape model as in most previous methods. To make the regression more effective and efficient, we design a two-level boosted regression, shape-indexed features and a correlation-based feature selection method. This combination enables us to learn accurate models from large training data in a short time (20 minutes for 2,000 training images), and run regression extremely fast in test (15 ms for a 87 landmarks shape). Experiments on challenging data show that our approach significantly outperforms the state-of-the-art in terms of both accuracy and efficiency.

Combining Per-Frame and Per-Track Cues for Multi-Person Action Recognition[edit]

Speaker: Sameh Khamis -- Date: September 13, 2012

We propose a model to combine per-frame and per-track cues for action recognition. With multiple targets in a scene, our model simultaneously captures the natural harmony of an individual's action in a scene and the flow of actions of an individual in a video sequence, inferring valid tracks in the process. Our motivation is based on the unlikely discordance of an action in a structured scene, both at the track level (e.g., a person jogging then dancing) and the frame level (e.g., a person jogging in a dance studio). While we can utilize sampling approaches for inference in our model, we instead devise a global inference algorithm by decomposing the problem and solving the subproblems exactly and efficiently, recovering a globally optimal joint solution in several cases. Finally, we improve on the state-of-the-art action recognition results for two publicly available datasets.

Artificial Intelligence and Artificial Creativity Before 1900[edit]

Speaker: Doug Summers-Stay -- Date: September 20, 2012

I will talk about various inventions such as the Eureka, which generated Latin poetry in hexameter while playing "God Save the Queen"; the Homeoscope, a mechanical search engine invented by a Russian police clerk in 1832; the Componium, an orchestra-in-a-box which composed random variations on a melody; and others along the same lines. I'll also talk about how we could go beyond these techniques to build something really creative. This is a presentation of material I found when I was doing research for the book I published in January, Machinamenta.

Attribute Discovery via Predictable and Discriminative Binary Codes[edit]

Speaker: Mohammad Rastegari -- Date: September 27, 2012

We present images with binary codes in a way that balances discrimination and learnability of the codes. In our method, each image claims its own code in a way that maintains discrimination while being predictable from visual data. Category memberships are usually good proxies for visual similarity but should not be enforced as a hard constraint. Our method learns codes that maximize separability of categories unless there is strong visual evidence against it. Simple linear SVMs can achieve state-of-the-art results with our short codes. In fact, our method produces state-of-the-art results on Caltech256 with only 128- dimensional bit vectors and outperforms state of the art by using longer codes. We also evaluate our method on ImageNet and show that our method outperforms state-of-the-art binary code methods on this large scale dataset. Lastly, our codes can discover a discriminative set of attributes.


Past Semesters[edit]


Current Seminar Series Coordinators[edit]

Emails are at umiacs.umd.edu.

Angjoo Kanazawa, kanazawa@ (student of Professor David Jacobs)
Sameh Khamis, sameh@ (student of Professor Larry Davis)
Jie Ni, jni@ (student of Professor Rama Chellappa)
Ching Lik Teo, cteo@ (student of Professor Yiannis Aloimonos)

Gone but not forgotten.

Anne Jorstad, jorstad@ (student of Professor David Jacobs)
Sima Taheri, taheri@ (student of Professor Rama Chellappa)