Changes

7,537 bytes removed , 23:40, 3 December 2015

→‎Learning 3D Deformation of Animals from 2

Line 20: Line 20:

* Topics may include current research, past research, general topic presentations, paper summaries and critiques, or anything else beneficial to the computer vision graduate student community.

+

==Schedule Fall 2015==

−

~~==Schedule Fall 2012==~~

+

All talks take place on Thursdays at 3:30pm in AVW 3450.

−

~~All talks take place Thursdays at 4:30pm in AVW 3450.~~

+

{| class="wikitable" cellpadding="10" border="1" cellspacing="1"

−

{| class="wikitable" cellpadding="10" border="1" cellspacing="0"

|-

! Date

Line 31: Line 30:

! Title

|-

−

| ~~September 6~~

+

| December 3

| Angjoo Kanazawa

−

| ~~Face Alignment by Explicit Shape Regression~~

+

| Learning 3D Deformation of Animals from 2D Images

−

|-

−

~~| September 13~~

−

~~| Sameh Khamis~~

−

~~| Combining Per-Frame and Per-Track Cues for Multi-Person Action Recognition~~

−

|-

−

~~| September 20~~

−

~~| Douglas Summerstay~~

−

~~| Artificial Intelligence and Artificial Creativity Before 1900~~

−

|-

−

~~| September 27~~

−

~~| Mohammad Rastegari~~

−

~~| Attribute Discovery via Predictable and Discriminative Binary Codes~~

−

|-

−

~~| October 4~~

−

~~| Xavier Gibert Serra~~

−

~~| Anomaly Detection on Railway Components using Sparse Representations~~

−

|-

−

~~| October 11~~

−

~~| Kotaro Hara~~

−

~~| Using Google Street View to Identify Street-level Accessibility Problems~~

−

|-

−

~~| October 18~~

−

~~| Ashish Shrivastava~~

−

~~| Dictionary learning methods for computer vision~~

−

|-

−

~~| October 25~~

−

~~| Yi-Chen Chen~~

−

~~| Dictionary-based Face Recognition~~ from ~~Video~~

−

|-

−

~~| November 1~~

−

~~| Raviteja Vemulapalli~~

−

|

−

|-

−

~~| November 8~~

−

~~| Sumit Shekhar~~

−

|

−

|-

−

~~| November 15~~

−

~~| Ang Li~~

−

|

−

|-

−

~~| November 22~~

−

~~| Arijit Biswas~~

−

|

|-

−

~~| November 29~~

+

| December 10

−

~~| Fatemeh Mir Rashed~~

+

| Xintong Han

−

|

+

| Automated Event Retrieval using Web Trained Detectors

−

|-

−

~~| December 6~~

−

~~| Ejaz Ahmed~~

−

|

−

|-

−

| December 13

−

| ~~''(Final exams, no meeting)''~~

−

|

|}

−

==Talk Abstracts ~~Fall 2012~~==

+

==Talk Abstracts Spring 2015==

−

~~===Face Alignment by Explicit Shape Regression===~~

−

~~Speaker: [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa] -- Date: September 6, 2012~~

−

In this talk, we will go over CVPR 2012 paper "Face Alignment by Explicit Shape Regression". I will review the paper and discuss its key concepts: cascaded regression, random ferns, shape indexed image features, and correlation based feature selection. Then I will discuss our hypothesis on why this seemingly simple method works so well and how we can apply their method to similar problem domains such as dog and bird parts localization and their challenges.

−

~~Abstract from the paper:~~

−

We present a very efﬁcient, highly accurate, “Explicit Shape Regression” approach for face alignment. Unlike previous regression-based approaches, we directly learn a vectorial regression function to infer the whole facial shape (a set of facial landmarks) from the image and explicitly minimize the alignment errors over the training data. The inherent shape constraint is naturally encoded into the regressor in a cascaded learning framework and applied from coarse to ﬁne during the test, without using a ﬁxed parametric shape model as in most previous methods. To make the regression more effective and efﬁcient, we design a two-level boosted regression, shape-indexed features and a correlation-based feature selection method. This combination enables us to learn accurate models from large training data in a short time (20 minutes for 2,000 training images), and run regression extremely fast in test (15 ms for a 87 landmarks shape). Experiments on challenging data show that our approach signiﬁcantly outperforms the state-of-the-art in terms of both accuracy and efﬁciency.

−

~~===Combining Per-Frame and Per-Track Cues for Multi-Person Action Recognition===~~

−

~~Speaker: [http://www.umiacs.umd.edu/~sameh/ Sameh Khamis] -- Date: September 13, 2012~~

−

~~We propose a model to combine per-frame and per-track cues for action recognition. With multiple targets in a scene, our model simultaneously captures the natural harmony~~ of an individual's action in a scene and the flow of actions of an individual in a video sequence, inferring valid tracks in the process. Our motivation is based on the unlikely discordance of an action in a structured scene, both at the track level (e.g., a person jogging then dancing) and the frame level (e.g.~~, a person jogging in a dance studio)~~. While we can utilize sampling approaches for inference in our model, we instead devise a global inference algorithm by decomposing the problem and solving the subproblems exactly and efficiently, recovering a globally optimal joint solution in several cases. ~~Finally, we improve on the state~~-of-~~the-art action recognition results for two publicly available datasets.~~

+

===Learning 3D Deformation of Animals from 2D Images===

+

Speaker: [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa] -- Date: December 3, 2015

−

~~===Artificial Intelligence~~ and ~~Artificial Creativity Before 1900===~~

+

Abstract: Understanding how an animal can deform and articulate is essential for a realistic modification of its 3D model. In this paper, we show that such information can be learned from user-clicked 2D images and a template 3D model of the target animal. We present a volumetric deformation framework that produces a set of new 3D models by deforming a template 3D model according to a set of user-clicked images. Our framework is based on a novel locally-bounded deformation energy, where every local region has its own stiffness value that bounds how much distortion is allowed at that location. We jointly learn the local stiffness bounds as we deform the template 3D mesh to match each user-clicked image. We show that this seemingly complex task can be solved as a sequence of convex optimization problems. We demonstrate the effectiveness of our approach on cats and horses, which are highly deformable and articulated animals. Our framework produces new 3D models of animals that are significantly more plausible than methods without learned stiffness.

−

~~Speaker: [http://www~~.cs.~~umd~~.~~edu/~dss/ Doug Summers~~-~~Stay]~~ -~~- Date: September 20~~, ~~2012~~

−

I will talk about various inventions such as the Eureka, which generated Latin poetry in hexameter while playing "God Save the Queen"; the Homeoscope, a mechanical search engine invented by a Russian police clerk in 1832; the Componium, an orchestra-in-a-box which composed random variations on a melody; and others along the same lines. ~~I'll also talk about how we could go beyond these techniques to build something really creative~~. ~~This is a presentation of material I found when I was doing research for the book I published in January, Machinamenta~~.

+

Link: [http://arxiv.org/pdf/1507.07646v1.pdf paper]

−

===~~Attribute Discovery via Predictable and Discriminative Binary Codes~~===

+

===Automated Event Retrieval using Web Trained Detectors===

−

~~Speaker: [http://www.cs.dartmouth.edu/~mrastegari/ Mohammad Rastegari] -- Date: September 27, 2012~~

−

~~We present images with binary codes in a way that balances discrimination and learnability of the codes~~. In our method, each image claims its own code in a way that maintains discrimination while being predictable from visual data. Category memberships are usually good proxies for visual similarity but should not be enforced as a hard constraint. ~~Our method learns codes that maximize separability of categories unless there is strong visual evidence against it~~. ~~Simple linear SVMs can achieve state-of~~-~~the~~-~~art results with our short codes. In fact~~, our method produces state-of-the-art results on Caltech256 with only 128- dimensional bit vectors and outperforms state of the art by using longer codes. We also evaluate our method on ImageNet and show that our method outperforms state-of-the-art binary code methods on this large scale dataset. Lastly, our codes can discover a discriminative set of attributes.

+

Speaker: [http://www.umiacs.umd.edu/~xintong/ Xintong Han] -- Date: December 10, 2015

−

~~===Anomaly Detection on Railway Components using Sparse Representations===~~

+

Abstract: Complex event retrieval is a challenging research problem, especially when no training videos are available. An alternative to collecting training videos is to train a large semantic concept bank a priori. Given a text description of an event, event retrieval is performed by selecting concepts linguistically related to the event description and fusing the concept responses on unseen videos. However, defining an exhaustive concept lexicon and pre-training it requires vast computational resources. Therefore, recent approaches automate concept discovery and training by leveraging large amounts of weakly annotated web data. Compact visually salient concepts are automatically obtained by the use of concept pairs or, more generally, n-grams. However, not all visually salient n-grams are necessarily useful for an event query - some combinations of concepts may be visually compact but irrelevant--and this drastically affects performance. We propose an event retrieval algorithm that constructs pairs of automatically discovered concepts and then prunes those concepts that are unlikely to be helpful for retrieval. Pruning depends both on the query and on the specific video instance being evaluated. Our approach also addresses calibration and domain adaptation issues that arise when applying concept detectors to unseen videos. We demonstrate large improvements over other vision based systems on the TRECVID MED 13 dataset.

−

~~Speaker: [http://www.umiacs.umd.edu/~gibert/ Xavier Gibert-Serra] -- Date~~: ~~October 4, 2012~~

−

~~High-speed rail (HSR) requires high levels of reliability of the track infrastructure. Automated visual inspection~~ is ~~useful for finding many anomalies such as cracks or chips on joint bars and concrete ties~~, ~~but existing vision-based inspection systems often produce high number of false detections, and~~ are very sensitive to external factors such as changes in environmental conditions. For example, state-of-the-art algorithms used by the railroad industry nominally perform at a detection rate of 85% with a false alarm rate of 3% and performance drops very quickly as image quality degrades. ~~On the tie inspection problem, this false alarm rate would correspond~~ to 2.6 detections per second at 125 MPH, which cannot be handled by an operator. These false detections have many causes, including variations in anomaly appearance, texture, partial occlusion, and noise, which existing algorithms cannot handle very well. To overcome these limitations, it is ~~necessary~~ to ~~reformulate this joint detection and segmentation problem as~~ a ~~Blind Source Separation problem, and use~~ a ~~generative model that is robust to noise and is capable of handling missing data~~.

−

~~In signal and image processing, Sparse Representations (SR) is an efficient way of describing~~ a ~~signal as a linear combination~~ of ~~a small number of atoms (elementary signals) from a dictionary. In natural images, sparsity arises from the statistical dependencies of pixel values across the image. Therefore~~, statistical methods such as Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis (ICA) have been used for dimensionality reduction in several computer vision problems. Recent advances in SR theory have enabled methods that learn optimal dictionaries directly from training data. For example, K-SVD is ~~a very well known algorithm for automatically designing over-complete dictionaries for sparse representation.~~

−

~~In this detection problem,~~ the ~~anomalies have very well defined structure~~ and ~~therefore, they can be represented sparsely in some subspace. In addition,~~ the image background has very structured texture, so it is sparse with respect to a different frame. Theoretical results in mathematical geometric separation show that it is possible to separate these two image components (regular texture from contours) by minimizing the L1 norm the coefficients in geometrically complementary frames. ~~More recently~~, ~~it has been shown that this problem can be solved efficiently using thresholding~~ and total variation regularization. Our experiments show that the sparse coefficients extracted from the contour component can be converted into feature vectors that can be used to cluster and detect these anomalies.

−

~~===Using Google Street View to Identify Street~~-~~level Accessibility Problems===~~

−

~~Speaker: [http://kotarohara~~.~~com/ Kotaro Hara] -- Date: October 11, 2012~~

−

~~Poorly maintained sidewalks, missing curb ramps~~, and ~~other obstacles pose considerable accessibility challenges; however, there are currently few, if any, mechanisms to determine accessible areas~~ of ~~a city a priori~~.

−

In the ~~first half~~ of ~~the presentation~~, ~~I will talk about our investigation of the feasibility of using untrained crowd workers from Amazon Mechanical Turk (turkers) to find~~, ~~label, and assess sidewalk accessibility problems in Google Street View imagery~~. ~~Our work effectively demonstrates a promising new~~, ~~highly scalable method~~ for ~~acquiring knowledge about sidewalk accessibility.~~

−

~~In the latter half, I will discuss the future works as well as open research questions related in the field~~ of ~~computer vision.~~

−

~~===Dictionary learning methods for computer vision===~~

−

~~Speaker: [http://www.umiacs.umd.edu/~ashish/ Ashish Shrivastava]~~ -- ~~Date: October 18, 2012~~

−

~~Sparse~~ and ~~redundant signal representations have recently gained much interest in image understanding~~. ~~This is partly due to the fact~~ that ~~signals or images~~ of ~~interest~~ are ~~often sparse in some dictionary. These dictionaries can~~ be either analytic or they can be learned directly from the data. In fact, it has been observed that learning a dictionary directly from data often leads to improved results in many practical applications such as classification and restoration. ~~In this talk I will give a general overview of dictionary learning methods and talk in detail about my recent work~~ on ~~semi-supervised dictionary learning and non-linear supervised dictionary learning methods.~~

−

~~===Dictionary-based Face Recognition from Video===~~

−

~~Speaker: Yi-Chen Chen -- Date: October 25, 2012~~

−

~~The main challenge in recognizing faces in video is effectively exploiting~~ the ~~multiple frames of a face~~ and the ~~accompanying dynamic signature. One prominent method is based on extracting joint appearance and behavioral features. A second method models a person by temporal correlations of features in a~~ video. Our approach ~~introduces the~~ concept ~~of video-dictionaries for face recognition, which generalizes the work in sparse representation and dictionaries for faces in still images. Video-dictionaries are designed~~ to ~~implicitly encode temporal, pose, and illumination information~~. We demonstrate ~~our method~~ on the ~~Face and Ocular Challenge Series (FOCS), which consists of unconstrained video sequences~~. ~~We show that our method is efficient and performs significantly better than many competitive video-based face recognition algorithms.~~

−

~~===TBA===~~

−

~~Speaker: Raviteja Vemulapalli -- Date: November 1, 2012~~

−

~~===TBA===~~

−

~~Speaker: Sumit Shekhar -- Date: November 8, 2012~~

−

~~===TBA===~~

−

~~Speaker: [http://www.cs.umd.edu/~angli/ Ang Li] -- Date: November 15, 2012~~

−

~~===TBA===~~

−

~~Speaker: [http://www.umiacs.umd.edu/~arijit/ Arijit Biswas] -- Date: November 22, 2012~~

−

~~===TBA===~~

−

~~Speaker: Fatemeh Mir Rashed -- Date: November 29, 2012~~

−

~~===TBA===~~

−

~~Speaker: Ejaz Ahmed -- Date: December 6, 2012~~

+

Link: [http://arxiv.org/pdf/1509.07845v1.pdf paper]

==Past Semesters==

−

* [[cvss_spring2012|~~Schedule~~ Spring 2012]]

+

* [[Cvss:Spring2015| Spring 2015]]

−

* [[cvss_fall2011|~~Schedule~~ Fall 2011]]

+

* [[cvss fall2014|Fall 2014]]

−

* [[cvss_summer2011|~~Schedule~~ Summer 2011]]

+

* [[cvss_spring2014|Spring 2014]]

+

* [[cvss_fall2013|Fall 2013]]

+

* [[cvss_summer2013|Summer 2013]]

+

* [[cvss_spring2013|Spring 2013]]

+

* [[cvss_fall2012|Fall 2012]]

+

* [[cvss_spring2012|Spring 2012]]

+

* [[cvss_fall2011|Fall 2011]]

+

* [[cvss_summer2011|Summer 2011]]

+

==Funded By==

+

* Computer Vision Faculty

+

==Current Seminar Series Coordinators==

Line 172: Line 77:

Emails are at umiacs.umd.edu.

−

{| ~~class="wikitable"~~ cellpadding="5"

+

{| cellpadding="1"

+

|-

+

| [http://sites.google.com/site/austinomyers/ Austin Myers], amyers@

+

| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])

|-

| [http://www.umiacs.umd.edu/~kanazawa/ Angjoo Kanazawa], kanazawa@

−

| (student of [http://~~www.~~cs.umd.edu/~djacobs/ Professor David Jacobs])

+

| (student of [http://cs.umd.edu/~djacobs/ Professor David Jacobs])

+

|-

+

| [http://sites.google.com/site/yechengxi/ Chenxi Ye] cxy@

+

| (student of [http://www.cfar.umd.edu/~yiannis/ Professor Yiannis Aloimonos])

|-

−

| [http://www.umiacs.umd.edu/~~~sameh~~/ ~~Sameh Khamis~~], ~~sameh~~@

+

| [http://www.umiacs.umd.edu/~xintong/ Xintong Han], xintong@

| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])

|-

−

| [http://www.~~umiacs~~.umd.edu/~~~jni~~/ ~~Jie Ni~~], ~~jni~~@

+

| [http://www.cs.umd.edu/~bharat/ Bharat Singh], bharat@

−

| (student of [http://www.umiacs.umd.edu/~~~rama~~/ Professor ~~Rama Chellappa~~])

+

| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])

|-

−

| [http://~~www~~.~~umiacs~~.~~umd.edu~~/~~~cteo/ Ching Lik Teo~~], ~~cteo~~@

+

| [http://bcsiriuschen.github.io/ Bor-Chun (Sirius) Chen], sirius@

−

| (student of [http://www.~~cfar~~.umd.edu/~~~yiannis~~/ Professor ~~Yiannis Aloimonos~~])

+

| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])

|}

Gone but not forgotten.

−

+

{| cellpadding="1"

−

{| ~~class="wikitable"~~ cellpadding="5"

+

|-

+

| [http://www.umiacs.umd.edu/~jhchoi/ Jonghyun Choi], jhchoi@

+

| (student of [http://www.umiacs.umd.edu/~lsd/ Professor Larry Davis])

|-

−

| ~~[http://www~~-~~users.math.umd.edu/~jorstad/ Anne Jorstad]~~, ~~jorstad~~@

+

| Ching-Hui Chen, ching@

−

| (student of [http://www.cs.umd.edu/~~~djacobs~~/ Professor ~~David Jacobs~~])

+

| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])

+

|

|-

−

| [http://~~www~~.~~umiacs~~.~~umd.edu/~taheri~~/ ~~Sima Taheri~~], ~~taheri~~@

+

| [http://ravitejav.weebly.com/ Raviteja Vemulapalli], raviteja @

| (student of [http://www.umiacs.umd.edu/~rama/ Professor Rama Chellappa])

+

|-

+

| [http://www.umiacs.umd.edu/~sameh/ Sameh Khamis]

+

|

+

|-

+

| [http://www.umiacs.umd.edu/~ejaz/ Ejaz Ahmed]

+

|

+

|-

+

| [http://cvlabwww.epfl.ch/~jorstad/ Anne Jorstad]

+

| now at EPFL

+

|-

+

| [http://www.umiacs.umd.edu/~jni/ Jie Ni]

+

| now at Sony

+

|-

+

| [http://www.umiacs.umd.edu/~taheri/ Sima Taheri]

+

|

+

|-

+

| [http://www.umiacs.umd.edu/~cteo/ Ching Lik Teo]

+

|

|}

Xintong

16

edits

Changes

Main Page (view source)

Revision as of 23:40, 3 December 2015