Cvss spring2012

From cvss

Schedule Spring 2012[edit]

Date Speaker Title
February 2 Ching Lik Teo The Telluride Neuromorphic Workshop Experience
February 9 Jonghyun Choi A Complementary Local Feature Descriptor for Face Identification (CCS-POP)
February 16 Sameh Khamis Energy Minimization with Graph Cuts
February 23 Jay Pujara Using Classifier Cascades for Scalable E-Mail Classification
March 1 (ECCV week, no meeting)
March 8 Jie Ni Example-Driven Manifold Priors for Image Deconvolution
March 15 Huimin Guo Covariance Discriminative Learning: A Natural and Efficient Approach to Image Set Classification
March 22 (Spring Break)
March 29 Daozheng Chen Group Norms for Learning Latent Structural SVMs
April 5 Jaishanker Pillai Sparsity Inspired Unconstrained Iris Recognition
April 12 Jun-Cheng Chen Ambiguities in Camera Self-Calibration
April 19 Sima Taheri Facial Expression Analysis Systems
April 26 Sujal Bista Rendering massive virtual world using Clipmaps
May 3 Nazre Batool Spatial Marked Point Processes in Computer Vision
May 10 (End of Year Social)
May 17 (Final exams, no meeting)


Talk Abstracts Spring 2012[edit]

The Telluride Neuromorphic Workshop Experience[edit]

Speaker: Ching Lik Teo -- Date: February 2, 2012

In this talk, I will present what we did as a group at the Telluride Neuromorphic Workshop 2011. I will explain the challenges we faced, modules that we have used, and some results from experiments on activity description we have conducted on the robot.

A Complementary Local Feature Descriptor for Face Identification (CCS-POP)[edit]

Speaker: Jonghyun Choi -- Date: February 9, 2012

In many descriptors, spatial intensity transforms are often packed into a histogram or encoded into binary strings to be insensitive to local misalignment and compact. Discriminative information, however, might be lost during the process as a trade-off. To capture the lost pixel-wise local information, we propose a new feature descriptor, Circular Center Symmetric-Pairs of Pixels (CCS-POP). It concatenates the symmetric pixel differences centered at a pixel position along various orientations with various radii; it is a generalized form of Local Binary Patterns, its variants and Pairs-of-Pixels (POP). Combining CCS-POP with existing descriptors achieves better face identification performance on FRGC Ver. 1.0 and FERET datasets compared to state-of-the-art approaches.

Energy Minimization with Graph Cuts[edit]

Speaker: Sameh Khamis -- Date: February 16, 2012

In this tutorial we describe how several computer vision problems can be intuitively formulated as Markov Random Fields. Inference in such models can be transformed to an energy minimization problem. Under some conditions, graph cut methods can be used to find the minimum of the energy function and, in turn, the most probable assignment for its variables. In addition, we will briefly cover some of the recent advances in the application of graph cuts to a wider set of energy functions.

Using Classifier Cascades for Scalable E-Mail Classification[edit]

Speaker: Jay Pujara -- Date: February 23, 2012

In many real-world scenarios, we must make judgments in the presence of computational constraints. One common computational constraint arises when the features used to make a judgment each have differing acquisition costs, but there is a fixed total budget for a set of judgments. Particularly when there are a large number of classifications that must be made in a real-time, an intelligent strategy for optimizing accuracy versus computational costs is essential. E-mail classification is an area where accurate and timely results require such a trade-off. We identify two scenarios where intelligent feature acquisition can improve classifier performance. In granular classification we seek to classify e-mails with increasingly specific labels structured in a hierarchy, where each level of the hierarchy requires a different trade-off between cost and accuracy. In load-sensitive classification, we classify a set of instances within an arbitrary total budget for acquiring features. Our method, Adaptive Classifier Cascades (ACC), designs a policy to combine a series of base classifiers with increasing computational costs given a desired trade-off between cost and accuracy. Using this method, we learn a relationship between feature costs and label hierarchies, for granular classification and cost budgets, for load-sensitive classification. We evaluate our method on real-world e-mail datasets with realistic estimates of feature acquisition cost, and we demonstrate superior results when compared to baseline classifiers that do not have a granular, cost-sensitive feature acquisition policy.

Example-Driven Manifold Priors for Image Deconvolution[edit]

Speaker: Jie Ni -- Date: March 8, 2012

Image restoration methods that exploit prior information about images to be estimated have been extensively studied, typically using the Bayesian framework. In this work, we consider the role of prior knowledge of the object class in the form of a patch manifold to address the deconvolution problem. Specifically, we incorporate unlabeled image data of the object class, say natural images, in the form of a patch-manifold prior for the object class. The manifold prior is implicitly estimated from the given unlabeled data. We show how the patch-manifold prior effectively exploits the available sample class data for regularizing the econvolution problem. Furthermore, we derive a generalized cross-validation (GCV) function to automatically determine the regularization parameter at each iteration without explicitly knowing the noise variance. Extensive experiments show that this method performs better than many competitive image deconvolution methods.

Covariance Discriminative Learning: A Natural and Efficient Approach to Image Set Classification[edit]

Speaker: Huimin Guo -- Date: March 15, 2012

We introduce a novel discriminative learning approach to image set classification by modeling the image set with its natural second order statistic, i.e., covariance matrix. Since nonsingular covariance matrices, a.k.a. symmetric positive definite (SPD) matrices, lie on a Riemannian manifold, classical learning algorithms cannot be directly utilized to classify points on the manifold. By exploring an efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive a kernel function that explicitly maps the covariance matrix from the Riemannian manifold to a Euclidean space. With this explicit mapping, any learning method devoted to vector space can be exploited in either linear or kernel formulation. Linear Discriminant Analysis (LDA) and Partial Least Squares (PLS) are considered in this paper for their feasibility for our specific problem. The proposed method is evaluated on two tasks: face recognition and object categorization. Extensive experimental results show not only the superiority of our method over state-of-the-art ones in both accuracy and efficiency, but also its stability to two real challenges: noisy set data and varying set size.

Group Norms for Learning Latent Structural SVMs[edit]

Speaker: Daozheng Chen -- Date: March 29, 2012

Latent variables models have been widely applied in many problems in machine learning and related fields such as computer vision and information retrieval.However, the complexity of the latent space in such models is typically left as a free design choice. A larger latent space results in a more expressive model, but such models are prone to overfitting and are slower to perform inference with. The goal of this work is to regularize the complexity of the latent space and learn which hidden states are really relevant for the prediction problem.To this end, we propose regularization with a group norm such as L1-L2 to estimate parameters of a Latent Structural SVM. Our experiments on digit recognition show that our approach is indeed able to control the complexity of latent space, resulting in significantly faster inference at test-time without any loss in accuracy of the learnt model.

Sparsity Inspired Unconstrained Iris Recognition[edit]

Speaker: Jaishanker Pillai -- Date: April 5, 2012

Iris recognition is one of the most popular approaches for human authentication, since the iris patterns are unique for each person and remain stable for long periods of time. However, existing algorithms for iris recognition require clean iris images, which limit their utility in unconstrained environments like surveillance. In this work, we develop an unconstrained iris recognition algorithm by modeling the inherent structure in clean iris images using sparse representations. The proposed algorithm recognizes the test image and also predicts the quality of acquisition. We further extend the introduced algorithm by a quality based fusion framework, which combine the recognition results from multiple test images. Extensive evaluation on existing datasets clearly demonstrate the utility of the proposed algorithm for recognition and image quality estimation.

Ambiguities in Camera Self-Calibration[edit]

Speaker: Jun-Cheng Chen -- Date: April 12, 2012

Structure from motion (SfM) is the problem of computing the 3D scene and camera parameters from a video or collection of images. SfM problems can be further classified as calibrated and uncalibrated. In calibrated SfM, the internal camera parameters are known. This is a much easier problem than the uncalibrated case, where these parameters are unknown. Solving for the internal camera parameters are known as the camera self/auto calibration problem. Critical motion sequences (CMS) are those sequences/videos from which internal parameters cannot be determined uniquely, that is, there are many different settings of internal parameters that give rise to the same video. In this talk, we are going to show that three cases of motions, (1) pure translation, (2) single rotation, and (3) single rotation about X/Y/Z-axis and translation, are CMS, and the necessary and sufficient conditions of a sequence not being a CMS.

Facial Expression Analysis Systems[edit]

Speaker: Sima Taheri -- Date: April 19, 2012

The goal of facial expression analysis is to create systems that can automatically analyze and recognize facial feature changes and facial motion due to facial expressions from visual information. This has been an active research topic for several years and has attracted the interest of many computer vision researchers and behavioral scientists, with applications in behavioral science, security, animation, and human-computer interaction. In this talk, I will briefly describe the components of a facial expression analysis system and review some previous work. Then I will talk about my work, View-Invariant Expression Analysis using Analytic Shape Manifolds and Structure-Preserving Sparse Decomposition for Facial Expression Analysis.

Rendering massive virtual world using Clipmaps[edit]

Speaker: Sujal Bista -- Date: April 26, 2012

Real time rendering of a massive virtual world requires efficient management of textures, geometric structures, and a variety of visual effects. Despite the recent improvements of the Graphics Processing Units (GPU), the currently available memory space and computation power is still not enough to store and process textures and geometries that are used to represent a high-quality virtual world. One way to overcome this problem is use Clipmap, which is a hardware accelerated approach that manages levels of detail (LOD) for objects, textures, and effects used to render a virtual world.

Spatial Marked Point Processes in Computer Vision[edit]

Speaker: Nazre Batool -- Date: May 3, 2012

The computer vision community is very familiar with Markov random field (MRF) modeling for numerous applications. In this talk, I will present an overview of the more general, but less popular, Markov point processes (MPP) and will highlight the connection between MRF and MPP and the advantages of MPP modeling for specific applications. Recently, several versions of MPP called ‘Marked’ point processes have been used in remote sensing/aerial imaging applications. I will discuss some of the applications and finally, present my recent work based on MPP.