| We introduce a novel discriminative learning approach to image set classification by modeling the image set with its natural second order statistic, i.e., covariance matrix. Since nonsingular covariance matrices, a.k.a. symmetric positive definite (SPD) matrices, lie on a Riemannian manifold, classical learning algorithms cannot be directly utilized to classify points on the manifold. By exploring an efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive a kernel function that explicitly maps the covariance matrix from the Riemannian manifold to a Euclidean space. With this explicit mapping, any learning method devoted to vector space can be exploited in either linear or kernel formulation. Linear Discriminant Analysis (LDA) and Partial Least Squares (PLS) are considered in this paper for their feasibility for our specific problem. The proposed method is evaluated on two tasks: face recognition and object categorization. Extensive experimental results show not only the superiority of our method over state-of-the-art ones in both accuracy and efficiency, but also its stability to two real challenges: noisy set data and varying set size. | | We introduce a novel discriminative learning approach to image set classification by modeling the image set with its natural second order statistic, i.e., covariance matrix. Since nonsingular covariance matrices, a.k.a. symmetric positive definite (SPD) matrices, lie on a Riemannian manifold, classical learning algorithms cannot be directly utilized to classify points on the manifold. By exploring an efficient metric for the SPD matrices, i.e., Log-Euclidean Distance (LED), we derive a kernel function that explicitly maps the covariance matrix from the Riemannian manifold to a Euclidean space. With this explicit mapping, any learning method devoted to vector space can be exploited in either linear or kernel formulation. Linear Discriminant Analysis (LDA) and Partial Least Squares (PLS) are considered in this paper for their feasibility for our specific problem. The proposed method is evaluated on two tasks: face recognition and object categorization. Extensive experimental results show not only the superiority of our method over state-of-the-art ones in both accuracy and efficiency, but also its stability to two real challenges: noisy set data and varying set size. |
| + | Latent variables models have been widely applied in many problems in machine learning and related fields such as computer vision and information retrieval.However, the complexity of the latent space in such models is typically left as a free design choice. A larger latent space results in a more expressive model, but such models are prone to overfitting and are slower to perform inference with. The goal of this work is to regularize the complexity of the latent space and \emph{learn} which hidden states are really relevant for the prediction problem.To this end, we propose regularization with a group norm such as $\ell_{1}$-$\ell_{2}$ to estimate parameters of a Latent Structural SVM. Our experiments on digit recognition show that our approach is indeed able to control the complexity of latent space, resulting in significantly faster inference at test-time without any loss in accuracy of the learnt model. |