Anonymous

Changes

From Theory
1,272 bytes added ,  14:40, 20 January 2015
Created page with "== Title == Variable Selection is Hard == Speaker == Howard Karloff == Abstract == Consider the task of a machine-learning system faced with voluminous data on m individuals..."
== Title ==
Variable Selection is Hard

== Speaker ==
Howard Karloff

== Abstract ==
Consider the task of a machine-learning system faced with voluminous data on m individuals. While there may be p=10^6 features describing each individual, how can the algorithm find a small set of features that "best" describes the individuals? People usually seek small feature sets both because models with small feature sets are understandable and because simple models usually generalize better. We study the simple case of linear regression, in which a user has an m x p matrix B and a vector y, and seeks a p-vector x *with as few nonzeroes as possible* such that Bx is approximately equal to y, and we call it SPARSE REGRESSION. There are numerous algorithms in the statistical literature for SPARSE REGRESSION, such as Forward Selection, Backward Elimination, LASSO, and Ridge Regression. We give a general hardness proof that (subject to a complexity assumption) no polynomial-time algorithm can give good performance (in the worst case) for SPARSE REGRESSION, even if it is allowed to include more variables than necessary, and even if it need only find an x such that Bx is relatively far from y. This is joint work with Dean Foster and Justin Thaler of Yahoo Labs.
Bots, Bureaucrats, editor
84

edits