Difference between revisions of "CATS-Jan-22-2015"

Latest revision as of 14:40, 20 January 2015

Title[edit]

Variable Selection is Hard

Speaker[edit]

Howard Karloff

Abstract[edit]

Consider the task of a machine-learning system faced with voluminous data on m individuals. While there may be p=10^6 features describing each individual, how can the algorithm find a small set of features that "best" describes the individuals? People usually seek small feature sets both because models with small feature sets are understandable and because simple models usually generalize better. We study the simple case of linear regression, in which a user has an m x p matrix B and a vector y, and seeks a p-vector x *with as few nonzeroes as possible* such that Bx is approximately equal to y, and we call it SPARSE REGRESSION. There are numerous algorithms in the statistical literature for SPARSE REGRESSION, such as Forward Selection, Backward Elimination, LASSO, and Ridge Regression. We give a general hardness proof that (subject to a complexity assumption) no polynomial-time algorithm can give good performance (in the worst case) for SPARSE REGRESSION, even if it is allowed to include more variables than necessary, and even if it need only find an x such that Bx is relatively far from y. This is joint work with Dean Foster and Justin Thaler of Yahoo Labs.

Web Accessibility

Anonymous

Search

Difference between revisions of "CATS-Jan-22-2015"

Namespaces

More

Page actions

Latest revision as of 14:40, 20 January 2015

Title[edit]

Speaker[edit]

Abstract[edit]

Navigation

Navigation

Wiki tools

Wiki tools

Anonymous

Search

Difference between revisions of "CATS-Jan-22-2015"

Latest revision as of 14:40, 20 January 2015

Title[edit]

Speaker[edit]

Abstract[edit]

Navigation

Wiki tools

Page tools