On feature selection : learning with exponentially many irreverent features as training examples

Ng, Andrew Y., 1976-

dc.contributor.advisor	Michael I. Jordan.	en_US
dc.contributor.author	Ng, Andrew Y., 1976-	en_US
dc.date.accessioned	2005-08-19T19:14:59Z
dc.date.available	2005-08-19T19:14:59Z
dc.date.copyright	1998	en_US
dc.date.issued	1998	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/9658
dc.description	Thesis (S.M.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1998.	en_US
dc.description	Includes bibliographical references (p. 55-57).	en_US
dc.description.abstract	We consider feature selection for supervised machine learning in the "wrapper" model of feature selection. This typically involves an NP-hard optimization problem that is approximated by heuristic search for a "good" feature subset. First considering the idealization where this optimization is performed exactly, we give a rigorous bound for generalization error under feature selection. The search heuristics typically used are then immediately seen as trying to achieve the error given in our bounds, and succeeding to the extent that they succeed in solving the optimization. The bound suggests that, in the presence of many "irrelevant" features, the main somce of error in wrapper model feature selection is from "overfitting" hold-out or cross-validation data. This motivates a new algorithm that, again under the idealization of performing search exactly, has sample complexity ( and error) that grows logarithmically in the number of "irrelevant" features - which means it can tolerate having a number of "irrelevant" features exponential in the number of training examples - and search heuristics are again seen to be directly trying to reach this bound. Experimental results on a problem using simulated data show the new algorithm having much higher tolerance to irrelevant features than the standard wrapper model. Lastly, we also discuss ramifications that sample complexity logarithmic in the number of irrelevant features might have for feature design in actual applications of learning.	en_US
dc.description.statementofresponsibility	by Andrew Y. Ng.	en_US
dc.format.extent	57 p.	en_US
dc.format.extent	5878396 bytes
dc.format.extent	5878150 bytes
dc.format.mimetype	application/pdf
dc.format.mimetype	application/pdf
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582
dc.subject	Electrical Engineering and Computer Science	en_US
dc.title	On feature selection : learning with exponentially many irreverent features as training examples	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc	42427464	en_US

Files in this item

Name:: 42427464-MIT.pdf
Size:: 5.605Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record