Reliable validation : new perspectives on adaptive data analysis and cross-validation

Elder, Samuel Scott

dc.contributor.advisor	Jonathan Kelner and Tamara Broderick.	en_US
dc.contributor.author	Elder, Samuel Scott	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Mathematics.	en_US
dc.date.accessioned	2019-03-01T19:56:01Z
dc.date.available	2019-03-01T19:56:01Z
dc.date.copyright	2018	en_US
dc.date.issued	2018	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/120660
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Department of Mathematics, 2018.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 107-109).	en_US
dc.description.abstract	Validation refers to the challenge of assessing how well a learning algorithm performs after it has been trained on a given data set. It forms an important step in machine learning, as such assessments are then used to compare and choose between algorithms and provide reasonable approximations of their accuracy. In this thesis, we provide new approaches for addressing two common problems with validation. In the first half, we assume a simple validation framework, the holdout set, and address an important question of how many algorithms can be accurately assessed using the same holdout set, in the particular case where these algorithms are chosen adaptively. We do so by first critiquing the initial approaches to building a theory of adaptivity, then offering an alternative approach and preliminary results within this approach, all geared towards characterizing the inherent challenge of adaptivity. In the second half, we address the validation framework itself. Most common practice does not just use a single holdout set, but averages results from several, a family of techniques known as cross-validation. In this work, we offer several new cross-validation techniques with the common theme of utilizing training sets of varying sizes. This culminates in hierarchical cross-validation, a meta-technique for using cross-validation to choose the best cross-validation method.	en_US
dc.description.statementofresponsibility	by Samuel Scott Elder.	en_US
dc.format.extent	109 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Mathematics.	en_US
dc.title	Reliable validation : new perspectives on adaptive data analysis and cross-validation	en_US
dc.title.alternative	New perspectives on adaptive data analysis and cross-validation	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mathematics
dc.identifier.oclc	1088419995	en_US

Files in this item

Name:: 1088419995-MIT.pdf
Size:: 6.434Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record