Show simple item record

dc.contributor.advisorDimitris Bertsimas.en_US
dc.contributor.authorKing, Angela, Ph. D. Massachusetts Institute of Technologyen_US
dc.contributor.otherMassachusetts Institute of Technology. Operations Research Center.en_US
dc.date.accessioned2015-09-17T19:07:16Z
dc.date.available2015-09-17T19:07:16Z
dc.date.copyright2015en_US
dc.date.issued2015en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/98719
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2015.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 131-139).en_US
dc.description.abstractIn the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving mixed integer optimization (MIO) problems. The common mindset of MIO as theoretically elegant but practically irrelevant is no longer justified. In this thesis, we propose a methodology for regression modeling that is based on optimization techniques and centered around MIO. In Part I we propose a method to select a subset of variables to include in a linear regression model using continuous and integer optimization. Despite the natural formulation of subset selection as an optimization problem with an lo-norm constraint, current methods for subset selection do not attempt to use integer optimization to select the best subset. We show that, although this problem is non-convex and NP-hard, it can be practically solved for large scale problems. We numerically demonstrate that our approach outperforms other sparse learning procedures. In Part II of the thesis, we build off of Part I to modify the objective function and include constraints that will produce linear regression models with other desirable properties, in addition to sparsity. We develop a unified framework based on MIO which aims to algorithmize the process of building a high-quality linear regression model. This is the only methodology we are aware of to construct models that imposes statistical properties simultaneously rather than sequentially. Finally, we turn our attention to logistic regression modeling. It is the goal of Part III of the thesis to efficiently solve the mixed integer convex optimization problem of logistic regression with cardinality constraints to provable optimality. We develop a tailored algorithm to solve this challenging problem and demonstrate its speed and performance. We then show how this method can be used within the framework of Part II, thereby also creating an algorithmic approach to fitting high-quality logistic regression models. In each part of the thesis, we illustrate the effectiveness of our proposed approach on both real and synthetic datasets.en_US
dc.description.statementofresponsibilityby Angela King.en_US
dc.format.extent139 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectOperations Research Center.en_US
dc.titleRegression under a modern optimization lensen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Center
dc.contributor.departmentSloan School of Management
dc.identifier.oclc920858725en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record