Regression under a modern optimization lens

King, Angela, Ph. D. Massachusetts Institute of Technology

dc.contributor.advisor	Dimitris Bertsimas.	en_US
dc.contributor.author	King, Angela, Ph. D. Massachusetts Institute of Technology	en_US
dc.contributor.other	Massachusetts Institute of Technology. Operations Research Center.	en_US
dc.date.accessioned	2015-09-17T19:07:16Z
dc.date.available	2015-09-17T19:07:16Z
dc.date.copyright	2015	en_US
dc.date.issued	2015	en_US
dc.identifier.uri	http://hdl.handle.net/1721.1/98719
dc.description	Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2015.	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 131-139).	en_US
dc.description.abstract	In the last twenty-five years (1990-2014), algorithmic advances in integer optimization combined with hardware improvements have resulted in an astonishing 200 billion factor speedup in solving mixed integer optimization (MIO) problems. The common mindset of MIO as theoretically elegant but practically irrelevant is no longer justified. In this thesis, we propose a methodology for regression modeling that is based on optimization techniques and centered around MIO. In Part I we propose a method to select a subset of variables to include in a linear regression model using continuous and integer optimization. Despite the natural formulation of subset selection as an optimization problem with an lo-norm constraint, current methods for subset selection do not attempt to use integer optimization to select the best subset. We show that, although this problem is non-convex and NP-hard, it can be practically solved for large scale problems. We numerically demonstrate that our approach outperforms other sparse learning procedures. In Part II of the thesis, we build off of Part I to modify the objective function and include constraints that will produce linear regression models with other desirable properties, in addition to sparsity. We develop a unified framework based on MIO which aims to algorithmize the process of building a high-quality linear regression model. This is the only methodology we are aware of to construct models that imposes statistical properties simultaneously rather than sequentially. Finally, we turn our attention to logistic regression modeling. It is the goal of Part III of the thesis to efficiently solve the mixed integer convex optimization problem of logistic regression with cardinality constraints to provable optimality. We develop a tailored algorithm to solve this challenging problem and demonstrate its speed and performance. We then show how this method can be used within the framework of Part II, thereby also creating an algorithmic approach to fitting high-quality logistic regression models. In each part of the thesis, we illustrate the effectiveness of our proposed approach on both real and synthetic datasets.	en_US
dc.description.statementofresponsibility	by Angela King.	en_US
dc.format.extent	139 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Operations Research Center.	en_US
dc.title	Regression under a modern optimization lens	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Operations Research Center
dc.contributor.department	Sloan School of Management
dc.identifier.oclc	920858725	en_US

Files in this item

Name:: 920858725-MIT.pdf
Size:: 8.847Mb
Format:: PDF
Description:: Full printable version

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record