Large-scale optimization Methods for data-science applications

Lu, Haihao,Ph.D.Massachusetts Institute of Technology.

dc.contributor.advisor	Robert M. Freund.	en_US
dc.contributor.author	Lu, Haihao,Ph.D.Massachusetts Institute of Technology.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Mathematics.	en_US
dc.date.accessioned	2019-09-19T23:17:07Z
dc.date.available	2019-09-19T23:17:07Z
dc.date.copyright	2019	en_US
dc.date.issued	2019	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/122272
dc.description	Thesis: Ph. D. in Mathematics and Operations Research, Massachusetts Institute of Technology, Department of Mathematics, 2019	en_US
dc.description	Cataloged from PDF version of thesis.	en_US
dc.description	Includes bibliographical references (pages 203-211).	en_US
dc.description.abstract	In this thesis, we present several contributions of large scale optimization methods with the applications in data science and machine learning. In the first part, we present new computational methods and associated computational guarantees for solving convex optimization problems using first-order methods. We consider general convex optimization problem, where we presume knowledge of a strict lower bound (like what happened in empirical risk minimization in machine learning). We introduce a new functional measure called the growth constant for the convex objective function, that measures how quickly the level sets grow relative to the function value, and that plays a fundamental role in the complexity analysis. Based on such measure, we present new computational guarantees for both smooth and non-smooth convex optimization, that can improve existing computational guarantees in several ways, most notably when the initial iterate is far from the optimal solution set.	en_US
dc.description.abstract	The usual approach to developing and analyzing first-order methods for convex optimization always assumes that either the gradient of the objective function is uniformly continuous (in the smooth setting) or the objective function itself is uniformly continuous. However, in many settings, especially in machine learning applications, the convex function is neither of them. For example, the Poisson Linear Inverse Model, the D-optimal design problem, the Support Vector Machine problem, etc. In the second part, we develop a notion of relative smoothness, relative continuity and relative strong convexity that is determined relative to a user-specified "reference function" (that should be computationally tractable for algorithms), and we show that many differentiable convex functions are relatively smooth or relatively continuous with respect to a correspondingly fairly-simple reference function.	en_US
dc.description.abstract	We extend the mirror descent algorithm to our new setting, with associated computational guarantees. Gradient Boosting Machine (GBM) introduced by Friedman is an extremely powerful supervised learning algorithm that is widely used in practice -- it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In the third part, we propose the Randomized Gradient Boosting Machine (RGBM) and the Accelerated Gradient Boosting Machine (AGBM). RGBM leads to significant computational gains compared to GBM, by using a randomization scheme to reduce the search in the space of weak-learners. AGBM incorporate Nesterov's acceleration techniques into the design of GBM, and this is the first GBM type of algorithm with theoretically-justified accelerated convergence rate. We demonstrate the effectiveness of RGBM and AGBM over GBM in obtaining a model with good training and/or testing data fidelity.	en_US
dc.description.statementofresponsibility	by Haihao Lu.	en_US
dc.format.extent	211 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Mathematics.	en_US
dc.title	Large-scale optimization Methods for data-science applications	en_US
dc.type	Thesis	en_US
dc.description.degree	Ph. D. in Mathematics and Operations Research	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mathematics	en_US
dc.identifier.oclc	1117775104	en_US
dc.description.collection	Ph.D.inMathematicsandOperationsResearch Massachusetts Institute of Technology, Department of Mathematics	en_US
dspace.imported	2019-09-19T23:17:03Z	en_US
mit.thesis.degree	Doctoral	en_US
mit.thesis.department	Math	en_US

Files in this item

Name:: 1117775104-MIT.pdf
Size:: 11.84Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record