Show simple item record

dc.contributor.advisorRobert M. Freund.en_US
dc.contributor.authorLu, Haihao,Ph.D.Massachusetts Institute of Technology.en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Mathematics.en_US
dc.date.accessioned2019-09-19T23:17:07Z
dc.date.available2019-09-19T23:17:07Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/122272
dc.descriptionThesis: Ph. D. in Mathematics and Operations Research, Massachusetts Institute of Technology, Department of Mathematics, 2019en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 203-211).en_US
dc.description.abstractIn this thesis, we present several contributions of large scale optimization methods with the applications in data science and machine learning. In the first part, we present new computational methods and associated computational guarantees for solving convex optimization problems using first-order methods. We consider general convex optimization problem, where we presume knowledge of a strict lower bound (like what happened in empirical risk minimization in machine learning). We introduce a new functional measure called the growth constant for the convex objective function, that measures how quickly the level sets grow relative to the function value, and that plays a fundamental role in the complexity analysis. Based on such measure, we present new computational guarantees for both smooth and non-smooth convex optimization, that can improve existing computational guarantees in several ways, most notably when the initial iterate is far from the optimal solution set.en_US
dc.description.abstractThe usual approach to developing and analyzing first-order methods for convex optimization always assumes that either the gradient of the objective function is uniformly continuous (in the smooth setting) or the objective function itself is uniformly continuous. However, in many settings, especially in machine learning applications, the convex function is neither of them. For example, the Poisson Linear Inverse Model, the D-optimal design problem, the Support Vector Machine problem, etc. In the second part, we develop a notion of relative smoothness, relative continuity and relative strong convexity that is determined relative to a user-specified "reference function" (that should be computationally tractable for algorithms), and we show that many differentiable convex functions are relatively smooth or relatively continuous with respect to a correspondingly fairly-simple reference function.en_US
dc.description.abstractWe extend the mirror descent algorithm to our new setting, with associated computational guarantees. Gradient Boosting Machine (GBM) introduced by Friedman is an extremely powerful supervised learning algorithm that is widely used in practice -- it routinely features as a leading algorithm in machine learning competitions such as Kaggle and the KDDCup. In the third part, we propose the Randomized Gradient Boosting Machine (RGBM) and the Accelerated Gradient Boosting Machine (AGBM). RGBM leads to significant computational gains compared to GBM, by using a randomization scheme to reduce the search in the space of weak-learners. AGBM incorporate Nesterov's acceleration techniques into the design of GBM, and this is the first GBM type of algorithm with theoretically-justified accelerated convergence rate. We demonstrate the effectiveness of RGBM and AGBM over GBM in obtaining a model with good training and/or testing data fidelity.en_US
dc.description.statementofresponsibilityby Haihao Lu.en_US
dc.format.extent211 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectMathematics.en_US
dc.titleLarge-scale optimization Methods for data-science applicationsen_US
dc.typeThesisen_US
dc.description.degreePh. D. in Mathematics and Operations Researchen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.identifier.oclc1117775104en_US
dc.description.collectionPh.D.inMathematicsandOperationsResearch Massachusetts Institute of Technology, Department of Mathematicsen_US
dspace.imported2019-09-19T23:17:03Zen_US
mit.thesis.degreeDoctoralen_US
mit.thesis.departmentMathen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record