Show simple item record

dc.contributor.advisorRahul Mazumder.en_US
dc.contributor.authorDedieu, Antoineen_US
dc.contributor.otherMassachusetts Institute of Technology. Operations Research Center.en_US
dc.date.accessioned2018-11-28T15:44:33Z
dc.date.available2018-11-28T15:44:33Z
dc.date.copyright2018en_US
dc.date.issued2018en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/119354
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, 2018.en_US
dc.descriptionCataloged from PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 101-109).en_US
dc.description.abstractIn this thesis, we study the computational and statistical aspects of several sparse models when the number of samples and/or features is large. We propose new statistical estimators and build new computational algorithms - borrowing tools and techniques from areas of convex and discrete optimization. First, we explore an Lq-regularized version of the Best Subset selection procedure which mitigates the poor statistical performance of the best-subsets estimator in the low SNR regimes. The statistical and empirical properties of the estimator are explored, especially when compared to best-subsets selection, Lasso and Ridge. Second, we propose new computational algorithms for a family of penalized linear Support Vector Machine (SVM) problem with a hinge loss function and sparsity-inducing regularizations. Our methods bring together techniques from Column (and Constraint) Generation and modern First Order methods for non-smooth convex optimization. These two components complement each others' strengths, leading to improvements of 2 orders of magnitude when compared to commercial LP solvers. Third, we present a novel framework inspired by Hierarchical Bayesian modeling to predict user session-length on on-line streaming services. The time spent by a user on a platform depends upon user-specific latent variables which are learned via hierarchical shrinkage. Our framework incorporates flexible parametric/nonparametric models on the covariates and outperforms state-of- the-art estimators in terms of efficiency and predictive performance on real world datasets from the internet radio company Pandora Media Inc.en_US
dc.description.statementofresponsibilityby Antoine Dedieu.en_US
dc.format.extent121 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectOperations Research Center.en_US
dc.titleSparse learning : statistical and optimization perspectivesen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Operations Research Center
dc.contributor.departmentSloan School of Management
dc.identifier.oclc1065541961en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record