MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Theses - Operations Research Center
  • Operations Research - Ph.D. / Sc.D.
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Theses - Operations Research Center
  • Operations Research - Ph.D. / Sc.D.
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

New optimization approaches to matrix factorization problems with connections to natural language processing

Author(s)
Berk, Lauren Elizabeth.
Thumbnail
Download1191900766-MIT.pdf (5.016Mb)
Other Contributors
Massachusetts Institute of Technology. Operations Research Center.
Advisor
Robert Freund.
Terms of use
MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582
Metadata
Show full item record
Abstract
In this thesis, we propose novel formulation optimization methods for four matrix factorization problems in depth: sparse principal component analysis, compressed sensing, discrete component analysis, and latent Dirichlet allocation. For each new formulations, we develop efficient solution algorithms using discrete and robust optimization, and demonstrate tractability and effectiveness in computational experiments. In Chapter 1, we develop a framework for matrix factorization problems and provide a technical introduction to topic modeling with examples. Chapter 2, Certifiably optimal sparse principal component analysis, addresses the sparse principal component analysis (SPCA) problem. We propose a tailored branch-and- bound algorithm, Optimal-SPCA, that enables us to solve SPCA to certifiable optimality.
 
We apply our methods to real data sets to demonstrate that our approach scales well and provides superior solutions compared to existing methods, explaining a higher proportion of variance and permitting more control over the desired sparsity. Chapter 3, optimal compressed sensing in submodular settings, presents a novel algorithm for compressed sensing that guarantees optimality under submodularity conditions rather than restricted isometry property (RIP) conditions. The algorithm defines submodularity properties of the loss function, derives lower bounds, and generates these lower bounds as constraints for use in a cutting planes algorithm. The chapter also develops a local search heuristic based on this exact algorithm. Chapter 4, Robust topic modeling, develops a new form of topic modeling inspired by robust optimization and by discrete component analysis.
 
The new approach builds uncertainty sets using one-sided constraints and two hypothesis tests, uses alternating optimization and projected gradient methods, including Adam and mirror descent, to find good local optima. In computational experiments, we demonstrate that these models are better able to avoid over-fitting than LDA and PLSA, and result in more accurate reconstruction of the underlying topic matrices. In Chapter 5, we develop modifications to latent Dirichlet allocation to account for differences in the distribution of topics by authors. The chapter adds author-specific topic priors to the generative process and allows for co-authorship, providing the model with increased degrees of freedom and enabling it to model an enhanced set of problems. The code for the algorithms developed in each chapter in the Julia language is available freely on GitHub at https://github.com/lauren897
 
Description
Thesis: Ph. D., Massachusetts Institute of Technology, Sloan School of Management, Operations Research Center, May, 2020
 
Cataloged from the official PDF of thesis.
 
Includes bibliographical references (pages 245-260).
 
Date issued
2020
URI
https://hdl.handle.net/1721.1/127291
Department
Massachusetts Institute of Technology. Operations Research Center
Publisher
Massachusetts Institute of Technology
Keywords
Operations Research Center.

Collections
  • Management - Ph.D. / Sc.D.
  • Management - Ph.D. / Sc.D.
  • Operations Research - Ph.D. / Sc.D.
  • Operations Research Ph.D. / Sc.D.

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.