Sparsity in Machine Learning: Theory and Applications

Lahlou Kitane, Driss

Author(s)

Lahlou Kitane, Driss

DownloadThesis PDF (11.34Mb)

Advisor

Bertsimas, Dimitris

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Sparsity plays a key role in machine learning for several reasons including interpretability. Interpretability is sought either by practitioners or by scientists. Indeed, on one hand interpretability can be key in a practice such as in healthcare, in which black box models cannot be used for the prescription of a treatment for a patient. On the other hand, interpretability is essential in understanding of phenomena that are modelled using machine learning such as plasma electromagnetic emissions. Besides interpretability, sparsity has several other important applications such as improvement of the predictive power of models and reduction of operational and investment costs. Integer optimization is a highly effective tool in the conception of methods to tackle sparsity. It offers a rigorous framework to build sparse models and has proved to provide more accurate and sparse models than other approaches including the ones using sparsity-inducing regularization norms. This thesis focuses on the application of integer optimization to address sparsity problems. We provide two applications of sparse modeling. The first one is related to the application of Mixed Integer Optimization (MIO) sparse regression to Laser Induced Breakdown Spectroscopy (LIBS), a modern and important chemical analysis technique. We build a methodology for sparse and robust models in chemometrics and test it on various types of mineral ore. The MIO approach beats experts’ predictions while offering remarkably sparser models compared to 𝐿𝐴𝑆𝑆𝑂. As the 𝑅2 achieved is higher than .99 in some cases, this application is, to the best of our knowledge, the first application that brings empirical proof that a true support exists in nature as the optimization community has been questioning the existence of such a concept in real life applications. The second application is related to COVID testing and sparse classification. We propose a fast and simple method for the detection of SARS-CoV-2 based on spectroscopy. This novel method builds on machine learning capabilities to deliver diagnosis in under a minute, without the use of any reagent, achieving a precision close to that of PCR. Sparse methods enable the detection of specific characteristics in the 3D structure of SARS-CoV-2 RNA and proteins. Given the importance PCA plays in our research and in machine learning in general, we also provide a new approach to tackle the sparse PCA problem. This approach is the first to generate several sparse principal components in one step, while existing techniques rely instead on deflation to iteratively generate principal components. The method proposed (GeoSPCA) generates high quality solutions that improves the variance explained by deflation techniques by more than an order of magnitude.

Date issued

2022-02

URI

https://hdl.handle.net/1721.1/143157

Department

Massachusetts Institute of Technology. Operations Research Center

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses