Dimensionality reduction for sparse and structured matrices
Author(s)Musco, Christopher Paul
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Martin C. Rinard and Jonathan A. Kelner.
MetadataShow full item record
Dimensionality reduction has become a critical tool for quickly solving massive matrix problems. Especially in modern data analysis and machine learning applications, an overabundance of data features or examples can make it impossible to apply standard algorithms efficiently. To address this issue, it is often possible to distill data to a much smaller set of informative features or examples, which can be used to obtain provably accurate approximate solutions to a variety of problems In this thesis, we focus on the important case of dimensionality reduction for sparse and structured data. In contrast to popular structure-agnostic methods like Johnson-Lindenstrauss projection and PCA, we seek data compression techniques that take advantage of structure to generate smaller or more powerful compressions. Additionally, we aim for methods that can be applied extremely quickly - typically in linear or nearly linear time in the input size. Specifically, we introduce new randomized algorithms for structured dimensionality reduction that are based on importance sampling and sparse-recovery techniques. Our work applies directly to accelerating linear regression and graph sparsification and we discuss connections and possible extensions to low-rank approximation, k-means clustering, and several other ubiquitous matrix problems.
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.Cataloged from PDF version of thesis.Includes bibliographical references (pages 97-103).
DepartmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Massachusetts Institute of Technology
Electrical Engineering and Computer Science.