Building efficient algorithms by learning to compress

Blalock, Davis W.(Davis Whitaker)

Author(s)

Blalock, Davis W.(Davis Whitaker)

Download1227516399-MIT.pdf (4.430Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

John V. Guttag.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

The amount of data in the world is doubling every two years. Such abundant data offers immense opportunities, but also imposes immense computation, storage, and energy costs. This thesis introduces efficient algorithms for reducing these costs for bottlenecks in real world data analysis and machine learning pipelines. Concretely, we introduce algorithms for: -- Lossless compression of time series. This algorithm compresses better than any existing method, despite requiring only the resources available on a low-power edge device. -- Approximate matrix-vector multiplies. This algorithm accelerates approximate similarity scans by an order of magnitude relative to existing methods. -- Approximate matrix-matrix multiplies. This algorithm often outperforms existing approximation methods by more than 10x and non-approximate computation by more than 100x. We provide extensive empirical analyses of all three algorithms using real-world datasets and realistic workloads. We also prove bounds on the errors introduced by the two approximation algorithms. The theme unifying all of these contributions is learned compression. While compression is typically thought of only as a means to reduce data size, we show that specially designed compression schemes can also dramatically increase computation speed and reduce memory requirements.

Description

Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020

Cataloged from student-submitted PDF of thesis.

Includes bibliographical references (pages 137-152).

Date issued

2020

URI

https://hdl.handle.net/1721.1/129244

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Doctoral Theses