On exploiting structures for deep learning algorithms with matrix estimation

Yang, Yuzhe,S.M.Massachusetts Institute of Technology.

Author(s)

Yang, Yuzhe,S.M.Massachusetts Institute of Technology.

Download1192299701-MIT.pdf (16.86Mb)

Other Contributors

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.

Advisor

Dina Katabi.

Terms of use

MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. http://dspace.mit.edu/handle/1721.1/7582

Metadata

Show full item record

Abstract

Despite recent breakthroughs of deep learning, the intrinsic structures within tasks have not yet been fully explored and exploited for better performance. This thesis proposes to harness the structured properties of deep learning tasks using matrix estimation (ME). Motivated by the theoretical guarantees and appealing results, we apply ME to study the following two important learning problems: 1. Adversarial robustness. Deep neural networks are vulnerable to adversarial attacks. This thesis proposes ME-Net, a defense method that leverages ME. In ME-Net, images are preprocessed using two steps: first pixels are randomly dropped from the image; then, the image is reconstructed using ME. We show that this process destroys the adversarial structure of the noise, while re-enforcing the global structure in the original image. Comparing ME-Net with state-of-the-art defense mechanisms shows that ME-Net consistently outperforms prior techniques, improving robustness against both black-box and white-box attacks. 2. Value-based planning and deep reinforcement learning (RL). This thesis proposes to exploit the underlying low-rank structures of the state-action value function, i.e., Q function. We verify empirically the existence of low-rank Q functions in the context of control and deep RL tasks. As our key contribution, by leveraging ME, we propose a generic framework to exploit the underlying low-rank structure in Q functions. This leads to a more efficient planning procedure for classical control, and additionally, a simple scheme that can be applied to any value-based RL techniques to consistently achieve better performance on "low-rank" tasks. The results of this thesis demonstrate the value of using matrix estimation to capture the internal structures of deep learning tasks, and highlight the benefits of leveraging structure for analyzing and improving modern learning algorithms.

Description

Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020

Cataloged from the official PDF of thesis.

Includes bibliographical references (pages 113-118).

Date issued

2020

URI

https://hdl.handle.net/1721.1/127319

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Keywords

Electrical Engineering and Computer Science.

Collections

Graduate Theses