On exploiting structures for deep learning algorithms with matrix estimation

Yang, Yuzhe,S.M.Massachusetts Institute of Technology.

dc.contributor.advisor	Dina Katabi.	en_US
dc.contributor.author	Yang, Yuzhe,S.M.Massachusetts Institute of Technology.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2020-09-15T21:51:58Z
dc.date.available	2020-09-15T21:51:58Z
dc.date.copyright	2020	en_US
dc.date.issued	2020	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/127319
dc.description	Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, May, 2020	en_US
dc.description	Cataloged from the official PDF of thesis.	en_US
dc.description	Includes bibliographical references (pages 113-118).	en_US
dc.description.abstract	Despite recent breakthroughs of deep learning, the intrinsic structures within tasks have not yet been fully explored and exploited for better performance. This thesis proposes to harness the structured properties of deep learning tasks using matrix estimation (ME). Motivated by the theoretical guarantees and appealing results, we apply ME to study the following two important learning problems: 1. Adversarial robustness. Deep neural networks are vulnerable to adversarial attacks. This thesis proposes ME-Net, a defense method that leverages ME. In ME-Net, images are preprocessed using two steps: first pixels are randomly dropped from the image; then, the image is reconstructed using ME. We show that this process destroys the adversarial structure of the noise, while re-enforcing the global structure in the original image. Comparing ME-Net with state-of-the-art defense mechanisms shows that ME-Net consistently outperforms prior techniques, improving robustness against both black-box and white-box attacks. 2. Value-based planning and deep reinforcement learning (RL). This thesis proposes to exploit the underlying low-rank structures of the state-action value function, i.e., Q function. We verify empirically the existence of low-rank Q functions in the context of control and deep RL tasks. As our key contribution, by leveraging ME, we propose a generic framework to exploit the underlying low-rank structure in Q functions. This leads to a more efficient planning procedure for classical control, and additionally, a simple scheme that can be applied to any value-based RL techniques to consistently achieve better performance on "low-rank" tasks. The results of this thesis demonstrate the value of using matrix estimation to capture the internal structures of deep learning tasks, and highlight the benefits of leveraging structure for analyzing and improving modern learning algorithms.	en_US
dc.description.statementofresponsibility	by Yuzhe Yang.	en_US
dc.format.extent	118 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	On exploiting structures for deep learning algorithms with matrix estimation	en_US
dc.type	Thesis	en_US
dc.description.degree	S.M.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.identifier.oclc	1192299701	en_US
dc.description.collection	S.M. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science	en_US
dspace.imported	2020-09-15T21:51:57Z	en_US
mit.thesis.degree	Master	en_US
mit.thesis.department	EECS	en_US

Files in this item

Name:: 1192299701-MIT.pdf
Size:: 16.86Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record