A Unified Approach to Controlling Implicit Regularization Using Mirror Descent

Sun, Haoyuan

dc.contributor.advisor	Jadbabaie, Ali
dc.contributor.advisor	Azizan, Navid
dc.contributor.author	Sun, Haoyuan
dc.date.accessioned	2023-07-31T19:41:44Z
dc.date.available	2023-07-31T19:41:44Z
dc.date.issued	2023-06
dc.date.submitted	2023-07-13T14:30:02.345Z
dc.identifier.uri	https://hdl.handle.net/1721.1/151464
dc.description.abstract	Inspired by the remarkable performance of deep neural networks, understanding the generalization performance of overparameterized models and the effect of optimization algorithms on it has become an increasingly popular question. In particular, there has been substantial effort to characterize the solutions preferred by the optimization algorithms, such as gradient descent (GD), something referred to as implicit regularization. In particular, it has been argued that GD tends to induce an implicit $\ell_2$-norm regularization in regression and classification problems. Despite significant progress in this space, the implicit bias of various algorithms are either specific to a particular geometry or only exist for a particular class of learning problems, and there is a lack of a general approach for controlling the implicit regularization. To this end, we present a unified approach via mirror descent (MD), which is an important generalization of GD, to control implicit regularization in both regression and classification settings. In particular, we show that MD with a general class of homogeneous potential function converges in direction to a generalized maximum-margin solution for linear classifications problems, thereby answering an open question in the classification setting. Additionally, we show that under suitable conditions, MD can be efficiently implemented with minimal overhead compared to GD and enjoys fast convergence to the maximum-margin solution induced by its implicit bias. Using comprehensive experiments with both linear and deep neural network models, we demonstrate that MD is a versatile method to produce learned models with different regularizers, which in turn lead to different generalization performances.
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright retained by author(s)
dc.rights.uri	https://rightsstatements.org/page/InC-EDU/1.0/
dc.title	A Unified Approach to Controlling Implicit Regularization Using Mirror Descent
dc.type	Thesis
dc.description.degree	S.M.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcid	https://orcid.org/0000-0002-6203-0198
mit.thesis.degree	Master
thesis.degree.name	Master of Science in Electrical Engineering and Computer Science

Files in this item

Name:: sun-haoyuans-sm-eecs-2023-thes ...
Size:: 1.389Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record