MIT Libraries homeMIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • Center for Brains, Minds & Machines
  • Publications
  • CBMM Memo Series
  • View Item
  • DSpace@MIT Home
  • Center for Brains, Minds & Machines
  • Publications
  • CBMM Memo Series
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Theory of Deep Learning IIb: Optimization Properties of SGD

Author(s)
Zhang, Chiyuan; Liao, Qianli; Rakhlin, Alexander; Miranda, Brando; Golowich, Noah; Poggio, Tomaso; ... Show more Show less
Thumbnail
DownloadCBMM-Memo-072.pdf (3.660Mb)
Metadata
Show full item record
Abstract
In Theory IIb we characterize with a mix of theory and experiments the optimization of deep convolutional networks by Stochastic Gradient Descent. The main new result in this paper is theoretical and experimental evidence for the following conjecture about SGD: SGD concentrates in probability - like the classical Langevin equation – on large volume, “flat” minima, selecting flat minimizers which are with very high probability also global minimizers.
Date issued
2017-12-27
URI
http://hdl.handle.net/1721.1/115407
Publisher
Center for Brains, Minds and Machines (CBMM)
Series/Report no.
CBMM Memo Series;072

Collections
  • CBMM Memo Series

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries homeMIT Libraries logo

Find us on

Twitter Facebook Instagram YouTube RSS

MIT Libraries navigation

SearchHours & locationsBorrow & requestResearch supportAbout us
PrivacyPermissionsAccessibility
MIT
Massachusetts Institute of Technology
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.