The lingering of gradients: How to reuse gradients over time

Allen-Zhu, Zeyuan; Simchi-Levi, David; Wang, Xinshang

Author(s)

Allen-Zhu, Zeyuan; Simchi-Levi, David; Wang, Xinshang

DownloadPublished version (1.351Mb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

© 2018 Curran Associates Inc..All rights reserved. Classically, the time complexity of a first-order method is estimated by its number of gradient computations. In this paper, we study a more refined complexity by taking into account the “lingering” of gradients: once a gradient is computed at xk, the additional time to compute gradients at xk+1, xk+2, . . . may be reduced. We show how this improves the running time of gradient descent and SVRG. For instance, if the “additional time” scales linearly with respect to the traveled distance, then the “convergence rate” of gradient descent can be improved from 1/T to exp(−T1/3). On the empirical side, we solve a hypothetical revenue management problem on the Yahoo! Front Page Today Module application with 4.6m users to 10−6 error (or 10−12 dual error) using 6 passes of the dataset.

Date issued

2018-12

URI

https://hdl.handle.net/1721.1/137055.2

Department

Massachusetts Institute of Technology. Department of Civil and Environmental Engineering; Massachusetts Institute of Technology. Laboratory for Information and Decision Systems; Massachusetts Institute of Technology. Institute for Data, Systems, and Society; Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

Advances in Neural Information Processing Systems

Publisher

Morgan Kaufmann Publishers

Citation

simchi-levi, David and Wang, Xinshang. 2018. "The lingering of gradients: How to reuse gradients over time." Advances in Neural Information Processing Systems, 2018-December.

Version: Final published version

ISSN

1049-5258

Collections

MIT Open Access Articles

Version	Item	Date	Summary
2	1721.1/137055.2*	2021-12-14T15:20:37Z	Verified or entered authority and publication metadata
1	1721.1/137055	2021-11-02T11:37:27Z

DSpace@MIT