Near-optimal no-regret algorithms for zero-sum

Daskalakis, Constantinos; Deckelbaum, Alan T.; Kim, Anthony

dc.contributor.author	Daskalakis, Constantinos
dc.contributor.author	Deckelbaum, Alan T.
dc.contributor.author	Kim, Anthony
dc.date.accessioned	2012-09-21T15:32:49Z
dc.date.available	2012-09-21T15:32:49Z
dc.date.issued	2011-01
dc.identifier.issn	1071-9040
dc.identifier.uri	http://hdl.handle.net/1721.1/73097
dc.description.abstract	We propose a new no-regret learning algorithm. When used against an adversary, our algorithm achieves average regret that scales as O (1/√T) with the number T of rounds. This regret bound is optimal but not rare, as there are a multitude of learning algorithms with this regret guarantee. However, when our algorithm is used by both players of a zero-sum game, their average regret scales as O (ln T/T), guaranteeing a near-linear rate of convergence to the value of the game. This represents an almost-quadratic improvement on the rate of convergence to the value of a game known to be achieved by any no-regret learning algorithm, and is essentially optimal as we show a lower bound of Ω (1/T). Moreover, the dynamics produced by our algorithm in the game setting are strongly-uncoupled in that each player is oblivious to the payoff matrix of the game and the number of strategies of the other player, has limited private storage, and is not allowed funny bit arithmetic that can trivialize the problem; instead he only observes the performance of his strategies against the actions of the other player and can use private storage to remember past played strategies and observed payoffs, or cumulative information thereof. Here, too, our rate of convergence is nearly-optimal and represents an almost-quadratic improvement over the best previously known strongly-uncoupled dynamics.	en_US
dc.description.sponsorship	National Science Foundation (U.S.) (CAREER Award CCF-0953960)	en_US
dc.language.iso	en_US
dc.publisher	Society for Industrial and Applied Mathematics	en_US
dc.relation.isversionof	http://dl.acm.org/citation.cfm?id=2133057	en_US
dc.rights	Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.	en_US
dc.source	SIAM	en_US
dc.title	Near-optimal no-regret algorithms for zero-sum	en_US
dc.type	Article	en_US
dc.identifier.citation	Constantinos Daskalakis, Alan Deckelbaum, and Anthony Kim. 2011. Near-optimal no-regret algorithms for zero-sum games. In Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '11). SIAM 235-254. SIAM ©2011	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Mathematics	en_US
dc.contributor.mitauthor	Daskalakis, Constantinos
dc.contributor.mitauthor	Deckelbaum, Alan T.
dc.relation.journal	Proceedings of the Twenty-Second Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '11)	en_US
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
dc.identifier.orcid	https://orcid.org/0000-0002-5451-0490
mit.license	PUBLISHER_POLICY	en_US
mit.metadata.status	Complete

Files in this item

Name:: Daskalakis-Near-Optimal.pdf
Size:: 1018.Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record