MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

External Sampling

Author(s)
Andoni, Alexandr; Indyk, Piotr; Onak, Krzysztof; Rubinfeld, Ronitt
Thumbnail
DownloadIndyk_External Sampling.pdf (174.5Kb)
OPEN_ACCESS_POLICY

Open Access Policy

Creative Commons Attribution-Noncommercial-Share Alike

Terms of use
Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/
Metadata
Show full item record
Abstract
We initiate the study of sublinear-time algorithms in the external memory model [1]. In this model, the data is stored in blocks of a certain size B, and the algorithm is charged a unit cost for each block access. This model is well-studied, since it reflects the computational issues occurring when the (massive) input is stored on a disk. Since each block access operates on B data elements in parallel, many problems have external memory algorithms whose number of block accesses is only a small fraction (e.g. 1/B) of their main memory complexity. However, to the best of our knowledge, no such reduction in complexity is known for any sublinear-time algorithm. One plausible explanation is that the vast majority of sublinear-time algorithms use random sampling and thus exhibit no locality of reference. This state of affairs is quite unfortunate, since both sublinear-time algorithms and the external memory model are important approaches to dealing with massive data sets, and ideally they should be combined to achieve best performance. In this paper we show that such combination is indeed possible. In particular, we consider three well-studied problems: testing of distinctness, uniformity and identity of an empirical distribution induced by data. For these problems we show random-sampling-based algorithms whose number of block accesses is up to a factor of 1/√B smaller than the main memory complexity of those problems. We also show that this improvement is optimal for those problems. Since these problems are natural primitives for a number of sampling-based algorithms for other problems, our tools improve the external memory complexity of other problems as well.
Description
36th International Colloquium, ICALP 2009, Rhodes, Greece, July 5-12, 2009, Proceedings, Part I
Date issued
2009-07
URI
http://hdl.handle.net/1721.1/73886
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Journal
Automata, Languages and Programming
Publisher
Springer Berlin / Heidelberg
Citation
Andoni, Alexandr et al. “External Sampling.” Automata, Languages and Programming. Ed. Susanne Albers et al. LNCS Vol. 5555. Berlin, Heidelberg: Springer Berlin Heidelberg, 2009. 83–94.
Version: Author's final manuscript
ISBN
978-3-642-02926-4
ISSN
0302-9743
1611-3349

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.