Show simple item record

dc.contributor.advisorErik D. Demaine and Piotr Indyk.en_US
dc.contributor.authorNelson, Jelani (Jelani Osei)en_US
dc.contributor.otherMassachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2011-10-17T19:49:37Z
dc.date.available2011-10-17T19:49:37Z
dc.date.copyright2011en_US
dc.date.issued2011en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/66314
dc.descriptionThesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from student submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (p. 136-145).en_US
dc.description.abstractA sketch of a dataset is a small-space data structure supporting some prespecified set of queries (and possibly updates) while consuming space substantially sublinear in the space required to actually store all the data. Furthermore, it is often desirable, or required by the application, that the sketch itself be computable by a small-space algorithm given just one pass over the data, a so-called streaming algorithm. Sketching and streaming have found numerous applications in network traffic monitoring, data mining, trend detection, sensor networks, and databases. In this thesis, I describe several new contributions in the area of sketching and streaming algorithms. The first space-optimal streaming algorithm for the distinct elements problem. Our algorithm also achieves 0(1) update and reporting times. A streaming algorithm for Hamming norm estimation in the turnstile model which achieves the best known space complexity. The first space-optimal algorithm for pth moment estimation in turnstile streams for 0 < p < 2, with matching lower bounds, and another space-optimal algorithm which also has a fast O(log²(1/[epsilon]) log log(1[epsilon])) update time for (1+/-[epsilon])- approximation. A general reduction from empirical entropy estimation in turnstile streams to moment estimation, providing the only known near-optimal space-complexity upper bound for this problem. A proof of the Johnson-Lindenstrauss lemma where every matrix in the support of the embedding distribution is much sparser than previous known constructions. In particular, to achieve distortion (1+/-[epsilon]) with probability 1-[delta], we embed into optimal dimension 0([epsilon]-²log(1/[delta])) and such that every matrix in the support of the distribution has 0([epsilon]-¹ log(1/[delta])) non-zero entries per column.en_US
dc.description.statementofresponsibilityby Jelani Nelson.en_US
dc.format.extent145 p.en_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsM.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleSketching and streaming high-dimensional vectorsen_US
dc.title.alternativeSketching and streaming algorithmsen_US
dc.typeThesisen_US
dc.description.degreePh.D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc756041968en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record