GPSZip : semantic representation and compression system for GPS using coresets
Author(s)
Wu, Cathy, M. Eng. Massachusetts Institute of Technology
DownloadFull printable version (7.822Mb)
Alternative title
Online HMM coresets for scalable learning of sensor streams
Semantic representation and compression system for GPS using coresets
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Daniela Rus.
Terms of use
Metadata
Show full item recordAbstract
We present a semantic approach for compressing mobile sensor data and focus on GPS streams. Unlike popular text-compression methods, our approach takes advantage of the fact that agents (robotic, personal, or vehicular) perform tasks in a physical space, and the resulting sensor stream usually contains repeated observations of the same locations, actions, or scenes. We model this sensor stream as a Markov process with unobserved states, and our goal is to compute the Hidden Markov Model (HMM) that maximizes the likelihood estimation (MLE) of generating the stream. Our semantic representation and compression system comprises of two main parts: 1) trajectory mapping and 2) trajectory compression. The trajectory mapping stage extracts a semantic representation (topological map) from raw sensor data. Our trajectory compression stage uses a recursive binary search algorithm to take advantage of the information captured by our constructed map. To improve efficiency and scalability, we utilize 2 coresets: we formalize the coreset for 1-segment and apply our system on a small k-segment coreset of the data rather than the original data. The compressed trajectory compresses the original sensor stream and approximates its likelihood up to a provable (1 + E)-multiplicative factor for any candidate Markov model. We conclude with experimental results on data sets from several robots, personal smartphones, and taxicabs. In a robotics experiment of more than 72K points, we show that the size of our compression is smaller by a factor of 650 when compared to the original signal, and by factor of 170 when compared to bzip2. We additionally demonstrate the capability of our system to automatically summarize a personal GPS stream, generate a sketch of a city map, and merge trajectories from multiple taxicabs for a more complete map.
Description
Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2013. Cataloged from PDF version of thesis. Includes bibliographical references (pages 76-79).
Date issued
2013Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.