MacroBase: Prioritizing Attention in Fast Data
Author(s)
Bailis, Peter; Gan, Edward; Madden, Samuel; Narayanan, Deepak; Rong, Kexin; Suri, Sahaana; ... Show more Show less
DownloadAccepted version (1.056Mb)
Terms of use
Metadata
Show full item recordAbstract
As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables efficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver order-of-magnitude speedups over alternatives by optimizing the combination of explanation and classification tasks and by leveraging a new reservoir sampler and heavy-hitters sketch specialized for fast data streams. As a result, MacroBase delivers accurate results at speeds of up to 2M events per second per query on a single core. The system has delivered meaningful results in production, including at a telematics company monitoring hundreds of thousands of vehicles,
Date issued
2017-05Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence LaboratoryPublisher
Association for Computing Machinery (ACM)
Citation
Bailis, Peter, Gan, Edward, Madden, Samuel, Narayanan, Deepak, Rong, Kexin et al. 2017. "MacroBase: Prioritizing Attention in Fast Data."
Version: Author's final manuscript