dc.contributor.author | Abuzaid, Firas | |
dc.contributor.author | Bailis, Peter | |
dc.contributor.author | Ding, Jialin | |
dc.contributor.author | Gan, Edward | |
dc.contributor.author | Madden, Samuel | |
dc.contributor.author | Narayanan, Deepak | |
dc.contributor.author | Rong, Kexin | |
dc.contributor.author | Suri, Sahaana | |
dc.date.accessioned | 2021-10-27T20:10:35Z | |
dc.date.available | 2021-10-27T20:10:35Z | |
dc.date.issued | 2018 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/135069 | |
dc.description.abstract | © 2018 Association for Computing Machinery. As data volumes continue to rise, manual inspection is becoming increasingly untenable. In response, we present MacroBase, a data analytics engine that prioritizes end-user attention in high-volume fast data streams. MacroBase enables eficient, accurate, and modular analyses that highlight and aggregate important and unusual behavior, acting as a search engine for fast data. MacroBase is able to deliver order-of-magnitude speedups over alternatives by optimizing the combination of explanation (i.e., feature selection) and classification tasks and by leveraging a new reservoir sampler and heavy-hitters sketch specialized for fast data streams. As a result, MacroBase delivers accurate results at speeds of up to 2M events per second per query on a single core. The system has delivered meaningful results in production, including at a telematics company monitoring hundreds of thousands of vehicles. | |
dc.language.iso | en | |
dc.publisher | Association for Computing Machinery (ACM) | |
dc.relation.isversionof | 10.1145/3276463 | |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | |
dc.source | Other repository | |
dc.title | MacroBase: Prioritizing Attention in Fast Data | |
dc.type | Article | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | |
dc.relation.journal | ACM Transactions on Database Systems | |
dc.eprint.version | Author's final manuscript | |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | |
eprint.status | http://purl.org/eprint/status/PeerReviewed | |
dc.date.updated | 2019-06-18T17:06:52Z | |
dspace.orderedauthors | Abuzaid, F; Bailis, P; Ding, J; Gan, E; Madden, S; Narayanan, D; Rong, K; Suri, S | |
dspace.date.submission | 2019-06-18T17:06:53Z | |
mit.journal.volume | 43 | |
mit.journal.issue | 4 | |
mit.metadata.status | Authority Work and Publication Information Needed | |