dc.contributor.author | Agarwal, Sameer | |
dc.contributor.author | Iyer, Anand P. | |
dc.contributor.author | Panda, Aurojit | |
dc.contributor.author | Mozafari, Barzan | |
dc.contributor.author | Stoica, Ion | |
dc.contributor.author | Madden, Samuel R. | |
dc.date.accessioned | 2014-09-26T13:11:32Z | |
dc.date.available | 2014-09-26T13:11:32Z | |
dc.date.issued | 2012-08 | |
dc.identifier.issn | 21508097 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/90381 | |
dc.description.abstract | In this demonstration, we present BlinkDB, a massively parallel, sampling-based approximate query processing framework for running interactive queries on large volumes of data. The key observation in BlinkDB is that one can make reasonable decisions in the absence of perfect answers. BlinkDB extends the Hive/HDFS stack and can handle the same set of SPJA (selection, projection, join and aggregate) queries as supported by these systems. BlinkDB provides real-time answers along with statistical error guarantees, and can scale to petabytes of data and thousands of machines in a fault-tolerant manner. Our experiments using the TPC-H benchmark and on an anonymized real-world video content distribution workload from Conviva Inc. show that BlinkDB can execute a wide range of queries up to 150x faster than Hive on MapReduce and 10--150x faster than Shark (Hive on Spark) over tens of terabytes of data stored across 100 machines, all with an error of 2--10%. | en_US |
dc.description.sponsorship | National Science Foundation (U.S.) (CISE Expeditions Award CCF-1139158) | en_US |
dc.description.sponsorship | QUALCOMM Inc. | en_US |
dc.description.sponsorship | Amazon.com (Firm) | en_US |
dc.description.sponsorship | Google (Firm) | en_US |
dc.description.sponsorship | SAP Corporation | en_US |
dc.description.sponsorship | Blue Goji | en_US |
dc.description.sponsorship | Cisco Systems, Inc. | en_US |
dc.description.sponsorship | Cloudera, Inc. | en_US |
dc.description.sponsorship | Ericsson, Inc. | en_US |
dc.description.sponsorship | General Electric Company | en_US |
dc.description.sponsorship | Hewlett-Packard Company | en_US |
dc.description.sponsorship | Intel Corporation | en_US |
dc.description.sponsorship | MarkLogic Corporation | en_US |
dc.description.sponsorship | Microsoft Corporation | en_US |
dc.description.sponsorship | NetApp | en_US |
dc.description.sponsorship | Oracle Corporation | en_US |
dc.description.sponsorship | Splunk Inc. | en_US |
dc.description.sponsorship | VMware, Inc. | en_US |
dc.description.sponsorship | United States. Defense Advanced Research Projects Agency (Contract FA8650-11-C-7136) | en_US |
dc.language.iso | en_US | |
dc.publisher | Association for Computing Machinery (ACM) | en_US |
dc.relation.isversionof | http://dx.doi.org/10.14778/2367502.2367533 | en_US |
dc.rights | Creative Commons Attribution-Noncommercial-Share Alike | en_US |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-sa/4.0/ | en_US |
dc.source | Other univ. web domain | en_US |
dc.title | Blink and it's done: Interactive queries on very large data | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Sameer Agarwal, Anand P. Iyer, Aurojit Panda, Samuel Madden, Barzan Mozafari, and Ion Stoica. 2012. Blink and it's done: interactive queries on very large data. Proc. VLDB Endow. 5, 12 (August 2012), 1902-1905. | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory | en_US |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.contributor.mitauthor | Mozafari, Barzan | en_US |
dc.contributor.mitauthor | Madden, Samuel R. | en_US |
dc.relation.journal | Proceedings of the VLDB Endowment | en_US |
dc.eprint.version | Author's final manuscript | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dspace.orderedauthors | Agarwal, Sameer; Iyer, Anand P.; Panda, Aurojit; Madden, Samuel; Mozafari, Barzan; Stoica, Ion | en_US |
dc.identifier.orcid | https://orcid.org/0000-0002-7470-3265 | |
mit.license | OPEN_ACCESS_POLICY | en_US |
mit.metadata.status | Complete | |