Show simple item record

dc.contributor.authorPerron, Matthew
dc.contributor.authorCastro Fernandez, Raul
dc.contributor.authorDeWitt, David
dc.contributor.authorMadden, Samuel
dc.date.accessioned2021-10-27T20:36:16Z
dc.date.available2021-10-27T20:36:16Z
dc.date.issued2020
dc.identifier.urihttps://hdl.handle.net/1721.1/136620
dc.description.abstract© 2020 Association for Computing Machinery. Much like on-premises systems, the natural choice for running database analytics workloads in the cloud is to provision a cluster of nodes to run a database instance. However, analytics workloads are often bursty or low volume, leaving clusters idle much of the time, meaning customers pay for compute resources even when underutilized. The ability of cloud function services, such as AWS Lambda or Azure Functions, to run small, fine granularity tasks make them appear to be a natural choice for query processing in such settings. But implementing an analytics system on cloud functions comes with its own set of challenges. These include managing hundreds of tiny stateless resource-constrained workers, handling stragglers, and shuffling data through opaque cloud services. In this paper we present Starling, a query execution engine built on cloud function services that employs a number of techniques to mitigate these challenges, providing interactive query latency at a lower total cost than provisioned systems with low-to-moderate utilization. In particular, on a 1TB TPC-H dataset in cloud storage, Starling is less expensive than the best provisioned systems for workloads when queries arrive 1 minute apart or more. Starling also has lower latency than competing systems reading from cloud object stores and can scale to larger datasets.
dc.language.isoen
dc.publisherACM
dc.relation.isversionof10.1145/3318464.3380609
dc.rightsCreative Commons Attribution-Noncommercial-Share Alike
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/4.0/
dc.sourcearXiv
dc.titleStarling: A Scalable Query Engine on Cloud Functions
dc.typeArticle
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.relation.journalProceedings of the ACM SIGMOD International Conference on Management of Data
dc.eprint.versionOriginal manuscript
dc.type.urihttp://purl.org/eprint/type/ConferencePaper
eprint.statushttp://purl.org/eprint/status/NonPeerReviewed
dc.date.updated2021-01-29T19:17:53Z
dspace.orderedauthorsPerron, M; Castro Fernandez, R; DeWitt, D; Madden, S
dspace.date.submission2021-01-29T19:17:56Z
mit.journal.volumeabs/1911.11727
mit.licenseOPEN_ACCESS_POLICY
mit.metadata.statusAuthority Work and Publication Information Needed


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record