| dc.contributor.advisor | Samuel R. Madden. | en_US |
| dc.contributor.author | Shanbhag, Anil(Anil Atmanand) | en_US |
| dc.contributor.other | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science. | en_US |
| dc.date.accessioned | 2021-01-06T20:17:36Z | |
| dc.date.available | 2021-01-06T20:17:36Z | |
| dc.date.copyright | 2020 | en_US |
| dc.date.issued | 2020 | en_US |
| dc.identifier.uri | https://hdl.handle.net/1721.1/129305 | |
| dc.description | Thesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, September, 2020 | en_US |
| dc.description | Cataloged from student-submitted PDF of thesis. | en_US |
| dc.description | Includes bibliographical references (pages 157-163). | en_US |
| dc.description.abstract | Modern GPUs provide an order-of-magnitude greater memory bandwidth compared to CPUs. In theory, this means data processing systems can process O(TB) of data with sub 100ms latency, thereby enabling interactive query response times on analytical SQL queries. However, the massively parallel architecture of GPUs requires rearchitecting in-memory data analytics systems in order to achieve optimal performance. This thesis describes how we adapted and redesigned in-memory data analytics systems to better exploit the GPU's memory and execution model. We present Crystal, a library of building blocks that can be used for writing high performance SQL query implementations for GPU.We use Crystal to implement basic SQL query operators and an analytical benchmark. We present theoretical models based on memory bandwidth as the critical bottleneck for query performance and show that implementations using Crystal are able to achieve these theoretical limits. We also present a study of the fundamental performance characteristics of GPUs and CPUs for database analytics. Our analysis shows that using modern GPUs vs CPUs can lead to a runtime gain equal to 1.5x bandwidth ratio of GPU to CPU ( 25x in our setup) and be 4x more cost effective than CPUs. Finally, we used Crystal's design principles to develop massively parallel variants of two classic sequential algorithms: top-k and bit-packing based compression. Bitonic Top-K is a top-k algorithm based on bitonic sort that is 4x faster than previous approaches. GPU-FOR is a compression format that can be decompressed efficiently in parallel and can be used to fit more data into the limited GPU memory. In summary, this thesis makes the case for using GPUs as the primary execution engine for interactive data analytics, and shows that implementations are efficient and practical. | en_US |
| dc.description.statementofresponsibility | by Anil Shanbhag. | en_US |
| dc.format.extent | 163 pages | en_US |
| dc.language.iso | eng | en_US |
| dc.publisher | Massachusetts Institute of Technology | en_US |
| dc.rights | MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided. | en_US |
| dc.rights.uri | http://dspace.mit.edu/handle/1721.1/7582 | en_US |
| dc.subject | Electrical Engineering and Computer Science. | en_US |
| dc.title | Interactive data analytics using GPUs | en_US |
| dc.title.alternative | Interactive data analytics using central processing units | en_US |
| dc.type | Thesis | en_US |
| dc.description.degree | Ph. D. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
| dc.identifier.oclc | 1227757140 | en_US |
| dc.description.collection | Ph.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science | en_US |
| dspace.imported | 2021-01-06T20:17:36Z | en_US |
| mit.thesis.degree | Doctoral | en_US |
| mit.thesis.department | EECS | en_US |