Exploring learned join algorithm selection in relational database management systems

Nguyen, Long Phi,M. Eng.Massachusetts Institute of Technology.

dc.contributor.advisor	Ryan C. Marcus and Tim Kraska.	en_US
dc.contributor.author	Nguyen, Long Phi,M. Eng.Massachusetts Institute of Technology.	en_US
dc.contributor.other	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.	en_US
dc.date.accessioned	2021-05-24T19:52:30Z
dc.date.available	2021-05-24T19:52:30Z
dc.date.copyright	2021	en_US
dc.date.issued	2021	en_US
dc.identifier.uri	https://hdl.handle.net/1721.1/130706
dc.description	Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, February, 2021	en_US
dc.description	Cataloged from the official PDF of thesis.	en_US
dc.description	Includes bibliographical references (page 81).	en_US
dc.description.abstract	Query optimizers, crucial components of relational database management systems, are responsible for generating efficient query execution plans. Despite many advances in the database community over the last few decades, most popular relational database management systems today still use cost-based optimizers that do not always model the underlying data's characteristics accurately. These cost-based optimizers brutally slow down a query if they make even one gross underestimate of a database table's cardinality. In this work, we improve on native cost-based optimizer performance by identifying the most ideal join algorithms for query execution plans in two popular relational database management systems, PostgreSQL and Microsoft SQL. First, we gather baseline query execution times for the entire IMDb Join Order Benchmark under different subsets of usable join algorithms to show that no subset yields high performance across all queries. We then show that it is feasible to use deep reinforcement learning to choose one of these subsets for each query seen and achieve far better performance on the intensive JOB queries. Finally, we introduce the idea of k-edits, showing results that indicate that for some queries, isolating just 1 "bad" join and changing its join algorithm can yield better performance. Our work suggests that reinforcement learning with both coarse and fine decisions shows huge potential for the future of query optimization and relational database management systems.	en_US
dc.description.statementofresponsibility	by Long Phi Nguyen.	en_US
dc.format.extent	81 pages	en_US
dc.language.iso	eng	en_US
dc.publisher	Massachusetts Institute of Technology	en_US
dc.rights	MIT theses may be protected by copyright. Please reuse MIT thesis content according to the MIT Libraries Permissions Policy, which is available through the URL provided.	en_US
dc.rights.uri	http://dspace.mit.edu/handle/1721.1/7582	en_US
dc.subject	Electrical Engineering and Computer Science.	en_US
dc.title	Exploring learned join algorithm selection in relational database management systems	en_US
dc.type	Thesis	en_US
dc.description.degree	M. Eng.	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.identifier.oclc	1251800590	en_US
dc.description.collection	M.Eng. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science	en_US
dspace.imported	2021-05-24T19:52:30Z	en_US
mit.thesis.degree	Master	en_US
mit.thesis.department	EECS	en_US

Files in this item

Name:: 1251800590-MIT.pdf
Size:: 1.402Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Graduate Theses

Show simple item record