MINCE: Dialect-Aware SQL Decomposition for Federated Query Execution
Author(s)
Zhang, Sophie S.
DownloadThesis PDF (829.8Kb)
Advisor
Kraska, Tim
Terms of use
Metadata
Show full item recordAbstract
The increasing adoption of specialized database systems has led to the rise of heterogeneous data environments. While having multiple engines in a data infrastructure enables opportunities for workload optimization, SQL dialect incompatibility makes workload migration difficult. To address this challenge, we develop MINCE (Multi-dialect INtegration and Crossengine Execution), a technique that decomposes SQL queries into parts to enable federated execution across engines with differing SQL dialects. MINCE uses a rule-based method to partition a query into executable components that are assigned to different database systems. To evaluate different execution strategies, MINCE further implements a cost model that incorporates both on-engine query execution time and inter-system data transfer overhead. We evaluate MINCE on a TPC-H-based workload augmented with PostgreSQL-specific functions unsupported in Amazon Redshift. Experimental results show that MINCE produces the fastest execution strategy among our baselines for 72.1% of queries using estimated cardinality, achieving a 2× speedup over single-engine baselines. With perfect cardinality information available to our cost model, this value increases to 88.4%, with an average 2.8× speedup. These results demonstrate that our system not only enables more flexible federated query execution, but also reliably identifies performant execution strategies.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology