Mallet: SQL Dialect Translation with LLM Rule Generation

Ngom, Amadou Latyr; Kraska, Tim

Author(s)

Ngom, Amadou Latyr; Kraska, Tim

Download3663742.3663973.pdf (933.5Kb)

Publisher Policy

Terms of use

Article is made available in accordance with the publisher's policy and may be subject to US copyright law. Please refer to the publisher's site for terms of use.

Metadata

Show full item record

Abstract

Translating between the SQL dialects of different systems is important for migration and federated query processing. Existing approaches rely on hand-crafted translation rules, which tend to be incomplete and hard to maintain, especially as the number of dialects to translate increases. Thus, dialect translation remains a largely unsolved problem. To address this issue, we introduce Mallet, a system that leverages Large Language Models (LLMs) to automate the generation of SQL-to-SQL translation rules, namely schema conversion, automated UDF generation, extension selection, and expression composition. Once the rules are generated, they are infinitely reusable on new workloads without putting the LLM on the critical path of query execution. Mallet enhances the accuracy of the LLMs by (1) performing retrieval augmented generation (RAG) over system documentation and human expertise, (2) subjecting the rules to empirical validation using the actual SQL systems to detect hallucinations, and (3) automatically creating accurate few-shot learning instances. Contributors, without knowing the system's code, can improve Mallet by providing natural-language expertise for RAG.

Description

aiDM ’24, June 14, 2024, Santiago, AA, Chile

Date issued

2024-06-09

URI

https://hdl.handle.net/1721.1/155537

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Publisher

ACM

Citation

Ngom, Amadou Latyr and Kraska, Tim. 2024. "Mallet: SQL Dialect Translation with LLM Rule Generation."

Version: Final published version

ISBN

979-8-4007-0680-6

Collections

MIT Open Access Articles

The following license files are associated with this item:

Creative Commons