Detecting Errors in Financial Data: A Multi-Agent LLM and Synthetic Data Approach

Liu, Katherine

Author(s)

Liu, Katherine

DownloadThesis PDF (1.299Mb)

Advisor

Gupta, Amar

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

With the high volume of activity flowing through financial institutions, detecting potential errors remains a critical challenge. This paper addresses two key areas where errors may occur: business name registrations and transactions within valid accounts. Traditional string-matching methods struggle to accurately identify incorrectly written business names that closely resemble existing ones, while existing error detection models for transaction data often suffer from class imbalance, leading to reduced performance on minority incorrect transaction cases. To address these issues, this paper proposes two novel approaches. First, a hybrid method integrating multi-agent Large Language Models (LLMs) with existing string-matching techniques enhances the detection of incorrect business names by capturing subtle variations beyond conventional edit-distance metrics, improving the recall from 0.815 for the baseline model to 0.987 using the proposed method. Second, an improved tabular data generation method for credit card transactions is introduced, leveraging LLMs and class balancing to generate high-quality synthetic data. Using this data to train error detection systems results in a decrease of the false negative rate from 23.47% to 12.84%. Together, these methods enhance the performance of error detection systems, enabling financial institutions to enhance the experiences of their clients.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/162958

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses