Show simple item record

dc.contributor.advisorAzizan, Navid
dc.contributor.authorAlimohammadi, Kaveh
dc.date.accessioned2025-11-05T19:33:18Z
dc.date.available2025-11-05T19:33:18Z
dc.date.issued2025-05
dc.date.submitted2025-07-16T16:02:28.026Z
dc.identifier.urihttps://hdl.handle.net/1721.1/163540
dc.description.abstractExisting differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. This study presents the first-of-its-kind algorithm that can be combined with \emph{any} existing DP mechanisms to generate synthetic relational databases. The algorithm iteratively refines the relationship between individual synthetic tables to minimize their approximation errors in terms of low-order marginal distributions while maintaining referential integrity; consequently eliminates the need to flatten a relational database into a master table (saving space), operates efficiently (saving time), and scales effectively to high-dimensional data. We provide both DP and theoretical utility guarantees for our algorithm. Through numerical experiments on real-world datasets, we demonstrate the effectiveness of our method in preserving fidelity to the original data.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleDifferentially Private Synthetic Data Generation for Relational Databases
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Institute for Data, Systems, and Society
dc.identifier.orcidhttps://orcid.org/0009-0008-8013-7926
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Social and Engineering Systems


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record