| dc.contributor.advisor | Azizan, Navid | |
| dc.contributor.author | Alimohammadi, Kaveh | |
| dc.date.accessioned | 2025-11-05T19:33:18Z | |
| dc.date.available | 2025-11-05T19:33:18Z | |
| dc.date.issued | 2025-05 | |
| dc.date.submitted | 2025-07-16T16:02:28.026Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/163540 | |
| dc.description.abstract | Existing differentially private (DP) synthetic data generation mechanisms typically assume a single-source table. In practice, data is often distributed across multiple tables with relationships across tables. This study presents the first-of-its-kind algorithm that can be combined with \emph{any} existing DP mechanisms to generate synthetic relational databases. The algorithm iteratively refines the relationship between individual synthetic tables to minimize their approximation errors in terms of low-order marginal distributions while maintaining referential integrity; consequently eliminates the need to flatten a relational database into a master table (saving space), operates efficiently (saving time), and scales effectively to high-dimensional data. We provide both DP and theoretical utility guarantees for our algorithm. Through numerical experiments on real-world datasets, we demonstrate the effectiveness of our method in preserving fidelity to the original data. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | In Copyright - Educational Use Permitted | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
| dc.title | Differentially Private Synthetic Data Generation for Relational Databases | |
| dc.type | Thesis | |
| dc.description.degree | S.M. | |
| dc.contributor.department | Massachusetts Institute of Technology. Institute for Data, Systems, and Society | |
| dc.identifier.orcid | https://orcid.org/0009-0008-8013-7926 | |
| mit.thesis.degree | Master | |
| thesis.degree.name | Master of Science in Social and Engineering Systems | |