dc.contributor.author | Xiao, Hanshen | |
dc.contributor.author | Suh, G. Edward | |
dc.contributor.author | Devadas, Srinivas | |
dc.date.accessioned | 2025-01-27T22:32:24Z | |
dc.date.available | 2025-01-27T22:32:24Z | |
dc.date.issued | 2024-12-02 | |
dc.identifier.isbn | 979-8-4007-0636-3 | |
dc.identifier.uri | https://hdl.handle.net/1721.1/158081 | |
dc.description | CCS ’24, October 14–18, 2024, Salt Lake City, UT, USA | en_US |
dc.description.abstract | We initiate a formal study on the concept of learnable obfuscation and aim to answer the following question: is there a type of data encoding that maintains the "learnability" of encoded samples, thereby enabling direct model training on transformed data, while ensuring the privacy of both plaintext and the secret encoding function? This long-standing open problem has prompted many efforts to design such an encryption function, for example, NeuraCrypt and TransNet. Nonetheless, all existing constructions are heuristic without formal privacy guarantees, and many successful reconstruction attacks are known on these constructions assuming an adversary with substantial prior knowledge.
We present both generic possibility and impossibility results pertaining to learnable obfuscation. On one hand, we demonstrate that any non-trivial, property-preserving transformation which enables effectively learning over encoded samples cannot offer cryptographic computational security in the worst case. On the other hand, from the lens of information-theoretical security, we devise a series of new tools to produce provable and useful privacy guarantees from a set of heuristic obfuscation methods, including matrix masking, data mixing and permutation, through noise perturbation. Under the framework of PAC Privacy, we show how to quantify the leakage from the learnable obfuscation built upon obfuscation and perturbation methods against adversarial inference. Significantly sharpened utility-privacy tradeoffs are achieved compared to state-of-the-art accounting methods when measuring privacy against data reconstruction and membership inference attacks. | en_US |
dc.publisher | ACM|Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security | en_US |
dc.relation.isversionof | https://doi.org/10.1145/3658644.3670277 | en_US |
dc.rights | Creative Commons Attribution | en_US |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
dc.source | Association for Computing Machinery | en_US |
dc.title | Formal Privacy Proof of Data Encoding: The Possibility and Impossibility of Learnable Encryption | en_US |
dc.type | Article | en_US |
dc.identifier.citation | Xiao, Hanshen, Suh, G. Edward and Devadas, Srinivas. 2024. "Formal Privacy Proof of Data Encoding: The Possibility and Impossibility of Learnable Encryption." | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | en_US |
dc.identifier.mitlicense | PUBLISHER_CC | |
dc.eprint.version | Final published version | en_US |
dc.type.uri | http://purl.org/eprint/type/ConferencePaper | en_US |
eprint.status | http://purl.org/eprint/status/NonPeerReviewed | en_US |
dc.date.updated | 2025-01-01T08:48:23Z | |
dc.language.rfc3066 | en | |
dc.rights.holder | The author(s) | |
dspace.date.submission | 2025-01-01T08:48:23Z | |
mit.license | PUBLISHER_CC | |
mit.metadata.status | Authority Work and Publication Information Needed | en_US |