Formal Privacy Proof of Data Encoding: The Possibility and Impossibility of Learnable Encryption

Xiao, Hanshen; Suh, G. Edward; Devadas, Srinivas

dc.contributor.author	Xiao, Hanshen
dc.contributor.author	Suh, G. Edward
dc.contributor.author	Devadas, Srinivas
dc.date.accessioned	2025-01-27T22:32:24Z
dc.date.available	2025-01-27T22:32:24Z
dc.date.issued	2024-12-02
dc.identifier.isbn	979-8-4007-0636-3
dc.identifier.uri	https://hdl.handle.net/1721.1/158081
dc.description	CCS ’24, October 14–18, 2024, Salt Lake City, UT, USA	en_US
dc.description.abstract	We initiate a formal study on the concept of learnable obfuscation and aim to answer the following question: is there a type of data encoding that maintains the "learnability" of encoded samples, thereby enabling direct model training on transformed data, while ensuring the privacy of both plaintext and the secret encoding function? This long-standing open problem has prompted many efforts to design such an encryption function, for example, NeuraCrypt and TransNet. Nonetheless, all existing constructions are heuristic without formal privacy guarantees, and many successful reconstruction attacks are known on these constructions assuming an adversary with substantial prior knowledge. We present both generic possibility and impossibility results pertaining to learnable obfuscation. On one hand, we demonstrate that any non-trivial, property-preserving transformation which enables effectively learning over encoded samples cannot offer cryptographic computational security in the worst case. On the other hand, from the lens of information-theoretical security, we devise a series of new tools to produce provable and useful privacy guarantees from a set of heuristic obfuscation methods, including matrix masking, data mixing and permutation, through noise perturbation. Under the framework of PAC Privacy, we show how to quantify the leakage from the learnable obfuscation built upon obfuscation and perturbation methods against adversarial inference. Significantly sharpened utility-privacy tradeoffs are achieved compared to state-of-the-art accounting methods when measuring privacy against data reconstruction and membership inference attacks.	en_US
dc.publisher	ACM\|Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security	en_US
dc.relation.isversionof	https://doi.org/10.1145/3658644.3670277	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	Formal Privacy Proof of Data Encoding: The Possibility and Impossibility of Learnable Encryption	en_US
dc.type	Article	en_US
dc.identifier.citation	Xiao, Hanshen, Suh, G. Edward and Devadas, Srinivas. 2024. "Formal Privacy Proof of Data Encoding: The Possibility and Impossibility of Learnable Encryption."
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science	en_US
dc.identifier.mitlicense	PUBLISHER_CC
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2025-01-01T08:48:23Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2025-01-01T08:48:23Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: 3658644.3670277.pdf
Size:: 1.464Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record