Black-Box Access is Insufficient for Rigorous AI Audits

Casper, Stephen; Ezell, Carson; Siegmann, Charlotte; Kolt, Noam; Curtis, Taylor Lynn; Bucknall, Benjamin; Haupt, Andreas; Wei, Kevin; Scheurer, Jérémy; Hobbhahn, Marius; Sharkey, Lee; Krishna, Satyapriya; Von Hagen, Marvin; Alberti, Silas; Chan, Alan; Sun, Qinyi; Gerovitch, Michael; Bau, David; Tegmark, Max; Krueger, David; Hadfield-Menell, Dylan

dc.contributor.author	Casper, Stephen
dc.contributor.author	Ezell, Carson
dc.contributor.author	Siegmann, Charlotte
dc.contributor.author	Kolt, Noam
dc.contributor.author	Curtis, Taylor Lynn
dc.contributor.author	Bucknall, Benjamin
dc.contributor.author	Haupt, Andreas
dc.contributor.author	Wei, Kevin
dc.contributor.author	Scheurer, Jérémy
dc.contributor.author	Hobbhahn, Marius
dc.contributor.author	Sharkey, Lee
dc.contributor.author	Krishna, Satyapriya
dc.contributor.author	Von Hagen, Marvin
dc.contributor.author	Alberti, Silas
dc.contributor.author	Chan, Alan
dc.contributor.author	Sun, Qinyi
dc.contributor.author	Gerovitch, Michael
dc.contributor.author	Bau, David
dc.contributor.author	Tegmark, Max
dc.contributor.author	Krueger, David
dc.contributor.author	Hadfield-Menell, Dylan
dc.date.accessioned	2024-07-24T17:19:49Z
dc.date.available	2024-07-24T17:19:49Z
dc.date.issued	2024-06-03
dc.identifier.isbn	979-8-4007-0450-5
dc.identifier.uri	https://hdl.handle.net/1721.1/155783
dc.description	FAccT ’24, June 03–06, 2024, Rio de Janeiro, Brazil	en_US
dc.description.abstract	External audits of AI systems are increasingly recognized as a key mechanism for AI governance. The effectiveness of an audit, however, depends on the degree of access granted to auditors. Recent audits of state-of-the-art AI systems have primarily relied on black-box access, in which auditors can only query the system and observe its outputs. However, white-box access to the system’s inner workings (e.g., weights, activations, gradients) allows an auditor to perform stronger attacks, more thoroughly interpret models, and conduct fine-tuning. Meanwhile, outside-the-box access to training and deployment information (e.g., methodology, code, documentation, data, deployment details, findings from internal evaluations) allows auditors to scrutinize the development process and design more targeted evaluations. In this paper, we examine the limitations of black-box audits and the advantages of white- and outside-the-box audits. We also discuss technical, physical, and legal safeguards for performing these audits with minimal security risks. Given that different forms of access can lead to very different levels of evaluation, we conclude that (1) transparency regarding the access and methods used by auditors is necessary to properly interpret audit results, and (2) white- and outside-the-box access allow for substantially more scrutiny than black-box access alone.	en_US
dc.publisher	ACM\|The 2024 ACM Conference on Fairness, Accountability, and Transparency	en_US
dc.relation.isversionof	10.1145/3630106.3659037	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	Association for Computing Machinery	en_US
dc.title	Black-Box Access is Insufficient for Rigorous AI Audits	en_US
dc.type	Article	en_US
dc.identifier.citation	Casper, Stephen, Ezell, Carson, Siegmann, Charlotte, Kolt, Noam, Curtis, Taylor Lynn et al. 2024. "Black-Box Access is Insufficient for Rigorous AI Audits."
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.contributor.department	Massachusetts Institute of Technology. Department of Economics
dc.contributor.department	Massachusetts Institute of Technology. Center for Collective Intelligence
dc.contributor.department	Massachusetts Institute of Technology. Department of Physics
dc.identifier.mitlicense	PUBLISHER_CC
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/ConferencePaper	en_US
eprint.status	http://purl.org/eprint/status/NonPeerReviewed	en_US
dc.date.updated	2024-07-01T07:56:42Z
dc.language.rfc3066	en
dc.rights.holder	The author(s)
dspace.date.submission	2024-07-01T07:56:43Z
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: 3630106.3659037.pdf
Size:: 850.0Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record