Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data

Balagopalan, Aparna; Madras, David; Yang, David H.; Hadfield-Menell, Dylan; Hadfield, Gillian K.; Ghassemi, Marzyeh

dc.contributor.author	Balagopalan, Aparna
dc.contributor.author	Madras, David
dc.contributor.author	Yang, David H.
dc.contributor.author	Hadfield-Menell, Dylan
dc.contributor.author	Hadfield, Gillian K.
dc.contributor.author	Ghassemi, Marzyeh
dc.date.accessioned	2024-02-09T20:48:29Z
dc.date.available	2024-02-09T20:48:29Z
dc.date.issued	2023-05-12
dc.identifier.issn	2375-2548
dc.identifier.uri	https://hdl.handle.net/1721.1/153492
dc.description.abstract	As governments and industry turn to increased use of automated decision systems, it becomes essential to consider how closely such systems can reproduce human judgment. We identify a core potential failure, finding that annotators label objects differently depending on whether they are being asked a factual question or a normative question. This challenges a natural assumption maintained in many standard machine-learning (ML) data acquisition procedures: that there is no difference between predicting the factual classification of an object and an exercise of judgment about whether an object violates a rule premised on those facts. We find that using factual labels to train models intended for normative judgments introduces a notable measurement error. We show that models trained using factual labels yield significantly different judgments than those trained using normative labels and that the impact of this effect on model performance can exceed that of other factors (e.g., dataset size) that routinely attract attention from ML researchers and practitioners.	en_US
dc.language.iso	en_US
dc.publisher	American Association for the Advancement of Science (AAAS)	en_US
dc.relation.isversionof	10.1126/sciadv.abq0701	en_US
dc.rights	Creative Commons Attribution	en_US
dc.rights	An error occurred on the license name.	*
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/	en_US
dc.source	American Association for the Advancement of Science	en_US
dc.subject	Multidisciplinary	en_US
dc.title	Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data	en_US
dc.type	Article	en_US
dc.identifier.citation	Aparna Balagopalan et al. ,Judging facts, judging norms: Training machine learning models to judge humans requires a modified approach to labeling data.Sci. Adv.9,eabq0701(2023).	en_US
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.contributor.department	Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory
dc.eprint.version	Final published version	en_US
dc.type.uri	http://purl.org/eprint/type/JournalArticle	en_US
eprint.status	http://purl.org/eprint/status/PeerReviewed	en_US
dspace.date.submission	2024-02-09T20:46:57Z
mit.journal.volume	9	en_US
mit.journal.issue	19	en_US
mit.license	PUBLISHER_CC
mit.metadata.status	Authority Work and Publication Information Needed	en_US

Files in this item

Name:: license_rdf
Size:: 40bytes
Format:: application/rdf+xml

View/Open

Name:: sciadv.abq0701.pdf
Size:: 1.610Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

MIT Open Access Articles

Show simple item record