Challenging the Classical View: Recognition of Identity and Expression as Integrated Processes

Schwartz, Emily; O’Nell, Kathryn; Saxe, Rebecca; Anzellotti, Stefano

Author(s)

Schwartz, Emily; O’Nell, Kathryn; Saxe, Rebecca; Anzellotti, Stefano

Downloadbrainsci-13-00296.pdf (3.498Mb)

Publisher Policy

Terms of use

Creative Commons Attribution https://creativecommons.org/licenses/by/4.0/

Metadata

Show full item record

Abstract

Recent neuroimaging evidence challenges the classical view that face identity and facial expression are processed by segregated neural pathways, showing that information about identity and expression are encoded within common brain regions. This article tests the hypothesis that integrated representations of identity and expression arise spontaneously within deep neural networks. A subset of the CelebA dataset is used to train a deep convolutional neural network (DCNN) to label face identity (chance = <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>0.06</mn><mo>%</mo></mrow></semantics></math></inline-formula>, accuracy = <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>26.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>), and the FER2013 dataset is used to train a DCNN to label facial expression (chance = <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>14.2</mn><mo>%</mo></mrow></semantics></math></inline-formula>, accuracy = <inline-formula><math xmlns="http://www.w3.org/1998/Math/MathML" display="inline"><semantics><mrow><mn>63.5</mn><mo>%</mo></mrow></semantics></math></inline-formula>). The identity-trained and expression-trained networks each successfully transfer to labeling both face identity and facial expression on the Karolinska Directed Emotional Faces dataset. This study demonstrates that DCNNs trained to recognize face identity and DCNNs trained to recognize facial expression spontaneously develop representations of facial expression and face identity, respectively. Furthermore, a congruence coefficient analysis reveals that features distinguishing between identities and features distinguishing between expressions become increasingly orthogonal from layer to layer, suggesting that deep neural networks disentangle representational subspaces corresponding to different sources.

Date issued

2023-02-10

URI

https://hdl.handle.net/1721.1/148018

Department

Massachusetts Institute of Technology. Department of Brain and Cognitive Sciences

Publisher

Multidisciplinary Digital Publishing Institute

Citation

Brain Sciences 13 (2): 296 (2023)

Version: Final published version

Collections

MIT Open Access Articles