MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Transformation Tolerance of Facial Recognition Technology and Informative Evaluation Metrics

Author(s)
Nakamura, Haley Marie
Thumbnail
DownloadThesis PDF (10.08Mb)
Advisor
Sinha, Pawan
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Over the last decade, machine learning based facial recognition (FR) systems have continued to increase in popularity while spreading to unique deployment settings. Despite the large variance among FR input distributions, popular facial recognition benchmarks continue to characterize system performance using one aggregate score over a single dataset. In many cases, the limitations of this score are unclear to downstream users: assuming benchmark accuracy is high, how is it expected to change for an image sampled from a distinct distribution? Which transformations can the model handle robustly, and which cause failure? Meanwhile, there is a large body of human facial perception research that aims to understand the underlying mechanisms of human recognition. This field offers methodological inspiration for more informative evaluation techniques, including the characterization of recognition performance as a function of a quantifiable input transformation. This work performs such an analysis. The performance scores of five state-of-the-art FR models are characterized as a function of Gaussian blur strength, intersecting with color variation. The performance-blur relationship is modeled as an s-curve, creating a highly interpretable format for discussion. Blur strength was consistently statistically significant to performance, but color variation did not significantly impact any model. Results are then compared to prior human recognition experiments. The best models outperform humans in low-blur regimes while humans outperform all models in high-blur regimes. These results motivate the need for modern benchmarks that capture a range of input distributions. The analysis presented can lead to a deeper understanding of FR systems, and provide a clearer interpretation of how model performance changes under quantified distribution shifts.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/163037
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.