MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

What Do Large Factor Models Learn? Self-Induced Regularization, Cost of Overfitting, and Self-Adaptivity

Author(s)
Xiong, Xin
Thumbnail
DownloadThesis PDF (12.08Mb)
Advisor
Chen, Hui
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
This paper studies the out-of-sample performance of large, overparameterized linear factor models for stochastic discount factor (SDF) estimation. Motivated by recent advances in finance and machine learning, we analyze the all-inclusive ridge estimator that incorporates all candidate factors without ex-ante screening or dimension reduction. Our new non-asymptotic pricing error bounds reveal that including many low-variance principal components implicitly increases the effective penalty on high-variance components, shrinking the estimated SDF and biasing performance. We further show that exact interpolation in the low-variance space leads to bounded out-of-sample deterioration, as the fitted coefficients behave effectively like zero and incur little additional error relative to underspecification. In addition, ridge regression self-adapts by emphasizing top principal components without knowing which factors are important, effectively mimicking a data-driven cutoff in principal component regression. These findings extend benign overfitting theory to the inherently misspecified setting of SDF estimation, where classical linear regression assumptions fail. Empirically, we validate these insights using U.S. equity data and large factor sets constructed via Random Fourier Features (RFF). We find that noise factors degrade performance through bias rather than variance inflation, and that adding weak but priced factors can hurt model performance. Lastly, we show that mildly negative ridge penalties can enhance model performance, consistent with our theoretical prediction that they partially offset self-induced regularization.
Date issued
2026-02
URI
https://hdl.handle.net/1721.1/165500
Department
Sloan School of Management
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.