What Do Large Factor Models Learn? Self-Induced Regularization, Cost of Overfitting, and Self-Adaptivity
Author(s)
Xiong, Xin
DownloadThesis PDF (12.08Mb)
Advisor
Chen, Hui
Terms of use
Metadata
Show full item recordAbstract
This paper studies the out-of-sample performance of large, overparameterized linear factor models for stochastic discount factor (SDF) estimation. Motivated by recent advances in finance and machine learning, we analyze the all-inclusive ridge estimator that incorporates all candidate factors without ex-ante screening or dimension reduction. Our new non-asymptotic pricing error bounds reveal that including many low-variance principal components implicitly increases the effective penalty on high-variance components, shrinking the estimated SDF and biasing performance. We further show that exact interpolation in the low-variance space leads to bounded out-of-sample deterioration, as the fitted coefficients behave effectively like zero and incur little additional error relative to underspecification. In addition, ridge regression self-adapts by emphasizing top principal components without knowing which factors are important, effectively mimicking a data-driven cutoff in principal component regression. These findings extend benign overfitting theory to the inherently misspecified setting of SDF estimation, where classical linear regression assumptions fail. Empirically, we validate these insights using U.S. equity data and large factor sets constructed via Random Fourier Features (RFF). We find that noise factors degrade performance through bias rather than variance inflation, and that adding weak but priced factors can hurt model performance. Lastly, we show that mildly negative ridge penalties can enhance model performance, consistent with our theoretical prediction that they partially offset self-induced regularization.
Date issued
2026-02Department
Sloan School of ManagementPublisher
Massachusetts Institute of Technology