Certifiably optimal sparse principal component analysis
Author(s)
Berk, Lauren; Bertsimas, Dimitris
Download12532_2018_153_ReferencePDF.pdf (404.2Kb)
Open Access Policy
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Abstract
This paper addresses the sparse principal component analysis (SPCA) problem for covariance matrices in dimension n aiming to find solutions with sparsity k using mixed integer optimization. We propose a tailored branch-and-bound algorithm, Optimal-SPCA, that enables us to solve SPCA to certifiable optimality in seconds for
$$n = 100$$
n
=
100
s,
$$k=10$$
k
=
10
s. This same algorithm can be applied to problems with
$$n=10{,}000\,\mathrm{s}$$
n
=
10
,
000
s
or higher to find high-quality feasible solutions in seconds while taking several hours to prove optimality. We apply our methods to a number of real data sets to demonstrate that our approach scales to the same problem sizes attempted by other methods, while providing superior solutions compared to those methods, explaining a higher portion of variance and permitting complete control over the desired sparsity. The software that was reviewed as part of this submission has been given the DOI (digital object identifier)
https://doi.org/10.5281/zenodo.2027898
.
Date issued
2019-01-01Department
Massachusetts Institute of Technology. Operations Research CenterPublisher
Springer Berlin Heidelberg