Abstract:
Can one compute a low-dimensional representation of any given data by looking only at its small sample, chosen cleverly on the fly? Motivated by the above question, we consider the problem of low-rank matrix approximation: given a matrix A..., one wants to compute a rank-k matrix (where k << min{m, n}) nearest to A in the Frobenius norm (also known as the Hilbert-Schmidt norm). We prove that using a sample of roughly O(k/[epsilon]) rows of A one can compute, with high probability, a (1 + [epsilon])-approximation to the nearest rank-k matrix. This gives an algorithm for low-rank approximation with an improved error guarantee (compared to the additive [epsilon]... guarantee known earlier from the work of Frieze, Kannan, and Vempala) and running time O(Mk/[epsilon]), where M is the number of non-zero entries of A. The proof is based on two sampling techniques called adaptive sampling and volume sampling, and some linear algebraic tools. Low-rank matrix approximation under the Frobenius norm is equivalent to the problem of finding a low-dimensional subspace that minimizes the sum of squared distances to given points. The general subspace approximation problem asks one to find a low-dimensional subspace that minimizes the sum of p-th powers of distances (for p > 1) to given points. We generalize our sampling techniques and prove similar sampling-based dimension reduction results for subspace approximation. However, the proof is geometric.

Description:
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Mathematics, 2007.; Includes bibliographical references (p. 51-52).