Blind regression : nonparametric regression for latent variable models via collaborative filtering
Author(s)
Song, Dogyoon
DownloadFull printable version (1.606Mb)
Alternative title
Nonparametric regression for latent variable models via collaborative filtering
Other Contributors
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.
Advisor
Devavrat Shah.
Terms of use
Metadata
Show full item recordAbstract
Recommender systems are tools that provide suggestions for items that are most likely to be of interest to a particular user; they are central to various decision making processes so that recommender systems have become ubiquitous. We introduce blind regression, a framework motivated by matrix completion for recommender systems: given m users, n items, and a subset of user-item ratings, the goal is to predict the unobserved ratings given the data, i.e., to complete the partially observed matrix. We posit that user u and movie i have features x1(u) and x2(i) respectively, and their corresponding rating y(u, i) is a noisy measurement of f(x1(u), x2(i)) for some unknown function f. In contrast to classical regression, the features x = (x1(u), x2(i)) are not observed (latent), making it challenging to apply standard regression methods. We suggest a two-step procedure to overcome this challenge: 1) estimate distance for latent variables, and then 2) apply nonparametric regression. Applying this framework to matrix completion, we provide a prediction algorithm that is consistent for all Lipschitz functions. In fact, the analysis naturally leads to a variant of collaborative filtering, shedding insight into the widespread success of collaborative filtering. Assuming each entry is revealed independently with p = max(m-1+[delta], n-1/2+[delta]) for [delta] > 0, we prove that the expected fraction of our estimates with error greater than [epsilon] is less than [gamma]2/[epsilon]2, plus a polynomially decaying term, where [gamma]2 is the variance of the noise. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides principled improvements over basic collaborative filtering and is competitive with matrix factorization methods. The algorithm and analysis naturally extend to higher order tensor completion by simply flattening the tensor into a matrix. We show that our simple and principled approach is competitive with respect to state-of-art tensor completion algorithms when applied to image inpainting data. Lastly, we conclude this thesis by proposing various related directions for future research.
Description
Thesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2016. This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. Cataloged from student-submitted PDF version of thesis. Includes bibliographical references (pages 77-81).
Date issued
2016Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.