dc.contributor.advisor | Rakhlin, Alexander | |
dc.contributor.author | Kur, Gil | |
dc.date.accessioned | 2023-11-02T20:23:27Z | |
dc.date.available | 2023-11-02T20:23:27Z | |
dc.date.issued | 2023-09 | |
dc.date.submitted | 2023-09-21T14:26:07.968Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/152867 | |
dc.description.abstract | This dissertation investigates non-parametric regression over large function classes, specifically, non-Donsker classes. We will present the concept of non-Donsker classes and study the statistical performance of Least Squares Estimator (LSE) --- which also serves as the Maximum Likelihood Estimator (MLE) under Gaussian noise --- over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of over these classes. (1) We demonstrate the minimax sub-optimality of the LSE in the non-Donsker regime, extending traditional findings of Birgé and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings. and Massart 93' and resolving a longstanding conjecture of Gardner, Markus and Milanfar 06'. (2) We reveal that in the non-Donsker regime, the sub-optimality of LSE arises solely from its elevated bias error term (in terms of the bias and variance decomposition). (3) We introduce the first minimax optimal algorithm for multivariate convex regression with a polynomial runtime in the number of samples -- showing that one can overcome the sub-optimality of the LSE in efficient runtime. (4) We study the minimal error of the LSE both in random and fixed design settings. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | In Copyright - Educational Use Permitted | |
dc.rights | Copyright retained by author(s) | |
dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
dc.title | On The Performance Of The Maximum Likelihood Over Large Models | |
dc.type | Thesis | |
dc.description.degree | Ph.D. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
dc.identifier.orcid | 0000-0001-7386-1686 | |
mit.thesis.degree | Doctoral | |
thesis.degree.name | Doctor of Philosophy | |