Methods for Generalization Under Distribution Shift

Netanyahu, Aviv

Author(s)

Netanyahu, Aviv

DownloadThesis PDF (13.49Mb)

Advisor

Agrawal, Pulkit

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Machine learning systems have achieved remarkable performance in tasks where test data closely resembles the training distribution. However, real-world applications often require systems capable of handling more challenging situations -- specifically, adapting to new tasks and extrapolating to data points outside the distribution of the training set. The current paradigm for handling distribution shifts is collecting and training models on large datasets. This work offers two more principled frameworks that enable machine learning models to generalize effectively to out-of-distribution scenarios without sacrificing the power of modern overparameterized models. The first framework converts an out-of-support zero-shot generalization problem into an out-of-combination problem via a transductive reparameterization, which is possible under low-rank style conditions. We explore how this idea can be applied to domains like robotics, where the environment is changing, and materials and molecular design, where predicting properties of materials or molecules outside of known ranges is crucial to driving more efficient materials discovery. The second framework focuses on few-shot task learning, which involves agents learning new tasks from minimal data and applying them to new environments. We formulate the problem of few-shot task learning as Few-Shot Task Learning through Inverse Generative Modeling, which allows us to leverage the power of neural generative models pretrained on a set of base tasks. We adapt a method for efficient concept learning to few-shot task learning based on our formulation and rapidly learn new tasks with only a few examples, enabling task execution from autonomous driving to real-world robotic manipulation tasks in novel settings without the need for extensive retraining.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/164031

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses