| dc.contributor.advisor | Sitzmann, Vincent | |
| dc.contributor.author | Fang, David S. | |
| dc.date.accessioned | 2024-09-24T18:25:11Z | |
| dc.date.available | 2024-09-24T18:25:11Z | |
| dc.date.issued | 2024-05 | |
| dc.date.submitted | 2024-07-11T14:37:43.379Z | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/156987 | |
| dc.description.abstract | Unsupervised learning for images and videos is important for many applications in computer vision. While supervised methods usually have the best performance, the amount of data curation and labeling that supervised datasets require makes it difficult to scale. On the other hand, unsupervised learning is more scalable, generalizable, and requires much less data curation, but is harder because it lacks a clear target objective. In this thesis, we propose two distinct lines of unsupervised learning work with generative applications: 1) BlobGSN and 2) optical flow estimation and flow generation with diffusion models. BlobGSN explores the unsupervised learning of spatially disentangled mid-level latent representations for 3D scenes in a generative context. Within this generative framework, we show that BlobGSN facilitates novel scene generation and editing. In a different vein, current state-of-the-art optical flow learning models rely on ground truth data collection for sequences of frames in videos. Unsupervised learning of optical flow, which would not require ground truth data, could theoretically leverage any publicly available video data for training. We explore different frameworks for unsupervised optical flow learning to tackle different problems such as photometric error, occlusion handling, and flow smoothness. Additionally, we propose a generative framework for generating optical flow from a single frame that can be trained in an unsupervised manner. | |
| dc.publisher | Massachusetts Institute of Technology | |
| dc.rights | In Copyright - Educational Use Permitted | |
| dc.rights | Copyright retained by author(s) | |
| dc.rights.uri | https://rightsstatements.org/page/InC-EDU/1.0/ | |
| dc.title | Unsupervised Learning for Generative Scene Editing and
Motion | |
| dc.type | Thesis | |
| dc.description.degree | M.Eng. | |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
| mit.thesis.degree | Master | |
| thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |