MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Unsupervised Learning for Generative Scene Editing and Motion

Author(s)
Fang, David S.
Thumbnail
DownloadThesis PDF (13.69Mb)
Advisor
Sitzmann, Vincent
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Unsupervised learning for images and videos is important for many applications in computer vision. While supervised methods usually have the best performance, the amount of data curation and labeling that supervised datasets require makes it difficult to scale. On the other hand, unsupervised learning is more scalable, generalizable, and requires much less data curation, but is harder because it lacks a clear target objective. In this thesis, we propose two distinct lines of unsupervised learning work with generative applications: 1) BlobGSN and 2) optical flow estimation and flow generation with diffusion models. BlobGSN explores the unsupervised learning of spatially disentangled mid-level latent representations for 3D scenes in a generative context. Within this generative framework, we show that BlobGSN facilitates novel scene generation and editing. In a different vein, current state-of-the-art optical flow learning models rely on ground truth data collection for sequences of frames in videos. Unsupervised learning of optical flow, which would not require ground truth data, could theoretically leverage any publicly available video data for training. We explore different frameworks for unsupervised optical flow learning to tackle different problems such as photometric error, occlusion handling, and flow smoothness. Additionally, we propose a generative framework for generating optical flow from a single frame that can be trained in an unsupervised manner.
Date issued
2024-05
URI
https://hdl.handle.net/1721.1/156987
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.