Towards General-purpose Vision via Multiview Contrastive Learning

Tian, Yonglong

dc.contributor.advisor	Isola, Phillip
dc.contributor.author	Tian, Yonglong
dc.date.accessioned	2023-03-31T14:41:08Z
dc.date.available	2023-03-31T14:41:08Z
dc.date.issued	2023-02
dc.date.submitted	2023-02-28T14:39:16.880Z
dc.identifier.uri	https://hdl.handle.net/1721.1/150229
dc.description.abstract	Representation learning plays a key role in building robust and general-purpose vision learners, and is a long-standing problem. It becomes increasingly interesting with the continuing explosion of data in our era. However, most previous approaches are based on specific designs of strategies that are not generalizable. This thesis instead proposes and studies multiview contrastive learning, which is based on a simple mathematical principle -- discriminating between samples from the joint distribution and samples from the product of marginals. We firstly introduce the general framework of multiview contrastive learning (MCL). We demonstrate that this simple framework is able to deal with various representation learning problems, and often improves the state of the arts to the next level. Then we move forward by trying to understand the role of view selection in multiview contrastive learning from an information-theoretic point of view, and come up with an "InfoMin" principle, which connects to minimal sufficient statistics and information bottlenecks. Such principle is further demonstrated by supervised contrastive learning, which rivals or even beats the supervised cross-entropy learning on standard image classification benchmarks. In the last part, we discuss other applications (such as knowledge distillation) and improvements of multiview contrastive learning (e.g., how to improve its efficiency on uncurated data).
dc.publisher	Massachusetts Institute of Technology
dc.rights	In Copyright - Educational Use Permitted
dc.rights	Copyright MIT
dc.rights.uri	http://rightsstatements.org/page/InC-EDU/1.0/
dc.title	Towards General-purpose Vision via Multiview Contrastive Learning
dc.type	Thesis
dc.description.degree	Ph.D.
dc.contributor.department	Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degree	Doctoral
thesis.degree.name	Doctor of Philosophy

Files in this item

Name:: Tian-yonglong-PhD-EECS-2023-th ...
Size:: 16.24Mb
Format:: PDF
Description:: Thesis PDF

View/Open

This item appears in the following Collection(s)

Doctoral Theses

Show simple item record