MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Scaling contrastive learning batch size by two orders ofmagnitude

Author(s)
Tian, Betsy
Thumbnail
DownloadThesis PDF (655.0Kb)
Advisor
Freeman, William
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Contrastive learning has emerged as a powerful framework for unsupervised representation learning, allowing models to learn by maximizing agreement between related samples and distinguishing dissimilar ones. However, contrastive learning frameworks are fundamentally limited by the number of negative pairs a model can observe, and memory-intensive backbones constrain practical batch sizes. We introduce a three-phase, adapter-augmented training framework that scales contrastive batch sizes by two orders of magnitude – surpassing previous state-of-the-art learners in both accuracy and speed. First, we co-train the backbone and adapter on small batches to establish a strong initialization. Next, we freeze the backbone and train the adapter alone with very large batches, exposing it to an enlarged negative pool. Finally, we transfer large-batch adapter gradients back into the backbone via segmented backpropagation. We evaluate our method on the PlacesAudio dataset and show promising results for boosting retrieval performance at each phase. By exposing the model to substantially more negatives per effective batch, we achieve higher accuracy at a faster speed than optimizer-stepping baselines. Ultimately, this approach that scales batch size by hundreds of times can be integrated into any contrastive learning framework for more robust representation learning and abundant negative sampling.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162953
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.