MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Low-cost Agents with Language Perception and Dynamic Inference

Author(s)
Pan, Bowen
Thumbnail
DownloadThesis PDF (27.99Mb)
Advisor
Oliva, Aude
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Designing efficient artificial intelligence agents presents significant challenges, particularly in terms of learning and inference costs. Traditional agents often suffer from high learning expenses due to their limited ability to generalize across diverse tasks and environments. Recent advances in large language models (LLMs) have shown strong generalization capabilities by leveraging high-level abstractions of the world through language. In this thesis, we propose leveraging language as a perceptual representation to enable LLM-based agents to perform vision-language navigation tasks with reduced data collection costs. We demonstrate that language not only facilitates the generation of efficient synthetic data but also serves as a bridge to minimize domain gaps between different environments. However, transformer-based agents are burdened with high inference costs, especially when handling long-horizon visual content. To mitigate this, we introduce two strategies: (1) reducing visual input redundancy through dynamic token selection, and (2) accelerating model inference using a memory-efficient Mixture of Experts (MoE) architecture. Together, these approaches offer a robust framework for enhancing both learning and inference efficiency in LLM agents.
Date issued
2024-09
URI
https://hdl.handle.net/1721.1/158499
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.