MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Last-Meter Delivery: Solving the Unattended Delivery Challenge from Streets to Doorsteps

Author(s)
Xiao, Wen-Xin
Thumbnail
DownloadThesis PDF (5.062Mb)
Advisor
Larson, Kent
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
The rise of e-commerce has led to a surge in package deliveries, resulting in the proliferation of unattended delivery methods to address the "last-meter" problem – the challenge of delivering packages from the roadside or sidewalk to the customer's front door. This thesis proposes a methodology for implementing Large Language Model (LLM), and Vision Language Model (VLM) to enable delivery robots to identify the final delivery target and navigate the complex terrain from the curb to the front door. The proposed solution aims to enhance the autonomy and safety of last-mile delivery systems, addressing the "last-meter" challenge and improving the customer experience. This thesis presents a comprehensive overview of the last-meter delivery concept, aiming to bridge the gap between the roadside/sidewalk and the customer's front door. It begins by introducing the significance of last-meter delivery in the growing e-commerce industry and the challenges posed by unattended deliveries. The thesis then reviews the existing literature on autonomous and unmanned delivery systems, multimodal delivery approaches, and the application of large language models and vision language models in robotics. This research identifies the advancements and gaps in the field that the proposed methodology aims to address. The thesis primarily focuses on leveraging Large Language Models, the Segment Anything Model, and the open-source Florence-2 vision foundation model to enable the transmission of customers' delivery instructions to the final delivery target in the context of last-meter delivery. It outlines the methodology for data preparation, object detection and labeling, as well as the integration of Large Language Models to handle customer instructions and coordinate delivery target. It also describes the experimental design and methodologies employed to validate the effectiveness of the proposed system. This includes the use of a last-meter dataset and the evaluation of last-meter scene and target coordinate identification. The thesis concludes by summarizing the key findings and contributions, discussing the broader implications of the proposed methodology, and suggesting directions for future work, such as enhancing system robustness and scalability. KEYWORDS: Last-Mile Delivery, last-meter Delivery, Large Language Models (LLM), Vision Language Models (VLM), Robotics, Segment Anything Model (SAM), Open-Vocabulary Object Detection (OVD).
Date issued
2024-09
URI
https://hdl.handle.net/1721.1/157723
Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.