Show simple item record

dc.contributor.advisorVivienne Sze.en_US
dc.contributor.authorZhang, Zhengdong,Ph.D.Massachusetts Institute of Technology.en_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2019-11-04T19:53:29Z
dc.date.available2019-11-04T19:53:29Z
dc.date.copyright2019en_US
dc.date.issued2019en_US
dc.identifier.urihttps://hdl.handle.net/1721.1/122691
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionThesis: Ph. D., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2019en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 211-221).en_US
dc.description.abstractAutonomous navigation algorithms are the backbone of many robotic systems, such as self-driving cars and drones. However, state-of-the-art autonomous navigation algorithms are computationally expensive, requiring powerful CPUs and GPUs to enable them to run in real time. As a result, it is prohibitive to deploy them on miniature robots with limited computational resources onboard. To tackle this challenge, this thesis presents an algorithm-and-hardware co-design approach to design energy-efficient algorithms that are optimized for dedicated hardware architectures at the same time. It covers the design for three essential modules of an autonomous navigation system: perception, localization, and exploration.en_US
dc.description.abstractCompared with previous research that considers either algorithmic improvements or hardware architecture optimizations, our approach leads to algorithms that not only have lower time and space complexity but also map efficiently to specialized hardware architectures, resulting in significantly improved energy efficiency and throughput. First, this thesis studies how to design an energy-efficient visual perception system using the deformable part models (DPM) based object detection algorithm. It describes an algorithm that enforces sparsity in the data stored on a chip, which reduces the memory requirement by 34% and lowers the cost of the classification by 43%. Together with other hardware optimizations, this technique leads to an object detection chip that runs at 30 fps on 1920 x 1080 videos while consuming only 58.6mW of power.en_US
dc.description.abstractSecond, this thesis describes a systematic way to explore algorithm-hardware design choices to build a low-power chip that performs visual inertial odometry (VIO) to localize a vehicle. Each of the components in a VIO pipeline has multiple algorithmic choices with different time and space complexity. However, some algorithms of lower time complexity can be more expensive when implemented on-chip. This thesis examines each of the design choices from both the algorithm and hardware's point of view and presents a design that consumes 24mW of power while running at up to 90 fps and achieving near state-of-the-art localization accuracy Third, this thesis presents an efficient information theoretic mapping system for exploration. It features a novel algorithm called Fast computation of Shannon Mutual Information (FSMI) that computes the Shannon mutual information (MI) between perspective range measurements and the environment.en_US
dc.description.abstractFSMI algorithm features an analytic solution that avoids the expensive numerical integration required by the previous state-of-the-art algorithms, enabling FSMI to run three orders-of-magnitude faster in practice. We also present an extension of the FSMI algorithm to 3D mapping; the algorithm leverages the compression of a large 3D map using run-length encoding (RLE) and achieves 8x acceleration in a real-world exploration task. In addition, this thesis presents a hardware architecture designed for the FSMI algorithm. The design consists of a novel memory banking method that increases the memory bandwidth so that multiple FSMI cores can run in parallel while maintaining high utilization. A novel arbiter is proposed to resolve the memory read conflicts between multiple cores within one clock cycle. The final design on an FPGA achieves more than 100x higher throughput compared with a CPU while consuming less than 1/10 of the power.en_US
dc.description.statementofresponsibilityby Zhengdong Zhang.en_US
dc.format.extent221 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleEfficient computing for autonomous navigation using algorithm-and-hardware co-designen_US
dc.typeThesisen_US
dc.description.degreePh. D.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.identifier.oclc1124763338en_US
dc.description.collectionPh.D. Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Scienceen_US
dspace.imported2019-11-04T19:53:27Zen_US
mit.thesis.degreeDoctoralen_US
mit.thesis.departmentEECSen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record