MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Hardware Acceleration for Real-Time Compression of 3D Gaussians

Author(s)
Kahler, Kailas B.
Thumbnail
DownloadThesis PDF (2.070Mb)
Advisor
Sze, Vivienne
Terms of use
Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/
Metadata
Show full item record
Abstract
3D Gaussian Splatting (3DGS) is a technique for novel view synthesis, where images of a scene from a specific viewpoint are generated using images from different viewpoints, that has gained popularity for its reduced computational overhead, resulting in faster training and rendering times compared to other methods like Neural Radiance Fields (NeRFs). Its applications outside of strictly novel view synthesis have also been explored, with monocular simultaneous localization and mapping (SLAM) in robotics being an emergent application. However, because of limited on-board battery capacity, the computer hardware used in small robots is much less capable than the high-powered GPUs that the 3DGS algorithm was originally developed on, having both less compute and memory capacity and bandwidth. While there has been work developing specialized compute for the rendering pipeline of 3DGS, memory remains an obstacle to deployment. The Gaussian map can occupy from 1MB − 700MB in memory, which is both too large to store on-chip within micro-robots and such that moving Gaussians from memory to compute can dominate power consumption. While there has been prior work on algorithms for compressing Gaussian representations, they are not yet capable of running in real-time on the hardware present in these robots, as would be required for SLAM. Thus, this thesis explores the limits of these compression methods on current hardware, resulting in an optimized CUDA implementation with better than 100× the throughput of prior work and achieving real-time operation on workstation-class hardware. However, after concluding that custom hardware is necessary for further improvement, this thesis also presents a hardware accelerator that nears real-time compression performance within a reduced power budget, outperforming an NVIDIA Jetson Orin Nano with 64% higher throughput while using 1/16th of the multipliers and drawing 38% of the power when running at 100MHz on an AMD UltraScale+ FPGA.
Date issued
2025-05
URI
https://hdl.handle.net/1721.1/162745
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.