MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Multimodal Graphical User Interface for 3D Model Fabrication Through Generative AI

Author(s)
Báez Alicea, Isabel
Thumbnail
DownloadThesis PDF (26.14Mb)
Advisor
Mueller, Stefanie
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
In recent years, three-dimensional model generation and manipulation through generative AI has seen significant developments. Current projects enable the generation of threedimensional assets from natural language prompts and input images, as well as functionalityaware model manipulation through mesh segmentation and categorization. However, all these workflows lack a coherent, unified platform that caters to users’ needs and each method’s technologies. Programs that rely on terminal-based commands lack the graphics needed for model interactions, and plugin extensions for 3D modeling applications are unintuitive and hard to extend for new functionalities. Additionally, both approaches require users to have prior computer engineering and/or 3D graphics knowledge. For this thesis, I propose the creation of a web-based, multimodal graphical user interface that consolidates all these different technologies in a single platform. By supporting model stylization and model generation (both from text prompts and input images), users can utilize combined workflows and expand the range of output possibilities for 3D asset creation. Other features in our interface include model uploading, saving, and downloading to enable a continuous stream of work on a single 3D asset. Apart from all this, we expand the current capabilities of existing image-to-3D generation programs by enabling users to combine up to six images together and create a merged 3D object. Each of these images corresponds to a view angle from which the outputted mesh will be built.
Date issued
2025-02
URI
https://hdl.handle.net/1721.1/159092
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.