MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Molecular Graph Representation Learning and Generation for Drug Discovery

Author(s)
Chen, Benson
Thumbnail
DownloadThesis PDF (3.842Mb)
Advisor
Barzilay, Regina
Jaakkola, Tommi
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Machine learning methods have been widely pervasive in the domain of drug discovery, enabling more powerful and efficient models. Before deep models, modeling molecules was largely driven by expert knowledge; and to represent the complexities of the molecular landscape, these hand-engineered rules prove insufficient. Deep learning models are powerful because they learn the important statistical features of the problem–but only with the correct inductive biases. We tackle this important problem in the context of two molecular problems: representation and generation. Canonical success of deep learning is deeply rooted in its ability to map the input domain into a meaningful representation space. This is especially poignant for molecular problems, where the “right” relations between molecules is nuanced and complex. The first part of this thesis will focus on molecular representation, in particular, property and reaction prediction. Here, we explore a transformer-style architecture for molecular representation, providing new tools to apply these models to graph-structured objects. Moving away from the traditional graph neural network paradigm, we demonstrate the efficacy of prototype networks for molecular representation, which allows us to reason over learned property prototypes of molecules. Lastly, we look at the molecular representations in the context of improving reaction predictions. The second part of this thesis will focus on molecular generation, which is crucial in drug discovery as a means to propose promising drug candidates. Here we develop a new method for multi-property molecule generation, by first learning a distributional vocabulary over molecular fragments. Then, using this vocabulary, we survey efficient exploration methods over the chemical space.
Date issued
2022-02
URI
https://hdl.handle.net/1721.1/143362
Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.