Show simple item record

dc.contributor.advisorRegina Barzilay.en_US
dc.contributor.authorJin, Wengongen_US
dc.contributor.otherMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science.en_US
dc.date.accessioned2018-09-17T14:50:56Z
dc.date.available2018-09-17T14:50:56Z
dc.date.copyright2018en_US
dc.date.issued2018en_US
dc.identifier.urihttp://hdl.handle.net/1721.1/117818
dc.descriptionThesis: S.M., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2018.en_US
dc.descriptionThis electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.en_US
dc.descriptionCataloged from student-submitted PDF version of thesis.en_US
dc.descriptionIncludes bibliographical references (pages 49-53).en_US
dc.description.abstractThis thesis focus on deep learning algorithms for learning continuous representation of molecular graphs, a much more compact representation than traditional fingerprints. We demonstrate its better predictive performance in two tasks. First, we seek to automate the prediction of organic reaction outcomes. The previous solution utilizes reaction templates to limit the space, but it suffers from coverage and efficiency issues due to its discrete nature. We propose a template-free approach to efficiently explore the space of product molecules by pinpointing the reaction center. The candidates products are scored by a Weisfeiler-Lehman Difference Network that models high-order interactions between changes occurring at nodes across the molecule. Our framework outperforms the top-performing template-based approach with a 10% margin, while running orders of magnitude faster. Moreover, we demonstrate that the model accuracy rivals the performance of domain experts. Secondly, we seek to automate the design of molecules based on specific chemical properties. Our primary contribution is the direct realization of molecular graphs from continuous space. Our junction tree variational autoencoder generates molecular graphs in two phases, by first generating a tree-structured scaffold over chemical substructures, and then combining them into a molecule with a graph message passing network. This approach allows us to incrementally expand molecules while maintaining chemical validity at every step. We evaluate our model on multiple tasks ranging from molecular generation to optimization. Across these tasks, our model outperforms previous state-of-the-art baselines by a significant margin.en_US
dc.description.statementofresponsibilityby Wengong Jin.en_US
dc.format.extent63 pagesen_US
dc.language.isoengen_US
dc.publisherMassachusetts Institute of Technologyen_US
dc.rightsMIT theses are protected by copyright. They may be viewed, downloaded, or printed from this source but further reproduction or distribution in any format is prohibited without written permission.en_US
dc.rights.urihttp://dspace.mit.edu/handle/1721.1/7582en_US
dc.subjectElectrical Engineering and Computer Science.en_US
dc.titleNeural graph representation learning with application to chemistryen_US
dc.typeThesisen_US
dc.description.degreeS.M.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.oclc1051460676en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record