Graph Representation Learning for Drug Discovery

Jin, Wengong

Author(s)

Jin, Wengong

DownloadThesis PDF (16.25Mb)

Advisor

Barzilay, Regina

Jaakkola, Tommi S.

Terms of use

In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

Drug discovery is an expensive and labor-intensive process, typically taking an average of 10–15 years. The goal of this thesis is to substantially accelerate this process by developing machine learning (ML) algorithms for three key steps in drug discovery pipeline. First, we develop better property predictors that enable us to effectively navigate known chemical space. The main challenge is to learn a predictor based on a small, biased assay and generalize to a much broader chemical space. We address this challenge by a new domain generalization method called counterfactual consistency regularization, which seeks to eliminate spurious correlations in biological assays. Second, we extend property prediction capabilities to combinations of molecules, enabling us to screen and discover synergistic drug therapies. Direct experimental data about combinations are extremely limited. To counter this limitation, we build more biological structure (drug-target interaction) into the models in order to leverage heterogeneous single-compound assays as well as to provide a mechanism to assess drug combinations through competitive binding to such targets. Third, we extend the search for new drugs beyond known chemical matter by developing deep generative models that can realize novel compounds with better characteristics. To this end, we propose hierarchical graph generative models that make use of larger structural building blocks derived from either tree decomposition of molecular graphs or molecular rationales explaining the outcome of property predictors. Lastly, we demonstrate how these techniques managed to discover novel antibiotics and COVID-19 antiviral drug combinations. These discoveries highlight the significant impact that deep learning can have on drug discovery by decreasing its time and cost.

Date issued

2021-09

URI

https://hdl.handle.net/1721.1/139909

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Doctoral Theses