Graph Representation Learning for Drug Discovery
Author(s)
Jin, Wengong
DownloadThesis PDF (16.25Mb)
Advisor
Barzilay, Regina
Jaakkola, Tommi S.
Terms of use
Metadata
Show full item recordAbstract
Drug discovery is an expensive and labor-intensive process, typically taking an average of 10–15 years. The goal of this thesis is to substantially accelerate this process by developing machine learning (ML) algorithms for three key steps in drug discovery pipeline. First, we develop better property predictors that enable us to effectively navigate known chemical space. The main challenge is to learn a predictor based on a small, biased assay and generalize to a much broader chemical space. We address this challenge by a new domain generalization method called counterfactual consistency regularization, which seeks to eliminate spurious correlations in biological assays. Second, we extend property prediction capabilities to combinations of molecules, enabling us to screen and discover synergistic drug therapies. Direct experimental data about combinations are extremely limited. To counter this limitation, we build more biological structure (drug-target interaction) into the models in order to leverage heterogeneous single-compound assays as well as to provide a mechanism to assess drug combinations through competitive binding to such targets. Third, we extend the search for new drugs beyond known chemical matter by developing deep generative models that can realize novel compounds with better characteristics. To this end, we propose hierarchical graph generative models that make use of larger structural building blocks derived from either tree decomposition of molecular graphs or molecular rationales explaining the outcome of property predictors. Lastly, we demonstrate how these techniques managed to discover novel antibiotics and COVID-19 antiviral drug combinations. These discoveries highlight the significant impact that deep learning can have on drug discovery by decreasing its time and cost.
Date issued
2021-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology