MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Machine Aided Biological Discovery and Design

Author(s)
Saksena, Sachit Dinesh
Thumbnail
DownloadThesis PDF (22.83Mb)
Advisor
Gifford, David K.
Terms of use
In Copyright - Educational Use Permitted Copyright MIT http://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
Advances in biotechnology and the life sciences are primarily driven by biologists conducting rigorous experimentation. However, biology is often too complex – with intractable combinatorial search spaces and functional landscapes – to comprehensively explore, understand, and engineer via iterative biological experimentation. Next-generation sequencing technologies have made it possible to measure biology in high-throughput, giving observational insight into these complexities. Further, in recent years, it has become possible to both manipulate biological systems with fine-grained control and directly synthesize large libraries of DNA molecules with specified sequences, providing unprecedented ability to engineer biology. We explore the thesis that computational methods that are built with experimental considerations and trained on carefully selected high-throughput experimental data can drive advances in the life sciences by making accurate predictions that can then be used to iteratively generate hypotheses and design biological sequences for further experimental validation. To test our thesis about the value of computational methods we introduce and apply computational approaches for modeling cellular differentiation trajectories, identifying non-specific antibodies, and designing diverse libraries of biological sequences that reflect desired objectives. First, we introduce a generative machine learning model for inferring cellular developmental landscapes from cross-sectional sequencing of in vitro differentiation time-series. We validate this model with ground-truth experimental lineage tracing experiments, and we show its ability to conduct in silico simulations of cellular differentiation trajectories with perturbations. Next, we present a computational framework for using sequencing data from therapeutic discovery campaigns to identify nonspecific antibody therapeutics in large candidate pools. We show that this approach bypasses and outperforms costly combinatorial affinity selection experiments and allows the use of only single-target selection data to identify pairwise nonspecificity. Finally, we introduce an algorithm for the rational design of high diversity synthetic antibody libraries using machine learning models and stochastic optimization. We show how this can be used to develop large libraries optimized for targets or developability characteristics leading to more promising candidates from affinity selection.
Date issued
2022-09
URI
https://hdl.handle.net/1721.1/147252
Department
Massachusetts Institute of Technology. Computational and Systems Biology Program
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.