MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Doctoral Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

DNA sequence design of non-orthogonal binding networks, and application to DNA data storage

Author(s)
Berleant, Joseph Don
Thumbnail
DownloadThesis PDF (55.20Mb)
Advisor
Bathe, Mark
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
DNA has proven itself a powerful tool in a diverse array of nanotechnology-related domains, including molecular computation, nanostructure fabrication, and data storage. Most DNA-based systems focus on using sets of DNA sequences that are orthogonal to each other, such that each DNA sequence has a dedicated binding partner, its complementary sequence. This design approach reduces the number of interactions that must be considered when predicting how a system will behave, at the cost of reducing the information-gathering ability of each molecular unit. Relatively little research has attempted to solve the problem of designing promiscuous, or non-orthogonal, DNA sequences, which are characterized by their ability to bind to several distinct partners with variable binding affinities. Yet there are many situations in which this type of dense interaction network can be useful. For example, in neural networks, a node will often take inputs from hundreds or thousands of upstream nodes, allowing it to condense large amounts of information into a single output value. While naturally occurring biological networks often make use of promiscuous binding behavior, the field of molecular computing currently lacks a general-purpose and efficient method for non-orthogonal DNA sequence design. In this thesis, I describe a novel, robust, and broadly applicable method for designing small or large sets of non-orthogonal DNA sequences. This method takes an arbitrary matrix of pairwise binding affinities, and attempts to design DNA sequences such that the differential binding affinity between any two pairs of sequences is proportional to the difference in the corresponding elements of the matrix. The key innovation of this method is the reformulation of the matrix via a binary embedding, which reduces the design specification to a set of binary strings that permit relatively straightforward sequence design. Not all matrices permit a binary embedding and I consider three cases here: when a binary embedding exists, when it is unknown if it exists, and when it does not exist. When it exists, I show through both simulation and experiment that DNA sequences can be designed with high precision. When it is unknown if a binary embedding exists, I give novel conditions for determining existence via representation of the matrix in a weighted graph. Finally, when an exact binary embedding does not exist, I develop an alternative method using approximate binary embeddings. To demonstrate the power of this method, I apply to the task of similarity searching in a large, simulated DNA databases, where I show that it outperforms the existing state of the art. I hope that this work opens the door to further innovations in designing and applying non-orthogonal DNA sequences to DNA nanotechnology.
Date issued
2023-06
URI
https://hdl.handle.net/1721.1/153069
Department
Massachusetts Institute of Technology. Department of Biological Engineering
Publisher
Massachusetts Institute of Technology

Collections
  • Doctoral Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.