Towards a Prime Factorization of Proteins
Author(s)
Radev, Simeon
DownloadThesis PDF (11.88Mb)
Advisor
Jacobson, Joseph
Terms of use
Metadata
Show full item recordAbstract
A classical problem of machine learning is the interpretability of a model’s latent information processing. This is particularly the case in the richly complex field of protein analysis, whereby unique and novel insights into the structural organization of proteins can help illuminate their functional space, and in particular lead toward a factorization of the structural space into a set of motif building blocks, which completely span this universe. This thesis creates a new inference interface for performing such analysis, by leveraging the sequential learning process of a neural autoencoder to construct a decomposition of proteins as a hierarchical sequence of embedded representation vectors. The further development of this work could lead to a greater understanding of the organizational complexity of natural phenomena, and in particular, as it relates to the uniquely complex relationship between protein structures and their function.
Date issued
2024-05Department
Program in Media Arts and Sciences (Massachusetts Institute of Technology)Publisher
Massachusetts Institute of Technology