dc.contributor.advisor | Bepler, Tristan | |
dc.contributor.author | Ram, Soumya | |
dc.date.accessioned | 2022-01-14T15:04:53Z | |
dc.date.available | 2022-01-14T15:04:53Z | |
dc.date.issued | 2021-06 | |
dc.date.submitted | 2021-06-17T20:14:07.377Z | |
dc.identifier.uri | https://hdl.handle.net/1721.1/139337 | |
dc.description.abstract | Protein engineering has the potential to solve complex global problems in medicine, clean energy, and manufacturing. However, current protein engineering efforts are hampered by a lack of supervised data. We help recitify this issue by developing supervised models that perform well in data-constrained settings by generalizing across protein engineering tasks and better incorporating coevolutionary and structural information. We also develop an unsupervised language model that conditions the target sequence on its multiple sequence alignment, allowing us to better model protein families. | |
dc.publisher | Massachusetts Institute of Technology | |
dc.rights | In Copyright - Educational Use Permitted | |
dc.rights | Copyright MIT | |
dc.rights.uri | http://rightsstatements.org/page/InC-EDU/1.0/ | |
dc.title | Using Co-evolutionary Information to Improve Protein Language Modelling | |
dc.type | Thesis | |
dc.description.degree | M.Eng. | |
dc.contributor.department | Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science | |
mit.thesis.degree | Master | |
thesis.degree.name | Master of Engineering in Electrical Engineering and Computer Science | |