| dc.contributor.author | Gutierrez, Clair S | |
| dc.contributor.author | Kassim, Alia A | |
| dc.contributor.author | Gutierrez, Benjamin D | |
| dc.contributor.author | Raines, Ronald T | |
| dc.date.accessioned | 2025-02-04T16:36:14Z | |
| dc.date.available | 2025-02-04T16:36:14Z | |
| dc.date.issued | 2024-11-01 | |
| dc.identifier.uri | https://hdl.handle.net/1721.1/158166 | |
| dc.description.abstract | Motivation
Post-translational modifications (PTMs) increase the diversity of the proteome and are vital to organismal life and therapeutic strategies. Deep learning has been used to predict PTM locations. Still, limitations in datasets and their analyses compromise success.
Results
We evaluated the use of known PTM sites in prediction via sequence-based deep learning algorithms. For each PTM, known locations of that PTM were encoded as a separate amino acid before sequences were encoded via word embedding and passed into a convolutional neural network that predicts the probability of that PTM at a given site. Without labeling known PTMs, our models are on par with others. With labeling, however, we improved significantly upon extant models. Moreover, knowing PTM locations can increase the predictability of a different PTM. Our findings highlight the importance of PTMs for the installation of additional PTMs. We anticipate that including known PTM locations will enhance the performance of other proteomic machine learning algorithms.
Availability and implementation
Sitetack is available as a web tool at https://sitetack.net; the source code, representative datasets, instructions for local use, and select models are available at https://github.com/clair-gutierrez/sitetack. | en_US |
| dc.language.iso | en | |
| dc.publisher | Oxford University Press | en_US |
| dc.relation.isversionof | 10.1093/bioinformatics/btae602 | en_US |
| dc.rights | Creative Commons Attribution | en_US |
| dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en_US |
| dc.source | Oxford University Press | en_US |
| dc.title | Sitetack: a deep learning model that improves PTM prediction by using known PTMs | en_US |
| dc.type | Article | en_US |
| dc.identifier.citation | Clair S Gutierrez, Alia A Kassim, Benjamin D Gutierrez, Ronald T Raines, Sitetack: a deep learning model that improves PTM prediction by using known PTMs, Bioinformatics, Volume 40, Issue 11, November 2024. | en_US |
| dc.contributor.department | Massachusetts Institute of Technology. Department of Chemistry | en_US |
| dc.contributor.department | Broad Institute of MIT and Harvard | en_US |
| dc.contributor.department | Koch Institute for Integrative Cancer Research at MIT | en_US |
| dc.relation.journal | Bioinformatics | en_US |
| dc.eprint.version | Final published version | en_US |
| dc.type.uri | http://purl.org/eprint/type/JournalArticle | en_US |
| eprint.status | http://purl.org/eprint/status/PeerReviewed | en_US |
| dc.date.updated | 2025-02-04T16:23:27Z | |
| dspace.orderedauthors | Gutierrez, CS; Kassim, AA; Gutierrez, BD; Raines, RT | en_US |
| dspace.date.submission | 2025-02-04T16:23:32Z | |
| mit.journal.volume | 40 | en_US |
| mit.journal.issue | 11 | en_US |
| mit.license | PUBLISHER_CC | |
| mit.metadata.status | Authority Work and Publication Information Needed | en_US |