Show simple item record

dc.contributor.authorWang, Sheng
dc.contributor.authorZhai, ChengXiang
dc.contributor.authorPeng, Jian
dc.contributor.authorCho, Hyunghoon
dc.contributor.authorBerger Leighton, Bonnie
dc.date.accessioned2016-10-13T17:54:36Z
dc.date.available2016-10-13T17:54:36Z
dc.date.issued2015-06
dc.identifier.issn1367-4803
dc.identifier.issn1460-2059
dc.identifier.urihttp://hdl.handle.net/1721.1/104798
dc.description.abstractMotivation: Systematically predicting gene (or protein) function based on molecular interaction networks has become an important tool in refining and enhancing the existing annotation catalogs, such as the Gene Ontology (GO) database. However, functional labels with only a few (<10) annotated genes, which constitute about half of the GO terms in yeast, mouse and human, pose a unique challenge in that any prediction algorithm that independently considers each label faces a paucity of information and thus is prone to capture non-generalizable patterns in the data, resulting in poor predictive performance. There exist a variety of algorithms for function prediction, but none properly address this ‘overfitting’ issue of sparsely annotated functions, or do so in a manner scalable to tens of thousands of functions in the human catalog. Results: We propose a novel function prediction algorithm, clusDCA, which transfers information between similar functional labels to alleviate the overfitting problem for sparsely annotated functions. Our method is scalable to datasets with a large number of annotations. In a cross-validation experiment in yeast, mouse and human, our method greatly outperformed previous state-of-the-art function prediction algorithms in predicting sparsely annotated functions, without sacrificing the performance on labels with sufficient information. Furthermore, we show that our method can accurately predict genes that will be assigned a functional label that has no known annotations, based only on the ontology graph structure and genes associated with other labels, which further suggests that our method effectively utilizes the similarity between gene functions.en_US
dc.description.sponsorshipNational Institute of General Medical Sciences (U.S.) (Grant 1U54GM114838)en_US
dc.language.isoen_US
dc.publisherOxford University Pressen_US
dc.relation.isversionofhttp://dx.doi.org/10.1093/bioinformatics/btv260en_US
dc.rightsCreative Commons Attribution-NonCommercial 4.0 Internationalen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc/4.0/en_US
dc.sourceOxford University Pressen_US
dc.titleExploiting ontology graph for predicting sparsely annotated gene functionen_US
dc.typeArticleen_US
dc.identifier.citationWang, Sheng, Hyunghoon Cho, ChengXiang Zhai, Bonnie Berger, and Jian Peng. “Exploiting Ontology Graph for Predicting Sparsely Annotated Gene Function.” Bioinformatics 31, no. 12 (June 13, 2015): i357–i364.en_US
dc.contributor.departmentMassachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratoryen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Scienceen_US
dc.contributor.departmentMassachusetts Institute of Technology. Department of Mathematicsen_US
dc.contributor.mitauthorCho, Hyunghoon
dc.contributor.mitauthorBerger Leighton, Bonnie
dc.relation.journalBioinformaticsen_US
dc.eprint.versionFinal published versionen_US
dc.type.urihttp://purl.org/eprint/type/JournalArticleen_US
eprint.statushttp://purl.org/eprint/status/PeerRevieweden_US
dspace.orderedauthorsWang, Sheng; Cho, Hyunghoon; Zhai, ChengXiang; Berger, Bonnie; Peng, Jianen_US
dspace.embargo.termsNen_US
dc.identifier.orcidhttps://orcid.org/0000-0002-2713-0150
dc.identifier.orcidhttps://orcid.org/0000-0002-2724-7228
mit.licensePUBLISHER_CCen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record