Empowering natural product science with AI: leveraging multimodal data and knowledge graphs
Author(s)
Meijer, David; Beniddir, Mehdi A; Coley, Connor W; Mejri, Yassine M; Öztürk, Meltem; van der Hooft, Justin JJ; Medema, Marnix H; Skiredj, Adam; ... Show more Show less
DownloadPublished version (2.191Mb)
Publisher with Creative Commons License
Publisher with Creative Commons License
Creative Commons Attribution
Terms of use
Metadata
Show full item recordAbstract
Artificial intelligence (AI) is accelerating how we conduct science, from folding proteins with AlphaFold and summarizing literature findings with large language models, to annotating genomes and prioritizing newly generated molecules for screening using specialized software. However, the application of AI to emulate human cognition in natural product research and its subsequent impact has so far been limited. One reason for this limited impact is that available natural product data is multimodal, unbalanced, unstandardized, and scattered across many data repositories. This makes natural product data challenging to use with existing deep learning architectures that consume fairly standardized, often non-relational, data. It also prevents models from learning overarching patterns in natural product science. In this Viewpoint, we address this challenge and support ongoing initiatives aimed at democratizing natural product data by collating our collective knowledge into a knowledge graph. By doing so, we believe there will be an opportunity to use such a knowledge graph to develop AI models that can truly mimic natural product scientists' decision-making.
Date issued
2024-08-16Department
Massachusetts Institute of Technology. Department of Chemical EngineeringJournal
Natural Product Reports
Publisher
Royal Society of Chemistry
Citation
Meijer, David, Beniddir, Mehdi A, Coley, Connor W, Mejri, Yassine M, Öztürk, Meltem et al. 2024. "Empowering natural product science with AI: leveraging multimodal data and knowledge graphs." Natural Product Reports.
Version: Final published version