ChemWARD : extracting chemical structure from printed diagrams
Author(s)
Moscicki, Angelique (Angelique E.)
DownloadFull printable version (16.44Mb)
Alternative title
Extracting chemical structure from printed diagrams
Other Contributors
Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science.
Advisor
Randall Davis.
Terms of use
Metadata
Show full item recordAbstract
Over the years, a vast amount of literature in the field of chemistry has accumulated, and searching for documents about specific molecules is a formidable task. To the extent that the literature is textual, services like Google enable relatively easy search. While search indexes like Google are very good at finding such things, its difficult to describe molecules completely using text because text can't easily indicate molecular structure, and molecular structure defines chemical properties. ChemWARD is a system that extracts the molecular structure from the printed diagrams that are ubiquitous in chemistry literature and converts them to a machine readable format in order to allow chemists to search the literature by drawing a molecular structure instead of typing a chemical formula. We describe the architecture of the system and report on its performance, demonstrating its ability to achieve an overall accuracy rate of 85.5% on printed diagrams extracted from published chemical literature.
Description
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2009. Cataloged from PDF version of thesis. Includes bibliographical references (p. 117-118).
Date issued
2009Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology
Keywords
Electrical Engineering and Computer Science.