dc.contributor.author | Keller, Mikaela | |
dc.contributor.author | Freifeld, Clark C. | |
dc.contributor.author | Brownstein, John S. | |
dc.date.accessioned | 2010-03-10T16:07:28Z | |
dc.date.available | 2010-03-10T16:07:28Z | |
dc.date.issued | 2009-11 | |
dc.date.submitted | 2009-06 | |
dc.identifier.issn | 1471-2105 | |
dc.identifier.uri | http://hdl.handle.net/1721.1/52463 | |
dc.description.abstract | Background
Automated surveillance of the Internet provides a timely and sensitive method for alerting on global emerging infectious disease threats. HealthMap is part of a new generation of online systems designed to monitor and visualize, on a real-time basis, disease outbreak alerts as reported by online news media and public health sources. HealthMap is of specific interest for national and international public health organizations and international travelers. A particular task that makes such a surveillance useful is the automated discovery of the geographic references contained in the retrieved outbreak alerts. This task is sometimes referred to as "geo-parsing". A typical approach to geo-parsing would demand an expensive training corpus of alerts manually tagged by a human.
Results
Given that human readers perform this kind of task by using both their lexical and contextual knowledge, we developed an approach which relies on a relatively small expert-built gazetteer, thus limiting the need of human input, but focuses on learning the context in which geographic references appear. We show in a set of experiments, that this approach exhibits a substantial capacity to discover geographic locations outside of its initial lexicon.
Conclusion
The results of this analysis provide a framework for future automated global surveillance efforts that reduce manual input and improve timeliness of reporting. | en |
dc.description.sponsorship | Google.org | en |
dc.description.sponsorship | National Library of Medicine and the National Institutes of Health (grant G08LM009776-01A2) | en |
dc.language.iso | en_US | |
dc.publisher | BioMed Central Ltd. | en |
dc.relation.isversionof | http://dx.doi.org/10.1186/1471-2105-10-385 | en |
dc.rights | Creative Commons Attribution | en |
dc.rights.uri | http://creativecommons.org/licenses/by/2.0/ | en |
dc.source | BioMed Central | en |
dc.title | Automated vocabulary discovery for geo-parsing online epidemic intelligence | en |
dc.type | Article | en |
dc.identifier.citation | Keller, Mikaela, Clark Freifeld, and John Brownstein. “Automated vocabulary discovery for geo-parsing online epidemic intelligence.” BMC Bioinformatics 10.1 (2009): 385. | en |
dc.contributor.department | Harvard University--MIT Division of Health Sciences and Technology | en_US |
dc.contributor.department | Program in Media Arts and Sciences (Massachusetts Institute of Technology) | en_US |
dc.contributor.approver | Freifeld, Clark C. | |
dc.contributor.mitauthor | Freifeld, Clark C. | |
dc.relation.journal | BMC Bioinformatics | en |
dc.eprint.version | Final published version | en |
dc.identifier.pmid | 19930702 | |
dc.type.uri | http://purl.org/eprint/type/JournalArticle | en |
eprint.status | http://purl.org/eprint/status/PeerReviewed | en |
dspace.orderedauthors | Keller, Mikaela; Freifeld, Clark C; Brownstein, John S | en |
dspace.mitauthor.error | true | |
mit.license | PUBLISHER_CC | en |
mit.metadata.status | Complete | |