Incorporating Content Structure into Text Analysis Applications
Author(s)
Sauper, Christina Joan; Haghighi, Aria; Barzilay, Regina
DownloadBarzilay_Incorporating content.pdf (556.4Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
Information about the content structure of a
document is largely ignored by current text
analysis applications such as information extraction
and sentiment analysis. This stands
in contrast to the linguistic intuition that rich
contextual information should benefit such applications.
We present a framework which
combines a supervised text analysis application
with the induction of latent content structure.
Both of these elements are learned
jointly using the EM algorithm. The induced
content structure is learned from a large unannotated
corpus and biased by the underlying
text analysis task. We demonstrate that
exploiting content structure yields significant
improvements over approaches that rely only
on local context.
Description
URL to papers listed on conference site
Date issued
2010-10Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
EMNLP 2010 : Conference on Empirical Methods in Natural Language Processing
Publisher
Association for Computational Linguistics
Citation
Sauper, Christina, Aria Haghighi, and Regina Barzilay. "Incorporating Content Structure into Text Analysis Applications." EMNLP 2010: Conference on Empirical Methods in Natural Language Processing, October 9-11, 2010, MIT, Massachusetts, USA.
Version: Author's final manuscript