Simple Type-Level Unsupervised POS Tagging

Lee, Yoong Keok; Haghighi, Aria; Barzilay, Regina

Author(s)

Lee, Yoong Keok; Haghighi, Aria; Barzilay, Regina

DownloadBarzilay_Simple type.pdf (393.2Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

Part-of-speech (POS) tag distributions are known to exhibit sparsity — a word is likely to take a single predominant tag in a corpus. Recent research has demonstrated that incorporating this sparsity constraint improves tagging accuracy. However, in existing systems, this expansion come with a steep increase in model complexity. This paper proposes a simple and effective tagging method that directly models tag sparsity and other distributional properties of valid POS tag assignments. In addition, this formulation results in a dramatic reduction in the number of model parameters thereby, enabling unusually rapid training. Our experiments consistently demonstrate that this model architecture yields substantial performance gains over more complex tagging counterparts. On several languages, we report performance exceeding that of more complex state-of-the art systems.

Date issued

2010-10

URI

http://hdl.handle.net/1721.1/63117

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory

Journal

Proceedings of the Conference on Empirical Methods in Natural Language Processing, EMNLP 2010

Citation

Lee, Yoong Keok, Aria Haghighi and Regina Barzilay. "Simple Type-Level Unsupervised POS Tagging." in Proceedings of the EMNLP 2010: Conference on Empirical Methods in Natural Language Processing, Oct. 9-11, 2010, MIT, Massachusetts, USA.

Version: Author's final manuscript

Collections

MIT Open Access Articles

DSpace@MIT