Modeling Syntactic Context Improves Morphological Segmentation

Lee, Yoong Keok; Haghighi, Aria; Barzilay, Regina

Author(s)

Yoong Keok, Lee; Haghighi, Aria; Barzilay, Regina

DownloadBarzilay-Modeling Syntactic.pdf (230.5Kb)

OPEN_ACCESS_POLICY

Terms of use

Creative Commons Attribution-Noncommercial-Share Alike 3.0 http://creativecommons.org/licenses/by-nc-sa/3.0/

Metadata

Show full item record

Abstract

The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underutilized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this connection. Our model learns that words with common affixes are likely to be in the same syntactic category and uses learned syntactic categories to refine the segmentation boundaries of words. Our results demonstrate that incorporating POS categorization yields substantial performance gains on morphological segmentation of Arabic.

Date issued

2011-06

URI

http://hdl.handle.net/1721.1/73140

Department

Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Journal

Proceedings of the Fifteenth Conference on Computational Natural Language Learning, CoNLL '11

Publisher

Association for Computing Machinery

Citation

Lee, Yoong Keok, Aria Haghighi, and Regina Barzilay. "Modeling syntactic context improves morphological segmentation." In Proceedings of the Fifteenth Conference on Computational Natural Language Learning (CoNLL '11). Association for Computational Linguistics, Portland, Oregon, USA, June 23–24, 2011. pp.1-9. ©2011 Association for Computational Linguistics.

Version: Author's final manuscript

Collections

MIT Open Access Articles

DSpace@MIT