Modeling Syntactic Context Improves Morphological Segmentation
Author(s)
Yoong Keok, Lee; Haghighi, Aria; Barzilay, Regina
DownloadBarzilay-Modeling Syntactic.pdf (230.5Kb)
OPEN_ACCESS_POLICY
Open Access Policy
Creative Commons Attribution-Noncommercial-Share Alike
Terms of use
Metadata
Show full item recordAbstract
The connection between part-of-speech (POS) categories and morphological properties is well-documented in linguistics but underutilized in text processing systems. This paper proposes a novel model for morphological segmentation that is driven by this connection. Our model learns that words with common affixes are likely to be in the same syntactic category and uses learned syntactic categories to refine the segmentation boundaries of words. Our results demonstrate that incorporating POS categorization yields substantial performance gains on morphological segmentation of Arabic.
Date issued
2011-06Department
Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory; Massachusetts Institute of Technology. Department of Electrical Engineering and Computer ScienceJournal
Proceedings of the Fifteenth Conference on Computational Natural Language Learning, CoNLL '11
Publisher
Association for Computing Machinery
Citation
Lee, Yoong Keok, Aria Haghighi, and Regina Barzilay. "Modeling syntactic context improves morphological segmentation." In Proceedings of the Fifteenth Conference on Computational Natural Language Learning (CoNLL '11). Association for Computational Linguistics, Portland, Oregon, USA, June 23–24, 2011. pp.1-9. ©2011 Association for Computational Linguistics.
Version: Author's final manuscript