Advanced Search
DSpace@MIT

Global models of document structure using latent permutations

Research and Teaching Output of the MIT Community

Show simple item record

dc.contributor.author Chen, Harr
dc.contributor.author Branavan, Satchuthanan R.
dc.contributor.author Barzilay, Regina
dc.contributor.author Karger, David R.
dc.date.accessioned 2010-10-14T12:43:57Z
dc.date.available 2010-10-14T12:43:57Z
dc.date.issued 2009-06
dc.date.submitted 2009-05
dc.identifier.isbn 978-1-932432-41-1
dc.identifier.uri http://hdl.handle.net/1721.1/59312
dc.description.abstract We present a novel Bayesian topic model for learning discourse-level document structure. Our model leverages insights from discourse theory to constrain latent topic assignments in a way that reflects the underlying organization of document topics. We propose a global model in which both topic selection and ordering are biased to be similar across a collection of related documents. We show that this space of orderings can be elegantly represented using a distribution over permutations called the generalized Mallows model. Our structure-aware approach substantially outperforms alternative approaches for cross-document comparison and single-document segmentation. en_US
dc.language.iso en_US
dc.publisher Association for Computational Linguistics en_US
dc.rights Attribution-Noncommercial-Share Alike 3.0 Unported en_US
dc.rights.uri http://creativecommons.org/licenses/by-nc-sa/3.0/ en_US
dc.source MIT web domain en_US
dc.subject algorithms en_US
dc.subject design en_US
dc.subject experimentation en_US
dc.subject languages en_US
dc.subject measurement en_US
dc.subject performance en_US
dc.title Global models of document structure using latent permutations en_US
dc.type Article en_US
dc.identifier.citation Chen, Harr, S.R.K. Branavan, Regina Barzilay, and David R. Karger (2009). "Global models of document structure using latent permutations." Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics (Morristown, N.J.: Association for Computational Linguistics): 371-379. © 2009 Association for Computing Machinery. en_US
dc.contributor.department Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science en_US
dc.contributor.department Massachusetts Institute of Technology. Computer Science and Artificial Intelligence Laboratory en_US
dc.contributor.approver Barzilay, Regina
dc.contributor.mitauthor Chen, Harr
dc.contributor.mitauthor Branavan, Satchuthanan R.
dc.contributor.mitauthor Barzilay, Regina
dc.contributor.mitauthor Karger, David R.
dc.relation.journal Proceedings of Human Language Technologies: the 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics en_US
dc.identifier.mitlicense OPEN_ACCESS_POLICY en_US
dc.eprint.version Final published version en_US
dc.type.uri http://purl.org/eprint/type/ConferencePaper en_US
eprint.status http://purl.org/eprint/status/PeerReviewed en_US
eprint.grantNumber National Science Foundation (U.S.) (grant IIS-0448168) en_US
eprint.grantNumber National Science Foundation (U.S.). Graduate fellowship en_US
eprint.grantNumber United States. Office of Naval Research en_US
eprint.grantNumber Microsoft Faculty Fellowship en_US
dspace.orderedauthors Chen, Harr; Branavan, S. R. K.; Barzilay, Regina; Karger, David R.


Files in this item

Name Size Format
Downloadable Full Text - PDF

This item appears in the following Collection(s)

Show simple item record

Attribution-Noncommercial-Share Alike 3.0 Unported Except where otherwise noted, this item's license is described as Attribution-Noncommercial-Share Alike 3.0 Unported
Open Access