MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
  • DSpace@MIT Home
  • MIT Open Access Articles
  • MIT Open Access Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Zipf’s law holds for phrases, not words

Author(s)
Ryland Williams, Jake; Lessard, Paul R.; Desu, Suma; Clark, Eric M.; Bagrow, James P.; Danforth, Christopher M.; Sheridan Dodds, Peter; ... Show more Show less
Thumbnail
DownloadWilliams-2015-Zipf's law.pdf (1.378Mb)
PUBLISHER_CC

Publisher with Creative Commons License

Creative Commons Attribution

Terms of use
Creative Commons Attribution http://creativecommons.org/licenses/by/4.0/
Metadata
Show full item record
Abstract
With Zipf’s law being originally and most famously observed for word frequency, it is surprisingly limited in its applicability to human language, holding over no more than three to four orders of magnitude before hitting a clear break in scaling. Here, building on the simple observation that phrases of one or more words comprise the most coherent units of meaning in language, we show empirically that Zipf’s law for phrases extends over as many as nine orders of rank magnitude. In doing so, we develop a principled and scalable statistical mechanical method of random text partitioning, which opens up a rich frontier of rigorous text analysis via a rank ordering of mixed length phrases.
Date issued
2015-08
URI
http://hdl.handle.net/1721.1/98434
Department
Massachusetts Institute of Technology. Center for Computational Engineering
Journal
Scientific Reports
Publisher
Nature Publishing Group
Citation
Ryland Williams, Jake, Paul R. Lessard, Suma Desu, Eric M. Clark, James P. Bagrow, Christopher M. Danforth, and Peter Sheridan Dodds. “Zipf’s Law Holds for Phrases, Not Words.” Scientific Reports 5 (August 11, 2015): 12209.
Version: Final published version
ISSN
2045-2322

Collections
  • MIT Open Access Articles

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.