↓ Skip to main content

Zipf’s word frequency law in natural language: A critical review and future directions

Overview of attention for article published in Psychonomic Bulletin & Review, March 2014
Altmetric Badge

About this Attention Score

  • In the top 5% of all research outputs scored by Altmetric
  • High Attention Score compared to outputs of the same age (98th percentile)
  • High Attention Score compared to outputs of the same age and source (83rd percentile)

Mentioned by

news
9 news outlets
blogs
4 blogs
twitter
33 X users
patent
1 patent
wikipedia
1 Wikipedia page
googleplus
1 Google+ user

Citations

dimensions_citation
502 Dimensions

Readers on

mendeley
423 Mendeley
Title
Zipf’s word frequency law in natural language: A critical review and future directions
Published in
Psychonomic Bulletin & Review, March 2014
DOI 10.3758/s13423-014-0585-6
Pubmed ID
Authors

Steven T. Piantadosi

Abstract

The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf's law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf's law and are then used to evaluate many of the theoretical explanations of Zipf's law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf's law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data.

X Demographics

X Demographics

The data shown below were collected from the profiles of 33 X users who shared this research output. Click here to find out more about how the information was compiled.
Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 423 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
United States 7 2%
Germany 3 <1%
Italy 2 <1%
United Kingdom 2 <1%
Portugal 1 <1%
Netherlands 1 <1%
Brazil 1 <1%
Colombia 1 <1%
France 1 <1%
Other 4 <1%
Unknown 400 95%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 109 26%
Student > Master 58 14%
Researcher 49 12%
Student > Bachelor 35 8%
Student > Doctoral Student 32 8%
Other 72 17%
Unknown 68 16%
Readers by discipline Count As %
Computer Science 72 17%
Linguistics 68 16%
Psychology 56 13%
Social Sciences 21 5%
Engineering 19 4%
Other 96 23%
Unknown 91 22%
Attention Score in Context

Attention Score in Context

This research output has an Altmetric Attention Score of 130. This is our high-level measure of the quality and quantity of online attention that it has received. This Attention Score, as well as the ranking and number of research outputs shown below, was calculated when the research output was last mentioned on 07 May 2024.
All research outputs
#330,154
of 25,986,827 outputs
Outputs from Psychonomic Bulletin & Review
#4
of 6 outputs
Outputs of similar age
#2,682
of 238,758 outputs
Outputs of similar age from Psychonomic Bulletin & Review
#1
of 6 outputs
Altmetric has tracked 25,986,827 research outputs across all sources so far. Compared to these this one has done particularly well and is in the 98th percentile: it's in the top 5% of all research outputs ever tracked by Altmetric.
So far Altmetric has tracked 6 research outputs from this source. They typically receive a lot more attention than average, with a mean Attention Score of 23.8. This one scored the same or higher as 2 of them.
Older research outputs will score higher simply because they've had more time to accumulate mentions. To account for age we can compare this Altmetric Attention Score to the 238,758 tracked outputs that were published within six weeks on either side of this one in any source. This one has done particularly well, scoring higher than 98% of its contemporaries.
We're also able to compare this research output to 6 others from the same source and published within six weeks on either side of this one. This one has scored higher than all of them