Zipf-Mandelbrot law
Encyclopedia : Z : ZI : ZIP : Zipf-Mandelbrot law
]| cdf =[\frac}}]| mean =[\frac}}-q]| median =N/A| mode =[\frac}]| variance =| skewness =| kurtosis =| entropy =| mgf =| char =| }} In probability theory and statistics, the Zipf-Mandelbrot law is a discrete probability distribution. Also known as the Pareto-Zipf law, it is a power-law distribution on ranked data, named after the Harvard linguistics professor George Kingsley Zipf (1902-1950) who suggested a simpler distribution called Zipf's law, and the mathematician Benoit Mandelbrot (born November 20, 1924), who subsequently generalized it.
The probability mass function is given by:
- [f(k;N,q,s)=\frac}]
- [H_=\sum_^N \frac]
Applications
The distribution of words ranked by their frequency in a random corpus of text is generally a power-law distribution, known as Zipf's law.
If one plots the frequency rank of words contained in a large corpus of text data versus the number of occurrences or actual frequencies, one obtains a power-law distribution, with exponent close to one (but see Gelbukh and Sidoro 2001).
External links
- [Z. K. Silagadze: Citations and the Zipf-Mandelbrot's law]
- [NIST: Zipf's law]
- [W. Li's References on Zipf's law]
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
