Perplexity
Encyclopedia : P : PE : PER : Perplexity
Perplexity is a measure in information theory. It is closely related to entropy (also called log probability). Perplexity is defined as
- [\displaystyle=2^^np(i)\log_2 p(i)}}]
In natural language processing, perplexity is a usual way of evaluating language models. A rough motivation in that context: perplexity would be equal to the size of the set of possible words in position [k] in a text, if the probabilities of the words observed a uniform distribution with the same entropy as the actual model has. The lower perplexity a language model has, the easier it is to predict the next word given the previous words and the model. Domain-specific texts usually have lower perplexity (= less variation) than general language. The lowest perplexity that has been published on the Brown Corpus (1 million words of American English) is about 247 which corresponds to an entropy of 1.75 bits.
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
