Opentopia Directory Encyclopedia Tools

Binomial distribution

Encyclopedia : B : BI : BIN : Binomial distribution


\!]| kurtosis =[\frac\!]| entropy =[ \frac \ln \left( 2 \pi n e p (1-p) \right) + O \left( \frac \right) ]| mgf =[(1-p + pe^t)^n \!]| char =[(1-p + pe^)^n \!]| }}

In probability theory and statistics, the binomial distribution is the discrete probability distribution of the number of successes in a sequence of n independent yes/no experiments, each of which yields success with probability p. Such a success/failure experiment is also called a Bernoulli experiment or Bernoulli trial. In fact, when n = 1, then the binomial distribution is the Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

Occurrence

A typical example is the following: assume 5% of the population is green-eyed. You pick 500 people randomly. How likely is it that you get 30 or more green-eyed people? The number of green-eyed people you pick is a random variable X which follows a binomial distribution with n = 500 and p = 0.05 (when picking the people with replacement). We are interested in the probability Pr[X ≥ 30].

Specification

Probability mass function

In general, if the random variable X follows the binomial distribution with parameters n and p, we write X ~ B(n, p). The probability of getting exactly k successes is given by the probability mass function:

[f(k;n,p)=p^k(1-p)^\,]
for [k=0,1,2,\dots,n] and where

[=\frac]
is the binomial coefficient "n choose k" (also denoted C(n, k) or nCk), hence the name of the distribution. The formula can be understood as follows: we want k successes (pk) and nk failures ((1 − p)nk). However, the k successes can occur anywhere among the n trials, and there are C(n, k) different ways of distributing k successes in a sequence of n trials.

In creating reference tables for binomial distribution probability, usually the table is filled in up to n/2 values. This is because for k > n/2, the probability can be calculated by its complement as

[f(k;n,p)=f(n-k;n,1-p).\,]
So, one must look to a different k and a different p (the binomial is not symmetrical in general).

Cumulative distribution function

The cumulative distribution function can be expressed in terms of the regularized incomplete beta function, as follows:

[ F(k;n,p) = \Pr(X \le k) = I_(n-k, k+1) \!]
provided k is an integer and 0 ≤ k ≤ n. If x is not necessarily an integer or not necessarily positive, one can express it thus:

[F(x;n,p) = \Pr(X \le x) = \sum_^ p^j(1-p)^]
where [\lfloor x\rfloor] is the greatest integer less than or equal to x.

For [k \le np], upper bounds for the lower tail of the distribution function can be derived. In particular, Hoeffding's inequality yields the bound

[ F(k;n,p) \leq \exp\left(-2 \frac\right), \!]
and Chernoff's inequality can be used to derive the bound

[ F(k;n,p) \leq \exp\left(-\frac \frac\right). \!]

Mean, standard deviation, and mode

If X ~ B(n, p) (that is, X is a binomially distributed random variate), then the expected value of X is

[E[X]=np\,]
and the variance is

[\mbox(X)=np(1-p).\,]
This fact is easily proven as follows. Suppose first that we have exactly one Bernoulli trial. We have two possible outcomes, 1 and 0, with the first having probability p and the second having probability 1 − p; the mean for this trial is given by μ = p. Using the definition of variance, we have

[\sigma^2= \left(1 - p\right)^2p + (-p)^2(1 - p) = p(1-p).]
Now suppose that we want the variance for n such trials (i.e. for the general binomial distribution). Since the trials are independent, we may add the variances for each trial, giving

[\sigma^2_n = \sum_^n \sigma^2 = np(1 - p). \quad \Box]
The most likely value or mode of X is given by the largest integer less than or equal to (n + 1)p; if m = (n + 1)p is itself an integer, then m − 1 and m are both modes.

Relations to other distributions

:[X+Y \sim B(n+m, p).\,]
Two other important distributions arise as approximations of binomial distributions:
Binomial PDF and normal approximation for n = 6 and p = 0.5.
Enlarge
Binomial PDF and normal approximation for n = 6 and p = 0.5.

:[ N(np, np(1-p)).\,]
Various rules of thumb may be used to decide whether n is large enough. One rule is that both np and n(1 − p) must be greater than 5. However, the specific number varies from source to source, and depends on how good an approximation one wants; some sources give 10. Another commonly used rule holds that the above normal approximation is appropriate only if
:[\mu \pm 3 \sigma = np \pm 3 \sqrt \in [0,n].]
The following is an example of applying a continuity correction: Suppose one wishes to calculate Pr(X ≤ 8) for a binomial random variable X. If Y has a distribution given by the normal approximation, then Pr(X ≤ 8) is approximated by Pr(Y ≤ 8.5). The addition of 0.5 is the continuity correction. Warning: The normal approximation gives inaccurate results unless a continuity correction is used.
This approximation is a huge time-saver (exact calculations with large n are very onerous); historically, it was the first use of the normal distribution, introduced in Abraham de Moivre's book The Doctrine of Chances in 1733. Nowadays, it can be seen as a consequence of the central limit theorem since B(n, p) is a sum of n independent, identically distributed 0-1 indicator variables.
For example, suppose you randomly sample n people out of a large population and ask them whether they agree with a certain statement. The proportion of people who agree will of course depend on the sample. If you sampled groups of n people repeatedly and truly randomly, the proportions would follow an approximate normal distribution with mean equal to the true proportion p of agreement in the population and with standard deviation σ = (p(1 − p)/n)1/2. Large sample sizes n are good because the standard deviation gets smaller, which allows a more precise estimate of the unknown parameter p.
The formula for Bézier curves was inspired by the binomial distribution.

Limits of binomial distributions

:[}]
approaches the normal distribution with expected value 0 and variance 1.

References

See also

Probability distributions  [ view][ talk][ edit] 
Univariate Multivariate
Discrete: BernoullibinomialBoltzmanncompound PoissondegeneratedegreeGauss-Kuzmingeometrichypergeometriclogarithmicnegative binomialparabolic fractalPoissonRademacherSkellamuniformYule-SimonzetaZipfZipf-Mandelbrot Ewensmultinomial
Continuous: BetaBeta primeCauchychi-squareexponentialexponential powerFfadingFisher's zFisher-TippettGammageneralized extreme valuegeneralized hyperbolicgeneralized inverse GaussianHotelling's T-squarehyperbolic secanthyper-exponentialhypoexponentialinverse chi-squareinverse gaussianinverse gammaKumaraswamyLandauLaplaceLévyLévy skew alpha-stablelogisticlog-normalMaxwell-BoltzmannMaxwell speednormal (Gaussian)ParetoPearsonpolarraised cosineRayleighrelativistic Breit-WignerRiceStudent's ttriangulartype-1 Gumbeltype-2 GumbeluniformVoigtvon MisesWeibullWigner semicircle DirichletKentmatrix normalmultivariate normalvon Mises-FisherWigner quasiWishart
Miscellaneous: Cantorconditionalexponential family • infinitely divisible • location-scale familymarginalmaximum entropyphase-typeposteriorpriorquasisampling

External links

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: