Errors and residuals in statistics
Encyclopedia : E : ER : ERR : Errors and residuals in statistics
In statistics, the concepts of error and residual are easily confused with each other.
Error is a misnomer; an error is the amount by which an observation differs from its expected value; the latter being based on the whole population from which the statistical unit was chosen randomly. The expected value, being the average of the entire population, is typically unobservable. If the average height of 21-year-old men is 5 feet 9 inches, and one randomly chosen man is 5 feet 11 inches tall, then the "error" is 2 inches; if the randomly chosen man is 5 feet 7 inches tall, then the "error" is −2 inches. The nomenclature arose from random measurement errors in astronomy. It is as if the measurement of the man's height were an attempt to measure the population average, so that any difference between the man's height and the average would be a measurement error.
A residual, on the other hand, is an observable estimate of the unobservable error. The simplest case involves a random sample of n men whose heights are measured. The sample average is used as an estimate of the population average. Then we have:
- The difference between the height of each man in the sample and the unobservable population average is an error, and
- The difference between the height of each man in the sample and the observable sample average is a residual.
- Residuals are observable; errors are not.
- Errors are often independent of each other; residuals are not independent of each other (at least in the simple situation described above, and in many others).
An example, with some of the mathematical theory
If we assume a normally distributed population with mean μ and standard deviation σ, and choose individuals independently, then we have
- [X_1, \dots, X_n\sim N(\mu,\sigma^2)\,]
- [\overline=]
- [\overline\sim N(\mu, \sigma^2/n).]
- [\varepsilon_i=X_i-\mu,\,]
- [\widehat_i=X_i-\overline.]
The sum of squares of the errors, divided by σ2, has a chi-square distribution with n degrees of freedom:
- [\sum_^n \left(X_i-\mu\right)^2/\sigma^2\sim\chi^2_n.]
- [\sum_^n \left(\,X_i-\overline\,\right)^2/\sigma^2\sim\chi^2_.]
See also
External links
- [VIAS Science Cartoons] Residuals from the humorous perspective.
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
