Least squares
Encyclopedia : L : LE : LEA : Least squares
Least squares is a mathematical optimization technique which, when given a series of measured data, attempts to find a function which closely approximates the data (a "best fit"). It attempts to minimize the sum of the squares of the ordinate differences (called residuals) between points generated by the function and corresponding points in the data. Specifically, it is called least mean squares (LMS) when the number of measured data is 1 and the gradient descent method is used to minimize the squared residual. LMS is known to minimize the expectation of the squared residual, with the smallest operations (per iteration). But it requires a large number of iterations to converge.
An implicit requirement for the least squares method to work is that errors in each measurement be randomly distributed. The Gauss-Markov theorem proves that least square estimators are unbiased and that the sample data do not have to comply with, for instance, a normal distribution. It is also important that the collected data be well chosen, so as to allow visibility into the variables to be solved for (for giving more weight to particular data, refer to weighted least squares).
The least squares technique is commonly used in curve fitting. Many other optimization problems can also be expressed in a least squares form, by either minimizing energy or maximizing entropy.
Formulation of the problem
Suppose that the data set consists of the points (xi, yi) with i = 1, 2, ..., n. We want to find a function f such that [f(x_i)\approx y_i.]
To attain this goal, we suppose that the function f is of a particular form containing some parameters which need to be determined. For instance, suppose that it is quadratic, meaning that f(x) = ax2 + bx + c, where a, b and c are not yet known. We now seek the values of a, b and c that minimize the sum of the squares of the residuals:
- [ S = \sum_^n (y_i - f(x_i))^2. ]
Solving the least squares problem
In the above example, f is linear in the parameters a, b and c. The problem simplifies considerably in this case and essentially reduces to a system of linear equations. This is explained in the article on linear least squares.
The problem is more difficult if f is not linear in the parameters to be determined. We then need to solve a general (unconstrained) optimization problem. Any algorithm for such problems, like Newton's method and gradient descent, can be used. Another possibility is to apply an algorithm that is developed especially to tackle least squares problems, for instance the Gauss-Newton algorithm or the Levenberg-Marquardt algorithm.
Least squares and regression analysis
In regression analysis, one replaces the relation
- [f(x_i)\approx y_i]
- [f(x_i) = y_i + \varepsilon_i,]
One frequently estimates the parameters (a, b and c in the above example) by least squares: those values are taken that minimize the sum S. The Gauss–Markov theorem states that the least squares estimates are optimal in a certain sense, if we take f(x) = ax + b with a and b to be determined and the noise terms ε are independent and identically distributed (see the article for a more precise statement and less restrictive conditions on the noise terms).
Least squares estimation for linear models is notoriously non-robust to outliers. If the distribution of the outliers is skewed, the estimates can be biased. In the presence of any outliers, the least squares estimates are inefficient and can be extremely so. When outliers occur in the data, methods of robust regression are more appropriate.
See also
- Isotonic regression
- Least mean squares filter
- Least-squares estimation of linear regression coefficients
- Linear regression
- Regression analysis
- Robust regression
- Root mean square
- Total least squares or errors-in-variables model
- Weighted least squares
External links
- http://www.physics.csbsju.edu/stats/least_squares.html
- [Zunzun.com] - Online curve and surface fitting
- http://www.orbitals.com/self/least/least.htm
- [Least squares] on PlanetMath
From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.
