Opentopia Directory Encyclopedia Tools

Mahalanobis distance

Encyclopedia : M : MA : MAH : Mahalanobis distance


In statistics, Mahalanobis distance is a distance measure introduced by P. C. Mahalanobis in 1936. It is based on correlations between variables by which different patterns can be identified and analysed. It is a useful way of determining similarity of an unknown sample set to a known one. It differs from Euclidean distance in that it takes into account the correlations of the data set and is scale-invariant, i.e. not dependent on the scale of measurements.

Formally, the Mahalanobis distance from a group of values with mean [\mu = ( \mu_1, \mu_2, \mu_3, \dots , \mu_p )] and covariance matrix [\Sigma] for a multivariate vector [x = ( x_1, x_2, x_3, \dots, x_p )] is defined as:

[D_M(x) = \sqrt (x-\mu)}.\, ]
Mahalanobis distance can also be defined as dissimilarity measure between two random vectors [ \vec] and [ \vec] of the same distribution with the covariance matrix [\Sigma] :

[ d(\vec,\vec)=\sqrt-\vec)^T\Sigma^ (\vec-\vec)}.\,]
If the covariance matrix is the identity matrix then it is the same as Euclidean distance. If the covariance matrix is diagonal, then it is called normalized Euclidean distance:

[ d(\vec,\vec)=\sqrt^p },]
where [\sigma_i] is the standard deviation of the [ x_i ] over the sample set.

 


From Wikipedia, the Free Encyclopedia. Original article here. Support Wikipedia by contributing or donating.
All text is available under the terms of the GNU Free Documentation License See Wikipedia Copyrights for details.

Search Titles
0123456789
ABCDEFGHIJ
KLMNOPQRST
UVWXYZ?

E-mail this article to:

Personal Message: