Statistics Toolbox
  Go to function:
    Search    Help Desk 
pdist    Examples   See Also

Pairwise distance between observations.

Syntax

Description

Y = pdist(X) computes the Euclidean distance between pairs of objects in the data matrix X. X is an m by n matrix, treated as m vectors of size n. For a dataset made up of m objects, there are pairs.

The output, Y, is a vector of length , containing the distance information. The distances are arranged in the order (1,2), (1,3),..., (1,m), (2,3),..., (2,m), ..., ...,(m-1, m). Y is also commonly known as a similarity matrix or dissimilarity matrix.

To save space and computation time, Y is formatted as a vector. However, you can convert this vector into a square matrix using the squareform function so that element (i,j) in the matrix corresponds to the distance between objects i and j in the original dataset.

Y = pdist(X,'metric') computes the distance between objects in the data matrix, X, using the method specified by `metric'. `metric'can be any of the following character strings that identify ways to compute the distance.

String
Meaning
`Euclid'
Euclidean distance (default)
`SEuclid'
Standardized Euclidean distance
`Mahal'
Mahalanobis distance
`CityBlock'
City Block metric
`Minkowski'
Minkowski metric

Y = pdist(X,'minkowski', p) computes the distance between objects in the data matrix, X, using the Minkowski metric. p is the exponent used in the Minkowski computation which, by default, is 2.

Mathematical Definitions of Methods .    Given an m-by-n data matrix X, which is treated as m (1-by-n) row vectors x1, x2,..., xm, the various distances between the vector xr and xs are defined as follows:

Examples

See Also

cluster, clusterdata, cophenet, dendrogram, inconsistent, linkage, squareform



[ Previous | Help Desk | Next ]