<< nan_moment Descriptive Statistics nan_percentile >>

NaN Toolbox >> NaN Toolbox > Descriptive Statistics > nan_pdist

nan_pdist

Return the distance between any two rows in x.

Calling Sequence

y = pdist (x)
y = pdist (x, metric)
y = pdist (x, metric_function)
y = pdist (x, metric, metricarg, ...)

Parameters

x:

x is the n x d matrix representing q row vectors of size d

y:

output

metric:

metric is an optional argument specifying how the distance is computed It can be any of the following ones, defaulting to "euclidean"

"euclidean":

Euclidean distance (default).

"seuclidean":

Standardized Euclidean distance. Each coordinate in the sum of squares is inverse weighted by the sample variance of that coordinate.

"mahalanobis":

Mahalanobis distance: see also mahalanobis.

"cityblock":

City Block metric, aka Manhattan distance.

"minkowski":

Minkowski metric. Accepts a numeric parameterp: for p=1 this is the same as the cityblock metric, with p=2 (default) it is equal to the euclidean metric.

"correlation":

One minus the sample correlation between points (treated as sequences of values).

"spearman":

One minus the sample Spearman's rank correlation between observations, treated as sequences of values.

"hamming":

Hamming distance: the quote of the number of coordinates that differ.

"jaccard":

One minus the Jaccard coefficient, the quote of nonzero coordinates that differ.

"chebychev":

Chebychev distance: the maximum coordinate difference.

Description

Return the distance between any two rows in x.

The output is a dissimilarity matrix formatted as a row vector y, (n-1)*n/2 long, where the distances are in the order [(1, 2) (1, 3) ... (2, 3) ... (n-1, n)]. You can use the squareform function to display the distances between the vectors arranged into an nxn matrix.

metric is an optional argument specifying how the distance is computed. It can be any of the following ones, defaulting to "euclidean", or a user defined function that takes two arguments x and y plus any number of optional arguments, where x is a row vector and and y is a matrix having the same number of columns as x. metric returns a column vector where row i is the distance between x and row i of y. Any additional arguments after the metric are passed as metric (x, y, metricarg1, metricarg2 ...).

Predefined distance functions are:

"

Examples

// Compute the  Euclidean distance.
X = rand(100, 5,"norm");
D = nan_pdist(X,'euclidean');

// Compute the Euclidean distance with each coordinate difference scaled by the standard deviation.
Dstd = nan_pdist(X,'seuclidean');

See also

Authors


<< nan_moment Descriptive Statistics nan_percentile >>