淘先锋技术网

首页 1 2 3 4 5 6 7

mdscale

Nonclassical multidimensional scaling

Syntax

Y = mdscale(D,p)

[Y,stress] = mdscale(D,p)

[Y,stress,disparities] = mdscale(D,p)

[...] = mdscale(D,p,'Name',value)

Description

Y = mdscale(D,p) performs nonmetric multidimensional

scaling on the n-by-n dissimilarity

matrix D, and returns Y, a configuration

of n points (rows) in p dimensions

(columns). The Euclidean distances between points in Y approximate

a monotonic transformation of the corresponding dissimilarities in D.

By default, mdscale uses Kruskal's normalized stress1

criterion.

You can specify D as either a full n-by-n matrix,

or in upper triangle form such as is output by pdist.

A full dissimilarity matrix must be real and symmetric, and have zeros

along the diagonal and non-negative elements everywhere else. A dissimilarity

matrix in upper triangle form must have real, non-negative entries. mdscale treats NaNs

in D as missing values, and ignores those elements. Inf is

not accepted.

You can also specify D as a full similarity

matrix, with ones along the diagonal and all other elements less than

one. mdscale transforms a similarity matrix to

a dissimilarity matrix in such a way that distances between the points

returned in Y approximate sqrt(1-D).

To use a different transformation, transform the similarities prior

to calling mdscale.

[Y,stress] = mdscale(D,p) returns the minimized

stress, i.e., the stress evaluated at Y.

[Y,stress,disparities] = mdscale(D,p) returns

the disparities, that is, the monotonic transformation of the dissimilarities D.

[...] = mdscale(D,p,'Name',value) specifies

one or more optional parameter name/value pairs that control further

details of mdscale. Specify Name in

single quotes. Available parameters are

Criterion— The goodness-of-fit

criterion to minimize. This also determines the type of scaling, either

non-metric or metric, that mdscale performs. Choices

for non-metric scaling are:

'stress' — Stress normalized

by the sum of squares of the inter-point distances, also known as

stress1. This is the default.

'sstress' — Squared stress,

normalized with the sum of 4th powers of the inter-point distances.

Choices for metric scaling are:

'metricstress' — Stress,

normalized with the sum of squares of the dissimilarities.

'metricsstress' — Squared

stress, normalized with the sum of 4th powers of the dissimilarities.

'sammon' — Sammon's nonlinear

mapping criterion. Off-diagonal dissimilarities must be strictly positive

with this criterion.

'strain' — A criterion equivalent

to that used in classical multidimensional scaling.

Weights — A matrix or vector

the same size as D, containing nonnegative dissimilarity

weights. You can use these to weight the contribution of the corresponding

elements of D in computing and minimizing stress.

Elements of D corresponding to zero weights are

effectively ignored.

Note

When you specify weights as a full matrix, its diagonal elements

are ignored and have no effect, since the corresponding diagonal elements

of D do not enter into the stress calculation.

Start — Method used to choose

the initial configuration of points for Y. The choices are

'cmdscale' — Use the classical

multidimensional scaling solution. This is the default. 'cmdscale' is

not valid when there are zero weights.

'random' — Choose locations

randomly from an appropriately scaled p-dimensional normal

distribution with uncorrelated coordinates.

An n-by-p matrix

of initial locations, where n is the size of the matrix D and p is

the number of columns of the output matrix Y. In

this case, you can pass in [] for p and mdscale infers p from

the second dimension of the matrix. You can also supply a 3-D array,

implying a value for 'Replicates' from the array's

third dimension.

Replicates — Number of times

to repeat the scaling, each with a new initial configuration. The

default is 1.

Options — Options for the

iterative algorithm used to minimize the fitting criterion. Pass in

an options structure created by statset.

For example,

opts = statset(param1,val1,param2,val2, ...);

[...] = mdscale(...,'Options',opts)

The choices of statset parameters are

'Display' — Level of display

output. The choices are 'off' (the default), 'iter',

and 'final'.

'MaxIter' — Maximum number

of iterations allowed. The default is 200.

'TolFun' — Termination tolerance

for the stress criterion and its gradient. The default is 1e-4.

'TolX'— Termination tolerance

for the configuration location step size. The default is 1e-4.

Examples

load cereal.mat

X = [Calories Protein Fat Sodium Fiber ...

Carbo Sugars Shelf Potass Vitamins];

% Take a subset from a single manufacturer.

X = X(strcmp('K',cellstr(Mfg)),:);

% Create a dissimilarity matrix.

dissimilarities = pdist(X);

% Use non-metric scaling to recreate the data in 2D,

% and make a Shepard plot of the results.

[Y,stress,disparities] = mdscale(dissimilarities,2);

distances = pdist(Y);

[dum,ord] = sortrows([disparities(:) dissimilarities(:)]);

plot(dissimilarities,distances,'bo', ...

dissimilarities(ord),disparities(ord),'r.-');

xlabel('Dissimilarities'); ylabel('Distances/Disparities')

legend({'Distances' 'Disparities'},'Location','NW');

0fb9c28fc6d649bf71bf851201a0f8de.png

% Do metric scaling on the same dissimilarities.

figure

[Y,stress] = ...

mdscale(dissimilarities,2,'criterion','metricsstress');

distances = pdist(Y);

plot(dissimilarities,distances,'bo', ...

[0 max(dissimilarities)],[0 max(dissimilarities)],'r.-');

xlabel('Dissimilarities'); ylabel('Distances')

7898e1dab314e97a527e6ba3ab8ec3e9.png

Introduced before R2006a