MDS - Index

What is Multidimensional
Scaling [MDS] ?
•
Anthony P.M. Coxon
–
•
Emeritus Professor of Sociological Research Methods,
University of Wales
–
Honorary Professor, Cardiff University
–
Honorary Professorial Research Fellow, University of Edinburgh
–
Co-founder & Co- Director of MDS software packages,
•
MDSX [OS] (freeware)and
•
NewMDSX for Windows (not-for-profit)
•
Website: www.newmdsx.com
see my entry on multidimensional scaling in
Lewis-Beck, M.S. et al, eds (2004) The Sage Encyclopaedia of
Social Science Research Methods. London Sage Publications )
Uni Winchester
09/12/09
What is MDS?
Prof APM Coxon, U Cardiff
1
ORIGINS / DEVELOPMENT OF MDS
• MDS (aka “Smallest Space Analysis”)
– Has origins in Psychometrics in 1920-’60s:
• Scale construction and dimensionality reduction
• Underwent major burst of development in 1960s due to
“non-metric revolution”(Coombs) and computing
developments allowing iterative estimation
– Originally designed for analysis of LTM of dis/similarities
data , taking a range of measures (not just PM correlations):
• “anything which, by an act of faith, can be considered a similarity”
(Shepard)
– Extended rapidly to deal with wide range of other types of
data
• Rectangular matrices ; triads, pair-comparisons, free-sorting
• “stacks” of matrices (3-way scaling – INDSCAL)
U Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff U
2
CONSTRUCTING A MAP …
– Given a map, it’s easy to calculate the distances between
the points …
– MDS operates the other way round:
• Given the data [ interpreted as quasi “distances” ] it
attempts to find the configuration [location of
points] which generated the distances
» This is “Classic MDS”: developed in 1930s – but
imperfect, not robust, & works only if data are ratio.
• Whereas more recent MDS can work even when only
the ordinal information exists: “Non-metric” =
ordinal MDS (Coombs / Kruskal “non-metric revolution” )
• What?? You can create an accurate map from only
the rank –order of the distances???
Yes! And it works!!  
3
The RANK of distances can recover the Map…
though not the coastline 
NEWMDSX (RUNSCRIPT = SYNTAX)
RUN NAME
COMMENT
Rank of Scottish distances,
1 = smallest; 120 = max; dissimilarity data
F3.3, p48 The User’s Guide to MDS
N OF STIMULI 16
PARAMETERS
DATA TYPE(1)
LABELS
BERWICK
EDINBURGH
GLASGOW
STRANRAER
AYR
PERTH
DUNDEE
ABERDEEN
STIRLING
OBAN
FORT_WM
INVERNESS
KYLE_LOCHALSH
BRAEMAR
ULLAPOOL
THURSO
READ MATRIX
17
53 11
92 68 36
70 30 4 11
34 7 19 83 45
27 8 29 93 58 1= Perth-Dundee
63 56 83 115 103 35 24
43 4 2 58 21 3 14 66
99 57 26 72 36 43 63 98 28
96 60 39 89 62 41 49 79 33 6
100 75 75 112 97 45 49 48 60 52 22
111 89 78 107 89 72 81 94 70 26 9 23
67 36 51 106 80 15 10 19 30 65 30 15 54
114 105 101 119 109 85 87 86 88 68 40 16 17 55
117 113 115 120 119 103 102 77 110 107 95 47 62 74 42
COMPUTE
=Stranraer-Thurso
U. Winchester
12/09
What is MDS?
Prof APM Coxon, Cardiff U
4
WHAT IS MULTIDIMENSIONAL SCALING?
A student’s definition:
–
If you are interested in how certain objects relate to each other
… and if you would like to present these relationships in the
form of a map then MDS is the technique you need” (Mr Gawels,
A good start!
KUB)
– MDS provides …
• a useful and easily-assimilable graphic visualisation of all sorts
of data
– Tukey: “A picture is worth a thousand words”
• In a user-chosen (small) # of dimensions
• providing a graphical representation of the structure
underlying a complex data set
• And measure how well / badly the solution distances match the
data dissimilarities (Stress)
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
5
MDS is a family of models
differentiated by …
– (DATA) the empirical inter-relationships
between a set of “objects”/variables which
are given in a set of dis/similarity data
» Basically, type of input data, defined by their “Way”
and “Mode” [e.g. 2W1M]. (Cf observations vs data)
– (FUNCTION) data are then optimally rescaled (according to permissible transformations for the data) in terms of …
» Choice of level of measurement [e.g. ordinal ]
– (MODEL) the assumptions of the model
chosen to represent the data
» Usually (Euclidean) Distance model
U Winchester, 12/2009
What is MDS?
Prof APM Coxon, Cardiff U
6
VARIANTS OF MDS due to type of data
MDS can be used with a wide variety of DATA
e.g.: SORTS OF DATA
– direct data (pair comparisons, ratings, rankings,
triads, counts)
– derived data (profiles, co-occurrence matrices,
textual data, aggregated data)
– measures of association etc derived from simpler
data, and
– tables of data.
• TYPES of DATA
• Described by WAY (2W=matrix; 3W=stack of matrices …)
• And MODE (# sets of distinct objects – eg variables,
subjects)
– E.G. 2W1M; 2W2M; 3W2M … 7W4M
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
7
VARIANTS OF MDS MODELS due to
TRANSFORMATIONS
MDS can also be used with a wide variety of:
Transformations (“levels of measurement”)
• monotonic (ordinal),
• linear/metric (interval),
… but also
–
Splines (SPSS PROXSCAL)
•
–
–
–
local preservation of distance
log-interval (MRSCAL),
Power (MULTISCALE)
“smoothness” (PARAMAP)
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff U
8
VARIANTS OF MDS due to type of MODEL
• DISTANCE “Minkowski-r”
– Usually Euclidean (r=2)
• Less often “City Block”, r=1
– Sometimes “Dominance”,
r=∞≈ 32
•
SCALAR PRODUCTS/Factor
•
scalar product : a ・ b = |a| |b| cos θ
– E.g. Covariance, PM Correlation
– As used in PCA, FA, MDPREF
•
COMPOSITION
–
Most common, Additive (cf ANOVA),
as in Impression Formation:
–
X(i.j) = a(i) + b(j) + …
•
–
nb Ordinal.non-metric ANOVA
But also, difference, product, mixed
U Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff U
9
MDS: The Basic [N-M] Model
• δ(j,k) =
• Data
•
F
(d(j,k)) -> X
Trans Model
Dis/similarities
+ε
(Config)
Monotonic F Euclid. Distances in Solution.
2W1M / LTM,FM Wk/Strong
Other, e.g.
• 2W2M
Linear
Vector-product
• 3W2M (stack)
Linear
weighted distance
Plus Error
•
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
bi-plot
+ Ss wts
CORRESP
INDSCAL
10
HOW DOES MDS WORK?
• Iteratively!
• START: Produce Init. “Guestimate” Configuration
• (a) FIT
– Calculate distances (d)
– Compare with data (δ) [via Ordinal regression]
– Calculate overall badness -of-fit measure
» Stress (d- δ) … well, almost! Actually more complex
» Perfect/Acceptable?  EXIT
• (b) IMPROVE:
For each point,
– find direction of improvement (don’t ask: calculus! Derivatives!)
– How far to move? Step-size (call it ‘heuristic’ ; “parachute & mist”)
• (c) MOVE configuration/points
• BACK TO (a)
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
11
1.
MDS PROGRAMS:
Usually either “General Purpose” Package (SPSS )
–
Basic Model for 2W1M data: PROXSCAL and 3W2M
INDSCAL
Also contains CORRESP, HICLUS and (in >SPSS13 ) PREFSCAL (2W2M)
–
2.
or “Library” : set of programs, each specific to Datashape, Trans & Model (e.g. NewMDSX for Windows);
includes
–
BASIC 2W1M SCALING:
•
•
–
2W2M (“Rectangular”) SCALING:
•
–
Multidimensional … Preference, Triads, Unfolding, Sorting
3W2M (and higher) SCALING:
•
•
•
Non-metric (ordinal) MINISSA , Metric (MRSCAL) linear,
Clustering (Hierarchical & Non-hierarchical)
Individual Differences (INDSCAL), (Tucker) Points-of-View
Procrustean IndDiffs (Lingoes’ PINDIS)
Or “ Interactive “ Package (PERMAP via NewMDSX)
•
•
•
primarily for basic model
Visually animated
Superb diagnostic procedures
Uni Winchester 12/2009
What is MDS?
Prof APM Coxon, Cardiff Uni
12
SITES & SOFTWARE:
SITES
– NEWMDSX AND DOCUMENTATION:
http://www.newmdsx.com
– INTERACTIVE PERMAP (Heady)
» (presently obtained via NewMDSX)
– THREE-WAY SCALING (Kroonenberg)
–
http://www.leidenuniv.nl/fsw/three-mode/content.htm
–
FORREST YOUNG’S VISTA (Visual Statistics)
http://forrest.psych.unc.edu/research/index.html
Uni Winchester, 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
13
WHAT IS MDS?
… and now for an example!
UniWinchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
14
APPENDICES
1. Interpretation: Headlines
2. MVA & MDS
Professor APM Coxon
15
MDS: Interpretation: Headlines
 For Euclidean Distance MDS: "What information is
stable/significant?“


Beware Local minima [PERMAP]
Remember: You may translate, reflect, (rigidly) rotate the
configuration: do so! [e.g. NewMDSX Graphics; PERMAP]
 “CLEARING UP” Configuration: [PERMAP]
 Map Evaluation & Diagnostics; Points and Links; selective removal
and hints of structure via Waern’s Graphic links.
 BASIC STRUCTURES:
 Regional: what points are close to each other and distant from
others? CLUSTERING [(HI)CLUS, SPSS]
 Linear: directions in space where some property is increasing:
External properties [PRO-FIT NewMDSX],
 If you must ... dimensions -- remember changing the origin or
dimensional orientation has no effect on relative distance. Most MDS
rotated at end to PCA ... Unlike FA, dimensions may/ may not have
importance.
 SIMPLE STRUCTURES
 dimensions, yes -- but also other simple structures (“horseshoes”,
radex/circumplex).
Professor APM Coxon
16
MDS & other “Dimensional”
Multivariate Analysis models
Uni Winchester 12/09
What is MDS?
Prof APM Coxon, Cardiff Uni
17
SOME POSSIBLE WEAKNESSES in MDS
There ARE any??!
• Relative ignorance of the sampling/inferential
properties of stress
• But, simulation (Spence), MLE estimation
• Prone-ness to local minima solutions
• but less so, and multiple starts & interactive programs like
PERMAP allow thousands of runs to check
• A few forms of data/models are prone to degeneracies
– especially MD Unfolding, but see new PREFSCAL in SPSS14)
• difficulty in representing the asymmetry of causal
models
– though external analysis is very akin to dependentindependent modelling,
– there are convergences with GLM in hybrid models such as
CLASCAL (INDSCAL with parameterization of latent classes)
Research Methods Festival
2006
What is MDS?
Prof APM Coxon, U Edinburgh
18