TETRAPLOT - Four Vertices Are Better Than Three by Leon L

TETRAPLOT - Four Vertices Are Better Than Three
by Leon L. Fedenczuk - Gambit Consulting Ltd.
Mark Bercov - Bercov Computer Consultants Ltd.
nOl1Tlalized as a ratio of their sum and displayed inside of the
diagram in a three dimensional coordinate system. Each apex of
the tetrahedron plot represents a single parameter or a
scientifically valid combination of parameters. A provision for
diNerent shapes and colors to display observation points increases
the dimensionality of the system.
ABSTRACT
Last year we gave you the Temary Plot - this year we have added
another vertex and present the Tetrahedron Diagram.
This
diagram allows the user to display four coorcfinate system data in
a three dimensional space.
Each apex of the tetrahedron
represents a single parameter or a scientifically valid combination
of parameters in the original data under investigation. In addition,
we have increased the dimensionality of the system by providing
for user selected shapes and colors.
The Tetrahedron Diagram is a powerful extension of the previously
presented Ternary Plot. It allows for the classification of data
based on the spatial coherence of data subsets.
As a
consequence, this graphical tool increases the ability of the user
to easily display and interpret information for complex systems
such as: chemical analysis, composites of rocks and various
multiphase systems, or the output from multivariate statistical
procedures.
INTRODUCTION
Fig. 1
One of the most interesting areas in the visualization of statistical
data is the use of dimensional reduction techniques through
special graphicaJ tools. This is the second paper on the use of
special plots to display multidimensionaJ systems on simple
diagrams. The focus of the first paper (presented in New Orieans)
was on the display of three dimensionaJ data in the form of the
Ternary Plot (1). It presented three dimensional data inside a
triangle where each vertex represented one dimension, or variable.
Tetrahedron and its faces before folding
PROCEDURAL STEPS IN DRAWING OBSERVATION POINTS
The tetrahedron is bounded by four equilateral triangles and six
edges of aquallength. It is placed in a such way that its center
corresponds to the origin of the right-handed X, V, Z coordinate
system (Rg. 2) with Z axis facing the user.
This paper describes a set of procedures implemented in SAS® to
draw a Tetrahedron Diagram. The Tetrahedron Diagram is a data
visualization tool which allows the viewing of four dimensional
closed data systems. A tetrahedron is considered the simplest
fonn of the polyhedra solids. These solids consist of a set of plane
polygons (referred to as faces) which bound a region of space.
The tetrahedron is the first of a five element list of polyhedra that
satisfy three conditions:
y
1. All faces are identical planar polygons.
2. Each vertex has the same number of incident edges as any
3.
other.
If the edges were flexible, they could be deformed to a map
on a sphere.
The others members of the Platonic Solids which satisfy these
conditions are the cube, dodecahedron, octahedron, and
icosahedron. The tetrahedron is constructed with four equilateral
triangle faces and four vertices (Rg. 1). All the Platonic Solids
have a kind of ·perfect geometric symmetry- in the sense that the
vertices of each solid are equidistant from a common center and
evenly distributed around the center (they lie upon an imaginary
x
c
sphere).
As was the case for the Ternary Plot, the Tetrah~dron Diagram is
used to visualize observations in which the variables represent a
closed data set. Four variables of each data observation are
Fig. 2
Tetrahedron bigonometry
534
B
The choice of the coordinate system is arbitraty and does not
affect the final graphical output The vertex A has been chosen on
the V axis at a unit distance R=l from the origin, where A is a
radius of the circumscribed sphere. The x, y, z coordinates of all
vertices are:
A = (0.0000,
B = ( 0.8165,
C = (-0.8165,
= ( 0.0000,
o
1.0000,
-0.3333,
-0.3333,
-0.3333,
3.
Finding O 2 on the
that
O,e line (therefore on the ABC face) such
O,O,lO,C = c/(a+b+c)
where (a+b+e) is not equal to zero.
0.0000)
0.4714)
0.4714)
-0.9428).
If (a+b)=O then O,=C.
If c=0 then O 2=0 l'
Let's assume an observation &0· having four parameter values: A,
B, C, and O.
Placing the observation points inside of the tetrahedron may be
done in a few simple steps (as shown in Fig. 3):
4.
Finding 0 9 on the 020 line such that:
0,0,100, = d/(a+b+c+d) = d
If (a+b+c)=O then 0,=0.
A
The position of 0 3 in the X,V and Z coordinate system is a
tetrahedron (four dimensional) representation of the normalized
These
obselVation O(a,b,c,d) and annotated as O(x,Y,z).
calculated values of X, V, and Z are used together with a shape
and color to display the observation using the SCAITER statement
in PROe GSO. Rotation of the coordinate system provides
additional help in finding the bestspatiai orientation facilitating data
classification or discrimination.
Blf------~ C
SYSTEM OVERVIEW
The present version of the TetraPlot Plotting System is written as
a SAS/AF® application using SAS Version 6.04 on the PC (~OS).
The bulk of the actual processing is performed by SAS Macros, as
opposed to SCL code. We believe that this design decision would
make it relatively simple to port the nucleus of this system to a
non-SAS/AF environment (if desired). The actual creation of the
Tetrahedron Diagram requires the input of the names of four
numeric variables (one for each vertex of the tetrahedron), a
numeric size variable, an alphabetic shape variable, and an
alphabetic color variable. The TetraPlot System which has been
developed can be utilized as is (right out of the box), and can also
be modified quite easily to meet specific user requirements.
N------~~-----~A
Fig. 3
Finding position of the observation
(unfolded tetrahedron)
1.
Nonnalization of parameters into the range 0·1 as proportions
of the sum of four parameters. This process transfonns data
into new system coordinates a, b, c, and d corresponding to
tetrahedron vertices A, B, C, and O.
USER INTERFACE
The user initiates a Tetrahedron P10tting System Session on a PC
by executing a Batch file (TETRAPLT.BAT). This file outputs a
message on the screen, and invokes SAS, specifying that the SAS
AUTOEXEC file TETRAPLT.SAS is to be executed.
This
AUTO EXEC file outputs a 'Please be patient' message on the
tenninal, allocates all of the required FILENAME's and
LlBNAMES's, submits all of the required SAS MACRO Programs
for compilation (remember, this is SAS 6.04), and invokes a
SASIAF Session.
The Primaty SAS/AF Session Menu allows the user to select one
of the three submenus, to return to SAS Display Manager, to exit
from SAS completety, or to obtain help.
The first submenu provides a primitive data import facility (which
allows the user to read data from an ASCII file and store this data
in a SAS dataset).
The second submenu allows the user to select a desired graphics
device (from a list which currently includes a VGA terminal, an
EGA terminal, a HP Laserjet printer, a HP Painljet plotter, a HP
7550 A size plotter, and a HP Oraftmaster " (E size) plotter_ The
user is also requested to provide the desired plot orientation
O(a,b,c,d) = ( AI(A+B+C+O), B/(A+B+C+O),
C/(A+B+C+O), O/(A+B+C+O) )
This forces the sum of the nonnalized parameters to always
be equal to 1.
2.
Finding point 0, on the AS edge based on a ratio of the A
and B parameters after normalization such that:
AO/AB = b/(a+b)
where (a+b) is not equal to zero.
If a=O then O,=B.
If b=O then O,=A.
535
(landscape or portrait).
Create SAS Annotate
Dataset
The third submenu allows the user to specify the input SAS
dataset name, the names of the four numeric variables to be
analyzed, and the names of the numeric size variable, the
for
a) axes and labels
b) tetrahedron outline
c} vertex annotation
alphabetic shape variable, and the alphabetic color variable in the
input SAS datasel The second screen of this submenu allows the
user to (optionally) provide a title for the graph, annotation for each
of the vertices, a rotate expression, and a tilt expression. The
,
rotate expression and tilt expression may be any valid expression
allowed by PAOe G30. Once the user has specified the desired
options, SAS code consisting primarily of the appropriate MACRO
programs (which perfonn all of the actual processing) is submitted.
I
Normal.i.ze numeri.c data
Calculate X,Y, and Z
coordi.nates
SAS PROCESSING
Create Tetrahedron Plot
The SAS Code utilized to create the Tetrahedron Plot is amazingly
(using PROC G3D and SCATTER
statement)
and store in graphics catalog
,
simple and straightforward, as outlined in Rg. 4. Processing
commences with the creation of a SAS annotate dataset which
contains aU of the required information for the tetrahedron axes
and labels, the tetrahedron outline, and the tetrahedron vertex
annotation text. The numeric data representing the closed system
to be analyzed is normalized, and the X, Y, and Z coordinates at
which each data point is to be plotted is calculated (as described
in the algorithm above). The actual Tetrahedron Plot is created
using the PROe GaO Scatter statement, and the graph is stored
in a temporary graphics catalog. The process is concluded by
replaying the stored graph on the specified graphics device. This
procedure of creating a graph, storing this graph in a graphics
catalog, and replaying the stored graph allows the use of
GREPLAY color maps which translate the original colors specified
into the appropriate colors for the target graphics device. A brief
description of each of the SAS Macro programs utilized in the
Replay plot
I
I
Fig. 4
2.
3.
4.
~
spacified
Tetrahedron Plot Creation
Tetraplot Plotting System appears below.
M_TETDIL
on
graphics -device (after applying
appropriate color map)
determines input list (consisting of variable names
and possibly informats) to be utilized to read
data
M_TETGDF - defines selected graphics deVice GOPTIONS and
color map
M_TET ANO ~ creates annotation for axes and labels,
tetrahedron outline, and vertex text
~ creates temporary annotate dataset TETRANNO
It is currently necessary to generate dummy minimum and
maximum X, Y, and Z values in the PRGe G3D input dataset
to ensure that annotation will appear within the screen area.
This will obviously no longer be a requirement once the AXIS
statement is available for PRGe G3D in SAS Release 6.06.
It is not possible to utilize the SAS Annotate Macros for this
project because they do not support specification of a Z
coordinate.
Because it was necessary to specify the axis and label
coordinates in tenns of data values (as opposed to screen
values), the axes and labels are affected by any tilt and rotate
values which are utilized. EVen if the AXIS statement were
available for use, it is doubtful that it is powerful enough to
allow the placement of the Origin of the axes at the center of
the tetrahedron and the placement of the axis labels at the
desired positions on the graph. In an ideal world, it would be
desirable for the axes and labels to be unaffected by tilt and
rotate.
5.
M_TETeRD - reads, checks, and normalizes input data
- calculates X, Y, Z coordinates
- clllates temporary G3D input dataset TETRAG3D
6.
M3ETPLT - creates graph in temporary graphics catalog
- replays graph on specified graphics device (with
appropriate color mapping)
- performs clean up
7.
With only four exceptions, the shapes available for use in the
SCATTER statement in PRGe GaO are one dimensional.
Accordingly, when the tetrahedron is rotated, these shapes
are plotted without rotation, seriously reducing the visual effect
of the graphs produced and limiting the user's ability to
discern depth within the plotted image.
Rendering (shading) of the sides of the tetrahedron is well
beyond the capabilities available in PRGe Gao.
The inability to specify color maps in all of the SASIGRAPH®
procedures (with the exception of PROe GREPLAy) remains
an annoying limitation when colors which are not available on
the selected output device have been specified. At the
present time, the user either accepts the default color
PROBLEM AREAS
mapping supplied by SAS/GRAPH for all of the non-available
colors, or outputs the graph produced in a graphics catalog,
1.
Under SAS 6.04, it is not possible to specify a tilt angle or a
rotate angle of more than 90 degrees. This seriously impairs
the users ability to view the graph from the most
advantageous position. This limitation will no longer apply to
and then replays the graph using PROe GREPLAY with a
8.
PROe G3D in SAS Release 6.06.
536
user-supplied color map. Hopefully, this limitation will soon be
rectified by the Institute.
We would love to get our hands on a copy of SAS/NVISION
to see how it might be utilized for the production of
tetrahedron plots, but will likely be unable to do so because
of the non-trivial costs involved.
APPUCATIONS OF THE TETRAHEDRON DIAGRAM
Graphical methods for representing multivariate data may be useful
for the detection and analysis of patterns by facilitating their
visualization. The Tetrahedron Diagram is one of the best tools to
display information such as the results of chemical analysis, rock
compositions, and any multiphase systems (e.g., fluid equilibriums,
formation water analysis. etc.).
x
The system is especially useful in the direct discrimination of
y
observations based on spatial separation of points representing
different observations. Also, it allows for classification purposes
based on spatial coherence of data subsets. These features and
increased dimensionality make the tetrahedron plot a desirable tool
in combination with multivariate procedures (e.g., principal
component analysis. QMmode and RMmode Factor Analysis).
Perhaps we can best illustrate the effectiveness of this tool in the
analysis of mUltivariate data by displaying the actual results of two
separate studies. In the first case, all of the points being analyzed
were located along one face of the tetrahedron (that is, an actual
data clustering does exist). It is clear from the first view (Fig. 5),
that this clustering is not always readily discerned. However, by
rotating and tilting the tetrahedron (Fig. 6), the actual nature of the
data becomes apparent.
Fig. 6
Final View of First Case Study
.-
0""".
o Veo1ax
Fig. 5
Initial View of Arst Case Study
The second study illustrates the use of the tetrahedron diagram in
locating clusters of data. The analyzed data represents three
multivariate populations. This is clear1y illustrated in the four views
shown in Fig. 7, 8, 9, and 10,
Fig. 7
First View of Second Case Study
537
'A-_~
___
Fig. 8
Fig. 10
Second View of Second Case Study
Fourth View of Second Case Study
CONCLUSIONS
A new tool for data analysis of four dimensional data has been
designed and tested with the use of the SAS system. The SAS
PROe G3D, annotate, and macro facilities have been chosen to
build the system, which produces tetrahedron plots. This paper
demonstrates how this plot can be incorporated in data analysis
and can improve statistical interpretation which is difficult or
impossible with other graphical tools.
....••. X
BV",""
REFERENCES
1. Leon L Fedenczuk, Mark Bercov. Templot - SAS Creation of
Ternary Plots, SAS Users Group International Sixteenth Annual
Conference, New Orleans, Louisiana, Feb. 1991, pp. 771-778.
If you have any questions or comments, or desire further
information, the authors may be contacted at
Gambit Consulting Ltd.
144 Palisway Dr. S.W.
CALGARY. ALBERTA
T2V 3V6. CANADA
Fig. 9
Third View of Second Case Study
SAS, SAS/AF, and SAS/GRAPH are registered trademarks or
trademarks of SAS Institute Inc. in the USA and other countries.
® indicates USA registration.
538