Scatter Diagram

scatter diagram
'le,a samplesizeof 1000
'200 million.
That is
the votingpreferences
scatter diagram
rroportionp that
I of time.Therefore
r is the worst caseand
;ample-size
calculator
;isp turnsout to be a
nceintervalcalculator
r you obtained.
ffi'l"
manbrain wants to
:sstestsof randomness.
e rn statistics
software
nmended.
If it is
J you haveto use
ee if biasexistsin
rm aspossible,with
it worksbestwhen
usterlookspretty
I to avoidproblems
Ldlooksdifferent
t givesa more
givesa lessprecise
I costandefficiency
) overcomeany
rling from production
3orexample,if a
ffectthe samples.
r throwa die to
isionsbased
or assistance
with
rling plan for quality
471
,""'-:
\€//
'\y
/\
Also called: scatterplot, X-Y graph
Description
The scatterdiagram graphspairs of numerical data, one variable on eachaxis, to look
for a relationshipbetweenthem. If the variablesare correlated,the pointswill fall along
a line or curve. The better the correlation,the tighter the points will hug the line.
When to Use
. When you have paired numerical data,and . . .
. When the dependentvariablemay have multiple valuesfor eachvalue of the
independentvariable,and . . .
. When trying to determinewhetherthe two variablesare related,such as:
- When trying to identify potentialroot causesof problems,or .
- After brainstormingcausesand effectsusing a fishbonediagram,to determine
objectively whethera particular causeand effect are related,or . . .
- When determiningwhethertwo effectsthat appearto be relatedboth occur with
the samecause.or . . .
- When testingfor autocorrelationbefore constructinga control chart
Procedure
l. Collectpairsof datawherea relationshipis suspected.
2. Draw a graph with the independentvariable on the horizontal axis and the
dependentvariableon the vertical axis. For eachpair of data,put a dot or a
symbol where the x-axis value intersectsthe y-axis value.If two dots fall
together,put them side by side,touching,so that you can seeboth.
3. Look at the patternof points to seeif a relationshipis obvious.If the dataclearly
form a line or a curr/e,you may stop. The variables are correlated.You may wish to
use regressionor correlationanalysisnow. Otherwise,completesteps4 through7.
472 scatter diagram
points on
4. Divide points on the graph into four quadrants.If there are X
the graPh,
. Count Xl2 pointsfrom top to bottom and draw a horizontal line'
. count X/2 points from left to right and draw a vertical line'
If number of points is odd, draw the line through the middle point.
5. Count the points in eachquadrant.Do not count points on a line'
and the total of
6. Add the diagonally oppositequadrants.Find the smaller sum
points in all quadrants.
A - points in upper left + points in lower right
B - points in upper right + points in lower left
Q = the smaller of A and B
N=A+B
7. Look up the limit for N on the trend test table (Table5' 18)'
. If Q is less than the limit, the two variablesare related'
Table5.18 Trendtesttable
N
1-B
Limit
N
Limit
0
51-53
18
Y - t l
54-55
19
56-57
20
15-16
3
5B-60
21
17-19
A
61-62
zz
20-22
5
63-64
ZJ
23-24
o
r]f,-oo
24
25-27
7
67-69
25
28-29
B
12-14
26
27
30-32
I
72-73
33-34
10
74-76
28
35-36
11
77-78
29
37-39
2
79-80
30
40-41
o
81-82
31
42-43
A
B3-85
44-46
5
86-87
32
33
47-48
6
88-89
34
17
90
35
49-50
scatter diagram
e X points on
473
. If Q is greaterthan or equal to the limit, the patterncould have occurredfrom
randomchance.
ntalline.
Example
ine.
lle point.
a line.
rm and the total of
'ight
left
Thisexampleis part of the ZZ-400 improvementstory on page 85. The ZZ-400 manufacturingteam suspectsa relationshipbetweenproduct purity (percentpurity) and the
amountof iron (measuredin parts per million or ppm). Purity and iron are plotted
against
eachother as a scatterdiagram in Figure 5.172.
Thereare 24 datapoints. Median lines are drawn so that 12 points fall on eachside
forbothpercentpurity and ppm iron.
To test for a relationship,they calculate:
A - points in upper left + points in lower right = 8 + 9 = 17
B - points in upper right + points in lower left= 4 + 3 = J
Q = the smallerof A and B = the smallerof 7 and l7 = 7
N=A+B=7+ll=24
Then they look up the limit for N on the trend test table. For N = 24, the limit is 6.
Q is greaterthan the limit. Therefore,the pattern could have occurred fiom random
and no relationshipis demonstrated.
chance,
Limit
18
Purity vs. lron
19
100.0
20
z
oS
26
99.0
zt
28
98.5
29
QA
-
J I
32
33
98.0
0 10
0.30
0.40
0.50
lron(partspermillion)
J4
35
Figure5.172 Scatterdiagramexample.
0.60
0.70
Considerations
. ln what kind of situationsmight you use a scatterdiagram?Here are some
examples:
- VariableA is the temperatureof a reactionafter 15 minutes.VariableB
measuresthe color of the product.You suspecthigher temperaturemakes
the product darker.Plot temperatureand color on a scatterdiagram.
- VariableA is the number of employeestrained on new software,andvariable
B is the number of calls to the computerhelp line. You suspectthat more
training reducesthe number of calls. Plot number of peopletrainedversus
numberof calls.
- To test for autocorrelationof a measurementbeing monitoredon a control
chart, plot this pair of variables:VariableA is the measurementat a given
time. VariableB is the samemeasurement,but at the previoustime' If the
scatterdiagram showscorrelation,do anotherdiagram where variableB is the
measurementtwo times previously.Keep increasingthe separationbetween
the two times until the scatterdiagram showsno correlation.
. Even if the scatterdiagram showsa relationship,do not assumethat onevariable
causedthe other.Both may be influencedby a third variable'
. When the data are plotted, the more the diagramresemblesa straightline,
the strongerthe relationship.SeeFigures5.39 through 5.42, page198in the
crtrrelationanalysisentry, for examplesof the types of graphsyou mightsee
and their interpretations.
. If a line is not clear,statistics(N and Q) determinewhetherthereis reasonable
exists,
certainty that a relationshipexists.If the statisticssay that no relationship
the patterncould have occurredby random chance.
. If the scatterdiagram showsno relationshipbetweenthe variables,consider
whetherthe data might be stratified. Seestratification for more details.
(x-axis)
. If the diagram showsno relationship,considerwhetherthe independent
because
variablehasbeenvaried witlely. Sometimesa relationshipis not apparent
range.
the datadon't covera wide enough
. Think creativelyabout how to use scatterdiagramsto discovera root cause.
. Seegraph for more information about graphingtechniques.Also seethe
graph decisiontree (Figure 5.68, page 216) in that sectionfor guidanceon
when to use scatterdiagramsand when other graphsmay be more usefulfor
your situation.
. Drawing a scatterdiagram is the first stepin looking for a relationshipbetween
methods
variables. Seecorrelation anolysis andregressionanalysisfbr statistical
vou can use.