Design of Experiments and Data Analysis

Design of Experiments and Data
Analysis
Graduate Seminar
29 Oct, 5 Nov
Hal Carter
Agenda
●
Analyze and Display Data
–
–
–
●
Simple Statistical Analysis
Comparing Results
Determining O(n)
Design Your Experiments
–
–
2K Designs Including Replications
Full Factor Designs
A System
Factors
System
inputs
System
Responses
System
outputs
Experimental Research
Define
System
Identify
Factors
and Levels
Identify
Response(s)
Design
Experiments
Define system outputs first
● Then define system inputs
● Finally, define behavior (i.e., transfer function)
●
Identify system parameters that vary (many)
● Reduce parameters to important factors (few)
● Identify values (i.e., levels) for each factor
●
●
●
Identify time or space effects of interest
Identify factor-level experiments
Create and Execute System;
Analyze Data
Define
Workload
Create
System
Execute
System
Analyze &
Display
Data
Workloads are inputs that are applied to system
● Workload can be a factor (but often isn't)
●
Create system so it can be executed
●Real prototype
●Simulation model
●Empirical equations
●
Execute system for each factor-level binding
● Collect and archive response data
●
Analyze data according to experiment design
● Evaluate raw and analyzed data for errors
● Display raw and analyzed data to draw conclusions
●
Some Examples
●
Analog Simulation
– Which of three solvers is
best?
– What is the system?
– Responses
● Fastest simulation time
● Most accurate result
● Most robust to types of
circuits being simulated
– Factors
● Solver
● Type of circuit model
● Matrix data structure
●
Epitaxial growth
– New method using nonlinear temp profile
– What is the system?
– Responses
● Total time
● Quality of layer
● Total energy required
● Maximum layer
thickness
– Factors
● Temperature profile
● Oxygen density
● Initial temperature
● Ambient temperature
SIMPLE MODELS OF DATA
Evaluation of a new wireless network protocol.
System: wireless network with new protocol
Workload:
10 messages applied at single source
Each message identical configuration
Experiment output:
Roundtrip latency per message (ms)
Data file “latency.dat”
Latency
22
23
19
18
15
20
26
17
19
17
Mean: 19.6 ms
Variance: 10.71 ms2
Std Dev: 3.27 ms
Verify Model Preconditions
Check randomness
Use plot of residuals around mean
Residuals appear random
Check normal distribution
`Use quantile-quantile plot
Pattern adheres consistently along
ideal quantile-quantile line
Confidence Intervals
Sample mean vs Population mean
CI: > 30 samples
( x  z[1a / 2] s / n , x  z[1a / 2] s / n )
CI: < 30 samples
x  t[1a / 2;n1] s / n , x  t[1a / 2;n1] s / n )
For the latency data, m =
10, a = 0.05:
(17.26, 21.94)
Raj Jain, “The Art of Computer Systems Performance Analysis,” Wiley, 1991.
Depth
Scatter and Line Plots
Resistance profile of doped silicon
epitaxial layer
Expect linear resistance increase as depth
increases
Resistance
1
1.689015
2
4.486722
3
7.915209
4
6.362388
5
11.830739
6
12.329104
7
14.011396
8
17.600094
9
19.022146
10
21.513802
Linear Regression Statistics
model = lm(Resistance ~ Depth)
summary(model)
Residuals:
Min
1Q Median
3Q Max
-2.11330 -0.40679 0.05759 0.51211 1.57310
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.05863 0.76366 -0.077 0.94
Depth
2.13358 0.12308 17.336 1.25e-07 ***
--Signif. codes: 0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 1.118 on 8 degrees of freedom
Multiple R-Squared: 0.9741, Adjusted R-squared: 0.9708
F-statistic: 300.5 on 1 and 8 DF, p-value: 1.249e-07
Validating Residuals
Errors are marginally normally
distributed due to “tails”
Comparing Two Sets of Data
Example: Consider two wireless
different access points. Which one is
faster?
Approach:
Take difference of data and
determine CI of difference.
Inputs: same set of 10 messages
communicated through both access points.
If CI straddles zero, cannot tell
which access point is faster.
Response (usecs):
Latency1 Latency2
22
19
23
20
19
24
18
20
15
14
20
18
26
21
17
17
19
17
17
18
CI95% = (-1.27, 2.87) usecs
Confidence interval straddles zero.
Thus, cannot determine which is
faster with 95% confidence
Plots with error bars
Execution time of SuperLU
linear system solution on
parallel computer
Ax = b
For each p, ran problem
multiple times with same
matrix size but different values
Determined mean and CI for
each p to obtain curve and
error intervals
How to determine O(n)
> model = lm(t ~ poly(p,4))
> summary(model)
Call:
lm(formula = t ~ poly(p, 4))
Residuals:
1
2
-0.4072 0.7790
3
4
5
0.5840 -1.3090 -0.9755
Coefficients:
Estimate Std. Error t value
(Intercept) 236.9444
0.7908 299.636
poly(p, 4)1 679.5924
2.3723 286.467
poly(p, 4)2 268.3677
2.3723 113.124
poly(p, 4)3 42.8772
2.3723 18.074
poly(p, 4)4
2.4249
2.3723
1.022
--Signif. codes: 0 `***' 0.001 `**' 0.01
6
0.8501
Pr(>|t|)
7.44e-10
8.91e-10
3.66e-08
5.51e-05
0.364
7
8
2.6749 -3.1528
***
***
***
***
`*' 0.05 `.' 0.1 ` ' 1
Residual standard error: 2.372 on 4 degrees of freedom
Multiple R-Squared:
1,
Adjusted R-squared: 0.9999
F-statistic: 2.38e+04 on 4 and 4 DF, p-value: 5.297e-09
9
0.9564
2
R
– Coefficient of Determination
SSE around the mean is
SST = ∑ (yi – mean(y))2 = ∑(yi2) – n(mean(y)2) = SSY -SS0
SSE around the model is
SSE = ∑ei2
SSR = SST – SSE
R2 = SSR/SST = (SST-SSE)/SST
R2 is a measure of how good the model is.
The closer R2 is to 1 the better.
Example: Let SST = 1499 and SSE = 97.
Then R2 = 93.5%
Using the t-test
Consider the following data (“sleep.R”)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
extra group
0.7
1
-1.6
1
-0.2
1
-1.2
1
-0.1
1
3.4
1
3.7
1
0.8
1
0.0
1
2.0
1
1.9
2
0.8
2
1.1
2
0.1
2
-0.1
2
4.4
2
5.5
2
1.6
2
4.6
2
3.4
2
From “Introduction to R”, http://www.R-project.org
T.test result
>
t.test(extra ~ group, data = sleep)
Welch Two Sample t-test
data: extra by group
t = -1.8608, df = 17.776, p-value = 0.0794
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.3654832 0.2054832
sample estimates:
mean of x mean of y
0.75 2.33
p-value is smallest 1- confidence where null
hyp. not true.
p-value = 0.0794 means difference not 0
above 92%
k
2
Factorial Design
y = q0 + qAxA + qBxB + qABxAB
(k=2)
SST = total variation around the mean
= ∑ (yi – mean(y))2
= SSA+SSB+SSAB
where SSA = 22qA2
Note: var(y) = SST/(n-1)
Fraction of variation explained by A = SSA/SST
k
2
Factor
Levels
Line Length (L)
32, 512 words
No. Sections (K)
4, 16 sections
Control Method (C) multiplexed, linear
Design
Experiment Design
Address
Trace
Cache
Misses
Are all factors needed?
If a factor has little effect on the variability of the
output, why study it further?
Method?
a. Evaluate variation for each factor using only two
levels each
b. Must consider interactions as well
Interaction: effect of a factor dependent on the
levels of another
L
32
512
32
512
32
512
32
512
K
4
4
16
16
4
4
16
16
C Misses
mux
mux
mux
mux
lin
lin
lin
lin
Encoded Experiment Design
L
-1
1
-1
1
-1
1
-1
1
K
-1
-1
1
1
-1
-1
1
1
C
-1
-1
-1
-1
1
1
1
1
Misses
k
2
Design
Analyze Results (Sign Table)
Obtain Reponses
L
-1
1
-1
1
-1
1
-1
1
K
-1
-1
1
1
-1
-1
1
1
C
-1
-1
-1
-1
1
1
1
1
Misses
14
22
10
34
46
58
50
86
I
1
1
1
1
1
1
1
1
qi: 40
L
-1
1
-1
1
-1
1
-1
1
10
K
-1
-1
1
1
-1
-1
1
1
5
C
-1
-1
-1
-1
1
1
1
1
20
LK LC KC
1 1 1
-1 -1 1
-1 1 -1
1 -1 -1
1 -1 -1
-1 1 -1
-1 -1 1
1 1 1
5 2 3
= 1/∑(signi*Responsei)
SSL = 23q2L = 800
SST = SSL+SSK+SSC+SSLK+SSLC+SSKC+SSLKC
= 800+200+3200+200+32+72+8
= 4512
%variation(L) = SSL/SST = 800/4512 = 17.7%
LKC
-1
1
1
-1
1
-1
-1
1
1
Miss.Rate
14
22
10
34
46
58
50
86
Effect % Variation
L
17.7
C
4.4
K
70.9
LC
4.4
LK
0.7
CK
1.6
LCK
0.2
Full Factorial Design
Model: yij = m+ai + bj + eij
Effects computed such that ∑ai = 0 and ∑bj = 0
m = mean(y..)
aj = mean(y.j) – m
bi = mean(yi.) – m
Experimental Errors
SSE = ei2j
SS0 = abm2
SSA= b∑a2
SSB= a∑b2
SST = SS0+SSA+SSB+SSE
Full-Factor Design Example
Determination of the speed of light
Morley Experiments
Factors: Experiment No. (Expt)
Run No. (Run)
Levels: Expt – 5 experiments
Run – 20 repeated runs
001
002
003
004
019
020
021
022
023
096
097
098
099
100
Expt Run Speed
1
1
850
1
2
740
1
3
900
1
4
1070
<more data>
1
19
960
1
20
960
2
1
960
2
2
940
2
3
960
<more data>
5
16
940
5
17
950
5
18
800
5
19
810
5
20
870
Box Plots of Factors
Two-Factor Full Factorial
> fm <- aov(Speed~Run+Expt, data=mm) # Determine ANOVA
> summary(fm)
# Display ANOVA of factors
Df Sum Sq Mean Sq F value
Pr(>F)
Run
19 113344
5965 1.1053 0.363209
Expt
4 94514
23629 4.3781 0.003071 **
Residuals
76 410166
5397
--Signif. codes:
0 `***' 0.001 `**' 0.01 `*' 0.05 `.' 0.1 ` ' 1
Conclusion: Data across experiments has acceptably small
variation, but variation within runs is significant