Optimal designs for one and two-colour microarrays using mixed

Optimal designs for one and twocolour microarrays using mixed
models
A comparative evaluation of their efficiencies
Lima Passos, Winkens, Tan and Berger
DEMA 2008
Maastricht University
Department of Methodology and Statistics
Background
Current situation
One versus two colour comparisons
• Woo et al, 2004:
– We observed good concordance in both estimated expression levels and statistical
significance of common genes.
• Smyth, 2005:
– All four platforms reasonably precise (cDNA, oligo, Agilent, Affymetrix);
– Broadly agree;
– Disagreement due to sequence differences, not to noise.
• John Hopkins Press release, 2005:
– Different microarray systems more alike than previously thought;
• Patterson et al., 2006:
– The quality of the data stemming from one and two-colour arrays are equivalent in terms
of reproducibility, sensitivity, specificity and accuracy;
– highly concordant results regarding detection of differentially expressed genes;
Current opinions
One or Two?
Background
• Hardiman, 2004:
– The choice of platform … should be guided by the content on that
platform and the amount of RNA available for experimentation.
• Agilent technologies:
– Both one and two colour have their places in scientific research:
• One provide much quicker analysis, more efficient method for analysing a
large number of samples or those that span long time frames;
• Two provide the most accurate results, helping identify small incremental
changes in sample to further specific investigations;
• Patterson et al. 2006;
– The decision to used one or two will be determined by cost,
experimental design considerations and personal preference;
– Platform type should not be considered a primary factor ‘in decisions
regarding experimental microarray design’;
Objective
Optimal designs
One versus two?
• The majority of papers addressing microarray design questions
- fixed effects models;
• They are all specifically directed to two-colour microarrays;
• Design papers with mixed models (also two-colour) are less
abundant (Cui and Churchill, 2003; Landgrebe et al., 2004; Tempelman, 2005; Bueno Filho et al.,
2006 and Tsai et al., 2006);
• Is the choice of platform an important design issue?
• Main question:
• What is exactly the impact the choice of a platform can have
on the precision of model parameters?
– If any, which are the financial implications?
Design
Design issues at stake
Two colour:
– which pair-samples (the
design points) to
distribute across the
slides together with their
label assignment?
• One colour:
– design points consists of
the groups themselves,
and not their pair-wise
combinations;
• ???
 x1 x2 ... xm 

  
 w1 w2 ... wm 
Premises
Mixed models
• One colour:
log( Intisl j ) 
yisl j  θ j  ul j  εisl j
ul j ~ N (0, σ u2 )
εisl j ~ N (0, σ e2 )
• Two colour:
 Int isgl j
log
 Int
 isrl k
(ul j - ulk ) ~ N (0,2σ u2 )

  yisl l  (δ g  δr )  (θ j  θ k )  (ul - ul )  (εisgl  εisrl )
jk
j
k
j
k


(εisgl j  εisrl k ) ~ N (0,2σ e2 )
Premises
Covariance structure
• Block diagonal, compound symmetric structure of V:
– Dye swap is made at the level of technical replication with identical
sample pairs. If not, i.e. lj with lk’, with k ≠ k’, the block diagonal of the
final covariance matrix V will be lost.
 σ u2  σ e2
v1  
2
σ
u

 2σ  2σ
v2  
2
2
σ
u

2
u


2
2
σu  σe 
σ u2
2
e


2
2
2σ u  2σ e 
2σ
2
u
M (ξ )  ( X 'V X )
1

m
l
1
d d
wd xd ' v x
Premises
Further premises
•
•
•
•
Contrasts - Θ* = CΘ (first order interactions or main effects)
Optimality criteria:
Det[CM () C] Trace[CM () C]
Sequential search yields an approximate
*
Exact designs: rounding up/down to the closest integer:
*I ~ I x *
• Relative efficiency one versus two:
 Det[ M( 2 )  ] 

effD 1 ;  2   
 
 Det[ M(1 ) ] 
1
p
Premises
The cost function
• Given the prohibitive costs, it is recommendable to
have an estimation of the costs of different
microarray designs for comparative purposes:
• cost = njc1 + nkSc2
Premises
Ceteris paribus
Assumptions/limitations
• To warrant comparability and fair assessment
between the two platforms:
– model parameters and contrasts (common research questions) for the
one and two-colour arrays are given on the same scale;
– number of technical replicates was held constant (2), and the search of
optimal designs focused on the distribution of biological replicates;
– homogeneity of biological variances of experimental groups as well as
independence and homogeneity of residual error variances were
assumed to hold;
– Variance components were restricted to a random intercept model with
compound symmetric, block-diagonal covariance matrix (dye-swap
with identical sample pairs!);
– subjects’ price was constant over all biological groups and the one- and
two-colour arrays cost the same;
Results
Results
3 x 3 factorial experiment
Results
ξ* and ξI* - Two colour
The design measure ξ*
Results
D-optimal design – main effects only
Pmf
Directed graph
11
33
wd
12
13
32
21
31
23
xd
P E RCE NT
20
15
wd
10
5
0
11
12
13
21
22
23
xd
31
32
33
22
Results
One versus two??
Subjects to groups allocation
How many subjects?
11
12
8
5
Results
One versus two??
Subjects to groups allocation
~
Results
Efficiency comparison
=N
≠I
≠N
=I
Results
Cost comparison
Cost 1 – Cost 2
=N
≠I
Cost 1 – Cost 2!!!
≠N
=I
Results
Cost comparison
“adjusted for efficiency”
Conclusion
Final remarks
Optimal allocation of subjects to experimental groups is much concordant
between the two platforms - Hence the choice of platform will not affect the
subjects to groups’ optimal allocation;
By varying number of subjects and arrays, while holding statistical
precision of parameter estimates comparable, the choice of the one over the
two-colour platform or vice-versa will be determined the subject to arrays
cost ratio;
On the grounds of statistical efficiency and under the condition that the
acquisition of arrays outstrips that of subjects financially, two-colour arrays
should be considered an efficient alternative over the one-colour,
specifically for studies involving class comparisons.
var 1
1. 0
r ef 1
1. 0
0. 5
0. 5
0. 0
0. 0
1
2
3
4
5
6
7
8
9
10
11
1
2
3
4
5
6
7
8
9
10
11