Pre-ICM Int. Convention - MSCS

A Bayesian decision theoretic solution to the problem related to
microarray data with consideration of intra-correlation among the
affected genes
By
Naveen K. Bansal
Marquette University
Milwaukee, Wisconsin(USA)
Email: [email protected]
http://www.mscs.mu.edu/~naveen/
With
Dr. Klaus Miescke
University of Illinois-Chicago
Illinois (USA)
Pre-ICM Int. Convention, 2008
When the genes are altered under adverse condition, such
as cancer, the affected genes show under or over
expression in a microarray.
X i  Expression
Level
X i ~ P(i , )
H 0 : i  0 vs H 1 : i  0 or H1 : i  0
i
i
i
The objective is to find the genes with under
expressions and genes with over expressions.
Pre-ICM Int. Convention, 2008
Probabilit y Model : P( , )
H :   (say, 0) vs H :   , H :  
0
0
1
0 2
0
Directional Error (Type III error):
Sarkar and Zhou (2008, JSPI)
Finner ( 1999, AS)
Shaffer (2002, Psychological Methods)
Lehmann (1952, AMS; 1957, AMS)
Main points of these work is that if the objective is to find the true
Direction of the alternative after rejecting the null, then a Type III error must be
controlled instead of Type I error.
Type III error is defined as P( false directional error if the null is rejected). The
traditional method does not control the directional error. For example,
| t | t / 2 , and t  t / 2 , an error occurs if   0.
Pre-ICM Int. Convention, 2008
Bayesian Decision Theoretic
Framework
Probabilit y Model : P( ,  )
H 0 :    0 ( say, 0) , H1 :    0 , H 2 :    0
 ( )  p1 1 ( )  p0 0 ( )  p1 1 ( )
where
 -1 ( )  g 1 ( ) I (  0),  0 ( )  I (  0), g1 ( ) I (  0)
g 1 ()  density w ith support contained in (-,0)
g1 ()  density w ith support contained in (0, )
A  {1,0,1}, L( , a )  loss for takin g action a  A.
Pre-ICM Int. Convention, 2008
g 1 and g1 could be trucated densities of a density on  .
The skewness in the prior is introduced by (p- 1, p0 , p1).
p1  p1 reflects that the right tail is more likely than the
left tail.
p-1  0 (or p1  0) would yield a one - tail test.
p-1  p1 with g -1 and g1 as truncated of a symmetric
density wo uld yield a two - tail test.
p-1 and p1 can be assigned based on what tail is
more important.
Pre-ICM Int. Convention, 2008
To introduce the correlation among under expressed
and over expressed genes, we can modify the priors
with g -1 ( |  ) and g  ( |  ) depending on the
parameters  and  .
A prior on  and  introduces
correlation among underexpressed genes and
over expressed genes.
Pre-ICM Int. Convention, 2008
The Bayes risk for a decision rule  is given by
r  ( )  p1r1 ( 1 )  p0 r0 (0)  p1r1 ( 1 )
where
r1 ( 1 )    0 R ( ,  ) 1 ( )d
r1 ( 1 )    0 R ( ,  ) 1 ( )d
r0 (0)  R (0,  )
For a fixed prior  , decision rules can be compared by comparing the space
S( )  {( r1 ( 1 ), r0 (0), r1 ( 1 )) :  D*}
consider the class of all rules  for which R (0,  ) are the same
slope depends
on p1 and p1
r1 ( )
Bayes
Rule
r1 ( )
Pre-ICM Int. Convention, 2008
Consider t he following two priors
 ( )  p-1 1 ( )  p0 0 ( )  p1 1 ( ),
 ' ( )  p-'1 1 ( )  p0 0 ( )  p1' 2 ( )
p1  p1' (i.e. H1 is more likelely under  in comparison to  ' )
Theorem 1 : If p1  p1' , and if  B and  B' are the Bayes rules under 
and  ' respective ly, then
r1 B ( )  r1 B ( ' )
'
and
r1B ( )  r1B ( ' )
'
Remark : This theorem implies that if apriori it is known tha t H1 is
more likely th an H ' 1 ( p1  p1 ), then th e average risk of the Bayes rule
in the positive direction will be smaller t han average risk in the negative
direction.
Pre-ICM Int. Convention, 2008
An Application:
An experiment was conducted to see the effect of a
sequence knockdown from non-coding genes.
Hypothesis was that this knockdown will cause the
overexpression of the mRNAs of the coding genes.
The implication of this is that this will shows how the
non-coding genes interact with coding genes. In other
words, non-coding genes also play a part in protein
synthesis.
Here it can be assumed that that the effect of
knockdown on the coding genes (if there is any) would
be mostly overexpression of mRNAs than
underexpression of mRNAs. The objective would be to
detect as many of overexpressed genes as possible.
Pre-ICM Int. Convention, 2008
Relationship with the False Discovery Rates
Consider t he "0 - 1" loss
0  0
L( ,-1)  
1   0
0  0
L( ,0)  
1  0
0  0
L( ,1)  
1  0
 P (  0)   0

R ( ,  )   P (  0)   0
 P (  0)   0
 
r1B ( )  
 0
P (  0) 1 ( )d  E[ P (  0 |   0)]
which is expected false non - negative discovery rate in the Bayesian framwork.
Similarly r1 B ( ) can be interprete d as expected false non - positive discovery rate, and
r0 B ( ) as expected false non - dicovery rate. r  B ( ) is the weighted average of these false
discoverie s.
Pre-ICM Int. Convention, 2008
{Xi , I  1,2,..., N} independent data with
X ~ f (xi ;i , ), i  1,2,..., N
i
Consider t he following multiple hypotheses problem :
H i :i  0, H i :i  0, H i :i  0
0
1
1
i ~ p1 1( )  p0 0 ( )  p11( ), i 1,2,..., N independent
 i {1, 0,1} non - randomized Bayes rule
B
Pre-ICM Int. Convention, 2008
Table 1
Possible outcomes from m hypothesis tests
Accept H 0
H 0 is true
H 1 is true
H1 is true
Total
FDR1  E[
V1  W1
]
R1
pFDR1  E[
Accept H 1
Accept H1
Total
U
V1
V2
m0
T1
S1
W2
m1
T2
W1
S2
m2
R0
R1
R2
m
FDR1  E[
V1  W1
| R1  0]
R1
V2  W2
]
R2
pFDR1  E[
V2  W2
| R2  0]
R2
( Storey, 2003 AS )
Pre-ICM Int. Convention, 2008
Theorem 2 : Suppose that m hypothesis tests are based
on test statistics T ,T ,...,Tm. Let  and  be the regions
1
1
1 2
of accepting H i and H i , respectively, for each i  1,2,..., m.
1
1
iid
Assume that Ti | H i ~ I ( H i ) F  I ( H i ) F  I ( H i ) F , where
1 1
0 0
1 1
F , F , F are the marginal distributi ons of Ti wrt  , and
1 0
1 0 1
1 respectively. If ( H i , H i , H i ), i 1,2,..., m are independent and
1 0 1
identical trinomia l trials, then
pFDR ()  P( H | T  )  P( H | T  )
1
1
1
0
1
and
pFDR ()  P( H | T  )  P( H | T  )
1
1
1
0
1
Pre-ICM Int. Convention, 2008
This theorem and our previous discussion on the Bayes decision rule
implies that under the independent hypotheses testing setting, the
Bayes rule obtained from a single test problem
H :  0 vs. H :  0, H :  0,
0
1
1
Which minimizes the average expected false discoveries, can be used
under multiple hypotheses problem. The false discovery rates
of this procedure can be obtained by computing the posterior
probabilities.
pFDR ( )  P( H |   1)  P( H |   1)
1 B
0 B
1 B
pFDR ( )  P( H |   1)  P( H |   1)
1 B
0 B
1 B
The prior probabilities p and p will determine whether the
1
1
positive hypothesis H or the negative hypothesis H
1
1
will be more frequently correctly detected.
Pre-ICM Int. Convention, 2008
Posterior False Discovery Rate
m

I (i  0) I (di  1) 


PFDR 1  E  i 1 m
| X
  I (di  1)  1

 i 1

m
 P(i  0 | X) I (di  1)
 i 1
m
 I (di  1)  1
i 1
Thus if P( i  0 | X)   1 , then E[PFDR -1 ]   -1.
Similarly , if P( i  0 | X)  1 , then E[PFDR1 ]   1.
Pre-ICM Int. Convention, 2008
Let
qi- ( X)  P ( i  0 | X)
qi ( X)  P ( i  0 | X)
q1,m  q2,m  ...  qm ,m
q1,m  q2,m  ...  qm ,m
j
1
Q j ( X )   qi,1
j i 1
Q j
1 j 
  qi ,1
j i 1
Pre-ICM Int. Convention, 2008
Theorem 3 : Let
max{ j :Q j ( X )   1}, if maximum exists

k- ( X )  
0 otherwise
max{ j :Q j ( X )  1}, if maximum exists

k ( X )  
0 otherwise
  ...  q  .
Select all H j1 corresponding to q1,1
k ,m
  ...  q  .
Select all H1j corresponding to q1,1
k ,m
Then E[PFDR - ]   1 and E[PFDR  ]   2 .
Pre-ICM Int. Convention, 2008
Computation of Bayes Rule:
 B ( )  {i (1,0,1) : E[L( ,i) | X] 
min E[ L( , j) | X]}
j  1,0,1
 ( | X)   ( H 1 | X) ( | H 1, X)   (H 0 | X) I (  0)   ( H1 | X) ( | H1, X)
Here  ( H
| X),  ( H | X), and  ( H | X) are the posterior probabilit ies
1
0
1
of H , H , and H respectively; and  ( | H , X) and  ( | H , X)
1 0
1
1
1
are the posterior densities of  under  ( ) and  ( ) respectively.
1
1
Computation parts invole computing  ( H
| X),  ( H | X), and  ( H | X)
1
0
1
and computing E[ L( ,1) | H , X] and E[ L( ,1) | H , X].
1
1
Pre-ICM Int. Convention, 2008
Choice of the loss function:
0
L( ,0)  
q( )
 0
,
 0
0
L( ,1)  q ( )
 1

0
 0
 0

,
L
(

,

1
)


 0
q ( )   0

1
For the "0 -1" loss, q ( )  q ( )  q ( ) 1.
0
1
1
A good choice would be
q ( )  l (1 q( )),
0
0
q ( )  l (1 q( )),
1
1
q ( )  l (1 q( ))
1
1
where l , l and l are some non - negative constants.
0 1
1
Here, q( ) reflect the magnitude of departure from the null. One
good choice of q( ) is
q( )  E [log( f ( X ,0) / f ( X , ))]

Pre-ICM Int. Convention, 2008
If q( )  q ( )  q ( ) 1, then we have a "0 -1" loss.
1
1
Theorem 3 : Under t he "0 -1" loss, the Bayes rule is given by
 B  {i : pi f (X | Hi )  max( p j f (X | H j ), j  1,0,1}
Here f (X | H j ) is the marginal density X under H j .
In other words, reject H
0
if
f (X | H ) p
f (X | H )
0  1 and
0 
f (X | H ) p
f (X | H )
1
0
1
Otherwise, select H
p
1.
p
0
( H ) if
1 1
p f (X | H )  () p f (X | H )
1
1
1
1
Pre-ICM Int. Convention, 2008
Example :
Probabilit y Model : N ( ,1)
H :   0 vs. H :  0, H :  0
0
1
1
Prior :  ( )  p  ( )  p I (  0)  p  ( )
1 1
0
11
 1 : left - half N (0,1) distributi on
1 : right - half N (0,1) distributi on
Pre-ICM Int. Convention, 2008
Theorem 4 : Under the "0 -1" loss, the Bayes rule reject H
0
if
T (x)  p / p
and T ( x )  p / p ,
1
0 1
1
0 1
otherwise select H
( H ) if p T ( x )  () p T ( x ), where
1 1
1 1
11
2
n2 x2
nx
T ( x) 
exp(
)(
)
1
n 1
2(n 1)
n 1
and
2
n2 x2
nx
T ( x) 
exp(
)(
)
1
n 1
2(n 1)
n 1
Note that T and T are monotonically decreasing and increasing
1
1
functions, respectively. Thus the rejection region is equivalent to
x  T 1( p / p ) and x  T 1( p / p )
0 1
1 0 1
1
Pre-ICM Int. Convention, 2008
Power Comparison:
When p  p , the rejection region is UMP unbiased.
1 1
Bayes test with p  0.39, p  0.39, p  0.22 : blue line)
1
0
1
UMP unbiased test : red dotted line
1
0.9
0.8
prob of rejection
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
mean value
0.2
0.3
0.4
0.5
Pre-ICM Int. Convention, 2008
Bayes test with p  0.22, p  0.39, p  0.39 : blue line)
1
0
1
UMP unbiased test : red dotted line
1
0.9
0.8
prob of rejection
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
mean value
0.2
0.3
0.4
0.5
Pre-ICM Int. Convention, 2008
What if p
, p , and p are unknown?
1 0
1
One can consider a Dirichlet prior. It is easy to see that the only
difference would be to replace p , p , and p by E ( p | X ),
1 0
1
1
E ( p | X ), and E ( p | X ) respectively.
0
1
For the normal problem, we get the following result.
Theorem 5 : Under the "0 -1" loss, the Bayes rule reject H
0
if
E ( p | X)
E ( p | X)
0
0
T (x) 
and T ( x ) 
,
1
1
E ( p | X)
E ( p | X)
1
1
otherwise select H
( H ) if E( p | X)T ( x )  () E ( p | X)T ( x ), where
1 1
1
1
1
1
Pre-ICM Int. Convention, 2008
Another Approach:
Consider t he prior
 ( )  p0 I (  0)  (1 p0 ) '( ),
where  '( ) is a skewed prior; for example, skew - normal
2 

 '( )   ( )( ) I (  0).
 

Here  reflects the skewness in  values, and  the
dispersion in  values.
Pre-ICM Int. Convention, 2008
Power Comparison of a skew-normal Bayes
with the UMP test
1
0.9
0.8
prob of rejection
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
-0.5
-0.4
-0.3
-0.2
-0.1
0
0.1
mean value
0.2
0.3
0.4
0.5
Pre-ICM Int. Convention, 2008
?
Pre-ICM Int. Convention, 2008
Pre-ICM Int. Convention, 2008