Sequential Implementation of Monte Carlo Tests with Uniformly

Sequential Implementation of Monte Carlo Tests
with Uniformly Bounded Resampling Risk
Axel Gandy
Department of Mathematics
Imperial College London
[email protected]
useR! 2009, Rennes
July 8-10, 2009
Introduction
I
I
I
I
Test statistic T , reject for large values.
Observation: t.
p-value:
p = P(T ≥ t)
Often not available in closed form.
Monte Carlo Test:
n
p̂naive =
1X
I(Ti ≥ t),
n
i=1
I
where T , T1 , . . . Tn i.i.d.
Examples:
I
I
I
Bootstrap,
Permutation tests.
Goal: Estimate p using few Xi
Mainly interested in deciding if p ≤ α for some α.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
2
Introduction
I
I
I
I
Test statistic T , reject for large values.
Observation: t.
p-value:
p = P(T ≥ t)
Often not available in closed form.
Monte Carlo Test:
n
p̂naive =
1X
I(Ti ≥ t) ,
| {z }
n
i=1
I
I
I
I
=:Xi ∼B(1,p)
where T , T1 , . . . Tn i.i.d.
Examples:
Bootstrap,
Permutation tests.
Goal: Estimate p using few Xi
Mainly interested in deciding if p ≤ α for some α.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
2
Introduction
I
I
I
I
Test statistic T , reject for large values.
Observation: t.
p-value:
p = P(T ≥ t)
Often not available in closed form.
Monte Carlo Test:
n
p̂naive =
1X
I(Ti ≥ t) ,
| {z }
n
i=1
I
I
I
I
=:Xi ∼B(1,p)
where T , T1 , . . . Tn i.i.d.
Examples:
Bootstrap,
Permutation tests.
Goal: Estimate p using few Xi
Mainly interested in deciding if p ≤ α for some α.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
2
Introduction
I
I
I
I
Test statistic T , reject for large values.
Observation: t.
p-value:
p = P(T ≥ t)
Often not available in closed form.
Monte Carlo Test:
n
p̂naive =
1X
I(Ti ≥ t) ,
| {z }
n
i=1
I
I
I
I
=:Xi ∼B(1,p)
where T , T1 , . . . Tn i.i.d.
Examples:
Bootstrap,
Permutation tests.
Goal: Estimate p using few Xi
Mainly interested in deciding if p ≤ α for some α.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
2
Sequential approaches based on Sn =
10
9
8
7
6
5
4
3
2
1
0
Pn
i=1 Xi
I
Stop once Sn ≥ Un or
Sn ≤ Ln
I
τ : hitting time
I
Compute p̂ based on Sτ
and τ .
I
Hit BU : decide p > α,
I
Hit BL : decide p ≤ α,
Sn
0 1 2 3 4 5 6 7 8 9 10
n
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
3
Sequential approaches based on Sn =
10
9
8
7
6
5
4
3
2
1
0
Pn
i=1 Xi
I
Stop once Sn ≥ Un or
Sn ≤ Ln
I
τ : hitting time
I
Compute p̂ based on Sτ
and τ .
I
Hit BU : decide p > α,
I
Hit BL : decide p ≤ α,
Sn
0 1 2 3 4 5 6 7 8 9 10
n
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
3
Sequential approaches based on Sn =
10
9
8
7
6
5
4
3
2
1
0
Pn
i=1 Xi
I
Stop once Sn ≥ Un or
Sn ≤ Ln
I
τ : hitting time
I
Compute p̂ based on Sτ
and τ .
I
Hit BU : decide p > α,
I
Hit BL : decide p ≤ α,
Sn
0 1 2 3 4 5 6 7 8 9 10
n
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
3
Previous Approaches
I
Besag & Clifford (1991):
Sn
h
0
m
0
I
n
(Truncated) Sequential Probability Ratio Test, Fay et al. (2007)
Sn
h
0
m
0
I
n
R-package MChtest.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
4
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
What do we really want?
Is p ≤ α?
Two individuals using the same statistical method on the same data
should arrive at the same conclusion.
First law of applied statistics, Gleser (1996)
Consider the resampling risk
(
Pp (p̂ > α) if p ≤ α,
RRp (p̂) ≡
Pp (p̂ ≤ α) if p > α.
Want:
sup RRp (p̂) ≤ p∈[0,1]
for some (small) > 0.
For Besag & Clifford (1991), SPRT: supp RRP ≥ 0.5
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
5
Recursive Definition of the Boundaries
Want:
sup RRp (p̂) ≤ p
Suffices to ensure
Pα (hit BU ) ≤ Pα (hit BL ) ≤ Recursive definition:
Given U1 , . . . , Un−1 and L1 , . . . , Ln−1 , define
I Un as the minimal value such that
Pα (hit BU until n) ≤ n
I
and Ln as the maximal value such that
Pα (hit BL until n) ≤ n
where n ≥ 0 with n % (spending sequence).
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
6
Recursive Definition of the Boundaries
Want:
sup RRp (p̂) ≤ p
Suffices to ensure
Pα (hit BU ) ≤ Pα (hit BL ) ≤ Recursive definition:
Given U1 , . . . , Un−1 and L1 , . . . , Ln−1 , define
I Un as the minimal value such that
Pα (hit BU until n) ≤ n
I
and Ln as the maximal value such that
Pα (hit BL until n) ≤ n
where n ≥ 0 with n % (spending sequence).
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
6
Recursive Definition of the Boundaries
Want:
sup RRp (p̂) ≤ p
Suffices to ensure
Pα (hit BU ) ≤ Pα (hit BL ) ≤ Recursive definition:
Given U1 , . . . , Un−1 and L1 , . . . , Ln−1 , define
I Un as the minimal value such that
Pα (hit BU until n) ≤ n
I
and Ln as the maximal value such that
Pα (hit BL until n) ≤ n
where n ≥ 0 with n % (spending sequence).
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
6
Recursive Definition of the Boundaries
Want:
sup RRp (p̂) ≤ p
Suffices to ensure
Pα (hit BU ) ≤ Pα (hit BL ) ≤ Recursive definition:
Given U1 , . . . , Un−1 and L1 , . . . , Ln−1 , define
I Un as the minimal value such that
Pα (hit BU until n) ≤ n
I
and Ln as the maximal value such that
Pα (hit BL until n) ≤ n
where n ≥ 0 with n % (spending sequence).
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
6
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
n=
Pα (Sn =k, τ≥ n)
0
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
1
0
1
-1
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
n=
Pα (Sn =k, τ≥ n)
0
1
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
1
0
1
-1
.2
.8
.07
2
-1
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
n=
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
n=
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
n=
4
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
.08
.41
.41
.18
3
-1
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
n=
4
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
.08
.41
.41
.18
3
-1
5
.02
.14
.41
.33
.20
3
-1
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
n=
4
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
.08
.41
.41
.18
3
-1
5
.02
.14
.41
.33
.20
3
-1
6
.03
.20
.39
.26
.22
3
-1
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
n=
4
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
.08
.41
.41
.18
3
-1
5
.02
.14
.41
.33
.20
3
-1
6
.03
.20
.39
.26
.22
3
-1
7
.04
.24
.37
.21
.23
3
0
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Recursive Definition - Example
I
I
n
.
α = 0.2, n = 0.4 5+n
Un =the minimal value such that
Pα (hit BU until n) ≤ n
I
Ln = maximal value such that
Pα (hit BL until n) ≤ n
Pα (Sn =k, τ≥ n)
k= 3
k= 2
k= 1
k= 0
n
Un
Ln
Axel Gandy
0
1
0
1
-1
1
2
3
n=
4
.2
.8
.07
2
-1
.04
.32
.64
.11
2
-1
.06
.38
.51
.15
2
-1
.08
.41
.41
.18
3
-1
5
.02
.14
.41
.33
.20
3
-1
6
.03
.20
.39
.26
.22
3
-1
7
.04
.24
.37
.21
.23
3
0
8
.05
.26
.29
.25
3
0
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
7
Sequential Decision Procedure - Example
n
.
α = 0.2, n = 0.4 5+n
20
10
0
0
10
20
30
40
50
60
70
80
90
100
n
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
8
Influence of on the stopping rule
0
50
100
150
200
250
300
350
n
= 0.1, 0.001, 10−5 , 10−7 ; n = 1000+n
0
1000
2000
3000
4000
5000
n
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
9
Sequential Estimation based on the MLE

 Sτ ,
p̂ =
τ
α,
I
τ <∞
τ = ∞,
One can show:
I
I
hitting the upper boundary implies p̂ > α,
hitting the lower boundary implies p̂ < α.
Hence,
sup RRp (p̂) ≤ p
I
Furthermore, ∃ random interval In s.t.
I
I
In only depends on X1 , . . . , Xn ,
p̂ ∈ In .
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
10
Example - Two-way sparse contingency table
1
2
0
1
0
2
0
1
1
1
2
0
1
2
1
1
2
1
0
1
1
3
2
0
1
0
0
7
0
0
1
0
3
1
0
I
H0 : variables are independent.
I
Reject for large values of the likelihood ratio test statistic T
I
T → χ2(7−1)(5−1) under H0 . Based on this: p = 0.031.
I
Matrix sparse - approximation poor?
I
Use parametric bootstrap based on row and column sums.
I
Naive test statistic p̂naive with n = 1,000 replicates:
p = 0.041 < 0.05.
Probability of reporting p > 0.05: roughly 0.08.
d
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
11
Example - Two-way sparse contingency table
1
2
0
1
0
2
0
1
1
1
2
0
1
2
1
1
2
1
0
1
1
3
2
0
1
0
0
7
0
0
1
0
3
1
0
I
H0 : variables are independent.
I
Reject for large values of the likelihood ratio test statistic T
I
T → χ2(7−1)(5−1) under H0 . Based on this: p = 0.031.
I
Matrix sparse - approximation poor?
I
Use parametric bootstrap based on row and column sums.
I
Naive test statistic p̂naive with n = 1,000 replicates:
p = 0.041 < 0.05.
Probability of reporting p > 0.05: roughly 0.08.
d
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
11
Example - Two-way sparse contingency table
1
2
0
1
0
2
0
1
1
1
2
0
1
2
1
1
2
1
0
1
1
3
2
0
1
0
0
7
0
0
1
0
3
1
0
I
H0 : variables are independent.
I
Reject for large values of the likelihood ratio test statistic T
I
T → χ2(7−1)(5−1) under H0 . Based on this: p = 0.031.
I
Matrix sparse - approximation poor?
I
Use parametric bootstrap based on row and column sums.
I
Naive test statistic p̂naive with n = 1,000 replicates:
p = 0.041 < 0.05.
Probability of reporting p > 0.05: roughly 0.08.
d
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
11
Example - Bootstrap and Sequential Algorithm
> dat <- matrix(c(1,2,2,1,1,0,1, 2,0,0,2,3,0,0, 0,1,1,1,2,7,3, 1,1,2,0,0,0,1,
+
0,1,1,1,1,0,0), nrow=5,ncol=7,byrow=TRUE)
> loglikrat <- function(data){
+
cs <- colSums(data);rs <- rowSums(data); mu <- outer(rs,cs)/sum(rs)
+
2*sum(ifelse(data<=0.5, 0,data*log(data/mu)))
+ }
> resample <- function(data){
+
cs <- colSums(data);rs <- rowSums(data); n <- sum(rs)
+
mu <- outer(rs,cs)/n/n
+
matrix(rmultinom(1,n,c(mu)),nrow=dim(data)[1],ncol=dim(data)[2])
+ }
> t <- loglikrat(dat);
> library(simctest)
> res <- simctest(function(){loglikrat(resample(dat))>=t},maxsteps=1000)
> res
No decision reached.
Final estimate will be in [ 0.02859135 , 0.07965451 ]
Current estimate of the p.value: 0.041
Number of samples: 1000
> cont(res, steps=10000)
p.value: 0.04035456
Number of samples: 8574
Further Uses of the Algorithm
I
Simulation study to evaluate whether a test is
liberal/conservative.
I
Determining the sample size to achieve a certain power.
Iterated Use:
I
I
I
I
Determining the power of a bootstrap test.
Simulation study to evaluate whether a bootstrap test is
liberal/conservative.
Double bootstrap test.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
13
Expected Hitting Time
Result: Ep (τ ) < ∞ ∀p 6= α
n
:
Example with α = 0.05, n = 1000+n
Ep(τ) µp
1.2 1.4
0
Ep(τ)
400
800
ε = 0.001
ε = 1e−05
ε = 1e−07
1.0
p
0.0
0.2
0.4
0.6
0.8
1.0
p
µp = theoretical lower bound on Ep (τ ).
R1
I Note:
0 µp dp = ∞;
I for iterated use: Need to limit the number of steps.
Expected Hitting Time
Result: Ep (τ ) < ∞ ∀p 6= α
n
:
Example with α = 0.05, n = 1000+n
Ep(τ) µp
1.2 1.4
0
Ep(τ)
400
800
ε = 0.001
ε = 1e−05
ε = 1e−07
1.0
p
0.0
0.2
0.4
0.6
0.8
1.0
p
µp = theoretical lower bound on Ep (τ ).
R1
I Note:
0 µp dp = ∞;
I for iterated use: Need to limit the number of steps.
Summary
I
Sequential implementation of Monte Carlo Tests and
computation of p-values.
I
Useful when implementing tests in packages.
After a finite number of steps:
I
I
I
I
p̂ or
interval [p̂nL , p̂nU ] in which p̂ will lie.
Guarantee (up to a very small error probability):
p̂ is on the “correct side” of α.
I
R-package simctest available on CRAN.
(efficient implementation with C-code)
I
For details see Gandy (2009).
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
15
References
Besag, J. & Clifford, P. (1991). Sequential Monte Carlo p-values. Biometrika 78,
301–304.
Davison, A. & Hinkley, D. (1997). Bootstrap methods and their application.
Cambridge University Press.
Fay, M. P., Kim, H.-J. & Hachey, M. (2007). On using truncated sequential
probability ratio test boundaries for Monte Carlo implementation of hypothesis
tests. Journal of Computational & Graphical Statistics 16, 946 – 967.
Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly
bounded resampling risk. Accepted for publication in JASA.
Gleser, L. J. (1996). Comment on Bootstrap Confidence Intervals by
T. J. DiCiccio and B. Efron. Statistical Science 11, 219–221.
Axel Gandy
Sequential Implementation of Monte Carlo Tests with Uniformly Bounded Resampling Risk
16