Biometrilca (1982), 69, 1, pp. 69-74
Printed in Cheat Britain
69
P values for tests using a repeated significance test design
BY KENNETH FAIRBANKS
Department of Mathematics, Murray State University, Kentucky, U.S.A.
AND RICHARD MADSEN
Department of Statistics, University of Missouri, Columbia, U.S.A.
Repeated significance testing can be used as a multistage testing procedure which has
advantages over single sample tests and purely sequential tests. Repeated significance
tests having a constant nominal level of significance have designs which are more easily
understood by researchers and still have good power relative to designs using varying
nominal levels of significance. In this paper we present a method for defining an overall P
value for repeated significance tests with constant significance levels which can aid
researchers in interpreting the observed results of such tests. A table of P values for
various designs is given.
Some key words: Multistage testing; P value; Repeated significance teats.
1.
INTRODUCTION
In many experimental situations where data are collected over a long time period, it is
common practice for researchers to perform intermediate analyses of the accumulating
data. Such analyses may be done simply out of curiosity or out of a desire to terminate
testing at an early time if the data so indicate. Early termination can lead to cost
savings. While ideally one might wish to use a purely sequential test in such situations,
there are many practical reasons for not doing so. A compromise measure would be to use
a group multistage design. One such approach would be to choose a number of stages and
then apply repeated significance tests to the data (Armitage, 1975). Because of the
repeated testing, the nominal significance levels used as a criterion for stopping the trial
must be smaller than the overall significance level for the entire test. Pocock (1977)
suggests that a constant nominal significance level used at each stage is a design more
easily understood by researchers and that little power is lost as compared to designs
using varying nominal significance levels.
One drawback to using repeated significance testing with a constant nominal
significance level is the difficulty in interpreting the results of a trial which never reaches
the stopping boundary, but which has a nominal P value at the final stage which is less
than the overall level of significance of a = O05, say. One possible solution to this
problem is to give an overall P value corresponding to an observed set of data. The
purpose of this note is to give a definition of an overall P value for a class of multistage
tests. This defined P value can then be used for repeated significance level tests using
constant nominal significance levels. In addition, we give tables which can be used for
finding the overall P value for tests having 1,...,5 stages. These tables are based on
tests for a difference in treatment means when the data are taken from independent
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
SUMMARY
70
KENNETH FAIRBANKS AND RICHARD MADSEN
normal populations having a common known variance. These are given in §2. In §3
we discuss the formulae used to calculate the P values and show that the probability
distribution of the P values is uniform.
2 . P VALUES FOR A REPEATED SIGNIFICANCE TEST
Following the test described in §2 of Pocock (1977), suppose there are two treatments
A and B and that these treatments are each randomly assigned to n of the next 2n
available subjects at each stage, for up to N stages. Let the response variable in each case
be normal with means \iA and fiB and common variance a2. In testing Ho: \iA — \i.B = 0
versus Ha: fiA — fiB =£ 0 use as a test statistic at the ith stage
z, = 2^ {2a21 {in)}.
(1)
The test is performed as follows: first find the appropriate critical value zc. If for any
i (i = 1, ...,N— 1), | 2(| ^ xc then reject Ho; otherwise take the next subsamples and
repeat the process. If the testing reaches the Nth stage, reject Ho if | zN | ^ zc and accept,
or fail to reject, Ho otherwise.
Now we define
for » = 2, ...,N. Then
where a is the overall significance level of the test. Given these definitions we define the P
value as follows.
Consider an iV-stage repeated significance test having critical value zc corresponding to
an overall level of significance a. If the test terminates at stage k with a terminal
observed value of the test statistics z0, the overall P value for the test will be defined as
*, where the sum is over i = 1,...,k— 1, and where
<zc, i =
\,...,k-\;\Zk\>\z0\).
Note that by the way the P value is defined,
(i) the P value will be smaller if the test terminates at stage k— 1 than if it terminates
at stage k;
(ii) the P value is less than or equal to a if and only if Ho is rejected;
(iii) the probability distribution of the P value is uniform over the interval (0,1).
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
where xAJ and xBJ represent the observed mean response for treatments A and B in the
jth subsamples. In order to obtain an ^-stage test having an overall significance level a
one must obtain appropriate critical values. These values may be obtained by numerical
integration (Armitage, McPherson, & Rowe, 1969) or from a table given by Pocock. Note
that the choice of n would be determined by the desired power against some specific
alternative. The value of n can be obtained from Table 2 of Pocock (1977). Notationally
let us define
Tests using a repeated significance test
71
Note that this definition of a P value could be extended to other multistage tests in a
straightforward way.
The distribution of the test statistic at a given stage must be found conditioned on
continuation at previous stages, and so its distribution is quite complicated. No simple
closed form for the distribution which allows easy calculation of the P value' can be
given. Consequently values which were found by methods of numerical integration are
given in Tables 1, 2 and 3. These tables can be used for N — 1, ...,5 stages and for
a = 0-01, and 005.
Table 1. P values when Ho is rejected, a. = 0-05
2-4
O0164
O0164
00397
00164
00332
O0460
O0164
00299
00404
O0488
—
2-6
O0094
O0094
O0345
O0094
00278
O0416
00094
O0242
O0357
O0447
O0094
O0221
O0319
O0398
O0464
Final observed |zj
2-8
3-0
O0052 O0026
O0052 O0026
00318 O0304
O0052 O0026
0-0248 00233
00395 00385
O0052 O0026
O0211 O0195
00333 O0322
00429 00421
OO052 00026
O0189 00172
00294 00282
00378 O0369
00447 00441
3-2
00014
OO014
O0298
O0014
O0225
O0381
O0014
00187
O0317
00418
O0014
00163
O0277
O0366
00438
3-4
O0007
O0007
00295
00007
O0222
00379
00007
O0183
00315
00417
O0007
O0160
00275
O0365
O0438
3-6
OO003
O0003
O0294
O0003
00220
00379
O0003
00182
O0315
O0417
O0003
O0158
O0274
O0364
O0438
Table 2. P values when Ho is rejected, a = 001
No.
of
stages
1
2
3
4
5
Zc
k
Z,a,
2-576 1 O0100
2-772 1 O0056
2 O0100
2-873 1 00041
2 O0072
3 O0100
2-939 1 OO033
2 O0058
3 O0080
4 O0100
2-986 1 O0028
2 00050
3 OO069
4 O0085
5 O0100
£,a, is over t =
l,...,k.
2-6
Final observed u-,i
3-2
3-4
2-8
3-0
O0051 O0027 O0014 O0007
O0051 00027 O0014 O0007
00095 O0074 00064 OO058
—
O0027 O0014 O0007
—
O0064 O0049 00044
—
O0088 O0079 O0075
—
O0027 O0014 O0007
—
O0053 00042 O0036
—
O0076 00066 00061
—
O0094 O0086 O0082
—
O0027 O0014 O0007
—
00049 O0037 O0032
—
O0068 O0058 O0053
—
OO084 O0075 00071
—
O0098 O0090 O0087
3-6
3-8
O0003
O0003
O0056
O0003
O0041
O0073
00003
O0034
O0060
O0081
O0003
00029
00051
O0070
O0086
O0001
O0001
O0056
O0001
O0041
O0072
O0001
O0033
O0059
O0081
O0001
O0028
O0051
00069
O0085
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
No.
of
k I, a,
2-2
stages
1
00278
1-960 1 O0500
2
00278
2178 1 O0294
00486
2 O0500
—
3
2-289 1 O0220
—
2 00379
—
3 O0500
—
4
1
O0182
2-361
2 00314
—
3 O0417
—
4 O0500
—
5
2-413 1 O0158
2 00274
—
3 O0364
4 00438
5 O0500
I, a, isover t == I,...,k.
72
KENNETH FAIRBANKS AND RICHARD MADSEN
Table 3. (a) P values when Ho is accepted, a — 005
No.
of
stages
1 •
2
3
4
5
Zc
O20
0-60
1-0
1•960
2 178
2 •289
2-361
2-413
0-8414
0-8418
0-8423
0-8426
0-8428
0-5486
0-5500
0-5514
0-5523
0-5530
0-3174
0-3214
0-3236
0-3251
0-3262
Final observed |z,|
1-4
1-6
1-8
0-1616
0-1703
0-1732
0-1749
0-1762
0-1096
0-1213
0-1246
0-1265
0-1277
00718
0O867
0O906
0O926
0O938
2-0
2-2
2-4
—
0O634
0O681
0O702
0O715
—
—
0O541
00565
0O579
—
—
—
—
0O502
(b) P values when Ho is accepted, a. = 0O1
1
2
3
4
5
2-576
2-772
2-873
2-939
2-986
Final observed | z, |
1-8
2O
2-2
020
0-60
10
1-4
0-8414
0-8415
0-8415
0-8416
0-8416
0-5486
0-5486
0-5487
0-5489
0-5490
0-3174
0-3175
0-3179
0-3181
0-3183
0-1616
01622
0-1627
0-1631
0-1633
0O718
0O734
0O741
0O745
0O747
0-0455
0-0476
0-0483
0-0487
0-0490
0O278
0O306
0O313
00317
0O320
2-4
0O164
0O197
0O206
0O211
0O213
2-6
2-8
—
—
0O132 —
0O143 0O107
0O147 0O112
0O150 0O116
3. FORMULAE FOR CALCULATION AND THE PROBABILITY DISTRIBUTION OF THE P VALUES
The method of calculation of the P values was based on numerical methods and
followed the work of Armitage et at. (1969). The values of a, and the P values can be
found from the conditional distributions of the random variables Zk as follows. If we
define
then under Ho, Wk ~ #(0, 1) and
(2)
Furthermore Zk_± and Wk are independent for k = 2, ...,N. If we l e t / ^ . ) denote the
density function of Zx and if the observed value of Zx, say z0, is in the critical region,
then the P value would be found from
If | z01 = zc, then the P value will be equal to al. Denote the conditional density of Z1
given that \Zy\ < ze by
f
Jlnl<l»
To find the conditional density of Z2, given \Zt\ < zc, use the independence of Zt and W2
to get
f(z1,w2)=f*(zl)<t>(w2),
(3)
where <p(.) is a standard normal density. From the relationship in (2) we can find the
joint density of Zl and Z2 given \Zl\ < zc. From this joint density the marginal density of
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
.
No.
of
stages
Tests using a repeated significance test
73
Z2 can be found to be
Using/ 2 (.) we can find a2 by using
(4)
- |
(.
J|I2|<»C
)
It follows from (4) that
so that the conditional density of Z2 given | Z,| < zc, i = 1, 2 is
In general, if /*_i(.) represents the conditional density of Zk_1 given |Z, | < zc for
i = 1, ...,& — 1, then the conditional density of Zk will be given by
From this we obtain
a4 = ( l - ( x 1 - . . . - a 4 _ 1 ) n-- l _
A(zt)dzt}>.
(6)
J|xk|<rcl
Finally the P value corresponding to a final observed value of the test statistics, say z0,
would be
(=1
\
(=1
/
^
J|lk|<|l0|
These values must be found by repeated numerical integration. Equations (5), (6) and (7)
are necessary to find desired P values.
Finally, under Ho the distribution of the P value is uniform over the interval (0,1). If
we define a0 = 0 anda w + 1 = 1 — (ax + ... + ctN) = 1 — a, and choosep6(0,1), then for some
k, p will satisfy
Z at < P < Z af
1=0
(=0
Such a value of p corresponds to a reject decision being made at stage k if p ^ a or an
accept decision being made at stage N if p > a. The value p corresponds to some
observed value of the test statistic, say zp > 0. Then we have that the probability that
the P value ^ p equals the probability that the decision is made prior to stage k or at
stage k with | Zk \ ^ zp and this is p.
This research was supported in part by U.S. Office of Naval Research. The authors
wish to thank a referee for a comment which led to an improved definition of the P value.
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
fi(z2)dz2 = (1— at— a2)/(l— a t )
74
KENNETH FAIRBANKS AND RICHARD MADSEN
REFERENCES
P., MCPHEESON, C. K. & ROWE, B. C. (1969). Repeated significance tests on accumulating data.
J. R. Statist. Soc. A 132, 235^4.
ARMITAGE, P. (1975). Sequential Medical Trials. Oxford: Blackwell.
POCOCK, S. J. (1977). Group sequential methods in the design and analysis of clinical trials. Biometrika 64,
191-9.
ARMITAOE,
[Received August 1980. Revised July 1981]
Downloaded from http://biomet.oxfordjournals.org/ at Penn State University (Paterno Lib) on September 19, 2016
© Copyright 2026 Paperzz