Tate, R.F.The biserial and point correlation coefficients."

.
;.-
..~
... .
.
.
.
..•.•_,._-
'."
. ',
,">,
,_.-
:i
THE BISERIAL AND POINT CORRELATION COEFFICI]!IJ'l'S
By
Robert Fleming Tate
~'.
1"
,
'tn.
;[7""
:.'
"
Institute of Statistics
Series /;~14
For Limited Distribution
A~imeo.
p. vi
L 21
"," should be added between "other" and "con·
tinuous" •
p. vii
1. 2
"of our" should read "of the joint distribution
of our".
p. 1
1. 6
1. 8
L
p. 2
9
Omit "of".
"(21Cr l " should read "(21C6) -1".
"-2x:sir" should read "2yrylu" •
"J
'1";
1. 13
should read
'1,
""
_c..?
1. 16 "defined to" should read "defined from the
sample to".
p. 3
L 1
II,
and" should read", m = the number of zi
in the sample which are 1, and".
,.
e
'~
p. 7
L 23 fIg' (x)
p. 23
1. 6
L 10
p. 29
p. 59
0" should read "g' (x) ';> 0".
"5.:t~'
should read "l:t~·
; " should read "{ - --
II {
I "
}dxi ".
should read " ¢ ".
1. 14
fl
1. 2
"for Random" should read "for Dependent
Random" •
•
THE BISERIAL AND POINT CORRELATION COEFFICIENTS
By Robert Fleming Tate
Special Report of research at the Institute of Statistics of
the University of North Carolina, Chapel Hill, under Office
of Naval Research Project NR 042031.
FOREWORD
By Harold Hot eUing
•
The fact that biserial correlation was introduced and
came into general use before the development of the modem
statistical emphasis on exact sampling probabilities and
the theor,y of efficient estimation and testing of hypotheses,
which ha\Te not yet embraced biserial correlation in their
formal treatments, leaves unanswered many questions as to the
efficacy, proper place, and possible modifications of this
widely used technique. A feeling that all is not well with
biserial correlation has led more recently to the introduction of point biserial correlation, whicn in turn has led to
further questions of principle and technique. In this paper
Mr. Tate contrasts these techniques with each other and with
the theoretically hundred per cent efficient, but computationally difficult method of maximum likelihood.. A paradox
is created by the fact that whereas the correlation coefficient in a population cannot exceed unity, the biserial estimate
of it has no upper bound and may occasionally be far greater
than unity. Mr. Tate shows how this phenomenon is associated
with a gradually decreasing efficiency, approaching zero, of
the biserial correlation as it increases. A point of particular interest is the variance stabilizing transformation,'
analogous to R. A. Fisher's transformation of the productmoment correlation coefficient, and capable of being carried
out with the same tables, reached on pp. 21-22 for the case
of equal frequencies in the two classes. Extension of this
to the case of unequal classes is now under considerat.ion.
Psychologists and personnel workers as well as others concerned with test construction, item analysis and correlation
will find in this memorandum a clarification of numerous
troublesome questions that have surrounded biserial
correlation.
ii
ABSTRACT
ROBERT FLEMING TATE.
The Biserial and Point Biserial
Correlation Coefficients.
(Uhder the direction of HAROLD
HOTELLING. )
Two solutions to the problem of finding the correlation
between a continuous random variable X and a discrete, two
valued random variable Z are discussed, the coefficients of
biserial and of point biserial correlation.
It is shown that the biserial correlation coefficient
r* has efficiency 0 when the population correlation coefficient
(' tends to :t L
Also, r* has minimum variance for fixed ..
when the cut value of the underlying normal
distl~ibution
occurs at the population mean; a table is included to illustrate this point.
A special case of the limiting distribution
of r* is obtained for the cut value at the mean.
Further,
biserial r* is shown to be Unbounded, and a diagram illustrating this point is included.
Tho equivalence of the use of point biserial
I'
certain restrictions and "Student I Slf t is displayed.
relative advantages and disadvantages of
cussed, and
so~ recammendati~
I'
The
and r* are dis-
are given as to Which of tho
two coefficients should be used in various cases •
•
under
INTRODUCTION
An important problem in experimental work is that of deter-
mining the correlation
I
between a continuous variable, and a
discrete variable which takes on only two values. The need to
measure such a quantity is "present in all fields of research to
some extent, but such correlations are of particular :importance
in psychological testing.
The fundamental problem of psychological testing 1s that of
measuring some quantity which has a name, for example intelligence
or meohanical aptitude.
A; measurable criterion is selected to re-
present the quantity under consideration.
In the absence of a:ny
external criterion the total test score is sometimes used.
The
technique consists firstly in summoning for consideration all
possible questions, or items} as they are called, which could
have any bearing on the quantity to 'be measured, and which can be
answered quickly and unambigously by the subject Who is being
tested.
The item is then scored 1 if the degree of association
with the quantity is positive, and 0 if it is not.
If the test
is to be efficiently carried out, the items should have low correlations among themselves, and if it is to be valid, they Should
•
all be correlated highly With the critorion..
The methods avail-
able for finding the correlation between
and criterion form
the subject of this pa.per.
it~
The extension of the mode of approach
used in psychological testing to other types of situat10ne will
in most cases be evident.
..
v
. For testing the hypothesis
used.
Jf = 0
product moment
l'
can be
Its distribution is independent of that of the discrete
variable under the null hypothesis.
Table V A of Fisher's book
For a tabulation we can use
(,
[6.t,
, 1
David's tables 2),
or Fisher's
z transformation, which ia eiven in Table V B of Fisher's book (6) .
There is no problem involved in testinc this hypothesj,a.
For the case
Ka.rl Pearson
(12)
.f
~
0 two solutions he.ve been proposed.
In 1909
proposed a. solution to the problem in the form of
the so-called biserial coefficient of correlation r *1 which is defined in Section 2 of Chapter I.
The word biserial refers' to the
separate sets of values of the continuous variable which are associated With the two values, 0 and 1, of the discrete variable.
Many
results and techniQues of the Karl Pearson school of statistics
were supplanted by new and improved methods a.fter the advent of
the penetratinf studies of R. A. Fisher
[5]
and J.
Neyman [l~
•
The coefficient of biseria,l correlation, however, is still a
widely used tool in statistical analysis.
Jt is treated in many
texts used in psycholOGical statistics, but its mathematical theory
has remained in substantially the same incomplete for.m since 1913.
The basic property of the biserial coefficient is that it
presupposes an underlyin[ normal distribution from which the discrete, two-valued, variable can be obtained •. That is, it pre-
•
supposes a dichotomization of the normal distribution at some
1.
Numbers in brackets refer to biblioeraphy.
vi
fix.ed point
(I),
after which all observations which fallon the right
of this point will be assiened the value 1, end all of those on the
left, the value O.
The meaning of this assumption will become
clearer if we return for a moment to the notion of a test i tam.
There the postulation of underlying normality is eqUivalent to the
Buppoeition that there exists a normal population of attitudes
towards answering the given test question.
Most subjects would, if
pOSSible, answer in qualified terms, or in degrees of confirmation
or rejection.
However, one of two answers must be given, and it
is hypothesised that there exists a unique point of demarcation of
attitudes, and that it occurs at
(I),
the aforementioned po1nt of
dichotomy.
In 1934 another solution to the problem was proposed by J.
Stalnaker and M. W. Richardson (l~ in the form of the point
biserial ooefficient of correlation r.
In connection with this
coefficient no specification is made of any under1Ying distribut1en for the discrete variable.
Pearson's assumptions were that the discrete variable was
obtained from. an. underlyinL normal variable by dicnotomiza.tion at
s~e
fixed point, and that there exists a linear regression of the
other I;ontinuous
variab~e
upon this normal variable.
His der1va-
tion of biserial r *, based on these assumptions, was carried out
by the Method of Consistency,
•
f.
BO
.
r * is a consistent esttmate
of
A discussion of the derivation is given in Chapter I,
Seotion 3.
vii
Unless we specify a more complex model than that used by Pearson,
we cannot speak of our discrete and continuous variables, and hence
can dra'" no further inferences about r*.
The model to be con-
sidereiL here specifies an underlying bivariate normal distribution
'~hree
with
OW the standard deviation of the continuous
parameters,
varia.ble I the cut va.lue of
coeN'icient
.f' .
(.I),
and the population correlation
The model is defined in Cha.pter I, Section 2.
In addition to the fact that r * is a consistent estimate of
the asymptotic standard error of r * is known.
.f '
[l~ obtained this result in 1913.
In Chapter II the asymptotic
standard error of r * will be studied more closely.
shown that this quantity, a function of
minimum value for any fixed'p
at
H. E. Soper
(.I)
(.I)
It will be
and! , takes on its
= O.
A double-entry table
of the asymptotic standard deViation is given in Table I.
In Chapter III, Section 1 the limiting distribution of r *
will be derived, subject to the restriction that the point of dichotomy
(.I)
be taken at the mean of the underlying distribution asso-
eiated with the discrete variable.
The variance of this distri-
bution offers a partial check on Soper's
same restriction on
(.I),
~sult.
Subject to the
a transformation will be given analogous to
FiSher's z transformation, which
stabiliz~s
the variance.
Two important properties of the efficiency of r * are obtained
in Chapter IV.
By a consideration of the information matriX of
the maximum likelihood estimates
•
as
.f
A
(T',
....
(.I)
,
f.A.
1t is shown that
+ the efficiency of r * tends to 0, and also that
tends to .1
r * has efficiency 1 when
f
=
O.
e"'
viii
In a recent paper J. Lev [10] considered point biserial r
under the assumption that the N values of the discrete variable
are not random but fixed.
That is, if X denotes the continuous
variable and Z the discrete variable, ho considered only samples
(Xi' Zi)' (i
= 1,
2, ••• , N), for which the Zi have a fixed Partition,
No of them being 0, and N of them 1. He further assumed tho
l
residuals of the Xi determined from the Zi to be normally distributed.
Lev showed that under these restrictions point biserial r
is a maximum likelihood estimate of
,f ' and that using it is
equivalent to using the two-sample t statistic when
the two-sample non-central t when
.F f:
f = 0,
o.
Neither of these coefficients is entirely adequate;
they both leave much to be desired.
and
in fact,
A great deal of the literature
available on the subject of biserial correlation-displays the confusion of the early Pearson school in their failure to distinguish
between population parameters and sample estimates of these parameters, and to appreciate the significance of the concept of
efficiency of 8n estimate.
An attempt has boen me.de in this paper
to put the various notions involved on a firmer mathematical footing in the light of modern statistical methods.
A discussion of '
the advantages and disadvantages of the biserial and point biserial
coefficients of correlation may be found in the summary of results
. ..
given in Chapter V.
CHAPTER I
PROPERTIES OF THE BISERIAL MID
POINr BISERIAL CORRELATION COEFFICIENTS
f
1.
Notation
~(x)
The following notation and conventions will be used throughout.
stands for (2n)-1/2exp (_X2/2), the probability
density of a standard normal variable.
stands for (2n
¢(x,y)
)-1(1_ f2)-1/2.
exp ~2-l(l_.? 2) -l(x2 j,,r2 -2
xyp.. +y2~
, the
probability density of a bivariate normal vector.
denotes a variable which has a normal distribution
N(a,b)
r.;>
e
...
with mean! and standard deviation
E;
the variance of a random variable X •
VeX)
i 9
f-l(x)
will stand for l/f(x) for any function f which
appears.
plim
~
=I
denotes a sequence of r?llldom. variables IN(N
= 1,2, ••• )
tending in probability to a random variable I as
N tends to
dlim~=x
r:f:)
denotes a sequence of random variables XN(N=l,2, ••• )
tending in distribution to a random variable X as
N tends to
FN(X)
00
= P(XN.~
•
That is
~o
say, if we define
x) and F(x) = P(l ~
xl, then
Fn{x)
tends to F(x) at every continuity point of F(x).
Greek
let~ers
will designate parameters.
Lemmas will be assigned Arabic numerals.
2
Theorems Will be assigned Roman numerals.
Formulae will be assigned Arabic numerals with the
number of the chapter appearinf first.
References Will be numbered in square brackets, and may be found
in the bibliography.
2.
Mathematical Model
(x, Y) is
a
two-dtmens 1onal"random vector with the
fr~quency
distribution ¢(x,y).
>-
Y
(.I)
, where
(.I)
is a fixed constant of the Y
Y<(.I)
distribution in the
interva~
-
00
< (.1)<
00 . •
00
Let p = p(Y~
(.I)
=~. ~(y)dy
ID
00
q = p(Y < (.I)
f
=
~(y)dy
= l-p.
(.I)
Then Z has a discrete distribution with probabilities q and p at
points 0 and 1 respectively.
The biserial correlation coefficient is now defined to be
(1.2)
sample means of the two sets of x variates which are associated
With z :: 0 and z
~
1 respectively, s
.1
=[ N
_. 21 1/2
t{xi-i) j
00
defined by the relationmjN
==
f
,
and t
is
00
1I.(y)dy.
t
,. Karl Pearson's Derivation of r *
The essence of Pearson's method was tha.t he found a relation
between the population means under the assumptions of normal!ty
underlyins the Z distribution, and the existence of a linear regression of X on Y.
~ = ECX
a
;.
o
t Y~
More precisely, let
b l :: E(l , Y ~ w),.
(1),
= E(X , Y < co),
.
b
o
:: E(Y , Y <: co) •
It can r~ad1lY be shown that b = p.11l.(W) and b :: _q- 11l.(c.o).
i
o
Rence,
J1=
aI-a.
<t.: / fr •.0
x
y p-11l.(w)+q-lh(w)
f
{(r)-l(a.._a )pqh-l(w).
x.
.1
0
Simplifying, we have
Now when the parameters of.this
expression are replaced by a certain set of consistent sample
estimates, we have (1.1).
This derivation shows that r * is a consistent estimate of
j>,
and
as such tends to
f
in probabi11ty, but there is the
matter of efficiency to be considered.
If l¢e are to obtain the
maximum amount of information from a sample of fixed size N, we
must seleot the particular consistent estimate from the class of
a.11 oonsistent estimates whioh has m1nimtUll variance.
The
deriva.tion 81ven a,bove is an example of the Method of COnf3istency,
which we.s thought, by the Karl Pearson school, to tell the whole
story.
More will be said of the efficiency of r * in Chapter IV.
4. Bounds for Biserial r *
The ordinary product moment r is bounded between -1 and +1.
It is eVident on slight investigation that r * does not share this
property.
Let us consider for a moment that we are dealing with a fixed
set of Zi:
No+Nl
= N.
° and Nl
namely, No of the Zi are
of them are 1, where
Then r * becomes
00
where c is determined from Nl/N =
S
A.(y)ay.
c
It can now be shown that, for
.f = 0,
(1.4)
has the two-sample t distribution with N-2
degree~
of freedom.
This is a transf~tion of. a result by Lev
(10) ,
will be said in Section 6 of this chapter.
For the case of fixed
of which more
proportions Nl/N and No/N and fixed sample size N, theh, tallows
l
us to compute upper and lower bounds for r *, and to make probability
statements about these bounds.
r* by Dunla.p
.
'
[4J
A nomograph for the computation of
is Biven in Fie;. 1 at the end of this paper in
order to illustrate the wide range of.valuea which can be obtained •
Dunlap uaes slightly different notation.
(1.1) for r * into
He transforms expression
5
r*
(1.5)
= s~l(xl-x)(m/N)X-l(t)
x,
o =
the two sample means.
(m/N')~+(l-m/N)x
r
*~
5.
by using the relation
and then considers xl to be the larger of
In this way he obtains only values of
O.
Machine Methods for the Computation
of r·* ,
,
There are several mechanical methods available for the compu-
tat10n of r *. Royer
.
devised a method for computing
r * for
(14)
each i tam in a test I using punched cards.
His method requires
for each r * the necessary steps for finding -x and a, and then a
sorting of all cards, separating those which have z
tabulation of part of them.
.,
Du. Boia
[3]
= 1,
and a
improved upon Royer's
method by reducing the number of steps to a point where only one
complete sorting is required and several r * may be computed
simultaneously,
6. Point Biserial Correlation.
Model:
Let
X
again be a continuous variable, and. zi (i=l".",
z.1 are 0, and NI of
o
We consider a model in which X has mean !-I..- and
N) be a fixed point in N-space.
them are 1.
standard deviation
N(O, 0"'(1-
.f 2)1/2)
N of the
(
CT
,
and the variables
X -
i
fl'
-a( zi -z) are
, where a = fQ-N(NoN1)-p../2.
Under this model the ordinary product
~nt
r between x and
z is called the point biserial coefficient of correlation, and is
defined to be
6
- ) N-l( NoN )1/2 •
r:;: s -1(xl-x
l
o
(1.6 )
The relation between the expressions for the biserial and point
biserial coefficients is
J ~(y)dy
00
where c is again defined by Nl/N :;:
It can easily be shown that r
mate of
f
Is
as in (1.3).
a maximum likelihood esti-
Lev (10) , using the model above, showed that the
distribution of r is equivalent to that of Student's t when
f :;:
0, and a non-central t when.f ~ O.
non-central. t distribution is
e·
~:;:
The parameter of the
(N)l/2jP (l-JP 2)-1/2, which
i8 independent of No and Nl • SummariZing, we can use
t = (N_2)1/2r (1_r2 )-1/2 to test the hypothesis ~:;: O.
f :I 0,
l' or
J
we can compute confidence lim!ts for
test the hypothosiS
given by Johnson and Welch
[8]
.p:;: .f 0'
When
~, and hence for
by making use of Table IV
for the non-central t distribution.
It should bo emphasized that the results hold only for the case
of fixed numbers Nand
NJ. in the two groups, the same in all of
o
the samples With which our particular sample is compared.
CRAPrER II
THE ASYMPTOTIC VARIANCE OF r *
H. E. Soper [15] gives the following expression for the
asymptotic variance of r * as far as terms of order N-1
This chapter will contain an investigation of the critical values
of VCr* ) considered as a function of p and
prove two lemmas.
f
We shall· first
00
If p ::
LEMMA 1.
f.
A,(y)dy, where A,(y)
= (2TC)-l/2exp (_y2/2),
x
then (l-2p)A(x)
{>
with equality at
<P(l-p)x
x = 0, :00
O(x(CO
p(l-p)x
Proof:
Define g(x)
= (l-2p)A.(x}-p(l-p)x.
It can easily be shoym
that g(x) 1s an odd function, since pC-x) :: q(x), so it will be
sufficient to show that g(x) , 0 fer x '7 O.
Taking derivatives
with respect to x, we have
gl(X)
= 2A,2(x)_P(1_p),
g"(X) :: (1-2p)A.(x)-4xA,2(x).
It is easily shown that
(2.4)
g(x) is oontinuous, g(O) :: g(oO) :: 0, gl(O) :: (4-1C)/41t
and hence gl(X)
.0
in the neichborhood of x::
o.
> 0,
Setting a'(x) =
•
terval 0
< x < 00
° gives p2..
2
p+2~ (x)
= 0, which for the in..
reduces to
Substituting the solution of (2.5) 1n (2.3), we find that
gil (x) == A(X) {
2
( 1-8X (x)]
1/2
~
-4XA(X)J
== A(x)K(x).
Now examine K(x).
For K(x)
>°
2
2
A (X) (16x +8)
<1
and Sex) has a. minimum at the
point x.
K(x)< 0
;\2 (x) (16x2+8)
>1
and Sex) has a maximum a.t the
point x.
Now consider the equation K(x) == 0, or defining
~ (x)
= ;\2 (x)(16x2+8) ,
its maximum at x
1.1
(0)
==
1. 27,
= 2- 1/2
l.1(x)
= 1.
l.1(x) is unimodal with
and minima. at x == O,·I'Q
~(oo) ==
O.
< Xl <:
Furthermore
Therefore, ~(x) == 1 has a single
solution x = Xl in the interval 0 < x <. 00 •
in the sub-ihterval 1.3
•
This solution occurs
1.4.
From the argument Siven :L t is clear that if there are any
critical values of g(x) in the interval
0< x < Xl'
max1Ina, while critica.l values in the interval Xl <. x
be minima.
A :maximum x
2
they must be
< ClO
occurs in the interval .8 (x
Hence, this must be the only maximum in 0
can be no maximum in the range Xl
< x < xl'
< x <GO
2
<
must
.9.
Since there
, then in view of (2.4)
aoy critical point %3 which occurs in this range will force g(x )
3
to be
<.
O.
-y
9
Assume that such a point x, exists.
Then substituting from
(2.5) in Sex), we have by hypothesis
(2.6)
s(x,)
= ~(x,)
2
] 1/2
2
-2x,X (x,)
( 1-8~ (x,)
< 0,
or
~2 (x, ) (4x, 2 -1-8) ;> 1, which means that (4x, 2 +8)
> 21fe'W(x, 2) •
X,
1.,.
The least value which
stituting x,
= 1.,
can take is Xl' but xl>
SO, sub-
in the above" we have 14.76 ,. 21fexp(l.69),
-
Which is false, and a fortiori false for all x
,
> 1.,.
Therefore, there are no critical values of g(x) in the range
0< x <
except for a maximum at x •
2
00
With (2.4', we ha.ve gex)
!,EMMA 2.
If P
=f
00
>
0 for all x
> 0,
which proves the lemma.
1/2
2
~(y)d.y, where hey) =(21f)exp(-y /2),
x
then p(1-p) ~ 'A,2(x)1f/2 for all x" ..
at x = 0, :
Combining this result
00
< x < 00
, with equality
00
Proof:
:x
Define F(x) =
J"I
'A.(y)d.y.
Then p(l-p)
= (l.F)F,
and we wish
-00
to show F(x) [l-F(xil
hex)
~ (exp( _x2 >] /4. Let
= F(x)-F2 (x)-
r
2:\ /4.
lexp(-x )J
hex) is an even function of x, so we need study only the case
o~
x -<
00.
Differentiatins once produces
ht(x)
= ~(x)-2F(x)A(X)+(x/2)exp(-x2 ),
which with a little simplification becomes
"
...
e
10
It is easily shown that
hex) 1s c01\tinuous, h(O) = h6x?)
(2.8)
=0,
o.
hI (0) =
By the law of the mean for integrals F(x) 5 (1/2)+x(2n) -1/2.
Hence, sUbstituting in (2.7), we obtain
(2.9)
h t (x)
~ ~(x)
f
2
1-2 [X(21t) -1/2+l/~ +X(1C/2)l/2exp( _x /2~ .
=xl-(xl {(n/2)1/2exp(_X2/2l_(2/nl 1 / 2]
•
From (2.9) it is clear that in same neighborhood of' x = 0
....
we have
(2.10)
o.
h'(x»
....
Now examine the function G(x).
GI (x) =
_2~(x)+(n/2)1/2(1_x2)exp( _x2/2)
In the interval 0
tion Xl
= (1_2/n)1/2.
(2.11)
~
x
<QIO
= 0,
G' (x)
G(~»
Theref'ore, G(x)
.
G' (x) = 0 has the unique solu-
>0
°
0, and G(x) steadily decreases to G(OO) = -1.
=0
has only one solution x
From (2.7) we see that h'(x)
= ~(x)G(x).
the unique solution x 2 in 0< x<oo
in sorne neighborhood about O.
•
~(x) {n(1-x2 )_e}
Also
G' (x) ~
G(o)
=
•
2
in 0
< x <CO
Hence, h'(x)
= ~ has
But t:rom (2.10) h'(x)
In the light of' these f'acts and
(2.8) it is evident that hex»~ Of'or all x in
0<
x(oo
;
>0
11
otherwise, hex) ~uld have at least two critical points in the
interval.
This proves the lennna..
THEOREM I.
l'
V(r)
* has an absolute minimum for
aIJy
fixed
.p
at
= 1/2.
Proof in three parts:
PART 1:
p
= 1/2
(2.12)
It is easily shown that VCr* ) has a relative minimum at
for
aIJy
fixed
1'.
Indeed,
V(r*) = N- 1 {p4_2.5.;p2+'p2A+B}
B = pq:\-2(rot.
A = pqa?,,--2(ill)+(2p-l)ill:\-1(00),
Recalling that :;
= _:\-1(<0)
and
, where
d~~ill) = ill,
two relations which are
obvioue from the model given in Chapter I, Section 2, we get
dA
dp
= "--3 (00) {
g =:\
;\
(1-2p):\(w)(2ill2 +1)+2(0:\2 (w)-2pqw{c.u2 +1);
-3(w) {:\(w) (1-2P )-2Pqm}, and hence
~; == ./2N-1:\-3(c.u) [(1-2P):\(c.u)(2(02+l)+2ro:\2(w)_2Pqc.u(c.u2+1~
l
+N- :\-3(ill) { :\(w) (1-2 P )-2PQc.u} .
~ =0
has the solution P
= 1/2
or ill
= O.
Differentiating V again, and setting ill
"-(ill)
•
= (2n)-1/2,
one finds
= 0,
P = 1/2,
12
..
and hence
d~
dp
I
dp
Therefore, p
*
dV = N-1l'
df
or ro
=0
gives a relative minimum of
.f .
has an absolute maximum at
f
{'4 2 -5+2A,)
j
:(5_2A)1/2/2 •
l
2n(1C_2)N- •
c.o=O
= 1/2
VCr ) for any fixed
v(l')
2(n-4)+rc-2}. It is easily seen that
(.1):0
O<.4n(1t-3)N-l~ d~J .~
PA:RT 2:
(.f
= 2nN- l
dV
d.f
.
dj' 2
2
for any fixed p.
.
has solutlons
d2v = N- l (120 2_ 5 +2A ).
Hence,
:J
an absolute maximum. prOViding 2A-5
X (ro)A
=~
f' = 0
< O.
f'
.f = 0
= 0,
will give
Recall from (2.12) that
= pqro2+(2p-l)m~(m).
The condition for
A ~ 0 is then
(1-2p)~(ro)-pqm ~ 0
for p ~ 1/2
(1-2p)A.(ro)-pqro ~ 0
for p ~ 1/2.
Hence, from Letmna 1 it immediately follows that A 5. 0, and t!l.erefore that VCr* ) has an absolute maximum at
PART 3:
V(r*'
f = 0)
f'
= 0 for any fixed p.
has an absolute minimum at p
= 1/2.
13
II
By Lemma. 2~ pq~-2 ((1) ~ n/2 with equality at co = 0 or p = 1/2.
Combining parts 1 to 3, we have
aba ,n { abs
occurS at p
= 1/2,
T
vCr*) for any p
1'
which.:= fortiori proves the theorem.
The points brought up in this chapter are adequately illustrated in Table I which gives the asymptotic standard deviation
of r * as a function of p and
l' •
CHAPTER III
.
A SPECIAL CASE OF THE LIMITING DISI'RIBUl'ION OF r *.
We shall need the folloWing three theorems.
THEOREM A [13) :
Let (V.tN' ... , VkiV) and (Vl' ••• I Vlf) be random
= 1,
vectors, where N
2, •••
If
(a)
dlim(V1N'."'V~) = (Vl""'V~),
(b)
b , b , ••• is a sequenoe ot real positive numbers such
2
l
that bN --+ 0 as N
(0)
4
CO
I
H(vl,v2, ••• ~vk) is a function ot the real variables
vl ' ... , vk which has a total differential at the point
(0, ••• ,0) ..
(d)
Hi
and
= oH/Ovi
at the point
(0, ••. ,0),
-
then, if
dl1m W
N
= H1Vl+ ••• +HkVk •
~ • ~+YN" where ~,
THEOREM B (1):. Let
It
of random variables.
(a)
dUm ~
= X,
(b) p1im Y =
N
then
where c is a constant,
= X+c.
Let ZN
:r-andom variables.
then
0,
dl1m ZN
·TBEOREM C (1):
YN1 ZN are sequences
= ~YN'
whe:r-o ~, YN, ZN are sequences of
If
IN = X..
(a)
dlim
(b)
plim YN =
0,
where c is a constant..
dlim ZN = oX.
15
II
As a corallary to TheoramC, we have for c
p~1m
Note:
ZN
= 0,
the result
O.
i=
No condition of independence is required in Theorems B and
LEMMA 3:
c.
If
00
m/N ::;
f
J
00
"
).,(y)dy,
p ::;
t
).,(y)dy,
00
where m is the number of successes in N trials with the probability
of suocess, p, a constant for each trial, then the necessary and.
sufficient condition that
1s
00
= O.
Proof:
(l)
(m(N)-p=
~
).,(y)dy,
t
which by the law of .the mean for inte[,rala eives
(3.1)
(m/N)-p ::; -(t-(l)~(t),
Now expand ).,(t) in
(;.2)
).,(t)
~
~
t
< t < 6,).
Taylor series about the point
(J),
).,(w)+).,t(m)(t-oo)+o(t-w).
m is a binomia.l vari.able with parameters N and.p, so we have
16
•
We can write (3.2) in the form
as N
-+
the second term on the right tends in probabi11ty
to Oby
the corollary to Theorem 0, since rt-/2 (t_cu) haa a limiting
,
Now
law.
00 ,
From this result and Theorem B it can be seen that the corollary
to Theorem C Gives a necessary and suffioient condition in this
case:
namely,
Plim~/2 [1I.(CU)-A(t~
11' and only if At «(.I) = O.
But cu is a fixed finite number and
At«(.I) = -cuA(CU), so (.I) must be O.
1.
= 0
This proves the lemma.
Derivation of the Distribution.
We start with expression (1.2) for the biserial correlation
coefficient.
r
*=
(N- 1Ex z -i Z)A-1(t)
i i
{N-iOE(Xi -x) 2]
172
This expression may be written in the
-
fOl~
-1/2
r * = (xz - -x -)
z (x2 - -2
X}A-1)
(t.
Thus r * is essentially a. function of sample means.
It is
.
evident
that r * is invariant under a transformation of the form
y
= xkr .
Hence, in the model for r * put
cr::ll 1.
11
For our purposes we w111need essentially a reformulation of
Let (Tux." T2Q(. "T;~, T4a ) (a£,.= 1" 2" ... , N) be
a set of independent, identically distributed random vectors with
Theorem A.
finite covariances
0;, J•
Now
= Ti-ETi
define
(i = 1" ••• , 4),
(a)
TiN
(b)
b
N
(c)
H(v1 , .•• "v4 )
(d)
ViN =
= (N) -1/2"
2 -1/2
= (v1-v2v;)(v4-v2)
,
~/~iN.
,;
By the generalization of the Lindeberg-Levy fonD. of the
Central Limit Theorem to the case of random vecotrs, we have
where (Vl" •• • ,v4) is a 11Drmal random vector With mean (0, ... ,0)
and
covariance matrix (0""1J)
i, J = 1, • ow, 4.
We have now satisfied the conditions of Theorem A, and so
(3.6)
dlim WN = d1im
=N(O,
r/2{ H(T1N, ••• ,T4lf) -H(O, ••• ,01
f~~HiHJ()"iJ}
1/2
),
where
Now in order actually to find the variance of the limiting
normal distribution, we adopt a mechanical technique due to p. L.
Heu which amounts to the sama thing as using the form of Theorem A
above" but which sa.ves labor.
What Thoorem A actually tells us is
18
to use a two-term Ta.ylor expansion of the funotionof sample
means about their expected values.
Define varia.bles of the form T' =
Xi
r?-/2(T_ET). Then,
z: = r-t l / 2 z' +1',
= N- l / 2 {xz)'+EXZ,
x2= N- 1/ 2 (X2 ) '+1.
i = N- l / 2x"
Upon substituting expressions
(3.7) in (3.4) we obtain
r* = \ (N- l / 2 (xz) '+m) _(N- l / 2x l )(N- 1 / 2 z 1+1'>J
(;.8)
{1'-1/ 2(x2) •+1-(N" 1/2x , )2
.
r
1
/ \-1( t) .
Now, the efficiency of Hsu's technique comes in combining the
terms in (3.8) I removing those which are O(N- 1!2).
Using a two-
term negat1ve b1nOIll1al expansion for the second term, we get
Now by Theorem C
r!2 ~ r *-(m».-let) ~
will have some limiting law as
,
"'- 1.(w>{~ xz-px- (EXZ)x2/2 J •
Applying Theorem A, we have tbe resUlt
(3.1l)
dlim
Nl/2 [r*_{EXZ)iI.-l(t~
..
N( O,iI.-1{w)·
{ V(XZ-pX- (EXZ)x
2
!21
1/
j
19
Unfortunately, this is not quite satisfactory, since what we
desire is the limiting distribution of
It can easily be seen that the two expressions are not the same.
Indeed,
r/2 [r*-(l!:XZ)",-1(CD1 = Nl / 2 [r*-<EXZ)",-l(tJ
(,.12)
+(EXZ),,--1(t)A-l(CD)~/2["-(CD).,,-(t)l •
The two terms on the right are dependent.
However, if
plim (second term on the right) = 0.1 we can easily obtain our
limiting distribution.
Acc'ordinely, we employ Lezmna , in order to
obtain the distribution for the case
(I)
= 0 or p == 1/2.
We must
now evaluate
First make the following definition
f
f
-00
CD
00
Clan
=
00
x
m
¢(x,y) dy dx.
Make the transformation (x-
fJ
f
y)(l- f'2)-1/2
00 00
(3.14)
Clan =
(27f)-1 [V(l-
•
e
Define b",
00
=
J
(21t)-1/2u"-exp( _u2/2)du•
CD
Y
= u.
This yields
r 2)1/'2+j'uJ mexp [.(u'2+v2)!2} dudv.
-00 (I)
.
= v,
.
e
20
Integration of b A by parts gives the recursion relationship
The Clan may now be written down.
Note that Clan is independent of
k for k ~ 1.
Making use of the facts that bo
the following constants.
We now have from
= p,
b1
= A(oo),
00
= OJ
we compute
C~.13),
Hence, from (3.15),
1/2 *
(3.16) d11m N (r -f)
::t
N(O,
{.r 4-2.5' 2+sc/21 1/2 ).
Among other things (3.16) provides a check for Soper's
result (2.1).
.
21
e
2.
Variance Stabilizing
Tr~formatio~r the
It 18 desirable to find a func+;:ion
-~/'2[
(*) -fl...
/t;)~~
. dlim N. f r
J
,'\.
f':::'~')
Case
(j)
== ~.
such that
= r,~O.,l).
TI
'
This function f(r*) haeseveral adva.nta.geo:'
(a)
Its standard error is practically ind3pc~dent off';
(b)
f(r)
* will tend to nor.mality faster than r * no matter
what value
(c)
l'
takes on;
The fom of tho distribution will be nearly the same for
all
f
for moderate N, while the distribution of
r 4·'.
large N will have a variance N-l(
p
for
2,5)742.
+Jr / '2 ) whioh
is markedly changed, becoming peaked for
and flat for
*
I'
p
near
(6)
Chapter VI.
:1,
near O.
In connection with this type of transformation see
.
. *
Consider a Taylor expansion of fer ) about
p
as far as
ter.ms of order N- 1 /'2.
(3.17)
f(r*)
= f f )+f'(f )(r«~J')+o(N-l/'2L
'2
E (f(r*)-f(f)]
E [f(r*)-f(f)] '2
.
= 1/1 by bPothesis.
= ~t(f)] '2 E(r*-f)'2+o (r("l)
Then aside from terms O(N- l )
we
have, equating the two previous
results, a differential equation
(3.18)
which can be rewritten as
.~
e
by (3.17).
22
P 2
-{(X
l
:f(f) =
'2
-5/4) +8n-25)/16
."
= (5/4)1/2 sin Q.
Let x
Then
J-1/2
dx.
.
sin- l (4/5)1/jp
.
ref) = (4/5)1/2 f
[cos o(oos 4 Q+A)-l/jd 01
o
where A
Q
.0053.
f
is in the interval -1 ~
the interval _640 ~ 0
tion by ignoring A.
~
f
~
1" so 0 is in
640 " and we obtain a very good approxima-
Upon neglecting A" we have
S:ln-l(4/5)1/~
f(f)
=
f
sec Qd 0"
o
which gives the following result:
(3.19)
(4/5):!'?:f-)11
2r * ti+(4/5)172jO ~
dl1m NJ./2 {5-l/2l0S(.1+(4 5)1/2r * {.l.
\1-{4/5)1
=N(O"l)
We can enter Fisher's table
r
= (4/5)1/2r *,
multiplying the
thereby obtaining
and saving some labor.
e
'.,<111
[6]
of z
= tanh-lr
tabulat~d value
for
by 51/ 2/2"
..
e
CHAPTER
rr
EFFICIENCY OF THE COEFFICIENT OF BISERIAL CORRELATION
1.
·.~el1minary Matters
From the mathematical model 6iven in Chapter I, Section 2,
w" mow that
x
P(x
~ x,
Z
= 0)
==
x
f J"
¢(x,y)dy dx, and
-00 -00
x 00
P(Xi x, Z
= 1)
=J f
-00
¢(x,y)dy dx.
(l)
Therefore, the probability element of the sample (xl' zl)"'·'
~, zN) ma:y be written
The Method of Ma.x1mulll Likelihood will now be invoked in
order to give us estimates
meters
f' '
(1),
l'..... '
/\.'\(1),
iY respect!vely •
nice properties.
;:r of the population para-
These estimates have several
They are consistent, tend in distribution to
normality as the sample size increases, have minimum variance
in the limit at least, and provide sufficient statistics where
any eXist.
The Method of Maximum Likelihood we.a first introduced
by Fisher in 1921
[5J '
the same author in 1925
and was substantially improved upon by
l7J .
These papers are both descr1bed
alone With other pertinent material by Kendall in Cha.pter XVII
24
· of his book \~].
For a more rigorous treatment the reader is
referred to Chapters XXXII and XXXIII of Cram6r (1) •
The likelihood funotion of a sample (xlI zl), ••• ,(~,zN)
is defined to be
N
(4.2)
L
=n
f(x i ,z1).
1=1
It will, as usual, be convenient to take logarithms and maximize
log L.
Because of the complexities inherent in the caloulations
of first and second partial derivatives of log L, the work will
have to be done in stages. The following notation will be used
(1)
(ii)
(4.4)
(iii)
(iV)
where
Q
e
~.=o
J
Ofl
dQj =
d2 f 1
.
2
a _ d log f(x,z) Ii
aoJatSk - dOJaok
diog
_
()Qj(1Qk -
f(x,z)
d'S J
•
z~o
I
{z=l •
2
d log f(x,z)
CiOJdQk
J' Ok run over the parameters
defined by (4.1) •
...
I
aro _~log
ao f(x,z)
dQ:" J
d2 f
f '
I
z=l
ID, ()"
•
and f(x,z) is
..
a
(v)
j( ¢(x,y)dy
=
f
R(a) ..
¢(x..y)dy:;: Sea)
a
"00
00
(vi)
25
00
I J
Eo(Mo/?,oj) = q-l
(l)
(Ofo/doj)¢{x,y)dy dx,
-00 -00
the conditional expectation of 0dlOS f
given z :;: 0
i
ff
00 00
(vii)
El(Ofl/ao j ):;: p-l
-00
(Ofl/doJ)¢(x,y)dy
<ix.
CD
are defined in the same way.
The fundamental relation to be used is
2
E(O l~ L)
dQi j
= NqE. 0 (02f 0 /'OQi 00 j )
+
NP~(02fl/OQ1aoj)'
The likelihood equations
(4.6)
{o log L)/Of
= 0,
(0 log L)/?x.D:;: 0 and (0 log L)/o(r=O
have the form
CD
(4.1)
N
I:
1=1
(l-zi)
J (~(xi,y)/of
00
)dy+zi
-00
(l-zt}R(CD}+ziS(w)
N (1-2Z i ) ¢ (xi,w)
i~l (l-zi}R(m}+ziS(m)
:;: 0
fm
(Q¢{x1 ,y)/op )dy
=~O
.
e
26
w
•
f
N (l",Zt)
I:
00
(a¢(Xi,y)/OO" )dY+Z i
-~oo.:..._
! (~(Xily)/a
<r )dy
=0
~=__....."IIl:"7""~.;;;;W__..r:,....,.------
(l-zi)R(W}+ziS(w)
1-1
There is no apparent way to solve these equationSj
however"
'"
they need not be solved in order to determine V(j> )" which
will in the limit be the minimum variance among all variances
of estimates of
.f
1\
as defined by our model.
V( f
) can be
obtained once we have the elements of the information matrix.
2.
Determination of the Information Matrix.
By definition the infor.mation matrix is
and we have
(4.8)
where
Jl. = «(J'" iJ)
the matrix ofc~variances of
.fA ' 1\
(I),
"
0"
The derivation of the elements of the information matrix Will
be given in two
~arts,
each consisting of several cases.
Part I, Case 1:
USing the notation of (4.4) we have
.
27
.,?t/a f2 =
-00
00
~
EO(d2f
I '0 J
2)
O
j (a ;/af
f f
2
fll((j»
= q-1
J(?Jt)/af
2)dy_(
2
)dy)1 1l- ((j»
-00
en
¢{x,y)('O2fOld
f
2 )dy dx,
-00 -00
which becomes
q-l
en
J Jccl¢lof2)d:y
00
-00 -00
~
'-
J (f (-a¢I'Of
en
00
dx_q-l
)
-00
\..
-00
A
2
)dy) R-l(w)dx.
.,
B
.. . .-I
The well known results
are to be used extensively in this chapter.
It is important to notice, first of all, that A and B .are
invariant under a change of scale of the X variable.
ly, we can replace
In view of
xlrr
by x, or in other words let
According-
cr =
(4.9)
1
\ dx.
~
Putting (j = 1 in ~(x,y), and. differentiating, we obtain
l-
.
e
28
(4.10)
¥
= -(x- ..f y)¢(x,y) (1- J' 2)-1
~ = -(Y-..f x)¢(x,y) (1- .f 2)-1
03¢(x,y)
2ay
ax
= ..¢(x,y) {2,f Cx-.f' y) )CX- f' y)2
_
(l_j'2)2
(1- .f2)2
1
1.
1-f 2
J
(y- j' X)1
1-J 2
J
Substituting in A from (4.10), and letting
we obtain
1
2
A = .q" exp(-m /2)(1.
2 -2
f)
f.
00
2·{
(z -1) mel.
f
2
j..z
f
/
(1-
f)
-00
Z bae the frequency function of a variable which ia N(O,l),
eo A :;: O.
Now consider B:
2
)
= (x-fro)
2 2
2 -2
¢ (x,w) (1-f)
J.
2 1 2}
.
29
f(")[
2
d r/d f 2 =
(d2t1!df2
f )
El (d2
f l d/ 2
and2
Eo(O1
f d
o
)iJ¥-(1 (ar;!d f)iJ¥)
f 2 ) are the
J
S-2(,.).
same except for the
limits of integration on y and q, which ris-· nO!!,:':lz.eplaced by p •
.
(4.5),
With the aid of the fundamental relation
the following
ma.y be written:
(4.11)
J¢
00
2
E d l 0 8'2L
d
f
= -N
2
(x,O)(x-
.f 0) 2 (1- f '2 ).-2
-00
2
Case 2:
E d log L
2
?m
d'2 fol ?m2
Aga.in let
~
= ~(0)(d¢(x,0)/?m)_¢2(x,0)}
= 1.
00
E {d'2 fo l?m'2)
o
= q-l
f
00
(d¢(x,co)/cm)dx-q-1
-00
From
~ ~.l.O)
I
J'I
¢2(x,0)R- l (0)dx.
-00
00
A::o oq.-l
(0)- J x) (1..
J 2) -J !(x.,O)dx.
-00
Let z
R- 2 (0)
= (x- fO)
(l-! 2) -1/2 •
Then, we set
..
e
exp(-z.2/2)dZ
==
_q-l(2n)-1/2~ epx(~2/2).
Hence,
00
E
o
ca2f
0
/cm2 )
== _q-l(2n) -1/2~ exp( _m2 /2)-q-1
f
¢2(x"m).
-00
R- 1 (m)dx.
Again E ca2fl/~2) is almost the same as E (??f /(!iJ)2)
1
o
o
00
E1Ulfl/"al)
:I
p-l(2n)-1/2(j)
exp(~2/2)-p-lf ¢2(x,~)s-1(m)dx.
-00
From (4.5) we obtain
Case
2
%.
_......:.J:;......
E;; log L
,hjaxi!lf
diiaf
•
.
~(lll)( 1III(x..,)/<1 f Hltx,lll)
<0
f
(1III/<lf)
<Is] R- 2 (1ll)
-00
f
oO~',
00
E (o2to /?m3f)
o
...
= q-1
-00
(a¢/of )cbc+q-1
f
-00
R- 1 (<o)d.x
¢2(x,<o)(x-.f Cb){l-f2)
-1
~
31
We see :tmmediately from (4.10) that the first term vanishes.
a sim11ar way we can obtain
00
"lcclt 1 /?tȴ) ..
_p-l
f
(?I/J(o,f
)u+p-l
-00
in which the first term vanishes.
(4.1~)
¢2(x..a>)(x-
f
m)'
-00
(1-
E·al;aj,L;I N J (x- f
2
J
00
f
2) -lS...l(a»dx
Rence... by
(4.5)
00
m}¢2(x..m)(1- .f2)-1
"00
2
Case 4:
E 'd 1og L
a;;a<y
0" must now be retained in ¢(x.. y).
'dfo/O» ..
¢(X.. a»R-l(m) and
f
'dfl/d» .. _¢(x..~)S-l(a» .. we find that
00
E ('d2f /?JJ:;j(y)
o
o
;I
q-l
Recalling that
J·
00
(d¢'d0" ')dx- q- 1
-00
¢(X.. a»R-l«.l)'
-00
{£ (?/;/orr 1
(.l)
•
)dy
and
E1ca2f1/?JOO0')
00
=;
dx,
00
_p-l! (d¢(eO' )dx+p... l
-00
f
¢(x..ca)S-l(ai).
-00
{t
(?/;/dfT)dy } dx.
In
( ~.14)
~ ~ ~(x,l') ( x2 (1- f
(1-2_
f
2) -1 CT - 3_!xy(l-
2) -1.
cr- l ] .
[f 2x 2l (1- 1'2) -2- 3x2 ( 1-.f )-~
2
+4
f' xy(l- 5'2)-10--3->20-- 2J
00
The integral
'
.f
(d¢(x,co) /0 cr )dx is taken by the
-QO
transformation z = (x-
f
(j)
0") (1- .f 2) -1/2 cy- l into the form
f t
00
<r -1(2n)-1/2 exp (_ro2/2)
z2 • l +c.ojZ(1_.r 2 )-1/2j.
-00
which vanishes.
2
(4.15)
E
~herefore,
O~gorL = N
-co
00
JI ¢(x,co)
{
1
_R- (m)
00
f
+S-l(m)
co
e
0' )d,y
-00
-00
.'
J (~d
(~d <T )d,y }
dx.
33
2
E 0 log L
C~ee l::
,,:
3.f 3cr ·
Let r = (x-.f a> tT )¢(x"a»
cr- 2 (1-.f 2).
'3"\/'3 j' '3cr ={ -R(",)( '3.-l'3er'
Then"
J
(?Y/JI'3 IT' )dy]
)+r
2
R- ("') ,
-00
00
"i\/'3'l '30" =
[8("')('3.-1'3 cT )
-r.r (?Y/JI'3CT)
dy} S-2(",).
(.0
USing (4.5.) and sUbatitutins for r" we obtain
2
(4.16)
f (x- f a>er
{R-1 f (o¢/d cr
~fo~; = N
E
00·
)0'-2(1_
f
2)-1¢(x"a»"
-00
a>
(a»
e.
)ely
-00
00
-S-l(a>1"
(Ot)/?J(l')dy} dx.
a>
Ca.se
6:
E
2
0 log L
ocr 2
•
a>
2
'3 f older' 2
= {R("')
f
.[.l
('32¢1'3 0-' 2)dy
-00
•
iJ2f /M:r 2
...
= {S(",)
(?Y/J/iJ 0"' )dy
j ('3 ¢liJ
2
J
2} R-2(",)
cT 2) dy
en
_[j
a>
(,!lNiJo-')dy ] 2
S-2(Ul).
;4
It may be shown after some tedious algebra that
Jf
00
Nq
ID
If
00 00
2
q-lU!p¢/ocr )dy dx+Np
p-l( o2¢/ofJ 2)dy dx
-00 W
-00 -00
• N
jj
ui¢/?JIY 2)113
dx
= O.
-00 -00
(4.17)
E
?J:~g2L •
-N
I
+8-
1
¢(x,,,,) { R- (",).
1
[I
(",)
[-l
(CY/J/?J(l')i13] 2
(CY/J/?J(T")113 ) 2} dx.
Part II.
The inta&ral eXpressions for the elements of the information matrix a.ppear to be rather formida.ble.
the matrix, we will want its inverse;
'"
in particular, we
want the element V(f.) in its inverse.
.
examine the efficiency of r
* as J?
After forming
In order to
tends to 1,
J'
must be
'"
allowed to tend to 1 in the final expression for V(j> ).
Here bad oomplications enter, since it can be determined
that none of the expected values (4.11), (4.12), (4.1;),
(4.15), (4.16) exists when
l'
satisfactorily over- the whole x
the exception of the point x
.
tends to 1.
r~e
They all behave
- c£> < x
< 00 with
=W•
The difficulty can be overcome in the following way.
If we can find a transformation Buchthat each integral
is expressible as a product C(l~
of
J'
I
f
'5
)~kal where C is a function
m which remains finite as ~
tends to 11 k is a
positive integer, and H is an integral which exists for all
f '
then we can invert the information matrix and see
A
what happens to the element V(j> ) as
l'
.
tends to 1.
Ob-
A
viously, f!+'l V(]> ) will be finite constant ~ 0 I since
the efficiency of a statistic must lie between 0 and 1, and
we already know that VCr* ) is finite from Chapter II.
Now make the following definitions.
M= ¢2(x /m) {R~l(m)+s-l(m)J
(4.18)
•
00
G(u)
e
=f
2
exp( _t /2)dt.
u
•
Let
<r
==
1 and make the transformation t
in the integrals of M.
(4.l9)
M = (27C)-1(l_
Y
= (m-f x)( 1-' 2) -1/2
Then we can write
2)-lexp {-(l-.f 2)-1(X2 -2fX!J.)-f(1)2>J .
•exp (x2 /2) [G- 1 [(
f
+G- 1 [ (Ol_ J' x)(l-
x-m) (1-
j'
f
2) -1/21
2)-1/2]) .
The transformation mentioned at the beginning of Part II
will be that transformation wh.ich changes the product of
the two exponential expressions in (4.19) to the form
exp( _z2/2) •(a function of
f ' oo).
The exponential can
be put in the form
...xp { -(1+
1-
1
f
2) [2(1-.f 2)
[X-2.f "'(1+.f 2)-11 2
_002 (1+
f
2) -1] ;
whence it is seen that the desired transformation is
(4.20)
T :·
l
Z
= (1+
f
2)1/2(1_
f
2)-1/2
which pz:oduces
e
(f X_(J))(1_j
x-foo
,
.+
= (1_y 2)1/2(1+.f2)-1/2Z +2fOO(1+f2)-1,
x
"
.
(X-2J'(J)(1~2)-1]
2)-1/2=
J' Z(1+f2)-1/2_00 (1_f2)1/2(1+f2)-1=
= (1_f2)~/2 [z(1+f2) -1/2+fOO (1_.f2)1/2(1+.f2) -1
zl
J'
and dx = (If 2)1/2(1+j'2)-1/2 dz •
Upon performing these substitutions in (4.19), we obtain
(4.21)
M = (22t)-1(1_.f2)-1/2(1+f2)-1/2exp(_002/l+.f2)
1
1
•oxp( _z2/2) { G- ( zl) +G- ( Zl)
Caso 1:
1.
Consider integrals of the form
S
00
I
k
=
M(x-
-00
f
oo)k(l_.
f
2) -kdx •
37
This is the form of (4.11), (4.12), (4.13).
(4.21) and using the tran~for.m of (x- ~ 00)
upon applying
k
given in (4.20)
on I k , one obtains immediately
.
00
2
exp(-0o /l+J'2)
.
J{ f
2
oo(1_p )
Z+
(1~p~)-1/2J ~
-00
It is easy to establish the fact that the integral
·e
in I k exists for all finite k~_ 0, even if ~ tends to :1.
z(2) -1/2 as
f~ 1
,
-z(2) -1/2 as
l' ---+
-1
while the balance of the integrand tends to zkexp(_z2/2).
It 1a easily shown by de l'Hospital's rule that
J
00
z:kexp (_z2/2} {
G-l(z~)+G-l( -z~)}cL1..
-00
must exist for all
have eXistence.
1.1
< 1. In our case
~
-1/2 ,so we
= 2
The factor multiplying tho integral of
course becomes infinite like (1+
Note that the integral in I
f
}-(k+l)/2 as
.P~ :1.
Will, in the limit, be an odd
k
function if k 1s odd, and an even funotion if k is even.
Expression (4.22) then offers us a partial evaluation of three
of the elements of the information matrix"
2
E
2
EO
Case 2:
~Og
j>
L
2
= NIl"
E 0
10~
an
L
= -NI
•
0
108 L
aoder·
Recall expressions
(?IJ/ofl')'from
(4.14) and (4.15). Upon substituting
(4.14) in (4.15), we have some cancellation"
and (4.15) becomes
J
00
(4.24)
N
0"' f(l~ f2) -1X¢2 (x,m)
-00
r
J ~(x,y)
00
-1 (m)
-00
~(x,y)dy]
_S-l(",)!
dy
dx.
00
Now" when we replace x/O'
by x, we have the same ex-
pression except that C" -2 is replaced by Q'" -1 , and the
rest of the integrand conta.ins no OV •
transformation t
= (YO' f
x) (1-
f
in brackets" we obta.in
1
N: t:>
J
fL(1- f· )20--] -
00
_[0X¢(X 1 OO)
2) -1/2 on the expression
(00-
f
x)(l-j> 2)-1/2
2
fx ]e -t /2dt
J(t(1-p2)1/2+
-00/
(00_PX)(1_p2)-12
f
-00
•
Performing the
e
-t~/2dt
.
e
39
•
J
00
(<.0-
e
Y
dx.
2
-t /2 dt
x) (1- j> 2) -1/2
After we cancel the right members in the inner integrands
and revise the limits on the first inner integrals, we obtain
Perfor.m1ng the integration in the numerators of the bracketed
expression, we get
f
00
(4.25)
-
l' N 0"'-\1-
f2)-1/2
x¢(x,ro)exp
-00
f
2- 1(1-
+G-
1
f
2)-1(ill_
x)1
f
[ (Cll- .f x)( 1-
f
1
{G- [ (
2) -l/j
r X-ill) (1-j' 2)-1/2]
J
dx.
By comparing (4.25) with expression (4.19) for M, one
can qUickly deter.m1no that (4.25) ma,y be written as
e
b
e•
•
00
- .f N (J"
-If
xM dx.
-
-00
Now, applying·T. as defined.in (4.20) we arrive at an expression
similar to (4.22) in exactly the
s~e
way as before.
The re-
sult is
(4.26) E O~a80"L = -Nf
•
L2Jt cr (l+ j> 2) J-l{l_ Y )-1/2{1+j' J1/2
J{
00
2
exp(-<n /l+? 2)
z(l-.f 2)1/2
-00
.J
.f
G-1{Zl)+<f1{-Zl)} di,
Case 3:
-02108 L
E dj' a~
.
This cS.se can be dispatched with ra.ther qUickly, since
2
2
expressions (4.15) for E *~o~~ and {4.16' for E" ~;?~dr
differ only in the factor ( x- 5>
(1)
0"' -) ()-' -2( 1-
f
2)-1 which
is present in the.denominator of the integrand in (4.16).
After replacing x/O"
by x, we can immediately give the
answer, using the transform of (x-
•
~(1)
given in (4.20) .
.
,
..
e
41
-02
(4.27) E a~o~~ = -N.f (21f
0' (l-
.f )(l+.9)]
00 _
_[~ z (1- .f' 2) 1/2+2 f
[z:+ fID(l- f
.
m(l+.f 2) -1/2]
-zdJ
dx.
.
2
E 0 1013 L
o 0'2
·
.Recall expression (4.17)
'e
The simplest wa:y to treat this integral is to write it in
the form
00
S
00
S
(a¢/'?J0')d.y
-N
¢(x,w) {R(ID) [-00
-00
R(ID)
00
rl I (OSl/Cl
SCm)
00
After substituting the value of
',.'
2
)
0".
+S(w)
Y
)d . '. 2J
]
dx.
~xo:) from (4.14) ~
ropla.cing x/a by x, we are left with the following ex-
..
e
press1ons, which conta.in
cr
/
(l+.f 2)"3 2.
2)1/2(1+.f, 2)-1/2J exp(_z2/2 ).
1
1
[G· !Zl) +G- (
Case 4:
-1
only as a factor •
42
f
00
y¢(x,y)dy dx +
-00
(0)
-
.f2w 0--
(1- y2) -1
f
b(x,(J)) {R-1(",)
-00
Consider (A).
r
m
00
2
[f y¢(x,y)ay
-00
This is of a slightly different type from
those which we dealt with before.
This time the correct
transformation is
which produces
x
= (1- f
(f
2)1/2(2_
x-m){l-
y
f
2) -1/2:
2)-1/2?+ fm/(2-
2) I
y z(2- .p 2) -1/2-2m(1- f
(2-
•
f
f
2)-1,
2)1/2
,.
e
= (1-
and dx
f
2)1/2(2_ J' 2)-1/2dz .
Substituting these values
in (A), we obtain
j
-00
J
00
Consider (B).
In
y¢(x,y)d.y make the transformation
-00
J
00
2
f
2
2N0-- (1_
.f 2) -1
x2 { x2 (1_ .f 2) -1_~.
-00
¢(x,m)exp( _x2 /2)dx.
Now apply:.· .. T
2
given in (4.28).
same way as before,
(B) = 2
f' 2N f3'"' (1- f
-L[
.
)(l+f )
We obtain, in exactly the
1
-2
00
.(1-
U.(.h..
f
2) 1/2(2-
j2)(2-
f
f
f' ID/(2- l' 2)}
2) -1/2+
2)-1/2+
(2rt)-1/2exp (_z2/2 )dz .
2
(2n)-lexp(_m /(2_p 2) ·
fID]
2.(1_ f2q.
2
44
Finally, consider (0).
Since there are essentially no new ideas involved, and
in view of the fact that this calculation is much longer than
the others, a sketch of the method will be given, tOfethor
with the final results.
(1)
Ma.1ce the transformation t
= (1'- j>X)(1_p 2 )-1/2
in the numerators of the expression in brackets.
= 01
+ 02 + 03'
(ii)
We have (0)
(iii)
(iv)
01 requires transformation T2 .
02 can be shown to vanish, With little effort.
(v)
03 requiros a new transformation of the same type
as Tl' 1'2:
x = z(l-
Define z2
= S'
The value of
namely, T ,
3
f' 2)1/2(2+ l' 2)-1/2+31' ro/('p
(2+ j> 2) -1/2 Z _ 2w (1.. .f
2+2 ).
2)1/2(~+f' 2) -1.
° will be included in the Table of Expected
Values with the others:
,..
e
Table of Expected Values.
We will define 01j to reprosent all of the expected
2
0 108
.
values E dO
db L except the factors
i
(.f
).. k
1-
tY-
and v
j
,,\
1\
1, J will run from 1 to 3; ,9 will correspond to 1;
2; .and
A
cr
A
•
to 3.
(l) to
Note that all c 1j must exist by the seme
arsument as that used in (4.22) et ~
2
E
(j(j1;l = 011(1- f
)-3/2= _1I(2n)-1 (1- J')(1+
exp(-.,2;l+
f
j(
2)
f'
-3~
)(1+,2»).
,,+ f",(l- f2)1/2(1+f2) -1/2
-00
E
afj2f10g<£UL .. °12(1- f) -1= N [ (27tH1- f
r
J' 2]-1
) ·
)(1+~ )(1+
00
eXP("(l)2/l +.f 2)
f [z+ S' CD(l- .f2)1/2(1+ .f 2)-1/21
-00
~xp(_z2/2) }[G1(Zl)+G-l(-Zlil
E
021
fj:~
dz.
r
/
-1/2
L = c 22 (1- fo)-1 2= _N(21f)-1 L(l- f}(l+fH1+f2) ].
00
exp(
-ci/1+ 'S 2)
f
-00
.
e
1
exp( .. z2/2) [G- ( Zl) +G.. 1 ( -Zl)] dz.
,..
e
46
:j;
2
E
as.,logacrL = cr-le13 (1- J0
0
)-1= -N
f'
l(r(1- 0 ) (1+f
~
J
)1
-1
.
00
.f[z(l- f''2)1/2
(1+f2)-3/~exp(-a>2/1+f2)
-00
00
(1+y'2)-1 exp (-(l)2/1 +J'2)
f[ Z(1_p'2)1/2
-00
(2-f 2)-1/2exp [ "(1)'2/(2_9 2
>}
.
J{
00
[Z(1_f'2)1/'2
-00
+2Nf'2 [0'(1-f)(1+j')] -'2(21f)-3/'2(2_.r 2 )-1!2.
e~ _(1)2(1_ .f 2) -1)J •
.
41
[{Z(1_r 2 )1/2(2_),2) -1/2+ jW] 2
-(1- J2)J
exp(-z2/2)dz
_Nj'4(21C)-3/ 2 [CY(1-f)(1+j»] -2(2_f 2 )-1/2
exp
f[
00
[_W2 (2_jP2)-1].
.
-00
-Nf 2{21C) -1 [
(j"
(1-'p) (1+.P ) ] -2 (2+j> 2) -1/2
exp(_3CD2 (2+f 2)-1) •
f
00
2
[Z(1- y2)(2+,.p2)-1/2+3j>W(2+J,2)-1 ] •
-00
[G- 1 (Z2)+O-1( -Z2j
exp( _z2/2)dz.
In the above
00
G(u) =
J
2
exp( _t /2)dt,
u
Z1 =
z2
,,'
e
f
4
z(1- f2)1/2{2_ f~)-1/2+fW(2-f2) -1] •
z(1+f2) -1/2-w(1-.f 2)1/2(1+),2) -1, and
=Y Z(2+j>2)-1/2_ 2ID (1_f 2)1/2(2+f2)-1.
#
48
. Now, recalling the definition of the information matrix,
we have
It will be notational]y convenient to consider
D=
a lCOf(cll ) 1-1 for a moment.
We know VCr*) as far as terms of order N- l from
A
-1
Chapter II. V(?) as far as terms of order N is given
-1
A
by D , and f> is an estimate of minimum variance. Hence,
the efficiency of the coefficient of biserial correlation is
•
49
1'-7 :l.
3. The Efficiency of r*as
j> ·4 -1
All of the work of Chapter IV Boes through for
f' --+ 1.
in exactly the same manner as for
result (4.30) we need only replace.f
by
Hence, in
-f'.
Note the
follow-ins facts: .
(1)
= L+
L+ 012
.f~-l
an odd
(i ~')
L
f~:l
c
j>~-l
f~ction
_
zl -
Z
;
f
L+ zn
~
-,
-1
L (1+P)-3/2Eff (r*)
f~:!t
(4.3l)
[pq (ci +1)
J
2~
exp(a,2j +( 2p-1)
00
f
because we are integrating
of z over symmetric lim!ta.
+~-l / 2
-eo
= 0,
13
=
eo
+ -1/2
= -3
f161f
1.
z.
eXP «(J)2/2)}
•
(2~) 1/2exp(a}/2) - 3/2]
.
z2 exp ( _z2/2) [G- l (Z2- 1 /2j+G- l ( _Z3- 1 / 2 )]
-:
"-1
dZ} .
-00
Expression (4.31) is positive for all finite (J), so we ~
state the followin[. theorem:
•
50
THEOBllM II
The coefficient of biserial correlation has limiting
.f
when
'
4. The Efficiency of r * When.f
= O.
efficiency 0 for estimating
+
f --;. -1.
In order to find the efficiency of r * for
J' = 0
it will
be easier to return to (4.11), (4.12), (4.13), (4.15),(4.17).
Setting
JP
=0 provides a great simplification, and it may
be shown rather directly that
(4.32)
2
E(d 10~ L ,
of
2
E(d 10g L
c.o2
o
'
f = 0)= _N(2npq)-lexp(~2)
t () = 0)
= -N(21fpq) -lexp ( _c.o2 )
.j
All mixed partials vanish.
Recalli11€) the expression
for VCr* ) from Chapter 1I(2.l), we have from (4.32) and (2.1)
A
(4.33) v(j> ,
f ::
0) = 2npqN..;lexp(c.o2)+O(N- 2 ),
Hence, we may state the following theorem:
THEOREM III.
The efficiency of the coefficient of biserial
correla.tion for estimating j> is 1 when.f = O.
result was to be expected.
This
••
CHAPTER V
SUMMARY A1ID INTERPRETATION OF RESULTS
When
f #0
two coefficients of correlation have been
proposed for estimating
f
=
f # 0,
0
and for testin£ the hypothesis
~
biserial r * and point biserial r.
The as-
sumption of an underlying bivariate normal distribution for
r * is about the simplest of those which allow us to specify,
and work with the Joint distribution of the continuous and
discrete variables.
It has been shown that the assumption
of an underlying normal population With the point of dichotom-r at the mean produces minimum variance in the limit for
*
r;
and that for thiS case a simple statistic fer*), de-
fined in Chapter III, Section 2, analoLouB to Fisher's z
transform of product moment r, can be used for moderate to
large N.
Gra.ph1cal and mechanical methods are available, so
that the computa.tion of r *1 for ea.ch item in a test for
example, can be carried out ra.pidly and efficiently.
How-
ever, r * is not restricted to the interval (-1, +1);
in
fact, it is unbounded as is illustrated in Figure 1.
Also,
.
while the efficiency of r
JP = 0,
it tends to 0 as
*
has been shown to be 1 for
J'
tends to +
-1.
This is a con-
siderable defect, since we are especially interested in
large correlations.
. point
In contradistinction to biserial r * we have
biserial r, whose model does not specify any underlying
distribution for the discrete variable.
Point biserial r
52
is a maximum likelihood estimate of
j>
under the conditions
of the model (See Chapter I, Section 6), and as such has
efficiency I for large N no matter what value
p
takes.
Corresponding to the property of minimum variance of
biserial r
m
= 0,
*
.
for the point of dichotomy at the mean, that is
we have a. similar'property for point biserial r.
Usinr r for the model given is equivalent to using the t
statistic.
It is intuitively clear that if we are in-
terested in measuring
f' '
and. hence essentially the
difference between two mean values, we will get better
results in the for.m of greater power for the t test if
the means are estim.ated from samples in which N = N = N/2.
o I
While the assumption of nor.mality of the residuals which
is used in connection with point biserial r is a restriction, there is some question as to how serious this restriction is.
Indeed, the theory of Least Squares is
largely based on the same assumption.
There is no ques-
tion, however, as to the inherent danger in making the
assumption of under1ying normality in connection with
biserial r * •
If the sample is sufficiently large we can get soma
indication as to whether the model for r *, or that for r
is more appl1.cable.
If the model for r is applicable, the
seta of variates (Xol"."XoN ) and (xIl"."xIN ), aS80o
1
53
ciated with the zi which are 0 and 1 respectively, should be
normally distributed.
It' the
model:~for
r * is applicable,
the,'Sete of variates cannot be normally distributed.
In the
caSe of r * an approximation to the moments E(X7 I Y ') ro)
canba obtained from the quantiti9s CkIn defined in Section 1
of Chapter III. We have from (3.:1.4)
E(X 1 Y)ro) =f A(ro) Ip, E(X2 / Y:> m)
and E(X3 t Y)ro) =
3f (1- f
2
= mIN
After
)>..(ro)/p+y3>..(m)(o}+2)/p.
performing the substitutions
P
= (1- f ~.l.? 2[ P-kJ.)>..(ro~ Ip,
.f = r *,
>"((1)
= >..(t),
= t,
ro
(Sse Chapter I, Section 2), if we do not have a
. -1 N1
fairly close agreement between E(X? I Y;;- ro) and N1 i~l xli'
then the model for r *is s~spect. To test the applicability
of r we need only test the
hj~othesis
that Xoi (i=l,2, •• N )
o
and X (J=1,2, •• ,N ) are norrr~l. with equal variances.
l
lJ
A variety of tests are avaHabJe for this purpose.
It would seem from the evid.e:lCc :r.>resented that point
biserial r is in most cases the better coefficient to use.
While the results obtained will
of Size N with a fixed
~e
valid only for awnples
p~ti~ion (Wo~N:»)
r in the fixed sample form is on
f'l~rly
view of its being a maximum likelihood
point biserial
good
~ound
eS1jim~.te ,of
Further.more, the concept of two fixed numbers for
in
J;)
the
•
discrete variable, instead of an underlYin8 distribution of
any kind, is more, satisfying to the psychologist in his
efforts to obJectify the testing situation.
If preliminary information pertaining to
ID
and
J?
is
available, then the following tables may be used in order
to select that statistic r or r * which from overall con. siderations of efficiency, sample size J and the menitudes
of
(l)
f
and
seems most appropriate. The statistic chosen
must of course satisfy the requirement that its model is
the more applicable in the sense of page 53.
f
tions are considered, the estimation of
of the hypothesis
l1m:lts on
I!
j
•
j
I Large
~f ...the
*
r*
r*
r
r
r
r
r
r
r
"Hypothesis.f'=J1oor the Placing of Confidence
.
111
small
".
t
confidence
Moderate Large
Limits on/'
Iwl
and the test
-Estimation of .?
tfl
small
Moderate
.'1'est.
J' = 5'0 or the placing of
.
small j
J
Two situa-
Moderate Large i
small
:r(r*)
fer *}
r
Moderate
r
r
r
Lares
r
r
j
r
c,
55
The recOJl:lll1endat1on is, then, use r * for estimating
when
f
I!' is small, otherwise use r. Use f(r *) for testins
f =.f 0
small and
or placing confidence limits on
Ij I
f
if' <.1)1 is
is small or moderate, otherwise use r.
connection with f(r* ) recall that this statistic may be
..
In
•
e..
: . 56
'
TABLE I
The Asymptotic Standard Deviation of
Biserial r * as a Function of p and f
.
All values must be divided by..rN.
p or l-p
.05
0
.10
.15
.20 . .25
-
.30
.35
.40
.45 . .50
4.466 2.922 2.345 2.041 1.85:7 1.137 1.658 1.608 1.580 1.511
.10 2.104 1.699 1.521 1.419 1.353 1.308 1.21811.258 1.241 1.243
.20 2.011 1.668 1.491 1.;89 1.323 1.219 1.2481.228 1.211 1.213
.30 2.033 1.616 1.440 1.339 1.273 1.229 1.198 1.179 1.167 1.163
·
e
,.,
tf
.40 1.971 1.543 1.370 1.269 1.a03 1.159 1.128 1.109 1.097 1.093
.50 1.893 1.449 1.219 1.179 1.114 1.069 1.038 1.019 1.008 1.004
.601 1.199 1.333 1.167 1.069 1.004 0.960 0.930 0.910 0.898 0.894
.10 1.691 1.194 1.034 0.939 0.815 0.831 0.801 0.781 0.169 0.766
.80 1.569 1.031 0.881 0.789 0.121 0.683 0.653 0.632 0.620 0.616
.90 1.438 0.842 0.105 0.619 0.559 0.517 0.486 0.465 0.453 0.449
1.00 1.302 0.616 0.503 0.429 0.314 0.335 0.304 0.283 0.270 0.266
57
r*
.
.~
.50-~
o
Lo
·1
I
.'H'·-I
-
-,
1-
I
f-.2 0
.~v-j
•
~
r-·
.20 _.~
.J
.10
30
1-
j
_.~
Xr,
o --'1
0
1_ .....
- -x
I
-
t.~- .50
J"l (
Nt x-x--)2
1-
l- ."0
/_.. •10
t
,SO
~ is the larger sample mean.
is the number _of values which make up i,...
is the coefficient of biserial correlat!on.
is obtained by computing and
x ~, laying a
~
r
r*
i
-:xr. - -
J"l
.
jE(x-i)2
straiJht-edge on the three scales, and readins the results on
the r scale.
58
BIBLIOGRAPHY
,-
1.
Cramer, H., Mathematical Methods ot Statistics,
Princeton university Press, 1946.
2.
DaVid, F. N., Ta~les of the Correlation Coefficient,
Cambridge university Press, 1938.
DUBois, P. H., "A Note on the Computation of Biserial
r in Item Validation," Psychometrika,
Vol. VII, (1942), pp. 143-146.
•
e
4.
Dunlap, J. W., "A Nomograph for Computing Biserial
Correlations," PSlchometrika, Vol. I,
(1936), PP. 59-60.
5.
Fisher, R. A., "On the Mathematical Foundations of
Theoretical Statistics," Philosophical
Transactions of the Royal Society A,
Vol. dCxxfI, (1921), pp. 309-366.
6.
Fisher, R. A., Statistical Methods for Research'
workers, New York City, Hafiier, 10th.
Edition 1946.
7.
Fisher, R. A., "Theory of Statistical Estimation,"
Proceedings of the Cambridge Philosophical
Society, Vol. XXII, (1925), pp. 700-725.
8.
Johnson, N. L.and Welch, B. L., "Applications of
the Non-centJ!"6.l t Distribution,"
Biometrika, Vol. XXXI, (1940), pp. 362-389.
9.
Kendall, M. G., The Advanced Theory of'Statisticli8
Vol. II, London, ciiarles Griffin, 19 •
10.
Lev, J., "The Point IU.ser1al Coefficient of Correlation," Anna.ls of Mathematica.l Statistics,
Vol. XX No.1, (1949), pp. 125-126.
11.
Ne,man, J., "Outline of a Theory of Statistical
Estimatj,on," Philoqophical Transactions of the Royal Society A,
fol. ccXXXVf, (1937), p. 333.
1,
,.
12. Pearson, K.,
"on a New Method. for Deter'lUning the
Correlation between a Measured Character
A, and a Character B," Biometrika,
Vol. VII, (1909), pp. 96-105.
59
~,.
W.,
Robbins, R. E. and. lioeffd1ng,
"The Central Limit
Theorem for Random Variables," Duke
Mattematical •.Tonrna.l, Vol. 1Y No.3,
(191;8), 1>P.· 775 ·18o.
e
14. Royer,E. B., "Punched Card Methods for Determining
Biserial Correlat-:or~sJ " Psycnomotr1ka,
Vo. VI No.1, (1941.), 1'1>.5.5:59-:
15.
Soper, R. E., "On tho Proba.blEJ Error fo:i.~ the Biserial
Expression for the Correla~lon Coofficient,"
Biomo~, Vol. X, (191.3), 1'p. 384"390.
16.
Stalnaker, J. and Pichard.son, M. W., "A Note on the
Use of B:i.scrial r in TestRGsoe,rch,"
Journa.:"" ()f' Genera.l P9~·c.holoey, Vol. VIII,
(1933),:pp7~)~465.
e
1'1