'.'
.' ~
ON NEYMAN'S CONJECTURE: A CHARACTERIZATION
OF THE MULTINOMIALS
.
Bikas Kumar Sinha and Thomas M. Gerig
Department of Statistics
North Carolina State University
Raleigh, North Carolina 27650
Institute of Statistics Mimeo Series # 1611
September 1982
.-
NORTH CAROLINA STATE UNIVERSITY
Raleigh, North Carolina
--e
ON NEYMAN'S CONJECTURE: A CHARACTERIZATION
OF THE MULTINOMIALS
Bikas Kumar Sinha 1 and Thomas M. Gerig
Department of Statistics
North Carolina State University
Raleigh, North Carolina 27650
O.
ABSTRACT
We provide a complete solution to a problem posed in Neyman (1965)
'.
and reformulated in Ghosh, Sinha and Sinha (1977) regarding a characterization of (positive and negative) multinomial distributions based,
among other things, on the properties of regression in power series
distributions.
AMS 1970 Subject Classification:
Key words and phrases:
ee
Primary 62H05
Multivariate power series Qistributions, positive
and negative mu1tinomia1s, linear regression.
10n leave from the Indian Statistical Institute (Calcutta), India.
1.
e-·
INTRODUCTION
There is no denying the fact that over the past two decades
there
has been an increasing interest in characterization of well-known discrete
as well as continuous distributions and in character-ization problems
in general.
Some excellent references, e.g., Kagan, Linnik and Rao (1972),
Kotz (1974), Pati1 (edited-1963) and Pati1, Kotz and Ord (edited-1975/
\
edited-1981), lead one to wonder about the vastness of the literature and
diverse research interests on this topic.
Our concern in this article is,
with mul tivariate power series dhtributtons (Khatri (1959))
and, more specifically, with a conjecture set forth by the late
Professor J. Neyman in 1965 regarding a characterization of positive and
negative multinomial distributions.
We settle his conjecture completely
after properly formulating it in this section below.
The key reference to this paper is Ghosh, Sinha and Sinha (1977) hereafter abbreviated as GSS - wherein this particular problem has been
studied and partially solved.
Our work may be regarded as a supplement
to that of GSS and the two together settle the conjecture.
The paper by
Sinha and Sinha (1976) accounts for the first attempt to attack the
problem,
(Another not-so-re1ated paper, but of independent interest, is
Sinha and Gerig (1982).)
To start with, suppose
(X 1 'x 2 ' ... 'X k) follow a k-variate power
series distribution with the joint pmf
(1.1)
ee
-2-
··e
where a x
x ~ 0; x. ~ I(the set of non-negative integers),
xl 2'" k
'
1 ~ i ~ k ; i = (6 1 ,6 2 , .. .,6 k) E Hk = {(6 1 , .. .,6 k )16 i > 0, 1 ~ i < k;·
iii
'1'"(6) = r..
. a..
. 6 1 e22
6 k <
'1'2""k '1'2""k 1
... k
00
}
A conjecture of Neyman (1965), reformulated in GSS, runs as follows:
Within the alass of power series distributions, the multinomials
are aharaaterized by the following properties:
Ql
The regression of Xi
on the remaining variables is a linear
funation of the sum of the remaining variables.
Q2
The distribution of Xl + X + ... + Xk is of the power series type.
2
In GSS, the conjecture has been settled in the affirmative under the
t·· e
aa a ... 0 > a and a l o. .. a + ... + a00 ••• 01 > o. Without
these conditions, however, the claim above (as suah) turns out to be
aonditions
false.
See Counter-Example 1 in Section 4 of the present paper.
Consider the following statements:
SlX
given {X j = xj(l~<k, j ~ i)}
is non-degenerate for at least one set of values of the xj1s and the
The conditional distribution of Xi
regression is linear and moreover, depends on the xj's only through
rx j , * Further,
j(ri)
rX j
assumes at least three distinct values
(l~i~k).
j(~i)
S2X The distribution of X = Xl + X + .. , + Xk is of the power series
2
type.
*We note that in view of the Proposition in GSS (pp. 399), this conditional
distribution also depends on the XjlS only through
rX j
j Ui)
**Throughout, we will assume this tn order that the linearity of
regression carries non-trivial sense. Without this condition, again,
Theorem 1.1 is false. See Counter-Example 2 in Section 4.
~
-3-
In this article, we state and prove what we believe to be the correct
form of Neyman's conjecture regarding the multinomials.
e-·
In particular, we
pr.ove the following:
Theopem 1.1
Whenevep SlX and S2X
t l ,t2 , ... ,t k ~
obtain~
and a set of P.V.'s
0
r
thepe exist integeps
~
1,
(Zl,Z2"",Zk) having a joint
(positive op negative) multinomial distPibution suoh that the
peppesentation
2.
, ,
X" = rZ. + t., l<i<k,
-- holds with ppobability one.
NEYMAN'S PROPERTIES FOR MULTINOMIALS
First of all we intend to develop certain basic results pertaining
to this investigation.
Let t i = min(values of Xi) . Transform Xi to
It is easy to observe that (Y l ,Y 2, .. .,Yk) follow
Yi = Xi - t i , l~i~k.
a k-variate power series distribution with the joint pmf given by, say,
Further, the statements
statements
(SlY,S2Y)
(SlX,S2X)
e··-t
are equivalent to the analogous
,
concerning the Y. 's
(and are obtained from SlX
.
to Y., , x.J to y.J , l<i;j<k)
-Let us write Y = Yl + Y2 + .•. + Y = Y + Y * and y = Yl + Y2 +
+
k
i
i
Yk = Yi + Yi * ' l~i<k. Then Yi * has at least three distinct values
and Y,' > o. Further, for any i, b
> 0 for some
*Y1Y2" 'Y k
(Yl'Y 2""'Yk) with y. = 0 and y. = Y - y. -> 0 . Our first result
is concerned with the conditional distribution of Y,' given Y. * = y. *
and S2X respectively by changing Xi
,
,
,
,
,
ee
-4-
-·e
Proof.
Take
i
=1
for notational simplicity.
In view of the Proposition
in GSS (pp. 399), we note that the joint marginal distribution of
(Y 2 , ... ,Y k) is a power series distribution with the pmf
(2.2)
where 6i* = 6i 8(6 1), 2 < i ~ k, ~ * (6) = ~(6)/A(61); A(6 1 ),B(6 1 ) > 0
Again, according to SlY, the conditional distribution of Y, given
{Vi
= YiI2~i~k}
is (trivially) a power series distribution depending
only on Y2 + Y3 +
+ Yk'
Hence, the identity
t·__ e
1
byly2 ·· 'Yk .
61Yl
}
c( Y2' ... 'Yk ----.:...Y-2-+Y
-3-+-·'-'-+Y-kA(6 1 XB(6 1)J
leads to
(2.3)
Finally, the numerator of
given by
ee
-5-
b,'y
e--
i Y2
Y
81 82 ... 8kk/f;(8)
2" 'Yk
simplifies, in view of (2.3), to
P(V l *
*
= Yl)
(using (2.2))
Hence the Lemma.
The next result is interesting in itself and has been of fundamental
importance in developing the proof of Theorem 1.1.
Theo~em
(Unde'l' the set-up of Lerrma 2.1),
2.1
obtain~
P[V i
= olV i * = Yi * ]
> 0
e _ -t
Whenever SlY and S2Y
for every vaZue Yi * of Vi *
The proof is given in the Appendix,
We are now in a position to look to the problem more closely.
As a
matter of fact, our next lemma, together with Theorem 5.1 in GSS, provides
a complete proof of Theorem 1.1 stated in Section 1.
Whenever
SlV and S2V
(i)
b
0 ,
(i i)
if
Lerrma 2.2
>
00 ••• 0
r
bro ... o >
b
0,
+
Y2
0
+ ... +
0, . . . ,
for an~ other
Yk
we necessariZy have
positive vatue of Y, then
boro ... O >
_
Y1Y2"'Yk -
Y1
~ea8t
is the
obtai.n~
=r ,
b00 ••• or
> 0
whi Ze
(Yl""'Yk) satisfying
ee
-6-
·-e
(iii)
Proof.
if s
is any othep vaZue of Y,
rls.
Let us write the power series distribution of Y as
(2.4)
where t ~
0,
u(t) ~
v(~) = ~ u(t) At(~) and A(~)
0,
is a function
of the 6i S. Clearly, v(~) <
for ~ E H •
Referring to (2.1) and (2.3), we then have
1
00
u( t) v- 1 (~) E; (~) At
(~)
t
=.L1 =0
.
1
(2.5)
d·(t-i) 61 n(6 2 , .. ·, 6k lt-i)
1
-
where
t-_· e
(2.6)
Y2'· .. 'Yk
Y2+" '+Yk = t-i
is a homogeneous polynomial of degree
(t-i)
in
(6 , ... ,6 ).
k
2
Note
incidentally that we have taken the representation
(2.7)
Next observe that a consequence of Y1* having at least three distinct
values is that the total Y = Y1 + Y1* must also have at least three
distinct values (since, according to Theorem 2.1, the value
0
of Y,
must
necessarily combine with each value of Y1* ). Let (t 1 ,t 2,t3) be ~
triplet of values of Y satisfying 0'::" t 1 < t 2 < t 3 . Denoting by
ee
Pt (6)
the homogeneous po1ynomi a1 of degree t (i n the 6i IS) in the ri ght
hand side of (2.5), we now deduce, using (2.5),
-7t 3-t 1
Pt (~)
2
t 3-t 2 t 2-t 1
= A(t 1 ,t 2,t 3) Pt (~) Pt
(~)
1
3
identically in e
t -t
t -t
t -t
where A(t 1 ,t 2 ,t3 ) = {u(t 2 )} 3 l /{u(t 1 )} 3 2{u(t 3 )} 2 1
function only of t 1 ,t 2 ,t 3 •
value of Vi
*
in combination with Vi
=
e··
is a
We now take up demonstration of the results one by one.
the least positive value of V and let si
(2.8)
Let r
be
be the least positive
o. 1
~
i
~
Then we have
k.
a series of implications shown below concerning the values of Vi's or of
* IS.
(V i ' Vi)
1
V*
1
(by Th. 2.1)
o
(if r
1
< r)
(Since r is the
>
least positive
value)
>
-
...
(f()r
k 3 [1
>
~
o
...
\l
r2 r3 • • .
rk.J
~ ~
+ r
which in combination with
(s i nce r is the
>
least positive
r 2*
~I
k
(
(if r < r)
2
>
=r
(by Th. 2.":>
= r, r 2 = 0 and so on.
value)
ee
-8-
-e
Thus,
P[V = rJ
arises essentially due to all or a (non-null) subset of
the following combinations of values Yi's of the V.1 IS:
{(r,o, ... ,o), (o,r,o, ... ,o), ... (o,o, ... ,o,r)} .
Specifically, let
(0,0, ... ,o,r)
Then
be a contributing term in P(V=r).
(by Th. 2.1)
If possible, let
1
,
(r,o, ... ,o)
If now s(>r)
V *]
ok
) I~
determining P(V=r) . Then
6
~_ok
J
, thereby establishing (i).
be missing in the effective subset
Pr(~)
does not involve any term involving
be any other value of V, we can set
t l = 0, t 2 = r, t 3 = s in (2.8) and claim
P~(~) = A(r,s) P~(~)
This shows that
identically in 6
(2.9)
Ps(~)
also must not involve any term involving 6 1
for any s whatsoever. But this is equivalent to P[Vl=oJ = 1 which
contradicts SlY.
Hence,
Pr(6)
-
is, as a matter of fact, an irreducible
homogeneous polynomial of degree r in all of (6 , ..• ,6 k ) and has the
1
r
k
form, say, L a.6. with a >
for every i . If, for any s,
i
i=l 1 1
(r,s) = 1, i . e., rand s are relatively prime to each other, (2.9)
°
becomes impossible unless
r = 1.
[This can be seen as follows.
s r s r
The polynomial P~ (~) contains a term of the form 61 162 2 for
o < r l ,r 2 < r , r l + r 2 = r. Hence, there exists a decomposition
< sl,s2 < s , such that rS = sr
of s as s = sl + s2'
l
1 and
°
-9-
rS
= sr 2 . Taking r = r - 1, this means that rls which is a con2
l
tradiction unless r = lJ
If, again, no two positive values of Yare
e-
relatively prime to each other, we can set, for some s > r,
(r, s) = h
>
1, r = ph, s = qh, (p,q) = 1 . Then (2.9) reads as
pqr (8)
- = pPs (8)
- identically in ~. The polynomial P~ (~) contains
sPl sP2
a term of the form 81 82
for 0 < Pl' P2 < p, Pl + P2 = P .
Hence, there exist integers ql,q2'
rq1 = sPl
and
again, taking
rq2 = sP2' i.e.,
< ql,q2 < q, ql + q2 = q
0
pq1 = qPl
0 ,
pq2 = qP2 . Once
P1 = P - 1, we get that plq so that p = 1 necessarily.
Hence, in any case, we must have, for any s
bro ... o >
and
such that
boro ... o >
0 , ... ,
>
r, rls
boO ... or > o.
as also
This proves the lemma.
Remark 1 It is now enough to defi ne Zi
by Xi = t i + rZ i ' 1 < i < k
and note that the conditions for applicability of Theorem 5.1 of GSS on
the joint distribution of (Zl,Z2""Zk) have all been established.
e·f
Thus,
Theorem 1.1 gets through, thereby establishing Neyman's conjecture.
3.
Proof of Theorem 2.1
through (2.1)-(2.8).
APPENDIX
We will use various notations already established
We proceed through the following steps.
for some value Yi * of Yi * .
Also, if at least two of the conditional distributions (of Yi given
Step I
Certainly,
P[Yi=oIY i *=Yi * J
> 0
Yi * ) are degenerate, then linearity of regression would demand all such
distributions to be degenerate - thereby contradicting SlY. Consequently,
only one of them at the most can be degenerate.
least three distinct values.
Therefore,
Again,
Y.1 * assumes at
Y = Y. + Y. * will also assume
· 1 1
at least three distinct values except under the following situation:
ee
-10-
,·e
V. *
,
Y.
Y
r
s-r, t-r
s,t
s
0, t-s
s,t
t
0
t
,
Calculations yield (for some ai' bi , a i * '
bi * ,c i * ' di * )
t-s
*
a. + b .e.
E(Y • IV• = r) ='
, ,
*, ,t-s '
a.* + b.e.
, ,,
. * = s)= e,.t-s Ic.* + d.e,.
* t-j
E(Y·IY.
, ,
"
E(Y i IY i * = t) = 0 and linearity is violated.
Step II
We will now treat the cases when V has at least three
distinct values, say,
r, s, t
any loss of generality,
in the order o -< r <. s < t. Without
Vi = 0 may be made to correspond to one of
Yi * = r, s, t. No matter to w,ich it corresponds, the important point is
whether the underlying conditional distribution is degenerate at 0 or
t··· e
We will settle both the cases.
not.
Step III Suppose one of the conditional distributions is non-degenerate
Vi
with
= 0 having a positive conditional probability while, if
possible, there is another with
conditional distribution.
Yi = 0 having zero probability in the
Then, a direct analysis, similar to that in Step I,
would result in non-linearity of regression.
Hence, in such situations,
PCYi = olY i * = Yi * J > 0 becomes a necessity for every value Yi * of Vi * .
Step IV
Now suppose that the only degenerate conditional distribution
concentrates on Vi = o.
We will argue that this violates S2Y
whenever Vi * is not an extreme value of V
For notational simplicity
i = 1 and PCV1=0IY l *=sJ = 1 with PCV=rJ > 0, PCY=tJ > d
r < s < t . Referring to (2.8), we must have p;-r(~) = A(r,s,t)
we take
and
ee
pt-s(e) ps-r(e)
r
-
t
-
identically in
obtained from (2.5).
~
where
Pr(~)
, etc., are to be
Now PCY1=0!V l * =sJ = 1 implies, from (2.5), that
-11-
Ps(~)
involves the term n(82,
nomial of degree s in
,8kls) which is a homogeneous poly-
(8 2 ,6 3 ,
e·-
,6 k) . Taking limit on both sides of
(1) as 81 + 0 , the left hand side tends to a positive quantity (more
specifically, + do(s)n(62, ... ,6kls)). Hence, none of Pr(~) and
Pt(~)
can vanish as 61 + o. Clearly, this is the case when
p[Y 1=0IY 1*=rJ > 0 , p[Y l =0IY 1*=tJ > o. Certainly, this must be true for
all values of Y .
Step V
The highly non-trivial case yet to be settled is:
P[Y 1=0IY l * =rJ = 1 where r
maximum value of Y, in case
is the minimum value of Y (or the
Y is finite with probability one).
This
is the only degenerate distribution and no other includes the value 0
for Y1 . Consider the following
Table I
Conditional
Yl
Yl *
probability
r
0
So
sl
s2
···
to
t1
t2
·
·
·
1
e···1
Regression
0
sl
sl
s2
a161 /{a 181 +a 261 +... }
s2- sl
+
a 1s 1 + a2s 281
s2
sl
s2
a281 /{a 1e1 +a 26 +... }
1
t
t
t
b161l /{b 1611+b 2e22+... }
t
t
t
b 612/{b 16 1+b 6 2+... }
2
1 21
s -s
a1 + a2612 1 + ...
t -t
b1t 1 + b2t 2e12 1 + ...
t -t
b + b e 2 1 + ...
l
21
...
·
·
·
ee
-12-
··e
Linearity of regression means
= (So - r) .
to -
identically in 81 .
(1 )
rJ
Hence,
So - r
sl
=t
o
- r
(2)
t1
and further,
(3)
).. e
So - r
sl = t - r t 2
o
provided 52 - sl
~
t 2 - t 1 . But (2) and (3) contradict each other.
Hence, we must have, among other relations (involving the values in the
conditional distributions displayed in the above table),
(4)
[Note that all values of Y1* must lie on the ~ side of r as,
otherwise, the regressions (positive quantities) cannot be cm1inear with
o in the middle.
This justifies the same sign of
We will now apply (4) to get into a contradiction.
(so-r)
ee
<
so
<
t0
and write s
= s0
+ sl < t
= t0
identity in Step IV involving Pr , Ps ' Pt
+ t
(to:r)J.
Once again we refer
to the above table and this time we wish to examine S2Y.
r
and
We take
1 • We now refer to the
and set 8i = 8i o '
-13-
2
k so that it reads as an identity in 81 . Arrange Ps in
increasing order of powers of 81 and let 81i be the first member.
j
Similarly, let 8l
be the first member in the expression for Pt.
Note that 1 < i < s - r - 1 and 1 ~ j ~ t - r - 1 (as, otherwise,
~
i
~
we will end up with situations discussed in our earlier steps).
i{t-r)
e··
But now
= j{s-r) is a necessity and this is not satisfied whenever
(s-r, t-r)
= 1,
i.e., they are relatively prime to each other.
are left with the situation s
and, then, for some
£,
1
~
=r
+ ph, t
=r
s < h , we have i
+ qh,
h
= sp and
>
So, we
1, (p,q)
j
=1
= sq .
Hence, we end up with the following set-up (verifying a part of (4) and
utilizing the other part, e.g.,
s2 - sl
= t 2 - t l ):
Table II
..
V. *
,
Vi
V
r
a
r
r + (h-s)p
sp
r + ph
sp + x
r + x + ph
··
·
··
sq
r + qh
r + (h-s)q
-
(
·
sq + x
r + qh + x
··
·
···
If in Table I, there are only two values under each of the conditional
s - r
distributions (non-degenerate), we get, besides (4), s2 = ~tO~_---r t 2
a
and this applied on the latter table, means that
ee
-14-
··e
£p + x
=
*
(£q+x)
must hold.
But this is not true since p ~ q .
Again, on the other hand, if (in Table I) there are values
s3 and t 3
but (s3-s1) ~ 2(s2-s1)' (t 3-t l ) ~ 2(t 2-t l ) , then also we arrive at a
contradiction (to linear regression and/or power series distribution)
in the same manner.
Hence, in Table II, all successive values must
necessarily have an equal increment of x in both the (non-degenerate)
distributions.
Suppose the last terms are £p + nx and £q + nx
(£q+nx)
Then, once again, we must have £p + nx = R
q
implying p = q which is not the case.
respectively.
Like this, under any such situation, one can come up with a
contradiction to SlY and/or S2Y.
4.
1.
We will stop here.
COUNTER-EXAMPLES
That Ql and Q2 alone do not lead to the multinomials can easily be
seen through counter-examples such as
X2
···
Xk
ax x
l 2
0
0
0
1
1
1
···
···
1
1
···
2
1
Xl
2
ee
···
··
n
n
··
·
·
···
n
..
1
...
xk
-15-
2.
If the conditioning variable assumes only two values, linear
e··
regression becomes a trivial property of any set of conditional
distributions.
Then the stated claim is not valid as the
following example illustrates:
Xl
X2
0
0
2
0
0
2
··
·
0
5.
a x,x " .x
k
2
··· \
0
1
0
1
. · · · ·0
.
·2·
0
···
1
···
···
·
·1·
CONCLUDING REMARKS
As is well-known, the most appropriate prior for the multinomial
parameters ei's is the Dirichlet prior given by
De.
a. . - 1
1
1
o < a..,
1
e.1
k
o. 1-1
(l-~ e.) k+
1
1
k
> 0,
-
L e.
1 1
< 1 .
Various characterizations of such distributions are available in the
literature.
This distribution also possesses all the properties of the
multinomials (vide Neyman (1965), Sinha and Sinha (1976)).
It would
be natural to investigate a similar characterization of these distributions through the properties of regression.
investigation in a separate communication.
We will undertake this
ee
-16-
I
••
~
ACKNOWLEDGEMENT
6.
We thank Professor John Bishir for drawing our attention to the
reference [7J.
7.
REFERENCES
[lJ Ghosh, J. K., Sinha, B. K. and Sinha, B. K.
(1977):
Multivariate
power series distributions and Neyman's properties for multinomials.
Jour. Multivariate Analysis I(3), 397-408.
[2J Kagan, A. M., Linnik, Yu. V. and Rao, C. R.
(1972):
Characterization problems in mathematical statistics (in Russian).
Nauka, Moscow (English edition, 1972, John Wiley, New York).
[3J Khatri, C. G.
distributions.
[4J Kotz, S.
(1959):
On certain properties of power-series
Biometrika, 46, 486-490.
(1974):
Characterizations of statistical distributions.
A supplement to recent surveys.
International Statistical Review
42, 39-65.
[5J Neyman, J.
(1965):
Certain chance mechanisms involving discrete
distributions in Classical and Contagious Discrete Distributions
(Ed. G. P. Patil).
Calcutta Statistical Publishing Soc., 4-14.
[6J Classical and Contagious Discrete Distributions.
Ed. G. P. Patil.
Calcutta Statistical Publishing Soc. (1965).
[7J Statistical Distributions in Scientific Work:
Characterizations and applications (1975).
Kotz, S. and Ord, J. K.
terizations.
e~
(1981).
Volume 3 -
Ed. Pati1, G. P.,
Volume 4 - Models, structures and charac-
Ed. Tai11ie, C., Pati1, G. P. and Ba1dessari, B. A.
-17-
[8J Sinha, Bima1 Kumar and Sinha, Bikas Kumar (1976):
On a characteri-
e-,
zation of the dispersion matrix based on the properties of
regression.
Comm. Statist. A 5(l3), 1215-1224.
[9J Sinha, Bikas Kumar and Thomas M. Gerig (1982):
On a further
characterization of dispersion matrices based on the properties
of regression.
Submitted to the Journal of Statistical Planning
and Inference.
.
ee
© Copyright 2026 Paperzz