Polynomial encoding of discrete probability using Grцbner bases

Polynomial encoding of discrete probability
using Grobner bases
Eva Riccomagno
Department of Statistics
University of Warwick
Coventry CV4 7AL (UK)
[email protected]
Henry P. Wynn
Department of Statistics
University of Warwick
Coventry CV4 7AL (UK)
[email protected]
1. Gr
obner bases in Statistics
The approach adopted by the authors started with Pistone and Wynn (1996) and has
continued in a series of works further exploiting the basic idea. It should be noted that the
approach diers from that of Diaconis and Sturmfels's (1998).
The starting point is to consider sets of points in R d as zero-dimensional varieties and
solutions of systems of algebraic equations. Thus the experimental design consisting of the
three points f 1; 0; 1g is considered as the solution of the equation
g (x) = (x + 1)x(x 1) = 0
Note that this equation can be written in \leading term" form x3 = x. Now, any higher order
polynomial p(x) can be written, following division, as
p(x) = s(x)g (x) + r(x)
where r(x) is at most quadratic. If p(x) interpolates the points ( 1; y1 ); (0; y2); (1; y3) then so
does r(x) and r(x) is the unique quadratic through the points. The Grobner basis method
extends this to produce multi-dimensional interpolators and can be considered as a generalisations of the one-dimensional g (x) into d-dimensions. The computations are performed in the
polynomial ring k[x1 ; : : : ; xd ] where k is a eld including the coordinates of the points.
A term ordering is essential to the theory. It is a total and well-ordering relation on
the polynomials of the form x = x1 1 : : : xd d (i 0 for i = 1; : : : ; d) and it is compatible
with simplication of monomials, that is if x x then x+ x+ for all ; ; . Thus the
largest term in a polynomial, called the leading term, is well-dened with respect to .
Given a term-ordering in k[x1 ; : : : ; xd ] a set of polynomials g1 ; : : : ; gt 2 k[x1 ; : : : ; xd ] is
a Grobner basis if the ideal generated by the leading terms of the gi is the ideal generated by
the leading terms of the polynomials ti=1 si gi where si 2 k[x1 ; : : : ; xd ]. There exist algorithms
to compute a Grobner basis for a given design and term ordering, that is a set of polynomials
which is Grobner basis and interpolates the design points (see Cox, Little and O'Shea, 1997
and Riccomagno, 1997).
The next important object in the theory is the set of all terms that are not divisible by
any of the leading terms of a Grobner basis. This set is called Est (D). From the denition
P
it follows that Est (D) has the structure of an order ideal that if is x is in Est (D) and x
divides x then x is in Est (D). The suÆx indicates the term ordering considered.
From the theory of Grobner bases it follows that interpolators are linear combinations of
elements of Est (D). Moreover the number of elements in Est (D) is the number of distinct
points. In particular a support of an interpolator has at most as many elements as there are
distinct points.
Notice that Est (D) is a vector-space basis of the quotient space
k[x1 ; : : : ; xd ]=Ideal (D)
(1)
where Ideal (D) is the set of all polynomials vanishing at the (design) points and it is called
design ideal. It is this property which makes Est (D) the basis for the general interpolator.
2. Developments in the experimental design
A summary of the most recent applications to experimental design is needed at this point.
As the term ordering changes we may obtain dierent Est (D). The set of all Est (D)
(leaves) is called the algebraic fan and corresponds to all saturated models that can be obtained
by the algebraic method. Of course each leaf has a corresponding Z -matrix in the standard
regression set up: Y = Z + . A minimal fan design is one for which there is only one leaf.
Product designs (full factorials) are examples but importantly so are design of echelon form
e.g.
Fan:
A full analysis is given in Caboara, Pistone, Riccomagno and Wynn (1997). Maximal fan
designs have all possible order ideals of suitable size as Est (D) and thus can be considered
model robust.
Other rings: Wherever the basic model setting can be reduced to a polynomial ring the theory
can be used. One example studied is Fourier models which can be converted with the condition
c2 + s2 1 = 0 where c = cos and s = sin (see Caboara and Riccomagno, 1998).
3. Support for discrete probability distribution
The main purpose of the talk is to describe how the method can be used to handle discrete
probability distribution. The approach is simple. Consider a discrete random variable X and
(i) make the conceptual mapping Design $ Support (and we still use D for the support),
(ii) interpolate the distribution for X fp(x) > 0; x 2 Dg or its log version flog p(x) > 0; x 2 Dg.
We summarise the intriguing implications of this \coding" of probability distributions.
4. Interpolation
If Est (D) = fx ; 2 Lg then we write
p(x) =
log p(x) =
X x
X x
2 L
2 L
(2)
(3)
Note that (3) places the distribution into the exponential form
X
p(x) = exp
2L
!
x (4)
The exponential form (4) is the entry point to classical statistical inference for discrete
distribution. Rewriting (4) with L0 = L (0; : : : ; 0) we can set 0 = K () and the cumulant
generating function of the x with respect to the uniform distribution is
X
p(x) = exp
2L
!
x
K ()
0
A small adaptation is to write
!
X
p(x) = exp
x
2L
K () p0 (x)
0
for a specially chosen base distribution p0 (x). In either case we can track the eect of the
support D, that is of the term ordering used to determine L.
5. Moment aliasing
62 L in terms of the moments
We can express higher order moments m = E X , m = E (X ) for all 2 L by interpolation
e
so that
Pdi=1 xiti
=
Pti=1 xiti MX (t) = E e
X
2L
=
e (t)x
X
0
t m =
X
2L
e (t)m
and we can recapture m by dierentiation at t = (t1 ; : : : ; td ) = 0. This idea was introduced
in Pistone, Riccomagno and Wynn (1999b). Moment aliasing can be extended to cumulant
aliasing by taking the cumulant generating function
KX (t) = log MX (t) =
X
0
t k
= log
X
2L
!
e (t)m
and expressing the m in terms of the relevant cumulants k , 0 .
6. The ring of random variables on
D
Any function Y on D has a representation
Y=
X
2 L
c X P
a1 ; : : : ; ad ) in a1 1 ad d . All such functions
Multiplication of the two functions Y above and Z =
2L d X leads to
where
X
maps the support point (
YZ =
where X + =
P
moments
2L r (
X X
form a ring.
!
c d r( + ; ) X 2L ; 2L
+ ; )X . This reduction induces a reduction of expectations to
Y X) =
E(
XX
2L ; 2L
r( + ; )m
Similar methods can be used to express conditional expectations via moment expressions.
REFERENCES
Caboara, M, Pistone, G, Riccomagno, E and Wynn, H P (1997). The fan of an experimental design.
SCU Research Report no. 10, May 1997.
Caboara, M and Riccomagno, E (1998). An algebraic computational approach to the identiability
of Fourier models. Journal of Symbolic Computation, 26:245-260.
Cox, D, Little, J and O'Shea, D (1997). Ideal, Varieties, and Algorithms, Springer-Verlag, New York,
Second Edition.
Diaconis, P and Sturmfels, B (1998). Algebraic algorithms for sampling from conditional distributions,
The Annals of Statistics, 26,1:363{397.
Pistone, G, Riccomagno, E and Wynn, H P (1999a). Algebraic Statistics. Monograph (in progress).
Pistone, G, Riccomagno, E and Wynn, H P (1999b). Grobner Bases and Factorisation in Discrete
Probability and Bayes. Computing and Statistics (special issue for the workshop Computing and
Statistics, Montreal, September 1997) (Accepted for publication).
Pistone, G and Wynn, H P (1996).
83,3:653{666.
Generalised Confounding with Grobner Bases, Biometrika,
Riccomagno, E (1997). Grobner bases in experimental design and related elds. PhD thesis, Department of Statistics, University of Warwick.
Riccomagno, E and Wynn, H P (1999). Grobner bases in experimental design: an overview. Sigsam
Bulletin (In press).
FRENCH RESUM
E
Cet article montre une application de la geometrie algebrique computationelle aux probabilites sur espace ni. L'algebre des evenements est recodee a l'aide d'une structure polyn^omiale
dont la determination est basee sur la theorie des bases de Grobner, sur l'elimination polyn^omiale
et la factorisation dams les anneaux de polyn^omes.
Nous donnons des formules pour le calcul, dans ce cadre, des probabilites marginales, et des
lois conditionalles. Les familles exponentialles sont aussi reduites a des structures polyn^omiales.
Le calcul des moments et des cumulants sont presentes sous la forme d'algorithmes algebriques.
Quelques exemples statistiques sont donnes.