fUIE ESTDlATI(l'f PROBLEIIS FOR GIBBS STATES
by
Chuanshu Ji
Department of Statistics
University of North Carolina
Chapel Hill, NC 27514
USA
The
problems
of
estimating
certain
infini te-dimensional
unlmown
parameters (such as interaction potentials, local characteristics, etc.) for
Gibbs states are considered in this paper.
We describe the construction of
consistent
Gibbs
estimators
for
one-dimensional
states
and
discuss
possibility of extending these results to multi-dimensional Gibbs states.
Key Words and Phrases:
Running Title:
Gibbs state, consistent estimator
Estimation for Gibbs States
the
§1.
Introduction
Gibbs states have played an important role in recent developments of
spatial statistics and image processing (cf. [1], [3], [11]).
a
For instance,
two-dimensional picture may be described as a continuous region being
parti tioned
into a
fine
rectangular array of
si tes or picture elements
("pixels"), each having a particular color, lying in a prescribed finite set.
Most researchers assume there is a probabi Ii ty distribution on the set of
pictures, and choose Gibbs states as models for this distribution.
Gibbs
mechanics.
states
were
originally
introduced
as
models
in
statistical
A particle on each site of a two-dimensional integer lattice may
represent the "spins" of a magnet, and there usually are interactions between
particles at different sites.
sites.
Gibbs
configurations.
states are
A configuration consists of particles on all
probability distributions
on
the
set
of
all
In image processing, particles are interpreted as colorings
(pixel variables).
Other intrinsic properties of Gibbs states are preserved.
In statistical context, Gibbs states are parametrized by certain unknown
parameters which affect
the
interactions between pixels.
estimate those parameters based on collected data.
One wants
For our interest,
to
the
data are fini te samples which are larger and larger pieces of an infini te
configuration on a
two-dimensional
lattice from an infini te-volume Gibbs
state (see [12] for its general definition).
The current state of art of parameter estimation for Gibbs states is
reflected by a series of papers of Gidas (cf. [5], [6], [7], [8]).
the
Gibbs
states
are
parametrized
by
elements
in a
Assuming
finite-dimensional
-2-
Euclidean space,
Gidas constructed
the maximum
likelihood estimators and
showed that they are (i) consistent even at points of phase transitions; (ii)
asymptotically
condi tions.
normal
He also
and
asymptotically
investigated
efficient
under
appropriate
the asymptotic bias and variance,
and
studied the estimation problem based on partially observed data (degraded
images) .
As Gidas pointed out in his papers, one interesting open problem is to
estimate certain infini te-dimensional unknown parameters which parametrize
Gibbs states.
In this case,
fails in general.
consistency of maximum likelihood estimators
Therefore new ideas and methods need to be brought in.
In
Section 3 of this paper, we state a new result in estimating some infinitedimensional unknown parameters for one-dimensional Gibbs states.
give
some
heuristic
arguments
and
make a
few
comments
on
We also
the case
of
e·
multi-dimensional Gibbs states in Section 4.
From the point of view of picture restoration,
people prefer simple
models such as Markov random fields with the nearest neighbor interactions,
because
the
complicated.
models
with
many
Nevertheless,
parameters
may
be
computationally
too
it is still worthwhile considering models with
infinite-dimensional parameters as the first step when we do not have much
information about the true model.
The estimators proposed in this paper may
be more robust and easier to compute than estimators of maximum likelihood
type.
. §2.
Further discussion in this respect will be in Section 4 .
Basic model and notation
We adopt the following general. (binary) Ising model. to illustrate the
estimation
problems
under
investigation.
multi-dimensional Gibbs states is given in [12].
The
general
framework
of
-3-
Let
z2
be the two-dimensional integer lattice.
With each pixel i €
z2,
we associate a random variable Xi taking values in the set {-l,l}, i.e. Xi
represents the coloring of the pixel i.
-1 and 1 may stand for "black" and
"white" respectively.
A sequence of non-negative numbers J
= {J(k),
k €
z2}
specifies the pair
interactions so that J(i-j) represents the interaction between the pixel i
and pixel j.
We assume that
~ 0,
(AI)
J(k)
(A2)
J(i-j)
(A3)
k~ J(k)
z2;
= J(j-i),
Apparently,
invariant.
Vk €
< 00.
the
V i, j €
z2;
(symmetry)
(summability)
interaction function
J
so defined
is also
translation-
An interaction J is said to have finite range if J(k) = 0 for all
k wi th sufficiently large Euclidean norm IIkll.
For our interest, J should
have infinite range.
For each configuration x
=
(xi' i €
z2)
and each particular pixel I!,
define the energy
(2.1)
=h
where h €
mis
xI! + ~
I!
to
J(i-I!)xixl!'
called the external field coefficient and
inverse temperature.
pixel
~
i~1!
the
) 0 is called the
HI!(x) can be interpreted as the contribution of the
total
(pair
interaction)
energy
configuration x.
The local characteristic at pixel I! is given by
(2.2)
~
associated
with
the
-4-
where Z =
Z(~;
~
(2.3)
~ ~)
xi' i
P2(x)
is the normalization such that
= 1.
x~
The two-dimensional Gibbs state
o = {-1,1}z2
such that if
~, then for every ~
X= (X.,
z2 and x
€
1
€
~
is a probability measure on the space
i €
z2)
has the probability distribution
0
So P2(x) is the conditional probability (under
~
is x
2
determine
~
J+l
each
€
finite
S) denote
every finite S
(2.5)
€
z2,
x
~
=
~
are x ' j
€
O} need not uniquely
j
~~.
Note
(possibility of phase transition).
subset S of
z2,
subconfiguration of x restricted to S.
(x . . , j
that the coloring at pixel
given that the colorings at pixels other than
that the local characteristic P ~ {P~(x), ~
For
~)
~(~+i
Cz2,
~
translated by i.
i €
z2 and
~
let
(x '
j
And for each i €
~
j
€
z2,
S)
be
the
let ~+i
=
is said to be stationary if for
~
= ~) = ~(~=~).
In this case the local characteristic P is also translation invariant in the
sense that
Note that the translation invariance of P does not imply the stationarity of
~
(possibility of sYmmetry breakdown).
The Gibbs state
~
is parametrized by h, 13 and J.
If h, 13 are unknown
and J is known, that would be a special case of the model Gidas considered.
e·
-5-
In this paper, we assume h,
~
are both known and J is unknown, which is an
infinite-dimensional parameter.
then X
A
contains n
2
Let A
be an nxn symmetric square in
n
observations.
7?,
We consider the following estimation
n
problems:
(1)
Estimate the local characteristic P = P(J);
(II) Estimate the interaction function J.
Since
(2.5)
implies
(2.6),
by assuming
the
true Gibbs
stationary we only need to estimate the function POe -)
observed that under our assumptions, for every finite A C
probabilities I.L(XA=xAlx
=x
AC AC
state I.L
in (I).
7?
is
It is
the conditional
), x E 0 can be derived from P (-).
O
A random function Tn on 0 constructed from X
A
is called a (strongly)
n
consistent estimator of P (-) if for every J
O
(2.7)
sup IT (y) - Po(y)
yEO
n
I ~ 0,
a.s.
under I.L as n
~ 00.
Such an estimator T was constructed in [10] for one-dimensional Gibbs states
n
when J satisfies certain conditions.
The results will be sketched in Section
3.
The problem (II) is more subtle than (I) because J does not have a clear
probabilistic interpretation as P has.
In Section 4 we will give some ideas
how to estimate J based on the estimator Tn for P (-).
O
§3.
EstiDllting local characteristics for one-dimensional Gibbs states
One dimensional Gibbs states themselves are important in topological
dYnamics (cf. [2]).
series.
They may also be used as models in categorical time
In this section some results for the problem (I) are stated in the
context of one-dimensional Gibbs states.
Hopefully, they could shed light on
-6-
the multi-dimensional cases.
All proofs are omitted.
Further details can be
found in [10].
For convenience we let the configuration space be
The results in this section can also be extended to other cases:
(i)
Z = {-1.1}z2. the set of all two-sided sequences;
(ii)
z~ (resp. zA)' the set of all one-sided (resp. two-sided) sequences in
which
transitions
between certain
states
of
their
coordinates
are
not
allowed.
Let X
=
(XO.X 1 .... ) be a
distribution~.
(3.1)
stationary sequence wi th
the probabili ty
Assume that the interaction J satisfies
J(n) ~ C pn. V n € ~
and some
C
> O.
P € (0.1).
Then for every h.P.J there uniquely exists a Gibbs state~.
Moreover. the
following "uniform mixing" condition holds.
LeIllDa 3. 1.
Let ~m-1 be the a-field generated by (XO'···.X 1); ~
be the
mm+n. oo
a-field generated by (Xi' i ~ m+n). Then there exist constants C > 0 and
a € (0.1) such that
(3.2)
uniformly for all A € ~m-1' B € ~
and all m.n € ~.
m+n. OO
The local characteristics at 0 now are written as
PO(e)
is a unlmown function when J
is unlmown.
XO.···.Xn _ 1 . we want to construct Tn such that
(3.4)
sup+ ITn(y) - PO(y)
y€z
I ~ O.
a.s. under ~ as n ~
00.
Given observations
-7-
In (3.3) P (·} may be
O
This is just the one-dimensional version of (2.7).
regarded as an infinite-step backward transition function. which suggests the
folloWing plan to construct T .
n
We first approximate P0(·) by a
sequence of
finite-step
transition functions {~(xOlxl ....• xm_l)' m €~. x € !+}.
(backward)
Then at each stage
m we estimate ~(xO Ix ..... x - ) by "sample transi tion function" which is a
l
ml
ratio of two empirical measures.
Based on convergence rate of estimators. we
argue that the "step-length" m should grow like clog n. where the constant c
is determined by certain bounds for the sequence J and its tail parts.
+
Define the forward shift operator a : };
+
0.1 ..... for x
€};.
+
-+};
by (ax)n = x + ' n =
n l
Then we have
Construction of T
n
Given XO •...• X _ . we define n periodic sequences ajX(n). j=O.l ..... n-l.
n l
where
For each y
+ and m
€};
<n
=
n-l
}; I{(ajX(n}}k = Yk' k=O.l •...• m-l} •
j=O
N(n) (y) =
m-l
where
(ajX(n)k
let
n-l
}; I{(ajX(n»k = Yk' k=l •...• m-l} .
j=O
represents
the
k-th
Finally. define
N(n)(y)
m
(3.5) Tn(Y) -_ {
Nm(~l) (y)
o
•
otherwise
coordinate
of
the
sequence
ajX(n).
-8-
N{n) (y)
m
also wri tten as
frequency" of
Yo
slots 1, ...• m-I.
N{n) (y)
/
m-I
n
is
n
the "sample condi tional
appearing in the slot 0 given that yl ....• ym-I appear in the
The next two theorems show that under certain conditions T
n
satisfies (3.4).
Theorelll 3.2.
Suppose there are two knoun constants K
>
0 and p
€
(O, I)
satisfying
(3.6) sup
n€IN
~
~ K.
n
p
Then there exists c € (O.I) unich on1.y depends on K and p such that T
n
satisfies (3.4) if we choose
(3.7)
m = [c log n] (the integer part of clog n).
Theorelll 3.3.
If we drop the assumption (3.6) and choose a sequence of
constants {cn . n €
rate (e.g. c
n
m} such that c n ! 0 as
satisfies clog n ~
n
00
as n ~
n ~
00).
00
with an arbitrari1.y s1.ow
Set m
=
[c
n
log n].
Then
Tn a1.so satisfies (3.4).
Note that (3.2) is a very strong mixing condition. under which X can be
decomposed into two nearly iid sequences by the standard "blocking" and
"spli tting" technique.
Hence some classical large deviation resul ts hold.
However. (3.2) need not be necessary for constructing consistent estimators.
Future improvement is possible.
§4.
Calments on the two-dimensional case
For the problem (l). we may just imitate the construction of T given in
n
Section 3.
Without loss of generality. assume that the origin 0 is either at the
center of the square A (when n is odd). or near the center of
n
A {when n is
n
-9-
Let B be an mxm symmetric square where
m
even) .
m = the greatest odd integer less than or equal to
(4.1)
c € (0, 1) is a cons tant .
Jc
log n,
We let 0 also be the center of B.
m
A can be
n
partitioned into k subsquares B(l), ... ,B(k), plus some remainders near the
edge of A.
n
Each B(j) is of size mxm, j=l, ... ,k and B(l) = B.
m
Accordingly
X consists of sub-configurations XB(j)' j=l, ... ,k plus the remainders.
An
For every y € 0, let
=
k
~
j=l
= YB
I{X (j)
B
}
m
(n)
N
(YB \.{O})
m
where t. is the center of B(j), j=l, ... k.
J
l·f
(4.2) Tn(Y)
= {
N(n)(
We conjecture that T
n
YB \.{O}
)
>0
m
N(n)(YBm,(O})
o
Define
otherwise.
would satisfy (2.7) if J satisfies certain conditions.
What kind of conditions should it be?
Unlike the one-dimensional case, here
it is not enough to just specify the tail behavior of J.
and f3
> f3c (f3c
,
= inverse of critical temperature),
For example, if h=O
then phase transition
occurs even when J is the nearest neighbor interaction (i.e. J(k) = 0 for all
k satisfying IIkll
because
~
>
1).
In this case,
the consistency of T
n
is not unique and all mixing properties break down.
Gibbs random fields have long-range dependence.
would fail
Actually, the
The similar construction
might sti 11 provide consistent estimators i f we assume that h,
f3 and J
correspond to an extremal Gibbs state and adjust the "grid size" m from
-10-
[v'c log n]
However.
to something smaller (e.g.
all
A
m = [c{log n) ].
A
these speculations seem to be premature for
>
0
is small).
the time being.
Some work related to those open problems is in the process of development.
So far we have not discussed how to solve the problem (II).
do it is to derive the estimator J{k) of J{k)
estimator Tn of PO{·),
(k €
7?
One way to
k #- 0)
from the
Notice that J can be determined by PO{·)
in the
following way.
For every x €
=h
(4.3) -a{x')
n.
+ {3
let x' = (x .• i €
1
7?
i #- 0) and
J{k)~.
};
k#O
xoa{x' )
e
PO{x) = -ea-{7-x-:'~)-+-e --a-:{r""x"";"'' ).
Then
Set x o = -1 and denote the corresponding x by x~1'
e-
We obtain
(4.4)
Furthermore. let y. z €
n
such that Yo
= -1.
Yi
= 1.
i #- 0; and ~
= 1.
zi = -1. i #- k.
By setting x~1 = y and x~1 = z respectively in (4.4) we have for every
k €
7?
k #- 0
If we have
the estimates of PO{y) and PO{z).
then (4.5) will produce the
estimate of J{k).
There could be a more direct way to estimate J based on Grenander' s
sieves method (d.
[9].
[4]).
If each sieve is generated by only fini te
number of parameters J{k). then on each sieve we may try to find the maximum
likelihood estimators and see if their limit yields a consistent estimator of
-11-
J when the sequence of sieves tends to the whole parameter space.
of
this approach
is
A drawback
that usually the maximum likelihood estimators are
computationally intractable if there are many parameters.
method based on counting frequencies looks simpler.
While our previous
Actually, that is also
one way of choosing sieves.
Acknowledge.ent
The author wishes to thank Professors Basilis Gidas and Steven Lalley
for illuminating discussions and suggestions.
References
[1]
Besag. J. (1986).
discussion).
[2]
On the statistical analysis of dirty pictures (with
J. Roy. Stat. Soc .. Series B. 48. 259-302.
Bowen. R. (1975).
Diffeomorphisms.
EquiLibrium States and the Ergodic Theory of Anosov
Lecture Notes in Math. 470.
Springer-Verlag. New
York.
[3]
Geman. S. and Geman. D. (1984).
Stochastic relaxation. Gibbs
distributions. and Bayesian restoration of images.
IEEE Trans .. PAMI-6.
721-741.
[4]
Geman. S. and Hwang. C. (1982).
Nonparametric maximum likelihood
estimation by the method of sieves.
[5]
Gidas. B. (1986).
Annals of Statistics. 10. 401-414.
Consistency of maximum likelihood and pseudo-
likelihood estimators for Gibbs distribution.
Proceedings of the Work-
shop on Stochastic DifferentiaL Systems with AppLications in ElectricaL
/Computer Engineering. ControL Theory. and Operations Research. IMA.
University of Minnesota. 129-145.
[6]
Gidas. B. (1987).
Parameter estimation for Gibbs distributions. I:
fully observed data.
[7]
Gidas. B. (1988).
Tech. Report. Division of Appl. Math .. Brown Univ.
Parameter estimation for Gibbs distributions. II:
partially observed data.
Tech. Report. Division of Appl. Math .. Brown
Univ.
[8]
Gidas. B. (1988).
Asymptotic bias and variance of maximum likelihood
estimators at points of phase transitions.
Appl. Math .. Brown Univ.
Tech. Report. Division of
[9]
Grenander. U. (1981).
[10] Ji. C. (1988).
Abstract Inference.
Wiley. New York.
Estimating potential functions of one-dimensional Gibbs
states under constraints.
Tech. Report 88-13. Dept. of Stat .. Purdue
Univ.
[11] Ripley. B. (1988).
Statistical Inference of Spatial Processes.
Cambridge Univ. Press. New York.
[12] Ruelle. D. (1978).
Massachusetts.
Thermodynamic Formalism.
Addison-Wesley. Reading.
© Copyright 2026 Paperzz