A Maximization Technique Occurring in the Statistical Analysis of

A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of
Markov Chains
Author(s): Leonard E. Baum, Ted Petrie, George Soules and Norman Weiss
Source: The Annals of Mathematical Statistics, Vol. 41, No. 1 (Feb., 1970), pp. 164-171
Published by: Institute of Mathematical Statistics
Stable URL: http://www.jstor.org/stable/2239727
Accessed: 28-04-2015 09:37 UTC
Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at
http://www.jstor.org/page/info/about/policies/terms.jsp
JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content
in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.
For more information about JSTOR, please contact [email protected].
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve and extend access to The Annals of
Mathematical Statistics.
http://www.jstor.org
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
TheAnnalsofMathematicalStatistics
1970, Vol. 41, No. 1, 164-171
A MAXIMIZATION TECHNIQUE OCCURRING IN THE
STATISTICAL ANALYSIS OF PROBABILISTIC
FUNCTIONS OF MARKOV CHAINS
BY LEONARDE. BAUM,TED PETRIE,GEORGESOULES,AND NORMANWEISS
Institute
and
for DefenseAnalyses,CaliforniaInstituteof Technology
ColumbiaUniversity
1. Introduction. This paper lays bare a principle which underlies the effectiveness
of an iterative technique which occurs in employing the maximum likelihood
method in statistical estimation for probabilistic functions of Markov chains. We
exhibit a general technique for maximizing a function P(Q) when P belongs to a
large class of probabilistically defined functions.
In the earlier note [2], an application of an inequality was made to ecology. The
more general approach of this paper has allowed the use of a certain transformation
and inequality in maximizing likelihood function in models for stock market
behavior [3] and sunspot behavior.
problems in weather prediction.
We also expect to apply these techniques to
Let A = (aij) be an s x s stochastic matrix. Let a = (ai), i = 1,
s be a
probability distribution. For each i = 1, -., s let fi(y) be a probability density:
dy = 1. For thetripleA, a, f = {fj} we definea stochasticprocess{ YJ} with
Jfi(y)
density
(1)
P(A,a,f){Yl
= Yl, Y2= Y2
=
Ylo,il,
YF = YT}
* * ',
iT=
1
ai.
aiT
aiOifi1(yj)aili2fi2(Y2).
For convenience we denote thisexpressionby Py, ...
YT(A,
liTfiT(YT).
a, f).
We call tht, process Y = { Y,} a probabilistic function of the Markov process
{X } determined by A. If a is chosen as a stationary distribution for the matrix A
then Y will be a stationary stochastic process.
Let A be an open subset of Euclidean n space. Suppose that to each A)e A, we
have a smooth assignmentA -* (A(i), a(A),f(A)). Specificallyeachfi(A, ) is a density
in y and for each fixed y is a smooth function in A. Under these assumptions, for
=
each fixed Y1, Y2,
, YT, PYI .YT()
a(i), f(A)) is a smooth
YT(A(4),
PY.
function of A. Given a fixed Y-sample y = y 1
, YT we seek a parameter value A'
which maximizes the likelihood
PY(A) =
PY...
YT(A) determined from A(i),
a(i),
f(A) by (1).
One might suspect from the complicated nature of the expression (1) for
a, f) and the difficult analysis of maximizing this function of A for very
PYl ... YT(A,
special choices of f presented in [2], [8] that a simple explicit procedure for
maximization for a general f would be quite difficult; however, this is not the case.
There is an extremely simple feature of this function which under mild hypothesis
on f enable us to define a continuous transformation
mapping A into itself with
Received March 24, 1969.
164
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
165
PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
the propertythat Pyl ..y(.())
YI
>
PY1...
YT()
unless A is a critical point of
' 'YT(A).-
Theorem2.1 is the simplebut importantprinciplewhichis applied in defining
of Sk(i) as k -X o are discussed
ST. The convergence
thetransformation
properties
in Proposition2.2 and following.In Section3, we transform
Theorem2.1 intoa
formsuitablefor applications,viz., Theorem3.1, and show that the methodis
but not to
applicableto theNormal,Poisson,Binomialand Gamma distributions
A featureof the firstthreeexamplesis that the transthe Cauchy distribution.
formation -(A)is explicitlypresentedas a functionof the observationsYl i YT
In each case, thetransformation
and itsiteratesarepractically
and theparameters.
computable.(See [3] fora detailedillustrationof the case in whichfi(A,-) is the
normaldensity.)In Section4, we provethatthe methodis applicableto location
and scale parametersforgeneralstrictly
log concave densityfunctions(Theorem
theMarkovprocess.
4.1) and to thecoordinatesofthestochasticmatrixdefining
Let X = {1, ..., 5}T and x = {x1,
2. An inequality.
of
the
form:
(1) ofA is
P(A) = ZxExp(X, A),with
p(x, A)= a(A) HT= 1 aX-
IX(A)fX(
,XT}
X. Thenthefunction
Yt)
More generallylet X be a totallyfinitemeasurespace withmeasure,u.Letp(x, A)
be a positivereal-valuedfunctionon X x A, whereA is a subsetofEuclideanspace,
whichis measurableand integrablein x forfixedA. Let P(A) = fxp(x,A)dy(x) and
A)logp(x,i') dy(x).
Q(A,i') = f?xp(x,
THEOREM 2.1. If Q(A,A) > Q(Q,A)thenP(Q) > P(A). The inequality
is strictunless
d1i(x).
p(x, A) = p(x, A)almosteverywhere
PROOF.log tis strictly
concavefort >
logP(;)/P(A)
0 sinced2/dt2(log
t)
t-
2
< 0. Hence
Iogfxp(x,;) d1(x)/P(A)
= log fx[p(x, A)dIW(x)/PQI)]p(x,
i)/p(x,A)
=
=
Jx[p(x, x) dii(x)/P(D)] log [p(x, A)/p(x,
A)]
f
I
(P(A) )- [Q(
,) - Q(A,
))] > 0
by hypothesis.For the firstinequalitywe have used Jensen'sinequalityforthe
probabilitymeasure dv,(x) = p(x, A)dp/
(x)/P(2). This inequalityis strictunless
p(x, X)/p(x,A) is constantalmosteverywhere
dvj(x) henceunlessp(x, A) = p(x, )
almosteverywhere
d,i(x).
in i for almostall x.
differentiable
PROPOSITION
2.1. Let p(x, A) be continuously
Let Y(Q) be a continuous
map ofA -* A suchthatforeachfixedi, Y(i) is a critical
pointof Q(Q, 2') as a function
ofi'. ThenallfixedpointsofY are criticalpointsofP
and if moreoverP(Q(i)) > P(2) unlessY(2) = 2, all limitpointsof -n(20) are
fixedpoints
of5- forany)0 e A.
PROOF.aP(2)/I I A-= Q(2, i')/a27 IA . Thus 2 is a criticalpointofP iffit is
a criticalpointof Q(Q, 2') as a functionof 2'. Hence by the hypothesison 5, if
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
166
LEONARD E. BAUM, TED PETRIE, GEORGE SOULES, AND NORMAN WEISS
AthenAis a criticalpointofP. Suppose thatg-i(AO)convergesto A.Then
'(Io) convergesto s(o); s p(yfli(AO)) _ p(gfni + I(AO)) < p(,$ni+ 'G%o))and
P(Q) < P(67-(Q))< P(t). Thus P(A) = P(Y(Q)) and sincethisequalitycan onlyhold
if Y(A) = A,thisshows that: all limitpointsof the sequence {g-n(0)} are fixed
pointsof87.
9(A)
=
fni+
87 and applications.The remainderof this paper will
3. The transfornmation
consist of applicationsof Theorem 2.1 in various situationsof interest.To
a given P(A) = fxp(x,A)dy(x) there is naturally associated the auxiliary
function:Q(A,i') = fxp(x, A)logp(x, i') dl (x). If we defineY: A -* A by 9I(A) =
{ eA Imax>,t'Q(/.,
.A')= Q(R, A)} then Q(Q, (1(A))? Q(Q, A) so Theorem 2.1
A
onsPwe willseethat
guaranteesP(9{(A)) > P(A). Undernaturalhypotheses
(i) existsand is singlevalued;
(ii) Y is continuous;
computable;
(iii) is effectively
(iv) P((1(A)) > P(A) save whenA is a criticalpointofP in whichcase A is a fixed
pointofS.
if P has finitely
manycriticalpoints,forexample,it followsfrom(iv) thatfor
each A not a criticalpoint,Ygn() approachesa criticalpointof P whichwill be a
local maximum(save forstartingpointsA in a lowerdimensionalmanifold).We
on
cannotruleout limitcyclebehaviorof theiteratesSF" withoutsome hypothesis
thecriticalpointsetofP as above.
Let A = RK. For almost all x let log p(x, A) =
1 log pi(x, x;) where for each i
concavein Aiand limp,1,, logpi(x,Ai) =
and almost all x log pi(x, Ai) is strictly
- oo. DefineQ (), Ai')by
Q(, Ai) = fxp(x,A)logpi(x,Ai')dp(x).
oo at
concave functionof Ai' which Then for A fixed,QjQ, Ai') is a strictly
+ oo and hence has a unique global maximumAj, which is a criticalpoint of
A-3 = {Jj}.
QiQ, Ai'). Define9: A
-
3. A P($-(A)) > P(A) with
THEOREM 3.1. Underthe above assumptions,
for all A
P
or
is afixedpointofg.
is
a
and
if
of
equivalently
criticalpoint
onlyif.A
equality
PROOF.
Q(Q,A)
El= 1 Qi(, ii),> El= 1 NA,
i) = QGa,
A)
concavein
so Theorem2.1 impliesP(A) ? P(A). Sinceforeach i Qi(;, ii) is strictly
A) (and hence the inequalityP(l) _ P(A)) will be
)i the inequalityQ(Q, A)_ QQ3,
*
strict
unless)i = A foreachi. Thislatteris thecase iffOQi/O/i' = 0, i
S;
apa
i.e.,re
isA=pli= o, t iha,ns.
Here is a sample of the inequalities thatcan be analyzed via our technique.
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
PROPOSITION 3. 1.
then
Ify, t = 1,
167
and cx =
, T is a sequenceofrealnumbers
CXl .. .XT
P(1) = ExcxH7=lexp(- Ix- YtI)
at a pointA whereeach i is someYt,*
assumesitsglobalmaximum
PROOF.
We have Q(A,i') = TiQi(;, Ai') whereQi(;, Ai') = -1y(i)|Ai'-y,
c as -Ai oo.
yt(i) > 0. Then Qi(G, Ai') -)
Qi(A,ii')
=
E yi(t)sgn(Ai'-y),
with
Ai'#anyYt,
is monotonic
decreasing
fromoo at Ai'=-oo to - oo at Ai'= oo, constantin
intervals
between
Yt,and changessignat someYt= i. We thenhaveQi(A,1i) =
max.'Qi(Al Ai,) Qi(A, i). By Theorem2.1 P(1) ? P(A). Now if Ai is any parameterwhichappearsin P(A) witha non-zero
coefficient
cx thenP(A) -+ - oo as
that
maximum
so
assumes
its
at
some
2. ButY(A) =A is a
oo
P(A)
point
Ail
pointwitheach) a yt,andP(1) ? P(A)= maxP(A).
Applications
of Theorem3.1 to familiarprobability
distributions.
Let X =
)
{1,
*
and x = {x1,
, S}T
, xT}eX.
For i = 1,
, s let fi(r) be a strictlylog
concaveprobability
function.
For eachi we obtaina oneparameter
density
family
of density
functions
by introducing
Aias a locationparameter:
fi(r- i). For a
and P({)
givenfixedreal sequenceYl, ,YT let p(x,)=
HT=
1logpi(x,Ai)where
Ex.xP(x,i)yi(x),i(x) > 0. Thenlogp(x,A)=
fX(yt-AXt)
logpi(x,Ai)= Es"i iog fi(yt-Ai)
withSxidenoting
thesetoft suchthatxt= i. Alsologfi(yt
concave
-Ai) is strictly
in AiforeachYtand -+ - oo as JI-+ ? sincefi is a probability
Hence,
density.
Theorem
3.1isapplicable.
Theorem
3.1 appliesstraight
thedensity
awayto thesethreeexamples:
fi(, y) =
cexp(-kJAi_-yJ),
onthe
E > 1,k > 0 and - ?? < Ai< oo,thePoissondistribution
non-negative
integers
and0 _ Ai< oo andthebinomial
distrifi(A,y) e-AiAiy/y!
_ i)N-Y and 0 < Ai< 1. In
butionon theintegers
0, 1, , N,fi(t,y) = (yN)1Aiy(I
eachcase themethodis thesame.We determine
thetransformation
9Ybysolving
fortheuniqueAi'zeroof a/a)i'Qi(A,
explicitly
i') = 0. For example,in thefirst
case9(A)i = Ajistheuniquezeroof
ExE-xrlt=I exp - Ilxt
-Yt E)%(x)
ES,
i'i-
yt
ForE = 2 wehavetheimportant
caseofthenormal
density.
Ineachofthesethree
cases9(A) hastheexplicit
form
(ii=[XEx =
[Z~
e x 4u(xJ7tL
=Y~1[Z
1 Vi(X)fxt(Axt, Yt)]
(X) fITnf
i( )xt(
*[Ex E_x iix)
=1n()X(XY)
_X_,(X)
fIT=
'Y )]
wherevi(x)= Esxiyt
andni(x)is thenumber
ofelements
inSxi. In themostimpor-
tantcase wherej(x) = ax1aXX2
...
aXT 1XT'
g-(4)i=[T=Tyt(i)yt][t= 1Yt(i)]
I
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
LEONARD E. BAUM, TED PETRIE, GEORGE SOULES, AND NORMAN WEISS
168
whereyt(i) may be obtained throughbackwardand forwardinductivecomputationsin t. We haveyt(i)= act(i)flt(i)
where
cxt(j) =
11 ..
, s, t
= 2
1 at-
1(i)aij bj(Xj,y,),
,,. ~T
fAt(i)=
s=,1Aht+1(j)aijbj(Xj,yt+
1),
as the Alikelii=1, * s, t = T- 1, , l. y,(i) has a probabilisticinterpretation
hood oftheevent{ Y1 =Y, Y2 =Y2
, YT = Y7 Xt = i}. See [6].
The analysisof the case of the gamma densityis somewhatmorecomplicated.
In particularwe do nothave an explicitexpressionfor9(A). NonethelessTheorem
3.1 stillapplies.Hereis a discussion.
The expressionf,,(r)= (F(v))- locvrvle-2r, cx
> 0, v > 0 definesa twoparameter
familyof densitiesforr > 0. Here 1(v) = S tv-le-tdt is thegammafunction.Let
and let
{lci,vj: i = 1, ..., s} be unknownparameters
P(x,
Q(, v, c',v') =
j
8u(X)H[=
lt2X=
V)->
vj)(x) 1= I
x p(x,aX,
-Ei=
p(x, cx,
x vu(x)
and
Sxi logf,,viYt),
I Qi(cx,
v, cxi',
vj').
Let ac,v be fixedthroughout.
Qi(cx,v,cxi',vi') -
oc as (i',vi' -* 0 or oc. We write
Qi(cx, v, LcXi,vi1) = ZT= I Yt(p) log fa8vjYt).
For cx,v fixed,one provesusingtheinfiite
expansionforlogF(v) [1] page 16, that
the Hessian of the Qi functionwithrespectto the primecoordinatesis negative
definite.(The Hessian of a functionF(x1, -.., xn) is the matrix:(a2F/Oxiaxj).)
Hence Qi(c, v,cxi',vi') is strictly
concaveas a functionof thetwo variablescxi',
vi'.
Since Q i(c, v,cxi',vi') -+ - oo as vi',cxi'go to the boundary,we conclude that
Q (Lx, v,cxi',vi') has a uniquecriticalpointai, vi whichis a global maximumand is
obtainedas a solutionof aQi/a0i' = 0, aQilavi' = 0. As usual P({Xi, vi}) > P(cx,v)
save at criticalpoints(ax,v) ofP.
in
The Cauchy densityyields a counterexamplewhich shows the difficulties
applyingTheorem3.1 to a widerclass of densityfunctions.Iff(r) = z -l(1 +r2)-f
thend2f/dr2= -27r-'(l-3r2) (I +r2)-3 is onlyconcavefor3r2? I so we cannot
proceed as in the precedingexamplesfor the one parameterfamilyof Cauchy
densitieswithlocationparameter:f(A,r) = 7i- 1( + (r- A)2) - 1. On the contraryit
is easyto see thatthefunction
a
-,
2T
QN(A,
Ai')Z= L
Y
i
(i)l + (y _Ai,)2
need not have a unique zero forsuitable {Yt}. For A fixed,one of thosecritical
pointsof Qi(A,Ai1)willprovidea globalmaximumof Qi(A,Ai')and hencea suitable
sucha globalmaximum.
procedureis neededforfinding
Ajbutsomefurther
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
169
PROBABILISTIC FUNCTIONS OF MARKOV CHAINS
.- is applicable.Witha given
4. Some generalsituationswherethetransformation
f(u) we can obtaina two parameterfamilyf"g(u) by
probabilitydensityfunction
introducinglocation and scale parametersm and a > 0 respectively:
fmj(u) =
* fs consider
a 'f((u- m)/a).Givendensityfunctionsf1,
P(m,ca)=P({Mi, Ci3)-;
wherep(x) > 0, and again X = {1,
thentheauxiliaryfunctionsQi are
*,
xeX
s}IT.
T
()I-Xt(-
f/ Yt_M\,
t=IXt
Xt
Let vx(m,a) be the summandof P;
a,mi',
a) ES [logf((Yt- mia')
QN(m,
ai()= ExvX(m,
-
loga(]
andQ(m,a, m',a') =El= 1Qi(m,a,mi', ai').
P if
carriesoverto suchfunctions
showsthatourprocedure
The nexttheorem
thateachfi is strictly
log
onlythe naturalhypothesis
we require(essentially)
concave.
function
log concavedensity
THEOREM4.1. For i = 1, , s letfi(u)be a strictly
at u = 0, andassume
foreverystatei thefollowing
withitsuniquelocalmaximum
holds:3 times
condition
non-pathology
t, = tj(i), n = 1,2 suchthaty,,#yt2 and
x:xtn=
i
vX(m,a) > 0, n = 1, 2. Thenforfixed m and a, the systemof equation
is theglobalmaxiaQ/ami'= 0, aQ/@ai'= 0 hasa uniquesolution
{ii', ai'} which
Y(m, a) = {Fii', vi'} we
ofm' anda'. Defining
mumofQ(m,a, m',a') as afunction
point
unless(m,a) is a critical
inequality
thenhaveP(.9(m,a)) _ P(m,a) withstrict
.9(m,a) = (m,a).
ofP orequivalently
= aQi/lmi'
the
and aQ/aaj'= 8Qil/ai',we mayconsider
PROOF. SinceaQ/am1'
For notational
we suppress
thedepenconvenience
functions
Qi independently.
i andtheprimes
denceof Qi on m= {mi}anda = {ai}, thendeletethesubscripts
is provedby showingQ(m,a) -x - cc as
as well.In suchnotationthetheorem
60 ofQK(Iml
< oo,0 < a < so), and thatQ has a
theboundary
(m,a) approaches
Now
pointinQ whichisa localandhenceglobalmaximum.
uniquecritical
Q(m,a) = Ex vX*>sji [logf(zt)- loga]
wherezt
=
(yt -
m)/a.Ifyt =
yt(i)
Q(m,a)
=
=
Esx,vx,
then
Zf= yt[logf(zt) - loga].
- 0, and thatthisis indeedthecaseas
Now Q(m,a) -s*-c iffnHT f(zt)Yj1a'Yt
.3Q
function
on
rests twofacts.Namely,
a strictly
f(t)
(m,a)
logconcavedensity
large,so
smalland Julis sufficiently
is alwayslessthane-a1ulifa > 0 is sufficiently
theabove non-pathology
thatliml,,1tuf(u)= 0 forany n; also it is precisely
of a t = t(m) suchthat
foreverym the existence
conditionwhichguarantees
of Q occurs.The lengthy
zt+ 0 andyt> 0, so thatas a -+0 thedesiredbehavior
butstraightforward
detailsareomitted.
foritis
eachcritical
pointof Q in Q is a localmaximum,
byshowing
We finish
thatQ has
fromMorsetheory)
thenobvioustopologically
(andfollowsrigorously
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
170
LEONARD E. BAUM, TED PETRIE, GEORGE SOULES, AND NORMAN WEISS
only one local maximum,whichis thena unique global maximum.Since thisis
trueforeveryi, an applicationof Theorem2.1 completestheproof.Let g = logf.
Then
a2Q I
am2
a2Q
IlaQ
a2
a
_a
ga+i
1
aE
a2 EY
g (Zt),
1
2
,YtZ
zt (t) + 2Eytztg'(Zt),
and
amOa
a cm + at2E
IN
9(Zt)tZtg
Now 02Q/am2< 0 sinceg" < 0. At a criticalpointaQiam=
determinant
det(H) oftheHessian H of Q(m,a) is givenby
aQ/aa= 0, and the
a4det(H) = St g"(Zt)Et ytZt2g"(Zt) -(ZtYtZtg"(Zt))2 + Et ytg"(Zt)E, ytZtg'(Zt).
Now ug'(u) < 0 since g'(0) = 0, so the thirdtermabove is positive.Since the
functionu2 is convex,thesum of thefirsttwo termsis also positive,whichproves
H is negativedefinite
at a criticalpointand theproofofthetheoremis complete.
It is to be notedthatH is notin generala negativedefinite
form.
COROLLARY 4.1. Let f(u) = exp(-cIuI")a> 1. Then logf(u) = -CIul is strictly
concaveso theprevioustheorem
applies.The specialcase x = 2, the normaldistributionis ofgreatutility.
The uniquecriticalpoint{ij, di} is thensimplycomputed
from:
t(i)]-1
[E=Et(')yt]EZtM
1=
[t
yt(i)Yt2]
[t
yt(i)]
-
ii
Since yt(i)is proportionalto the a posterioriprobabilityof beingin the statei at
timet, the reestimation
{hi, U-}has an obvious probabilisticinterpretation
which
led to itsoriginaluse.
.3 on all the coordinates
Application.We finallycarryout the transformation
mentionedin Section 1; i.e., we will include the stochasticmatrixcoordinates
A = (aij) and starting
a = (ai).
probabilities
We considerthesituationin whicheach YtE (1, 2, **, m) and eachfj is a probain (1, 2, , m) ratherthan a probabilitydensityon the reals
bilitydistribution
sincethiswas thecase originally
dealtwith[2], [6]. The followingmutatismutandis
applies to the case whereeachfj is a centeredlog concave densityfunctionas in
Theorem4.1.
LetP(a, A,f)) =
cxp(x,a, A,f) where
p(x,a,A,f)
andyte{l,2,
,m}.Then
Q(a, A,f,a', A',f')
=
=
axH1
ax.fl
Ex p(x,a, A,f) {log a', + Et loga' ,x +
Dlogf,e(Y,)}
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions
PROBABILISTIC
FUNCTIONS
171
OF MARKOV CHAINS
is an extremely
simplefunctionof thevariablesa', A' andf'. In factan elementary
computationshowsthatthereis a uniquea', A', J'whichmaximizesQ as a function
oftheprimedcoordinates.We find:
a, =
a,A,j)][EXp(x,a,A,J)
LLxo pXx,
_1= [Z,p(x, a, A,f)Zt_
fj'(k)=[Ex p(x, a, A,J) EXt
t
j,y =
I
][Exp(x, a, A,f)Ext
k1 ][X
p(X,a, A,f) Ext=j
? 1 1]-1
1]1
Hence, the transformationY: {a, A,f} -- {a-,A,f} increases the function
P(a, A,f). Strictinequalityholds save at fixedpointsof or criticalpointsof P
suitablyinterpreted.
REFERENCES
andWinston,
NewYork.
Holt,Rinehart
(1964).TheGammaFunction.
[1]
ARTIN, EMIL
[5]
BAUM, LEONARD E.
[2] BAUM,LEONARD E. and EAGON, J. A. (1967). An inequalitywith applications to statistical
estimationforprobabilisticfunctionsof Markov processes and to a model forecology.
Bull. Amer.Math. Soc. 73 360-363.
[3] BAUM, LEONARD E.; GAINES, STOCKTON; PETRIE, TED; and SIMONS, JAMES. Probabilistic
modelsforstockmarketbehavior.To appear.
[4] BAUM, LEONARD E. and PETRIE, TED (1966). Statisticalinferenceforprobabilisticfunctionsof
finitestateMarkov chains.Ann.Math. Statist.37 1554-1563.
and SELL,
on manifolds.
forfunctions
transformation
GEORGE R. Growth
To appear PacificJ. Math.
forprobabilistic
procedure
estimation
[6] BAUM,LEONARD E. and WELCH, LLOYD. A statistical
functionsof finiteMarkov processes. Submittedfor publication Proc. Nat. Acad. Sci.
USA.
[7 ] PETRIE, TED. Theoryof probabilisticfunctionsof finitestateMarkov chains. To appear.
[8]
manuscript.
Unpublished
ROTHAUS, OSCAR. An inequality.
This content downloaded from 193.0.96.15 on Tue, 28 Apr 2015 09:37:06 UTC
All use subject to JSTOR Terms and Conditions