On the Probability Distribution on a Compact Group. I.* By Yukiyosi

1940]
On
On
the
the
Probability
Distribution
Probability
on
Distribution
By Yukiyosi KAWADA
a Compact
Group
on a Compact
and
Kiyosi
. I. 977
Group. I.*
ITO.
(Read April 2, 1940].
been
In the probability
mainly
considered.
tage des cartes,(1)"
esses, one must
group
group
been
theory
But
roulette,
establish
or dice, treated
a probability
of n letters, on the rotation
of the sphere.
The case
recently
studied
real or vectorial
in order to study
in detail
random
variables
the problems
of
have
"bat-
by Poincare as
Markoff proctheory
on the
substitution
group of the
of the rotation
circle or on the rotation
group of
the circle has
by P. Levy(2). Our
some of his results, and to establish
a probability
separable
topological
group
in general,
so that
aim
is to generalize
theory on the compact
we can treat the above
(*) Monbusyu-Kagakukenkyu:
Tokyo-Teikokudaigaku,
Rigakubu, Dai nigo Kenkyu.
(1) H.
Poincare,
Caleul des probabilite,
(1912), p. 301.
(2) P. Levy, L'addition
des variables
aleatoires
defines
sur une
circonference,
Bull. Soc. math. France, 67 (1939), 1-41. Cf. also H. Weyl, Ueber die
Gleichverteilung
von Zahlen mod. Eins,
math. Ann., 77 (1916), 313-352.
978
Y. KAWADAand K. ITO.
[Vol.22
cited problems of Poincare as special cases.
We have actually two powerful methods in the probability theory:
the method of characteristic functions, and that of function of concentration of P. Levy(3). In our general case also we can use these
methods with some modifications. To use the first method, we have to
replace the characteristic functions by " characteristic matrices " based
on the Levy(3). Neumann's representation theory of compact groups.
In this paper we treat chiefly the Markoff process with this method.
In the appendix we give a method of constructing an invariant measure in a compact separable group by means of a Markoff process(4).
I.
Characteristic
matrix.
Let G be a compact separable topological group. By a probability
distribution p on G is meant a real function p(E) defined for all Bore]
sets E of G which satisfies the following conditions:
It
is
well
taining
E
known
such
that
for
any
Borel
set E
there
exists
a G -set
F
con-
that,
For a random variable x which takes values in G we can determine
a probability
distribution
px(E): that is the probability
of the event
that x takes values in E.
An important probability
distribution p on G is the uniform disibution
where
on G:
mG(E) is an invariant
tr
measure:
By the theory of A. Haar and J. von Neumann(5) there exists always
an invariant measure and mG(E) is uniquely determined by (5) and the
(3) See P. Levy, Theorie de l'addition
des variables
aleatoires, (1937).
(4) Cf. also S. Bochner,
Average
distribution
of arbitrary
masses under
transformations,
Proc. Nat. Acad. U.S.A., 20 (1934), 206-210.
group
1940] On the Probability
Distribution
on a Compact Group . I.
979
additional condition MG(G)=1.
A Borel set E is said to be a continuity set of the probability distribution p when p(E)=p(E)=p(E°)
is verified , where E is the closure
of E and E° the open kernel of E.
It is a well known fact that if p1(E)=p2(E) for every continuity
set of both probability distributions p1 and p2, then we can conclude
p1=p2, namely p1(E)=p2(E) for all Borel sets E. Similarly, a necessary
and sufficient condition for p1=p2 is that
s verified for any real continuous function f(s) on G.
Two random variables x1 and x2 are said to be "equal
in law"
if px1(E)=px2(E) for any Borel set E on G; therefore a necessary and
sufficient condition for the equality in law of x1 and x2 is
for any real continuous function f, where E means the expectation.
There are at most an enumerable infinite number of mutually inequivalent
irreducible continuous unitary representations
of the compact
separable group G as shown in the Weyl-v. Neumann's
theory of representations(6).
The
always
degree
the
Let these be denoted
of *
shall
principal
be
denoted
representation
These matrices
is continuous
D( )(p)
(.5) J. von Neumann,
1 (1933), 106-114.
(6) J, von Nenmann,
38 (1934), 445-492.
(r=0,
the
nr.
Especially *(o)
shall
mean
absolute
values
of the components
on G we call define D( )(p) as
1, 2,
Zum Haarschen
Almost
by
of G:
As D( )(s) is a unitary matrix,
are not greater than 1:
and since *(s)
by *
periodic
....)
shall
be called
Mass in topologischen
functions
Gruppen,
on a group, Trans.
characteristic
Comp.
Math
Am. Math. Soc,
980
Y. KAWADA and K. ITO.
[Vol.22
matrices of the probability distribution
p on G. Similarly the characteristic matrices D( ) (x) of a random variable x is defind by
For
the
principal
representation
for any probability
distribution
especially
for
from
orthogonality
the
tations
For
a
value,
the
of
By the inequality
distribution p=mG
relation
of
the
on
G,
(7) we have also
we
inequivalent
can
deduce
irreducible
represen
variable
xs which
takes
only
a
fixed
point
s on
G as
have
and for a random
Theorem
only
uniform
p.
hold
G:
random
we
D(0)
1.
variable
x'=axb
Two probability
if
(a, b*G)
distributions
p1 and
p2 are
equal
if
and
Proof.
In virtue of the approximation
theorem of Weyl(6) we can
choose for any given complex valued continuous function f(s) on G
and for any positive number e a finite number of constants *
such
that
Integrating
therefore
on
G, we have
1940]
On the Probability
Distribution
Now for two probability
duct probability
distribution
expression:
on. a Compact Group. I.
981
distributions
p1 and p2 we define the pro(or convolution) p=p1*p2 by the following
whereby the integrability results from the fact that both p1(Es-1) and
p2(s-1E) are Baire functions of s. Let x1 and x2 be two independent
random variables which take values on G, then the probability distribution px1x2 of the product variable x1x2 is given clearly by px1x2=*
heorem 3. For two probability distributions p1 and p2 en. TG, we
have the following relation between the characteristic matrices of p1, p2
and p1*p2
Proof.
From
the definitions
By (11) and theorem
follows
1, 2 we have
for
any probability
distribution
p on G.
A probability
distribution p
on G is said to be stable after P.
Levy(7) if p*p=p.
Then we have
Theorem
3. A probability
distribution
on G is stable when
and
only when
p is a uniform
distribution mn
on a closed subgroup II.
We call prove this theorem
directly
with the help of the characteristic
matrices,
7 in S3.
II.
The
but
we shall
Covergence
sequence {pn}
of the
prove
it later
of probability
probability
be convergent
to a probability distribution
which
is a continuity
set of p holds
(7)
See
P.
Levy, loc.
cit., (2).
as a
corollary
of theorem,
distributions.
distributions
p on G,
on
G is said
if for any
Borel
to
set
E
982
Y. KAWADA
and
K.
ITO.
[vol
. 22
The limit probability distribution
is then uniquely determined by {pn}.
It is well known that this convergence can be also defined as follows:
pn*p
if for any real (or complex)
continuous
function
f(s) on G
As usual it is defined that a sequence of random variables {xn} is
convergent in law (or convergent in the sense of Bernoulli) to a random
variable x if {pxn} converges to px.
Theorem 4. A necessary and sufficient condition for the convergence
of the probability distributions {pn} to a probability distribution p0 is
Proof.
We shall first prove that the condition (19) is sufficient.
For any continuous function f(s) and for any positive number
there
exist from (14) a finite number of constants *
such that
On the other
for which
hand, from the hypothesis
where r rune over indices with *0,
(19) there
and C( )=*.
exists an integer
n0
From (20),
(21) we get
and
therefore,
as
can
be
arbitrarily
chosen,
that is, pn*p0, by (18). To prove the necessity, it suffices to observe
that
are continous functions of s, Q.E.D.
Theorem 5. Let {pn} be a sequence of probability distributions on
G. If the characteristic matrices {D(r),(pn)} converge to a matrix D(r)=
(d(r)ij):
1940]
On
the
Probability
Distribution
on a Compact
then the sequence {pn}
converges
to a probability
has these matrices
D(r) as characteristic
matrices
Group . I.
distribution
983
p, which
Proof.
By (13) and the hypothesis
(23) we can choose for any
continuous function f(s) a finite number of constants (r)ij(m) such that
Then
the sequence Mm=*(m)d(r)ij (m
sequence.
Let its limit be M(f),
The correspondence
where
as
-1, 2,
f(s)*M(f)
is
easily
verified.
a unequely
From
By
satisfies the following
a
functions
theorem
determined
of
F.
probability
(24) and (25) we can
conclude
Let Knr be
all
topology
less
of
the
than
Knr
or
be
space
of
matrices
equal
to
1
introduced
in
as
as the space of the topological
conditions:
on G.
Riesz(8),
distribution
pn*p,
(s) we"have D(r)=lim D(r)(pn)=D(r)(p),
are
is a fundamental
then
f(s) and g(s) are coutinuous
fore,
....)
there
p
and
exists,
such
especially
for f(s)=
Q.E.D.
of
their
usual. We
product
d(r)ij
degree nr
whose
absolute
of {Knr}.
components
values,
shall
there-
that
and
let
the
define
Then Knr
are com-
pact separable Hausdorff space, consequently
it is also the case for R.
Now in virtue of (10) we can define
a corresponding
point P(p) in R
to any probability distribution p on
G by
This
correspondence,
(8)
See, for example,
(1938), 408-411.
is one
to one
S Saks, Integration
and
continuous;
in abstract
and
spaces,
the
subset
Duke Math.
Jour.
of
4
984
Y. KAWADA
and
K.
ITO.
[vol.
22
all the elements of the form P(p) of R is closed in R, as follows from
theorems 1, 4, and 5. Therefore we have the following theorem:
Theorem 6. The space of all the probability distributions on G
with the topology pn*p as defined above is a compact separable
Hausdorff space, embeddable in R(9).
Remark. The definition of characteristic matrices is a modification
of that of the characteristic function of the usual real distribution: we
use namely the continuous irreducible representations of G instead of
the character function eux of the additive group of real numbers {x}
in the usual case. Theorem 1, 2, 5 are quite parallel to the usual distributions, they correspond to theorem 9, 13, 11 in the work of H.
Cramer, " Random variables and probability distribution," (1937)(10).
III,
Markoff process (homogeneous case).
Now we consider a simple Markoff process on G. Let Pn(s, E)
denote the transition probability that an element s*G may be trans(9) Cf. N. Kryloff and N. Bogoliouboff,
La theorie generale
de la mesure dans son
application
a l'etude
des systemes dynamiques
de la mechanique
non lineaire, Annals
of Math, 38 (1937), 65-113.
(10) When G
is a finite group of order g, the characteristic
matrices
are more
closely related to the probability
distribution.
Let s*R(s)=(
u,us-1)u,v ( s t=1 when s=t,
and =0, when s*t) be the right regular representation
of G, and s-D(r)(s) (r=0, 1,
1) be the irreducible
representations
of G.
If we reduce the regular repre-...., hsentation R(s)
into irreducible
representations
with a non-singular
matrix T
=*G R(s)p(ds)=*p(s)R(s)=(p(u-1r))u,r
. Therefore, if we know D(r)(p),...., D(h-1)(p),
or R(p), the probability p(s) is given as the components of R(s) Instead of the regular representation R(s),
the representation *(s):
s*D(s)=R(s)-U
of G is often used
in the literature, where U is defined by g-1*R(s),
namely the matrix whose components are all equal to g-1 . Then
therefore
D(p) is the
zero matrix
e T. Uno and Y. Hasimoto, sur
224-233.
17(1915).
ties
if and only if p is a uniform distribution
le probleme
du battage
des Secartes, this Proc.
Ser. 3. 17 (1935),
1940]
On the Probability
Distribution
on a Compact Group . I
985
ferred into a Borel set E*G by the n-th operation of the process.
If
we apply to the unit element e
of G the operations of the simple
Markoff process, we obtain a sequence of random variables s1, s2, s3, ....
on G. The transition
probability
Pn(s, E) is equal to the conditional
probability
Ps(n-1)=s(sk*E), and the n-th operation
corresponds
to the
multiplication
of the random variable s-1n-1sn. We will denote this
variable with xn.
An important
special case arises, when the n-th operation (xn) is
independent
of the last result (sn-1) for any n:
i.e. xn in
is independent
terized
by the
of sn-1=x1x2 .... xn-1.
property(11) that
are respectively represented
on G as follows :
where pn corresponds
we
by probability
to the probability
p(n) be the probability
then
Such
distribution
a special
distributions
distribution
process
pn(n=1,
is charac-
2, .... )
of xn on G.
Let
of sn
have
where P(n)(s, E) is the transition probability that s may be transferred
into E after the first n operations.
Another special case is the one, where the Markoff process is temporally homogeneous, namely Pn(s, E) is independent of n.
A famous example of a simple homogeneous Markoff process with
the property * is the problem of "battage des cartes" treated by H.
Poincare.
In this case G is the symmetric group of m-letters. Let the permutations of m cards be denoted by 1, .... , N (N=M!). We denote
withthesame
notationalso
i thesubstitution 1=(
where
1/ 2),1=
(1, 2, .... , m). We
assume that a man has a certain habit in cutting
the card,, that may be represented as a probability
distribution
on G,
namely:
let p( 1) be the probability
with which he effects the substi(11)
This property
will he called *
in this paper.
(
986
Y. KAWADA and K. ITO.
tution 1
by
cutting
once
the
pij that the permutation i
once the cards is given by
In the following
As usual
may
Then
be
the
transferred
transition
into j
we treat a simple temporally
process with the property
It is, therefore, sufficient
p(n), which is merely
cards.
[Vol. 22
probability
after
cutting
homogeneous Markoff
G(11)on a general compact separable group G(12).
only to consider the probability
distribution
the n-th power of p:
we define the spectrum
S(p) of a probability
distribution
p as the aggregate of all the points s of G for which the probability
p(E) of any open set E containing
s is positive: p(E)>0.
It can be
easily verified that S(p) is a closed set on G.
Let H
be the closed subgroup
generated by S(p). If a closed
subgroup H0
has the property that
then holds clearly H0*H: H
is the smallest
closed subgroup
with this
property.
A probability distribution p(E) on G is called proper in G, if the
closed subgroup H
generated
by the spectrum
S(p) is equal to G.
Generally H
is not equal to G, but
and
similarly
* and *
easure of G.
must
be
equal
to mG(E)
from the unicity
of the invariant
m
1940]
On the
Probability
Distribution
on a Compact
Group . I.
987
and
Therefore there is no loss of generality if we consider p(E) only within
H instead of within G or, what is the same thing, if we consider only
the probability
distribution
which is proper in G.
We have the following mean ergodic theorem:
Theorem 7. Let p be a probability
distribution
on G which is
proper in G.
tion mG holds
Then for any continuity
set E of the uniform
distribu-
More precisely we
have
Theorem 8. Let p be proper in G.
1) If the spectrum S(p) is contained in a residue class s0H or Hs0
(so * H, H*G)
of a closed subgroup H, then H is an invariant subgroup
of G and G is the smallest closed subgroup containing s0H. Therefore
G/H is abelian.
For any continuity set E'*H
of H of the uniform
distribution mH
on H holds
2) If the spectrum S(p) is not
any closed subgroup H, then
for every continuity set E of mG.
Proof cf theorem S. To prove
and (11) to show that
contained
(37) it is sufficient
i.e. that any characteristic
root (r)
of matrices
is smaller than 1 in its absolute value:
Let y(r)
Let
the
be
coordinate
a characteristic
vectors
sector
be
denoted
in any
for (r) with
by x(r)i:
residue
class of
by theorem
D(r)(p)
(r=1,
2, ....)
unit
length,
then
4
988 Y.
For
the
is
KAWADA and K. ITO.
a unitary
transformation
Ur which
transforms
[Vol. 22
x(r)1 in y(r):
matrices
the
characteristic
matrix
of
the
unitary
irreducible
continuous
re-
presentation
The coordinate
for
Therefore
vector
x(r)1 is then
the matrix *(r)(p)
a characteristic
vector
for
D(r)(p),
must be of the form
namely
*(r)
(s) being
a unitary
representation,
we have
or
If (r) were equal to 1 for some r, then by (40) and (41) the
spectrum S(p) should be contained in the aggregate H of all elements
s} of G for which *(r)(s)=1.
As *(r)(s) is a unitary matrix, H{ should
be the set of all the elements {s} of G which are represented by *(r)
in the form
=U-1r*(r)
Therefore H should be closed proper subgroup of G, what contradicts
our hypothesis that p is proper in G; so we must have (r)*1.
We shall prove next that if there is no closed subgroup H mentioned in 1) of theorem 8, then we have
If (r) were equal to ei ( a real number), then by (40) and (41) the
spectrum S(p) should be contained in the aggregate F of all the elements {s} with *(r)11(s)=ei ; in other words F is the set of all the ele-
1940] On
ments
Let
the
which
so be any
Probability
are
Distribution
represented
elements
of F,
by *(r)
then
on
in the
a Compact
Group
. I.
989
form
holds
The closed subgroup H(*G)
could be then defined as the aggregate
of all the elements {s] for which *(r)11(s)=1, then we would have
this contradicts
our assumption.
Thus by (42) and (43) theorem
8, 2) is proved.
In the general case let e* , ... , e*
characteristic
roots of D(r)(p) (r=1, 2, ....
matrices Ur (r=1, 2, ....)
such that
( (r)j*0 (mod. 2 )) be the
), then we can choose unitary
where the square matrices A(r) has no characteristic
root with absolute
value 1. Let F (or H) be the aggregate of all the elements {s}, for
which *
..., mr; r=1, 2, ....
element s0 F
)).
Then H is a closed subgroup
of G and for any
and, just as in the former case, we can prove that the spectrum S(p)
is contained in F. The closed subgroup generated by s0H=F*S(p)
must be equal to G by our assumption that p is proper in G. Therefore H must be an invariant subgroup and the factor-group G/H
abelian.
Now let p'(E) be defined by p'(E)=p(s0E), then the spectrum S(p')
is contained in H and
990
Y. KAWADA
and
K. ITO.
[Vol. 22
As *(r)(s) is a unitary matrix we can easily see from (44) that
Weshall
prove
nextthatthematrices
(*)
characteristic
matrices
of the uniform
(r=1,
2,....) are
distribution mH
on H.
posing the representations *(r)(s)
considered as representations
the continuous irreducible representations
of H, we obtain
where
there
is no principal
representation
(s); for if *
of H, then
This contradicts
(r=1,
....,
qr;
of H
were the
(45).
r=1,
Thus
Decomof H into
among D(r1)11(s),
....,
principal
representations
D(rqr)H
from *
2, ....)
follows
*(r)
(mn)
isoftheform
(*)
that
the
characteristic
matrices
(r=1,
2,....). By(45)
wehave
proved,
therefore, for the continuity set E of mH in G.
But by theorem
4 it is sufficient for the proof of (36) for the
continuity set E*H
of mH in H, to observe that all the continuous
irreducible
representations
of the closed subgroup H of the compact
group
G are found
in the
decomposition
of the
continuous
irreducible
1940]
On the Probability Dist ribution on a Compact Group
. I.
991
representations of G in the form (46)(13)
.
Therefore the proof of theorem S is completed .
Proof of theorem 7. It is sufficient to show that
By (44) we have
Proof of theorem 3. Let H be the closed subgroup generated by
the spectrum S(p) of the probability distribution p . Then by theorem
7 we have
Corollary.
In the problem of "battage des cartes" we
distinguish
three following sorts of habit.
1)
Starting from a permutation,
there is at least one permutation
which can not be obtained however often we cut the cards, namely the
spectrum of the corresponding
probability, distribution
p is contained
(13)
of modul
This group-theoretical
of representations
theorem can be proved easily by means of the concept
of a compact group used by E.R. van Kampen, Almost
periodic
functions and compact groups, Ann. of Math. 37 (1936), 78-91. The irreducible
representations
of H which are given by decomposing
irreducible
continuous
representations of G form clearly a modul *
of representations
of H; that is, conjugate complex representations
of such representations,
or the representations
obtained by decomposing
direct products
of such representations
take also part in *.
On the other
hand for any two elements s,t in G there is at least one irreducible
representation *G
for which DG(s)*DG(t).
Therefore for any two elements s,t
in H we can find a representation DH
of * for which Du(s)*Dn(t).
From these two properties
we can conclude
that *
contains
all the irreducible
continuous
representations
of H.
(see corollary
at.
p. 82 in the paper
of E.R. van Kanmpen.)
992
Y. KAWADA
and K. ITO.
[Vol . 22
in a proper subgroup of G.
2) All the possible substitutions by cutting once the cards are odd
substitutions, namely the spectrum of p consists merely of odd substitutions.
3) The other case.
Only in the third case we have the uniform distribution m, as the limit
probability distribution, but not in cases 1) and 2).
For any continuous function f(s) on G we define f(n)(s) by
then
from
theorem
7 follows
But we have a stronger result:
Theorem 9. If the probability
holds
distribution
uniformly for s in G.
Especially in the case 2) of theorem
p is proper
in G,
then
8 we have
uniformly for s in G.
Proof.
We have seen the convergence of (50) and (51), so that
it remains only to prove its uniformity.
We shall prove it only for
(51) in the case 2) of theorem 8, the general case being to be proved
analogously.
For any continuous function f(s) we have by (13) a finite number
of constants (r)ij such that
Let Max *
follows
be denoted by C.
Then from (52) and D(r)(st)=D(r)(s)D(r)(t)
1940]
On the
Probability
Distribution
where no can be chosen independently
of the convergence of (51), Q.E.D.
IV.
on a Compact
Gro
up.
I.
993
of s ; this shows the uniformity
Non homogeneous
case.
We will treat next a Markoff process with the property *,
where
the homogenity
is not assumed.
Theorem 10. Consider a simple Markoff
process with
the property
If there is a positive
then for any Borel
number
set E we
>0 independent
of E such that
have
uniformly for all Borel sets E(14)
Proof.
Without loss of generality
we can assume
that s=1.
Let
pn'(E) be given by
then
p'n*mG=(pn-mG)*mG=pn*mG-mG=0
and
Henceforth
follows by mathematical
induction
(14)
Cf. T. Uno and Y. Hasimoto,
loe. cit. (10).
similarly
mG*p'=0.
994
Y. KAWADA and K. ITO.
then we have qn(E)>=0
Taking
G-E
[Vol. 22
and
for E in (60) we have 1-p(n) (E)>=(1-(1-
)n)(1-mG(E)),
namely
From
(60) and (61) it results, that limn*p(n)(E)=mG(E)
Bore] sets E, Q.E.D.
We define now two or more random
lly commutatice in law, if
variables x1,
holds for any permutation (i1, .... , in) of (1, .... ,n),
holas for any Borel set E with mG(E)<
As
special
cases
of this theorem
uniformly
....
By theorem
6 we
can
, xn as mutua
namely D(r)(x1....
. then
we can replace
(62) by the condition that pxn are absolutely continuous
to mG uniformly in n, or by a stronger condition that
Proof.
for all
choose
the assumption
with
a subsequence {xnr}
respect
so that
1940]
On the Probability
pxnr*p (r=1,
2, 3, ....)
Distribution
holds.
on a Compact Group. I. 995
Then
we shall
for any Borel set E with mG(E)< /2.
then p(E) =limr* px nr(E)<=1-
If
for mG(E)<
mG(E)< , then we can represent
E is a continuity
.
E=*En,
3, ....);
1-
. If E is any Borel set with mG(E)< /2,
so that mG(O-E)< /2.
of open continuity
therefore
Then
set of p,
If E is any open set with
E as a union
En:
O(O*E)
En*En+1(n=1, 2,
first show that
sets
p(E)=limn*p(En)<=
we can choose an open set
we have p(E)<=p(O)<=1-
, as
(O)< . Thus (63) is i n any ease proved.
mG
From (63) we can conclude that the spectrum S(p) of p generates
the whole group G and that the condition 1) in theorem 8 can not
hold, because any proper subgroup H
(and any residue class) has its
measure 0:
Therefore from theorem 8 we have limn*p*n=mG.
consequently
for
a suitable
subsequence
From pxnr*p follows
rm,
or
From (65), (10) and the hypothesis
in law we conclude
that
are mutually
commutative
Q.E.D.
Appendix.
We
have
seen
by theorem
value Mf(s)=*
9 how
we can
give concretely
the
mean
foranycontinuous
function
f(s)ona com-
pact separable group by Markoff process: let p be a probability
distribution (namely a mass distribution
with total mass 1) with the condition 2) in theorem 8, then
996 Y.
uniformly
KAWADA
in s, where
p(n) is defined
and
K.
ITO.
recurrently
[Vol.
22
by
The condition 2) of theorem 9 is satisfied when p has the spectrum
S(p)=G.
For example, let {sn} (n=1, 2, 3, ....) be an enumerable set
everywhere dense on G, and let us distribute the mass 1/
2n on the point
Then the so defined distribution p:
Sn.
satisfies clearly the condition 8(p)=G.
Our proof of theorem 8
is based on the existence of a uniform distribution mG on G. But conversely we can prove the uniform convergence of (66) directly in the case of the probability
distribution
p of
(68), whence we can prove the existence of mG by the method used by
J. von Neumann(15)
Theorem 12. Let p(E) be the probability distribution
on a compact
separable group with S(p)=G.
Then the sequence
for a continuous function f(x)
(f):
on G converges uniformly
to a constant
M
Proof.
The uniform convergence of (69) is proved
as in a note of S. Iyanaga and K. Iiodaira(6).
(15)
(16)
a Group,
J. von Neumann,
loc. cit. (5).
S. Iyanaga
and K. Kodaira,
On the Theory
Proc. Imp. Acad. Tokyo, 16 (1940), 136-140.
of
Almost
quite
Periodic
similarly
Functions
in
1940]
On the Probability
Let Mn,
mn„ and On
by the definition
Distribution
be defined
of f(n)(s).
on a Compact
Group . I.
997
by
Therefore
it is sufficient, to prove limn*On=
0 for the proof of (69).
Let
be an invariant metric on G: (s, t)= (sa, ta) = (as, at),
then we can find for any assigned positive number
so that (s, t)
<
implies |f(s)-f(t)|< /2
from the uniform continuity of f(s) on G.
We can see then
W
e can divide G into a finite number
of mutually
disjoint
Borel
sets E.
where
Ei
point.
For
then
has
its
diameter
smaller
Then from S(p)=G
any
given
pair
of
than /2
and
has
at
least
an
inner
we have
points
s, t of
G we
choose
Ej
by
so that
998
Y. KAWADA
and
But by (70), (72) and the definition
in sE1UtEj
is less
than /2,
If we choose n0 so that
(1-
K.
ITO.
of {Ei}
therefore
the
[Vol.
oscillation
22
of f(l-1)(u)
we have
)n0O0< /2, then On<
for
n>n0,
namely
limn* On=0 is proved.
The conditions 1)-5) are direct consequences
from our definition
of M(f),
Q.E.D.
For an almost periodic function f(s) on a group G we can quite
analogously construct the mean value of f(s) by means of the metric
introduced
in G:
Mathematical
Tokyo
(Received
Oct, 24,
1940).
Institute,
Imperial
University.