CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION;
ROHLIN’S DISINTEGRATION THEOREM
Abstract. The purpose of this paper is to give a clean formulation and proof of Rohlin’s
Disintegration Theorem [Ro52]. Another (possible) proof can be found in [Ma83]. Note
also that our statement of Rohlin’s Disintegration Theorem (Theorem 2.1) is more general
than the statement in either [Ro52] or [Ma83] in that X is allowed to be any universally
measurable space, and Y is allowed to be any subspace of standard Borel space.
Sections 1 - 4 contain the statement and proof of Rohlin’s Theorem. Sections 5 - 7
give a generalization of Rohlin’s Theorem to the category of σ-finite measure spaces with
absolutely continuous morphisms. Section 8 gives a less general but more powerful version
of Rohlin’s Theorem in the category of smooth measures on C 1 manifolds. Section 9 is an
appendix which contains proofs of facts used throughout the paper.
1. Notation
We begin with the definition of the standard concept of a system of conditional measures,
also known as a disintegration:
Definition 1.1. Let (X, µ) be a probability space, Y a measurable space, and π : X → Y
a measurable function. A system of conditional measures of µ with respect to (X, π, Y ) is
a collection of measures (µy )y∈Y such that
i) For each y ∈ Y , µy is a measure on π −1 (X). For µ
b-almost every y ∈ Y , µy is a
probability measure.
ii) The measures (µy )y∈Y satisfy the law of total probability
Z
(1.1)
µ(B) = µπ−1 (y) (B)db
µ(y)
for every event B of X. (Here and throughout this paper µ
b := µ ◦ π −1 .) Note that
we are implicitly assuming that the map y 7→ µy (B) is µ
b-measurable; we must be
careful to prove this claim.
The proof that we will give of Rohlin’s disintegration theorem is probabilistic; in particular, we will use the following notations motivated by a probabilistic point of view:
Notation 1.2. Let (X, µ) be a probability space, A a µ-measurable subset of X with
µ(A) > 0. We write
µ(B ∩ A)
Pµ (X ∈ B X ∈ A) := µA (B) :=
µ(A)
Z
Eµ (ψ(X ) X ∈ A) := ψ(x)dµA (x)
To prove the existence of systems of conditional measures, we will use a related concept
which depends on topology:
1
C
2 ONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Definition 1.3. Let (X, µ) be a topological probability space, Y a metric space, and
π : X → Y a measurable function. (π need not be continuous.) Let y ∈ Y . Then the
topological conditional measure of µ with respect to (X, π, y, Y ) is the weak-* limit
(1.2)
µy := lim µπ−1 (B(y,ε))
ε→0
if it exists and is supported entirely on π −1 (y). (The measures on the right hand side are
defined by Notation 1.2.)
This definition has the advantage of being specific: for each y ∈ Y , there is at most
one measure on π −1 (y) which can be called the conditional probability of µ on π −1 (y). Its
disadvantage is that the context of the definition is less general: X is required to be a
topological space and Y is required to be a metric space.
We recall the following standard definitions:
Definition 1.4. Standard Borel space is the Cantor space 2N with its Borel σ-algebra;
the Borel isomorphism theorem states that any uncountable Polish space with its Borel
σ-algebra is Borel isomorphic to standard Borel space.
Definition 1.5. A universally measurable space is a measurable space X such that there
is an isomorphic embedding iX of X into standard Borel space, such that for every Borel
measure µ on standard Borel space, iX (X) is in the completion of µ.
Definition 1.6. A metric space X is an ultrametric space if it satisfies the ultrametric
triangle inequality
d(x, z) ≤ max(d(x, y), d(y, z))
for all x, y, z ∈ X.
2. Statement of Rohlin’s Disintegration Theorem
We will prove two versions of Rohlin’s Theorem; the first, which is a strengthening of the
version given in [Ro52], is an entirely measure-theoretic formulation, whereas the second,
which appears to be new, involves topology. Theorems 2.1 and 2.2 correspond to Definitions
1.1 and 1.3, respectively.
Theorem 2.1 (Rohlin’s Disintegration Theorem). Let X be a universally measurable
space, let Y be a measurable space such that there exists a measurable injective map from Y
into standard Borel space, and let µ be a Borel probability measure on X. Let π : X → Y be
measurable. Then there exists a system of conditional measures (µy )y∈Y of µ with respect to
(X, π, Y ). They are unique in the sense that if (νy )y∈Y is any other system of conditional
measures, then µy = νy for µ
b-almost every y ∈ Y .
Theorem 2.2. Let (X, µ) be a compact metric probability space, let Y be a locally compact separable ultrametric space or a separable Riemannian manifold. Let π : X → Y be
measurable. Then for µ
b-almost every y ∈ Y , the topological conditional measure of µ with
respect to (X, π, y, Y ) exists as in Definition 1.3. Furthermore the collection of measures
(µy )y∈Y is a system of conditional measures as in Definition 1.1. (If µy does not exist, set
µy = 0.)
The proof will be divided into 2 parts: deducing Theorem 2.1 from Theorem 2.2, and
proving Theorem 2.2.
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
3
3. Proof of Rohlin’s Theorem: Theorem 2.2 → Theorem 2.1
Let X 0 = 2N be standard Borel space, and let iX : X → X 0 be the inclusion guaranteed
0
0
by the universal measurability of X. Let µ0 = µ ◦ i−1
X ; µ is a probability measure on X .
Then (X 0 , µ0 ) is a compact metric probability space.
Let iY be a measurable injective map from Y into the Cantor space Y 0 := 2N equipped
with the Borel σ-algebra. Note that Y 0 is a locally compact separable ultrametric space.
By [[Sr98] 3.2.3 p.92], the map π admits a Borel measurable extension π 0 : X 0 → Y 0 . Note
that there is no reason to suppose that π 0 is continuous.
Thus we have satisfied the hypotheses of Theorem 2.2 for (X 0 , µ0 , π 0 , Y 0 ). (If X and Y
are standard Borel, we are done with existence.) Let (µ0y0 )y0 ∈Y 0 be a system of conditional
−1
measures of µ0 with respect to (X 0 , π 0 , Y 0 ). For each y ∈ Y , let µy = (µ0iY (y) iX (X))◦(i−1
X )
if µ0iY (y) is supported on iX (X), and µy = 0 otherwise. Note that this makes sense since
iX (X) is universally measurable. We claim that (µy )y∈Y is a system of conditional measures
of µ with respect to (X, π, Y ).
First, note that since π 0 is an extension of π, then iY ◦ π = π 0 ◦ iX , and thus µb0 = µ
b ◦ i−1
Y .
For all y ∈ Y , µ0iY (y) is a measure on (π 0 )−1 (iY (y)). If µ0iY (y) (X 0 \ iX (X)) > 0, then µy = 0
−1
is
is a measure on π −1 (y). If µ0iY (y) (X 0 \ iX (X)) = 0, then µy = (µ0iY (y) iX (X)) ◦ (i−1
X )
−1
0 −1
a measure supported on iX ((π ) (iY (y)) ∩ iX (X)), which by the injectivity of iY is equal
0
to π −1 (y). Furthermore, in this case we have µ0iY (y) = µy ◦ i−1
X . If additionally µiY (y) is a
probability measure, then µy is a probability measure.
Now for µb0 -almost every y 0 ∈ Y 0 , µ0y0 is a probability measure, and µ0y0 (X 0 \ iX (X)) = 0.
(The second claim follows from (1.1) applied to the formula µ0 (X 0 \ iX (X)) = µ(∅) = 0.)
Thus for µ
b-almost every y ∈ Y , µ0iY (y) is a probability measure, and µ0iY (y) (X 0 \ iX (X)) = 0.
By the preceding paragraph, we see that for every y ∈ Y , µy is a measure on π −1 (y), and
for µ
b-almost every y ∈ Y , µy is a probability measure and µ0iY (y) = µy ◦ i−1
X . Thus condition
(i) of Definition 1.1 is satisfied.
To prove condition (ii), fix B ⊆ X measurable. Since iX is an embedding, there exists
0
B 0 ⊆ X 0 Borel such that B = i−1
b-almost every y ∈ Y , µiY (y) = µy ◦ i−1
X (B ). Now for µ
X
0
and therefore µiY (y) (B 0 ) = µy ◦ i−1
(B
)
=
µ
(B).
Thus
the
function
y
→
7
µ
(B)
is
equal
y
y
X
µ
b-almost everywhere to the composition of iY with the map y 0 7→ µy0 (B 0 ), and is therefore
µ
b-measurable.
R
Finally, note that µ0 (B 0 ) = µ(B). Applying (1.1), we see that µ(B) = µ0y0 (B 0 )dµb0 (y 0 ) =
R 0
R
R
0
µiY (y) (B 0 )db
µ(y) = µy ◦i−1
µ(y) = µy (B)db
µ(y). Thus (1.1) is satisfied for (µy )y∈Y ,
X (B )db
which is therefore a system of conditional measures of µ with respect to (X, π, Y ).
It remains to show uniqueness. This actually follows from much weaker assumptions:
from now until the end of this proof, rather than assuming that X is universally measurable
and that there exists a measurable injective map from Y into standard Borel space, we will
assume only that X and Y are measurable spaces, and that the σ-algebra of measurable
subsets of X is countably generated. (This follows from the fact that X is universally measurable, since 2N is separable and any subset of a separable measurable space is separable.)
The first step will be to prove uniqueness for each individual event. The separability of
X will allow us to extend to the uniqueness stated in the theorem.
C
4 ONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
We will need a lemma:
Lemma 3.1. If (µy )y∈Y is a system of conditional measures of µ with respect to (X, π, Y ),
then for all measurable events S ⊆ Y and B ⊆ X,
Z
−1
(3.1)
µ(π (S) ∩ B) =
µy (B)db
µ(y).
S
Proof. The result follows directly from (1.1) and is left to the reader. Hint: µy (π −1 (S) ∩
B) = χS (y)µy (B), which follows from the fact that µy is supported entirely on π −1 (y). Corollary 3.2. Systems of conditional measures are unique in the sense that if (µy )y∈Y
and (νy )y∈Y are two systems of conditional measures for the same measure µ, then for
every event B of X and for µ
b-almost every y ∈ Y , µy (B) = νy (B). (Note the order of the
quantifiers.)
Proof. By Lemma 3.1,
Z
Z
µy (B)db
µ(y) =
S
νy (B)db
µ(y)
S
for every measurable S ⊆ Y . Let S1 = {y ∈ Y : µy (B) < νy (B)} and S2 = {y ∈ Y :
µy (B) > νy (B)}. If µ(S1 ) > 0, then the right hand side would be bigger, thus µ(S1 ) = 0.
Similarly, µ(S2 ) = 0. Thus µy (B) = νy (B) for µ
b-almost every y ∈ Y .
This corollary gives us the desired uniqueness for individual events. However, when we
reverse the order of quantifiers, we can only guarantee that µy (B) = νy (B) for a countable collection of events B. Now we use the fact that the σ-algebra of X is countably
generated; let (Bn )n∈N be a generating sequence. Then the collection of finite intersections
(∩n∈F Bn ) F ⊆N is also countable. Thus for µ
b-almost every y ∈ Y , µy (B) = νy (B) for each
#(F )<∞
B in this collection. By [[Co93] 1.6.2 p.45], this implies that for µ
b-almost every y ∈ Y ,
µy (B) = νy (B) for every event B of X, i.e. µy = νy for for µ
b-almost every y ∈ Y .
4. Proof of Rohlin’s Theorem: Theorem 2.2
The heart of the proof is contained in the following lemma:
Lemma 4.1. If ψ : X → R is integrable, then the function
(4.1)
y 7→ Eµ (ψ(X ) π(X ) = y) := lim Eµ (ψ(X ) π(X ) ∈ B(y, ε))
ε→0
is well-defined for µ
b-almost every y ∈ Y , and is a Radon-Nikodym derivative of (ψµ)◦π −1
against µ
b.
For the remainder of this section, we will take (4.1) as a definition of the topological
conditional expected value of ψ with respect to (X, µ, π, y, Y ). The reason that this is not
a good definition in general is that it leads to counterintuitive results. For example, if
ψ = χπ−1 (y) , then Eµ (ψ(X ) π(X ) = y) = 0, yet for every value of x for which π(x) = y,
ψ(x) = 1. For continuous functions this kind of thing doesn’t happen, which is why it was
necessary to use weak-* convergence in Definition 1.3.
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
5
Thus what the lemma really says is that conditional expected values exist almost everywhere, and if you know what their values are when they do exist, you can reconstruct the
expected value of ψ on any event which depends only on π(X ).
Proof of Lemma 4.1: Note that
Eµ (ψ(X ) π(X ) ∈ B(y, ε)) =
(ψµ) ◦ π −1 (B(y, ε))
µ
b(B(y, ε))
Thus this lemma is really the Lebesgue differentiation theorem (Theorem 9.1) applied to
the space Y and the measures (ψµ) ◦ π −1 and µ
b.
Only a few technical points stand between this lemma and the full Theorem 2.2.
The first is the distinction between conditional expectation and conditional measure.
Ideally, the conditional measures would be a collection of measures (µy )y∈Y such that for
every integrable ψ : X → R and for every y ∈ Y ,
Z
(4.2)
ψdµy = Eµ (ψ(X ) π(X ) = y).
However, this is not possible, since the map A 7→ Eµ (χA (X ) π(X ) = y) is not countably
additive. (It is finitely additive.) As an example, if An = π −1 (B(y, n1 )), then
lim Eµ (χAn (X ) π(X ) = y) = 1 6= 0 = Eµ lim χAn (X ) π(X ) = y
n→∞
n→∞
The problem is that the expression Eµ (ψ((X )) π(X ) = y) is itself defined in terms
of a limit, so the inequality above is another way of saying that limits don’t necessarily
commute. This suggests that the solution is to force one of the limits to be uniform.
Another issue that comes up is the issue of null sets. For each integrable ψ, there is a
null set Nψ outside of which Eµ (ψ(X ) π(X ) = y) is well defined. If we want (4.2) to hold
for a certain set of points S ⊆ Y and a certain class of functions Ψ ⊆ RX , then the larger
the set of functions is, the smaller the set of points can be. Specifically, S ⊆ X \ ∪ψ∈Ψ Nψ .
Thus if Ψ is uncountable (for example all measurable functions), there is no guarantee that
S is nonempty.
If we think about it a little, these problems really have the same root. The counterexample given above would not be a problem if we were allowed to ignore the point y, since
it is a null set. The problem is that the null sets add up, we need some way to make the
number of terms in our union countable.
What countable set of functions should we use? Based on the comments following Lemma
4.1, it would make sense to use only continuous functions. One nice fact about the class
of continuous functions is that it is separable; i.e. there exists a countable collection of
continuous functions which is dense in the uniform topology. (To see this, note that by
Urysohn’s metrization theorem [[Wi04] 23.1 p.166] X can be embedded in the Hilbert cube
[0, 1]N , which implies that there is a countable collection of continuous functions which
separate points [i.e. the projection maps]. Let Ψ be the Q-algebra generated by these
functions plus the function which is identically one. Then Ψ is countable and separates
points. The uniform closure of Ψ is an R-algebra which contains the constants and separates
points, and so by the Stone-Weierstrass theorem [[Wi04] 44.7 p.292] is equal to C(X). Thus
C
6 ONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Ψ is a countable dense subset of C(X).) This is excellent, since it brings in the uniform
limits necessary to make the limits commute.
Now let Ψ be our countable dense set, and then N = ∪ψ∈Ψ Nψ is a null set. The first
thing that we want to show is that for all ψ ∈ C(X) = Ψ, then Nψ ⊆ N . (The closure
denoted is the uniform closure.) To see this, pick a sequence ψn ∈ Ψ which tends to ψ
uniformly. Then for all y ∈ Y \ N , then Eµ (ψn (X ) π(X ) = y) exists for all n ∈ N. Now
Eµ (ψ(X ) π(X ) = y) = lim lim Eµ (ψn (X ) π(X ) ∈ B(y, ε))
ε→0 n→∞
= lim lim Eµ (ψn (X ) π(X ) ∈ B(y, ε))
n→∞ ε→0
= lim Eµ (ψn (X ) π(X ) = y);
n→∞
(The exchange of limits is justified because the convergence is uniform with respect to n.)
In particular, Eµ (ψ(X ) π(X ) = y) exists, so y ∈ Y \ Nψ . By taking the contrapositive we
see that Nψ ⊆ N .
Fixing y ∈ Y \ N , we see that for every ψ ∈ C(X), the limit (4.1) exists. By [[Be82] p.77,
paragraph 1], this implies that the measures µπ−1 (B(y,ε)) converge in the weak-* topology
to a limit measure, which we shall call µy := limε→0 µπ−1 (B(y,ε)) , following the notation of
Definition 1.3. Note however that we have not yet proven that µy is supported entirely on
π −1 (y). (We do know that it is a probability measure, since X is compact.)
The next thing we want to show is that the collection of measures (µy )y∈Y \N satisfies
the Law of Total Probability. Ideally, this should follow directly from the fact that (4.1)
is a Radon-Nikodym derivative of (ψµ) ◦ π −1 against µ
b. As stated earlier, this fact allows
global information to be reconstructed from local information, and that is exactly what the
Law of Total Probability is about. Again, however, we must worry about technicalities.
First, let’s see exactly what Lemma 4.1 buys
R us. If ψ is any continuous function, then
by integrating
both sides of (1.2), we get that ψdµy is equal to (4.1). Thus the function
R
y 7→ ψdµy is well-defined for µ
b-almost every y ∈ Y , is µ
b-measurable, and is a Radon−1
Nikodym derivative of (ψµ) ◦ π against µ
b. According to the definition of the RadonNikodym derivative, this means that for any measurable set S ⊆ Y ,
Z Z
Z
(4.3)
ψdµy db
µ(y) =
ψdµ
S
π −1 (S)
R
where the left hand side is taken to include the assumption that the map y 7→ ψdµy is
µ
b-measurable.
The next step requires a bit of carefulness. We want to generalize (4.3) to all bounded
measurable ψ. Let BM(X) be the set of bounded measurable functions from X to R. We
know that C(X) ⊆ BM(X) is a dense subset, if BM(X) is given the topology of monotone
convergence (Lebesgue-Hausdorff Theorem, [[Sr98] 3.1.36 p.91]). Thus it suffices to show
that the set of all ψ ∈ BM(X) which satisfy (4.3) is closed in the topology of monotone
convergence. This follows from three applications of the Monotone Convergence Theorem,
applied to the finite measures µ, µ
b, and µy . Note that if ψn →
− ψ monotonically, then
n
R
R
R
ψn dµy →
− ψdµy monotonically. Thus if y 7→ ψn dµy is µ
b-measurable for all n ∈ N, then
R n
y 7→ ψdµy is µ
b-measurable as well.
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
7
Thus, (4.3) is true for all bounded measurable ψ. In particular, if ψ = χB , then (4.3)
simplifies to (3.1). If furthermore S = Y , then (3.1) simplifies to (1.1).
It remains to show that for µ
b-almost every y ∈ Y , the measure µy is supported entirely
−1
on π (y). (This is a condition of both definitions 1.1 and 1.3.)
This is one step in the proof that seems somewhat counterintuitive; it seems like the
fact that µy is supported entirely on π −1 (y) should follow directly from the fact that the
defining equation for µy converges in the first place. In fact, this would be true if we
knew that π were continuous rather than just measurable.1 If π is measurable there are
counterexamples. For example suppose that
X := [0, 1]2
Y := [0, 1]
y if y 6= .5
π(x, y) :=
0 if y = .5
and µ is Lebesgue measure on X. Then µ.5 is supported on π −1 (0) rather than on π −1 (.5).
Thus we are forced to resort to a different argument. Note that saying that µy is supported entirely on π −1 (y) is the same as saying that µy ◦ π −1 = δy , where δy is a point mass
at y. We will prove this equality setwise and then reverse the order of quantifiers.
Taking (3.1) and substituting B = π −1 (C), we get
Z
Z
χC (y)db
µ(y) =
µy ◦ π −1 (C)db
µ(y)
S
S
for all measurable sets C, S ⊆ Y . By an argument similar to the proof of Corollary 3.2, we
see that for every measurable set C ⊆ Y and for µ
b-almost every y ∈ Y , we have
(4.4)
δy (C) = χC (y) = µy ◦ π −1 (C).
By taking a countable dense collection of Cs we can reverse the order of quantifiers, noting
that the collection of all Cs for which (4.4) holds is closed under monotone convergence.
Finally we plug in C = {y}; thus for µ
b-almost every y ∈ Y , µy (π −1 (y)) = 1.
5. Generalization to σ-finite measure spaces: Motivation
Now that Rohlin’s Theorem has been proven, we will state and prove a generalization to
the category of σ-finite measure spaces with absolutely continuous morphisms.
The first question to ask: why generalize to σ-finite measure spaces? So far the interpretation of Rohlin’s theorem has been entirely probabilistic, but it is not clear how σ-finite
measures are related to probability. If µ is a nonzero finite measure, it makes sense to
µ(B)
define Pµ (X ∈ B) = µ(X)
and this gives a probability measure. However, if µ is an infinite
σ-finite measure, then the normalization constant µ(X) is ∞, and the preceding formula
makes no sense. Nevertheless, we can still come up with examples of σ-finite measures out
of which probability measures naturally arise:
1If
π is continuous, then π∗ : M(X) → M(Y ) is continuous in the weak-* topology, thus for every y ∈ Y ,
µy ◦ π −1 = lim µπ−1 (B(y,ε)) ◦ π −1 = lim µ
bB(y,ε) = δy .
ε→0
ε→0
C
8 ONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Example 5.1. Let X = R2 , Y = R, π R= πyR : X → Y , and let µ be given by the density
2
∞
∞
ρ(x, y)dxdy := e−(x−y) /2 dxdy. Since −∞ −∞ ρ(x, y)dxdy = ∞, it follows that µ is an
infinite σ-finite measure. However, we will see in Section 8 that µπ−1 (0) is given by the
2
density e−x dx, which is finite and in Rfact normalizes to Gauss measure. Thus it makes
2
sense to say Pµ (X ∈ B Y = 0) = √12π B e−x /2 dx.
The easiest way to understand this is under the Bayesian understanding of probability,
i.e. probability as degrees of belief. Under this interpretation, we can understand a σfinite measure as a person who doesn’t have enough information even to say that particular
events have fixed probabilities. Next, the person is told that Y = 0. This does not give
him enough information to say precisely what X is, but he now has a good enough grip on
the situation to be able to symbolize his understanding as a probability measure.
Example 5.2. Let X = R3 , Z = R, π = πz : X → Z, and let µ be given by the density
2
ρ(x, y, z)dxdydz := e−(x−y+z) /2 dxdydz. Again µ is an infinite σ-finite measure. The
difference this time is that now µπ−1 (0) is also an infinite σ-finite measure; in fact it is
exactly the measure which we called µ in Example 5.1.
The point of this example is that once we admit that σ-finite measures have probabilistic
significance, it makes sense to apply Rohlin’s Theorem in a context where there are no finite
measures to be seen. Going back to the previous discussion, the way we can understand
Example 5.2 is that, as in the previous example, the original measure µ is a person who
doesn’t have enough information even to say that particular events have fixed probabilities.
He is told that Z = 0, but his knowledge of the particle’s location still can’t be represented
as a probability measure. Finally he is additionally told that Y = 0, and then he can say,
for example, that P (X < 0) = .5.
Note further that in this example µ
b is not σ-finite. In fact, it is equal to ∞λ, where
λ is Lebesgue measure. Thus, the measure µ
b is not really very useful; in particular, the
denominator of (1.2) becomes ∞, making the limit zero. Clearly 0 should not be considered
a conditional measure. The conclusion we can draw is that we need another measure ν
to replace µ
b in the denominator of (1.2). ν should have the property that µ
b << ν. For
example, in this case we could set ν = λ.
6. Generalization to σ-finite measure spaces: Statements of the
generalized Rohlin Theorems
To state the generalized version of Rohlin’s Theorem, we will first need to generalize
definitions 1.1 and 1.3.
Definition 6.1. Let (X, µ) and (Y, ν) be measure spaces, and let π : X → Y be a measurable function. A system of conditional measures of µ with respect to (X, π, Y, ν) is a
collection of measures (νy )y∈Y such that
i) For each y ∈ Y , νy is a (nonnegative) measure on π −1 (y).
ii) For each event B of X, the measures (νy )y∈Y satisfy the generalized law of total
probability:
Z
(6.1)
µ(B) = νy (B)dν(y)
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
9
Definition 6.2. Let (X, µ) be a topological measure space, (Y, ν) a metric measure space,
and π : X → Y a measurable function. (π need not be continuous.) Let y ∈ Y . Then the
topological conditional measure of µ with respect to (X, π, y, Y, ν) is the weak-* limit
(6.2)
µ π −1 (B(y, ε))
ε→0
ν(B(y, ε))
νy := lim
if it exists, is locally finite, and is supported entirely on π −1 (y).
Note that in both of these definitions we used the symbol νy in place of µy . This allows
us to use these formulas simultaneously with the originals (1.1) and (1.2) in the proof of
Theorem 6.4 without any abuse of notation.
If ν = µ
b and is a probability measure, then these definitions are equivalent to the original
definitions.
Theorem 6.3. Let (X, µ) be a universally measurable σ-finite measure space and let (Y, ν)
be a σ-finite measure space such that there exists a measurable injective map from Y into
standard Borel space. Let π : X → Y be measurable, and suppose that µ
b << ν. Then
there exists a system of conditional measures (νy )y∈Y of µ with respect to (X, π, Y, ν). For
ν-almost every y ∈ Y , νy is a σ-finite measure. The conditional measures are unique in the
sense that if (γy )y∈Y is any other system of conditional measures, then νy = γy for ν-almost
every y ∈ Y .
Theorem 6.4. Let (X, µ) and (Y, ν) be locally compact locally finite separable metric measure spaces; also assume that Y is either an ultrametric space or a Riemannian manifold.
Let π : X → Y be measurable, and suppose that µ
b << ν. Then for ν-almost every y ∈ Y ,
the topological conditional measure of µ with respect to (X, π, y, Y, ν) exists as in Definition
6.2. Furthermore the collection of measures (νy )y∈Y is a system of conditional measures as
in Definition 6.1.
7. Generalization to σ-finite measure spaces: Proofs of the generalized
Rohlin Theorems
The proofs will follow the same format as the proofs of Theorems 2.1 and 2.2. Theorem
6.3 will be proven first under the assumption that Theorem 6.4 is known, and then Theorem
6.4 will be proven.
`
Proof of Theorem 6.3 using Theorem 6.4: Since X is σ-finite, let X = n∈N An , where
each An has finite µ-measure. Each An is a measurable subset of a universally measurable
space, and is therefore also universally measurable. For each n ∈ N, let in : An → 2N
be an isomorphic embedding such that in (An ) is universally measurable. By gluing we
obtain an isomorphic embedding iX : X → X 0 := N × 2N such that iX (X) is universally
0
0
measurable. Letting µ0 = µ ◦ i−1
X , we see that (X , µ )is a locally compact locally finite
separable ultrametric measure space.
Similarly, we obtain a measurable injective map iY : Y → Y 0 := N×2N with the property
0
0
that ν 0 := ν ◦ i−1
Y is locally finite. We see that (Y , ν ) is a locally compact locally finite
separable ultrametric measure space. Again by [[Sr98] 3.2.3 p.92], we can extend π to a
Borel measurable map π 0 : X 0 → Y 0 . Thus we have satisfied the hypotheses of Theorem 6.4
CONDITIONAL
10
MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
for (X 0 , µ0 , π 0 , Y 0 , ν 0 ). Let (νy0 0 )y0 ∈Y 0 be a system of conditional measures of µ0 with respect
to (X 0 , π 0 , Y 0 , ν 0 ).
The remainder of the proof of existence given for Theorem 2.1 is valid, if we replace
each occurence of µ
b by ν, µb0 by ν 0 , µy by νy , and “probability” by “σ-finite”. (Recall
that a locally finite measure on a locally compact separable metric space is σ-finite, so the
measures coming from Definition 6.2 are necessarily σ-finite.)
R
To prove uniqueness, we shall first assume that µ is finite. In this case, νy (X)dν(y) =
µ(X) < ∞, so νy (X) < ∞ for ν-almost every y ∈ Y . Thus the proof of uniqueness given
for Thoerem 2.1 still holds, with the same replacements
as above.
`
If µ is a σ-finite measure, again let X = n∈N An , where each An has finite µ-measure.
Note that for every system of conditional measures (νy )y∈Y of µ with respect to (X, π, ν)
and for each n ∈ N, then (νy An )y∈Y is a system of conditional measures of µ An with
respect to (An , π, ν). Thus if (νy )y∈Y and (γy )y∈Y are two systems of conditional measures,
then for each n ∈ N, (νy An )y∈Y and (γy An )y∈Y are both systems of conditional
measures for (µ An , π, ν). Uniqueness for the finite case implies that for ν-almost every
y ∈ Y , νy An = γy An . Fixing y ∈ Y and letting n ∈ N vary, we see that for ν-almost
every y ∈ Y , then νy An = γy An for all n ∈ N.P(The countability
P of N justifies this
reversal of quantifiers.) For each such y ∈ Y , νy = n∈N νy An = n∈N γy An = γy .
Thus νy = γy for ν-almost every y ∈ Y .
Now we must prove Theorem 6.4; we will use Theorem 2.2. Note that there are two
different generalizations made from Theorem 2.2 to Theorem 6.4: the generalization from
ν=µ
b to ν >> µ
b, and the generalization from compact spaces to locally compact spaces.
We will deal with the former generalization first; i.e. first we will prove Theorem 6.4 in the
case where X is compact, then we shall generalize.
Proof of Theorem 6.4 in the case where X is compact: Since µ is assumed to be locally finite, it follows that it is finite. Thus the hypotheses of Theorem 2.2 are satisfied for the
µ
normalized measure µX := µ(X)
. It is left to the reader to verify that using µ instead of
µX does not affect (1.1) and (1.2).
The first thing that Theorem 2.2 tells us is that for µ
b-almost every y ∈ Y , the weak-*
limit (1.2) exists and is entirely supported on π −1 (y). We compare (1.2) with (6.2), and
see that they differ by a factor of
(7.1)
µ
b(B(y, ε))
.
ε→0 ν(B(y, ε))
f (y) := lim
By Theorem 9.1 (Lebesgue differentiation theorem), this limit exists and is finite for νalmost every y ∈ Y . Furthermore the function f thus defined is a Radon-Nikodym derivative of µ
b against ν.
Taking products, we see that (6.2) exists and equals f (y)µy for µ
b-almost every y ∈ Y .
This is of course not enough; we need to show that it exists for ν-almost every y ∈ Y . This
can be remedied by the following argument: Let F be the set of all y ∈ Y such that the
conditional
measure µy exists according to Definition 1.3. Then N := X \ F is a µ
b-nullset,
R
so N f (y)dν(y) = µ
b(N ) = 0. Thus f (y) = 0 for ν-almost every y ∈ N . Since (1.2) is
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
11
bounded, f (y) = 0 implies that (6.2) is zero. Thus (6.2) is zero for ν-almost every y ∈ N .
(Note that this is true even though N is not necessarily a ν-nullset.)
In conclusion, we have shown that for ν-almost every y ∈ Y , (6.2) exists and is given by
the following formula:
(
f (y)µy if y ∈
/N
(7.2)
νy =
0
if y ∈ N
where µy is a conditional measure.
To complete the proof we will need to show that for ν-almost every y ∈ Y , νy is finite
and supported entirely on π −1 (y), and that (6.1) holds for any event B of X.
For the former claim, let y ∈ Y satisfy (7.2). If y ∈ N , then νy = 0 is trivially finite and
supported on π −1 (y). On the other hand, if y ∈ F , then µy exists and satisfies Definition
1.3; i.e. it is probability and entirely supported on π −1 (y). Now if f (y) < ∞, then νy is
also finite and supported on π −1 (y). Since f (y) < ∞ for ν-almost every y ∈ Y , we are
done.
For the latter claim, we note that Theorem 2.2 implies (1.1). Thus it suffices to show
that the right hand sides of (1.1) and (6.1) are equal, i.e.
Z
Z
µy (B)db
µ(y) = νy (B)dν(y)
This is a direct computation based on (7.2), and is left to the reader.
Proof of Theorem 6.4 in the general case:
Lemma 7.1. If X is a locally compact separable metric space, then there exists an increasing sequence of compact sets Kn ⊆ X such that X = ∪n∈N Kn , and such that for any
compact set K ⊆ X, there exists N such that K ⊆ KN .
Proof. Since X is locally compact, there is an open cover consisting of relatively compact
open sets. By [[Wi04] 16.11 p.112], X is Lindelöf, thus there is a countable subcover
(Uj )j∈N . Let Kn := ∪j<n Uj ; as the union of finitely many compact sets, Kn is compact.
Then X = ∪j∈N Uj ⊆ ∪n∈N Kn ⊆ X. Finally, suppose that K ⊆ X is compact. Then the
cover (Uj )j∈N has a finite subcover (Uj )j<N . Thus K ⊆ ∪j<N Uj ⊆ ∪j<N Uj = KN .
/
Let (Kn )n∈N be as in Lemma 7.1. For each n ∈ N, the compactness of Kn implies that
the quintuple (Kn , µ Kn , π, Y, ν) falls into the category of quintuples for which we have
already proven that Theorem 6.4 applies. Denote the topological conditional measure of
µ Kn with respect to (Kn , π, y, Y, ν) by νn,y , if it exists and satisfies the conditions of
Definition 6.2.
Claim 7.2. Fix y ∈ Y , and suppose that for each n ∈ N, the conditional measure νn,y exists
and satisfies the conditions of Definition 6.2. Then the sequence (νn,y )n∈N is monotone
increasing, and the limiting measure νy := limn→∞ νn,y is the topological conditional measure
of µ with respect to (X, π, y, Y, ν).
CONDITIONAL
12
MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Proof. The fact that (νn,y )n∈N is monotone increasing follows directly from the fact that
(Kn )n∈N is a monotone increasing sequence of sets. By Lemma 9.4, the map
νy (A) := lim νn,y (A)
n→∞
is a measure. We need to show that it is a locally finite measure, and that it is equal to
the weak-* limit (6.2). To this end, let ψ ∈ Cc+ (X); we claim that
R
Z
ψdµ π −1 (B(y, ε))
(7.3)
ψdνy = lim
< ∞.
ε→0
ν(B(y, ε))
Let N be large enough so that Supp(ψ) ⊆ KN . Then for all nR ≥ N , the right hand side
can be calculated
entirely on Kn , and is by definition equal to ψdνn,y < ∞. Thus for all
R
n ≥ N , ψdνn,y is independent of n. Taking
R the limit as n tends to infinity yields that
R the
right hand side of (7.3) is equal to limn→∞ ψdνn,y . But by (9.1), this is equal to ψdνy ,
proving (7.3).
For every x ∈ X, pick an open neighborhood U of x which is relatively compact.
By
R
+
[[Co93] 7.1.8 p.199], there is a function ψ ∈ Cc (X) with χU ≤ ψ. Then νy (U ) ≤ ψdνy <
∞. Thus νy is locally finite, as claimed.
It remains to show that νy is supported entirely on π −1 (y). Fix y ∈ Y and let ψ :=
χX\π−1 (y) ; (9.1) simplifies to
νy (X \ π −1 (y)) = lim νn,y (X \ π −1 (y)) = 0.
ε→0
/
Note that the hypotheses of Lemma 7.2 are satisfied for ν-almost every y ∈ Y . Thus
the first claim of Theorem 6.4 is proven, and it remains to show that (6.1) is satisfied for
every event B of X. To see this, fix y ∈ Y , and let ψ := χB . Then (9.1) simplifies to
νy (B) = limε→0 νn,y (B). Integrating both sides with respect to ν yields
Z
Z
νy (B)dν(y) =
lim νn,y (B)dν(y)
n→∞
Z
νn,y (B)dν(y)
= lim
n→∞
= lim µ(B ∩ Kn )
n→∞
= µ(B)
and we are done.
8. Application to Differentiable Manifolds
Next, we describe in more detail the specific case where X and Y are manifolds, and π
is a smooth map:
Theorem 8.1. Let X be an (m + n)-dimensional oriented C 1 manifold, and let Y be an
n-dimensional oriented C 1 Riemannian manifold.2 Let π : X → Y be a C 1 nonsingular
2In
place of assuming that X and Y are orientable, it may be assumed that the forms ω, λ, and σ
defined below are forms of odd type in the sense of [[dR84] Section 5, p.19-23]
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
13
map. Let µ and ν be nonnegative smooth measures on X and Y , respectively. Suppose that
the densities of µ and ν are given by
ω := dµ ∈ Γ(∧m+n T ∗ X)
λ := dν ∈ Γ(∧n T ∗ Y )
where ω ≥ 0 and λ > 0 are measurable.
Assume additionally that σ ∈ Γ(∧m T ∗ X) is an m-form on X which satisfies
ω = σ ∧ π∗λ
(8.1)
(By Proposition 9.2, we know that such a σ exists and that i∗y σ is defined uniquely by ω
and λ. It should also be clear that if π −1 (y) is given the proper orientation, then i∗y σ ≥ 0.)
For each y ∈ Y , let νy be the smooth measure on π −1 (y) corresponding to the density
dνy := i∗y σ ∈ Γ(∧m T ∗ [π −1 (y)])
where iy : π −1 (y) → X is the inclusion map.
Then the collection of measures (νy )y∈Y is a system of conditional measures according
to Definition 6.1. If additionally σ is continuous and λ ∈ L1loc , then each measure νy is a
conditional measure according to Definition 6.2.
For example, this theorem can be used to calculate
1
ln(2)
Pµ X > .5 X Y =
.
=
3
ln(3)
We first prove Theorem 8.1 in the exceptionally simple case where X = Rm+n , Y = Rn ,
and π is projection onto the last n coordinates:
Proof of Theorem 8.1, Special Case: Write
i~∗y σ(x1 , . . . , xm ) := f (x1 , . . . , xm , y1 , . . . , yn )dx1 ∧ . . . dxm
λ(y1 , . . . , yn ) := g(y1 , . . . , yn )dy1 ∧ . . . dyn .
An easy calculation shows that
ω(x1 , . . . , xm , y1 , . . . , yn ) = f (x1 , . . . , xm , y1 , . . . , yn )g(y1 , . . . , yn )dx1 ∧. . . dxm ∧dy1 ∧. . . dyn .
Thus for every event B of X, by Fubini’s theorem
Z
µ(B) =
f (x1 , . . . , xm , y1 , . . . , yn )g(y1 , . . . , yn )dV(x, y)
B
Z "Z
=
Y
iy−1
(B)
~
f (x1 , . . . , xm , y1 , . . . , yn )g(y1 , . . . , yn )dV(x) dV(y)
Z "Z
=
Y
iy−1
(B)
~
#
f (x1 , . . . , xm , y1 , . . . , yn )dV(x) g(y1 , . . . , yn )dV(y)
Z
=
#
νy (B)dν(y).
CONDITIONAL
14
MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Thus the collection of measures (νy )y∈Y is a system of conditional measures according to
Definition 6.1.
It remains to prove (6.2), assuming that σ is continuous and that λ ∈ L1loc . Since the
convergence is intended to be weak-*, we need to show that for every continuous function
with compact support ψ : Rm+n → R, (7.3) holds. Since (ν~y )~y∈Rn is a system of conditional
measures, the right hand side simplifies to
Z
~
lim Eν
ψdνY~ Y ∈ B(~y , ε) .
ε→0
Note that this makes sense because λ ∈ L1loc . Writing Ψ(~y ) :=
simplifies to
~
~
(8.2)
Ψ(~y ) = lim Eν Ψ(Y) Y ∈ B(~y , ε)
R
ψdν~y , we see that (7.3)
ε→0
Now the function Ψ is continuous; this follows from the continuity of ψ and of σ, and from
the fact that ψ is compactly supported. By the definition of continuity this means that for
every γ > 0 there exists an δ > 0 such that
~ ∈ B(~y , δ) ⇒ |Ψ(Y)
~ − Ψ(~y )| ≤ γ.
Y
Fixing γ and δ, this implies that
~ Y
~ ∈ B(~y , ε)) − Ψ(~y )| ≤ γ.
0 < ε ≤ δ ⇒ |Eν (Ψ(Y)
Adding the quantifiers back on, this is the same as saying that
~ Y
~ ∈ B(~y , ε)) = Ψ(~y )
lim Eν (Ψ(Y)
ε→0
and we are done.
Proof of Theorem 8.1, General Case. The first thing we will do is to prove (6.1) for every
event B of X. We first claim that it is sufficient to pick an open cover C, and to check that
(6.1) holds whenever B ⊆ U ∈ C.
To see this, note that X is Lindelöf, and thus C has a countable`subcover (Un )n∈N . Fix
B ⊆ X measurable, and let Bn := B ∩ (Un \ (∪i<n Ui )). Then B =` n∈N Bn , and Bn ⊆ Un .
If we assume that (6.1) holds for each Bn , then it also holds for n≤N Bn for all N ∈ N.
By three applications of the Monotone Convergence Theorem, this implies that (6.1) holds
for B.
Our next step is to construct the cover C. Fix x ∈ X, and let y := π(x) ∈ Y . The fact
that π is nonsingular at x means that the induced map π∗ : Tx (X) → Ty (Y ) is surjective.
By [[Hs81] 4.2.8 p.44], there exist neighborhoods U of x and V of y such that U ⊆ π −1 (V ),
and such that the triple (U, π, V ) is diffeomorphic to a triple (U 0 , π 0 , V 0 ), where U 0 ⊆ A
and V 0 ⊆ B are open subsets of real vector spaces A and B of dimensions (m + n) and n
respectively, and π 0 : A → B is a surjective linear map. It is an exercise in linear algebra to
show that any surjective linear map is (linearly) conjugate to a projection map, so without
loss of generality we assume that U 0 ⊆ Rm+n , V 0 ⊆ Rn , and π 0 is projection. Thus by
the special case of Theorem 8.1, (6.1) holds for (U 0 , π 0 , V 0 ). But clearly (6.1) is invariant
under diffeomorphism, so .(6.1) holds for (U, π, V ). Thus U can be admitted into our cover.
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
15
Finally, we define our cover C to be the set of all U defined as above. The above argument
shows that C is a cover, and that (6.1) holds whenever B ⊆ U ∈ C.
Note that each νy is by definition supported entirely on π −1 (y). Thus (νy )y∈Y is a system
of conditional measures.
Now, suppose that σ is continuous, and that λ ∈ L1loc . If y ∈ Y is fixed, then λ(y) is
constant, so by Proposition 9.2, i∗y σ is linearly dependent on ω, which implies that it is
continuous. This implies that νy is locally finite.
Thus all that remains is to show that (6.2) is valid. Since the convergence intended is
weak-*, it suffices to show that (7.3) is valid for all ψ ∈ Cc (X).
First, we consider the case where Supp(ψ) ⊆ U for some U ∈ C. As before, we can move
to local coordinates and take advantage of the special case of Theorem 8.1 proven above.
Finally, let ψ ∈ Cc (X) whose support is not contained in any elements of C. Since
Supp(ψ) is compact, there is a finite subcollection (Ui )i<N of C which covers Supp(ψ). By
[[Co93] 7.1.10 p.200], there exist
P functions (ψi )i<N in Cc (X) such that Supp(ψi ) ⊆ Ui for
all i < N , and such that ψ = i<N ψi . The above argument shows that (7.3) is valid for
each of the ψi s; thus it is valid for ψ.
9. Appendix
The version of the Lebesgue differentiation theorem which we have used is based on the
theory developed in [[Fe69] p.141-169]. Here it is:
Theorem 9.1. (Lebesgue differentiation theorem) Let X be a locally compact separable
ultrametric space or a separable Riemannian manifold. Let µ and ν be locally finite measures
on X. Assume µ << ν. Then the function
µ(B(x, ε))
f (x) := lim
ε→0 ν(B(x, ε))
is well-defined for ν-almost every x ∈ X and is a Radon-Nikodym derivative of µ against
ν.
Proof. First, note that the case where X is a locally compact separable ultrametric space
can be reduced to the case where X is a compact ultrametric space. To see this, let C be
the cover of X consisting of all compact open balls. (Since X is ultrametric, open balls
are closed, which is why the balls themselves are compact rather than relatively compact.)
Since X is Lindelöf, C has a countable subcover (Un )n∈N . Next, write Vn := Un \ ∪i<n Ui ;
(Vn )n∈N is a partition of X into compact open sets. If the result is true for each of the triples
(Vn , µ Vn , ν Vn ), then by gluing we obtain the result for (X, µ, ν). Thus in the proof
we shall assume that X is either a compact ultrametric space or a separable Riemannian
manifold.
To make this proof more consistent with the notation found in [[Fe69] p.141-169], we will
write ψ := µ and φ := ν.
In both cases, the first step will be to show that the covering relation
V := {(x, B(x, r)) : x ∈ X, 0 < r < ∞}
is φ-Vitali. (The definitions of covering relation and φ-Vitali relation are found at [[Fe69]
2.8.16 p.151].)
CONDITIONAL
16
MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
If X is a separable Riemannian manifold, then the discussion in [[Fe69] p.146, paragraph
3] implies that for every compact K ⊆ X, the metric of X is directionally limited at K
according to the definition found in [[Fe69] 2.8.9 p.145]. Thus [[Fe69] 2.8.18 p.152] yields
that our covering relation is φ-Vitali.
If X is a compact ultrametric space, we let R := {r > 0 : ∃x, y ∈ X 3 d(x, y) = r}. R
cannot contain any sequence whose elements are all distinct and which does not tend to
zero. To see this, suppose that (rn )n∈N is a sequence in R which does not tend to zero.
Let xn , yn be such that d(xn , yn ) = rn . Then there is a number ε > 0 and a subsequence
of (rn )n∈N which is bounded below by ε. Since X is compact metric, there is a further
subsequence of (xn , yn )n∈N which converges to a point (x, y). From now on (xn , yn )n∈N will
refer to this sub-subsequence. We know that if n ∈ N is large enough, then d(x, xn ) < ε
and d(y, yn ) < ε. Furthermore, d(xn , yn ) = rn ≥ ε. The ultrametric inequality then implies
that rn = d(xn , yn ) = d(x, y) for all sufficiently large n ∈ N. This contradicts the fact that
the (rn )n∈N are all distinct.
Thus R consists of a sequence tending towards zero. (If R is finite then X is also finite,
in which case the proof is trivial.) Index this sequence in decreasing order by (rj )j∈N . We
define a sequence of partitions (Pj )j∈N by letting Pj be the partition of X into open balls
of radius rj . Note that each member of Pj is the union of some subfamily of Pj+1 . Also,
limj→∞ diam(Pj ) = limj→∞ rj+1 = 0. Furthermore P1 is bounded and X is separable, so
we have satisfied the hypotheses of [[Fe69] 2.8.19 p.152]. Thus
V = {(x, S) : ∃j ∈ N 3 x ∈ S ∈ Pj }
= {(x, B(x, r)) : x ∈ X, : ∃j ∈ N 3 r = rj }
= {(x, B(x, r)) : x ∈ X, 0 < r < ∞}
is φ-Vitali. (The reason the last equality is true is that for any x ∈ X and for any
0 < r < ∞, there is some j ∈ N such that rj+1 < r ≤ rj . In this case, B(x, r) = B(x, rj ).)
In either case, we have satisfied the hypotheses of [[Fe69] 2.9.1 p.152 (general assumptions
to be used throughout Section 2.9)], noting that X is separable metric and therefore every
locally finite measure is regular. In particular, the hypotheses of [[Fe69] 2.9.5 p.154] are
satisfied. The conclusion should be interpreted to mean that the derivate
ψ(S)
φ(S)
ψ(S)
: (x, S) ∈ V, diam(S) < ε, φ(S) 6= 0}
= lim{
ε→0 φ(S)
ψ(B(x, r))
= lim{
: diam(B(x, r)) < ε, φ(B(x, r)) 6= 0}
ε→0 φ(B(x, r))
D(ψ, φ, V, x) : = (V ) lim
S→x
exists and is finite for φ-almost every x ∈ X. (See [[Fe69] p.153] for the definition of
derivate and [[Fe69] p.151] for the definition of the notation (C) lim.)
The statement that a limit of a parameterized collection sets (Rε )ε>0 exists should be
interpreted to mean that the limits limε→0 sup(Rε ) and limε→0 inf(Rε ) exist and are equal.
By the Sandwich Theorem, this implies that for any map f : (0, ε0 ) → ∪ε>0 Rε such that
f (ε) ∈ Rε , then limε→0 f (ε) exists and is equal to this earlier limit. Thus if we take
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
17
f (ε) :=
ψ(B(x, 2ε ))
,
φ(B(x, 2ε ))
then we find that for all x ∈ Supp(φ),
ψ(B(x, 2ε ))
D(ψ, φ, V, x) = lim
ε→0 φ(B(x, ε ))
2
ψ(B(x, ε))
= lim
ε→0 φ(B(x, ε))
Thus this last line exists and is finite for φ-almost every x ∈ Supp(φ). Since φ(X \Supp φ) =
0, this proves the well-definedness claim of Theorem 9.1.
The proof that the limiting function x 7→ D(ψ, φ, V, x) is a Radon-Nikodym derivative
of ψ against φ is given at [[Fe69] 2.9.7 p.155]. Note that by [[Fe69] 2.9.2 p.153], ψφ = ψ
since ψ << φ.
Next, we include a proposition about short exact sequences of vector spaces. The vector
spaces can be over an arbitrary field, but for convenience of notation we assume they are
all real vector spaces. For consistency with the geometric meaning we have considered the
contravariant maps i∗ and π ∗ ; however, corresponding statements could be made about the
covariant maps i∗ and π∗ .
Proposition 9.2. Let U , V , and W be finite dimensional vector spaces, and let i : U → V
and π : V → W form a short exact sequence. Suppose that dim(U ) = m and dim(W ) =
n, so that dim(V ) = m + n. Then if λ ∈ ∧n W ∗ is fixed, then (8.1) defines a law of
proportionality between ω ∈ ∧m+n V ∗ and i∗ σ ∈ ∧m U ∗ . In other words, the set of all
(ω, i∗ σ) which satisfy (8.1) constitutes a one-dimensional linear subspace of ∧m+n V ∗ ⊕
∧m U ∗ .
Proof. Without loss of generality, we let U = Rm , V = Rm+n , W = Rn , i is inclusion, and
π is projection. Fixing λ, we can write λ = ady1 ∧ dy2 · · · ∧ dyn for some a ∈ R. Next, we
write each σ ∈ ∧m Rm+n as
σ = c1 dx1 ∧dx2 ∧· · ·∧dxm +c2 dx1 ∧dx2 ∧· · ·∧dxm−1 ∧dy1 +. . .+c(m+n) dyn−m+1 ∧dyn−m+2 ∧· · ·∧dyn
m
where ci ∈ R for each i ∈ {1, 2, . . . , m+n
}. (We have written the above formula as if
m
n ≥ m; there is no essential difference if n < m.)
Note that all terms except the first will become zero when the operation i∗ is applied,
because each of them contains at least one factor which depends entirely in the y coordinates. Similarly, all terms except the first will become zero when the wedge product is
taken with π ∗ λ, for the same reason.
Thus
i∗ σ = c1 dx1 ∧ dx2 ∧ · · · ∧ dxm
σ ∧ π ∗ λ = ac1 dx1 ∧ dx2 ∧ · · · ∧ dxm ∧ dy1 ∧ dy2 · · · ∧ dyn
and since c1 can vary freely, we have established direct proportionality.
Corollary 9.3. In the context of Proposition 9.2, then there is a natural isomorphism
between ∧m+n V ∗ and ∧m U ∗ ⊗ ∧n W ∗ .
CONDITIONAL
18
MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
Proof. If λ is fixed, then Proposition 9.2 establishes a linear map ×λ : i∗ σ 7→ σ ∧ π ∗ λ from
∧m U ∗ to ∧m+n V ∗ . Thus the map ×(i∗ σ, λ) := σ ∧ π ∗ λ is well-defined and linear in the
first coordinate. It is clear that × is linear in the second coordinate; thus it is a bilinear
map. By the universal property of tensor products, there is a corresponding map from
∧m U ∗ ⊗ ∧n W ∗ to ∧m+n V ∗ . It can easily be checked that this map is a surjection; since
the domain and codomain have the same dimension (one), this implies that the map is a
bijection.
Finally, a lemma from measure theory is required to complete an earlier argument.
Lemma 9.4. Let X be a measurable space and let µn be a monotonically increasing sequence of measures on X. Then the limiting function µ(A) := limn→∞ µn (A) is a measure;
moreover for every nonnegative measurable function ψ : X → R, then
Z
Z
(9.1)
ψdµ = lim
ψdµn
n→∞
Proof. Clearly
µ(∅) = 0. Finite additivity is also clear. To show countable subadditivity,
`
let A = m∈N Am . Then
µ(A) = lim µn (A)
n→∞
X
= lim
µn (Am )
n→∞
≤ lim
X
n→∞
=
X
m∈N
µ(Am )
m∈N
µ(Am )
m∈N
Thus µ is a measure. To prove (9.1), we first note that if ψ is a characteristic function then
this formula follows directly from the definition. If ψ is simple, then it is a positive linear
combination of characteristic functions, and it is easy to show that (9.1) holds. Finally,
suppose ψm increase monotonically to ψ, where ψm are simple. Then
Z
Z
ψdµ = lim
ψm dµ
m→∞
Z
= lim lim
ψm dµn
m→∞ n→∞
Z
≤ lim lim
ψdµn
m→∞ n→∞
Z
= lim
ψdµn
n→∞
To prove the opposite inequality, note that µn ≤ µ, so
as n tends to infinity yields the desired result.
R
ψdµn ≤
R
ψdµ. Taking the limit
CONDITIONAL MEASURES AND CONDITIONAL EXPECTATION; ROHLIN’S DISINTEGRATION THEOREM
19
References
[Be82] H. Bergström, Weak convergence of measures, Probability and Mathematical Statistics. Academic
Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York-London, 1982. x+245 pp. ISBN:
0-12-091080-2
[Co93] D. Cohn, Measure theory, Reprint of the 1980 original. Birkhäuser Boston, Inc., Boston, MA, 1993.
x+373 pp. ISBN: 0-8176-3003-1
[dR84] G. de Rham, Differentiable manifolds, Forms, currents, harmonic forms. Translated from the French
by F. R. Smith. With an introduction by S. S. Chern. Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], 266. Springer-Verlag, Berlin, 1984.
x+167 pp. ISBN: 3-540-13463-8
[Fe69] H. Federer, Geometric measure theory, Die Grundlehren der mathematischen Wissenschaften, Band
153 Springer-Verlag New York Inc., New York 1969 xiv+676 pp.
[Hs81] C. Hsiung, A first course in differential geometry, Pure and Applied Mathematics. A WileyInterscience Publication. John Wiley & Sons, Inc., New York, 1981. xvii+343 pp. ISBN: 0-47107953-7
[Ma83] D. Maharam, On the planar representation of a measurable subfield, Measure theory, Oberwolfach
1983 (Oberwolfach, 1983), 4757, Lecture Notes in Math., 1089, Springer, Berlin, 1984.
[Ro52] V. A. Rohlin, On the fundamental ideas of measure theory, Amer. Math. Soc. Translation 1952,
(1952). no. 71, 55 pp.
[Sr98] S. M. Srivastava, A course on Borel sets, Graduate Texts in Mathematics, 180. Springer-Verlag,
New York, 1998. xvi+261 pp. ISBN: 0-387-98412-7
[Wi04] S. Willard, General topology, Reprint of the 1970 original [Addison-Wesley, Reading, MA;
MR0264581]. Dover Publications, Inc., Mineola, NY, 2004. xii+369 pp. ISBN: 0-486-43479-6
© Copyright 2026 Paperzz