2017 ENS lectures on Lorentzian geometry and hyperbolic pdes

Lectures on Lorentzian Geometry and
hyperbolic pdes
Jacques Smulevici*
March 24, 2017
Contents
1 Preliminary remarks
2 Review of differential geometry
2.1 Differential manifolds . . . . . .
2.2 Differential mappings . . . . . .
2.3 Tangent vectors . . . . . . . . . .
2.4 The differential of a map . . . . .
2.5 The inverse function theorem . .
2.6 Curves . . . . . . . . . . . . . . .
2.7 Vector fields and tangent bundle
2.8 One forms . . . . . . . . . . . . .
2.9 Einstein summation convention
2.10 Submanifolds . . . . . . . . . . .
2.11 Immersions and submersions . .
2.12 Integral curves . . . . . . . . . . .
2.13 Tensor fields . . . . . . . . . . . .
2.14 Tensors at a point . . . . . . . . .
2.15 Contraction . . . . . . . . . . . .
2.16 Pull-back . . . . . . . . . . . . . .
2
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
.
4
.
4
.
4
.
5
.
5
.
5
.
6
.
7
.
7
.
7
.
8
.
9
.
9
.
9
. 11
. 12
3 Semi-Riemannian manifolds
3.1 Semi-Riemannian metrics . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.3 The Levi-Civita connection . . . . . . . . . . . . . . . . . . . . . . . . . .
3.4 Isometries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
3.5 Parallel translation and geodesics . . . . . . . . . . . . . . . . . . . . . .
3.6 The physical interpretation of the geodesic equations in the causal case
3.7 The exponential map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
12
14
18
22
23
27
27
4 Curvature
30
5 Some differential operators
34
* Laboratoire de Mathématiques, Université Paris-Sud 11, bât. 425, 91405 Orsay, France.
1
6 The Einstein equations
39
6.1 Hyperbolicty I: Wave equations satisfied by the Riemann tensor . . . . 41
6.2 Hyperbolicity II: Wave coordinates . . . . . . . . . . . . . . . . . . . . . . 42
7 Semi-Riemannian submanifolds
8 Local geometry
8.1 Two parameters maps .
8.2 The Gauss lemma . . . .
8.3 Convex neighborhoods
8.4 Arc lengths . . . . . . . .
8.5 The Riemannian case . .
8.6 The Lorentzian case . .
.
.
.
.
.
.
.
.
.
.
.
.
43
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
44
44
45
47
50
50
56
9 Deformation of curves
59
9.1 From causal to timelike curves . . . . . . . . . . . . . . . . . . . . . . . . 60
9.2 First variation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
10 Causality and global Lorentzian geometry
10.1 Causality and topology . . . . . . . . . .
10.2 Quasi-limits . . . . . . . . . . . . . . . .
10.3 Causality conditons . . . . . . . . . . . .
10.4 Achronal sets . . . . . . . . . . . . . . .
10.5 Cauchy hypersurfaces . . . . . . . . . .
10.6 Global hyperbolicity . . . . . . . . . . .
10.7 Cauchy developments . . . . . . . . . .
10.8 Time and temporal functions . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
64
65
67
68
69
72
74
75
76
11 The domain of dependence in globally hyperbolic spacetime
83
11.1 Three extra geometric lemmas . . . . . . . . . . . . . . . . . . . . . . . . 83
11.2 A local version of the domain of dependence property . . . . . . . . . . 84
11.3 Proof of the domain of dependence . . . . . . . . . . . . . . . . . . . . . 86
12 Symmetric hyperbolic systems
86
13 linear wave equations
90
A Global volume form and orientable manifolds
91
B Additional exercices
91
B.1 Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
B.2 Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
1 Preliminary remarks
These lectures have been designed for a course given at the Ecole Normale of Paris
in 2017.
I used the following books to prepare these lectures
1. Semi-Riemannian Geometry by O’Neil [O’N83],
2. The Cauchy problem in General Relativity by H. Ringström [Rin09],
2
3. Mathematical problems of General Relativity I. by D. Christodoulou [].
Most of the beginning is taken from O’Neil. The introduction to the Einstein
equations is more personal. The construction of the time and temporal functions
is taken from [Rin09], which itself followed the papers [BS05], [O’N83], [BS03]. We
claim no originality in these notes, neither in their content, nor in the way it is written (but we do hope the students will find these notes useful! ).
The goals of the lectures are to
1. define the basic objects of Riemannian and Lorentzian geometry,
2. give some of the fundamental results in Riemannian and Lorentzian geometry.
3. introduce the student to some analysis (pdes) problems and to the (by now)
standard techniques to tackle them. More specifically, we will focus on hyperbolic pdes arising naturally in the context of Lorenzian geometry.
The working groups that will follow these lectures will have more advanced topics,
but, to set up a target, one of the main goal of these lectures will be to prove the
following theorem1
Theorem 1.1. Let (M , g ) be a smooth, oriented, time-oriented Lorentzian manifold.
Assume that (M , g ) is globally hyperbolic and that Σ is a smooth Cauchy hypersurface
with future unit normal n. Let ψ0 , ψ1 be smooth functions on Σ and F a smooth
function on M . Then the Cauchy problem
äg ψ
=
F,
ψΣ
=
ψ0 ,
n(ψΣ )
=
ψ1
admits a unique smooth solution ψ.
Obviously, we need to define everything above. In particular, äg will be a wave
type operator generalizing the classical wave operator −∂2t +∆ and Σ will be replacement for the hypersurface t = 0 in Rn+1 . The pure pde aspect of the problem will
actually be easy (say easier than solving a general hyperbolic pde in Rn+1 as seen in
the PDE class). The hard part will be essentially to split the manifold on small pieces,
on which we can solve the wave equation, and then glue everything together. To allow such a cut and paste argument we need the geometric assumption of "globally
hyperbolicy" and a large part of the lectures will try to prepare us to define and use
that assumption.
2 Review of differential geometry
I expect the students to already be aquainted with most of the content of this section.
Most proofs are left to the reader.
The basic object we will need is that of a differential manifold.
1 This theorem could be attributed to Leray, following his 1952 lectures on hyperbolic pdes. The notion
of global hyperbolicity used in his lectures is slightly different and in fact he proves much more, since his
main interest was general hyperbolic odes, not just the wave equations.
3
2.1 Differential manifolds
Definition 2.1. Let M be a topological space. A coordinate system on M is an homeomorphism ξ from an open set U ⊂ M into an open set of Rn .
−1
Two coordinate ξ1 and ξ2 systems overlap smoothly if ξ1 ◦ ξ−1
2 and ξ2 ◦ ξ1 are
smooth. (exercice: on which domains of definition ? )
Definition 2.2. An atlas of dimension on M is a collection of coordinate systems such
that the union of all their domains of definition cover M and such that any two coordinate systems overlap smoothly.
Exercice: what is the dimension of M ? What if we replace smooth by continuous
? ( a lot harder )
An atlas A is called complete if it contains each coordinate systems that overlap
smoothly with every coordinate systems of A.
Lemma 2.1. Every atlas is contained in a unique complete atlas.
Definition 2.3. A smooth manifold is a Hausdorff space endowed wih a complete
atlas.
We will also assume in all of these lectures that all manifolds considered are secound countable (i.e. admits a countable basis of open sets). Exercice:
1. what does Hausdorff mean again ?
2. what would be a good definition of an open submanifold of M ?
3. same with product manifolds of M and N ?
2.2 Differential mappings
Definition 2.4. Let M and N be manifolds. A mapping φ : M → N is smooth if for any
coordinate systems (U M , ξM ) and (U N , ξN ) of M and N , we have ξN ◦ φ ◦ ξ−1
M smooth.
Notation : We will write F (M ) for the set of smooth real valued functions of M .
2.3 Tangent vectors
Let p ∈ M (M manifold. ) A tangent vector X at p is a real-valued function
X : F (M ) → R,
such that
1. X is R-linear.
2. X ( f g ) = f (p)X (g ) + g (p)X ( f ).
(Think of X as a derivation at p).
Let T p M denote the set of all tangent vectors at p. T p M is a real vector space (for
which operations ?) called the tangent space to M at p.
Definition 2.5. Let (U , ξ = (x α )) be a coordinate system of M and p ∈ U . Let f ∈
F (M ). Then,
∂ f ◦ ξ−1
(ξ(p)).
∂x i f (p) :=
∂x i
4
£ ¤
£ ¤
Let ∂x i |p : f → ∂x i f (p). One then easily check that ∂x i |p ∈ T p M . Often, we
will drop the p index and simply write ∂x i .
Lemma 2.2. Let X ∈ T p M , f , g ∈ F (M ). Then,
1. if f = g on some neighborhood of p, then X ( f ) = X (g ).
2. if f = const on some neighborhood of p, then X ( f ) = 0.
This lemma implies that tangent vectors are purely local objects.
Theorem 2.1. If ξ = (x α ) is a coordinate system of M , then (∂x α ) form a basis of T p M .
In particular, dim T p M = dim M .
2.4 The differential of a map
Definition 2.6. Let M , N be smooth manifolds and φ : M → N smooth. For any X ∈
T p M , define Y X : F (N ) → R as Y (g ) = X (g ◦ φ). Then, Y ∈ Tφ(p) M . The map sending
each such X to Y X is called the differential (or pushforward) of φ at p and denoted
d φp . (d φp : T p (M ) → Tφ(p) N ).
Exercice:
• Show that d φp is linear and compute it in local coordinates.
• If N = R and f ∈ F (M ), then d f is called the differential of f . It defines mapping :
d f p : Tp M → R
so it actually an element of (T p M )? .
2.5 The inverse function theorem
Theorem 2.2. Let φ : M → N be a smooth mapping. The differential map d φp at
some pT p M is a linear isomorphism if and only if there is a neighborhood N of p in
M such that φN is a diffeomorphism from N onto a neighborhood φ(N ) of φ(p) ∈ N .
Proof. This is a consequence of the usual inverse function theorem for subsets of
Rn .
2.6 Curves
A curve is smooth map γ : I → M , where I is an non-empty interval. Given a curve
γ, we write
dγ
d
γ̇(s) =
(s) = d γγ(s) [ ]|s ∈ Tγ(s) M .
ds
ds
A piecewise smooth curve is a continuous map γ : I → M , where I is an nonempty interval such that if I = [a, b], then there exists a finite partition a = t 0 < t 1 .. <
t k = b such that each curve segment γ|[ti ,ti +1 ] is a smooth curve, while for an interval
with open endpoints, say I = [a, b), we require that for each non-empty closed interval J ⊂ I , the restriction γ|J is piecewise smooth (this implies that the break points
have no cluster point in I ).
5
2.7 Vector fields and tangent bundle
F
Let T M = p∈M T p M be the disjoint union of all the tangent planes of M . We call
T M the tangent bundle of M . Associated to T M is the a canonical projection defined by
π:TM
→
M
(p, X )
→
p
T M can be given a natural manifold structure as follows. Let (x α ) be a coordinate
system of M . Let p ∈ T p M . For any v ∈ T p M , we have
v = v α ∂x α
The (v α ) then forms a global coordinate system on T p M (called conjugate to the
(x α ) ). Given (p, X ) ∈ T M , let ξ̃(p, X ) = (ξ(p), X α ), then ξ̃ defines a local coordinate
system of T M and one easily verifies that any two such local coordinate systems
overlap smoothly.
The tangent bundle is fact the basic example of a vector bundle over M .
A vector field on M is by definition a section of T M , i.e. a map of the form
X :M →TM
satisfying π ◦ X = I d M . We will write X p to denote the first component of X (p), i.e.
X p ∈ T p M , ∀p ∈ M .
Given a vector field X and f ∈ F (M ), the function X ( f ) is by definition X ( f ) :
p → X p ( f ). The set Γ(M ) of vector fields on M can be naturally given the structure
of a module over the ring F (M ).
A vector field X is smooth if X ( f ) is smooth for any f .
Given a local coordinate system (x α ), we associate the vector fields ∂x α , defined
so that [∂x α ]p ( f ) = [∂x α ]|p ( f ).
For any vector field X , we have
X
X = X (x α )∂x α .
α
Identification with derivation
A derivation on F (M ) is a R-linear function D : F (M ) → F (M ) which satisfies the
usual Leibniz property
D( f g ) = D( f ).g + f .D(g ).
Exercice:
Every vector field defines a derivation and in fact, every derivation can be realized
as a vector field.
The Lie bracket
Definition 2.7. Let X , Y be vector fields. Then, we define the vector field [X , Y ] by
[X , Y ]p ( f ) = X p (Y ( f )) − Y p (X ( f )).
6
Push forward of a vector field by diffeomorphism
Definition 2.8. Let φ be a diffeomorphism φ : M → N and X a vector field on M .
Then, we can define a vector field on N by the formula
(d φX )g = X (g ◦ φ) ◦ φ−1 .
Note that this definition breaks down if φ is merely is smooth function.
2.8 One forms
Let T p M ∗ := (T p M )∗ be the vector space dual of T p M . Its element are called covectors. Let T M ∗ denote the cotangent bundle, obtained as above by the disjoint
union of the T p M ∗ . A one form is then a smooth section of T M ∗ (ex: give T M ∗ the
structure of a smooth vector bundle). The set of 1-forms is denoted Λ1 (M ).
Definition 2.9. Let f ∈ F (M ). Then, we define d f ∈ Λ1 (M ) as d f (X ) = X ( f ) for all
tangent vector X to M .
Given a coordinate system (x α ) on M , for each α, the differential of the function
x , d x α is thus a 1-form.
Let ∂x α denotes the coordinate vector fields, then the (d x α ) provides at each
point a dual basis to the (∂x α ). Moreover, we have
X
d f = ∂x α ( f )d x α .
α
α
2.9 Einstein summation convention2
P
By convention, we will remove the symbols everytime we have repeated indices.
For instance, in the previous formula, we write
d f = ∂x α ( f )d x α .
Again by convention, indices corresponding to the components of vectors are lowered, and indices corresponding to components of covectors are put in higher positions. With this convention, a repeating sum will always involve indices up and
down (and no sum with indices only up should appear).
2.10 Submanifolds
Definition 2.10. A manifold P is a submanifold of M if
• P ⊂ M and P has the induced topology of M .
• The inclusion map i : P → M is smooth and its differential is injective.
Proposition 2.1. If P m is a submanifold of M n , then for all p ∈ P , there exists a coordinate system (x 1 , .., x m , x m+1 , .., x n ) defined on U ⊂ M such that p ∈ U and
P = {(x m+1 = 0, .., x n = 0)}.
2 According to the usal folklore knowledge, Einstein thought of this as his greatest invention. Let me
know if you know an exact citation on this.
7
Proof. Use the inverse function theorem.
We call such a coordinate system adapted to P .
Proposition 2.2. A subset P ⊂ M is a submanifold of M if for each point p of P , there
is a coordinate system of M adapted to P .
Let P be a submanifold of M , p ∈ P and denote the inclusion map by i . Then
d i p : Tp P → Tp M .
Since d i p is injective, we will identify T p P with d i p (T p P ) and therefore view T p P as
a vector subspace of T p M . A vector field X on M is then tangent to P , if for all p ∈ P ,
X p ∈ Tp P .
2.11 Immersions and submersions
Definition 2.11. An immersion is a smooth map φ : M → N such that d φp is injective
for all p ∈ M .
Definition 2.12. An embedding of P into M is a immersion φ : P → M such that
1. φ is injective.
2. the induced map φ|φ(P ) : φ → φ(P ) is an homeomorphism.
Clearly, the restriction of any immersion to any sufficiently small open set is an
imbedding.
Exercice :
1. If P is a submanifold of M then the inclusion map i : P → M is automatically
an imbedding.
2. Consersely, let φ : P → M be an imbedding. Give φ(P ) a manifold structure
such that φ̄ : P → φ(P ) is a diffeomorphism. Then φ(P ) ⊂ M and the inclusion
is an immersion, so that φ(P ) is a submanifold of M .
Definition 2.13. Let ψ : M m → N n be a smooth map and let q ∈ N . q is said to be
regular provided that d ψp is onto, for every p ∈ M .
Lemma 2.3. If q ∈ ψ(M ) is regular, then ψ−1 (q) is a submanifold of M and dim M =
dim N + dim ψ−1 (q).
A hypersurface is a submanifold of codimension 1. We have
Corollary 2.1. Let c be a value of f : M → R. If d f p 6= 0 for every p ∈ f −1 (c), then
f −1 (c) is a submanifold of M (called a level set of f ).
Definition 2.14. A submersion ψ : M → B is a smooth mapping onto B such that d ψp
is onto for all p ∈ M .
8
2.12 Integral curves
A curve α is an integral curve of X ∈ Γ(M ) if α̇ = X α .
In coordinates, an integral curve is a solution to
d (x α ◦ α)
= X i (x 1 (α), .., x n (α)).
ds
By Cauchy-Lipshitz, for any X ∈ Γ(M ) and any p ∈ M , there exists a unique maximal
integral curve of X starting at p.
Exercices:
• give a definition of the flow of a vector field using ode theory.
• For any vector field X such that X p 6= 0, there exists a local coordinate system
(x α ) at p such that X = ∂x 1 .
2.13 Tensor fields
Definition 2.15. A tensor field T of type (r, s) over a manifold is a map of the form
T : (Λ1 (M ))r × (Γ(M ))s → F (M )
which is F (M )-multilinear.
We will denote by T sr the set of tensor fields of type (r, s).
Definition 2.16 (Tensor product). Given T1 a tensor field of type (r 1 , s 1 ) and T2 of
type (r 2 , s 2 ), we define T1 ⊗ T2 as
T1 ⊗ T2 : (Λ1 (M ))r 1 +r 2 × (Γ(M ))s1 +s2
→
F (M )
(θ1 , .., θr 1 +r 2 , X 1 , .., X s1 +s2 )
→
T1 (θ1 , .., θr 1 , X 1 , .., X s1 )T2 (θr 1 +1 , .., θr 1 +r 2 , X s1 +1 , .., X s1 +s2 )
Exercice:
b ...b
1. Define the components T a11..ar s of an (r, s) tensor field T .
2. How do the components chance under a change of basis ?
3. Define a (r, s) tensor bundle and tensor fields as a section of that bundle.
2.14 Tensors at a point
First a classical lemma
Lemma 2.4. Given any neighborhood U of a point p ∈ M , there is a function f ∈
F (M ), called a bump function at p, such that
1. 0 ≤ f ≤ 1 on M ,
2. f = 1 on some neighborhood of p,
3. The support of f is included in U
supp f := {p ∈ M : f (p) 6= 0} ⊂ U .
9
Proof. Exercice
Proposition 2.3. Let p ∈ M and A ∈ T sr . Let (θ i )1≤i ≤s , (νi )1≤i ≤s be 2s one forms such
that θpi = νip for all i . Similarly, let (X i ), (Yi ) for 1 ≤ i ≤ r be 2r vector fields on M such
that X i (p) = Yi (p).
Then,
A(θ 1 , θ 2 , .., θ s , X 1 , .., X r )(p) = A(ν1 , .., νs , Y 1 , .., Y r )(p).
Proof. We do the proof in the case of tensor fields of type (1, 1). Let θ be a one form
and X be a vector field. Let (U , x α ) be a coordinate system around p and let f be a
bump function at p with support in U . We can decompose X and θ,
X = X α ∂x α ,
(1)
θ = θα d x α ,
(2)
Here any of the components X α , θα are smooth functions on U . Moreover, f X α ,
f θα , f ∂x α , f d x α are all smooth in U and can be extended trivially to the whole of M
by 0. (We denote the resulting object by the same letters).
Thus, by F (M )-multilinearity
A( f 2 θ, f 2 X ) = f θα . f X β A( f d x α , f ∂x β ).
Evaluated at p, this gives
f 4 (p)A(θ, X )(p) = A( f 2 θ, f 2 X )(p) = f (p)θα (p) f (p)X β (p)A( f d x α , f ∂x β )(p),
so that
A(θ, X )(p) = θα (p)X β (p)A( f d x α , f ∂x β )(p).
This implies that given a tensor field A of type (r, s), we can defined a map
A p : (T p M ? )r × (T p M )s → R
as follows. Given ω1 , .., ωr ∈ T p M ? and x 1 , .., x s ∈ T p M let θ i , X i be respectively one
forms and vector fields such that θ i (p) = ωi , X i (p) = x i . Then, define
A p (ω1 , .., ωr , x 1 , .., x s ) = A(θ, X )(p).
At each p, A p is R-multilinear (it is thus a tensor over the vector space T p M ).
Moreover, the above proposition implies that given any open set U of M , the
restriction A |U of A to U is a well defined tensor field on U .
In particular,
Definition 2.17. Let (U , x α ) be a coordinate system. Then, for any tensor A the components of A (relative to ξ) are the real-valued functions
i ..i
A j1 ...rj = A(d x i 1 , .., d x i r , ∂ j 1 , ..., ∂ j s ).
1
s
Lemma 2.5. With the above notation,
i ..i
A = A j1 .. jr ∂i 1 ⊗ .. ⊗ ∂i r ⊗ d x j 1 .. ⊗ d x j s .
1
s
10
Proof. Show that the left and right-hand side take the same value on each (d x i 1 , ..d x i r , ∂ j 1 ..∂ j s )
and use the F (U ) multi-linearity.
Tensor transformation rules. We give the transformation rule for tensor of type
(1, 1) and leave the general case as an exercice.
Lemma 2.6. Let A be a tensor field of type (1, 1) and let (x α ), (y i ) be two coordinate
systems defined on a common open set U . Denote the components of A with respect
to (x) as a βα and with respect to (y) as b ij , then
a βα =
∂x α ∂y j
∂y i ∂x β
b ij .
Remark 2.1. In the above formula, the interpretation of say
∂x α
∂y i
is that for each α, x α
is a function on U and for each i , ∂ y i is a vector field. Thus, ∂ y i (x α )(=
∂x α
)
∂y i
is well
defined.
Proof. Just use the previous formula giving A in terms of its components and apply
the change of coordinate rules to each of ∂x α , d x α .
A trivial remark: let (U , ξ = (x α )) be a coordinate system.
Let X α (x 1 , .., x n ) be smooth functions defined on on ξ(U ). Then X := X α ◦ ξ.∂x α
defines a vector field on U . Now let (X 0 )α be another set of functions and ξ0 = (y α )
be another coordinate system. Then X 0 := (X 0 )α ◦ ξ0 .∂ y α = X provided that the usual
rule for changes of coordinates applies to X α and X 0α . Thus, given a “field of components” relative to a coordinate system, it defines a good tensor field provided it
satisfies the usual rule for coordinate transformations.
Remark 2.2. The components of tensor need not be only with respect to coordinate
induced basis. More precisly, a basis of vector fields is a set of vector fields X i such
that at each p, the X i (p) form a basis of T p M . One can then compute a dual basis of
forms, and finally decompose all tensors with respect to these basis.
2.15 Contraction
The contraction is an operation on tensor fields taking (r,s) types to (r-1,s-1) types.
It can be viewed as a trace operation. More precisely,
Lemma 2.7. There is a unique F (M )-linear function C defined on tensor of type (1, 1)
such that C (X ⊗ θ) = θ(X ).
β
Proof. In coordinates, given A α the components of a tensor of type (1, 1), check that
C (A) = A α
α necessarily and that this defines C properly.
j
Exercice: define a contraction C i over the indices i and j for tensors of type (r, s)
and 1 ≤ i ≤ r , 1 ≤ j ≤ s.
11
2.16 Pull-back
Let φ : M → N be a smooth map. Let ω ∈ Λ1 (N ). Then, we can define a one-form on
M by
∀p ∈ M , v ∈ T p M , φ? (ω)(v) = ωφ(p) (d φp v).
Similarly, if A is a (0, s) tensor field, then define for each p on M
φ? (A)(v 1 , .., v s ) = A φ(p) (d φp v 1 , .., d φp v s ).
3 Semi-Riemannian manifolds
3.1 Semi-Riemannian metrics
Definition 3.1. A scalar product g on a real vector space V is a non-degenerate symmetric bilinear form on V .
Non-degenerate means that, for each v 6= 0 ∈ V , ∃x ∈ V , such that g (v, x) 6= 0.
Equivalently, any matrix representing g is invertible.
We say that v and w are orthogonal, denoted v ⊥ W , if g (v, w) = 0.
If W is a subspace of V , we define as usual W ⊥ and we have
Lemma 3.1.
dimW + dimW ⊥ = dimV,
(W ⊥ )⊥ = W.
Because g (v, v) can be negative, the norm |v| of a vector is defined to be
|v| = |g (v, v)|1/2
(but it does not define a norm in the sense of normed vector space unless g is positive definite). A unit vector is a vector of norm 1, i.e. such that g (v, v) = ±1. As usual
a set of mutually orthogonal unit vectors is said to be orthonormal.
It follows from the standard theory of quadratic forms that
Lemma 3.2. A scalar product on V (6= ;) has an orthonormal basis.
If (e i ) is an orthonormal basis, the number of i such that g (e i , e i ) < 0 is independent of the choice of orthonormal basis (cf Sylvester’s law of inertia on the signature
of quadratic forms).
The signature of g is typically written −+..+ or +..+ (each minus sign correspond
to an i such that g (e i , e i ) < 0 etc..)
Definition 3.2. Let g be a (0, 2) tensor field. g is said to be symmetric if for any vector
fields X and Y , g (X , Y ) = g (Y , X ).
If g αβ are the components of g in some basis of vector fields, then g is symmetric
if and only if g αβ = g βα .
If g is a (0, 2) symmetric tensor field, then at each p, g p is a symmetric bilinear
form on T p M . We say that g is non-degenerate if g p is non-degenerate at each p.
Moreover, at each p, we can compute the signature of g p .
We can thus define a metric tensor as follows.
12
Definition 3.3. A metric tensor g on M is a symmetric non-degenerate (0,2) tensor
field on M of constant signature. (M , g ) is then called a semi-Riemannian manifold.
If g is positive definite, then (M , g ) is called a Riemannian manifold, if sign g = − +
+.. + +, then (M , g ) is called a Lorentzian manifold.
The name pseudo-Riemannian manifold is also used instead of semi-Riemannian
manifold (not to be confused with subRiemannian).
Let (M , g ) be a semi-Riemannian manifold and (x α ) a local coordinate system.
Since g is non-degenerate, its matrix components g αβ is invertible and will be denoted g αβ .
Exercice: Show that g αβ defines a (2, 0) tensor. (Hint let (x α ) and (y α ) be two
0
coordinate systems. Let g αβ be the components in the first system, g αβ
be the components in the second. We have
0
(g 0 )αβ g βγ
= g αβ g βγ = δα
γ
0
and write g βγ in terms of g βγ
. Deduce that g αβ transforms like a (2, 0) tensor.
If X is a vector field of components X α , then
Y ∈ Γ(M ) → g (X , Y )
is F (M )-linear and therefore defines a 1-form, denoted [ X . In components, we will
write
[
X = X α g αβ d x β := X β d x β ,
i.e. the components of the corresponding 1-form are denoted by the same letter but
with the indices down.
Lemma 3.3. The map [ : Γ(M ) → Λ1 (M ) is an F (M )-linear isomorphism.
Proof. First note that if g (V, X ) = g (W, X ), for all X ∈ Γ(M ), then V = W (g is nondegenerate).
Let θ ∈ Λ1 (M ) and consider (U , (x α )) a coordinate system, g αβ the components
of g and g αβ the components of the inverse matrix of g αβ . The map ] θ : p → g αβ (p)θα (p)∂x β
then defines a vector field on U . Moreover, given any other coordinate system defined on V , constructing ] θ 0 similarly, one easily check that ] θ 0 and ] θ must agree on
U ∩V if non-empty. In other words, ] θ can actually be defined as a global vector field
on M .
Now we have g (] θ, X ) = θ(X ) by construction (evaluate everything in the (x α )
coordinate system. )
Thus, ] ν is given by
and by convention
]
ν : θ → g −1 (θ, ν)
] α
ν = νβ g αβ .
The raising and lowering of indices can then be generalized to arbitrary tensors in
the natural way.
Definition 3.4. Let (M , g ) be a semi-Riemannian manifold. Let X ∈ T p M , then p
is said to be spacelike if g p (X , X ) > 0, timelike if g p (X , X ) < 0 and null if X 6= 0 and
g p (X , X ) = 0.
13
The set of all null vectors in T p M is called the nullcone at p ∈ M and the union
of all null and timelike vectors is the set of all causal vectors.
If α is a smooth curve, then α is said to be timelike if all its tangent vectors are
timelike etc...
Exercice: Extends the above definition to submanifolds of M .
The line element: g can be written as
g = g αβ d x α ⊗ d x β .
By polarization, g can of course be reconstructed from the map q(v) : v ∈ T p M →
g (v, v). The map q is called the line element of the metric g and is typically denoted
d s 2 with
d s 2 = g αβ d x α d x β
(see for instance the explanations p56 in O’NEIL).
3.2 Examples
The simplest example of a Riemannian manifold is (Rn , δ), where δ is the Euclidean
metric, which in global Cartesian coordinate systems of Rn just reduces to the usual
scalar product.
3.2.1 Submanifolds of a Riemannian manifold
Lemma 3.4. Let Σ be a submanifold of a Riemannian manifold (M , g ) and j : Σ → M
the inclusion map. Then, the pull-back of g , j ? (g ) is a Riemannian metric on Σ.
Proof. It follows from the definition that j ? (g ) is a (0, 2) (symmetric) tensor. Moreover, given vectors X and Y in T p Σ, p ∈ Σ, we have by definition
j ? (g )p (X , Y ) = g p (d j p (X ), d j p (Y )).
Since Σ is a submanifold, the differential d j p is injective at each p, thus d j p (X ) = 0
if and only if X = 0. It then follows easily that j ? (g )p is positive definite.
Example: Consider the sphere S 2 = {x 2 + y 2 + z 2 = 1} in R3 .
S 2 endowed with the induced topology is a submanifold of R3 .
We consider on S 2 a local chart ξ given by the coordinates (θ, φ) defined on
(0, π) × (0, 2π) by the inverse of the map
(θ, φ) → (cos θ, sin θ cos φ, sin θ sin φ) ∈ S 2 .
We consider on R3 spherical coordinates χ = (r, θ, φ) defined on (0, +∞)×(0, π)×
(0, 2π) by the inverse of the map (here by a small abuse of language, we are using
again the label φ and θ)
(r, θ, φ) → (r cos θ, r sin θ cos φ, r sin θ sin φ).
A computation shows that in the (r, θ, φ) coordinate system
¡
¢
δ = d r ⊗ d r + r 2 sin2 (θ)d φ ⊗ d φ + d θ ⊗ d θ .
The map χ ◦ j ◦ ξ−1 is then given by
χ ◦ j ◦ ξ−1 : (θ, φ) → (1, θ, φ)
and the induced metric on S 2 is given by
σS 2 = sin2 (θ)d φ ⊗ d φ + d θ ⊗ d θ.
14
3.2.2 Minkowski space
Minkowski space is the Lorentzian manifold (Rn+1 , η), with η the Lorentzian metric
such that if (x α ) = (x 0 , x i ) are Cartesian coordinates on Rn+1 , then the ∂x α are orthonormal with3 g 00 = −1. Minkowski space plays the role of the Euclidean space
in Lorentzian geometry, in the sense that it is the trivial example of a Lorentzian
manifold.
Contrary to the Riemannian case, if N is a submanifold of a Lorentzian manifold
(M , g ) (such as the Minkowski space) with inclusion map j , the pull-back tensor
j ? (g ) does not need to define a Lorentzian metric, in fact, it does not need to define
a metric at all.
Example: In (Rn+1 , η) with Cartesian coordinates (t , x, y, z), consider the submanifolds given by t = 0, x = 0 and t = x and compute the pull back of the metric in
each of these cases.
3.2.3 Products
Let M and N be semi-Riemannian manifold with metric g M and g N respectively.
Consider the product manifold M × N and let π : M × N and σ : M × N be the projection on the first and second components. Then,
g := π? (g M ) + σ? (g N )
is a metric on g .
For instance, if (N , g N ) is a Riemannian manifold, then we can construct a Lorentzian
manifold on the product Rt × N by considering the metric
g = π? (−d t ⊗ d t ) + σ? (g N ).
(The index t on Rt simply expresses that t is a global coordinate on R.) Often, we’ll
drop the pull back of the projections in the above formula and simply write
g = −d t ⊗ d t + g N .
3.2.4 Uniformly Lorentzian metrics on Rn+1
In this section, we will consider Cartesian coordinates (x 0 , x i ) on Rn+1 . Latin indices
i , j will always refer to the spatial directions, so that for instance, If T is a tensor,
Ti j denotes all the components of T associated to the ∂x i , 1 ≤ i ≤ n. Greek indices
will be used to denote all the components (including the ones associated to ∂x 0 of
tensors.
We start with some linear algebra.
Lemma 3.5. Let g ∈ M n+1 (R) be a symmetric matrix such that g 00 < 0 and h = (g i j ),
the submatrix obtained by removing the first row and the first column, is positive
definite. Then g has sign − + ...+ .
Remark 3.1. The point of the lemma is that we do not need to know the values of g
on the whole first row and the whole first column to determine the signature of g in
the case, only the sign of g 00 .
3 Here are considering units (i.e. a specific choice of inertial frame) such that the speed of light in the
vacuum is 1. In general, we have g 00 = −c 2 in an inertial frame.
15
Proof. Let O be an orthogonal matrix diagonalizing h and let k = MOt g MO , where
µ
MO =
1
0
0
O
¶
.
Consider the submatrix l = (k i j ) obtained similarly by removing the first row and
the first column of k. Then, l is diagonal and
l = O t hO = d i ag (λ1 , .., λn ),
with λi > 0 for all i . Furthermore k 00 = g 00 < 0 and the eigenvalues of g and k coincides. We compute the determinant of k − λI to get
!
Ã
2
2
k 0n
k 01
− .. −
(λ1 − λ)..(λn − λ).
p(λ) = det(k − λI ) = k 00 − λ −
λ1 − λ
λn − λ
Let us define
f (λ) = k 00 − λ −
and its derivative
f 0 (λ) = −1 −
2
k 01
λ1 − λ
2
h 01
(λ1 − λ)2
− .. −
− .. −
2
k 0n
λn − λ
2
h 0n
(λn − λ)2
.
Let λm be the smallest of the λi . On (−∞, λm ), f is smooth and f 0 < 0. Moreover,
f (−∞) = +∞ and f (0) < 0. Thus, there is a unique negative value of λ ∈ (−∞, 0), say
λ0 for which f (λ) = 0 which has to be an eigenvalue of g . Moreover, p 0 (λ0 ) 6= 0, i.e.
it is a root with multiplicity one. Since f 0 < 0 on (−∞, λm ), there can be no other
eigenvalue in this interval. It follows that the remaining eigenvalues must be strictly
positive.
Alternatively, from the assumptions, g restricted to R(1, .., 0) is definite negative
while g restricted to the span of the e i , 1 ≤ i ≤ n is definite positive. So it follows from
basic linear algebra that g has one negative eigenvalue and n positive ones.
Definition 3.5. Let g be a smooth symmetric (0, 2) tensor field on Rn+1 . We say that g
is uniformly Lorentzian is there exists a global Cartesian coordinate system and constants a, b > 0 such that g 00 < −a, (g i j ) > b and the components of g are uniformly
bounded.
Of course, it follows from the previous lemma that such a uniform Lorentzian
metric is in particular Lorentzian.
Given a uniformly Lorentzian metric, it will be useful (especially when we want
to solve wave equations) to have estimates controlling the inverse metric as well.
Let g be a matrix as in Lemma 3.5. Let h be the submatrix of g given in components by g i j where we have removed the first row and the first column. From
Lemma 3.5, we know in particular that g is invertible. Let g µν be the components
of the inverse matrix of g and let H to denote the submatrix given in components
by g i j obtained by removing the first row and the first column (note that in general
H 6= h −1 ).
Finally, we will denote the vector of components g 0i by s(g ) (s is sometimes
called the shift vector).
16
Lemma 3.6. Let g be a matrix as in Lemma 3.5. Then,
g 00 =
1
,
g 00 − d 2
where d 2 = (h)−1 (s(g ), s(g )), that H is positive definite, with
g 00
h −1 ≤ H ≤ h −1 ,
g 00 − d 2
and that
s(g −1 ) =
1
h −1 s(g ).
d 2 − g 00
Proof. Let A be the square root of h −1 , i.e. the positive definite symmetric matrix
such that A 2 = h −1 . Then A t h A = I d . Let k = M At g M A , where
µ
MA =
1
0
0
A
¶
.
Then, k 00 = g 00 , the submatrix (k i j ) obtained by erasing the first row and the first
column is simply the identity matrix and s(k), the new shift vector, is given by A t s(g ).
Let B be an orthogonal matrix such that B t A t s(g ) = |A t s(g )|e 1 , | . | denotes the Euclidean norm and e 1 = (1, .., 0) is the first vector fo the canonical basis of Rn . Note
that
|A t s(g )|2
=
[(A t s(g ))t .A t s(g )]1/2
=
[s(g )A A t s(g )]1/2 = [s(g )A 2 s(g )]1/2 = [h −1 (s(g ), s(g ))]1/2 = d .
Consider ρ = M Bt kM B , where M B is defined similarly to M A . Then, ρ 00 = g 00 , the
submatrix (ρ i j ) obtained by removing the first column and the first row is simply the
identity and s(ρ) = d e 1 . Note that the inverse of the 2×2 submatrix with components
ρ µν , 0 ≤ µ, ν ≤ 1 is given by
1
g 00 − d 2
µ
1
−d
−d
g 00
¶
.
Since g −1 = M A M B ρ −1 M Bt M At and the matrices M A and M B preserves the 00 components of a matrix, we obtain the first claim.
Moreover, let Hρ = (ρ i j ) be the submatrix obtained by removing the first column
and the first row of ρ −1 . Then,
H = AB Hρ B t A t .
We have, for any w 6= 0, denoting ( , ) the usual scalar product on Rn ,
H (w, w)
h −1 (w, w)
=
=
(AB Hρ B t A t w, w)
(Aw, Aw)
(Hρ B t A t w, B t A t w)
(B t A t w, B t A t w)
,
using that B is orthogonal and A symmetric. The second claim then follows since
Hρ is a diagonal matrix with all diagonal element 1 expect for the first which is given
g 00
by g −d
2 ≤ 1. We leave the last claim as an exercice.
00
17
3.2.5 A first look at the Schwarzshild spacetime
The Schwarzschild spacetime is one of the most important example of Lorentzian
manifold (it was discovered in 1916 by Schwarzschild as a solution to the Einstein
equations). It not just one Lorentzian manifold, but a one parameter set of Lorentzian
manifolds, index by a positive number m > 0 (referred to as the mass). Let thus m > 0
and consider the product manifold
M ext = Rt × (2m, +∞) × S2 .
The projection on the second variable will be denoted by r . r is just a smooth function on our manifold taking values in (2m, +∞). On M ext , we consider the following
Lorentzian metric
µ
¶
µ
¶
2m
2m −1
g = − 1−
dt ⊗dt + 1−
d r ⊗ d r + r 2 σS2 ,
r
r
where σS2 denotes the pull back on M ext of the usual round metric on S2 . On M ext ,
we can consider local coordinates of the form (t , r, θ, φ).
Note that the above expression for the metric g becomes singular as r → 2m.
Exercice:
1. Prove that there exists a change of coordinates of the form (t , r, θ, φ) → (t +
f (r ), r, θ, φ) such that the metric components in the new coordinate system
are regular (i.e. admits continuous (in fact analytic) extensions ) up to r = 2m.
(hint: consider a function f (r ) of the form f (r ) = 2m ln(r − 2m).)
2. Deduce from the above that M ext can be isometrically embedded as an open
submanifold of a (strictly larger) manifold M Sc wh .
We will return to Schwarzchild later in the course (or in the working groups).
3.3 The Levi-Civita connection
Given a semi-Riemannian manifold, we are looking for an operation replacing the
usual notion of derivative for tensorial objects. We start by axiomatizing the key
properties that we would like.
Definition 3.6. A connection D on a smooth manifold M is map of the form
D : Γ(M )
×
Γ(M ) → Γ(M )
(V,W )
→
DV W
such that D V W is F (M )-linear in V , R-linear in W and the Leibniz rule is verified
D V ( f W ) = V ( f ).W + f D V W
for all f ∈ F (M ).
D V W is called the covariant derivative of W with respect to V (for the connection
D).
Let D be a connection on M . Let p ∈ M and consider v ∈ T p M . Let V be any
vector field such that Vp = v. Then, for any W ∈ Γ(M ), (D V W )(p) is independent of
18
the choice of V (it depends only on v). Thus, for each p, we can define map (still
denoted D)
D : T p M × Γ(M )
→
Tp M
(v,W )
→
D v W.
Given a semi-Riemannian manifold, there can be plenty of connections, but there is
a unique one that satisfies the two extra conditions written below.
Theorem 3.1. Let (M , g ) be a semi-Riemannian manifold. Then, there exists a unique
connection D such that
1. D is torsion free: [V,W ] = D V W − D W V ,
2. X (g (V,W )) = g (D X V,W ) + g (V, D X W ).
D is called the Levi-Civita connection of (M , g ) and is characterized by the Koszul
formula
2g (D V W, X ) = V g (W, X )+W g (X ,V )−X g (V,W )−g (V, [W, X ])+g (W, [X ,V ])+g (X , [V,W ]).
Remark 3.2. The first condition can be interpreted as a twist or torsion free condition. Without it, geodesics (which we are going to define soon) would not completely
determined the connection. The second condition means essentially that D g = 0, i.e.
the metric is constant with respect to this notion of derivative.
Proof. Let D be a connection satisfying the above two conditions. Inserting the conditions on the RHS of the Koszul formula, it follows that it must hold. Then the nondegeneracy of the metric implies that D V W is uniquely determined. Reciproquely,
let F (V,W, X ) be given by the RHS of the Koszul formula. Let V,W ∈ Γ(M ) and check
that X → F (V,W, X ) is F (M ) linear. Thus, the formula defines a one-form and we
denote it image by ] as 2D V W (i.e. 2g (D V W, X ) = F (V,W, X ) ). Then the Koszul
formula holds and from it we can check that all the conditions hold. For instance,
g (X , [V,W ]) = 2g (D V W, X )−V g (W, X )−W g (X ,V )+X g (V,W )+g (V, [W, X ])+g (W, [V, X ])
and similarly (exchanging V and W )
g (X , [V,W ]) = −2g (D W V, X )+V g (W, X )+W g (X ,V )−X g (V,W )−g (V, [W, X ])−g (W, [V, X ])
Adding the equations give the torsion free property.
Let us also check that with this definition, D is indeed F (M ) linear in the first
variable. Let f ∈ F (M ). Then,
2g (D f V W, X )
=
f V g (W, X ) + W g (X , f V ) − X g ( f V,W ) − g ( f V, [W, X ]) + g (W, [X , f V ]) + g (X , [ f V,W ])
=
=
f V g (W, X ) + W ( f g (X ,V )) − X ( f g (V,W )) − f g (V, [W, X ])
¡
¢
¡
¢
+ g W, X ( f )V + f [X ,V ] + g X , f [V,W ] − W ( f )V
¡
¢
f V g (W, X ) + W g (X ,V ) − X g (V,W ) − g (V, [W, X ]) + g (W, [X ,V ]) + g (X , [V,W ])
=
f 2g (D V W, X ).
19
Exercice: check the other conditions and finish the proof.
From the Koszul formula, it follows that, for any p, D V W (p) only depends on the
value of W in a neighborhood of p (since it only depends on W (p) and its derivative
at p. Thus, given any open set U ⊂ M , we can consider D defined on Γ(U ) × Γ(U ).
Definition 3.7. Let (x α ) be a coordinate system. The Christoffel symbols are the real
valued functions Γα
such that
βγ
α
D ∂ β ∂x γ = Γα
βγ ∂x .
x
¡
¢
Lemma 3.7. We have Γα
= 21 g αρ ∂x β g γρ + ∂x γ g βρ − ∂x ρ g βγ .
βγ
In particular, Γα
= Γα
.
βγ
γβ
Proof. We have,
g (D ∂ β ∂x γ , ∂x ρ ) = Γα
βγ g ρα .
x
On the other hand, using the Koszul formula, all the commutators vanish and
2g (D ∂ β ∂x γ , ∂x ρ ) = ∂x β g γρ + ∂x γ g βρ − ∂x ρ g βγ .
x
Conclude by inverting g
ρα
in the first equation.
Exercice: Let (S 2 , σ) be the usual 2 sphere endowed with the round metric. Compute its Christoffel symbols in the (θ, φ) coordinate system.
Of course, for the Euclidean or the Minkowski space in Cartesian coordinates,
the covariant derivatives in the direction of ∂x α just reduces to partial derivatives
and the Christoffel symbols vanish.
We now seek to extend our connection to the derivation of arbitrary tensors. First
a definition.
Definition 3.8. A tensor derivation on D is a set of maps D(= D rs ) : Tsr (M ) → Tsr (M ),
where Tsr (M ) is the set of tensor fields of type (r, s), such that,
1. for a tensor field A ⊗ B , D(A ⊗ B ) = D A ⊗ B + A ⊗ DB ,
2. D(C A) = C (D A) for any contraction C .
We have easily
Proposition 3.1. Given a vector field V and an R-linear map δ : Γ(M ) → Γ(M ) such
that
δ( f X ) = V ( f )X + f δ(X ),
there exists a unique tensor derivation D on M such that D 00 = V and D 01 = δ.
Thus, the Levi-Civita connection extends naturally to arbitrary tensor fields.
Proof. D 00 (derivation of functions) and D 01 derivation of vector fields are given. Assume that such a general D exists. Let θ be a one-form and X a vector field. Then,
θ ⊗ X is a (1, 1) tensor. We must have
D(θ ⊗ X ) = (Dθ) ⊗ X + θ ⊗ D X .
20
Moreover by contraction, we need
D(C (θ ⊗ X )) = C (D(θ ⊗ X )).
Since C (θ ⊗ X ) = θ(X ), this gives
D(θ(X )) = Dθ(X ) + θ(D X )
and thus
Dθ(X ) = V (θ(X )) − θ(δX ),
which determines uniquely Dθ. Moreover, one easily check that Dθ is F (M )-linear
and thus defines a one-form.
For an arbitrary tensor A of type (r, s), we have similarly,
D A(θ 1 , .., θ r , X 1 , .., X s )
=
V (A(θ 1 , .., θ r , X 1 , .., X s )) −
r
X
A(θ 1 , .., Dθ i , .., θ r , X 1 , .., X s )
i =1
−
s
X
A(θ 1 , .., θ r , X 1 , ..., δX j , ..., X s ).
j =1
Thus D is uniquely determined. One then needs to check that the above formula
indeed defines a tensor derivation. The fact that D distributes on tensor products
is easy from the above formula. For the contraction property, let us prove that D 11
commutes with C . We have trivially that D commutes with C for tensor product
θ ⊗ X . The result then follows since any (1, 1) tensor can be written as a sum of such
terms.
Exercice:
1. Recall that given two vector fields X , Y ∈ Γ(M ), [X , Y ] is a vector field. Show
that it defines a tensor derivation called the Lie derivative (notation L X Y =
[X , Y ]).
2. Let (x α ) be a coordinate system. Compute L ∂α g βγ := L ∂α (g )βγ .
3. Deduce that L can never coincide with the tensor derivation associated with
the Levi-Civita connection of a metric. (Of course, this also follows from the
fact that D V W is F (M )-linear in V ).
4. Let X and Y be two vector fields and let φt denotes a local flow of X near
p ∈ M . Prove that
¢
1¡
d φ−t (Yφt (p) − Y p t =0 = [X , Y ]p .
t →0 t
lim
(3)
Definition 3.9. Let T be any tensor field of type (r, s). Then we define DT , the covariant differential of T , as the (r, s + 1) tensor field
(θ1 , .., θr , X 1 , .., X s ,V ) → (D V T )(θ1 , .., θr , X 1 , .., X s ).
Notations Let (x α ) be a coordinate system and T a tensor field of type (r, s). Then
the covariant derivatve of T in the direction of α is often denote D α instead of D ∂αx .
Moreover, the component of the covariant derivative D α T are often written as
i ,..,i
T j 1,.., j r ;α .
s
1
The ";" should be compared with the usual ", " used for instance often in PDEs notations to denote partial differentiations.
21
3.4 Isometries
Definition 3.10. Let (M , g ) and (N , h) be semi-Riemannian manifolds. An isometry
from M to N is a diffeormorpism φ : M → N such that
φ? (h) = g .
This means that h(d φ(v), d φ(w)) = g (v, w), for all v, w ∈ T p M , p ∈ M .
To any complete4 vector field, we can associate a flow φs , which for any s gives a
diffeomorphism of the manifold on itself. Thus a natural question is: which vector
field X have a flow generating isometries ?
Proposition 3.2. Let X ∈ Γ(M ) and A ∈ T s0 (M ). Consider a domain D on which the
flow of X , ψs , is well defined for s sufficiently small. Then, in D,
L X A = lim
t →0
¢
1¡ ?
ψt (A) − A ,
t
where L X A denotes the Lie derivative of A with respect to the vector field X .
We do the proof in the case of a (0, 2) tensor field A. Since L X is a tensor derivation, for vector fields V and W ,
(L X A)(V,W ) = X (A(V,W )) − A([X ,V ],W ) − A(V, [X ,W ]).
Let p ∈ D. Then,
(ψ?
t A − A)(V p ,W p )
=
A ψt (p) (d ψt (Vp ), d ψt (Wp )) − A p (Vp ,Wp )
=
A ψt (p) (d ψt (Vp ), d ψt (Wp )) − A ψt (p) (Vψt (p) ,Wψt (p) )
+ A ψt (p) (Vψt (p) ,Wψt (p) ) − A p (Vp ,Wp ) = I + I I
Let α be the integral curve of X starting at p, i.e α(t ) = ψt (p).
Then, by definition,
d
1
lim I I =
(A α (Vα ,Wα ))|t =0 = X p (A(V,W )).
t →0 t
dt
For I , we write first
¡
¢
¡
¢
I = A d ψt (Vp ) − Vψt (p) , d ψt (Wp ) + A Vψt (p) , d ψt (Wp ) − Wψt (p) .
Since ψt ◦ ψ−t = I d ,
d ψt (Vp ) − Vψt (p)
=
d ψt (Vp − d ψ−t Vψt (p) ),
and similarly for the second term. The result then follows from (3) and the fact that
d ψt → I d as t → 0.
Definition 3.11. A Killing vector field on a semi-Riemannian manifold is a vector
field X for which the Lie derivative of the metric tensor vanishes L X g = 0.
4 For non complete vector field, we can also define a flow, but the global analysis will have to be replaced by local ones.
22
Proposition 3.3. Let X be a vector field and denote its local flow by ψt . X is Killing if
and only if all its (local) flows are isometries.
Proof. Let p ∈ M and U a neighborhood of p on which ψt is well defined for all t
sufficiently small. Then, if all ψt are isometries ψ?
t (g ) = g and thus, by the above
L X (g ) = 0. Reciprocally, let L X g = 0 and let ψt be a local flow of X . By standard
properties of ψt , we have ψs ψt = ψs+t for s, t small enough. The previous proposition then implies that, for v, w tangent vector at some point in the domain of the
flow
¢
1¡
lim g (d ψs+t (v), d ψs+t (v) − g (d ψs (v), d ψs (w) ) = 0
t →0 t
Thus, the real-valued functions s → g (d ψs (v), d ψs (w)) has zero derivative and is
thus constant, i.e.
g (d ψs (v), d ψs (w)) = g (v, w),
which implies that ψs is an isometry.
Definition 3.12. For any vector field X , define its deformation tensor as
X
π(V,W ) =
¢
1¡
g (D V X ,W ) + g (V, D W X ) .
2
From the definition, X π is F (M ) linear in its two arguments. Thus, it defines a
(0, 2). In coordinates
¢
1¡
X
παβ = D α X β + D β X α .
2
Lemma 3.8. 2 X π = L X g .
Proof. Use the tensor derivation of the Lie derivative, of the connection as well as
the Levi-Civita properties.
Exercice:
1. Consider Minkowski space with Cartesian coordinates (x α ) = (t , x i ). Show that
the translations ∂x α , the rotations x i ∂x j − x j ∂x i , and the hyperbolic rotations
t ∂x i + x i ∂t are all Killing fields.
2. If φ : M → N is an isometry, then d φ(D X Y ) = D d φ(X ) (d φY ), for all X , Y ∈ Γ(M ).
(Recall first what d φ(X ) is for X a smooth vector field and φ a diffeomorphism
).
3.5 Parallel translation and geodesics
Definition 3.13. Let N and M be smooth manifolds. A vector field on a smooth map
φ : N → M is a map Z : N → T M such that π ◦ Z = φ, where π is the canonical projection π : T M → M .
(In terms of vector bundle, Z is a section of the pullback bundle φ? (T M ) → N .)
Consider now a curve α : I → M . A vector field on α is according to the above
definition a map that assigns to each s ∈ I a tangent vector at α(s). Let Γ(α) denote
the set of all vector fields along α.
A particular example is given by the tangent vector α0 to α itself. Another one is
obtained from any vector field V ∈ Γ(M ), as Vα : t → Vα(t ) .
23
Proposition 3.4. Let α : I → M be a curve. Then, there is a unique function Z → Z 0 =
dZ
d s from Γ(α) to Γ(α) called the induced covariant derivative such that
1.
d
ds
is R linear.
2. Leibniz rule: (hZ )0 =
3. (Vα )0 (t ) = D α0 (t ) (V ),
dh
ds Z
+ hZ 0 ,
for any h ∈ F (I ),
∀V ∈ Γ(M ),
Moreover, we then have, for any vector fields Z1 and Z2 on α,
d
g (Z1 , Z2 ) = g (Z10 , Z2 ) + g (Z1 , Z20 ).
dt
Proof. Uniqueness: This follows from the first three properties only. Let us first assume that α ies in a a single coordinate patch (U , x i ).
For any Z ∈ Γ(α), we can write
Z (t ) = Z i (t )∂x i ,
where Z i (t ) = Z (t )(x i ).
By the first two properties,
£
¤0
dZ
= (Z i )0 ∂x i + Z i (∂x i )|α .
ds
Note that by the usual bump function argument, it follows that if V ∈ Γ(U ), then the
third property still holds, i.e.
(Vα )0 (t ) = D α0 (t ) (V ),
∀V ∈ Γ(U ).
(Indeed, given t 0 ∈ I , consider a bump function f around α(t 0 ), then f V can be
extended to a global vector field on M , use then property 3, expands using property
2 and evaluate everything at t 0 .)
Thus by the third property,
£
¤0
(∂x i )|α (t ) = D α0 (t ) ∂x i .
In particular, the connection completely determined ddZs .
In the general case, let J be an non-empty open subinterval on which α(J ) lies
in a a single coordinate patch (U , x i ). Let I dds the operation associated to the original α and J dds the one associated to α|J . From the above, J dds is already uniquely
determined. Let us show that given any Z ∈ Γ(α),
·
¸
·
¸
d
I d
Z
= J
Z|J .
ds
ds
|J
Let t 0 ∈ J , f be a bump function around α(t 0 ) with support in U and h be a bump
function (defined on R ) around t 0 with support in J . As before we can write
Z (t ) = Z i (t )∂x i
for t ∈ J . We consider hZ i and f ∂x i and extends them smoothly by 0 outside of
respectively J and U .
24
Then, on one hand,
(Z i h f ∂x i )0 = (h · f ◦ α · Z )0 = (h f ◦ α)0 Z + h f ◦ αZ 0 ,
and thus (Z i h f ∂x i )0 (t 0 ) = Z 0 (t 0 ). On the other hand,
(Z i h f ∂x i )0
=
(Z i h)0 f ∂x i + hZ i ( f ∂x i )α )0
=
(Z i h)0 f ∂x i + hZ i D α0 ( f ∂x i ).
We then evaluatehthis at ti0 and recognise on the right-hand side the expression
previously found for J dds Z|J .
For the existence, on any subinterval J of I such that α(J ) lies in a coordinate
neighborhood, define Z 0 be the above formula. One can then check that all four
properties hold (with I replaced by J ). Let J 1 and J 2 be two such subintervals of I ,
such that J 1 ∩ J 2 6= 0. The above formula gives two vector fields on J 1 ∩ J 2 , called
them Z10 and Z20 . Since the first three properties hold, we can apply our uniqueness
statement and thus we must have Z10 = Z20 on J 1 ∩ J 2 6= 0. In other words, the above
formula defines a single vector field in Γ(α).
Exercice: Check indeed that all four properties hold as claim in the proof. (For
the last property, check that it first hold for if α lies in a single coordinate neighborhood for Z1 and Z2 given by [∂x i ]α , [∂x j ]α .)
From the above proof, we obtain the coordinate expression
Z0
=
=
¢
dZβ ¡
∂x β |α + Z β D α0 (∂β )
ds
¢
dZβ ¡
γ
∂x β |α + (x ρ (α))0 Γβρ ◦ α Z β (∂x γ )|α ,
ds
where Z β := Z (x β ) : I → R are the components of Z along α.
A special case of a vector field along α is given by the tangent vector field to α, α0 .
Its derivative along α, α00 is sometimes called the acceleration of the curve α. (Note
that the first 0 does not depends on the geometry, but the second does.)
A vector field Z on α such that Z 0 = 0 is said to be parallel. The above computation leads to
Proposition 3.5. Let α : I → M , a ∈ I and z ∈ Tα(a) M . Then, there exists a unique
parallel vector field Z on α such that Z (a) = z.
Proof. Let J be a non-empty open subinterval of I containing a and such that α(J )
is contained on a unique coordinate chart. From the above, we must have
0=
dZβ
γ
+ (x ρ (α))0 Γβρ Z β .
ds
This is a system of linear odes which therefore has a unique solution defined on the
whole interval J for given initial data.
Now, if J 1 is another interval such that α(J 1 ) is contained in a unique coordinate
chart and J 1 ∩ J 6= 0, let then t 1 ∈ J 1 ∩ J 6= 0 and z(t 1 ) = Z (t 1 ) (where Z (t 1 ) has been
obtained uniquely by sovling the equation on J .) By the same argument, Z|J 1 is then
given uniquely by solving the equations in coordinates with the data given by z(t 1 ).
25
Since I can written as a union of such subintervals, we are done. (More precisely,
given any t 1 ∈ I , α([a, t 1 ]) can be covered by a finite number of coordinate neighborhoods. Using the uniqueness statement and the finite of neighborhoods, Z can be
defined on on the whole of [a, t 1 ] uniquely. Since this holds for any t 1 , the statement
of the proposition holds.)
With the above notation, given a, b ∈ I the map sending z ∈ Tα(a) M to Z (b) ∈
Tα(b) M is called a parallel translation.
Exercice: Prove that parallel translation is a linear isometry from (Tα(a) M , g α(a) )
to Tα(b) M , g α(b) ).
We can now define the notion of geodesics.
Definition 3.14. A geodesic is a curve α such that α00 = 0.
Lemma 3.9. Let α(s) be given by local coordinates as x α (s). Then the geodesic equations read as
ρ
(x ρ )00 + Γβγ (x β )0 (x γ )0 = 0.
The geodesic equation is therefore a system of second order odes. Since it is nonlinear, (due to the product of x 0 in the second term), it does not need to be defined
globally, but given data it always has a unique maximal solution.
We have by Cauchy-Lipschitz
Lemma 3.10. Let v ∈ T p M . Then there exists a unique maximal solution γ : I → M
to the geodesic equation such that γ0 (0) = v where is I is an open interval and 0 ∈ I .
Such a maximal geodesic (what does maximality refer to here ? ) is called a maximal geodesic or inextendible geodesic. We say that (M , g ) is geodesically complete if
every inextendible geodesic is defined on the whole of R (i.e the geodesic equations
always admit global solutions).
Lemma 3.11. Let a 6= 0, τ : s → as + b be a straight line and J = τ−1 (I ). Then, if
γ : I → M is a geodesic, so is γ ◦ τ : J → M .
Exercice: what about other changes of parametrization?
We will note (γv , I v ) the geodesic associated to the vector v as above.
Note that any constant curve is a geodesic.
Exercice:
• Show that the (non-constant) geodesics of the Euclidean or Minkowski space
are the straight lines.
• A curve α is called a pregeodesic if it has a reparametrization as a geodesic.
Let now α be a regular curve (α0 6= 0) such that α00 and α0 are collinear, i.e
α00 (s) = f (s)α0 (s) for some f .
1. Show that β = α ◦ h for β a geodesic if and only if h 00 + f (h)h 02 = 0.
2. Assume now that g (α0 , α0 ) 6= 0. Then, any constant speed reparametrization of α is a geodesic.
3. Either g (α0 , α0 ) = 0 identically, or it never vanishes.
4. If g (α0 , α0 ) = 0, then α is also a pregeodesic.
26
3.6 The physical interpretation of the geodesic equations in the causal
case
Recall Newtown’s equation from classical mechanics. In any Galilean frame, for a
particule of mass m,
X
m ẍ = F.
P
When the forces are only gravitational, then F = −m∇φ, where φ is the gravitational potential. The equation then reduces to
ẍ = −∇φ(x).
This is a second order ode, like the geodesic equations, whose solutions depends
only on initial position and initial velocity of the particule.
In General Relativity, the replacement of Newton’s equation is the geodesic equation. More precisely, a particular of mass m, which is free falling (no external forces
apart the effect of gravity ), moves along a geodesic whose tangent vector is timelike
if m > 0, null if m = 0. These last requirements are the general relativistic requirement that no particules can move faster than the speed of light.
Thus in General Relativity, there are no gravitational force, there is only geometry. To obtain a close physical theory, it remains to have a prescription for this geometry. More specifically, in Newtonian mechanic, the gravitational potential φ is
obtained through Poisson equation
∆φ = 4Gρ,
where ρ is the local mass density and G a constant (the gravitational constant).
Thus, although we have a replacement for Newton’s equation, we still need to
find a replacement for Poisson equation.
3.7 The exponential map
Definition 3.15. Let p ∈ M . Let U p be the set of vectors in T p M such that for v ∈ T p M ,
γv is defined on [0, 1]. The exponential map is then
exp p : U p
→
M
v
→
γv (1).
Remark 3.3. By smooth dependance with respect to the initial data (cf Cauchy-Lipshitz),
the exponential map is a smooth function on U p .
Remark 3.4. With the above notation, for any t ∈ R, the map s → γv (t s) is a geodesic.
Moreover, it has initial velocity t v and position p. Thus, γv (t s) = γt v (s) by uniqueness. In particular, we have
expp (t v) = γt v (1) = γv (t ).
The curve t → γv (t ) will be called a radial geodesic.
Exercice: prove that U p contains a neighborhood of 0 ∈ T p M .
Proposition 3.6. For each p ∈ M , there exists a neighborhood V of 0 in T p M such
that expp is a diffeomorphism from V to some neighborhood U of p ∈ M .
27
Proof. Let (x α ) be coordinates on M . Recall that the (x α ) induces a global coordinate
system (X α ) on T p M given by
¡
¢
X = X α ∂x α |p ,
for any X ∈ T p M . Consider the differential of the exponential map at 0 ∈ T p M . We
have
d expp (0) : T0 T p M → Texpp (0) M = T p M .
Let V ∈ T0 T p M . Since T p M is a vector space, we can identify T0 T p M with T p M .
More precisely. We can write V = V α [∂ X α ]|0∈Tp M . Then, α : s → sV α [∂x α ]|p∈M is a
curve in T p M whose tangent vector is V . Denote by Ṽ the vector V α [∂x α ]|p∈M (the ˜
map is sometimes called the canonical isomorphism in this context).
Then by definition and the above remark,
d expp (0)(V )
=
=
=
=
´
d ³
expp (α(s))
|s=0
ds
´
d ³
expp (s Ṽ
|s=0
ds
·
¸
d
γ (s)
d s Ṽ
|s=0
Ṽ .
Thus, d expp (0) coincides with the ˜ map. Exercice: show that the ˜ map is independent of the choice of coordinates. The result then follows by the inverse function
theorem.
Definition 3.16. If U and V are as in the preceeding proposition and if U is starshaped around 0, then V is called a normal neighborhood of p.
Recall that U around 0 starshaped means that for any x ∈ U , sx ∈ U for all 0 ≤
s ≤ 1. (Note that U ∈ T p M is a subset of a vector space, so that sx makes sense.)
Note that if U is a neighborhood of 0, then it contains a starsphaped neighbordhood of 0.
Proposition 3.7. If V = expp (U ) is a normal neighborhood of p ∈ M , then for each
point q ∈ V , there exists a unique geodesic σ : [0, 1] → V from p to q. Furthermore
σ0 (0) = exp−1
p (q) ∈ U .
Proof. Let q ∈ V and v = exp−1
p (q) ∈ U . Since U is starshaped, the ray ρ : t → t v,
0 ≤ t ≤ 1 is contained in U . Let σ = expp ◦ρ. Then σ is a geodesic from p to q.
Similarly, let τ : [0, 1] → V be any geodesic joining p to q with values in V . Then,
if τ0 (0) = w, and we must have by uniqueness that τ(t ) = expp (t w) at least for t
sufficiently small (since t w ∈ U for t small). Consider the set t ∈ [0, 1], such that
t w ∈ U . It is not a non-empty open interval in [0, 1]. Let t 0 be its supremum. We
have that τ(t 0 ) ∈ V and so that by continuity,
lim
t →t 0 ,t <t 0
exp−1
p τ(t ) = t 0 w ∈ U .
Thus, t 0 = 1 and w ∈ U .
Hence q = expp (w). Since expp is a diffeomorphism on U , it follows that w = v
and hence that τ = σ.
28
Remark 3.5. Note that if we do not specify the domain of definition of σ, then the
uniqueness does not hold anymore since we can make a change of parametrization.
Exercice: Let S 2 be the unit sphere in R3 endowed with its round metric.
1. Show that rotations of S 2 are isometries.
2. Let γ be a geodesic and φ : (M , g M ) → (N , g N ) an isometry. Show that φ(γ) is
also a geodesic.
3. Show that on S 2 , the great circles are geodesics.
4. Show that given any two distinct points on S 2 , they can be joined by two distinct geodesics (where we consider two geodesics to be equal if they have same
image in S 2 to account for the parametrization freedom).
Definition 3.17. A broken geodesic is a continuous, piecewise smooth curve with each
piece being a smooth geodesic.
Recall that a piecewise smooth curve is a continuous map γ : I → M , where I
is an non-empty interval such that if I = [a, b], then there exists a finite partition
a = t 0 < t 1 .. < t k = b such that each curve segment γ|[ti ,ti +1 ] is a smooth curve, while
for an interval with open endpoints, say I = [a, b), we require that for each nonempty closed interval J ⊂ I , the restriction γ|J is piecewise smooth.
A classical result in differential geometry asserts that a manifold is connected if
and only it is path connected. Here is a slight variation in the case of semi-Riemannian
manifolds.
Lemma 3.12. A semi-Riemannian manifold M is connected if and only if two points
of M can be joined by a broken geodesic.
Proof. Let M be connected and p ∈ M . Let C be the set of points that can be connected to p by broken geodesics.
C is open: indeed if q ∈ C , let V be a normal neighborhood around q. Then from
the above lemma, V ⊂ C .
M \C is open: Let q ∈ M \C and again V be a normal neighborhood around q. Then
V ⊂ M \C . For, if q 0 ∈ V ∩C , then from the above, there exists a geodesic from q to q 0
and from p to q 0 and thus, a broken geodesic from q to p, which is a contradiction.
Since C 6= ;, it follows from the connectedness of C that C = M . The converse is
trivial.
Let V be a normal neighborhood of p ∈ M . Let (e α ) be an orthonormal basis for
T p M . For any q ∈ V , let (x α (q)) be determined by
α
exp−1
p (q) = x (q)e α .
In other words, given q ∈ V , we first compute v q = exp−1
p (q). Then the components
of v q in the orthonormal basis e α provides the x α . The x α are smooth functions
since they can be written as a composition of smooth functions and it is also immediate that they form a local coordinate system near p.
Lemma 3.13. The map q → x α (q) determines a local coordinate system on V called
normal coordinates.
If f α is the dual basis to the e α , then x β ◦ expp = f β on U . Moreover,
29
Proposition 3.8. Let (x α ) be normal coordinates. Then, g αβ (p) = g (p)(e α , e β ) and
Γα
(p) = 0.
βγ
Remark 3.6. Note that g αβ (p) = g (p)(e α , e β ) means that the matrice g αβ (p) coincides
with the Minkowski (up to an ordering the e α ) or the identity matrix in the Lorentzian
or Riemannian case. In physics, these coordinates are sometimes referred to as "local
inertial frames" essentially because the laws of physics written in these coordinates
will look similar (up to a first order term) to those written in Minkowski space or the
Euclidean space, at least on a sufficiently small neighborhood of p. In other words, if
your laboratory is sufficiently small5 and if you make measurements very quickly (we
need a small region of spacetime, not just of space), then the measurements should
converge to those obtained in say Minkowksi space, i.e. we can neglect all gravitational effects (of course, if your lab becomes too small, then quantum effects might
appear, but this is another story).
Proof. Let v = v α e α ∈ T p M , where e α are orthonormal as above. Since expp (t v) =
γt (v) (for t small enough),
x α (γv (t )) = x α (expp (t v)) = t v α .
Hence, v = γ0v (0) = v α [∂x α ]|p .
In particular, e α = [∂x α ]p and hence that g αβ (p) = g (p)(e α , e β ).
¢2
2 ¡
Moreover, from the above expression, we have ddt 2 x α (γv (t )) = 0, and since γv
is geodesic, this implies that
Γα
βγ (γv (t ))
´ d ¡
¢
d ³ β
x (γv (t ))
x γ (γv (t )) = 0.
dt
dt
Evaluated at t = 0, we get that
β γ
Γα
βγ (p)v v = 0,
for all v, which implies, by polarization, that Γα
(p) = 0.
βγ
Remark 3.7. With the above notations, the Jacobian matrix of the differential of the
exponential map at p at any point v ∈ U with respect to the coordinates associated to
the e α on T p M and the normal coordinates on expp (U ) is just the identity matrix.
4 Curvature
For a surface in R3 , there is a well defined notion of curvature, the Gaussian curvature. This is an intrinsic invariant of the surface, in the sense that Gauss proved
that the Gaussian curvature is independent of the fact that the surface happens to
be embedded in R3 . The generalization of the the Gaussian curvature to arbitrary
(semi-)Riemannian manifold let Riemann to Riemannian geometry.
5 Smallness should always be measured with respect to something. Here, we want the Γ to stay small
on some coordinate patch and since at p we have arranged that they vanish, we need the size of the lab
to be small compared to the derivatives of the Γ. These derivatives are not very geometric, so if we want
something more physical (i.e. less dependent on our choices of gauge), we need a tensorial quantity at
the level of the first derivative of the Γ. This is the curvature, soon to be defined in these lectures.
30
Lemma 4.1. Let (M , g ) be a semi-Riemannian manifold and D its Levi-Civita connection. The function
R : Γ(M )3
→
Γ(M )
(X , Y , Z )
→
R X Y Z := D X D Y Z − D Y D X Z − D [X ,Y ] Z
defines a (1, 3) tensor field on M called the Riemann curvature tensor of (M , g ).
Remark 4.1. Eventhough R only takes 3 vector fields as arguments, it is a (1, 3) tensor
field in the sense that given any one-form ω, the map
(X , Y , Z , ω) → ω(R X Y Z )
is a F (M ) multi-linear map from Γ(M )3 × Λ1 (M ) to R.
Proof. For R to define a tensor field, one needs to R to be F (M ) linear in each of its
arguments. Let f ∈ M , then
RX ,f Y Z
=
D X D f Y Z − D f Y D X Y − D [X , f Y ] Z
¡
¢
D X f D Y Z − f D Y D X − D X ( f )Y + f [X ,Y ] Z
=
X ( f )D Y Z + f D X D Y Z − f D Y D X − X ( f )D Y Z − f D [X ,Y ] Z
=
f R X ,Y Z .
=
The other cases are similar.
Remark 4.2. The sign of R is conventional, i.e. some authors (for instance O’Neil)
define R as above but with an extra minus sign everywhere.
Since R is a tensor field, it can be interpreted as an map defined on (T p M )3 .
Given x, y ∈ T p M , let R x y be the map T p M → T p M sending z to R x y z. We have
Proposition 4.1 (Symmetries of the Riemann tensor in the metric case ).
Rx y
=
−R y x
g (R x y v, w)
=
−g (R x y w, v)
R x y z + R y z x + R zx y
=
0
g (R x y v, w)
=
g (R v w x, y).
Remark 4.3. The curvature can be defined for any linear connexion. Some symmetries always hold (such as the first), but others (such as the last) are specific to metric
connexion.
The third equation is called the first Bianchi identity. The last follows from the
first three.
Proof. The first equation is obvious in view of the definition of R. For the second, we
assume that X , Y , Z are local vector fields such that X p = x, Y p = y, Z p = z and such
that all commutators [X , Y ] = [Y , Z ] = [Z , X ] = 0. Since R is a tensor field, it does not
depend on the choice of extensions. (ex: why do such extensions exists ? ) Then,
g (R X Y V,V )
=
g (D X D Y V,V ) − g (D Y D X V,V )
=
X g (D Y V,V ) − g (D Y V, D X V ) − Y g (D X V,V ) + g (D X V, D Y V )
=
1/2X Y g (V,V ) − 1/2Y X g (V,V )
=
0,
31
since X and Y commutes. By polarization, we obtain the second equation. For
a function F := F (X , Y , Z ), let C (F ) denotes the sum over cyclic permutations of
X , Y , Z i.e
C (F )(X , Y , Z ) = F (X , Y , Z ) + F (Z , X , Y ) + F (Y , Z , X ).
Then, since any cyclic permutation leaves C (F )(X , Y , Z ) unchanged, we have
C RX Y Z
=
C D X DY Z − C DY D X Z
=
C DY D Z X − C DY D X Z
=
C D Y [Z , X ] = 0.
We leave the last symmetry as an exercice.
Since R is a tensor field, we can consider its covariant differential DR : V ∈ Γ(M ) →
D V R. Recall that since D V is a tensor derivativion, it maps tensor fields into tensor
fields, and since D V R is F (M ) linear in V , one can thus view DR as a (1, 4) tensor
field which takes 4 vector fields X , Y , Z ,V and return a vector field
DR(X , Y , Z ,V ) = D V R X Y Z = (D V R)(X , Y )V.
Again, at any p ∈ M , we get a map from (T p M )4 to T p M . We have
Proposition 4.2. We have
D z R(x, y) + D x R(y, z) + D y R(z, x) = 0,
called the second Bianchi identity.
Proof. Choose a normal coordinate system (x α ) at p. Let X p , Y p , Z p ∈ T p M and let
let X , Y , Z be local extensions of X p , Y p , Z p by choosing X , Y , Z to have constant
components in the ∂x α basis of vector fields. Note that X , Y , Z all commutes together
and that their covariant derivatives vanish at p.
Then, from the tensor derivation laws,
(D Z R) (X , Y )V = D Z (R(X , Y )V ) − R(D Z X , Y )V − R(X , D Z Y )V − R(X , Y )(D Z V ).
At p, the two middle terms on the RHS vanishes, hence
D Z R(X , Y ) = [D Z , R(X , Y )] = [D Z , [D Y , D X ]],
at p.
The result then follows from the Jacobi identity
[D Z , [D Y , D X ]] + [D X , [D Z , D Y ]] + [D Y , [D X , D Z ]] = 0
(just write it out to see the cancellations).
Let the components of the curvature tensor be defined as
α
R(X , Y )Z = R βγδ
Z β X γ Y δ ∂x α ,
i.e.
α
R βγδ
∂x α = R(∂γ , ∂δ )(∂β ).
Exercice:
32
1. Show that
α
α
σ α
σ α
R βγδ
= ∂γ Γα
δβ − ∂δ Γγβ + Γδβ Γγσ − Γγβ Γδσ .
2. Let
α
R αβδγ = g αρ R βγδ
.
Show that R αβγδ = R γδαβ .
3. The Ricci identity: for X ∈ Γ(M ), define the second covariant of X , DD X as the
covariant derivative D(D X ) of D X . It is a (1, 2) tensor field and we denote its
α
components by X ;βγ
. Prove that
α
α
α
X ;βγ
− X ;γβ
= R δβγ
X δ.
4. Let T be an arbitrary tensor field. Show that
[D α , D β ]T = (R ? T )αβ .
where R ? T denotes a tensor obtained from R ⊗ T and contractions.
Remark 4.4. There are plenty of nice geometric interpreations of the Riemann curvature tensor. Ex: write something about it. (look for instance for the relation between
curvature and holonomy).
Given a semi-Riemannian manifold, we can define locally6 a n-form called volume
form out of the metric, given by
p
η = |g |d x 1 ∧ d x 2 .. ∧ d x n ,
where g = det g αβ is the determinant of the metric in the local coordinate system.
As we will see later, the volume form appears naturally in many formulae involving geometric operators and of course, it is also the natural form to perform
integrations on our manifold.
Choose normal coordinates (x i ) at some p and consider the function det g =
2
±η (∂x 1 , ∂x 2 , .., ∂x n )in the (x α ) system of coordinates. Doing a taylor expansion, we
have
1
det g = det g (0) + ∂α (det g )(0)x α + ∂α ∂β (det g )(0)x α x β + O(x 3 ),
2
where det g (0) = ±1. The derivative terms on the right hand side can all be rewritten in terms of covariant derivatives. Using the Levi-Civita property of the metric (D g = 0), the symmetrization obtained thanks to the x α x β , and the total antisymmetry of the determinant, all the second covariant derivatives can be rewritten
using the Riemann curvature tensor. However, only specific combinations7 of the
components of the curvature tensor actually appears and they can be rewritten8 (up
to a constant) as a tensor: the Ricci tensor, which is defined as follows.
6 It can even be made global if (and only if) M is orientable. See Appendix A.
7 The situation is similar to that of the flow of a vector field. Indeed, let B be a measurable set and
µ(φt (B )) be the volume of the image of flow of B . Then, ddt (µ(φt (B )) is given by an integral of the diver-
gence of X . The analogy is that the flow is the geodesic flow, the vector field is the geodesic spray vector
field. However, all these objects are defined on the tangent bundle and not just on M (essentially because
the geodesic equation is second order) while we are interested in the variation of volumes in M , which is
why this is an analogy only.
8 Those interested in the computations can consult the paper The volume of a small geodesic ball of a
Riemannian manifold by Alfred Gray, or the book Riemannian Geometry by Eisenhart.
33
α
Definition 4.1. Let R βγδ
be the components of the Riemann tensor. Then, Ri c(g ) is
the tensor obtained by contraction in the first and third positions, i.e. in coordinates
α
Ri c(g )βδ = R βαδ
.
Lemma 4.2. The Ricci tensor is symmetric: Ri c(g )αβ = Ri c(g )βα .
Proof. This follows easily from the symmetries of the Riemann tensor.
Similarly, the scalar curvature is defined as
Definition 4.2. The scalar curvature is then defined as the trace of Ri c(g ) i.e.
R(g ) = R αβ g αβ .
Remark 4.5. The scalar curvature has again an interpretation in terms of taylor expansion, this time, one needs to consider the surface of a geodesic ball centered at
some point and compare it with the surface of a ball of flat space of the same radius.
5 Some differential operators
Definition 5.1. Let f ∈ F (M ). Then, g r ad f (the gradient of f ) is by definition the
unique vector field such that
g (g r ad f ,V ) =< d f ,V >= V ( f ),
for any V ∈ Γ(M ).
In components, (g r ad f )α = g αβ ∂β f .
One often write instead of g r ad f , D f , which should then not be confused the
D f : V → D V ( f ) =< d f ,V >= d f (V ).
Definition 5.2. Let X be a vector field. We then define its divergence as
d i v(X ) := D α X α = C (D X )
where C is the contraction operator.
Exercice:
1. For a tensor field, we define similarly its divergence by computing first its covariant differential and then apply the contraction operator on the the first
two indices. For instance, the divergence of a (1, 1) tensor field T is a one form
µ
given in components by D µ T ν . Compute its components in terms of those of
µ
T ν.
¡
¢
1
α
2. Show that d i v(X ) = | det
in any system of coordinates (note
g | ∂α | det g |X
the absolute value, which is necessary in the Lorentzian case).
p
3. Let η = |g |d x 1 ∧ d x 2 .. ∧ d x n be a local volume form associated to g . Show
that
d (i X η) = div(X ) η = L X (η),
where i X η is the interior product of η by X ( a n − 1 form ). Hint: use a coordinate system such that ∂x 1 = X ).
34
We end this section by the definition of a natural second order differential operator associated to a Riemannian or Lorentzian metric.
Definition 5.3.
• Assume that g is Riemannian, then the operator
ψ ∈ F (M ) → ∆g (ψ) = g αβ D α D β = d i v(g r ad f ),
is called the Laplace-Beltrami operator associated to g .
• For g Lorentzian, we have similarly an operator äg
ψ ∈ F (M ) → äg (ψ) = g αβ D α D β = d i v(g r ad f ),
called the wave (or D’Alembertian) operator associated to g .
Remark 5.1. One can define a similar operator for any semi-riemannian metric g but
it is only for the Riemannian and Lorentzian case that there currently exists a good
understanding (and, to the author knowledge, only these operators appears naturally
in physical contexts).
Remark 5.2. For a tensor field T of arbitrary type, we can define similarly
äg T = g αβ D α D β T.
Remark 5.3. For a Riemannian metric, the principal symbol of ∆g written in an arbitrary system of local coordinates is elliptic, while for a Lorentzian metric it is hyperbolic.
Exercice:
• Show that the principal symbol of äg is g αβ (p)ξα ξβ . It can thus be viewed as
a real function defined on the cotangent bundle T M ? .
• Show that in any local system of coordinates such that g 00 < 0, g i j definite
positive, if ξ = (τ, ξi ) (i.e. ξ0 = τ), then, for each (ξi ) 6= 0, the equation P (τ) :=
g αβ ξα ξβ = 0 has two distinct real roots.
• Show that in any local system of coordinates such that g 00 < 0, g i j definite
positive, the wave equation can be recast as a symmetric hyperbolic system in
the sense of Friedrichs.
5.0.1 Remarks on conservation laws and Lagrangian theory
Recall now Stokes theorem. Let M be a smooth orientable manifold and N ⊂ M
a domain with smooth boundary ∂N , j : ∂N → M the inclusion map, then ∀ω ∈
Λn−1 (M ) such that supp (ω) ∩ ∂N is compact, we have
Z
Z
?
j ω=
d ω.
∂N
N
Assume for simplicity that N is such that ∂N is compact and let X be a vectorfield.
Assume also that M is orientable (in which case the volume form η can actually be
35
defined globally and agree with the given formula in any local chart ). Then, from
the above,
Z
Z
∂N
j ? (i X η) =
N
div(X ) η.
In particular, for every divergence free vector field, we have a conservation law (in
the sense that the total flux of X through ∂N has to vanish, so the incoming flux=the
outgoing flux).
These conservation laws are important in PDEs applications because they will
be the source of many of the estimates needed to control solutions.
The question becomes: how to construct divergence free vector fields ?
One answer: out of divergence free, symmetric tensor fields and out of symmetries of the metric.
µ
µ
Let now T be a divergence free tensor, say Tν satisfying D µ Tν = 0. Let X be a
vector field, and define X J µ as
µ
X µ
J = Tν X ν .
Then,
µ
div( X J µ ) = Tν D µ X ν .
µ
Thus, we get again a conservation law provided that Tν D µ X ν = 0.
Later in the course, we shall consider symmetric tensor so that T µν = T νµ . In this
case,
¡
¢
µ
Tν D µ X ν = T µν 1/2 D µ X ν + D ν X µ = T µν X πµν ,
where X π is the deformation tensor of X . Conclusions: Killing vector fields and
divergence free tensors give conservation laws as explained above.
The questions now becomes: how to construct or obtain divergence free symmetric tensors and Killing fields9 ?
Digression:
For all the standard equations of physics (say coming from electromagnetism, fluid
dynamics, scalar fields etc..) it is not hard to write them down first in tensorial notation in special relativity (i.e. in the special case of a background geometry given by
Minkowski space) and then in an arbitrary Lorentzian manifold by replacing partial
derivatives in Cartesian coordinates by covariant derivatives. Now, in all these cases,
the equation can be written simply under the form of a divergence free tensor field
constructed out of the unknown. These divergence free tensor fields themselves
can be constructed from variational techniques and takes their roots in Lagrangian
theory. Thus, the physics is essentially going to tell us what are the divergence free
symmetric tensor to consider. As for Killing fields, they will appear also naturally10 .
For instance, if we imagine a physical system which is stationary (i.e. there is a sense
in which the system is not evolving), then the physicist will model this by the existence of a Killing field T which has some timelike property (g (T, T ) < 0 at least in
some region of the manifold).
9 In fact, arguments based on stokes theorem can be applied to symmetric tensor fields and vector
fields even when they are not divergence free or Killing. In that case, we would get error terms from the
divergence term, and the game becomes to control these errors. In particular, the control of solutions for
non-linear problems often falls in that category.
10 There is another (similarly vague) argument for the existence of symmetry in many solutions of problems arising in physics. These solutions can often be constructed by variational arguments, for instance
by constructing minimizers of certain functionals. Departure of symmetry will then often enhance the
value of these functionals. A good example to have in mind is the isoperimetric problem in classical
geometry.
36
Example 1: We consider a smooth function ψ (a scalar field ) on our manifold
and assume that it solves the wave equation
äg ψ = F,
for some smooth source F .
To any such ψ, we can associate its so called energy-momentum11 tensor (field)
¢
1 ¡
T [ψ] = d ψ ⊗ d ψ − g g (Dψ, Dψ) .
2
Then,
div T = äg ψ.d ψ,
or in coordinates,
µ
D µ Tν [ψ] = äg ψ.D ν ψ.
Example 2: Maxwell equations. In Minkowski space in usual coordinates, they
can be written12 as
∂µ F µν = j ν ,
where j ν is a vector field (the source term) and F (the Faraday tensor) is a closed two
form on Rn+1 (thus, by the Poincaré Lemma, F = d A for some one form A).
Exercice: Derive the usual form of the Maxwell equations (involving the divergence and rotational operators in flat space) in terms of the electric and magnetic
fields E i = F (∂x 0 , ∂x i ) and B i = − 21 ² j ab δa j δak F (∂x j , ∂x k ), where ² is the totally antisymmetric symbol such that ²123 = 1 from the equations satisfied by F above.
We can then introduce a tensor, the energy-momentum tensor of electromagnetism
1
Tµν [F ] = F µσ F σν − η µν (F αβ F αβ )
4
and check that the equation of motion implies that
µ
∂µ Tµν = F ν j ν .
(Exercice: check the sign of the RHS ). In particular, in the vacuum j = 0 and we
again have a divergence free symmetric tensor field.
To obtain the equations in an arbitrary Lorentzian manifold (M , g ), just replace
all ∂µ by D µ and the Minkowski metric η by g .
The origin of these tensor fields is Lagrangian field theory. A very nice mathematical treatment of such Lagrangian field theories (and in particular of the hyperbolic character of the equations) is given in "The action principle and partial differential equations" by D. Christodoulou with an overview given in the book by the
same author "Mathematical problems of general relativity I ".
11 Also refers to sometimes as stress energy tensor
12 As for the rest of the equations written in these lectures, all physical constants have been set to 1 for
simplicity.
37
5.0.2 The Einstein tensor
Let us now try to construct general relativity. In particular, we are looking for an
equation replacing Poisson’s equation
∆φ = 4πρ.
Let T be the tensor associated with the matter fields. It turns out that in the above
examples, T is (0, 2) symmetric tensor field and T is divergence free. Moreover, for
any timelike vector field X , we have T (X , X ) ≥ 0 and this is interpreted in physics as
an energy density.
Exercice: For a scalar field ψ, show that T [ψ, ψ](X , X ) ≥ 0 for X timelike. What is
T [ψ, ψ](X , X ) in Minkowski space for X = ∂t ?
We interpret T as the correct replacement for ρ. On the other hand, the replacement for φ should be constructed out of the metric. Moreover, it should be second
order (as is ∆φ). Thus, we want an equation under the form
G(g ) = T,
where G is a symmetric (0, 2) tensor (it has to be independent of the choice of coordinates), constructed out of the first and second derivatives of g and G is divergence
free. A first candidate would have been the curvature tensor, since it is a tensor and
depends on the second derivatives of g , but it is 4 tensor. The Ricci tensor then
seems like a good candidate and indeed, in the first draft, Einstein (wrongly) chose
it. The trouble is that Ri cci is not divergence free in general.
The correct choice is given in the next lemma.
Lemma 5.1.
d R(g ) = 2 div Ri c.
Equivalenty, the Einstein tensor defined as
1
G = Ri c(g ) − g R(g )
2
is a divergence free tensor field.
Proof. We first express the second Bianchi identity in coordinates
R αβγδ;ρ + R αβργ;δ + R αβδρ;γ = 0
Contracting in α and ρ gives, using also the symmetries of R and the definition of
the Ricci tensor,
0
=
R αβγδ;α + R αβαγ;δ + R αβδα;γ
=
R αβγδ;α + R βγ;δ − R βδ;γ
Contract again in βδ (and use again the symmetries of R) to conclude.
38
6 The Einstein equations
The Einstein equations for a Lorentzian13 manifold (M , g ) are given by
1
Ri c(g ) − g R(g ) + Λg = T,
2
where Λ ∈ R is a constant, typically called in the physics litterature the cosmological
constant, T is a symmetric, divergence free, 2-tensor (think of T as a source term),
typically called the energy-momentum tensor in the physics litterature. Note that
the term Λg is also divergence free, by virtue of the fact that D g = 0.
For simplicity, let us assume that T = 0 and that dim M = 4. Then, taking the
trace of the Einstein equations, we obtain that R = 4Λ and the Einstein equations
reduce to the vacuum Einstein equation:
Ri c(g ) = Λg .
A solution to the Einstein vacuum equations is then a Lorentzian manifold (M , g )
solving the above equations. Some remarks:
1. The manifold M is part of the unknown. We are not just solving for g .
2. A word about the real constant Λ. In the first formulation of the equations,
Einstein did not add the Λ term. Then, trying to construct explicit static solutions (which have a time symmetry ) in the presence of matter, he needed
to balance the gravitational force that attracts matter together with another
mechanism that stretch matter. Adding Λ > 0 allowed to construct such solutions. The solution that Einstein constructed is however highly unstable,
thus not quite physical and moreover does not match the observations that
Hubble made in 1929. He called the introduction of the Λ is "biggest mistake"
and decided to remove it from the equations. However, even if the solution of
Einstein did not match the physical observations, the new mechanism coming from Λ > 0 was leggit. It is responsible for the so-called "accelerated expansion" of the universe and is a standard of cosmology. In other words, yes
the solution of Einstein is unphysical but that does not necessarily mean the
equations were bad. There could be other solutions with better properties.
The current models use Λ = 0 for an event at the scale of a galaxy or a star,
Λ > 0 (but very small14 compared to all other constants, such as G and c))
for a global model of the observed universe. Finally, solutions with Λ < 0 are
extremely popular in High Energy Physics (the most quoted paper in HEP is
about such type of solutions).
3. As we shall see, the Einstein equations are dynamical (hyperbolic) equations.
Thus, solutions are typically constructed by solving initial value problem (but
other constructions are possible such as characteristic initial value problem,
analytic extensions...) in which case we will need to explain what are the data
and in which sense our solutions agree with the data.
13 The study of the Riemannian version of these equations is although an interesting subject. Here, we
will focus on the Lorentzian case.
14 According to the English wikipedia article the current proposed value if 1.19 × 10−52 m −2 .
39
4. In the non-vacuum case (T 6= 0), although T is a source, the typical case is
that T is not given a priori. Instead, T is also obtained by solving a dynamical equation, which itself depends on (M , g ). Thus, one obtained a coupled
system.
Ex: The scalar field. Consider a function ψ ∈ F (M ) and recall that T [ψ] is
given by
1
T [ψ] = d ψ ⊗ d ψ − g (g (Dψ, Dψ),
2
where Dψ denotes the gradient of ψ. Recall also that
div T = Dψ . äg ψ.
Thus, we consider the coupled problem
1
Ri c(g ) − g + Λg
2
äg ψ
=
T [ψ],
=
0.
Typical other sources term: electromagnetism, Vlasov fields (kinetic theory),
fluids etc..
Above, we claimed that the Einstein equations are actually hyperbolic, in fact
wave, equations. There are several different notions of hyperbolicity in the PDE literature. Here, we are going to prove first that, from the Einstein equations, one can
extract systems of wave equations, that is to say equations of the form
äg u = F (u, Du) + B,
for some tensor fields u related to the metric (These equations may only hold locally,
say in some coordinate patch of the manifold).
The difficulties in identifying the hyperbolic character of the Einstein equations
are actually already present in a much simpler, and better known systems of PDEs:
the Maxwell equations.
Recall that they can be written (here in Minkowski space for simplicity) as ∂µ F µν =
ν
j for a closed two form F .
How do we know that these are actually wave equations ?
We will give two answers15 .
First, in the (E , B ) formulation, it is well known that from the above system we
can write wave equations of the form
äE = S E (∂ j ),
äB = S B (∂ j ).
Second, if A is such that F = d A, then the Maxwell equations written in terms of
A takes the form
äA ν − ∂ν ∂µ A µ = j µ .
15 A third formulation of hyperbolicity for the Maxwell equations can be obtained directly from the
(E , B ) formulation. Discarding the two divergence equations, the two evolution equations for E and B
actually form a 1st order symmetric hyperbolic system in the sense of Friedrich. Interestingly, the second
Bianchi equations for the curvature tensor can also be separated into constraints and evolution equations
and the evolution part forms again a symmetric hyperbolic system. That there are so many analogies between the maxwell equations D µ F µν and the Bianchi equations is actually not that surprising, since the
Maxwell equations can also be viewed as the Bianchi equations for a connection 1-form on a principal
bundle whose curvature is actually given by F .
40
Now this equation is not a wave equation because of the second term in the lefthand side. However, recall that given any scalar function χ, we have F = d (A + d χ).
Claim: if χ solves äχ = −∂µ A µ , then A 0 = A + d χ solves ∂µ (A 0 )µ = 0. The new A 0 then
solves a wave equation. Choosing χ is called a making gauge choice (the one where
∂µ A µ = 0 is called the Lorentz gauge, another popular one is the Coulomb gauge,
where only the spatial divergence of A is required to vanish ∂i A i = 0) .
For the Einstein equations, we claim that similarly, we can write wave equations
directly by essentially differentiating the equations (thus obtaining wave equations
for R) or, that by making a gauge choice, we can write wave equations for g . The
gauge freedom in this case is simply the choice of local coordinates on our manifold.
6.1 Hyperbolicty I: Wave equations satisfied by the Riemann tensor
Proposition 6.1. Let (M , g ) be a solution to the vacuum Einstein equations. Then,
the Riemann curvature tensor solves an equation of the form
äg R = R ? R.
where R ? R is obtained from R ⊗ R by contractions.
Proof. Taking the trace of the Bianchi equations, we obtain
D σ R αβσν + D β R αν − D α R βν = 0.
Thanks to the Einstein equations and the fact that D g = 0, by definition of the LeviCivita connection, the last two terms vanishes and we obtain just a divergence free
property of the Riemann curvature tensor
D σ R αβσν = 0
Recalling again the Bianchi equation
D σ R αβµν + D β R σαµν + D α R βσµν = 0,
we have
äg R αβµν
=
g ρσ D ρ D σ R αβµν ,
=
−g ρσ D ρ D β R σαµν − g ρσ D ρ D α R βσµν
=
−g ρσ [D ρ , D β ]R σαµν − g ρσ [D ρ , D α ]R βσµν − D β g ρσ D ρ R σαµν − D α g ρσ D α R βσµν
=
−g ρσ [D ρ , D β ]R σαµν − g ρσ [D ρ , D α ]R βσµν
=
R ? R.
Remark 6.1. Given (M , g ) and apropriate data, it is not hard to construct local solutions to a non-linear system equations of the form
äg φ = φ ? φ.
However, this scheme does not really solves the Einstein equations in that R and g are
of course not independent. We could imagine running a scheme of the form:
41
Given appropriate data for g , we construct first a candidate for g 1 (for instance extending g locally to simply be a Lorentzian metric, whatever it is, as long as it agrees
with the data).
Then, we compute data for R 1 and solve the associated wave equation.
We then try to construct g 2 , which would have R 1 as a curvature tensor (possibly modulo some error terms) and should still agree with the data.
We then run again our iterative scheme. There are many difficulties in doing so and
in practice, the local existence of solutions to the Einstein equations does not use this
formulation.
However, estimating the curvature first and returning to g has been a very successfull strategy to address global problems for the Einstein equations (when we try to
understand the long time dynamics of the solutions).
One clear thing: before solving the Einstein equations, we need to know how to
solve the wave equation (say äg ψ = 0 for ψ a function ). In particular, we need to
know how to prescribe initial data.
6.2 Hyperbolicity II: Wave coordinates
Definition 6.1. Let (M , g ) be a Lorentzian manifold. A system of wave16 coordinates
is by definition a coordinate system (x α ) such that for each α, the function x α is a
solution of the wave equation
äg x α = 0.
Proposition 6.2. Let (x α ) be wave coordinates. Then, for all α,
g ρσ Γα
ρσ = 0.
Proof. We have for any function ψ
äg ψ
g ρσ D ρ ∂σ (ψ)
¡
¢
γ
g ρσ ∂ρ ∂σ (ψ) − Γρσ ∂γ (ψ) .
=
=
(4)
(5)
Thus,
äg x α
=
³
´
γ
ρσ α
g ρσ Γρσ δα
Γρσ .
γ =g
Corollary 6.1. In wave coordinates, the wave operator äg reduces to its principal part
and the vacuum Einstein equations takes the form
äg g αβ = Q αβ (∂g , ∂g ).
Proof. This follows easily by writing explicitly the Ricci tensor first in terms of the Γ
and then in terms of g .
Remark 6.2. The existence of local wave coordinates will be relatively easy once we
know how to solve the wave equation.
16 In the Riemannian case, similar coordinates can be constructed, replacing the wave operator by the
Laplace-Beltrami operator. These are called harmonic coordinates and this name is sometimes also used
in the Lorentzian case.
42
7 Semi-Riemannian submanifolds
Definition 7.1. Let P be a submanifold of a semi-Riemannian manifold (M , g ) and
j : P → M denote the inclusion map. If j ? (g ) is a metric tensor on P , we say that P is
semi-Riemannian submanifold of M .
j ? (g ) is called the induced metric or first fundamental form.
Recall the definition of a vector field on a map. If P is a submanifold of M , then
we call a vector field on the inclusion map j : P → M a P -vector field. We denote by
Γ(P, M ) the set of all such vector fields.
If P is a submanifold of M , then we have, for each p ∈ M a direct sum decomposition
T p M = T p P + (T p P )⊥ .
Vectors in (T p P )⊥ are said to be normal to P while those of T p P are of course tangent
to P . The resulting orthogonal projection will be denoted as
tan : T p M → T p P
and
nor : T p M → (T p P )⊥ .
A vector field Z in Γ(P, M ) is normal to M provided that for each p ∈ P , Z p is normal
to P . For any Z in Γ(P, M ), we can apply tan and nor at each point to construct the
tangential and normal part of Z .
Definition 7.2. Let P be a semi-Riemannian submanifold of M . Then the induced
connection ∇ on P is a map
∇ : Γ(P ) × Γ(P, M ) → Γ(P, M )
defined as follows. For any V ∈ Γ(P ) and Z ∈ Γ(P, M ), let V 0 and Z 0 be smooth local
extensions to M of V and Z . Then we define ∇V Z as the restriction of ∇V 0 Z 0 to P .
Exercice: With the above notation, check that the restriction does not depend on
the choice of extensions, and thus that the induced connection is well defined.
Lemma 7.1. If V,W ∈ Γ(P ) and P D denotes the Levi-Civita connection associated to
the induced metric on P , then
tan ∇V W = P D V W.
Remark 7.1. In particular, we could reconstruct the properties of the Levi-Civita connection for surfaces (or any submanifolds of Rn ) in this way. Indeed, let S be a surface in R3 . Then, we know how to define the induced metric on S. Moreover, for
V,W ∈ Γ(P ), we can consider extensions Ṽ , W̃ to the whole of Rn . Then, ∇Ṽ W̃ is given
in Cartesian coordinates by Ṽ α (W̃ β ) and we can restrict to the surface as explained
above. In other words, the formula just given together with basic calculus in Rn can
be seen as justification for the definition of the Levi-Civita connection.
Add second fundamental forms and Gauss-Codazzi equations.
Definition 7.3. An hypersurface of a Riemannian or Lorentzian manifold (i.e. submanifold of codimension 1) is called spacelike if the induced metric is Riemannian,
timelike if the induced metric is Lorentzian, null if the induced metric17 is degenerate.
17 This is of course a slight abuse of language, since the induced metric in that case is not a metric.
43
8 Local geometry
8.1 Two parameters maps
We start by some preliminaries on two parameter maps.
Definition 8.1. Let D ⊂ R2 be open and such that D intersected with any vertical or
horizontal line is an interval (possibly empty)18 . A two parameter map is a smooth
map
κ:D
→
M
(u, v)
→
κ(u, v).
κ is thus composed of two intertwined family of curves
• For any fixed v 0 such that {v = v 0 } ∩ D 6= ;, the u-parameter curve v = v 0 of κ
is u → κ(u, v 0 ).
• For any fixed u 0 such that {u = u 0 } ∩ D 6= ;, the v-parameter curve u = u 0 of κ
is v → κ(u 0 , v).
At any (u, v) ∈ D, we can compute d κ(u,v) : T(u,v) D → Tκ(u,v) M . The maps (u, v) →
κu := d κ(∂u ) ∈ Tκ(u,v) M and (u, v) → κv := d κ(∂v ) ∈ Tκ(u,v) M are then vector fields
on the map κ. Moreover, one has of course that κu (u 0 , v 0 ) coincides with the tangent vector of the v = v 0 curve at u 0 and κv (u 0 , v 0 ) is the tangent vector of the u = u 0
curve at v 0 .
If the image of κ lies in the domain of a coordinate system (x α ) then its coordinate functions κα = x α ◦ α are smooth real functions on D. Moreover,
κu = ∂u (κα )∂x α ,
κv = ∂v (κα )∂x α .
Let Z be a smooth vector field on the map κ, i.e.
Z : D → T M,
such that π ◦ Z = κ, where π is the canonical projection π : T M → M .
We can then define Zu = D∂uZ the covariant derivative of Z along the u-parameter
curve, given in local coordinates by
β
Zu = ∂u Z α ∂x α + Z α Γαδ ∂u κδ ∂x β ,
(6)
and similarly Z v = D∂vZ the covariant derivative of Z along the v-parameter curve.
In the particular case Z = κu , Zu is the acceleration of the u-parameter curve.
Moreover,
Lemma 8.1.
1. If κ is a two parameter map, κuv = κvu .
2. If Z is a vector field on x, then
Zuv − Z vu = R(κv , κu )Z
where R is the Riemann curvature tensor.
18 A wedge (kind of boomerang) obtained from instance from the points (10, 0), (0, 0), (0, 10)and (1, 1) is
an example of such D which is non-convex.
44
Proof. Writing κuv in coordinates, it follows that it is symmetric in uv. We leave the
proof of the second point as an exercice.
Remark 8.1. The induced covariant derivative on a curve, the induced covariant
derivative on a submanifold and the two covariant derivatives for vector fields on a
two-parameter map are in fact all special examples of the induced covariant derivative for sections of the pullback bundle. You can try to write a definition for it as an
exercice.
8.2 The Gauss lemma
So far we have seen that given p ∈ M , expp carries rays t → t v into radial geodesics
t → γv (t ). Moreover, we computed the differential of the exponential map at 0 ∈
T p M . The following result, called the Gauss lemma is concerned with the differential map of the exponential map at some radial vector. It implies in particular that
orthogonality to radial directions is also preserved.
Lemma 8.2. Let p ∈ M and 0 6= x ∈ T p M . If v x , w x ∈ T x T p M with v x radial, then
g expp (x) (d x expp (v x ), d x expp (w x )) = g p (v x , w x ).
Remark 8.2. Here, on the RHS, we evaluate the metric g at p at the vectors v x and
w x . Strictly speaking, we should replace them in this expression with their images by
the canonical isomorphism that allows to identify T w T p M and T p M .
Remark 8.3. Recall that given Φ : V → V , for V a vector space endowed with a scalar
product g , Φ is an isometry if g (d φx (v), d φx )(w) = g (v, w), for all v, w. Thus, the
Gauss lemma can be viewed as a statement about the exponential map being a partial isometry (it preserves certain directions). Note also that in the statement, the differential of the exponential map is taken at x (hence the notation d x ).
Proof. v x radial means that v = λx (up to the canonical isomorphism) for some λ ∈
R and by linearity, we can assume that λ = 1, i.e. v = x. (so we will replace x by v
below.)
Consider the two parameter map x̃(t , s) = t (v + sw) in T p M and its image by the
exponential map in M
x(t , s) = expp x̃(t , s).
We have x̃ t (1, 0) = v v and x̃ s (1, 0) = w v , hence
x t (1, 0) = d v expp (v v ),
x s (1, 0) = d v expp (w v ).
The statement of the lemma is then
g (x t (1, 0), x s (1, 0)) = g (v, w).
By definition, the curve t → x(t , s) is a geodesic with initial velocity v + sw. Hence
x t t = 0 and g (x t , x t ) = g (v + sw, v + sw).
By Lemma 8.1, x t s = x st . Thus,
1
∂t g (x t , x s ) = g (x t , x st ) = g (x t , x t s ) = ∂s g (x t , x t ).
2
Since g (x t , x t ) = g (v, v) + s 2 g (w, w) + 2sg (w, v), we have
45
1
∂s g (x t , x t )(t , 0) = g (w, v).
2
On the other hand, x(0, s) = expp (0) = p, for all s, so x s (0, 0) = 0 and g (x t , x s )(0, 0) =
0. Thus, we have
g (x t , x s )(t , 0) = t g (v, w),
which gives the lemma when evalutated at t = 1.
Denote by q̃ the line element at p, i.e. the map
q̃ : v ∈ T p M → g p (v, v).
◦ exp−1
p .
Let q = q̃
(In the following, we will often use a upper ˜ to denote objects
defined on T p M . )
f = exp−1 (U ).
Let U be a normal neighborhood at p and U
p
At p, we can consider the level sets of q̃. These are called hyperquadrics. Since
we can choose coordinates such that g = η at p, we can identifies the level sets of q̃
with the level sets of the line element of the Minkowski space. Note that since the
metric is non-degenerate, d q̃ v 6= 0 at any v 6= 0, so that the hyperquadrics q̃ −1 (c)
are honest hypersurfaces of T p M if c 6= 0, and if c = 0, then q̃ −1 (0) \ {0} is a (nonconnected) hypersurface of T p M \ {0}.
Similarly, we can consider, for c 6= 0,
q −1 (c) = {r ∈ U : q(r ) = c},
called local hyperquadrics. Now
q(r ) = q̃ ◦ exp−1
p (r ),
f, q(exp (v)) = c and q −1 (c) is the image by the expofor all r ∈ U , so if v ∈ q̃ −1 (c)∩ U
p
f. The local null cone at p, denoted Λ(p), is by definition19
nential map of q̃ −1 (c) ∩ U
Λ(p) = q −1 (0) \ {p}.
Since expp : q̃ −1 (c) → q −1 (c) by construction, for any v in q̃ −1 c, we have
´
³
f → Texp (v) q −1 (c)
d v expp : T v q̃ −1 (c) ∩ U
p
and similarly for the null cone and the local null cone.
f on T p M .
Recall the definition of the position (also called scaling) vector field P
In coordinates,
f(v) = v α ∂v α ,
P
for v ∈ T p M . Note that
g r ad q̃ = 2Pe,
which implies that the position vector field is orthogonal to every hyperquadric
and is both tangent and orthogonal to the null cone q̃ −1 (0) \ {0}. Moreover, by construction, the position vector field is radial.
Using the exponential map, we can then define the local position vector field at
p as follows. Let q ∈ U and v = exp−1
p (q). Then define P as
f).
P (q) = d v expp (P
From the Gauss lemma, it follows that
19 The point p has to be removed so that these form regular submanifolds of M .
46
Corollary 8.1. The local position vector field P at p is orthogonal to every local hyperquadric of M at p. Furthermore P is both orthogonal and tangent to the nullcone
q −1 (0) \ {p}. Finally, P is radial, that is tangent to every radial geodesics emanating
from p.
Proof. Let c 6= 0 and consider a local hyperquadric q −1 (c). Let X be tangent to q −1 (c)
at some point r . Let v ∈ T p M be such that expp (v) = r and let x ∈ T v T p M be such
that d v expp (x) = X . Then,
f), d v exp (x)) = g (P , x).
g (P, X )(r ) = g (d v expp (P
p
Now, since we have an isomorphism
d v expp : T v q̃ −1 (c) → Texpp (v) q −1 (c)
f is orthogonal to q̃ −1 (c), we have
it follows that x is tangent to q̃ −1 (c) and since P
g (P, X )(r ) = 0. We leave the rest as an exercice.
Corollary 8.2. Let D q denotes the gradient vector field of q. Then, D q = 2P .
Proof. First, we have D q̃ = 2P̃ . Let v ∈ T U and let ṽ be such that d v expp ṽ = v.
Then,
³
´
³
´
g (D q, v) = v(q) = d v expp ṽ (q) = ṽ q ◦ expp
=
f, ṽ) = 2g (P, v).
ṽ(q̃) = g (D q̃, ṽ) = 2g (P
where in the last step we used the Gauss lemma.
8.3 Convex neighborhoods
Recall the concept of convexity in Euclidean geometry: a set S in an Euclidean
space is convex if and only for all 0 ≤ s ≤ 1 and all points p, q ∈ S , p(1 − s) + sq ∈ S .
Let us try to translate this in an arbitrary semi-Riemannian manifold. Given a set
S and p, q ∈ S , the replacement for the curve s → ps + (1 − s)q should clearly be a
geodesic s → γ(s) from p to q. This motivates the following definition.
Definition 8.2. An open set C is convex provided it is a normal neighborhood of each
of its points.
In particular, for any two points p, q of C , there is a unique geodesic segment
σpq : [0, 1] → M from p to q that lies entirely in C (but contrary to the Euclidean or
Minkowskian case, there could well be others that leave C ).
The aim of this section will be to prove
Proposition 8.1. Let M be a semi-Riemannian manifold. Then for any p ∈ T p M ,
there exists a convex set C containing p.
Before we prove this proposition, let us introduce some extra material concerning the exponential map, viewed this time as a mapping defined on T M instead of
the individual tangent plane. More precisely, let π : T M → M be the canonical projection. Let D be the set of all vectors v ∈ T M , such that the curve γv defined as
47
before as the unique maximal geodesic starting at π(v) with initial tangent vector v
is defined on [0, 1] (i.e. its maximal time of existence is larger than 1). Define as well
Dp = D ∩ T p M . (Here we identify T p M with a subset of T M , which if T M is defined
as the disjoint union of the T p M , is the subset {p} × T p M ).
Lemma 8.3. D is open in T M and Dp is an open set of T p M that is starshaped around
0.
Proof. That D is open follows from standard ode theory (in particular continuous
dependance with respect to the initial data). That Dp is starshaped around 0 is a
consequence of the fact γv (st ) = γsv (t ), for 0 ≤ s, t , ≤ 1.
Let us now define E as
E :D ⊂TM
→
v = (p, v p )
→
³M × M
´
p(= π(v)), expp (v p )
Recall that a smooth map between two manifolds is called non-singular at some
point if its differential at that point is injective20 (one-to-one). We have
Lemma 8.4. If expp : Dp → M is non-singular at x ∈ Dp , then E : D → M × M is
nonsingular at x.
Proof. Suppose d E (v)(= d E x (v)) = 0 for v ∈ T x (T M ). We must show that v = 0. Let
π : T M → M denote the canonical projection and let π1 be the projection of M × M
on its first factor. Then π1 ◦ E = π, so that d πx (v) = d π1 (d E x (v)) = 0. It follows that v
is vertical, that is tangent to T p M , where p = π(x). But E |Tp M differs from expp only
by the obvious diffeomorphism {p} × M to M .
It then follows from the inverse function theorem and Proposition 3.7 that E
maps some open neighborhood of 0 in T M diffeomorphically onto an open neighborhood of (p, p) ∈ M × M . We now proceed to the proof of Proposition 8.1.
Proof. Let ξ = (x α ) be a normal coordinate system on a neighborhood of p ∈ M . Let
N (δ) be the set of points q such that
X α
|x (q)|2 < δ.
α
For δ small enough, this set defines a normal neighborhood of p which is diffeomorphic to an open ball. (Note that Nδ is clearly open and contains p. Thus, it is a
neighborhood of p and clearly, exp−1
p (Nδ ) is a neighborhood of 0 ∈ T p M . Using the
definition of normal coordinates, it is not hard to check that this neighborhood is
starsphaped around 0. )
By the previous lemma, if δ is small enough, then E is a diffeomorphism from a
neighborhood N containing (p, 0) ∈ T p M onto N (δ) × N (δ). Moreover, for δ small
γ
enough, the tensor field whose components are given by δαβ − Γαβ δγρ x ρ can be
assumed to be positive definite. We claim that it then follows that N (δ) is a normal
neighborhood of each of its points q ∈ N (δ).
20 A regular point on the other hand is one where the differential is onto.
48
Let N q = N ∩ T q M . By construction, E |Nq is a diffeomorphism onto {q} × N (δ),
hence [expq ]|Nq is a diffeormorphism onto N (δ). It remains to show that N q is starshaped about 0.
Let r ∈ N (δ), r 6= q and let v = E −1 (q, r ). Then, v ∈ N q and σ = (γv )|[0,1] is a
geodesic from q to r . If σ lies in N (δ) then t v ∈ N q for all 0 ≤ t ≤ 1 and hence N q is
P
starshaped around 0. Assume σ leaves N (δ). For any k ∈ N (δ), let n(k) = α |x α |2 .
Then, since n(q) < δ and n(r ) < δ, the function n ◦ σ must have a local maximum at
some t 0 with 0 < t 0 < 1. Writing x α for x α (σ) and using the geodesic equation, we
have
Ã
· α ¸2 !
2 α
dx
d 2n ◦ σ
βd x
= 2 δβα x
+
2
2
dt
dt
dt
³
´ d xα d xβ
γ
= 2 δαβ − δγρ x ρ Γβα
> 0,
dt dt
where we have used the geodesic equations to reach the last line. This contradicts
the fact that t 0 is a maximum point.
If p and q are points of a convex set C and σpq is the geodesic in C from p to q,
~ is by definition pq
~ := σ0pq (0) ∈ T p M .
the displacement vector pq
Lemma 8.5. Let C be a convex open set. Then, the map ∆ : C × C → T M sending
~ is smooth.
(p, q) to pq
Proof. Recall the map E : D → M × M . We have that D is open and that E is smooth.
In particular, since C × C is open in M × M . It follows that E −1 (C × C ) is open. One
easily check that E −1 (C × C ) = ∆(C × C ) and then that ∆ is the inverse map of
E : ∆(C × C ) → C × C .
Exercice:
1. Consider S 1 ⊂ R2 . Find two convex sets whose intersection is non connected,
hence in particular non-convex.
2. Prove that if U ,V are two convex sets included into a convex set W , then U ∩V
is a convex set if non-empty.
Definition 8.3. A convex covering is a covering of M by convex open sets such that is
U and V are elements of the covering, then their intersection U ∩ V is convex if not
empty.
Let us end this section with the following lemma, whose proof is left as an exercice.
Lemma 8.6. Given any open covering C of M , there exists a convex covering R such
that each element of R is contained in some element of C . In other words, the convex
covering is a refinement of the original cover.
Proof. (sketch) Let C be an open covering of M . Recall that M is metrizable, so let
d be a metric on M which is compatible with the topology of M . For any x ∈ M ,
let N x be a convex neighborhood containing x. Shrinking N x is necessary, we may
49
assume that N x is bounded (for d ) and has compact closure. Then, let Vx be a convex
neighborhood of x, such that d i am(Vx ) < 1/3d (x, ∂N x ). The set of all such Vx , x ∈
M , is then a covering of M , such that each Vx is contained in some element of C and
is convex. Moreover, if Vx ∪V y 6= ;, assume wlog that d i am(Vx ) ≥ d i am(V y ) then by
construction,
d (x, y) ≤ 2d i am(Vx ) ≤ 2/3d (x, ∂N x ).
Thus, y ∈ N x and then V y ∈ N x . Thus, Vx ∪ V y ∈ N x .
8.4 Arc lengths
Definition 8.4. Let α be a piecewise smooth curve defined on a closed interval [a, b].
We define its arc length as
Z
L(α) =
b
a
|g (α0 , α0 )|1/2 d s.
A reparametrization function h : [c, d ] → [a, b] is a piecewise smooth function
such that either h(c) = a, h(d ) = b (h is orientation preserving), or h(c) = b, h(d ) = a
(h is orientation-reversing). If its derivative does not change sign, h is monotone.
Lemma 8.7. The length of a piecewise smooth curve segment is invariant under a
change of monotone reparametrization. Moreover, if |g (α0 , α0 )| > 0, there is strictly
increasing reparametrization function h such that β = α ◦ h has |g (β0 , β0 )| = 1.
A curve such that |g (α0 , α0 )| = 1 is said to have unit speed or arc length parametrization.
−1
1/2
Let U be a normal neighborhood of p ∈ M . The function q → r (q) = |g (exp−1
p (q), expp (q))|
is the radius function at p.
Exercice: Let (x α ) be normal coordinates at p. Give an expression for r (q) in
terms of the coordinates of q.
Lemma 8.8. Let r be the radius function on a normal neighborhood U of p ∈ M . If σ
is the radial geodesic from p to q ∈ U , then L(σ) = r (q).
0 0
Proof. If v = σ0 (0), then v = exp−1
p (q). Since g (σ , σ ) is constant along σ,
1
Z
L(σ) =
0
|g (σ0 , σ0 )|1/2 = |g (v, v)|1/2 = r (q).
8.5 The Riemannian case
We assume here that g is a Riemannian metric. Then, v → r˜(v) = g (v, v)1/2 is a norm
on each tangent plane and, given a normal neighborhood U of p, the radius function at p,
r = r˜ ◦ exp−1
p
is smooth expect at p. Let q̃(v) = g (v, v) and q = q̃ ◦ exp−1
p and denote by P the local
position vector field, i.e. P = d expp P̃ , with P̃ given in local coordinates by
P̃ = v α ∂v α .
50
Then, for all q = expp (v),
|g (P, P )|1/2 (q) = g (d expp P̃ , d expp P̃ )1/2 (q) = g (P̃ , P̃ )1/2 = g (v, v)1/2 = r (q).
Hence, U = P /r is the outward unit radial vector field (outward because r increase
along the integral curves of U , radial because P is radial) and U is normal to all the
hyperspheres at p, i.e. the hypersurfaces of constant r . Since r = q 1/2 , it follows
from Corollary 8.2 that
P
g r ad r = = U
r
on U \ {p}.
Lemma 8.9. Let (M , g ) be a Riemannian manifold and U be a normal neighborhood
of p. If q ∈ U , then the geodesic σ : [0, 1] → U with σ(0) = p, σ(1) = q is the unique
(up to monotone reparametrization) shortest (piecewise smooth) curve in U from p
to q.
Remark 8.4. Note that here everything is constrained to stay in the neighborhood U .
Proof. We must prove that if α : [0, b] → U is another curve joining p to q, then
L(α) ≥ L(σ) with equality if and only if α = σ◦h for some monotone h. We will prove
it in the case of smooth curves for simplicity, but modulo easy modifications the
proof works for piecewise smooth curves.
First, restricting the vector field U = P /r to α, we have
α0 = g (α0 ,U )U + N ,
where N is a vector field on α orthogonal to U for all t > 0. At t = 0, the vector field
U is a priori not well defined, but since g (U ,U ) = 1, its components are uniformly
bounded in any local system of coordinates. (In fact, U can be extended continuα0 (0)
ously to r˜(α
0 (0)) .)
Then,
g (α0 , α0 )1/2 = [g (α0 ,U )2 + g (N , N )2 ]1/2 ≥ |g (α0 ,U )| ≥ g (α0 ,U ).
On the other hand, from grad r = U , we have g (α0 ,U ) =
b
Z
L(α) =
0
g (α0 , α0 )1/2 d s ≥
Z
b
0
d r ◦α
dt .
Hence,
g (α0 ,U )d s ≥ r (p) = L(σ).
Moreover, in the equality case, then we have N = 0, for all t > 0 as well as |g (α0 ,U )| =
g (α0 ,U ), which implies that
dr
≥ 0.
dt
Thus, α0 = dd rt U , which implies that α is a monotone reparametrization of integral
curves of U . Since U is radial, the integral curves of U are geodesics passing through
p, so it follows that α is a monotone reparametrization of a geodesic from p to q,
which by uniqueness has to be σ. In fact, α(t ) = σ(r ◦ α(t )/r (q)).
51
8.5.1 The Riemannian distance
Definition 8.5. For any points p and q of a connected Riemannian manifold M , the
Riemannian distance from p to q is the infimum of all the lengths of all piecewise
smooth curves joining p to q.
If p ∈ M and ² > 0, we define B ² (p) to be the open ball of size ² for the Riemannian
distance d . (Note that for the moment, we do not know that this is actually open for
the topology of M ).
Proposition 8.2. Let p ∈ M .
1. For any ² > 0 sufficiently small, B ² (p) is normal (in particular open).
2. For q in a normal neighborhood B ² (p), the radial geodesic from p to q is the
unique shortest curve in M from p to q. In particular,
L(σ) = r (q) = d (p, q).
Remark 8.5. The point of the second item above is that we are considering all possible
geodesics, not only those constrained in the normal neighborhood.
Proof. Let V be a normal neighborhood of p ∈ M and U = exp−1
p (V ). For ² > 0 sufficiently small, U contains the starshaped open set
Be² (0) = {v ∈ T p M : g (v, v)1/2 < ²}.
Thus, N := expp (B̃ ² (0)) is also a normal neighborhood of p and by construction N ⊂
B ² (p). Moreover, if q ∈ N , then by the previous lemma, the radial geodesic σ from p
to q is the unique shortest curve in N from p to q and L(σ) = r (q). Since L(σ) < ², it
suffices to prove that if α is a curve in M starting at p and leaving B ² (p), then L(α) ≥ ².
In particular, this shows that if q ∉ N , then d (p, q) ≥ ² and hence that N = B ² (p). To
prove the claim, note that since α leaves N , it meets every sphere S(a) of constant r
equal to a, for all a < ². If αa is the shortest initial segment of α from p to S(a), then
αa lies in N . Hence L(α1 ) ≥ a by the previous lemma. Since this holds for all a < ²,
the claim is proved.
Proposition 8.3. For a connected Riemannian manifold M , the Riemannian distance function d : M × M → R defines a metric on M , i.e. it safisfies
1. for all p, q ∈ M , d (p, q) = 0 if and only if p = q.
2. for all p, q ∈ M , d (p, q) = d (q, p),
3. for all p, q, r ∈ M , d (p, q) ≤ d (p, r ) + d (r, q).
Moreover, d is compatible with the topology of M .
Proof. The symmetry property is trivial. Assume that d (p, q) = 0, since M is Hausdorff, there is a normal neighborhood of p that does not countain q and from the
(proof of the) previous proposition, there exists a open ball around p of radius ² > 0
that does not contain q. Thus, d (p, q) > ² > 0.
For the triangle inequality, let ² > 0 and choose α a curve from p to q and β a
curve from q to r , such that L(α) < d (p, q) + ² and L(β) < d (q, r ) + ². Joining α and β
gives a curve of length at most L(α)+L(β)+2². In other words, the triangle inequality
52
follows from the definition of infimum of a set and the fact that given curves from
p to q and q to r , we can get a curve from p to r whose length is the sum of the
previous length.
Finally, let p ∈ M and consider q ∈ B r (p), r > 0. We have d (p, q) < r and so, by the
triangle inequality, for any r 0 such that 0 < r 0 < r −d (p, q), the ball B r 0 (q) is contained
in B r (p). Since for r 0 small enough, we know from the previous proposition that
B r 0 (q) is open, it follows that B r (p) is open. Conversely, we have already proven that
any open set for the topology of M contains a small enough ball around any point of
that set. Thus, the two topology are compatible.
A minimizing segment from p to q is by definition a curve from p to q that realizes the Riemannian distance function. Note that there could be several or none
such segments. One has
Corollary 8.3. A minimizing segment σ from p to q is a monotone reparametrization
of a smooth geodesic segment from p to q.
Proof. The domain I of α can be decomposed into subintervals I i such that each
αi = α|I i lies a convex open set. Since α is minimizing, each αi is also minimizing
(the restriction of a minimizing segment is a minimizing segment). Since each αi
lies in a convex set, it follows that it is a monotone reparametrization of a geodesic
which we can assume to be of unit speed σi . Joining this gives a possibly broken
geodesic σ from p to q and joining the reparametrization gives that α is a monotone
reparametrization of σ. Moreover, we claim that in fact σ is unbroken, i.e. it is a
smooth geodesic. For, consider two geodesic segments σi ending at r and a geodesic
segment σi +1 starting at r . Since α is minimizing, then σ is minimizing. Thus, the
restriction of σ to any sufficiently small neighborhood of r is minimizing. But this
implies that this restriction is a monotone reparametrization of a smooth geodesic.
Since we had assume σ to be parametrized by arc length, it follows that in fact σ is a
smooth geodesic, which concludes the proof.
8.5.2 The Hopf-Rinow Theorem
We prove here
Theorem 8.1 (Hopf-Rinow). For a connected Riemannian manifold M the following
are equivalent
(MC) (M , d ) is a complete metric space, i.e. every Cauchy sequence is convergent.
(C1) There exists a point p ∈ M such that every geodesic starting at p is defined on R.
(C) M is geodesically complete.
(HB) Every closed bounded subset of M is compact.
With the notations of the theorem, we start by proving,
Lemma 8.10. If (C 1 ) holds, then for any q ∈ M , there exists a minimizing geodesic
segment from p to q.
Proof. Consider a normal ball B ² (p), where ² > 0 small enough. If q ∈ B ² (p), then
the claim is trivial, thus we assume that q ∉ B ² (p). Let r be the radius function at p.
For 0 < δ < ², the level set r = δ is an (n − 1) dimensional sphere in B ² (p) denoted
53
S δ . In particular, S δ is compact. The function s → d (s, q) is continuous on S δ , hence
reach its minimum at some point m ∈ S δ . We claim that d (p, m) + d (m, q) = d (p, q).
Let α : [0, b] → M be any curve from p to q. The function τ(s) : s → d (α(s), p) is
continous and satisfies τ(0) = 0, τ(b) = d (p, q) > ². By the intermediate value theorem, there exists an a, such that d (α(a), p) = δ. By definition, α(a) ∈ B ² (p) and in
fact α(a) ∈ S δ . Moreover, 0 < a < b. Let α1 and α2 be the restriction of α to [0, a] and
[a, b] respectively. Then,
L(α1 ) ≥ δ = d (p, m)
by the minimizing property of radial geodesics and
L(α2 ) ≥ d (m, q)
by the definition of m.
Thus,
L(α) = L(α1 ) + L(α2 ) ≥ d (p, m) + d (m, q).
Hence,
d (p, q) ≥ d (p, m) + d (m, q).
The reverse inequality is just the triangle inequality, so the claim is proven.
Now, let γ : [0, +∞) → M be the unit speed geodesic whose initial segment runs
radially from p to m. Let d = d (p, q) and let T be the set of t ∈ [0, d ] such that
t + d (γ(t ), q) = d .
It suffices to show that d ∈ T . For, it then follows that d (γ(d ), q) = 0, which implies
that γ(d ) = q. Since γ has unit speed, L(γ[0,d ] ) = d , so that γ[0,d ] is indeed a minimizing geodesic from p to q. In fact, γ[0,t ] is minimizing for any t ∈ T , since its length is
t and
d ≤ d (p, γ(t )) + d (γ(t ), q) = d (p, γ(t )) + d − t ,
which implies that d (p, γ(t )) = t . By continuity, T is closed. Moreover t is non-empty
by the first part of the proof. Thus, it contains a largest number t 0 ≤ d . Assuming
t 0 < d , we deduce a contraction as follows. In a normal neighborhood U 0 of γ(t 0 ),
repeating the first part of the proof produces a unit speed radial geodesic σ : [0, δ0 ] →
U 0 from γ(t 0 ) to m 0 ∈ U 0 such that
δ0 + d (m 0 , q) = d (γ(t 0 ), q).
Moreover, because t 0 ∈ T , we have
t 0 + δ0 + d (m 0 , q) = d .
Since d ≤ d (p, m 0 ) + d (m 0 , q), we must have t 0 + δ0 = d (p, m 0 ). Thus, the sum of
γ[0,t0 ] and σ minimizes, so from the above Corollary and the fact that they have unit
speed (so the parametrization is somewhat fixed), the sum is a smooth minimizing
geodesic. This means that m 0 = σ(δ0 ) = γ(t 0 + δ0 ). This implies that
t 0 + δ0 + d (γ(t 0 + δ0 ), q) = d (p, m 0 ) + d (m 0 , q) = d ,
hence t 0 + δ0 ∈ T a contradiction.
In the proof of the Hopf-Rinow theorem, we will need the following notion.
54
Definition 8.6. A piecewise smooth curve α : [0, B ) → M is extendible provided it has
a continuous extension α̃ : [0, B ] → M . Then q = B̃ is called an endpoint of α.
Equivalently, there exists a q ∈ M such that for every sequence s i → B , α(s i ) → q.
An extendible curve need not have a piecewise smooth extension however, it does in
the special case of integral curves, as a consequence of the Cauchy-Lipshitz theorem
and the standard continuation criterion.
Exercices:
1. With the above notation, prove that if α is a geodesic, then it is extendible if
and only if it extendible as a smooth geodesic.
2. Let X be vector field on M that never vanishes (equivalently its flow has no
fixed point). Prove that the maximal integral curves of M are inextendible. In
particular, the maximal integral curves of a globally causal vector field are all
inextendible.
Proof of the Hopf-Rinow Theorem 8.1.
(MC) =⇒ (C). Let γ : [0, b) be a unit speed geodesic. Assume that b < +∞. If (t i ) is a sequence
in [0, b) converging to b then (γ(t i )) is Cauchy, since
d (γ(t i ), γ(t j )) ≤ |t i − t j |.
Hence,(γ(t i )) converges to some q ∈ M . Moreover, if (s i ) is another such sequence, then we still have d (γ(t i ), γ(s j )) ≤ |t i − s j |, so that the limit points is
unique. It follows that γ is continuously extendible and it follows from the
above exercice that in fact is extendible as a geodesic. Hence, every maximal
geodesic is defined on (−∞, +∞) i.e. M is geodesically complete.
(C) =⇒ (C1) Trivial.
(C1) =⇒ (HB) Let A be closed bounded subset of M and let p be as in (C1). By the preceding
lemma, if q ∈ A, there exists a minimizing geodesic σq defined on [0, 1] from
p to q. Then g (σ0q (0), σ0q (0))1/2 = L(σq ) = d (p, q) < r , for some r ∈ R independent of q. In other words, σ0 (0) lies in the compact ball B r of T p M . Since
expp (B r ) is compact and contains the closed set A, A is compact.
(HB) =⇒ (MC) Always true in metric spaces, so left as an exercice.
As a consequence of the Hopf-Rinow theorem (and the lemma), we have
Corollary 8.4. If a connected Riemannian manifold is complete then any two of its
point can be joined by a minimizing geodesic segment.
Corollary 8.5. Any compact Riemannian manifold is complete.
Proof. Property (HB) holds trivially in this case.
Finally, let us recall Whitney’s embedding theorem21 and obtain yet another corollary of the Hopf-Rinow theorem (combined with Whitney’s embedding theorem).
21 Actually, Whintney proved that M can be emdedded into R2 n, but the 2n +1 version is easier to prove
and we do not care here about the dimension of the target space.
55
Theorem 8.2. Let M be a smooth connected manifold of dimenion n. Then, there
exists an embedding of M into R2n+1 such that the image under the embedding is a
closed subset.
Remark 8.6. For a proof, see for instance Bredon, Topology and Geometry.
Corollary 8.6. Let M be a smooth connected manifold. Then, there exists a complete
Riemannian metric on M .
Proof. Let φ : M → R2n+1 denotes an embedding as in Whitney’s theorem and give
M the pullback metric by the embedding. Let d M be the corresponding Riemannian
distance. Then, closed and bounded set of M with respect to d are easily seen to be
compact (use for instance that d R3 (φ(p), φ(q)) ≤ d M (p, q), p, q ∈ M ) ) and property
(HB) is satisfied.
For an alternative proof, more constructive, see Ringström book.
8.6 The Lorentzian case
Many (most) of the previous properties proved in the case of Riemannian manifolds
do not hold for Lorentzian manifolds. As a start, g does not defines a norm on each
tangent space, so in particular, for any ² > 0, the set of all v, such that |g (v, v)| ≤ ²
is non compact. The bottom line is that, yes there are analogies between the Riemannian and Lorentzian cases (for instance, we will soon see a local maximizing
property of timelike geodesics analogous, including the proof, to the minimizing
property of geodesics in Riemannian geometry), but overall Lorentzian manifolds
have their own, more complicated, geometry.
We now assume for the rest of this section that (M , g ) is a smooth Lorentzian
manifold.
8.6.1 Timecone and time-orientability
Recall that v ∈ T p M 6= 0 is called spacelike, null or timelike depending on whether
g p (v, v) > 0, = 0 or < 0 and that the set causal vectors is the union of the null and
timelike vectors. Similarly, submanifods are also called spacelike, null or timelike
depending on whether the induced metric are Riemannian, degenerate or Lorentzian.
In particular, this applies to curves. Moreover, if z ∈ T p M is timelike, then z ⊥ is
spacelike (in particular, by convention, 0 ∈ T p M is spacelike).
At each p ∈ M , the set of timelike vectors forms two disjoint cones, referred to as
timecones.
More precisely, If Tp denotes the set of all timelike vectors at some p ∈ M , we
say for v, w, ∈ Tp that v C w if and only if g (v, w) < 0. Clearly, C is a symmetric binary relation since g is symmetric. That C is reflexive follows from the definition
of a timelike vector. Moreover, if g (v, w) < 0 and g (w, x) < 0, then since w is timelike, it follows that w 6= 0, and we can find an orthonormal basis (e i ) such that w =
|g (w, w)|1/2 e 0 where g (e 0 , e 0 ) = −1. Then, g (v, w) < 0 implies that v = v 0 e 0 + v i e i ,
with v 0 > 0 and v 0 = |v 0 | > (v i v j δi j )1/2 since v is timelike and similarly for x. Then,
g (x, v) = −v 0 w 0 + v i w j δi j
so that the result follows by the Cauchy-Schwarz inequality.
56
Since for timelike vectors v, w either g (v, w) < 0 or g (v, w) > 0, there are two and
only two equivalent classes, the two disjoint timecones.
Since moreover these sets are clearly open and connected, Tp has two connected
components.
Finally, it is easy to see that the two timecones are convex sets (here convexity is
the usual notion for subsets of vector spaces) and that the map v ∈ Tp → −v maps
one into the other.
For a piecewise smooth curve, we say that it is timelike if its tangent vector at every non-break point is timelike and if all its half tangents (left and right ones) are all
timelike and points in the same timecone. Similarly for a null or causal curve. Note
that by definition, a causal curve is a "regular" curve, in the sense that its tangent
vector at every non-break point is nonzero, nor does any all its half-tangents.
The reverse Cauchy-Schwarz inequality:
Lemma 8.11. Let v, w be timelike vectors. Then,
(g (v, w))2 ≥ |g (v, v)| . |g (w, w)|.
Proof left as an exercice.
A priori, there is no intrinsic way to distinguish one timecone from the other. To
remediate this, we need an extra structure.
Definition 8.7. (M , g ) is said to be time-orientable if there exists a global timelike
vector field T on M . A choice of time-orientation is the prescription of such a global
timelike vector field T . We then say that (M , g ) is time-oriented by T .
Remark 8.7. The standard notion of orientation for manifolds and the notion of
time-orientation for a Lorentzian manifold are unrelated.
Remark 8.8. As for the standard orientation, every Lorentzian manifold admits a
double cover22 that is time-orientable.
In the remainder all the Lorentzian manifolds that we will consider will be timeorientable. In fact, we’ll give a name to those.
Definition 8.8. A spacetime is a triple (M , g , T ) where (M , g ) is a connected Lorentzian
manifold (of dimension n ≥ 2) and T is a global timelike vector field on M .
For simplicity, , on top of being time-orientable, we will assume also that our
spacetimes are orientable in the rest of the lectures. (Thus, for us, all spacetimes
will be orientable by assumption, and we may omit to say it later).
A causal vector X is called future-oriented if g (X , T ) < 0, i.e. X and T points in the
same timecone (if X is null then X does not belong to the timecone of T ). Similarly
for past-oriented and again curves inherit the labels if their tangent vectors at each
point have it.
We have easily,
Proposition 8.4. Let (M , g ) be a spacetime. Then, there exists a globally defined timelike vector field on M whose flow is defined on M × R.
22 i.e. a smooth map k : M̃ → M , where M̃ is a smooth manifold, such that each p ∈ M has a connected
neighborhood U such that k −1 (U ) has two connected components and k maps each of them diffeomophically onto U .
57
Proof. Since M is time-oriented, there exists a globally timelike vector field T1 . Let
h be complete Riemannian metric on M and let T = ||T1 ||−1
T1 . Then, T is complete.
h
Indeed, let γ an integral curve of T defined on (t − , t + ). If t + < ∞ then the integral
curve restricted to [a, t + ) (a ∈ (t − , t + )) is contained in a compact set, since it has
bounded length for the Riemannian metric. Hence, the curve is extendible through
t + a contradiction.
8.6.2 Local Lorentzian geometry
We have
Lemma 8.12. Let p ∈ M and β : [0, b] → T p M a piecewise smooth curve at p such that
β(0) = 0, α = expp β is well defined on [0, b] and is timelike. Then, for all t ∈]0, b], β(t )
remains in a single timecone of T p M .
Remark 8.9. Recall that, for t ∈ [0, b], expp β(t ) is well defined if the geodesic γβ(t )
starting at p and with initial tangent vector β(t ) is defined on [0, 1].
Proof. Note first that that if β(t ) is timelike for all t , then β must stay in a single
timecone (since there are only two timecones and there disconnected).
In the following, we write that property P holds initially for “there exists ² > 0,
such that for all t ∈]0, ²], Property P holds”. Assume first that β is smooth. Since
β0 (0)(= α0 (0)) is timelike and timecone are convex open sets, initially β is in a single
timecone C . Note that the position vector field P̃ on T p M is by definition radial
and points in the same direction as β in particular, it is timelike in C . Thus, initially
g (β0 , P̃ ) < 0. Since D q̃ = 2P̃ , we have
d
q̃ ◦ β = 2g (β0 , P̃ ),
dt
and initially ddt q̃ ◦ β is strictly negative.
By the Gauss lemma,
g (α0 , P ) = g (β0 , P̃ )
and so initially, g (α0 , P ) < 0.
Consider the set E all τ such that for all 0 < t ≤ τ such that
d
q̃ ◦ β(t ) ≤ 0.
dt
By construction, E is a closed in ]0, b] and it is not-empty by the above. Let a be
the sup E . By the above, we know that a > 0. Moreover, in ]0, a[, q̃ ◦ β < 0, so β is
timelike and thus P̃ β (the position vector field at β) is timelike and thus P α (by the
Gauss lemma) is timelike. Hence, by the above, g (α0 , P ), g (β0 , P̃ ) and then ddt q̃ ◦ β
are all strictly negative. Hence, q̃ ◦ β is decreasing and therefore strictly negative. It
follows that β is timelike at a and thus again that ddt q̃ ◦ β(a) < 0. Thus E is open and
hence E =]0, b]. Hence β is timelike for all t and therefore stays in a single timecone.
Assume now that β is merely piecewise smooth. We know from step 1 that on its
first smooth segment β stays in C . Thus at the first break t 1 , g (α0 (t 1− ), P 1 ) < 0, where
P 1 = d expp (P β (t 1 )). Since α is timelike, α0 (t 1+ ) and α0 (t 1− ) must point in the same
¡
¢
direction, and thus g (α0 (t 1+ ), P 1 ) < 0. It then follows that ddt q̃ ◦ β does not change
sign at breaks and the rest of the argument follows similarly.
58
Exercice: prove the analogue of the above lemma replacing timelike and timecone by causal and causal cone.
Lemma 8.13. Let (M , g ) be a Lorentzian manifold and U be a normal neighborhood
of p. If k ∈ U and there exists a timelike curve from p to k in U , then the radial
geodesic segment σ from p to k is the unique (up to reparametrization) longest timelike curve in U from p to k.
Proof. Let α be a timelike curve in U from p to k. Let r be the radius function of M
at p which we recall is defined as
¯
¯−1/2
¯
¯
−1
r (k) = ¯g (exp−1
.
p (k), expp (k))¯
In other words r (k) gives the norm of the tangent vector to geodesic going from p
to k in an affine time of 1. In yet another words, r (k) gives the norm of the position
vector field. In yet yet another words, r (k) = L(σ).
Note that r is smooth apart at p and at the local null cone of p.
By the previous lemma β = exp−1
p ◦α lies in a single timecone C . Hence, apart at
t = 0, α lies in a region on which U = Pr is a unit timelike vector field. In particular,
U is timelike at k, so σ is timelike at k and hence just timelike. Let now
α0 = −g (α0 ,U )U + N ,
where N is a (spacelike by reverse Cauchy-Schwarz) vector field on α orthogonal to
U . Then,
1
|g (α0 , α0 )|1/2 = [(g (α0 ,U ))2 − (g (N , N ))2 ] 2 ≤ |g (α0 ,U )|.
Since D q = 2P and r = (−q)1/2 , we have Dr = − Pr = −U . By the previous lemma,
g (β, P̃ ), hence g (α0 ,U ) is negative, hence
|g (α0 ,U )| = −g (α0 ,U ) =
d
g (r ◦ α).
dt
Consequently,
Z
L(α) =
|g (α0 , α0 )|1/2 d t ≤ r (k) = L(σ).
If the length are equal, then N = 0 which implies that α is a monotone reparametrization of σ.
Again, this lemma holds for piecewise smooth curves (again assuming that, by
definition, piecewise smooth timelike curves have their half-tangents at break points
pointing in the same timecone).
9 Deformation of curves
Definition 9.1. A (smooth) variation of a curve segment α : [a, b] → M is a two parameter (smooth) mapping
x : [a, b] × (−δ, δ)
(u, v) → x(u, v)
such that x(u, 0) = α(u) for u ∈ [a, b].
59
→
M
The u parameter curves of a variation are called longitudinal and the v parameters curve transerve. The vector field V on α given by V (u) = x v (u, 0) is called the
variation vector field of α.
If x(a, s) = α(a) and x(b, s) = α(b) for all s ∈ (−δ, δ), we say that x is a variation
with fixed endpoints (in which case V must vanish at the endpoints).
One easily have
Lemma 9.1. Given any curve α : [a, b] → M and V a vector field on α, there exists a
variation of α whose variation vector field is V . Moreover, if V vanishes at the endpoints, there exists a fixed endpoints variation of α whose variation vector field is V .
Proof. Just take x(t , s) = expα(t ) (sV ). Since α is compact, there exists a δ > 0 such
that x is well-defined on [a, b] × (−δ, δ).
9.1 From causal to timelike curves
Lemma 9.2. Let α be a causal curve segment in a Lorentzian manifold and let x be
a variation of α with variation vector field V . If g (V 0 , α0 ) < 0, then for all sufficiently
small v > 0, the longitudinal curve αv of x is timelike.
Proof. Since α is causal,
g (x u , x u )(u, 0) = g (α0 , α0 ) ≤ 0.
But α is defined on a closed interval [a, b] and
∂v g (x u , x u )|v=0 = 2g (x uv , x u )|v=0 = 2g (x vu , x u )|v=0 = 2g (V 0 , α0 ) < 0.
Thus, if v > 0 is sufficiently small, then g (x u , x u ) < 0 for all u.
By gluing two smooth curves, one often ends up with a piecewise smooth curve.
Thus, it seems natural to consider piecewise smooth variations of piecewise smooth
curves.
Definition 9.2. A variation x of a piecewise smooth curve α : [a, b] → M is said to be
piecewise smooth if x is continuous and for breaks
a = u 0 < u 1 < .. < u k < b = u k+1
the restriction of x to each set [u i −1 , u i ] × (−δ, δ) is smooth.
A priori, x and α could have different breaks, but we can always add trivial breaks
to x and α (i.e. add points to the partition of [a, b] corresponding to α or x where
actually there is no regularity issue).
Note that x v is then smooth on each [u i −1 , u i ] × (−δ, δ). In particular, the variation vector field V : u → x v (u, 0) is continuous23 on each [u i −1 , u i ] and hence on
[a, b]. On the other hand, the velocity vector field α0 will generally have a discontinuity at each break u i . The discontinuity is measured by the tangent vector
∆α0 (u i ) = α0 (u i+ ) − α0 (u i− ) ∈ Tα(ui ) M .
Exercice:
23 Note that here V is given one value at the break points. This is because since we require x
[u
i −1 ,u i ]
be smooth, we have V (u i ) = ∂v x(u 1 , v), i.e. we can take a partial derivative after evaluation at u 1 .
60
to
1. Prove that Lemma 9.2 is valid for piecewise smooth curve (with the usual hypothesisis on the half-tangents at each break points).
2. Prove that the conclusion Lemma 9.2 still holds if g (V 0 , α0 ) ≤ 0 and strictly
negative whenever α0 is null.
This implies the following result.
Lemma 9.3. In a Lorentzian manifold, if α is a causal curve that is not a null pregeodesic (a monotone reparametrization of a geodesic) from p to q, then there exists
a timelike curve from p to q arbitrarily close to α.
Remark 9.1. Here arbitrarily close means "constructed out of a fixed endpoints variation of α" so that the longitudinal curves are all timelike for v > 0. In particular, given
any open set containing α, one can choose such a timelike curve lying in the open set.
Proof. Without loss of generality, assume that α is defined on [0, 1]. Assume first that
α0 (1) is timelike. Let W be the vector field obtained by parallel translation of α0 (1)
along α. Then, W and α0 are always in the same causal cone, and since W is timelike
g (W, α0 ) < 0. Since α0 (1) is timelike, by continuity, there exists δ, ² > 0, such that
g (α0 , α0 ) < −δ on [1−², 1]. Let f be any smooth function on [0, 1] that vanishes at the
endpoints and such that f 0 > 0 on [0, 1 − ²]. Set V = f W , then g (V 0 , α0 ) = f 0 g (W, α0 )
is negative on [0, 1−²]. Let x be a fixed endpoint variation with variation vector field
V . Then for v > 0 small enough, the longitunal curve is timelike on [0, 1 − ²] by the
previous lemma, and on [1 − ², 1] simply by continuity. Now if α is timelike at 0, the
above proof can be repeated and if α is timelike at a nonendpoint s, then we can
apply the above result on [0, s] and [s, 1]. (Note that if α is regular at s and the above
construction yields a timelike curve that is regular (no break point) at s).
Thus, we are left with the case where α is a piecewise smooth null curve. Assume first that α is a smooth null curve. Differentiating g (α0 , α0 ) = 0, we obtain that
g (α00 , α0 ) = 0. Moreover α00 cannot be colinear everywhere on α to α0 , otherwise
(cf Exercice) α could be reparametrized to be a null geodesic. Thus the function
g (α00 , α00 ) is not identically zero, since orthogonal null vectors are necessarily colinear. Let W be a parallel timelike vector field on α in the same causal cone as α0 at
each point, i.e. g (W, α0 ) < 0. Let V = f W + hα00 , where f and h are functions vanishing at the endpoints that will be determined below so that g (V 0 , α0 ) < 0. From
g (α00 , α0 ) = 0, we have
g (α000 , α0 ) = −g (α00 , α00 )
and hence
g (V 0 , α0 )
=
f 0 g (W, α0 ) + h 0 g (α00 , α0 ) + hg (α000 , α0 )
=
f 0 g (W, α0 ) − hg (α00 , α")
=
g (W, α0 )( f 0 − hk),
where
k :=
g (α00 , α00 )
.
g (W, α0 )
Since k is not identically zero, so there exists a smooth function h vanishing at the
endpoints such that
Z 1
khd u = −1
0
61
Ru
Let now f (u) = 0 (kh + 1)d u. Then, f vanishes at endpoints, and f 0 = kh + 1 > kh.
Consequently g (V 0 , α0 ) < 0.
Now, if α is a piecewise smooth null curve, any non-pregeodesic segment can be
deformed to a timelike curve, and then by the first argument we can deformed the
whole curve to a timelike curve. Hence, we are left with the case of piecewise null
curve for which each segment is a pregeodesic. By reparametrization, we might even
assume that each segment is a null geodesic, and without loss of generality, we can
consider only the case of two segments. Let us thus assume that α is a null geodesic
on [0, s] and [s, 1] with a break at s. Let W be the vector field obtained by parallel
transportation of ∆α0 (s) = α0 (s + ) − α0 (s − ). We have g (W, α0 (s − )) = g (α0 (s + ), α0 (s − )) <
0 because at breaks, we have assumed by definition that semi-tangents points in
the same direction, and because if = 0, then there are no breaks (in the sense that
null orthogonal vectors must be collinear and that a change a parametrization then
allows to remove the break). On the other hand, g (W, α0 ) is constant on [0, s − ] by
parallel transport and α being a geodesic on that segment. Thus, g (W, α0 ) is negative
on [0, s − ]. Similarly, it is positive on [s + , 1]. Now choose a piecewise smooth function
f that vanishes at the endpoints (i.e. at 0 and 1) and such that f 0 > 0 on [0, s − ] and
f 0 < 0 on [s + , 1]. Then, for V = f W , we have g (V 0 , α0 ) < 0.
9.2 First variation
Let x : [a, b] × (−δ, δ) → M be a variation of a curve segment α. For each v ∈ (−δ, δ),
let L x (v) be the length of the longitudinal curve u → x(u, v). Then L x (v) is a realvalued function with L x (0) the length of α. If g (α0 , α0 ) 6= 0 on α, then for δ small
enough, L x is a smooth function (cf below).
Note that g (α0 , α0 ) > 0 on α means that α is spacelike and g (α0 , α0 ) < 0 means
that it is timelike. Let ² be the sign of α, i.e. ² = sign(g (α0 , α0 )) = {−1, 0, +1}.
Lemma 9.4. Let x be a variation of a curve segment α : [a, b] → M with g (α0 , α0 ) 6= 0.
If L x (v) is the length of u → x(u, v) then,
Z b
g (α0 ,V 0 )
L 0x (0) = ²
d u,
0 0 1/2
a |g (α , α )|
where ² is the sign of α and V is the variation vector field of x.
Proof.
b
Z
L x (v) =
a
|g (x u (u, v), x u (u, v))|1/2 d u.
In particular, |g (x u (u, v), x u (u, v))|1/2 is smooth for δ small enough since it does not
vanish by continuity. Then,
L 0x (0) =
Z
b
a
d
|g (x u (u, v), x u (u, v))|1/2 d u
dv
where |g (x u (u, v), x u (u, v))|1/2 = (²g (x u (u, v), x u (u, v)))1/2 and we compute
d
|g (x u (u, v), x u (u, v))|1/2
dv
=
=
1
2(²g (x u (u, v), x u (u, v)))1/2
²g (x u , x vu )
¯
¯ ,
¯g (x u (u, v), x u (u, v))¯1/2
using x uv = x vu . Just evaluate at v = 0 to conclude.
62
² .2g (x u , x uv )
Exercice: Check that the previous lemma holds for x a piecewise smooth variation of α a piecewise smooth curve.
Proposition 9.1 (First variation formula24 ). Let α : [a, b] → M be a piecewise smooth
curve segment with constant speed c = |g (α0 , α0 )|1/2 6= 0 and sign ². If x is a piecewise
smooth variation of α, with break points at (u i )1≤i ≤k such that u 0 = a < u 1 < .. < u i <
.. < u k < u k+1 , then
²
L 0x (0) = −
c
Z
b
a
g (α00 ,V )d u −
h²
ib
k
²X
g (∆α0 (u i ),V (u i )) + g (α0 ,V ) .
c i =1
c
a
Proof. We simply split the previous integral at the break points and integrate by
parts in u, using that
g (α0 ,V 0 ) =
d
g (α0 ,V ) − g (α00 ,V ).
du
Remark 9.2. For a fixed endpoint variation x, i.e. x(a, v) = p and x(b, v) = q for all
v ∈ (−δ, δ), the variation vector field V vanishes at a and b and thus the last term also
vanishes.
Corollary 9.1. A piecewise smooth curve α of constant speed c > 0 is an (unbroken)
geodesic if and only if the first variation of arc length is zero for every fixed endpoints
variation of α.
Proof. If α is geodesic then α00 = 0 and there are no breaks. Conversely, suppose
L 0x (0) = 0 for every fixed endpoint variation x. First, we show that each segment
α|[ui ,ui +1 ] is geodesic. It suffices to show that α00 = 0 for u i < t < u i +1 . Let y be any
tangent vector to M at α(t ) and let f be a bump function on [a, b] with support
included in [t −δ, t +δ] ⊂ [u i , u i +1 ]. Let Y be the vector field on α obtained by parallel
translation of y and set V = f Y .
Since V (a) = V (b) = 0, the exponential formula x(u, v) = expα (u)(vV (u)) produces a fixed endpoint variation of α whose variation vector field is V .
Since L 0x (0) = 0, and given the support assumption on f , we have
Z
0=
t +δ
g (α00 , f Y )d u.
t −δ
Since this holds for all y, δ > 0 and bump function f , we have g (α00 , y) = 0 for all y, i.e.
α00 = 0. It remains to show that the breaks are trivial, so that α is indeed unbroken.
As before, let y be an arbitrary tangent vector at α(u i ) and let f be a bump function
at u i with supp f ⊂ [u i −1 , u i +1 ]. For a fixed variation with vector field f Y , the first
variation formula gives
0 = −²/c g (∆α0 (u i ), y)
for all y, since α00 = 0 on each subsegment. Since g is non-degenerate, this implies
that all the breaks are trivial.
24 The first variation formula will help us to find critical points of the length functional, but do determine
whether these critical points are actual minimizers, one of course need to go beyond, for intance, by
computing a second variation formula, at the level of two derivatives of L.
63
10 Causality and global Lorentzian geometry
Recall that to distinguish between the future and the past, we say that M is time oriented if there exists a globally defined timelike vector field, denoted here T . Then,
at every p ∈ M , for v ∈ T p M a causal vector, we say that v is future oriented if
g p (v, T p ) < 0, which amounts to v and T p lying in the same time cone if v is timelike.
Recall also that the notion of future-oriented or past-oriented descend to curves.
From now on, we will assume that all the Lorentz manifolds we consider are
time-oriented. Moreover, we will assume that the manifold is oriented and connected (this ensures that we can define a global volume forma and the existence of
a complete Riemannian metric).
Definition 10.1. Let p, q ∈ M . We say that p << q if there is a future pointing timelike
curve in M from p to q and p < q if there is a future pointing causal curve in M from
p to q. Finally, p ≤ q means p = q or p < q.
Definition 10.2. For A ⊂ M , we define
I + (A)
+
J (A)
−
=
{p ∈ M : ∃q ∈ A /q << p},
=
{p ∈ M : ∃q ∈ A /q < p},
I (A)
=
{p ∈ M : ∃q ∈ A /p << q},
J − (A)
=
{p ∈ M : ∃q ∈ A /p < q}.
The set I + (A) is called the chronological future of A, J + (A) is the causal future
and similarly for I − and J − , replacing future by past.
In the above definition, we require that the curves to be piecewise smooth. Note
that in view of the previous section, if there is piecewise smooth causal curve from p
to q that is not null pregeodesic, then it can be deformed to be a piecewise smooth
timelike curve joining p to q. On the other hand, a piecewise smooth timelike curve
from p to q can be smooth out as follows.
Lemma 10.1. If there exist smooth timelike, future-oriented, curves from p to q and
from q to r , then there exists a smooth timelike future oriented curve from p to r .
Proof. Let p, q, r be as in the statement of the lemma. Let α be a smooth timelike
curve from p to q and β be a smooth timelike curve from q to r . The curve α followed
by β is not necessarily smooth at q, but we claim that we can smooth it out and still
maintain its timelike character. First, note that this is a local problem at q, and since
the condition of being timelike is an open condition, it follows by continuity that
it is sufficient to prove it in the case of the Minkowski metric. Let us thus assume
that η = g . Without loss of generality, we can consider coordinates (t , r, ω) on Rn+1
such that q = (0, .., 0) and we can also assume that α : [−1, 0] → M is given in local
coordinates by (s, 0, .., 0) + O(s 2 ) and that β : [0, 1] → M with β(0) = q. Again, since
this is a local problem near s = 0, we can neglect the O(s 2 ) and we shall do so in the
following.
τ := t (β(1)) > R :=
From the timelike condition, we have β̇t > |¡β̇i |. In particular,
¢
r (β(1)). We consider a curve of the form γ(s) = s, R f (s), ω(q) defined on [0, τ]. The
lemma then follows since we can choose f such that f (x) = 0 for x ≤ 0, f (τ) = 1 and
| f 0 | < R1 . For instance, rescaling in τ means we are looking for a function f defined
on [0, 1], such that f (x) = 0 for x ≤ 0, f (1) = 1 and | f 0 | < Rτ . Draw on a picture the
straight line t = x, and then draw the straight line of slope Rτ passing by (1, 1). Then
64
any convex function lying between the two straightlines and vanishing for x ≤ 0 will
do the job.
We have easily
Lemma 10.2. The following relations hold
I + (A) = I + (I + (A)) = I + (J + (A)) = J + (I + (A) ⊂ J + (J + (A)) = J + (A).
10.1 Causality and topology
For an open set U , we denote by I + (A, U ) the chronological future of A ⊂ U in the
open submanifold U ⊂ M . In other words, q ∈ I + (A, U ) if and only if there exists
r ∈ A and a future oriented timelike curve with values in U joining r to q. In particular, q ∈ U , so that I + (A, U ) ⊂ I + (A) ∩ U .
Exercice: give an example when the above inclusion is strict.
Lemma 10.3 (Topological properties of causal sets in convex regions). Let C ⊂ M be
open and convex. Then,
1. For p 6= q ∈ C , q ∈ J + (p, C ) if and only if the unique geodesic from p to q is a
future oriented causal curve (similar for I + and timelike).
2. I + (p, C ) is open in C (hence in M ).
3. J + (p, C ) is the closure in C of I + (p, C ).
4. The relation ≤ is closed on C , i.e if p n → p and q n → q with all points in C
then q n ∈ J + (p n , C ) for all n implies that q ∈ J + (p, C ).
Proof. Let p 6= q ∈ C with q ∈ J + (p, C ). Let q ∈ I + (p, C ) and let α be a futureoriented timelike curve joining p to q. Since C is a normal neighborhood of p, we
have α = expp β for some curve β : [0, 1] → T p M with β(0) = 0. From Lemma 8.12
and the fact that α is timelike, β stays in a single timecone on ]0, 1]. This implies that
~ = β(1) is timelike and future-oriented. Thus, the unique geodesic from p to q is
pq
a future oriented timelike curve.
If now q ∈ J + (p, C ) and γ is a causal curve joining p to q, then either it can be
deformed in C to a timelike curve joining p to q and we run the first argument, or it
is null (pre)geodesic, in which case we are also done. This proves the first point.
For the second point, let q ∈ I + (p, C ). From the first point, the unique geodesic
~ pq)
~ < 0 and use that all the operations here are
from p to q is timelike. Thus g (pq,
continuous.
The third and fourth points follow from the first and the continuity of the map
~
(p, q) → pq.
Recall the definition of a continuously extendible curve, Definition 8.6.
We now prove
Lemma 10.4. A causal curve contained in a compact subset K ⊂ C convex is continuously extendible.
65
Proof. Let α be causal curve defined on [0, B ) whose image is included K . Since K
is compact, there exists a sequence (s i ) such that α(s i ) → q and s i → B as i → +∞.
We must show that every such sequence gives the same limit. Let t i be another
sequence, such that α(t i ) → p and p 6= q. By the previous lemma and the fact that α
~
is causal it follows that q ∈ J + (p, C ) and p ∈ J + (q, C ). From the previous lemma, pq
is thus both future pointing and past pointing or p = q.
We now prove
Lemma 10.5. The relation p << q is open, that is if p << q there exists neighborhood
U of p, and V of q such that for all p 0 ∈ U , q 0 ∈ V , p 0 << q 0 .
Proof. Let σ be a timelike curve from p to q. If C is a convex neighborhood of q,
let q − be a point of C on σ before q. Dually, let p + be a point of σ between p and
q − and contained in a convex neighborhood C 0 of p. By the lemma, I + (q − , C ) and
I − (p + , C ) are open in M and have the required property.
Corollary 10.1. For any subset A ⊂ M , I + (A) and I − (A) are open.
Proof. Follows directly from the previous lemma. (Let p ∈ A and q be a timelike
curve from p to some q ∈ I + (A). Then, by the previous lemma, there exists a neighborhood V of q including in I + (A). )
In Minkowski space, for any set A, J + (A) is the closure of I + (A), but it is easy to
construct spacetimes where this is no longer the case.
Ex: In 1 + 1 Minkowski space, fix a point p and consider J + (p). The boundary
of J + (p) consists of two null curves starting at p. Remove a point on these curves
and consider the resulting spacetime to get an example where the closure of I + (p)
is much larger than J + (p).
In general, we only have
Lemma 10.6. For any subset A,
¡
¢
1. int J + (A) = I + (A).
2. J + (A) ⊂ I + (A), with equality if and only if J + (A) is closed.
+
+
Proof.
¡ I (A)
¢ is open and contained in J (A), hence is contained in its interior. If q ∈
int J + (A) , then there exists a convex neighborhood of q, C ⊂ J + (A). Take a point
r ∈ C , such that r ∈ I − (q, C ). Since r ∈ C , r ∈ J + (A). Hence, q ∈ I + (J + (A)) ⊂ I + (A).
Since I + (A) ⊂ J + (A), we have I + (A) ⊂ J + (A). Hence, if J + (A) = I + (A), then J + (A)
is closed and if J + (A) is closed then, I + (A) ⊂ J + (A) = J + (A). The equality case is clear
provided that the inclusion assertion in (2) holds. It suffices to just prove (2) for one
point, i.e. we will prove J + (p) ∈ I + (p). Let q ∈ J + (p). If q = p, then clearly q ∈ I + (p).
Otherwise, let α be a causal curve from p to q. Let C be a convex neighborhood of
q and let q − be a point of α ∈ J − (q, C ). We have q ∈ J + (q − , C ) and by Lemma 10.3,
J + (q − , C ) ⊂ I + (q − , C ).
The result then follows since I + (q − , C ) ⊂ I + (J + (p)) ⊂ I + (p).
66
10.2 Quasi-limits
We are interested in the construction of limits of a sequence of causal curves. Since
we lack good compactness properties on such sets of curves, the limits can be rough.
An approach could be to try to define a weak topology on the set causal curves, but
this seems to be a bit difficult in this context, and instead we introduce below the
notion of quasi-limits.
Definition 10.3. Let (αn ) be a sequence of future oriented causal curves in M and
let R be a convex covering of M . A limit sequence for (αn ) relative to R is a finite or
infinite sequence of points p 0 < p 1 .. in M such that
L 1a . For each i , there exists a subsequence αφ(n) such that, there exists, for each n, a
set of i points along the curve αφ(n) , denoted αφ(n) (s n,1 ), αφ(n) (s n,2 ),.., αφ(n) (s n,i ),
with s n,1 < s n,2 < s n,3 .. such that for all j ≤ i , as n → +∞,
αφ(n) (s n, j ) → p j .
L 1b . For each j < i , the points p j , p j +1 and for all n, the segments of αφ(n) restricted
to [s n, j , s n, j +1 ] are all contained in a single convex neighborhood of the covering
R.
L 2 . If the p i are infinite, it is nonconvergent. If they are finite, say p 1 < p 2 < .. < p k ,
then they contains more than one point (k > 1) and there exists no p k+1 such
that p 1 < p 2 < .. < p k < p k+1 and further subsequences of the αφ (n) satisfies L1
for the extended sequence.
Proposition 10.1. Let (αn ) be a sequence of future oriented causal curves such that
αn (0) → p ∈ M and (αn ) does not converge to just p (i.e. there exists a neighborhood
of p that contains only finitely many of the αn 25 . Then relative to any convex covering
R, (αn ) has a limit sequence starting at p.
Proof. i) Construction of the p i .
Since M is paracompact26 (since every topological space is paracompact cf Prop. 2.3
in Paulin differential geometry course) . Thus, it has a locally finite covering R 0 by
open sets B such that each B is compact and contained in some member of R. By
the hypotheses on (αn ), we can arrange for R 0 to be such that it contains B 0 such
that infinitely many αn start in B 0 and leave B 0 . Relabel these curves as 1 αn and
for each 1 , let s n be such that 1 αn (s n ) is the fist point in bd B 0 . Passing to a further
subsequence, we can assume that (1 αn (s n )) converges to a point p 1 ∈ bd B 1 . Now
choose B 1 ∈ R 0 containing p 1 . If infinitely many 1 αn s leave B 1 , we obtain as before
a subsequence 2 αn whose first departure points from B 1 converges to a point p 2 in
bd B 1 . Repeat this step as many times as possible with the additional rule on subsequent choices of B i , if there is more than one candidate element of R 0 containing p i ,
pick one that has been used fewest times before.
25 The case where (α ) just converges to p can happen by taking just one curve passing through p and
n
shrinking the interval to 0
26 Recall that this means that M is Hausdorff (séparé in French) and every open cover admits a subcover
that is locally finite.
67
ii) Basic properties and L1.
Clearly, the first two conditions of the definition of limit sequences hold (with C i
any element of R that contains B i . Since the relation ≤ is closed on C i , it follows
that p i +1 ≥ p i . By construction, p i +1 6= p i , hence p i +1 > p i .
iii) Checking L2.
If the sequence (p i ) is infinite, we must show that it is non-convergent. Assume that
it converges to some q. Let B ∈ R 0 such that q ∈ B , then for all sufficiently large i ,
p i ∈ B . Since B is compact and R 0 is locally finite, only finitely many number of R 0
meet B . Hence, some must have been chosen for B i for infinitely many i . But this
violates the additional rule, for B itself was a candidate infinitely many times, but
was chosen only finitely many times (since it is open and contains all p i but a finite
number, only a finite number can be in bd B). Finally, suppose that the sequence of
the p i is finite. Since the construction above cannot continue, there exists a k such
that only a finite number of the k αn leave B k . Let (αm ) be those trapped in B k .By
a previous (since the closure of B k is compact) they are extendible. Assume thus
that there are defined on closed interval [0, b m ] (by a change of parametrization and
continuous extension, we can always assume that). By compactness, we can take
a further subsequence and assume that αm (b m ) converges to some q. If q = p k ,
then the sequence p 0 < .. < p k cannot be extended to still satisfies the first two limit
sequence conditions (hence it is a limit sequence). If q 6= p k , then q > p k and posing
p k+1 = q, we have a limit sequence.
Let now (p i ) be a limit sequence for (αn ) and let λi be the (future causal) geodesic
from p i to p i +1 in a convex set C i as in L1. Assembling these segments for all i gives
a broken geodesic λ called a quasi-limit of (αn ) with vertices p i . λ is a future broken causal geodesic that starts at p. If the (p i ) are infinite, then by L 2 , λ is futureinextendible. In the finite case, the curve runs from p to some p k . When all the
curves (αn ) are future inextendible, we have
Lemma 10.7. A quasi-limit λ of future inextendible curves (αn ) is future inextendible.
Proof. Let p i denotes the vertices of λ. If the p i are infinite, then λ is inextendible,
since they must be non-convergent. On the other hand, the limit sequence cannot
be finite. Indeed, assume that the limit sequence is given by p 0 < .. < p k . Let C
be a convex neighborhood containing p k and let B k ⊂ C open and whose closure is
compact. Let k αn be a subsequence satisfying L1. Now by hypothesis, all the curves
are inextendible, thus by Lemma 10.4, they must leave B k . But then we can pass to
a subsequence to get an extra point in B d B k , contradiction.
10.3 Causality conditons
First, an easy remark
Lemma 10.8. A compact Lorentzian manifold admits a closed timelike curve.
Proof. By compactness, the open covering I + (p) admits a finite subcover I + (p 1 ), .., I + (p k ).
If p 1 ∈ I + (p i ), for i 6= 1, then I + (p 1 ) ⊂ I + (I + (p i ) = I + (p i ). In that case (note that then
k > 1), we can remove I + (p 1 ) and still obtain a open cover. Wlog, assume thus that
p 1 is not included into any of the other I + (p i ). Then, p 1 ∈ I + (p 1 ), i.e. there exists a
closed timelike curve at p.
68
Closed timelike curves are bad: from the physics point of view, and from evolution problems point of view. We want to exclude them. We want in fact a slightly
stronger criterion.
Definition 10.4. The strong causality condition is said to hold at p ∈ M provided
that given any neighborhood U of p, there exists a neighborhood V ⊂ U of p such
that every causal curve segment with endpoints in V lies entirely in U .
Exercice: Prove that if there exists a close timelike curve passing through p, then
M is not strongly causal at p.
Lemma 10.9. Suppose that the strong causality condition holds on a compact K ⊂ M .
If α is a future-inextendible causal curve that starts in K , then α eventually leaves K
never to return; that is there is an s > 0 such that α(t ) ∉ K , for all t ≥ s.
Proof. Assume that α persistently returns to K . Assume that α is defined on [0, B ).
Then, there exists a sequence s i → B , with α(s i ) ∈ K . Taking a subsequence, we can
assume that α(s i ) converges to some p ∈ K . Since α has no future endpoint, there
must exists another sequence t j → B , such that α(t j ) does not converge to p. Taking yet another subsequence, we can assume that there exists a neighborhood U of
p such that α(t j ) ∉ U for all j . Since s j and t j both converges to B , they have subsequences that alternates s 1 < t 1 < s 2 < t 2 ... The curves α|[sk ,sk+1 ] are almost closed
at p: Given any neighborhood V ⊂ U of p, for k large enough the ends α(s k ) and
α(s k+1 ) are contained in V and yet the curve escape U since α(t k ) ∉ U .
Lemma 10.10. Suppose the strong causality condition holds on a compact subset K .
Let (αn ) be a sequence of future oriented causal curve segments in K such that αn (0) →
p and αn (1) → q 6= p. Then, there exists a future oriented causal broken geodesic λ
from p to q and a subsequence αm whose length converges with the limit satisfying
limm→+∞ L(αm ) ≤ L(λ).
Proof. From Proposition 10.1, the αn admits a limit sequence p i starting at p. If p i
is infinite, the corresponding quasi-limit λ is future inextendible and thus, by the
above lemma, must leave K . In particular, some of the αn must leave K , a contradiction. Thus the limit sequence is finite, starts at p and ends at q (again applying
L2). The quasi-limit with these vertices is a broken geodesic from p to q. Thanks
to L1b, we can consider a subsegment in some convex set C i . On each convex set,
the length separating two points p < q is bounded above by the length of the corresponding geodesic. Thus,
L(αm|[sm,i ,sm,i +1 ] ) ≤ |p mi p~m,i +1 |,
~ itself ) between
where p mi = α(s mi ). Moreover, we know that the length (in fact pq
two points p < q in a convex set depends continuously on p and q. Thus, |p mi p~m,i +1 |, | →
p i ,~p i +1 |.Suming on all the segments and taking the limit m → +∞ then gives the
lemma.
10.4 Achronal sets
Definition 10.5. A ⊂ M is said to be achronal if there exists no p, q ∈ A such that
p << q and acausal if there exists no p, q such that p < q. Given A achronal, D + (A)
is by definition, the set of all p such that every past inextendible causal curve through
p meets A. Similarly, we define D − (A).
69
Definition 10.6. The edge (marge in French, not to be confused with the boundary) of
an achronal set A consists of all points p in the closure of A such that every neighborhood U of p contains a timelike curve from I − (p, U ) to I + (p, U ) that does not meet
A.
Exercice: Let A be the set [0, 1]×{t = 0} in 2-dimensional Minkowski space. What
is the edge of A ?
Recall the definition of a topological hypersurface of M .
Definition 10.7. S is a topological hypersurface of M is for all p ∈ S, there exists a
neighborhood U of p in M and a homeomorphism φ : U → R n such that φ(U ∩ S) =
φ(U ) ∪ Π where Π is a hyperplane in Rn .
Lemma 10.11. For any p ∈ M and U a neighborhood of p. Then, there exists a coordinate chat (N , ξ) of p such that
1. ξ(N ) has the form (a − ², b + ²) × N ⊂ R 1 × R n−1 .
2. For any (x i ) ∈ N , the curve s → ξ−1 (s, x i ) is timelike.
3. The slice x 0 = a of N is in I − (p, U ), the slice x 0 = b of N is in I + (p, U ).
Proof. Consider a normal neighborhood U of p with U ⊂ U . Let ξ be a normal coordinate system at p with g (∂x 0 , ∂x 0 )(p) = −1 and ∂x 0 future pointing. Consider the
region N such that, |x 0 | ≤ δ and |(x i )2 | ≤ 1/8δ2 . Let V be defined by V = exp−1
p (N )
and let Vρ be the subset of vectors which are timelike for the auxiliary Minkowski
metric η 1/4 := −1/4d x 0 ⊗ d x 0 + d x i ⊗ d x i . Now, let β be a curve in Vρ which is timelike for η 1/4 . Let y α (t ) be the coordinates of β
β(t ) = y α (t )[∂x α ]|p
and let γ = expp ◦β. Then the normal coordinates of γ(t ) are also given by y α (t ) by
construction.
Moreover, by construction, the coordinates of γ0 are given by (y α )0 and
−1/4((y 0 )0 )2 + ((y i )0 )2 < 0.
Since at p, the metric in normal coordinates coincides with the Minkowski metric, by taylor expansion, we have that,
g αβ (x) = η αβ + O(|x|)
and hence, for δ small enough,
|g αβ (x) − η αβ | ≤ C δ,
for some C .
This implies that
g (α0 , α0 ) = η(α0 , α0 ) + O(δ)(|y 0 |2 ),
where |y 0 |2 = ((y i )0 )2 + ((y 0 )0 )2 ≤ 5/4((y 0 )0 )2 . Since η(α0 , α0 ) < −3/4((y 0 )0 )2 , we have
that for δ small enough, g (α0 , α0 ) < 0.
In particular, expp maps each x i = const curve to a timelike curve in N . We
claim that the points of normal coordinates (−δ, x i ) and (δ, x i ) lies in respectively
70
I − (p, U ) and I + (p, U ). Indeed, let v = δ[∂x 0 ]p + x i [∂x i ]p and consider the curve
β : t → t v. Then, β0 = v and η 1/4 (v, v) < 0. Thus, β is timelike for the auxiliary metric
and γ = expp β is a future timelike (for g ) geodesic from p to the point of normal
coordinate (δ, x i ). Finally, for ² > 0 sufficiently small, we have again that the points of
normal coordinates (−δ + ², x i ), (δ − ², x i ) lies in respectively I − (p, U ) and I + (p, U ).
Proposition 10.2. An achronal set A is a topological hypersurface if and only if A
contains no edge point (i.e. A and edge A are disjoints).
Proof. First, let A be a topological hypersurface. Let p ∈ A and let U be an neighborhood of p as in Definition 10.7. Without loss of generality, we can assume that U
is connected and that U \A has two connected components. (Consider φ(p) in R n .
There exists a ball B centered at p such that B ⊂ φ(U ). Then φ−1 (B ) is a neighborhood of p ∈ M that works). Since A is achronal, the open sets I − (p, U ) and I + (p, U )
are disjoints. For otherwise, there exists a closed (piecewise) smooth timelike curve
through p ∈ A, which contradicts that A is achronal. For similar reasons, they do
not meet A. Moreover, I − (p, U ) and I + (p, U ) are clearly connected. Any timelike
curve through p meets both sets, hence they are contained in different components
of U \ A. Thus every timelike curve from I − (p, U ) to I + (p, U ) must go through some
point of A. Thus, p is not in the edge of A.
Now suppose that A and edge A are disjoints. Let p ∈ A. Since p ∉ edge(A),
there exists a neighborhood U of p, such that every timelike curve from I − (p, U ) to
I + (p, U ) intersects A.
From the previous lemma, we consider a neighborhood N of p such that
1. ξ(N ) has the form (a − ², b + ²) × N ⊂ R 1 × R n−1 .
2. The slice x 0 = a of N is in I − (p, U ), the slice x 0 = b of N is in I + (p, U ).
If y ∈ N ⊂ Rn−1 , the x 0 coordinate curve
s → ξ−1 (s, y)
is timelike by construction, and must then meet A. Since A is achronal, the meeting point is unique. Let h(y) be its x 0 coordinate. It suffices now to show that the
function
h : N → (a, b)
is continuous, for then φ = (x 0 −h(x 1 , .., x n−1 ), x 1 , .., x n−1 ) is a homeomorphism from
N onto its image (easy to check, just write explicity φ−1 ) that carries A ∩ N to the
slice u 0 = 0 of φ(N ) ⊂ R n .
Let y n be a sequence that converges in N to y. Asssume that h(y n ) does not
converge to h(y). Then, there must exists (since the h values are bounded) a subsequence y ψ(n) converging to some r 6= h(y). Now since r 6= h(y), either r > h(y),
in which case one can reach ξ−1 (y, r ) from ξ−1 (y, h(y)) by following ∂x 0 , a timelike
vector field forward, or r < h(y) and then we follow ∂x 0 backwards. Thus, ξ−1 (y, r )
is in the set I − (q, N ) ∪ I + (q, N ), where q = ξ−1 (y, h(y)). Since this set is open, it
follows that there exists some n such that ξ−1 (y n , h(y n )) ∈ I − (q, N ) ∪ I + (q, N ) for
some n. But these belong to A so they cannot be in I − (q, N ) ∩ I + (q, N ) since A is
achronal.
71
Remark 10.1. One can in fact show that such an A is a Lipshitz submanifold, i.e. the
function h constructed above satisfies a Lipshitz condition. cf Penrose.
Corollary 10.2. An achronal set A is a closed topological hypersurface if and only if
edge A is empty.
Proof. Assume that A achronal is a topological hypersurface. Then A and edge A are
disjoints. But edge A is included into the closure of A, so if A is closed then edge A
must be empty.
Reciprocally, suppose edge A is empty. Then A is topological hypersurface. Moreover, if A is achronal so if Ā (because the relation << is open). Thus, if q ∈ Ā \ A, then
no timelike curve through q can ever meet A (otherwise, since A ⊂ Ā, that would
contradicts that Ā is achronal), which implies that q is in edge(A) which we assumed
was empty.
Definition 10.8. A subset F of M is a future set (respectively past set) provided I + (F ) ⊂
F (respectively I − (F ) ⊂ F ).
Example: For any set B , J + (B ), I + (B ) are future sets.
Exercice: If F is a future set, then M \ F is a past set.
Corollary 10.3. The topological boundary of a future set is a closed achronal topological hypersurface if non-empty.
Proof. Let p ∈ ∂F . If q ∈ I + (p), then I − (q) is a neighborhood of p and hence contains
a point of F . Thus, q ∈ I + (F ) ⊂ F . This proves that I + (p) ⊂ F and similarly, one has
I − (p) ⊂ M \F . Inparticular, I + (∂F ) and I − (∂F ) are disjoint and hence ∂F is achronal.
Second, the closed set ∂F has no edge points. Indeed, I + (p) ⊂ intF and I − (p) ⊂
extF for p ∈ ∂F . (Let U be a neighborhood of p and γ be a timelike curve from
I − (p,U ) to I + (p,U ), then since I − (p,U ) and I + (p,U ) are contained in intF and extF ,
γ must intersect the boundary of F ). The conclusion then follows from the previous
corollary.
This last corollary has in fact plenty of applications in the study of possible singularities for non-linear wave equations. For instance, consider a wave equation of
the form
äu = F (u, u 0 )
where F is some non-linearity and ä is the usual wave operator. We imagine that we
have initial data given at t = 0 and that the problem is well posed at locally, so that
we can construct for any data a local solution. Now, using the domain of the dependence/finite speed of propagation, we can consider the largest domain on which the
solution stays regular (see Alinhac books for more on how to define this properly).
That domain will be a future set, and hence its boundary (where something bad has
to happen to our solution, otherwise we would continue) must be at least a closed
achronal topological hypersurface.
10.5 Cauchy hypersurfaces
Definition 10.9. A Cauchy hypersurface in M is a subset S that is met exactly once by
every inextendible timelike curve in M .
72
Example: Any t = const in Minkowski space is a Cauchy hypersurface. For any
ρ > 0, the hyperboloid t 2 − x 2 = ρ is however not a Cauchy hypersurface. (The curve
in coordinates (t , r, ω) given by s → (s/π, s(ar c t an(s)/π+1/2), ω) is for instance timelike, inextendible and does not meet the above hyperboloid, I think).
We will need the following lemma.
Lemma 10.12. Let α be a past-inextendible causal curve starting at p that does not
meet a closed set C . Then, if p 0 ∈ I + (p, M \ C ), there is a past inextendible timelike
curve through p 0 that does not meet C .
Proof. Since α is past inextendible, we may assume that α has domain [1, +∞) and
that the sequence α(n) does not converge.
Since p 0 ∈ I + (p, M \ C ), we can join p 0 to α(1) = p by a timelike curve. Then the
combined curve γ gives a causal curve starting at γ(0) = p 0 which is inextendible and
has values in M \C . Moreover, γ is not a null pregeodesic since it is initially timelike.
We claim that we can deform it, to be timelike everywhere and still avoiding
C . We will denote β the deformed curve. Indeed, consider the points p 1 = α(1),
p 2 = α(2), .., p n = α(n). First, we can deform γ on [0, 2] to be a timelike curve keeping the ends p 0 and p 2 fixed and being arbitrary close to γ. Since γ([0, 2]) is compact
and C is closed, if the deformation is small enough then the deformed curve β0 does
not meet C . (For instance, everything can be quantified using a metric d on M ). We
then define β on [0, 1] to be the curve just obtained. To construct β on [0, 2], we consider the curve β0 constructing previously on [1, 2] (and agreeing with β on [0, 1] by
construction ) and join it to α from [2, 3]. This is a causal curve from β(1) to p 3 which
is initially timelike. Thus, it can be deformed so that the deformed cuve β00 is timelike everywhere and avoids C . Extending β on [0, 1] by β00 on [1, 2] we obtain β on
[0, 2]. Repeating the process, we construct β on [0, ∞), which is timelike everywhere,
avoids C and moreover β(n) can be taking arbitrarily close to α(n) depending on n.
For instance, we can arrange d (β(n), α(n)) < 1/n, for an arbitrary metric d on M . It
follows that β is inextendible.
We have
Proposition 10.3. A Cauchy surface is a closed achronal topological hypersurface
and is met by every inextendible causal curve.
Proof. Let p ∈ M . Let γ be a maximally defined (thus inextendible) future oriented
timelike geodesic γ defined in a neighborhood of 0 and such that γ(0) = p. Then, γ
intersect S once and only once by definition, say at t 0 , now depending on whether
t 0 > 0, = 0, or < 0, we have that p ∈ I + (S), S, or I − (S). Thus, M is the disjoint union
of I + (S), S and I − (S). Moreover, for any q ∈ S, for any future oriented timelike curve
γ such that γ(0) = q is such that γ(t ) ∈ I + (S) for t > 0 and γ(t ) ∈ I − (S) for t < 0. Thus,
S is the common boundary of the sets I ± (S) and from Corollary 10.3, S is a closed
achronal topological hypersurface. It remains to show that S is met, not just by every
inextendible timelike curve, but by every inextendible causal curve α.
Assume that α is a causal curve that does not meet S. Wlog, assume that α(0) ∈
I + (S). Then, by the previous lemma, there is a past inextensible timelike curve β
starting in I + (S) that does not meet S. Since any future-pointing timelike curve
starting at β(0) must remain in I + (S), thus adjoining any such curve to β gives an
inextendible timelike curve that avoids S, i.e. a contradiction.
73
Proposition 10.4. Let S be a smooth spacelike Cauchy hypersurface. Then M is diffeomorphic to R × S. Futhermore, if S 0 is another smooth Cauchy hypersurface, then
S 0 and S are diffeomorphic.
Proof. Let T be a globally defined timelike vector field whose flow is defined on M ×
R. Its restriction to S gives a smooth map
f := φ|S : R × M → M ,
which is injective, since if φ(t 1 , x 1 ) = φ(t 2 , x 2 ), then by uniqueness in the flow (or the
semi-group property of t → φ(t , .)) x 1 = φ(t 2 − t 1 , x 2 ). But then since S is achronal,
we must have t 2 = t 1 and x 2 = x 1 .
Let now (t 0 , x 0 ) ∈ R × S. We claim that d f ( t 0 , x 0 ) is surjective. Indeed, let g (p) =
φ(−t 0 , p), then g is a diffeomorphism of M and d f ( t 0 , x 0 ) is surjective is and only if
d (g ◦ f )( t 0 , x 0 ) is sujective. Now by construction, the image of d (g ◦ f )( t 0 , x 0 ) contains
T x 0 as well as the tangent space of S. Since S is spacelike, it follows that d (g ◦ f )( t 0 , x 0 )
is sujective.
Since f is a local diffeomorphism and injective, it is in fact a global diffeomorphism onto its image. We claim that it is also surjective. Indeed, let p ∈ M and let γ
be the meximal integral curve of T . Then γ is inextendible and thus intersect S. It
then follows that p is in the image of f .
Let now π : R × S → S be the projection on S. If S 0 is another smooth spacelike
Cauchy hypersurface, then we get a map r : S 0 → defined by r (x) = π ◦ f −1 (x). Then,
r i ssmoot h and r maps a point x ∈ S 0 to the point of intersection between the integral curve of T through x and S. The inverse of r can be defined as the map that
takes y ∈ S to the point of intersection between the integral curve of T through y and
S 0 . Thus, S and S 0 are diffeomorphic.
We also state, without proof (the proof given in O’Neil requires invariance of domain, it would be nice to have another proof not based on it).
Proposition 10.5. Let S be a Cauchy hypersurface in M and let X be a timelike vector
field on M (which exists, since M is time-oriented). If p ∈ M , the maximal integral
curve of X through p meets S at a unique point ρ(p. Then ρ : M → S is a continuous
open map onto S leaving S pointwise fixed. In particular S is connected.
10.6 Global hyperbolicity
Definition 10.10. A spacetime is said to be globally hyperbolic if it is strongly causal
and if for any p, q, J − (p) ∩ J + (q) is compact.
We have the following results.
Proposition 10.6. If (M , g ) is globally hyperbolic, then the causality relation ≤ is
closed, i.e. if p n → p and q n → q, with p n ≤ q n , then p ≤ q.
Proof. The proof is trivial is p n = q n for infinitely many n, hence we can assume that
p n < q n for all n. Let αn be a causal curve from p n to q n . Take p − << p and q << q + .
Then, for n large enough, p − << p n < q n << q + . Thus, the curve αn are all in the
compact set J + (p − ) ∩ J − (q + ). From Lemma 10.10, there exists a causal curve from p
to q.
Lemma 10.13. If (M , g ) is globally hyperbolic and p < q, then there is a causal curve
from p to q such that no causal curve connecting p to q has greater length.
74
Proof. Since p < q, the set C (p, q) of causal curves from p to q is non-empty. Let
τ(p, q) = supγ∈C (p,q) L(γ). Note that a priori, τ(p, q) ∈ [0, +∞]. Let αn be causal
curves from p to q such that L(αn ) → τ(p, q) as n → +∞. These curves are all in
J + (p) ∪ J − (q) which is compact and strongly causal by global hyperbolicity. Hence,
by Lemma 10.10, there is a causal broken geodesic λ from p to q with L(λ) = τ(p, q).
Moreover, there can not be breaks by Lemma ??.
Lemma 10.14. Let (M , g ) be a globally hyperbolic Lorentzian manifold and let K 1 , K 2 ⊂
M be compact. Then J − (K 1 ) ∪ J + (K 2 ) is compact.
Proof. Recall that since M is metrizable (since M is a manifold), a subset is compact
iff it is sequentially compact. Let q n ∈ J − (K 1 ) ∪ J + (K 2 ). By assumptions, there exists
p n,i ∈ K i , such that p n,2 ≤ q n ≤ p n,1 . Since the K i are compact, up to subsequences,
we can assume that p n,i → p i . Let r i ∈ M be such that r 2 << p 2 and p 1 << r 1 . Then,
r 1 << q n << r 2 for n large enough. Hence, q n ∈ J − (r 1 ) ∪ J + (r 2 ) for n large enough
and since this set is compact by global hyperbolicity, up to a subsequence q n converges to some q. Since the relation ≤ is closed on globally hyperbolic manifold, we
conclude that p 2 ≤ q ≤ p 1 .
10.7 Cauchy developments
Definition 10.11. If A is achronal subset of M , the future Cauchy development of A
is the set D + (A) of all points p ∈ M such that every past-inextendible causal curve
through p meets A. (In particular A ⊂ D + (A). Similarly, D − (A) is the set of all points
p ∈ M such that every future-inextendible causal curve through p meets A. The Cauchy
devlopment of A is by definition D + (A) ∪ D − (A).
In particular, if S is a Cauchy hypersurface, then D(S) = M .
We have easily
Lemma 10.15. D + (A) ⊂ A ∪ I + (A) ⊂ J + (A), D + (A) and I − (A) are disjoints, D + (A) ∩
D − (A) = A and D + (A) \ A = D(A) ∩ I + (A). Moreover, if α is a past-directed causal
curve starting in D + (A) and leaving D + (A), then it first leaves D + (A) through A.
Proof. We only prove the last claim. If α(s) ∉ D + (A), then there is a past-inextendible
causal curve β starting at α(s) that does not meet A. But α|[0,s] +β must meet A, thus
α|[0,s] must intersect A.
We will also need, whose proof is left as an exercice.
Lemma 10.16. If A is achronal and p ∈ intD(A), then every inextendible causal curve
through p meets both I − (A) and I + (A).
(To get an idea why p must lie in the interior, just take A a closed horizontal interval in 2d Minkowski space and then take p on the boundary of D(A) in the future
of A. )
Lemma 10.17. Let (M , g ) be a globally hyperbolic spacetime and let S be a Cauchy
surface. Let p ∈ J + (S). Then, J − (p) ∪ J + (S) is compact.
Proof. Let x n be a sequence in J − (p)∪ J + (S). If x n admits a subsequence converging
to p we are done, so we assume that it is not the case. In particular, we may assume
that x n 6= p for all n. Let αn be past pointing causal curves from p to x n . Since no
75
subsequences of x n converges to p, there is a past directed limit sequence {p i } for
αn starting at p. If {p i } is infinite, then if λ is a quasi-limit, λ is past inextendible
casual curve starting at p. Thus, by Lemma 10.16, it must intersect I − (S), which is a
contradiction. Thus, the p i are finite and then some subsequence of the x n (which
are the endpoints of the αn ) must converges to the last p i , which, since all the J sets
are closed in a globally hyperbolic spacetimes, must lie in J − (p) ∪ J + (S).
We also state without a proof
Theorem 10.1. If A is an achronal set, then intD(A), if non-empty, is globally hyperbolic.
The last theorem implies in particular
Corollary 10.4. A spacetime that admits a Cauchy hypersurface is globally hyperbolic.
10.8 Time and temporal functions
In the whole section, (M , g ) is a globally hyperbolic Lorentzian manifold.
The aim of the section is to prove the existence of certain special, globally defined, functions on globally hyperbolic Lorentzian manifold. These functions will
have the property that each of their level sets will be Cauchy hypersurfaces. Moreover, we’ll require more and more structure on them to eventually construct a smooth
function, whose level sets are smooth spacelike Cauchy hypersurfaces and such that
its gradient is timelike, past directed. This will be our replacement for say, the function t in Minkowski space.
First, the functions that we will construct will lack regularity. We’ll then modify
it locally near any level set to eventually reach our goal.
10.8.1 Time function and Geroch’s theorem
Let M be oriented, so that we can define globally a volume form
η=
p
g d x 1 ∧ ..d x n
Let (Ui , ξi )1≤i ≤+∞ be an atlas of M compatible with its orientation (so that the Jacobian matrices of the change of coordinates have positive determinant on overlap)
and let (φi ) be a smooth partition subject to the (Ui ). Finally, let m i be the integral
of the φi .
Then, define a volume form
ω=
+∞
X
1
i
i =1 m i 2
φi η.
Let Λ be Rthe linear function that takes smooth function of compact support on
M to Λ( f ) = M f ω.
By Riesz representation theorem, there exists a positive measure µ and a σ algebra of subsets of M containing the Borel sets of M , such that µ is complete and such
that for smooth functions of compact support,
Z
f d µ = Λ( f ).
76
By construction, µ(M ) = 1, µ(U ) > 0 for any open non-empty set and for any measurable set V , µ(V ) = 0 if and only if for all i , ξi (V ∪ Ui ) is of zero measure for the
Lebesgue measure.
Theorem 10.2 (Geroch’s theorem). There exists a continuous, onto, function τ : M →
R which is strictly increasing along any future oriented causal curve and such that if
γ is an inextendibly causal curve defined on (t , t + ), then τ(γ(t )) → ±∞, as t → t ± . In
particular, for any a ∈ R is an acausal Cauchy hypersurface.
Proof. The fact that if τ is such a function, then its level sets are Cauchy hypersurface
follows from the definition of a Cauchy hypersurface. Since the monotonicity of τ
holds for any causal curve (not just timelike curve), then the levet sets are acausal:
they can not be met more than once by causal curves.
Let µ be the measure constructed above. Let p ∈ M and let C p+ = J + (p) \ I + (p).
We claim that
µ(C p+ ).
Indeed, let q ∈ C p+ . Then, there is a causal curve from p to q. Moreover, since M is
globally hyperbolic, there exists a causal geodesic connecting p to q of greater length
and since q ∉ I + (p), it must be a null geodesic. Thus, there exists a v ∈ T p M which
is null and such that expp (v) = q. Let N be the future directed null vectors in T p M
that lies in the domain of expp . This is an open submanifold of T p M of dimension
n − 1. Moreover, expp is a smooth map from N to M and C p+ \ {p} ⊂ expp (N ). An
application of Sard’s theorem then implies that µ(C p+ \ {p} ⊂ expp (N )) = 0, hence C p+
has zero measure.
Similarly, with C p− = J − (p) \ I − (p), we have µ(C p− ) = 0.
Let us define now f − (p) = µ[J − (p)](= µ[I − (p)]) and f + (p) = µ[J + (p)](= µ[I + (p)]).
Since I ± (p) are open, f ± (p) > 0, for all p ∈ M .
Step 1: f − and f + are continuous.
Let p j → p.
Assume first that p j << p. Since J − (p j ) ⊂ J − (p), we know that f − (p j ) ≤ f − (p).
Moreover, for all q ∈ M ,
Z
f − (q) =
ξI − (q) d µ
M
Since the relation << is open, it follows that ξI − (p j ) converges pointwise to ξI − (p) and
the result then follows by Lebesgue’s dominated convergent theorem.
If now p << p j , we have that ξ J − (p j ) converges pointwise to ξ J − (p) as a consequence of the fact that the relation < is closed on globally hyperbolic manifold and
again f − (p j ) → f − (p) as j → +∞.
For a general sequence of p j , let ² > 0 and take points q 1 and q 2 such that q 1 <<
p << q 2 and
f − (p) ≤ f − (q 2 ) ≤ f − (p) + ²,
f − (p) − ² ≤ f − (q 1 ) ≤ f − (p).
for j large enough, q 1 << p j << q 2 , so that
f − (p) − ² ≤ f − (q 1 ) ≤ f − (p j ) ≤ f − (q 2 ) ≤ f − (p) + ².
This proves the continuity of f − .
77
Step 2: f ± are stricly monotone along causal curves.
Let p < q. Then q ∉ J − (p). (This follows from the strong causality condition).
Since ≤ is closed on globally hyperbolic spacetimes, J − (p) is closed and thus there
exists an open neighborhood U of q which does not intersect J − (p). It follows that
J − (q) \ J − (p) contains a non-empty open set and thus f − (q) > f − (p).
Step 3: For γ : [0, a) → M future inextendible, f + ◦ γ(t ) → 0 as t → a.
Let K ⊂ M be compact. We claim that for t large enough K t = J + [γ(t )] ∩ J − (K ) is
empty. Note that K t is compact for every t (cf previous lemma). Let C = K 0 ∪ {γ(0)}.
Then C is a compact set and γ start in C . From previous proposition, there is a t 0
such that for t > t 0 , γ(t ) ∉ C . Since γ(t ) ∈ J + [γ(0)] for all t ≥ 0, we conclude that
γ(t ) ∉ J − (K ) for t > t 0 . Thus K t is empty for t > t 0 .
Exercice: prove that this implies step 3.
Step 4: The function τ : M → R defined as
τ(p) = ln f − (p) − ln f + (p)
then satisfies all the properties of the theorem.
A function which is strictly increasing along future directed causal curves will be
called a time function.
10.8.2 Local constructions
The time function we have just constructed still lacks regularity to be used in applications. It is only continuous. Moreover, eventhough it is strictly increasing along
future directed causal curves, its gradient may not be timelike. We would like to construct a smooth function τ such that Dτ is timelike and past-directed. We will call
such a function a temporal function. Temporal function are time function, but there
are example of time functions which are not temporal.
We will need a few technical constructions.
First, we need the following lemma, whose classical proof is ommitted.
Lemma 10.18. Let V ⊂ U ⊂ Rn be open sets. Let f ∈ C ∞ (U ) be such that f(x) > 0 for
x ∈ V and f(x)=0 for x ∈ ∂V ∩ V . Let g (x) = exp[−1/ f (x)], for x ∈ V and 0 elsewhere.
Then g is smooth.
Let now p ∈ M and consider a normal neighborhood U of p and Ũ = exp−1
p (U ).
−1
+
Recall the functions q̃ : v → g (v, v) and q = q̃ ◦ expp . Let Tp be future timecone at
p and let V = expp Tp+ . Note that q vanishes on ∂V and q < 0 on V and thus we can
define a function
f p (x) = exp[1/q(x)]
for x ∈ V and f p (x) = 0 elsewhere. f p is smooth by the above lemma. Moreover, on
V,
g r ad f p = −1/q 2 f p g r ad q = −1/q 2 f p 2P,
where P is the position vector field. Thus, g r ad f p is a past directed timelike vector
field on V and is zero everywhere else.
78
Lemma 10.19. Let S be an acausal Cauchy hypersurface. Let p ∈ S and let U be a
convex neighborhood of p. Then, there is a smooth non-negative function h p such
that h p (p) = 1, h p has compact support contained in U , if r ∈ J − (S) and h p (r ) 6= 0,
then g r ad h p (r ) is past directed timelike.
Proof. Let γ : (−², ²) → M be a future directed timelike curve with γ(0) = p. We claim
that for t < 0, K t = J + (γ(t )) ∪ J − (S) is contained in U . First, the K t are compact by
Lemma 10.16. Moreover, for s < t , K t ⊂ K s , i.e. the K t are decreasing in t . Assume
that for all t < 0, K t is not included in U . Then, there exits an increasing sequence
0 < s j → 0 such that for ll j , K s j ∩U c 6= ;. Since p j ∈ K s1 for all j , we can pass to a
subsequence converging to r , and since U is open, U c is closed, so that r ∈ K s j ∩U c .
Since γ(s j ) ≤ p j and ≤ is closed, we have p ≤ r and since p 6= r (since r ∈ U c and
p ∈ U ). Since r ∈ J − (S) and p ∈ S, this contradicts the causality of S. We conclude
that for t < 0 small enough, K t ⊂ U . Let t 0 such that K t0 ⊂ U and let r 0 = γ(t 0 ). Since
U is convex, we can construct the function f r 0 as introduced above, which has a past
directed gradient whenever it is non-zero. Let K ⊂ U be compact and containing K t0
in its interior and let φ ∈ C c∞ (U ) be a cutoff function associated to K , i.e. φ = 1 on K
and φ(U ) ⊂ [0, 1]. Let H = φ f . It is a smooth function with compact support in U and
can therefore be extended by 0 outside of U to a smooth function on M . Note that by
definition of the function f p (see the definition of the set V above), if r ∉ J + (γ(t 0 )),
then f (r ) = 0. Thus, if r ∈ J − (S) and and H (r ) 6= 0, then H = f on a neighborhood
of r , so that grad H is past pointing and timelike. Finally, note that H (p) > 0, since
p >> r 0 and that φ(p) = 1. Thus, h p = H /H (p) has all the required property.
Proposition 10.7. Let S be an causal Cauchy hypersurface. Given an open neighborhood W of S, there exists a smooth function h : M → [0, +∞) such that
1. The support of h is included in W ,
2. h(p) > 1/2, for p ∈ S,
3. For r ∈ J − (S) and h p (r ) 6= 0, then grad h p (r ) is past pointing timelike.
Remark 10.2. Compared with the previous lemma, we are constructing a function h
whose support and properties lies in a neighborhood of S instead of a neighborhood
of some p ∈ S. Thus, we are gradually moving from local to more global properties.
Proof. Let d be the distance associated to a complete Riemannian metric on M .
Then, B ρ (p), the closed ball of radius ρ centered at p, is compact due to the HopfRinow theorem. Let p ∈ M , and define for l ∈ N,
K l = B l (p) − B l −1 (p),
R l = K l ∩ S.
For each r ∈ S, fix a convex set Ur of diameter strictly less than 1 for d and contained
in W . Let h r be the function constructed in the previous lemma for (U , p) = (Ur , r ).
Let Vr = h r−1 [(1/2, +∞)]. Since R l is compact and the Vr are an open covering of S,
for any l , there exists a finite number of points, denoted r l ,1 , .., r l ,kl ∈ R l such that the
corresponding Vr l ,1 , ..,Vr l ,k forms an open cover of R l . Note that if |l − m| ≥ 3, then
l
Ur l ,i ∩Ur m,i = ;, in view of the diameter constraint and the fact that d (r l ,i , r m, j ) ≥ 2.
Thus, the sum
kl
+∞
XX
h=
h r l ,i
l =1 i =1
79
defines a smooth function since each point has a neighborhood in which all but a
finite number of the terms vanish. If x ∈ S, then h(x) > 1/2 since x must lies in some
Vr l ,i . Further, the support of h is the union of the supports of h r l ,i (i.e. we do not need
to take a further closure), since the covering Ur l ,i is locally finite. Thus, the support
of h is included into W . If x ∈ J − (S) and h(x) 6= 0, then grad h r l ,i (x) is timelike and
past directed for every l , i such that h r l ,i (x) 6= 0 and vanishes for every l , i such that
h r l ,i (x) = 0 (since h r l ,i ≥ 0, so that in that case x must be a minimum), so grad h(x) is
timelike and past directed.
10.8.3 Smooth temporal function
Let τ be the continuous time function guaranteed to exist by Geroch’s theorem. We
will denote its level set by S t := τ−1 (t ).
The idea will be to use the previous constructions to construct a replacement for
τ near some S t , and then glue the replacements to obtain a global temporal function.
Definition 10.12. Let t − < t a < t < t b < t + and S ± = S t à ĆÂśpm . A function σ : M → R
is called a temporal step function around t (for (t − , t a , t , t b , t + ) ) if it satifies
1. grad σ is timelike and past pointing in V = {p ∈ M : grad σ(p) 6= 0},
2. σ(M ) ⊂ [−1, 1],
3. σ = ±1 on J ± (S ± ),
4. S t 0 ⊂ V for all t 0 ∈ (t a , t b ).
Lemma 10.20. Let t < t < t + . Then, there exists an open set U , such that
J − (S t ) ⊂ U ⊂ I − (S t + )
and a function h + : M → [0, +∞) whose support is contained in I + (S t − ) and such that
• If p ∈ U and h + (p) > 0, then grad h + (p) is timelike and past pointing,
• h + (p) > 1/2 for p ∈ J + (S t ) ∩U .
Proof. Take h + be the function h constructed in the previous Proposition with S = S t
and W = I − (S t+ ) ∩ I + (S t − ). For x ∈ S t , h(x) > 1/2 and grad h(x) is past pointed and
timelike. By continuity, there exists an open neighborhood of x, Vx , such that the
same conditions on h and grad h holds in Vx . Then define U as the union of I − (S t )
and the sets Vx for all x ∈ S t . (that J − (S t ) ⊂ U is a consequence of the fact that S t is a
Cauchy hypersurface).
Lemma 10.21. Let t < t + and U ⊂ I − (S t+ ) be an open neighborhood of J − (S t ). Then,
there exists a smooth function h − : M → [−1, 0] such that
• the support of h − is included in U ,
• if grad h − (p) 6= 0, then it is timelike, past directed.
• h − (p) = −1 for p ∈ J − (S t ).
80
Proof. Reverse time orientation and construct a function h as in Proposition 10.7.
Then we get a smooth function h : M → [0, +∞) whose support is included in U
and such that if h(p) > 0 for p ∈ J + (S t ), then grad h(p) is timelike, future pointing.
Morover, h(p) > 1/2 for p ∈ S t . Let h 1 = −h and let φ : R → [−1, 0] be a smooth
cutoff function such that φ = −1 on (−∞, −1/2], φ0 > 0 on (−1/2, 0), φ = 0, on [0, +∞).
Defines h − = φ ◦ h 1 on J + (S t ) and h − = −1 on J − (S t ), then h − has all the desired
property.
Proposition 10.8. Let t − < t < t + . Then, there exists a smooth function σ : M → R satisfying the first three properties of Definition 10.12 and such that S t ⊂ {p ∈ M , grad (σ) 6=
0}.
Proof. Let h + and U be as in Lemma 10.20 and given this U , let h − be as in Lemma
10.21. Then, h + − h − > 1/2 on U (since for p ∈ U , either p ∈ J + (S t ) or p ∈ J − (S t )).
Thus, we can define σ as
h+
−1
σ=2 +
h − h−
on U and σ = 1 on the complement of U . We have
grad σ = 2
h + grad h − − h − grad h +
(h + − h − )2
which at any given point is past pointing timelike or zero.
Corollary 10.5. Let t ∈ R and t ± = t ± 1. Let t < t a < t < t b < t + and let K ⊂ τ−1 ([t a , t b ])
be compact. Then, there is a smooth function σ satisfying the first three property of
Definition 10.12 and such that K ⊂ {p ∈ M , grad (σ) 6= 0}.
Proof. For each s ∈ [t a , t b ], let σs be the function constructed in Proposition ?? with
t , t , t + replaced by t − 1, s, t + 1 and let Vs be the set on which grad σs is non-zero (i.e.
the interior of the support of grad σ ). Then, the set Vs constitute an open covering
of K . Let s 1 , .., s k be such that the Vsi is a finite open covering of K and define
σ=
k
1X
σs .
k i =1 i
Then, σ has the desired properties.
In the proof of the next theorem we will need the following lemma, whose proof
is left as an exercice. Note that the reason why this is not a triviality is that a limit of
future-directed timelike vectors can be 0 or a null vector.
P
Lemma 10.22. Let v i be a sequence of timelike vectors such that v i is convergent.
P
∞
Then, i = 1 v i is timelike.
Theorem 10.3 (Existence of temporal step function). Let t ∈ R and let t ± = t ± 1.
Given t < t a < t < t b < t + there is a temporal step function around t as in definition
10.12.
Proof. Let G j be an increasing sequence of open sets of compact closure and such
that M = ∪ j G j . Let
K j = G j ∩ J + (S t a ) ∩ J − S tb .
81
K j is compact and can be used to get a function σ j as in Corollary 10.5. We will
take a sum of the σ j , but in order for the sum to converge
¡
¢ we will need to divide the
each σ j by increasing weights. To define those, let Vi ,i ξ 1≤i be a open covering by
coordinate charts. (Note that i in i ξ labels the coordinate chart, not the individual
coordinate functions). We may assume that the Vi are locally finite and that there
exists another open cover by open sets Ui , such that Ui ⊂ Vi for all i and the Ui have
compact closure. (Exercice: construct such open covers).
For any j , let A j > 1 be such that for all 1 ≤ i ≤ j , 0 ≤ m ≤ j and for all multiindices |β| ≤ m,
¯
¯
¯ ∂β σ ¯
j ¯
¯
sup ¯ m i ¯ ≤ A j ,
¯∂ ξ¯
Ui
i.e. the partial derivatives of order m of σ j are uniformly bounded by A j on U i for
the j first coordinate systems.
Let now σ be defined by
+∞
X 1
σj .
σ=
j
j =1 2 A j
Let p ∈ M . Then p ∈ Ui . To prove that σ is C l , split the series into j > i , l and
j ≤ i , l . The j ≤ i , l are finite, so poses no threat to regularity.
σj
For j > l , i , 2 j A and all its derivatives of order ≤ l with respect to the coordinates
j
ξ are bounded by 21j , so we have uniform converges. It follows that σ is smooth.
We still need to normalize σ to satisfy all properties of 10.12. For this, note that
σ = σ− for some constant σ− < 0 on J − (S − ) (since all the σ j = −1) and similarly
σ = σ+ > 0 on J + (S + ). Let ψ : R → [−1, 1] be such that, ψ is smooth, ψ(t ) = −1 for
t ≤ σ− , ψ(t ) = 1 for t ≥ σ+ and ψ0 > 0 for t ∈ (σ− , σ+ ). Then ψ ◦ σ has all the desired
properties.
i
Theorem 10.4. There exists a smooth time function T : M → R, whose gradient is
timelike and past pointing and T tends to ± + ∞ along any future/past inextensible
causal curve. In particular, the level sets of T are smooth space like Cauchy hypersurfaces.
Proof. Let t k = k/2, t k,± = t k ± 1, t k,a = t k ± −1/2, t k,b = t k ± +1/2. Let σk be the
function just constructed, with t = t k , t ± = t k,± , t a = t k,a , t b = t k,b . Note that each
p ∈ M is such that τ(p) ∈ (t k,a , t k,b ) for some k. Define
T = σ0 +
+∞
X
(σ−k + σk ).
k=1
Note that if k ≥ 3, then (σ−k +σk ) = 0i f −k/2+1 ≤ τ(p ≤ k/2−1 (because σ−k (p) = 1,
σk (p) = −1 on this interval). Thus, for any p ∈ M , there is a neighborhood of p
such that only a finite number of the terms in the above sum are non-zero on that
neighborhood. It follows from the definition and the fact that each p is such that
τ(p) ∈ (t k,a , t k,b ) for some k that the gradient of T is timelike and past directed. Let
now γ : (t , t + ) → M be a future directed inextensible causal curve. From the gradient
properties of T , T ◦ γ is strictly increasing along γ. Let m ≥ 1. Since γ intersects
each Cauchy hypersurfaces S t , there exists s m ∈ (t , t + ) such that τ(γ(s m )) = m. Let
l = 2(m + 1). We have (σ−k + σk )[γ(s m )] = 0 for k ≥ l . Thus,
"
#
l
X
T (γ(s m )) = σ0 +
(σ−k + σk ) (γ(s m ).
k=1
82
Since m ≥ 1, σ0 [γ(s m )] = 1. Futhermore, (σ−k + σk )(γ(s m ) ≥ 0 for all k ≥ 1, since
σ−k (γ(s m ) = 1 and σk (γ(s m ) ≥ −1 . Finally, for 1 ≤ k ≤ 2(m − 1), we have
(σ−k + σk )(γ(s m ) = 2
and thus
T (γ(s m )) ≥ 4(m − 1) + 1.
Combined with the monotonicity property of T (γ), we have obtained T (γ(s)) →
+∞, as s → t + .
11 The domain of dependence in globally hyperbolic spacetime
The aim of this section is to prove the following statement.
Theorem 11.1 (Domain of dependence). Let (M , g ) be a globally hyperbolic spacetime and let t be a temporal function as in Theorem 10.4. Let S = t −1 (0) and let Ω ⊂ S
be open. RLet u be a solution to äg u = 0 such that u and Du vanishes on Ω. Then
u = 0 on (D + (Ω)). In particular, taking Ω = S, then we have proven uniqueness of
the solution to the linear wave equation.
Before proving the theorem, we will need a few more geometric statements, as
well as a local version of it.
11.1 Three extra geometric lemmas
Lemma 11.1. Let (M , g ) be a smooth globally hyperbolic spacetime with a smooth
spacelike Cauchy surface S. Then, for every p ∈ S, there is a neighborhood U ∈ p
such that for every q ∈ U in J + (S) there is a normal neighborhood Vq of q such that
J − (q) ∩ J + (S) is compact and contained in Vq .
We leave its proof as an exercice, but note that one cannot just take any convex
neighborhood of p, for the condiition J − (q) ∩ J + (S) ⊂ Vq is not always satisfied. (For
a full proof, consult Ringström, p135 ).
We will also need (again for proofs, see Ringström.)
Lemma 11.2. Let N1 and N2 be two n dimensional spacelike submanifolds. Asssume
for each p ∈ N1 ∩ N2 , the normals to N1 and N2 are linearly independent. Then, N1 ∩
N2 is an n − 1 dimensional spacelike submanifold.
Lemma 11.3. Let N1 and N2 be two n dimensional spacelike submanifolds of (U ⊂
Rn , g ), for some Lorentzian metric g defined on U . Assume N1 and N2 are as in the
previous lemma and let P = N1 ∩ N2 and assume M is compact. Fix n 1 and n 2 be the
normals to N1 and N2 . Then, their restrictions to P give two vector fields on M such
that at every p ∈ P , n 1 (p), n 2 (p) ∈ T p M ⊥ . Define
f : M ⊗ R2 → Rn
by
f (p, t , s) = p + t n 1 (p) + n 2 (p).
Then f is smooth and there is an ² > 0 such that f restricted to M × B ² (0) is a diffeomorphism onto a neighborhood of P .
83
11.2 A local version of the domain of dependence property
Lemma 11.4. Let S be a Cauchy surface and p ∈ J + (S). Assume that there is a normal
neighborhood of p such that J − (p) ∩ J + (S) is compact and contained in Vp . Then, if
u solves the wave equation
äg u = 0
and u an Du vanishes on J − (p) ∩ S, u = 0 on J − (p) ∩ J + (S).
−1
Proof. Let Ṽp = exp−1
p (V p ). We consider at p, the hyperquadric q̃ (c) ∈ T p M , for c <
0. It has two connected components, and we consider the one corresponding to the
past directed timelike vectors, denoted Q̃ c . Let Q c be the image by the exponential
map of Q̃ c ∩ Ṽp . By construction Q c ⊂ I − (p). Similarly, we denote by Q 0 the past
directed local null cone at p. Let D = J − (p) ∩ J + (S) and D c = J − (Q c ) ∩ J + (S). We
leave as an exercice to verify that the following statement holds (Draw a picture ! )
1. D c = ∩γ≤c Q γ ∩ J + (S),
2. The interior of D c is I − (Q c ) ∩ I + (S) and its boundary is Q c ∩ J + (S) ∪ J − (Q c ) ∩S.
3. If c l → 0, c l < 0 for all l , then i nt D ⊂ ∩l D cl ⊂ D.
Let ρ be a Riemannian metric on V and let d be the associated topological metric. Let ² > 0 and define
R ² = {r ∈ S ∩ D : d (r,Q 0 ) < ²}.
Then, R ² is an open neighborhood of Q 0 ∩ S in S ∩ D. Let L ² = S ∩ J − (p) \ R ² . Since
D is compact and S is closed, we have D ∩ S compact and hence, L ² is compact.
Moreover, for every r ∈ L ² , exp−1
p (r ) is timelike (because r does not belong to some
Q 0 but lies in J − (p) ). Consequently, exp−1
p (L ² ) is a compact subset of the interior
of the past line cone in T p M . Hence, for c < 0 small enough, L ² does not intersect
Q c and Q c and S must intersect in R ² . Let T be a smooth unit normal to S. Then,
T is timelike. For ² > 0, P and T are linearly indedependent in R ² . SinceP and T
are non-zero vector fields on R ² and R ² is compact, ρ(T, T ) and ρ(P, P ) are uniformly
bounded above and below on R ² . On the other hand, g (T, T ) = −1 and g (P, P ) tends
to zero as ² → 0. This implies the linear independence.
Consequently, for c < 0, every point in Q c ∩ S is such that the normal to Q c and
S at that point are linearly independent. Since Q c and S are smooth spacelike n − 1
dimensional manifolds, Q c ∩S is a smooth n−1 dimensional submanifold. Moreover
it is compact (exercice).
Let u be the solution assumed to exist in the statement of the lemma. Let T be
its energy-momentum tensor, i.e.
1
Tαβ = D α uD β u − g αβ (g (Du, Du)).
2
Let f = −1/2q and N = −P , where P is the position vector field. Note that grad f = N .
Let J :=N J β = Tαβ N β be the current associated to N and let
η = −e k f |Âău|2 N
and
J˜ = e k f J
84
be weighted version of it. (The weights allows to absorb some error terms, otherwise,
we would need an extra Gronwall argument to conclude). The constant k wil be
chosen later.
Note that in D c , N is a future directed timelike vector field and that on Q c ∩I + (S),
it is the outward pointing normal relative to D c . Recall that
D α Tαβ = äg u.D β u
and
N
d i v J = äg N (u) + Tαβ
παβ .
Let e = 21 |u|2 +
Then,
P
α |∂α u|
2
.
|d i v J | ≤ C e,
for some C > 0.
We also have
d i vη = −2e k f uN (u) − e k f |u|2 d i v N − ke k f |u|2 g (N , N ).
Since N is timelike on D c , there exists a constant c 0 > 0 such that
¡
¢
d i vη ≥ e k f kc 0 |u|2 −C e .
Moreover,
d i v J˜ = e k f (÷J + kQ(N , N ))
and again, since N is timelike on D c (uniformly), there exists a constant c 1 such
that
c 0 |u|2 + T (N , N ) ≥ c 1 e.
Hence,
d i vη + ÷J ≥ e k f (kc 1 −C )e.
For k large enough, the left hand side is therefore positive and controls e k f e. Moreover,
g (N , J˜)
e k f T (N , N )
g (N , η)
=
,
= e −k f |u|2 .
g (N , N )
g (N , N )
g (N , N )
Both of these quantities are non-positive. Recall that by Stokes theorem, if (M , g )
is Lorentzian manifold with spacelike boundary ∂M and if N is the outward unitnormal, then for ξ is smooth vector field with compact support
Z
Z
M
d i vξ =
∂M
g (ξ, N )
g (N , N )
where both integrals are taken with respect to the natural (induced) volume forms
on M and ∂M . If we could apply this directly in our setting we would be done, since
both sides of the equation would have different signs.
The trouble is that our domain D c has corners at the intersection of Q c with S,
so we are not just in the setting to apply the above.
We will use the second lemma above. Let T be a unit normal to S and let R :=
g (P,T )
P − g (T,T ) . Using the lemma, we get a smooth map
h : Q c ∩ S × B ² (0) → V
85
which is a diffeomorphism onto its image and contains an open neighbourhood of
Q c ∩S, asssuming that ² > 0 is small enough. Let χ ∈ C c∞ (R2 ) be such that χ(x) = 1 for
|x| ≤ 1/2 and χ(x) = 0 for |x| ≥ 3/4 and let χ(δ/x) = χδ (x). For δ ≤ ², we can consider
a function
φδ : Q c ∩ S × B ² (0)
R,
→
(p, x) → ψ(p, x) := χδ (x)
Then, ψ = φ◦h −1 is a smooth function defined on some neighbordhood of Q c ∩S and
we can extend it by 0 outside to get a smooth funciton on V . Moreover, the volume
of the support of ψδ can be estimated by C δ2 for some constant C and
|∂ψδ | ≤ C δ−1 .
Now D c0 = D c \ S ∩ Q c is a smooth manifold with boundary and (1 − ψδ )X has
compact support on this manifold. Applying the above divergence theorem, we get
Z
Z
Z
Z
d i v(1 − ψδ ) = −
X α ∂α ψδ +
di v X −
ψδ d i v X .
D c0
D c0
D c0
D c0
The first and the last term converges to 0, so that the middle term converges to the
divergence of X on D c . The boundary integrals similarly converges to what they
should. It follows that the above divergence formula is valid even in the presence of
"corners" which concludes the proof.
11.3 Proof of the domain of dependence
We now move with the proof of the theorem.
Proof. Let p ∈ i nt (D + (Ω)). We have K = J − (p) ∪ D + (Ω) compact by a variant of
Lemma 10.17 (cf Lemma 40 p423 of O’Neil). For any interval I and t 0 ∈ R, we define
R I = t −1 (I ) ∩ K ,
R t0 = t −1 (t 0 ) ∩ K ,
S t0 = t −1 (t 0 ).
Note that if I is compact then so is R I and that R t0 and S t0 are compact. Let I ² =
[t 0 − ², t 0 + ²] For any t 0 such that R t0 pick an open set U containing R t0 . Then, (by
compactness of R I ² ), there exists ² > 0 such that R I ² ⊂ U . Let now 0 ≤ t 0 ≤ T :=
t (p). For any q ∈ R t0 , there is an open neighborhood U q as in the preceeding lemma
(using S t0 as a Cauchy surface). By compactness of R t0 , there is a finite subcovering
U q1 , ..,U ql . Let U be their union and let ² be small enough so that R I ² ⊂ U . Assume
that u and Du vanishes on R t0 . Let q ∈ R s for s ∈ [t 0 , t 0 + ²]. There there is a normal
neighborhood of q such that the conditions of Lemma ?? holds, which implies that
u and Du vanishes on R s . Thus, the set of s ∈ [0, T ) such that u and Du equal zero
on R [0,s] is non-empty, open and closed in [0, T ), which proves the theorem.
12 Symmetric hyperbolic systems
There are many ways to solve linear wave equations. Here, we will go a standard
method which relies essentially on energy estimates and a duality argument.
We consider an equation of the form
L(u) := A µ ∂µ u + Bu
=
f,
(7)
u(0, .)
=
u0 ,
(8)
86
where u is an RN valued function on defined on Rn+1 (or a subset), the initial data u 0
is taken smooth for simplicity, f is smooth and decay fast at infinity in x (say f (t , .)
is Schwarz for all t ) and the A µ are smooth N × N matrix valued functions defined
on Rn+1 such that A µ are symmetric for each µ and A 0 is positive definite with a
uniform lower bound, i.e. ∃c > 0, such that
∀ξ ∈ RN .
A 0 (ξ, ξ) > c|ξ|2 ,
L is then said to be symmetric hyperbolic.
Recall that H k = H k (Rn ) is the space of tempered distributions on Rn u such that
û is a measurable function (we identify the function and its distribution) satisfying
||u||H k := ||(1 + |ξ|2 )k/2 |û|||L 2 < C .
H k with the norm ||.||H k is a Hilbert space. Recall also that for k ≥ 0, we can identify
H k with the space of L 2 functions which have k weak derivatives in L 2 .
Recall also that (H k )∗ can be identified with H −k in the standard way.
We will need the following lemma, whose proof is left as an exercice (hint: use
duality).
Lemma 12.1. If u if Schwarz and φ is C ∞ with uniform bounds on φ and its derivatives, then
||φu||H k . ||u||H k ,
even for negative k.
Lemma 12.2 (Estimates for k ≥ 0). Let u be a solution of (7) which is smooth on
[0, T ] × Rn and such that, for k ≥ 0, u(t , .) ∈ H k , for all t ∈ [0, T ]. Let E k be defined by
Z
1 X
A 0 (∂α u, ∂α u)d x.
E k [u] =
2 |α|≤k Rn
Then, we have the following estimate
E k1/2 (t ) . E k1/2 (0) +
t
Z
0
|| f (s, .)||H k d s.
Proof. We do the proof for k = 0, the other follows easily after commuting the equation by ∂α and noting that the resulting equation can be put under the same form.
We have
Z
Z
Z
d
1/2 A 0 (u, u)d x = 1/2 ∂t A 0 (u, u)d x + A 0 (∂t u, u)d x,
dt
x
Zx
Zx ³
´
0
= 1/2 ∂t A (u, u)d x − u t A i ∂i u + Bu − f d x,
x
Zx
Z ³
´
0
u t ∂i (A i )u − (Bu, u) − ( f , u) d x,
= 1/2 ∂t A (u, u)d x + 1/2
x
x
so that the lemma follows from the lower bounds on A 0 and the upper bounds on A
and f and B .
Corollary 12.1. We have uniform ||.||H k bounds for k ≥ 0.
87
Lemma 12.3 (Estimates for k < 0). Let u be a solution of (7) which is smooth on
[0, T ] × Rn , satisfying uniform Schwarz bounds
(1 + |x|i )|∂β u| ≤ C i ,β .
Let k < 0 be a negative integer. Then, we have the estimate
t
Z
||u||H k . ||u(0, .)||H k +
0
|| f (s, .)||H k d s.
Proof. Let U (t , .) be defined by
U (t , .) = (1 − ∆)k u(t , .),
where (1 − ∆)k for k negative is to be understood as the Fourier multiplier (1 + |ξ|2 )k .
It follows from the assumptions on u that both U and LU satisfy Schwarz bounds,
so that we can apply the results of the previous corrolary. Thus
Z t
||LU (s, .)||H −k d s
||u(t , .)||H k = ||U (t , .)||H −k . ||U (0, .)||H −k +
0
Z t
. ||u(0, .)||H k + ||LU (s, .)||H −k d s.
0
On the other hand, we have
f = Lu = (1 − ∆)−k LU + [L, (1 − ∆)−k ]U
yielding
||LU ||H −k ≤ || f ||H k + ||[L, (1 − ∆)−k ]U ||H k .
Note that
[L, (1 − ∆)−k ] =
X
a α ∂α
|α|≤2|k|
where α are multi-indices such that ∂α contains at most one t derivative and a α are
smooth bounded coefficients depending only on A and B .
As a consequence of Lemma 12.1,
||[L, (1 − ∆)−k ]U ||H k . ||U ||H −k + ||∂t U ||H −k−1 .
The last term contains t derivative of U , which we do not yet control. We will subsitute the t derivative using the equation. More specifically, let
L 0 u = (A 0 )−1 Lu.
Then,
(A 0 )−1 f
=
L 0 u = ∂t u + (A 0 )−1 A i ∂i u
=
(1 − ∆)−k ∂t U + (1 − ∆)−k (A 0 )−1 A i ∂i U + [(A 0 )−1 A i ∂i , (1 − ∆)−k ]U ,
yielding
||∂t U ||H −k−1
=
¯¯h
i ¯¯
¯¯
¯¯
||(A 0 )−1 f ||H k−1 + ||(A 0 )−1 A i ∂i U ||H −k−1 + ¯¯ (A 0 )−1 A i ∂i , (1 − ∆)−k U ¯¯
.
|| f ||H k−1 + ||U ||H −k + ||U ||H −k−1
.
|| f ||H k−1 + ||U ||H −k .
88
H k−1
Combining the above, we obtain
t
Z
||u(t , .)||H k
.
||u(0, .)||H k +
0
¡
¢
||u(s, .)||H k + || f (s, .)||H k d s.
The lemma then follows using Gronwall inequality.
The previous analysis already brings, in particular, a global uniqueness statement: if u and v agree at t = 0, they agree for all times (just apply Lemma 12.2 with
k = 0). However, one can prove a stronger result, a local uniqueness statement.
Lemma 12.4 (local uniqueness: Domain of dependence). Let L be as above with
A µ ∈ C 1 , A 0 positive definite and bounded from below, B ∈ C 0 and f = 0 on some slab
[T1 , T2 ] × Rn . Let u be a solution to (7) such that u(0, .) = 0 on some some ball B x 0 (R)
with R > 0. Then, there exists a c > 0 depending only the bounds on A such that u = 0
on I − (p) ∩ [T1 , T2 ] × Rn , where p = (R/c, x 0 ) ∈ Rn+1 and I − (p) is the interior of a past
null cone for the Minkowski metric with the speed of light c.
Proof. We multiply the equation by e −kt u for some k large enough to get
¡
¢
∂α [e −kt A α (u, u)] = e −kt −k A 0 (u, u) + ∂α A α (u, u) − 2B ||u||2
and integrate this over a domain of the form D = I − (p) ∩ [0, T2 ] × Rn , where p =
(R/c, x 0 ) and c will be chosen large enough so that R/c < T2 . Moreover, here I − (p)
refers to the interior of the past null cone of p associated with the Minkowski metric
η c = −c 2 d t ⊗ d t + δi j d x i ⊗ d x j .
We then use the usual Stokes’ theorem in Rn+1 to relate the bulk terms to the
boundary terms. The integral over ∂D ∩ {t = 0} vanishes in view of the initial data
hypothesis. The other boundary is the truncated cone given by the equation
¶
µ
R
= |x − x 0 |, t > 0,
c t−
c
Ã
!
c
i
and the vector n = (n α ) =
is an outgoing normal. (ex: rewrite this using
0)
− (x−x
|x−x 0 |
differential geometric language, then n is actually a one-form...)
Thus this boundary term give a contribution proportional to n α A α (u, u) where
n 0 can be made arbitrary large by choosing c large enough. Choosing c large enough,
it follows that this boundary term is non-negative. Choosing k large, the bulk term
is non-positive, which implies that all terms must vanish, i.e. u = 0.
We are now in a position to prove existence of solutions.
Theorem 12.1 (Existence). Let S T = [0, T ] × Rn and assume that L is above, with
A µ and B ∈ C ∞ with bounded derivative A 0 positive definite with a uniform lower
bound. Assume for simplicity that u 0 ∈ C 0∞ . Then, there is a unique solution u on S T
to (7). Moreover, for each t , u(t , .) = 0 outside from a fix compact K T .
Proof. Let L be as above and define L ? be the formal adjoint of L
L ? u = −∂t (A 0 u) − ∂ j (A j u) + B t u.
89
Then, L ? is a symmetric hyperbolic operator, so in particular, we have, for any k
integer and for every test function φ ∈ C 0∞ ((−∞, T ) × Rn ),
Z T
||φ(t , .)||H −k ≤ C
||L ? φ(s, .)||H −k d s,
t
for t ∈ [0, T ], by applying the estimates of the previous lemma with t replaced by
T − t.
Let X be the Banach space
³
´
X = L 1 [0, T ], H −k .
Note that L ? φ can be viewed as a member of X so we let M be the subspace of X
composed of elements of the form L ? φ for φ as above (i.e. M = L ?C 0∞ ((−∞, T ) × Rn )).
¡
¢
Let f ∈ L 1 [0, T ], H k , and define for ψ = L ? φ ∈ M
Z T0
F (ψ) =< f , φ >=
(φ(t ), f (t ))L 2 d t .
0
Then F is a linear bounded functional on M . By Hahn-Banach, it admits a continuous extension to the whole of X satisfying the same bound. By duality (recall that
∞ ?
1
(L 1 )? is isomorphic
to L ∞
¡
¢ , while (L ) is not isomorphic in general to L ), there
∞
k
exists a u ∈ L [0, T ], H such that, for all φ as above
Z T
Z T
(φ(t ), f (t ))2L d t =
(L ? φ(t ), u(t ))L 2 d t .
0
0
Thus u is a solution in the sense of distributions of the equation (since we have
assumed that u(0, .) = 0). From the equation, the weak time derivative ∂t u solves
∂t u = −(A 0 )−1 A i ∂i u − Bu − f ∈ H k−1 .
¡
¢
¡
¢
Thus, we have ∂t u, u ∈ L ∞ (−∞, T ); H k−1 , which implies
that u ∈ C¢0 (−∞, T ); H k−1 .
¡
Repeating the argument, we can obtain that u ∈ C 1 (−∞, T ); H k−2 . In particular, if
f is smooth, since implies (using k arbitrary large) that u is smooth. To solve the
equation with non-vanishing data, assume first that u(0, x) ∈ C 0∞ and let u 0 (t , x) =
u(0, x) so that u 0 has the right-Cauchy data. Then, consider the modified equation
Lv = f − Lu 0 .
Corollary 12.2. Let u 0 ∈ C ∞ and f ∈ C ∞ . Assume that the operator L is as above,
there exists a unique solution to (7)
Proof. Use the previous existence result and the local uniqueness together.
13 linear wave equations
Below, when we manipulate such as a Lorentzian metric in coordinates, we will always assume that such bounds are valid.
We now consider the linear wave equation
g µν ∂µ ∂ν u + b α ∂α u + cu = f ,
(9)
u(t = 0, .) = u 0 ,
(10)
∂t u(t = 0, .) = u 1 ,
(11)
90
where b, c, u 0 and u 1 are C ∞ functions. Without loss of generality, assume that g 00 =
−1 (otherwise, just divide). Let v = (∂α u, u) be a Rn+2 vector field over Rn+1 and define the symmetric n+2×n+2 matrices by (A 0 )i j = g i j , (A 0 )n+1,n+1 = (A 0 )n+2,n+2 = 1,
(A k )i ,n+1 = (A k )n+1,i = −g i k , (A k )n+1,n+1 = −2g 0k and all other components vanishes. Let moreover d n+1,i = −b i , d n+1,n+1 = −b 0 , d n+1,n+2 = −c, d n+2,n+1 = −1,
h n+1 = f . Then, (9) is equivalent to the equation on v
A α ∂α v + d v = h.
(12)
Exercice: show reciprocally that if v solves the above equation and the data verifies
∂i v n+2 = v i , then u = v n+2 solves the original wave equation.
From our previous analysis, we therefore obtain
Theorem 13.1. There exists a unique C ∞ solution u to (9) satisfying the initial condition.
As well as a domain of dependence property
Proposition 13.1. With the above notation, if f = 0 and if the initial data vanishes
on some B R (x 0 ), then, there exists a c > 0, such that u = 0 on I − (p) ∪ {t ≥ 0}, where the
set I − is constructed from the Minkowski metric with speed of light c.
A Global volume form and orientable manifolds
Lemma A.1. A semi-Riemannian manifold M has a global volume form if and only
if M is orientable.
Proof. Recall that a manifold is orientable if it has a coordinate atlas all of whose
transition functions have positive Jacobian determinants. For every such local coordinate system, we have a local n-form defined by | det(g )|d x 1 ∧..∧d x n . Moreover, for
every other coordinate in the atlas (thus positively oriented with the first) one can
check that the local n-form in the new coordinate system agrees on overlap with the
first. Thus, this defines a global volume form.
B Additional exercices
B.1 Problems
B.1.1 A commutation formula for the wave equation
Let äg denote the D’Alembertian acting on real valued functions ψ : M → R.
1. Let X be a Killing field. Prove that [X , äg ] = 0, in the sense that for any smooth
function ψ,
X (äg ψ) = äg (X (ψ)).
2. Let now X be a vector field, not necessarily Killing and denote by π its deformation tensor, and by t r (π) the trace of π, given in any local coordinate system
by t r (π) = g αβ παβ . Prove that for any smooth function ψ;
¡
¢
£
¤
g X ψ = X (äg (ψ)) + q X ψ with
q X ψ = 2παβ D α D β ψ + 2D α παµ − D µ (t r π) D µ ψ .
£
¤
£
91
¤
B.2 Solutions
B.2.1 Solution to B.1.1 (second question only)
This is a classical formula. The proof written below is taken from [HS13]), Lemma
6.2, p242.
Note that on functions f
³
´
LX Dα f = DαLX = X βDβDα f + Dα X β Dβ f ,
while on 1-forms
¢
¡
L X D α Vβ − D α L X Vβ = X γ [D γ , D α ]Vβ − V γ [D γ , D α ]X β + D γ D α X β − 2D α πγβ V γ .
Contracting with g αβ , we obtain the formula
g αβ L X D α D β Vβ = g αβ D α L X Vβ + V γ D γ (t r π) − 2V γ D α πγα .
With Vβ = D β ψ (i.e. V = d ψ), we get
g αβ L X D α D β ψ = g αβ D α D β L X + D γ (t r π)D γ ψ − 2D α πγα D γ ψ.
Finally, the desired formula follows from
³
´
X (äg ψ) = L X g αβ D α D β ψ = −2παβ D α D β ψ + g αβ L X D α D β ψ.
References
[BS03]
Antonio N. Bernal and Miguel Sánchez. On smooth Cauchy hypersurfaces
and Geroch’s splitting theorem. Comm. Math. Phys., 243(3):461–470, 2003.
[BS05]
Antonio N. Bernal and Miguel Sánchez. Smoothness of time functions and
the metric splitting of globally hyperbolic spacetimes. Comm. Math. Phys.,
257(1):43–50, 2005.
[HS13]
Gustav Holzegel and Jacques Smulevici. Stability of Schwarzschild-AdS for
the spherically symmetric Einstein-Klein-Gordon system. Comm. Math.
Phys., 317(1):205–251, 2013.
[O’N83] Barrett O’Neill. Semi-Riemannian geometry, volume 103 of Pure and Applied Mathematics. Academic Press, Inc. [Harcourt Brace Jovanovich, Publishers], New York, 1983. With applications to relativity.
[Rin09] Hans Ringström. The Cauchy problem in general relativity. ESI Lectures in
Mathematics and Physics. European Mathematical Society (EMS), Zürich,
Zuerich, Switzerland, 2009.
92