Information cohomology.

Information cohomology.
Juan Pablo Vigneaux
IMJ-PRG - Université Paris 7
July 13, 2016
July 13, 2016
1 / 41
P. Baudot and D. Bennequin, The homological nature of
entropy, Entropy, 17 (2015), pp. 3253–3318.
July 13, 2016
2 / 41
1
Information
2
Classical information structures
3
Information cohomology
4
Quantum information cohomology
July 13, 2016
3 / 41
Outline
1
Information
2
Classical information structures
3
Information cohomology
4
Quantum information cohomology
July 13, 2016
4 / 41
Information theory
Shannon (1948) defined the information content of a random variable
X : Ω → {x1 , ..., xn } as
H(X ) = −
n
X
P(X = xi ) log2 P(X = xi ),
(1)
k=0
where P denotes a probability law on the space Ω. The function H is
called entropy.
July 13, 2016
5 / 41
Information theory
Shannon (1948) defined the information content of a random variable
X : Ω → {x1 , ..., xn } as
H(X ) = −
n
X
P(X = xi ) log2 P(X = xi ),
(1)
k=0
where P denotes a probability law on the space Ω. The function H is
called entropy.
Information is related to uncertainty.
1
2
If P(X = xi ) = 1 for certain i, then H(X ) = 0.
Uniform distribution on {x1 , ..., xn } implies H(X ) maximal.
July 13, 2016
5 / 41
Information theory
Shannon (1948) defined the information content of a random variable
X : Ω → {x1 , ..., xn } as
H(X ) = −
n
X
P(X = xi ) log2 P(X = xi ),
(1)
k=0
where P denotes a probability law on the space Ω. The function H is
called entropy.
Information is related to uncertainty.
1
2
If P(X = xi ) = 1 for certain i, then H(X ) = 0.
Uniform distribution on {x1 , ..., xn } implies H(X ) maximal.
Shannon recognized an important relation,
H(X , Y ) = H(X ) + H(Y |X ).
July 13, 2016
5 / 41
Outline
1
Information
2
Classical information structures
3
Information cohomology
4
Quantum information cohomology
July 13, 2016
6 / 41
Observables
Fix a finite probability space Ω. Observables are random variables
1, X1 , X2 , X3 , ... (where 1 corresponds to certitude/a constant variable).
We are just interested in the algebras of events defined by each variables
(we consider X ∼
= Y if σ(X ) = σ(Y )).
We can write an arrow X → Y if σ(Y ) ⊂ σ(X ) (if “X refines Y ”).
July 13, 2016
7 / 41
O = O(Ω) category of “observables”, whose objects are all the possible
partitions of Ω and arrows are refinements.
O has a initial element, the partition by points, that we denote by 0,
and also a final element, the trivial partition 1 := {Ω}.
(X , Y ) : Ω → R × R, ω 7→ (X (ω), Y (ω)) is the coarser partition that
refines X and Y ⇒ categorical product. We write simply
XY := (X , Y ) (this product is commutative, associative, 1 is a unit,
all elements are idempotents).
July 13, 2016
8 / 41
A concrete information structure, denoted by S, is a subcategory of
O(Ω) that satisfies the following properties:
1
Given any three objects X , Y and Z in Ob(S), such that X refines Y
and Z , then the partition YZ also belongs to Ob(S).
2
The category S is a full subcategory of O: given any two objects
X , Y in S,
MorS (X , Y ) = MorC (X , Y ).
3
The partition 1 is in Ob(S).
July 13, 2016
9 / 41
Information structures: examples
Example 1. Set Ω = {1, 2, 3} and define Xi := {{i}, Ω \ {i}}. M is the
atomic partition.
1Ω
X1
X2
X3
M
July 13, 2016
10 / 41
Information structures: examples
Example 2. As before, but the observable X2 is not available.
1Ω
X1
X3
M
July 13, 2016
11 / 41
Information structures: examples
Example 3. From quantum physics. Here, Lx , Ly , Lz are the quantum
observables that correspond to angular momentum and L2 = L2x + L2y + L2z .
1
Lx
Lx L2
L2
Ly
Ly L2
Lz
Lz L2
We cannot measure simultaneously two components of the angular
momentum since the operators do not commute.
July 13, 2016
12 / 41
Probabilities
We denote by ∆(B) the set of all possible laws on (Ω, B). When B has n
atoms, we identify ∆(B) with ∆n−1 ⊂ Rn to see it as a measure space.
We shall consider subsimplicial structures of ∆(B), denoted by QB , or
simply Q.
Operations:
1
Marginalization: for an arrow f : X → Y , there is an induced
simplicial map f∗ : ∆(B(X )) → ∆(B(Y )), P 7→ f∗ P that such that
f∗ P and P coincide on B(Y ) (we write also Y∗ ). We denote
QX := X∗ Q.
2
Conditioning: X : (Ω, B) → {x1 , ..., xn } random variable, P a law of
∆(B) and P(X = xi ) 6= 0 for certain i ∈ [n].
July 13, 2016
13 / 41
Example.
Set Ω = {1, 2, 3}, Xi := {{i}, Ω \ {i}}, M atomic.
∆k := {(x0 , . . . , xk ) ∈ R2≥0 : x0 + . . . + xk = 1}, the k-simplex.
1Ω
X1
X3
M
{1}
(p1 , p2 + p3 )
∆1
∆1
(X1 )∗
(p1 , p2 , p3 )
∆2
July 13, 2016
14 / 41
Functional module
Similarly, for each observable X , consider the real vector space FX of
measurable functions on QX (the entropy H[X ] lives here).
If X → Y , a function f ∈ QX can be mapped naturally to FX : just set
f X |Y (P) = f (Y∗ P).
The set FX accepts a natural action of SX (these are the variables refined
by X ): for an observable Y (call the possible values {y1 , ..., yk }) in SX
and f ∈ F (QX ), the new function Y .f ∈ FX is given by
(Y .f )(P) =
k
X
P(Y = yi )f (P|Y =yi ).
i=1
July 13, 2016
15 / 41
Example.
Set Ω = {1, 2, 3}, Xi := {{i}, Ω \ {i}}, M atomic, ∆k the k-simplex.
FM = {f : ∆2 → R}, etc.
1Ω
X1
X3
M
F1
f (x, y )
FX1
f X |Y (x, y , z) = f (x, y + z)
F X2
FM
July 13, 2016
16 / 41
Outline
1
Information
2
Classical information structures
3
Information cohomology
4
Quantum information cohomology
July 13, 2016
17 / 41
Topos cohomology
Consider your category S. Over each X ∈ S there is monoid SX of
variables coarser than X . Denote by AX the algebra generated over
R by this monoid.
Put the trivial Grothendieck topology on S. The couple (S, A) is a
ringed site. We work in the category Mod(A): sheaves of groups
with an action of A. (The sheaf F lives here.) The category
Mod(A) has enough injective objects.
We call Mod(A) the topos of information.
Define the information cohomology as (cf. Bennequin-Baudot, 2015
[1]):
H n (S, Q) = Extn (RS , F ).
July 13, 2016
18 / 41
Classical information cohomology
Objective: projective resolution of the trivial sheaf of A-modules RS .
0 ← RS ← B0 ← B1 ← B2 ← ...
Let Bn (X ) de be the free AX module generated by the symbols
[X1 |...|Xn ], where {X1 , ..., Xn } ⊂ SX . Remark that B0 (X ) is the free
module on one generator [ ], in consequence isomorphic to AX .
The Bi are sheaves in Mod(A).
We introduce now AX -module morphisms εX : B0 (X ) → RSX given
by the equation εX ([ ]) = 1, and boundary morphisms
∂ : Bn (X ) → Bn−1 (X ), given by
∂[X1 |...|Xn ] =X1 [X2 |...|Xn ]
+
n−1
X
(−1)k [X1 |...|Xk Xk+1 |...|Xn ] + (−1)n [X1 |...|Xn−1 ].
k=1
These morphisms are natural in X .
July 13, 2016
19 / 41
Resolution
Proposition
The complex
0
RS
ε
B0
∂1
B1
∂2
B2
∂3
...
(2)
is a resolution of the sheaf RS .
In fact, the construction correspond to a relatively free resolution, cf.
MacLane, Homology.
Proposition
For each n ≥ 0, the sheaf Bn is a projective object in Mod(A).
This uses the special properties of S.
July 13, 2016
20 / 41
Back to the observables.
Set Ω = {1, 2, 3}, Xi := {{i}, Ω \ {i}}, M atomic, Q• full set of
probabilities.
1Ω
X1
X2
M
The general construction says that a 1-cocycle is defined by 3 functions
f [X1 ] : QX1 → R, f [X2 ] : QX2 → R, f [M] : QM → R subject to the cocycle
equations.
July 13, 2016
21 / 41
Cocycle equations:
0 = X1 .f [X2 ] − f [M] + f [X1 ]
0 = X2 .f [X1 ] − f [M] + f [X2 ]
...
These are functional equations. They imply
X2 .f [X1 ] + f [X2 ] = X1 .f [X2 ] + f [X1 ], i.e., for every (p0 , p1 , p2 ) ∈ ∆2 :
p1
p0
,
− f [X1 ](1 − p1 , p1 )
1 − p2 1 − p2
p0
p2
= (1 − p1 )f [X2 ]
,
− f [X2 ](1 − p2 , p2 ).
1 − p1 1 − p1
(1 − p2 )f [X1 ]
July 13, 2016
22 / 41
p0
p1
,
− f [X1 ](1 − p1 , p1 )
1 − p2 1 − p2
p0
p2
= (1 − p1 )f [X2 ]
,
− f [X2 ](1 − p2 , p2 ).
1 − p1 1 − p1
(1 − p2 )f [X1 ]
Tverberg, Lee, Ng, etc.: the only measurable solution to this equation are
f [X1 ](x, 1 − x) = f [X2 ](x, 1 − x) = λ(−x log x − (1 − x) log(1 − x))
where λ is an arbitrary constant.
July 13, 2016
23 / 41
p0
p1
,
− f [X1 ](1 − p1 , p1 )
1 − p2 1 − p2
p0
p2
= (1 − p1 )f [X2 ]
,
− f [X2 ](1 − p2 , p2 ).
1 − p1 1 − p1
(1 − p2 )f [X1 ]
Tverberg, Lee, Ng, etc.: the only measurable solution to this equation are
f [X1 ](x, 1 − x) = f [X2 ](x, 1 − x) = λ(−x log x − (1 − x) log(1 − x))
where λ is an arbitrary constant.
Moreover: in fairly general situations, the information cohomology
H 1 (S, Q) is a 1-dimensional vector space, all cocycles being multiples of
entropy function.
July 13, 2016
23 / 41
Universality of the entropy
Proposition
Let S be an information structure and Q an adapted probability family.
Suppose that every minimal object can be factorized as a non-degenerate
product. Then,
Y
H 1 (S, Q) =
RHα
α∈H0CW (S0 )
where α represents a connected component of S0 = S \ 1 and
(
H[X ] if X ∈ Ob(α)
Hα [X ] =
0
if X ∈
/ Ob(α)
If (S1 , Q1 ), . . . , (Sn , Qn ) satisfy the hypothesis above, then
!
n
n
n
M
M
M
Y
H1
S,
Q =
RHα .
i=1
i=1
i=1 α∈H CW (Si 0 )
0
July 13, 2016
24 / 41
Symmetric information cohomology
Consider the group G(Ω, B) of permutations of Ω that respect the algebra
B. This group acts on O(Ω): if X = {Ω1 , ..., Ωn }, then
σ ∗ X = {σ −1 Ω1 , ..., σ −1 Ωn }.
A classical information structure is symmetric if its closed by the action of
G(Ω, B).
A probability functor is Q is symmetric
July 13, 2016
25 / 41
Outline
1
Information
2
Classical information structures
3
Information cohomology
4
Quantum information cohomology
July 13, 2016
26 / 41
Quantum densities
In the quantum case, the role of Ω is played by a fixed complex vector
space E of finite dimension N.
A quantum probability law is a hermitian product, not required to be
positive definite. If a basis of E has been chosen, h can be represented by
a N × N hermitian matrix, h(u, v ) = u ∗ ρv .
July 13, 2016
27 / 41
Quantum densities
In the presence of a basis, there is a canonical hermitian form (with
ρ = IdE )
n
1X
h0 (x, y ) ≡ hx, y i0 :=
x̄i yi .
n
(3)
j=1
We can proceed in the opposite sense, fixing a positive definite hermitian
form h0 and taking as basis of E the set of its eigenvectors.
Usually, ρ is normalized to Tr (ρ) = 1. This condition on the trace makes
no sense, unless additional structure has been defined on E , like a basis or
a canonical product h0 . (From now on, we suppose that a positive definite
product h0 on E has been fixed.)
July 13, 2016
28 / 41
Observables
Real-valued random variables are generalized by hermitian operators and
they are called observables.
By the spectral theorem, each hermitian operator Z can be decomposed as
weighted sum of positive hermitian projectors
Z=
K
X
zj Ej
j=1
where z1 , ..., zK are the (pairwise distinct) real eigenvalues of Z . Each Ej
is the projector on the eigenspace spanned by the eigenvectors of zj . The
Ej are mutually orthogonal and their sum equals the identity IdE .
This decomposition of Z is not necessarily compatible with the chosen
basis of E .
July 13, 2016
29 / 41
Refinement, products
We consider as equivalent two hermitian operators that define the same
direct sum decomposition {Ej }j of E , ignoring the particular eigenvalues.
We denote by EA both the subspace of E and the orthogonal projector on
it. A decomposition {Eα }α∈A is said to refine {Eβ0 }β∈B if each Eβ0 can be
expressed as sum of subspaces {Eα }α∈Aβ , for certain Aβ ⊆ A. In that case
we say also that {Eα }α∈A divides {Eβ0 }β∈B .
The joint (Y1 , ..., Yn ), for a set of commuting observables,
correspond to
L
the decomposition of E in direct orthogonal sum α∈A Eα , where
{Eα }α∈A is the collection of joint eigenspaces for the set {Yj }1≤j≤n .
These are obtained by means of intersection of the original subspaces. The
joint (Y1 , ..., Yn ) is denoted simply as a product Y1 · · · Yn and it is the
categorical product as before.
July 13, 2016
30 / 41
Structures
A quantum information structure is a category S whose objects are a
set of observables (equivalently, orthogonal decompositions). We put a
morphism X → Y if X divides Y , as a partial order. This category is
required to satisfy the following properties:
1
If X , Y and Z are observables in Ob(S), such that X → Y and
X → Z , then Y and Z commute and YZ ∈ Ob(S).
2
The trivial partition of E , denoted by 1 := {E }, belongs to Ob(S).
This category satisfy the axioms of a generalized information structure.
July 13, 2016
31 / 41
The amplitude or expectation of an observable Z in the state ρ is defined
as
Eρ (Z ) := Tr (Z ρ) .
(4)
The product h0 plays an important role here, defining the trace.
Associated to a choice of h0 there is a corresponding unitary group of
transformation U(h0 ). These are special symmetries of the quantum
system.
The role of the classical algebra of sets on Ω could be played here by a
fixed orthonormal decomposition B of E , such that any decomposition in
S is refined by B. However, this choice does not allow us to consider
invariance under the group U(h0 ); a more flexible choice is the set UB of
all decompositions obtained from B by unitary transformations.
July 13, 2016
32 / 41
Probabilities, conditioning
An elementary event associated to S is a subspace EA of E , which is an
element of one of the decompositions X ∈ Ob(S).
If ρ is a density and EA is a elementary event, the probability of the event
A correspond to
Pρ (A) = Tr (EA∗ ρEA ) .
(5)
We define also the conditioning of ρ by A as the new hermitian matrix
ρ|A =
EA∗ ρEA
.
Tr EA∗ ρEA
(6)
We always assume that S and Q1 are mutually adapted; this means that
Q1 is closed under conditioning by A, for any EA belonging to a
decomposition X ∈ Ob(S).
July 13, 2016
33 / 41
Families of densities
If X = {Eα }α∈A is an observable, we can define its density as
X
X
ρX =
Pρ (A)ρ|A =
EA∗ ρEA .
α∈A
(7)
α∈A
Given an observable X , we associate to it a subset QX of Q1 , that
contains at least every density ρX for ρ ∈ Q1 .
Axiom: for each arrow of division X → Y , the set QX is contained in QY .
We denote by Y∗ the injection QX ,→ QY .
In what follows, we simply denote by Q the functor X 7→ QX . The set Q1
appears as the value of Q on 1 ∈ Ob(S).
July 13, 2016
34 / 41
Families of densities: some examples
There are several possible choices for the functor Q when Q1 has been
given. The extreme cases are:
1
2
Qmax
:= Q1 for every X ∈ Ob(S).
X
min are
Qmin
X := {ρX | ρ ∈ Q1 }. In this case, the elements of QX
positive hermitian forms on E , decomposed in blocks with respect to
X . When Q1 = PH, we call this canonical density functor, Qcan .
Important difference between the quantum and classical case: for X → Y
in the respective information structure, there exist more quantum
probability laws in QY than in QX , but there are less classical probability
laws in (Ω, B(Y )) than in (Ω, B(X )). In particular, the terminal object has
the richest structure in terms of quantum probability laws: any hermitian
product.
July 13, 2016
35 / 41
Quantum information cohomology
S (with trivial topology) is a ringed site, with structure ring A generated
on each X ∈ Ob(S) by the monoid SX .
Sheaf F (Q): over each X ∈ Ob(S), the space F (QX ) correspond to the
real vector space of measurable functions on QX with values in R, with an
action of AX defined by average conditioning. By marginalization, one has
F (QY ) ,→ F (QX ).
The quantum information cohomology is just the general topos
cohomology:
H n (S, Q) = ExtnA (RS , F (Q)), n ≥ 0.
July 13, 2016
(8)
36 / 41
Resolution
The construction of the bar resolution applies again. The quantum
information cohomology is the cohomology of the complex
{C n = HomA (RS , F (Q))}n≥0 with boundary operator
δ̂f [X1 |...|Xn+1 ] =X1 .f [X2 |...|Xn+1 ]
n
X
+
(−1)k f [X1 |...|Xk Xk+1 |...|Xn ] + (−1)n+1 f [X1 |...|Xn ]
k=1
The action of A correspond to Von Neumann’s average conditioning,
X
Y .f [X1 |...|Xn ](ρ) =
Tr (EA∗ ρEA ) f [X1 |...|Xn ](ρ|A ),
A
where the EA are the spectral projectors of the observable Y . Here it is
not necessary to assume that Y commutes with X1 , ..., Xn .
July 13, 2016
37 / 41
Locality
Naturality
Bn (Y )
Bn (X )
fY
fX
F (QY )
F (QX )
is translated in this context by the a condition called locality: given an
arrow X → Y in S and observables {X1 , ..., Xn } ∈ SX , the equation
fX [X1 |...|Xn ](ρ) = fY [X1 |...|Xn ](Y∗ ρ)
(9)
is satisfied by all ρ ∈ QX . In the quantum case, this does not necessarily
means that f [X1 |...|Xn ] just depend on the the conditional densities
EA∗ ρEA , for a space EA of the decomposition X1 · · · Xn . In fact, this
depends on the choice of Q. The counterexample in the case of Qmax is
given by a function F (ρ) independent of X .
July 13, 2016
38 / 41
The Von Neumann’s entropy is defined by
S(ρ) = Eρ (− log2 (ρ)) = Tr (−ρ log2 ρ) .
(10)
For any density functor Q adapted to S, the Von Neumann’s entropy
defines a local 0-cochain, that we call SX , which is simply the restriction
of S to the set QX . For X → Y and ρ ∈ QX , the law Y∗ ρ = ρ ∈ QY by
functoriality, thus SX (ρ) = SY (ρ) = SY (Y∗ ρ). This 0-cochain will be
called also Von Neumann’s entropy. In the case of Qmax , X 7→ SX gives a
constant section.
July 13, 2016
39 / 41
The classical entropy associated to a partition X = {Ej } and density ρ is
X
H[X ](ρ) = −
Tr Ej∗ ρEj log Tr Ej∗ ρEj .
(11)
j
This is a 1-cocycle: for X → Y → Z in S, one has
HX [Z ](ρX ) = HY [Z ](Y∗ ρX ).
(12)
The following proposition link both concepts of entropy.
Proposition
Let X and Y be to commuting observables in S. Then,
SXY (ρ) = H[Y ](ρ) + Y .SX (ρ).
(13)
July 13, 2016
40 / 41
Thank you!
July 13, 2016
41 / 41