Algebraic Information Theory

Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Algebraic Information Theory
Marc Pouly
[email protected]
Apsia Breakfast Seminar
Interdisciplinary Centre for Security, Reliability and Trust
University of Luxembourg
June 2011
Marc Pouly
Algebraic Information Theory
1/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Hartley’s Information Theory
Shannon’s Information Theory
Hartley’s Measure (1928)
Given a set S = {s1 , . . . , sn } how can we measure its uncertainty u(S)
1
uncertainty is a non-negative value
2
monotone: |S1 | ≤ |S2 | ⇒ u(S1 ) ≤ u(S2 )
3
additive: u(S1 × S2 ) = u(S1 ) + u(S2 )
Theorem: There is only one function that satisfies these properties
u(S) = log |S|
Marc Pouly
Algebraic Information Theory
2/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Hartley’s Information Theory
Shannon’s Information Theory
From Uncertainty to Information
The uncertainty of S = {s1 , . . . , s8 } is log 8 = 3 bits
Assume now that someone gives more precise information for
example that either s3 or s7 has been transmitted
We have S 0 = {s3 , s7 } with a remaining uncertainty log 2 = 1 bit
The information reduces uncertainty by log 8 − log 2 = 2 bits
Information is the Reduction of Uncertainty !
Marc Pouly
Algebraic Information Theory
3/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Hartley’s Information Theory
Shannon’s Information Theory
Shannon’s Measure (1948)
How much uncertainty is contained in a set S = {s1 , . . . , sn } if the
probability pi = p(si ) of each element is known?
S(p1 , . . . , pn ) = −
n
X
pi log pi
i=1
We have a similar uniqueness result for specific properties
An information theory is derived by the same principle
This is what people call classical or statistical information theory
Shannon generalizes Hartley: S( n1 , . . . , n1 ) = log n
Marc Pouly
Algebraic Information Theory
4/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
How do we represent Information
Thanks to Hartley we have an information theory for sets and
thanks to Shannon an information theory for probabilities
But there are other ways of representing information (on computers)
databases (relations), systems of equations and inequalities,
constraint systems, possibilistic formalisms, formalisms for
imprecise probabilities, Spohn potentials, graphs, logics, ...
Statistical information theory is not enough !
Marc Pouly
Algebraic Information Theory
5/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
Extending Hartley’s Theory
Alphabets
Hartley's Information Theory
Probabilistic Sources
Isomorphisms
Shannon's Information Theory
Relational Information Theory
Marc Pouly
Algebraic Information Theory
6/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
The Fundamental Theorem of Lattice Theory
Hartley’s information theory assumes a finite alphabet S and
assigns values to subsets, i.e. u : P(S) → R≥0 .
P(S) is a distributive lattice with meet ∩ and join ∪
Theorem (Fundamental Theorem of Lattice Theory)
Every distributive lattice is isomorphic to a lattice of subsets
We can carry over Hartley’s measure to isomorphic formalisms
for example to the relational algebra used in databases
Marc Pouly
Algebraic Information Theory
7/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
Relational Information Theory
We can therefore measure the uncertainty in relations
R =
Destination
Heathrow
Heathrow
Gatwick
City
City
Departure
10:00
14:00
08:30
11:15
15:20
Gate
7
9
4
5
7
and obtain u(R) = log 5 bits
If we agree on the three properties stated by Hartley then
u(S) = log |S| is the only correct way of measuring uncertainty in
subset systems and hence also the right way for isomorphic
formalisms such as the relational algebra.
Marc Pouly
Algebraic Information Theory
8/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
Duality of Information
R =
Destination
Heathrow
Heathrow
Gatwick
City
City
Departure
10:00
14:00
08:30
11:15
15:20
Gate
7
9
4
5
7
1
How to get to London ?
the more tuples the more information
2
I am waiting for my friend, which flight might she have taken ?
the less tuples the more information
Such a dualism is always present in order theory but not for measures
Is measuring information sometimes too restrictive ?
Marc Pouly
Algebraic Information Theory
9/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
Linear Equation Systems
Solution set of linear equation systems form affine spaces
X1
3X1
4X1
−
+
+
2X2
5X2
3X2
+ 2X3
− 3X3
−
X3
= −1
=
8
=
7
The null space of the equation matrix A is
N (A)
=
{(x1 , x2 , x3 ) ∈ R3 : x1 = −
9
4
x3 , x2 =
x3 }
11
11
How much uncertainty is contained in an equation system ?
Can we treat this just as another subset system ?
Marc Pouly
Algebraic Information Theory
10/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
Equational Information Theory
Linear equation systems can have no, one or infinitely many
solutions.
Hence, the uncertainty is either log 0, log 1 or log ∞
Here, a (quantitative) measure of information is not appropriate
Marc Pouly
Algebraic Information Theory
11/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Extending Hartley’s Theory
Relational Information Theory
Equational Information Theory
A first Summary
A theory of information should explain what information is
Hartley & Shannon: information = reduction of uncertainty
Rely on the assumption that information can be measured
There are many formalisms for representing information on
computers that are not covered by this theory
Does this theory reflect our daily perception of information ?
What is our perception of information ?
Marc Pouly
Algebraic Information Theory
12/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
What is Information ?
... information exists in pieces
... information comes from different sources
... information refers to questions
... pieces of information can be combined
... information can be focussed on the questions of interest
Marc Pouly
Algebraic Information Theory
13/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Towards a formal Framework
φ, ψ ∈ Φ
information exists in pieces
there is a universe of questions r and every piece of information
φ ∈ Φ refers to a finite set of questions d(φ) ⊆ r
φ⊗ψ
combination of information
focussing of information
if d(φ) = x and y ⊆ x then φ↓y ∈ Φ
Marc Pouly
Algebraic Information Theory
14/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
... and the same again for nerds ...
This is a two-sorted algebra (Φ, r ) with
universe of questions r and information pieces Φ
labeling operator d : Φ → P(r )
combination operator ⊗ : Φ × Φ → Φ
focussing operator ↓: Φ × P(r ) → Φ
But operations cannot be arbitrary - they must satisfy some rules !
Marc Pouly
Algebraic Information Theory
15/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Axioms of Information
1
it should not matter in which order information is combined
φ⊗ψ =ψ⊗φ
2
and
a combination refers to the union of the question sets
d(φ ⊗ ψ)
3
(φ ⊗ ψ) ⊗ ν = φ ⊗ (ψ ⊗ ν)
=
focussing information on x ⊆ d(φ) gives information about x
d(φ↓x )
4
=
x
focussing can be done step-wise, i.e. if x ⊆ y ⊆ d(φ)
φ↓x
5
d(φ) ∪ d(ψ)
=
(φ↓y )↓x
combining a piece of information with a part of itself gives nothing new
φ ⊗ φ↓x
Marc Pouly
=
φ
Algebraic Information Theory
16/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
The Combination Axiom
How shall ⊗ and ↓ behave with respect to each other ?
6
If φ, ψ ∈ Φ with d(φ) = x and d(ψ) = y then
(φ ⊗ ψ)↓x
=
φ ⊗ ψ ↓x∩y
Compare with the distributive law: (a × b) + (a × c) = a × (b + c)
Definition (Kohlas, 2003)
A system (Φ, r ) satisfying the six axioms is called information algebra
Marc Pouly
Algebraic Information Theory
17/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Relational Databases
Relations are pieces of information
φ =
Player
Ronaldinho
Eto’o
Henry
Pires
Club
Barcelona
Barcelona
Arsenal
Arsenal
Goals
7
5
5
2
ψ =
Player
Ronaldinho
Eto’o
Henry
Pires
Nationality
Brazil
Cameroon
France
France
Combination is natural join and focussing is projection:
φ⊗ψ =
(φ ⊗
Player
Ronaldinho
Eto’o
Henry
Pires
Club
Barcelona
Barcelona
Arsenal
Arsenal
ψ)↓{Goals, Nationality }
=
Marc Pouly
Goals
7
5
5
2
Goals
7
5
5
2
Nationality
Brazil
Cameroon
France
France
Nationality
Brazil
Cameroon
France
France
Algebraic Information Theory
18/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Other Examples
Linear Equation Systems:
Systems of linear equations are pieces of information
Combination is union of equation systems
Focussing is Gaussian Variable Elimination
Logic:
Logical sentences are pieces of information
Combination is conjunction
Focussing is existential quantification
and many more ...
Marc Pouly
Algebraic Information Theory
19/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Algebraic Information Theory (2003)
Given an information algebra (Φ, r ) we define
φ ψ if and only if φ ⊗ ψ = ψ
φ is less informative than ψ if it is absorbed by ψ
this relation forms a partial order called order of information
Algebraic information theory does not measure information but
compares information pieces based on how informative they are
with respect to each other
Marc Pouly
Algebraic Information Theory
20/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Neutral Information
Some information pieces are special. Usually, exactly one such
element of each type exists for every set of questions s ⊆ r .
Neutral Information: for x ⊆ r there exists ex ∈ Φ such that
φ ⊗ ex = φ
and ex↓y = ey
Combination with neutral information has no effect
From neutral info we can only extract neutral info
Marc Pouly
Algebraic Information Theory
21/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Contradictory Information
Contradictory Info: for x ⊆ y ⊆ r there exists zx ∈ Φ s.t.
φ ⊗ zx = zx
and
φ↓x = zx then φ = zy
Contradictory information absorbs everything
Contradictions can only be derived from contradictions
Marc Pouly
Algebraic Information Theory
22/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Some interesting Properties
1
if φ ψ then d(φ) ⊆ d(ψ)
2
if x ⊆ d(φ) then ex φ
3
if d(φ) ⊆ x then φ zx
4
φ, ψ φ ⊗ ψ
5
φ ⊗ ψ = sup{φ, ψ}
6
if x ⊆ d(φ) then φ↓x φ
7
φ1 φ2 and ψ1 ψ2 imply φ1 ⊗ ψ1 φ2 ⊗ ψ2
8
if x ⊆ d(φ) ∩ d(ψ) then φ↓x ⊗ ψ ↓x (φ ⊗ ψ)↓x
9
if x ⊆ d(φ) then φ ψ implies φ↓x ψ ↓x
Marc Pouly
Algebraic Information Theory
23/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Conclusion
Quantitative information theory is a success story !
But we cannot measure everything
Quantitative theories do not reflect our perception of information
Algebraic information theory can be defined in a generic way on
every formalism that satisfies the axioms of an information
algebra
The basic concept of algebraic information theory is a partial
order of information
Marc Pouly
Algebraic Information Theory
24/ 25
Statistical Information Theory
The Need for other Information Theories
Algebraic Information Theory
Information Algebras
Examples
Algebraic Information Theory
Marc Pouly
Algebraic Information Theory
25/ 25