Introduction to some statistics of persistence diagrams

Statistics of Persistence Diagrams
Katharine Turner
University of Chicago
[email protected]
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
1/1
What is a statistic?
A statistic (singular) is a quantity that describes some attribute of a
sample of data. It summarizes some information about the data.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
2/1
What is a statistic?
A statistic (singular) is a quantity that describes some attribute of a
sample of data. It summarizes some information about the data.
A statistic is found using some statistical algorithm with the set of data as
input.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
2/1
What is a statistic?
A statistic (singular) is a quantity that describes some attribute of a
sample of data. It summarizes some information about the data.
A statistic is found using some statistical algorithm with the set of data as
input.
Basic descriptive statistics are often measures of central tendency and
their corresponding measures of variability or dispersion.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
2/1
What is a statistic?
A statistic (singular) is a quantity that describes some attribute of a
sample of data. It summarizes some information about the data.
A statistic is found using some statistical algorithm with the set of data as
input.
Basic descriptive statistics are often measures of central tendency and
their corresponding measures of variability or dispersion.
Measures of central tendency include the mean, median and mode.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
2/1
What is a statistic?
A statistic (singular) is a quantity that describes some attribute of a
sample of data. It summarizes some information about the data.
A statistic is found using some statistical algorithm with the set of data as
input.
Basic descriptive statistics are often measures of central tendency and
their corresponding measures of variability or dispersion.
Measures of central tendency include the mean, median and mode.
Measures of variability include the standard deviation (or variance), the
absolute deviation, and the range of the values (distance between the
minimum and maximum values of the variables).
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
2/1
Central tendencies as function minimisers
Central tendencies (and their corresponding measures of variability) are
solutions for optimizing different cost functions. These cost functions are
based on Lp norms on function spaces. We mainly care about when
p = 1, 2 and ∞.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
3/1
Central tendencies as function minimisers
Central tendencies (and their corresponding measures of variability) are
solutions for optimizing different cost functions. These cost functions are
based on Lp norms on function spaces. We mainly care about when
p = 1, 2 and ∞.
Given a sample a1 , a2 , . . . , an , some familiar statistical quantities are
minimisers of functions of the form
!1/p
N
X
Fp (x) =
|ai − x|p
F∞ (x) = sup |ai − x|
i
i=1
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
3/1
Central tendencies as function minimisers
Central tendencies (and their corresponding measures of variability) are
solutions for optimizing different cost functions. These cost functions are
based on Lp norms on function spaces. We mainly care about when
p = 1, 2 and ∞.
Given a sample a1 , a2 , . . . , an , some familiar statistical quantities are
minimisers of functions of the form
!1/p
N
X
Fp (x) =
|ai − x|p
F∞ (x) = sup |ai − x|
i
i=1
p = 2: the mean minimizes the mean squared error
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
3/1
Central tendencies as function minimisers
Central tendencies (and their corresponding measures of variability) are
solutions for optimizing different cost functions. These cost functions are
based on Lp norms on function spaces. We mainly care about when
p = 1, 2 and ∞.
Given a sample a1 , a2 , . . . , an , some familiar statistical quantities are
minimisers of functions of the form
!1/p
N
X
Fp (x) =
|ai − x|p
F∞ (x) = sup |ai − x|
i
i=1
p = 2: the mean minimizes the mean squared error
p = 1 : the median minimizes absolute deviation
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
3/1
Central tendencies as function minimisers
Central tendencies (and their corresponding measures of variability) are
solutions for optimizing different cost functions. These cost functions are
based on Lp norms on function spaces. We mainly care about when
p = 1, 2 and ∞.
Given a sample a1 , a2 , . . . , an , some familiar statistical quantities are
minimisers of functions of the form
!1/p
N
X
Fp (x) =
|ai − x|p
F∞ (x) = sup |ai − x|
i
i=1
p = 2: the mean minimizes the mean squared error
p = 1 : the median minimizes absolute deviation
p = ∞ : the mid-range point minimizes the maximum absolute
deviation
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
3/1
Mean and variance for data on the real line
The mean of a1 , a2 , . . . aN is the number µ which minimizes the of the
mean squared error
F2 (x) =
N
X
!1/2
|ai − x|2
i=1
The mean is thus
N
1 X
µ=
ai
N
i=1
The standard deviation is the value F2 (µ).
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
4/1
Median and absolute deviation for data on the real line
The median of a1 , a2 , . . . , aN , written in non-decreasing order, is the
number m which minimizes the absolute deviation
F1 (x) =
N
X
|ai − x|.
i=1
For N odd is unique and is a N+1 . For N even can be any number in the
2
interval [a N , a N+2 ]. The total cost of moving everything from the sample
2
2
data to the median is F1 (m)
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
5/1
Midrange point and range for data on the real line
The range of a1 , a2 , . . . , aN , written in non-decreasing order, is aN − a1
N
and its midpoint is a1 +a
2 . Consider the function
F∞ (x) = max{|ai − x|}.
The minimizer of F∞ is the midpoint and its value is half the range. This
represents the maximal cost of moving any point to the midpoint.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
6/1
Persistence diagrams
Persistence diagrams describe how the topology changes with the
progressive inclusion of sublevel sets.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
7/1
Persistence diagrams
Persistence diagrams describe how the topology changes with the
progressive inclusion of sublevel sets.
Definition
2
A persistence diagram is a countable multiset of points in R along with
the diagonal ∆ = {(x, y ) ∈ R2 | x = y }, where each point on the diagonal
has infinite multiplicity. We require some niceness assumptions.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
7/1
Given two diagrams X , Y we can considers bijections φ between the points
in X and the points in Y . Bijections always exist because there are
countably many points at every location on the diagonal. We only need to
consider bijection where off-diagonal points are either paired with
off-diagonal points or with the point on the diagonal that is closest to it.
Example: φ : X (red) → Y (blue)
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
8/1
Distance between diagrams
There are many choices of metrics in the space of persistence diagrams
just like there are different choices of metric on spaces of functions. We
will consider three choices which are analogous to L1 , L2 and L∞ on the
space of real valued functions.
One family of distances are
!1/p
dp (X , Y ) =
inf
φ:X →Y
X
kx −
φ(x)kpp
x∈X
for p ≥ 1 and taking limits for p = ∞ getting
d∞ (X , Y ) =
Katharine Turner (U of Chicago)
inf
φ:X →Y
max{kx − φ(x)k∞ }
Statistics of Persistence Diagrams
9/1
The optimal pairing
optimal pairing
We will call a bijection between points optimal for dp if it achieves the
infimum in the definition of dp .
Returning to our example.
The optimal choice is the first one for all values of p.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
10 / 1
Non-uniqueness away from the diagonal
This example works for every p ∈ [1, ∞]. Matching the points vertically or
horizontally involves the same cost.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
11 / 1
Non-uniqueness involving the diagonal, p = 1
Given diagram (red), there region which distinguishes whether it costs less
to pair both points to the diagonal (purple) than pairing them to each
other (blue). It costs the same on the boundary (green).
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
12 / 1
Non-uniqueness involving the diagonal, p = 2
Given diagram (red), there is a parabola which bounds the region which
distinguishes whether it costs less to pair both points to the diagonal than
pairing them to each other.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
13 / 1
Statistical qualities can thus be defined on the space of persistence
diagrams by analogy using these different distances. Given diagrams
X1 , X2 . . . XN let
Fp (Y ) =
N
X
!1/p
dp (Xi , Y )p
F∞ (Y ) = sup d∞ (Y , Xi ).
i
i=1
The mean µ is the diagram which minimizes F2 and F2 (µ) is the
standard deviation.
The median m is the diagram which minimizes F1 and F1 (m) is the
absolute deviation.
The “midpoint” m̂ is the diagram which minimizes F∞ and F∞ (m̂) is
the maximal absolute deviation.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
14 / 1
“Mean” of points in R2 and copies of the diagonal
Lemma
Let (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) be points in the plane. Let x̂ be the
mean of a1 , a2 . . . ak and ŷ be the mean of b1 , b2 , . . . , bk . Then
!
k ŷ + (n − k) x̂+ŷ
k x̂ + (n − k) x̂+ŷ
2
2
,
(x, y ) :=
n
n
is the point in R2 which minimizes
k
X
k(x, y ) − (ai , bi )k22 +
i=1
Katharine Turner (U of Chicago)
n
X
k(x, y ) − ∆k22
i=k+1
Statistics of Persistence Diagrams
15 / 1
“Mean” of points in R2 and copies of the diagonal
Lemma
Let (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) be points in the plane. Let x̂ be the
mean of a1 , a2 . . . ak and ŷ be the mean of b1 , b2 , . . . , bk . Then
!
k ŷ + (n − k) x̂+ŷ
k x̂ + (n − k) x̂+ŷ
2
2
,
(x, y ) :=
n
n
is the point in R2 which minimizes
k
X
k(x, y ) − (ai , bi )k22 +
i=1
n
X
k(x, y ) − ∆k22
i=k+1
We call this (x, y ) the “mean” of (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) and n − k
copies of the diagonal.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
15 / 1
“Mean” of 3 points in R2 and 2 copies of the diagonal
The red, blue and purple points are the (ai , bi ). The black point is their
arithmetic mean - (x̂, ŷ ).
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
16 / 1
“Mean” of 3 points in R2 and 2 copies of the diagonal
The orange is the point on the diagonal closest to the black.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
17 / 1
“Mean” of 3 points in R2 and 2 copies of the diagonal
The green point is the mean of the red, blue, purple and 2 copies of the
orange. It is the weighted average of the black and orange.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
18 / 1
“Median” of points in R2 and copies of the diagonal
Lemma
Suppose k > n/2. Let (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) be points in the
plane. Let (x, y ) be the point in R2 where x is the median of a1 , a2 . . . ak
with n − k copies of −∞ and y is the median of b1 , b2 , . . . , bk with n − k
copies of ∞. Then (x, y ) is the point in R2 which minimizes
k
X
k(x, y ) − (ai , bi )k1 +
i=1
Katharine Turner (U of Chicago)
n
X
k(x, y ) − ∆k1
i=k+1
Statistics of Persistence Diagrams
19 / 1
“Median” of points in R2 and copies of the diagonal
Lemma
Suppose k > n/2. Let (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) be points in the
plane. Let (x, y ) be the point in R2 where x is the median of a1 , a2 . . . ak
with n − k copies of −∞ and y is the median of b1 , b2 , . . . , bk with n − k
copies of ∞. Then (x, y ) is the point in R2 which minimizes
k
X
k(x, y ) − (ai , bi )k1 +
i=1
n
X
k(x, y ) − ∆k1
i=k+1
We call this (x, y ) the “median” of (a1 , b1 ), (a2 , b2 ), . . . , (ak , bk ) and
n
P−k k copies of the diagonalPifn
Pk
i=1 k(x, y ) − (ai , bi )k1 +
i=k+1 k(x, y ) − ∆k1 <
i=1 k∆ − (ai , bi )k1 .
Otherwise we say the “median” is the diagonal.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
19 / 1
“Median” of points in R2 and copies of the diagonal
Lemma
If k < n/2 then
k
X
k(x, y ) − (ai , bi )k1 +
i=1
n
X
k(x, y ) − ∆k1 >
i=k+1
k
X
k∆ − (ai , bi )k1
i=1
for every point (x, y ) ∈ R2
When k < n/2 we say that the “median” of (x1 , y1 ), (x2 , y2 ), . . . , (xk , yk )
and n − k copies of the diagonal is the diagonal.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
20 / 1
“Median” of 3 points in R2 and 2 copies of the diagonal
The red, blue and purple points are the (ai , bi ).
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
21 / 1
“Median” of 3 points in R2 and 2 copies of the diagonal
To find the x coordinate we take the median of {a1 , a2 , a3 , ∞, ∞}.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
22 / 1
“Median” of 3 points in R2 and 2 copies of the diagonal
To find the y coordinate we take the median of {b1 , b2 , b3 , −∞, −∞}.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
23 / 1
“Median” of 3 points in R2 and 2 copies of the diagonal
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
24 / 1
Selections and Matchings
Given a set of diagrams X1 , · · · , XN , a selection is a choice of one point
from each diagram, where that point could be ∆.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
25 / 1
Selections and Matchings
Given a set of diagrams X1 , · · · , XN , a selection is a choice of one point
from each diagram, where that point could be ∆.
A matching is a set of selections so that every off-diagonal point of every
diagram is part of exactly one selection.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
25 / 1
Selections and Matchings
Given a set of diagrams X1 , · · · , XN , a selection is a choice of one point
from each diagram, where that point could be ∆.
A matching is a set of selections so that every off-diagonal point of every
diagram is part of exactly one selection.
Each matching Φ gives a candidate µΦ for the mean by taking the diagram
whose points are the means of each of the selections. The mean is one of
these µΦ so we only need to compare the F2 (µΦ ) over all matchings Φ.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
25 / 1
Selections and Matchings
Given a set of diagrams X1 , · · · , XN , a selection is a choice of one point
from each diagram, where that point could be ∆.
A matching is a set of selections so that every off-diagonal point of every
diagram is part of exactly one selection.
Each matching Φ gives a candidate µΦ for the mean by taking the diagram
whose points are the means of each of the selections. The mean is one of
these µΦ so we only need to compare the F2 (µΦ ) over all matchings Φ.
Each matching Φ gives a candidate mΦ for the median by taking the
diagram whose points are the medians of each of the selections. The
median is one of these mΦ so we only need to compare the F1 (mΦ ) over
all matchings Φ.
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
25 / 1
Description of local minimums of F2
Theorem
Let X1 , . . . , Xm be persistence diagrams with only finitely many
off-diagonal points. W = {wj } is a local minimum of
P
2 1/2 if and only if there is a unique optimal
F2 (Y ) = m1 m
i=1 d2 (Xi , Y )
pairing from W to each of the Xi , which we denote as φi , and each wj is
the arithmetic mean of the points {φi (wj )}i=1,2,...,m .
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
26 / 1
Description of local minimums of F1
Theorem
Let X1 , . . . , Xm be persistence diagrams with only finitely many
off-diagonal P
points. W = {wj } is a local minimum of
F1 (Y ) = m1 m
i=1 d1 (Xi , Y ) if and only if there whenever φi : W → Xi for
every optimal pairings each wj is the “median” of the points
{φi (wj )}i=1,2,...,m .
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
27 / 1
Mean (green) of three diagrams
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
28 / 1
Median (green) of three diagrams
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
29 / 1
The End
Katharine Turner (U of Chicago)
Statistics of Persistence Diagrams
30 / 1