Mrs. Gerber`s Lemma with quantum side information

Mrs. Gerber’s Lemma with quantum side information
David Reeb , Christoph Hirche
1
1
1973
OVEMBER
769
Classical case
Entropy
of Certain
random variables
X , Y , X , Y Binary
(X , Y ) indep. from (X , Y )
Applications:
Part
I
X , X ∈ of
{0, 1}Certain
binary:
Entropy
Binary
p
q
X ∼
, X Part
∼
Applications:
I
1−p
1−q
OVEMBER
• 1973
•
•
1
1
1
1
1
2
2
2
2
769
2


ER AND JACOB ZIV

1
2





• consider
(mod 2)
convolution:



ER AND
JACOB addition
ZIV
pq
+
(1
−
p)(1
−
q)
p
∗
q
blish a be
the
entropy
to 
be < -$, then
function. If we restrict a =:
X
+
X
∼
1
2
andom h(a) is one-to-one.
p(1Thus
− q)we+can,
(1 −
1 −2,pdefine
∗q
forp)q
0 I Y I log
three
h-‘
(v)
E
[O,+]
unambiguously.
Let
us
remark
here
that
•
H
:=
H(X
|Y
),
H
:=
H(X
|Y
)
1
1
1
2
2
2
en fora
blish
be the entropy function. If we restrict a to be < -$, then
binary
l-a
h(x)
:=
−x
log
x
−
(1
−
x)
log(1
−
x)
andom •h(a)
is
one-to-one.
Thus
we
can,
for
0
I
Y
h’(a) = log ,
O_<all I log 2, define
(OSa)
, three
of a
a Let us remark here that
h-‘
(v)
E
[O,+]
unambiguously.
H{X}
en for
binary
, of a
H{X}
and that
Mrs. Gerber’s
Lemma
l-a
h’(a) =case,
log (1-copy
O_<all
(OSa)
conditioning)
’
O lall
(0.5b)
a(1 - a)’ −1
and that
−1
a&kJ
H(X1 + X2|Y1Y2) ≥ h h (H1) ∗ h (H2)
so that h( .) is concave (downward).
h”(a)
=
’
O
lall
(0.5b)
F inally, the notationa(1H(X)
denoting
the
entropy
of a
=:
gc(H1, H2) .
- a)’
random variable X, and H{X ( Y) denoting conditional
a&kJ
so that h(etc.,
.) is will
concave
(downward).
ablish entropy,
be standard
as in [l] or [2].
F inally,(a)
thenonotation
H(X) denoting the entropy of a
side information:
uence Proof:
( Y) denoting conditional
I.
INTRODUCTION
random
variable
X,
and
H{X
y this
H(X
+X
)
=
g
H(X
),
H(X
)
1
2
c
1
2
ablish
will .be
standard
[l] orrandom
[2].
com- entropy,
Let X etc.,
= (Xi,.
. ,X,)
E 93”beasainbinary
n-vector.
uence (b)
each
time.
Let gPx(x)
Pr (X =inx},
x Eentry:
gn define its probability disc(·, ·)=convex
INTRODUCTION
X
y this tribution. W e will notI. specify
anything about Px(x) except
+X
|Y
Y
)
=
p(y
)p(y
)
H(X
+X
|y
y
)
1
2
1
2
1
2
1
2
1
2
com- H(X
Let
X
=
(Xi,.
.
.
,X,)
E
93”
be
a
binary
random
n-vector.
that its entropy
y1,y2
probability disX
time.
Let
Px(x)
=
Pr
(X
=
x},
x
E
gn
define
its
y1
y2
ctors.
p(y
)p(y
)
g
HW {X}
= not
-= specify
c Px(x)
log Px(x)
2H1Px(x)
nu, H2except
1
2
c
(1)
tribution.
e
will
anything
about
annel
xem
y1,y2
!
that its entropy
X
X
y
y2
1 the random vecwhere
0
5
u
I
log
2.
Let
us
assume
that
ctors.
≥
g
p(y
)H
,
p(y
)H
c
1
2
1
2
H
{X}
=
c
Px(x)
log
Px(x)
2
nu
symmetric
channel
(1)
y1
y2 with crossannel tor X is the input to a binary
xem over probability po(O
< gp.
I
3).
Let
Y
=
(Yi,
*
.
.
,
Y,)
be
=
H
,
H
.
c
1
2
(O-1) the
where
0
5
u
I
log
2.
Let
us
assume
that the
veccorresponding channel output n-vector.
Therandom
probability
tor X is the input
to defined
a binarybysymmetric channel with crossof Y is
when distribution
f (O-1)
the over probability po(O < p. I 3). Let Y = (Yi, * . . , Y,) be
MY)
=
Pr
{Y
=
Y>
=
,&”
Qk&
I
W
’
d-4
the
corresponding
channel
output
n-vector.
The
probability
by
when distribution of Y is defined by
f the where Q#c(y 1x) is defined in (0.2). A useful (and of course
W)
MY)
=
Pr
{Y
=
Y>
=
,&”
Qk&
I
W
’
d-4
equivalent)
way
of
defining
Y
is
to
let
2
=
(Z,,
* . * ,Z,) be
by
components are independent rana binary n-vector whose
Plot Q#c(y
g(H1, 1Hx)1)−H
(X1, (and
Y1) ∼of(X
1 vs.inH(0.2).
1, i.e.Afor
2, Y2)
where
is
defined
useful
course
d o m variables which take the value 1 with probability p.
W)
equivalent)
of defining
is toThen
let 2 set
= (Z,,
be
and 0 with way
probability
(1 - Ypo).
Y, = * .X,* ,Z,)
@ Z,
awhere
binary
n-vector
whose
components
independent
ran“0”
denotes
addition
mmulti-user
o d 2.are
Our
m a in result
is
Wyner
&
Ziv
(IEEE
1973):
limits
communication
(0.3) d o m variables which take the value 1 with probability p.
following.
•the
X=
(X1, . . . , Xn) binary random n-vector
andTheorem
0 with Iprobability
- po).
Then
set previously
Y, = X, @and
Z,
’ : W ith X (1
and
Y
as
defined
•where
Z = (Z“0”
, Zn) binary
i.i.d. noise
∼ Our
(p0, 1m−a pin0) result is
1, . . . denotes
addition
m
o
d
2.
(l), i.e., H(X) 2 nu, then
(0.3) •satisfying
Y :=
X +Z
the
following.
(2)
Theorem I ’ : WH(Y)
ith X 2andnh(p,*h-‘
Y as (u))
defined previously and
0.4a)
satisfying
(l), ifi.e.,
nu, {X,};
then are independent, and
with
equality
andH(X)
only if2 the
0.4b)
with H{X,} = v(kH(Y)
= 1,2;.-~2).
2 nh(p,*h-‘(u))
(2)
0.4a)
Note that the function h(p,*h-l(v))
> U. Thus (2) exwith
equality
if
and
only
if
the
{X,};
are
independent,
and
presses
the
reasonable
fact
that
if
the
entropy
of
the
channel
73.
0.4b)
7974. with H{X,} = v(k = 1,2;.-~2).
now
Note that the function h(p,*h-l(v))
> U. Thus (2) ex* This result is known as “Mrs. Gerber’s Lemma” in honor of a
ute of
presses
thewhose
reasonable
fact
if the
of theat channel
certain
lady
presence
wasthat
keenly
felt entropy
by the authors
the time
73.
..
7974.
now
ute of
h”(a) = -
,
with
a
0.15
0.10
0.05
0.1
0.2
0.3
0.4
0.5
Leibniz Universität Hannover, Germany
0.6
0.7
this research was done.
* This result is known as “Mrs. Gerber’s Lemma” in honor of a
certain
lady whose presence was keenly felt by the authors at the time
..
2
2
Universitat Autonoma de Barcelona, Spain
Case H ≈ 0, i.e. F ≈ 0
Proof in restricted case: H & 0.2
Quantum setup
• (X1, Y1)
= p|0ih0| ⊗ ρ0 + (1 − p)|1ih1| ⊗ ρ1,
(X2, Y2) = q|0ih0| ⊗ σ0 + (1 − q)|1ih1| ⊗ σ1
quantum side information
• (X1 +X2, Y1, Y2) = TrX2 ◦ CNOT(X1,X2)→(X1+X2,X2)
• H1 := H(X1|Y1), H2 := H(X2|Y2)
• relevant
• proof
to polar
coding
for
the
n
o
c-q-channel C: 0 7→ ρ0, 1 7→ ρ1 :
• τX1Y1 = τX2Y2 = 12 |0ih0| ⊗ ρ0 + 12 |1ih1| ⊗ ρ1
• τABC = τ(X1+X2)(X2)(Y1Y2) = CNOT(τX1Y1 ⊗ τX2Y2)
• noation: f := F (ρ0, ρ1) fidelity
going via f = F (ρ0, ρ1) cannot work:
(a) the best Petz map in the Fawzi-Renner bound
for pure ρ0 = ψ0, ρ1 = ψ1 yields:
2
1+f +(1 − f )
H − H ≥ − log
2
2
f
4
=
+ O(f )
4
(b) in general H ≥ f log 2, e.g. for
−
(a) Fawzi-Renner:
Conjecture
0
∃RC→AC
(and thus ∃R) s.th.
−
−
H := H(X1 +X2|Y1Y2) is lower-bounded as:
(1) for H1 + H2 ≤ log 2:
H
−
≥ gc H1, H2 ,
tight if no side information.
(2) for H1 + H2 ≥ log 2:
H
−
≥ gc log 2−H1, log 2−H2
+H1 + H2 − log 2 ,
tight for pure ρ0, ρ1, σ0, σ1 and p = q = 1/2.
notation: gq (H1, H2) and gq (H) := gq (H, H)
classical proof fails (must fail!):
conditioning on quantum register
quantitative strengthening of SSA:
−
H − H1 = I(X1 +X2 : X2 | Y1Y2) ≥ 0 .
Polar coding application
H −H = I(A : B|C)
τ
0
≥ −2 log F τABC , RC→AC (τBC )
"
1 = −2 log F ω0 ⊗ ρ0, R(ρ0)
2
#
1 + F ω1 ⊗ ρ1, R(ρ1) ,
2
with ω0/1 := 12 |0ih0| ⊗ ρ0/1 + 12 |1ih1| ⊗ ρ1/0.
two identical binary-input channels C
(X1 7→ Y1), (X2 7→ Y2) with p = q = 1/2
guessing entropy: H1 = H2 ≡ H
• polar transformation yields guessing entropies:
−
−
bad channel C : H ≥ gq (H) ≥ H,
+
+
good channel C : H ≤ 2H − gq (H) ≤ H.
X2
Even if H ≤ f log 2 held, then only
⇒
H
−
H
≥ H+
4(log 2)2
H
i.e. only
• Entropy-power
X1
X2
C
C
Y1
Y2
X1
X2
X2
C
C
Y1
Y2
If κ < 1 exists such that
• H + ≤ κH
(for H ≤ 12 log 2),
−
• (log 2−H ) ≤ κ(log 2−H)
(bigger H),
then polarize to H + ≤ ε after log(1/ε) recursions.
polynomial gap to capacity (1304.4321):
achieve rate C −ε with polynomial blocklength
L ≈ (1/ε)
(log 2)/ log(1/κ)
Conjecture would imply: κ ≈ 0.6
.
H
≤ H 1−
.
2
4(log 2)
"
#
ineq for Gaussian X1, X2 ∈ Rn:
e2H(X1+X2)/n ≥ e2H(X1)/n + e2H(X2)/n
#
n
2H(X1)/n
2H(X2)/n
+e
⇒ H(X1 + X2) ≥ log e
2 =: G H(X1), H(X2) .
"
0.020
0.015
0.010
G jointly convex
class. conditioning OK
0.005
0.2
X1 + X 2
C+
+
(small H) .
Related questions
1
1
2
H −H ≥ −2 log cos arccos f − arccos f :
2
2
"
−
2
• conditioning
C−
X1 + X 2
ρ0 = diag(f, 1 − f, 0) ,
ρ1 = diag(f, 0, 1 − f ) .
(b) concavity of arccos; triangle inequality for
A(·, ·) := arccos F (·, ·); F monotonic under R:
#
"
1 1 arccos F ω0 ⊗ ρ0, R(ρ0) + F ω1 ⊗ ρ1, R(ρ1)
2
2
1 1 ≥ A ω0 ⊗ ρ0, R(ρ0) + A ω1 ⊗ ρ1, R(ρ1)
2
2
1 1 ≥ A ω0 ⊗ ρ0, ω1 ⊗ ρ1 − A R(ρ0), R(ρ1)
2
2
h
i
1
1 ≥ arccos F (ω0, ω1) · F (ρ0, ρ1) − A ρ0, ρ1
2
2
1
1
2
= arccos f − arccos f
2
2
• take
2 3/2 !
0.4
0.6
0.8
on quantum systems?
1.0
• X1, X2
bosonic systems @ beamsplitter:
König & Smith (arXiv:1205.3409), De Palma et
al. (arXiv:1402.0404)
(c) relate f and H (Kim et al., Roga et al.):
2
(1−f )
ρ0 + ρ1
S(ρ0) + S(ρ1)
1−f
−
≤ S
≤ h
2
2
2
2
|
{z
}
!
!
• with
conditioning, (X1, Y1), (X2, Y2) Gaussian:
König (arXiv:1304.7031)
log 2−H
H−
• other
(prime) alphabets?
Abelian groups (G, ⊕)?
(log 2 − H)/20
1
≥ H+
(for H ≥ log 2)
1 − log(log 2−H)
2
• many-copy
1/20
⇒ (log 2−H ) ≤ (log 2−H) 1 −
.
1 − log(log 2−H)
−
"
#
(multi-mode) case, cf. original work
by Wyner & Ziv
#