Information complexity: an Overview

Information Complexity: an
Overview
Rotem Oshman, Princeton CCI
Based on work by Braverman, Barak, Chen, Rao,
and others
Charles River Science of Information Day 2014
Classical Information Theory
β€’ Shannon β€˜48, A Mathematical Theory of
Communication:
Motivation: Communication Complexity
𝑓 𝑋, π‘Œ = ?
𝑋
π‘Œ
Yao β€˜79, β€œSome complexity questions related to
distributive computing”
Motivation: Communication Complexity
More generally: solve some task 𝑇(𝑋, π‘Œ)
𝑋
π‘Œ
Yao β€˜79, β€œSome complexity questions related to
distributive computing”
Motivation: Communication Complexity
β€’ Applications:
– Circuit complexity
– Streaming algorithms
– Data structures
– Distributed computing
– Property testing
–…
Example: Streaming Lower Bounds
β€’ Streaming algorithm:
How much space
is required to approximate f(data)?
algorithm
data
β€’ Reduction from communication complexity
[AMS’97]
Example: Streaming Lower Bounds
β€’ Streaming algorithm:
State of the
algorithm
algorithm
data
β€’ Reduction from communication complexity
[Alon, Matias, Szegedy ’99]
Advances in Communication Complexity
β€’ Very successful in proving unconditional lower
bounds, e.g.,
– Ξ© 𝑛 for set disjointness [KS’92, Razborov β€˜92]
– Ξ© 𝑛 for gap hamming distance [Chakrabarti, Regev β€˜10]
β€’ But stuck on some hard questions
– Multi-party communication complexity
– Karchmer-Wigderson games
β€’ [Chakrabarty, Shi, Wirth, Yao ’01], [Bar-Yossef, Kumar, Jayram,
Srivakumar β€˜04]: use tools from information theory
Extending Information Theory to
Interactive Computation
β€’ One-way communication:
– Task: send 𝑋 across the channel
– Cost: 𝐻 𝑋 bits
β€’ Shannon: in the limit over many instances
β€’ Huffman: 𝐻 𝑋 + 1 bits for one instance
β€’ Interactive computation:
– Task: e.g., compute 𝑓 𝑋, π‘Œ
– Cost?
Information Cost
β€’ Reminder: mutual information
𝐼 𝑋; π‘Œ = 𝐻 𝑋 βˆ’ 𝐻 𝑋 π‘Œ = 𝐻 π‘Œ βˆ’ 𝐻 π‘Œ 𝑋
β€’ Conditional mutual information:
𝐼 𝑋; π‘Œ 𝑍 = 𝐻 𝑋 𝑍 βˆ’ 𝐻 𝑋 π‘Œ, 𝑍
= 𝐸𝑧 𝐼 𝑋; π‘Œ 𝑍 = 𝑧
β€’ Basic properties:
– 𝐼 𝑋; π‘Œ β‰₯ 0
– 𝐼 𝑋; π‘Œ ≀ 𝐻 𝑋 and 𝐼 𝑋; π‘Œ ≀ 𝐻 π‘Œ
– Chain rule: 𝐼 π‘‹π‘Œ; 𝑍 = 𝐼 𝑋; 𝑍 + 𝐼 π‘Œ; 𝑍 𝑋
Information Cost
β€’ Fix a protocol Ξ 
β€’ Notation abuse: let Ξ  also denote the
transcript of the protocol
β€’ Two ways to measure information cost:
– External information cost: 𝐼 Ξ ; π‘‹π‘Œ
– Internal information cost: 𝐼 Ξ ; π‘Œ 𝑋 + 𝐼 Ξ ; 𝑋 π‘Œ
– Cost of a task: infimum over all protocols
– Which cost is β€œthe right one”?
Information Cost: Basic Properties
External information: 𝐼 Ξ ; π‘‹π‘Œ
Internal information: 𝐼 Ξ ; π‘Œ 𝑋 + 𝐼 Ξ ; 𝑋 π‘Œ
β€’ Internal ≀ external
β€’ Can be much smaller, e.g.:
– 𝑋 = π‘Œ uniform over 0,1
– Ξ : Alice sends 𝑋 to Bob
𝑛
β€’ But equal if 𝑋, π‘Œ inependent
Information Cost: Basic Properties
External information: 𝐼 Ξ ; π‘‹π‘Œ
Internal information: 𝐼 Ξ ; π‘Œ 𝑋 + 𝐼 Ξ ; 𝑋 π‘Œ
β€’ External information ≀ communication:
𝐼 Ξ ; π‘‹π‘Œ ≀ 𝐻 Ξ  ≀ Ξ  .
Information Cost: Basic Properties
β€’ Internal information ≀ communication cost:
𝐼 Ξ ; π‘Œ 𝑋 + 𝐼(Ξ ; 𝑋|π‘Œ) ≀ Ξ  .
β€’ By induction: let Ξ  = Ξ 1 … Π𝑑 .
β€’ βˆ€π‘Ÿ ≀ 𝑑 : 𝐼 Ξ β‰€π‘Ÿ ; π‘Œ 𝑋 + 𝐼 Ξ β‰€π‘Ÿ ; 𝑋 π‘Œ ≀ π‘Ÿ.
𝐼 Ξ β‰€π‘Ÿ ; π‘Œ 𝑋 + 𝐼 Ξ β‰€π‘Ÿ ; 𝑋 π‘Œ
what we know after r rounds
= 𝐼 Ξ <π‘Ÿ ; π‘Œ 𝑋 + 𝐼 Ξ <π‘Ÿ ; 𝑋 π‘Œ what we knew after r-1 rounds
I.H. ≀ π‘Ÿ βˆ’ 1
+ 𝐼 Ξ π‘Ÿ ; Y X, Ξ <π‘Ÿ + 𝐼 Ξ π‘Ÿ ; X Y, Ξ <π‘Ÿ
what we learn in round r, given what we already know
Information vs. Communication
β€’ Want: 𝐼 Ξ π‘Ÿ ; Y X, Ξ <π‘Ÿ + 𝐼 Ξ π‘Ÿ ; X Y, Ξ <π‘Ÿ ≀ 1.
β€’ Suppose Ξ π‘Ÿ is sent by Alice.
β€’ What does Alice learn?
– Ξ π‘Ÿ is a function of Ξ <π‘Ÿ and 𝑋, so
𝐼 Ξ π‘Ÿ ; Y X, Ξ <π‘Ÿ = 0.
β€’ What does Bob learn?
– 𝐼 Ξ π‘Ÿ ; Y X, Ξ <π‘Ÿ ≀ Ξ π‘Ÿ = 1.
Information vs. Communication
β€’ We have:
Internal information ≀ communication
External information ≀ communication
Internal information ≀ external information
Information vs. Communication
β€’ β€œInformation cost = communication cost”?
– In the limit: internal information! [Braverman, Rao β€˜10]
– For one instance: external information! [Braverman,
Barak, Rao, Chen β€˜10]
Big question: can protocols be compressed down to
their internal information cost?
– [Ganor, Kol, Raz ’14]: no!
– There is a task with internal IC=π‘˜, CC=2π‘˜ .
… but: remains open for functions, small output.
Information vs. Amortized
Communication
β€’ Theorem [Braverman, Rao β€˜10]:
β€’
β€’
β€’
β€’
𝐢𝐢(𝐹 𝑛 , πœ‡π‘› , πœ–)
lim
= 𝐼𝐢 𝐹, πœ‡, πœ– .
π‘›β†’βˆž
𝑛
The β€œβ‰€β€ direction: compression
The β€œβ‰₯” direction: direct sum
We know: 𝐢𝐢 𝐹 𝑛 , πœ‡π‘› , πœ– β‰₯ 𝐼𝐢 𝐹 𝑛 , πœ‡π‘› , πœ–
We can show: 𝐼𝐢 𝐹 𝑛 , πœ‡π‘› , πœ– = 𝑛 β‹… 𝐼𝐢 𝐹, πœ‡, πœ–
Direct Sum Theorem [BRβ€˜10]
𝐼𝐢 𝐹 𝑛 , πœ‡π‘› , πœ– = 𝑛 β‹… 𝐼𝐢 𝐹, πœ‡, πœ– :
β€’ Let Ξ  be a protocol for 𝐹 𝑛 on 𝑛-copy inputs 𝑋, π‘Œ
β€’ Construct Ξ β€² for 𝐹 as follows:
– Alice and Bob get inputs π‘ˆ, 𝑉
– Choose a random coordinate 𝑖 ∈ 𝑛 , set 𝑋𝑖 = π‘ˆ, π‘Œπ‘– = 𝑉
– Bad idea: publicly sample π‘‹βˆ’π‘– , π‘Œβˆ’π‘–
𝑋
π‘ˆ
π‘Œ
𝑉
Direct Sum Theorem [BRβ€˜10]
𝐼𝐢 𝐹 𝑛 , πœ‡π‘› , πœ– = 𝑛 β‹… 𝐼𝐢 𝐹, πœ‡, πœ– :
β€’ Let Ξ  be a protocol for 𝐹 𝑛 on 𝑛-copy inputs 𝑋, π‘Œ
β€’ Construct Ξ β€² for 𝐹 as follows:
– Alice and Bob get inputs π‘ˆ, 𝑉
– Choose a random coordinate 𝑖 ∈ 𝑛 , set 𝑋𝑖 = π‘ˆ, π‘Œπ‘– = 𝑉
– Bad idea: publicly sample π‘‹βˆ’π‘– , π‘Œβˆ’π‘–
Suppose in Ξ , Alice sends 𝑋1 βŠ• β‹― βŠ• 𝑋𝑛 .
In Ξ , Bob learns one bit β‡’ in Ξ  β€² he should learn 1/𝑛 bit
But if π‘‹βˆ’π‘– is public Bob learns 1 bit about π‘ˆ!
Direct Sum Theorem [BRβ€˜10]
𝐼𝐢 𝐹 𝑛 , πœ‡π‘› , πœ– = 𝑛 β‹… 𝐼𝐢 𝐹, πœ‡, πœ– :
β€’ Let Ξ  be a protocol for 𝐹 𝑛 on 𝑛-copy inputs 𝑋, π‘Œ
β€’ Construct Ξ β€² for 𝐹 as follows:
– Alice and Bob get inputs π‘ˆ, 𝑉
– Choose a random coordinate 𝑖 ∈ 𝑛 , set 𝑋𝑖 = π‘ˆ, π‘Œπ‘– = 𝑉
Publicly sample 𝑋1 , … , π‘‹π‘–βˆ’1
𝑋
Privately sample 𝑋(𝑖+1) , … , 𝑋𝑛
π‘ˆ
Privately sample π‘Œ1 , … , π‘Œπ‘–βˆ’1
π‘Œ
Publicly sample π‘Œ 𝑖+1 , … , π‘Œπ‘›
𝑉
Compression
β€’ What we know: a protocol with
communication 𝐢, internal info 𝐼 and external
info 𝐼𝑒π‘₯𝑑 can be compressed to
– 𝐼𝑒π‘₯𝑑 β‹… polylog 𝐢 [BBCR’10]
– 𝐼 β‹… 𝐢 β‹… polylog 𝐢 [BBCR’10]
– 2𝑂
𝐼
[Braverman’10]
β€’ Major open question: can we compress to 𝐼 β‹…
polylog 𝐢 ? [GKR, partial answer: no]
Using Information Complexity to Prove
Communication Lower Bounds
β€’ Internal/external info ≀ communication
β€’ Essentially the most powerful technique known
[Kerenidis,Laplante,Lerays,Roland,Xiao’12]: most
lower bound techniques imply IC lower bounds
β€’ Disadvantage: hard to show incompressibility!
– Must exhibit problem with low IC, high CC
– But proving high CC usually proves high IC…
Extending IC to Multiple Players
β€’ Recent interest in multi-player number-inhand communication complexity
β€’ Motivated by β€œbig data”:
– Streaming and sketching, e.g., [Woodruff, Zhang
β€˜11,’12,’13]
– Distributed learning, e.g., [Awasthi, Balcan, Long β€˜14]
Extending IC to Multiple Players
β€’ Multi-player computation traditionally hard to
analyze
β€’ [Braverman,Ellen,O.,Pitassi,Vaikuntanathan]:
Ξ© π‘›π‘˜ for Set Disjointness with 𝑛 elements, π‘˜
players, private channels, NIH input
Information Complexity on Private
Channels
β€’ First obstacle: secure multi-party computation
β€’ [Goldreich,Micali,Wigderson’87]: any function can be
computed with perfect information-theoretic security
against < π‘˜/2 players
– Solution: redefine information cost, measure both
β€’ Information a player learns, and
β€’ Information a player leaks to all the others.
Extending IC to Multiple Players
β€’ Set disjointness:
– Input: 𝑋1 , … , π‘‹π‘˜
– Output: 𝑋1 ∩ β‹― ∩ π‘‹π‘˜ = βˆ…?
β€’ Open problem: can we extend to gap set
disjointness?
– First step: β€œpurely info-theoretic” 2-party analysis
Extending IC to Multiple Players
β€’ In [Braverman,Ellen,O.,Pitassi,Vaikuntanathan]
we show direct sum for multi-party
– Solving 𝑛 instances = 𝑛 β‹… solving one instance
β€’ Does direct sum hold β€œacross players”?
– Solving with π‘˜ players = Ξ© π‘˜ β‹… solving with 2
players?
– Not always
β€’ Does compression work for multi-party?
Conclusion
β€’ Information complexity extends classical
information theory to the interactive setting
β€’ Picture is much less well-understood
β€’ Powerful tool for lower bounds
β€’ Fascinating open problems:
– Compression
– Information complexity for multi-player
computation, quantum communication, …