Slides

Lower Bounds for
Read/Write Streams
Paul Beame
Joint work with Trinh Huynh
(Dang-Trinh Huynh-Ngoc)
University of Washington
Data stream Algorithms
• Many huge successes
– No need to remind people at this workshop!
• Some problems provably hard
– E.g. Frequency moments Fk, k > 2 require
space Ω(n1-2/k)
[Bar-Yossef-Jayram-Kumar-Sivakumar 02],
[Chakrabarti-Khot-Sun 03]
Beyond Data Streams
• Disk storage can be huge
– Can stream data to/from disks in real time
• Sequential access hides latency
– Motivates multipass streams
• Analyzed by similar methods to single pass
• Why stop at a single copy?
– Working with more than one copy at once may make
computations easier
• Why stream the data onto disks exactly as read?
– Can make modifications to data while writing
Read/write streams model
0100111010001010100010
00
10010
111
011
00011
011
00
11
0000
10
0000101111001111010000
memory
• Disks  read/write streams
– Key Parameters: space, #passes=reversals
– Assume #streams is constant
• Introduced by [Grohe-Schweikardt 05]
Read/write streams model
• Much more powerful than data-stream model
– Sort with O(log n) passes, O(log n) space, 3 streams
• MergeSort
– Exactly compute any frequency moment
• Data-stream requires passes  space = Ω(n)
– Θ(log n) passes, O(1) space gives all of LOGSPACE
[Hernich-Schweikardt 08]
What can be computed in o(log n) passes + small
space?
Previous lower bounds for R/W
streams
• In o(log n) passes need Ω(n1-ε) space to
– Sort n numbers
[Grohe-Schweikardt 05]
– Test set-equality A=B, multiset equality,
XQuery, XPath
[Grohe-Hernich-Schweikardt 06]
• Same lower bounds apply for randomized
algorithms with one-sided error
[Grohe-Hernich-Schweikardt 06]
Previous lower bounds for R/W
streams
• Lower bounds for general randomness and
two-sided error:
– In o(log nlog log n) passes, need Ω(n1-ε) space to:
• Approximate F* within factor 2
• Find Empty-Join, XQuery/XPath-Filtering
etc.
[B-Jayram-Rudra 07]
What about approximating frequency moments
Fk for k  2 ?
Our Main Result
Theorem: Any randomized R/W-stream
algorithm using o(log n) passes needs
Ω(n1-4/k-ε) space to 2-approximate Fk
• Implies polynomial space for k>4
• Compare with: Θ(n1-2/k) on data streams
R/W streams with o(log n) passes don’t help much
for approximating frequency moments.
Methods
[Alon-Matias-Szegedy 96] approach to lower
bounding Fk in data streams
1. Reduce testing t-party set-disjointness to Fk
Solved easily by R/W streams!
Easy!
2. Simulate any data-stream algorithm by a
multi-party number-in-hand communication game
Fails for R/W streams!
Trivial!
Cannot be applied to R/W streams!
3. Apply Ω(n/t) communication lower bound on
t-party set-disjointness
[AMS 96,Saks-Sun 02,Bar-Yossef-Jayram-Kumar-Sivakumar 02,
Chakrabarti-Khot-Sun 03,Grönemeier 09] (tight!)
Promise Set-Disjointness (DISJ)
0, x1,…,xt are pair-wise disjoint
DISJn,t(x1,…,xt) =
1, a s.t. a  xi for every i
Undefined otherwise
x1
x2
x3
x4
x5
0
1
0
0
0
1
0
0
0
0
0
0
1
0
0
1
0
0
0
0
0
1
0
0
0
0
0
0
1
0
1
0
0
0
0
0
0
1
0
0
1
1
1
1
1
0
0
0
0
0
0
0
1
0
0

• t-party NIH communication: Ω(nt)
0
0
0
1
0
1
0
0
0
0
0
0
0
0
1
0
1
0
0
0
• Approximating Fk  testing DISJn,t for t  n1/k
R/W streams easily solve DISJn,t
• Testing DISJn,t with 2 streams,3 passes,O(log n) space
• Input: x1,x2,…,xt{0,1}n
x1
x2
xt-1
xt
x1
x2
xt-1
xt
How to prove lower bounds
in R/W streams?
• Lower bounds [GS05], [GHS05], [BJR07] for R/W
streams don’t use [AMS96] outline
– Introduce permuted 2-party versions of problems
– Employ ad-hoc combinatorial arguments
We take a more general approach related to
[AMS96] directly using NIH comm. complexity
Our approach to lower bound Fk
R/W streams algorithm for
t-party-permuted-DISJ
on input size n
Number-in-hand communication
protocol for t-party-DISJ
on input size  nt2
[Alon,Matias,Szegedy
96]’s approach
Our approach to lower
bound Fk to
lower bound
Fk in
data stream
in R/W
streams
Reduce testing
testing t-party
permuted
t-party DISJ to
1. Reduce
set-disjointness
to FFkk
Easy!
by DISJ by
2. Simulate data-stream
R/W streamsalgorithms
for permuted
multi-party
communication
NIH comm. number-in-hand
for DISJ on slightly
smaller inputgame
size
Apply our simulation
3. Apply communication lower bound on
t-party set-disjointness
[AMS96,SS02,B-YJKS02,CKS03,G09] (tight!)
Ideas from the proof
Segmenting DISJn,t
Input: x1,x2,…,xt{0,1}n
x1
1
2

nm
x2
xt-1
m
1
2
xt

nm
• View DISJn,t as an OR of m subproblems DISJn/m,t
m
Permuted DISJ
Fix 1,2,…,t permutations on [m]
DISJn/m,t
Permuted-DISJn,m,t
1(x1)
11(1) 21(2)
   
nm

t(xt)
2(x2)
1m
(m)
t1(1) t2(2)
  
t(m
m)
nm
• View Permuted-DISJn,m,t as an OR of m subproblems DISJn/m,t
Why is permuted-DISJ hard?
• Intuitively, to solve a subproblem (e.g. blue),
we need to compare at least two blue segments
DISJn/m,t

i(xi)

j(xj)

l(xl)


• Need to compare at least two segments of every color
• If segments are shuffled, many passes are needed
Permuted DISJ
• Good subproblem: computation always depends
only on at most one of its t segments (and the
memory/state)
• If segments are randomly shuffled:
With o(log m) passes, t=o(m1/2) parties,
99% of the m subproblems are good
• Reduction idea: Try to embed an ordinary
DISJn/m,t in one of the good subproblems
Catch: Which subproblems are good depends
on input
Simulation
s-space R/W streams algo
A for permuted-DISJn,m,t
NIH comm. protocol
for DISJn/m,t
t players on input y1,y2,…,yt:
1(x1x)1
1. Generate m-1 DISJn/m,t’s
that look like* y1,y2,…,yt
2. Shuffle with 1,2,…,t
2(x2x)2
• (y1,y2,…,yt) is good w.h.p
*same sizes but don’t intersect
y2

3. Run A on 1(x1),…,t(xt)
y1
Generating the extended input
Given y1,y2,…,yt, players
– Exchange the sizes of each of the sets
• O(t log n) bits
– Choose random consistent reordering of the indices
of each y1,y2,…,yt
– Generate m-1 random inputs to DISJn/m,t with same
set sizes as y1,y2,…,yt but that are disjoint
– Place y1,y2,…,yt in random position and then shuffle
Key observation: If y1,y2,…,yt are disjoint then
this resolves the catch
– After shuffling, all the subproblems look the same so
the probability that the subproblem where
y1,y2,…,yt lands is good does not depend on the
input
Simulating R/W stream algorithm
A using NIH communication
• As A executes on input v=1(x1),…,t(xt) players know
all inputs except y1,…,yt
– each player builds up copy of a dependency graph
σ(v) for the elements of each stream so far
• Using σ(v), at each step all players either
– know the next move, or
– know which one player knows next block of moves
• that player communicates
– know that need two players’ info: simulation “fails”
• If subproblem y1,…,yt is good for v then simulation does
not fail
• If players detect failure they output “not disjoint”
– If input was disjoint then only 1% chance of this
Dependency Graph
Vertices: Elements of each stream in each pass
pass 0
Edges: From element to elements in previous pass that
contained heads at same time it did
pass 1
pass j -1
Stream R to L
Stream L to R
Stream L to R
pass j
pass j+1
Why most subproblems are good
• Simple case: algorithm just makes copies of the
input stream and compares them
– # of subproblems with > 1 segment read at same
time on single pass through the streams (L-to-R or
R-to-L on each stream)
• ≤ # segments appearing in the same (or reversed)
order
– Almost surely, for random permutations 1,2,…,t
no pair has a common subsequence or inverted
subsequence longer than 2em1/2
– When t is o(m1/2) the total is o(m).
Why most subproblems are good
• General case: May combine information
about all streams onto a single stream in
single pass
– What is combined may depend on the input
values
– Each element depends on the segments that
it can reach in the input stream via the
dependency graph
Why most subproblems are good
• For each fixed v, after p=o(log m) passes:
– Each element can depend on only 2O(p) different
input segments
– For any one stream, the sequence of its elements’
dependencies on input segments is the interleaving
of 2O(p) monotone subsequences from 1,2,…,t
 Only 2O(p) t m1/2=mo(1) bad subproblems on input v
Communication Cost of Simulation
• For each fixed v, after p=o(log m) passes:
– Only 2O(p) t elements depend on a segment and have
a neighbor that does not depend on it
• Players only need to communicate when
segment dependencies change
– only happens 2O(p)t times at cost of O(ps) bits per
time
Limitations and Future Work
Limitation of using permuted-DISJ
R/W streams algo for
permuted-DISJn,m,t
NIH CC protocol
for DISJn/m,t
• Gap from data stream due to loss in input size
• Most of this loss is necessary
– Need nm  (t2) to use Ω(n/t) CC lower bound for DISJn/m,t
– Efficient R/W algo for permuted-DISJn,m,t unless m ≥ t32
– Implies that n is Ω(mt2) which is Ω(t3.5)
 Since we need t≈n1/k, the lower bound Ω(n/t) is
trivial for k  3.5
A longest-common-subsequence
problem on permutations
• Algorithm for permuted-DISJn,m,t follows from the
following theorem:
In any 3 permutations on [m] there is a pair with
longest common subsequence length ≥ m1/3.
Proof: For each i  [m] define a triple ti of integers:
For each of the 3 pairs of permutations put length of the
longest common subsequence for that pair that ends with
value i. Can show that all m triples are different.
So some triple must contain a coordinate ≥ m1/3
• Tight even for 4 permutations
R/W stream algorithm for
permuted-DISJn,m,t for large t
t  m2/3, any : Testing permuted-DISJn,m,t
with 2 streams, 3 passes, O(log nmt) space
1(x1)
2(x2)
3(x3)
4(x4)
5(x5)
6(x6)
1(x1)
2(x2)
3(x3)
4(x4)
5(x5)
6(x6)

• Compare m1/3 blocks each time
In any three permutations on [m] there is a pair with
longest common subsequence length ≥ m1/3.
Open problems
• Is Ω(n1-4/k-ε) lower bound for R/W streams tight?
– Gap from O(n1-2/k) upper bound in data stream
• Can’t use permuted-DISJn,m,t to close it
– Polynomial space to compute Fk for 2 < k ≤ 4 ?
• Other problems on R/W streams?
• L(m,k)  maximum LCS length that can be guaranteed
between some pair in any set of k permutations on [m].
– We show L(m,3)  L(m,4)  m1/3
– What is L(m,k) for other values of k?
– [B-Blais-Huynh 08] L(m,k) = m1/3+o(1) for k  mO(1)