New Characterizations in Turnstile
Streams with Applications
Yuqing Ai
Wei Hu
Tsinghua University
Tsinghua University
Yi Li
David Woodruff
Facebook
IBM Almaden
Turnstile Streaming Model
ο
Underlying π-dimensional vector π₯ initialized to 0
ο
Stream of updates π₯ β π₯ + ππ or π₯ β π₯ β ππ for
standard unit vector ππ
ο
At end of the stream, π₯ β {βπ, β¦ , β1, 0, 1, β¦ , π}π
ο
Output an approximation to π(π₯) w.h.p.
ο
Goal: use as small space in bits as possible
Example: Estimating the β2 -norm
ο
Output π with 1 β π π₯
ο
Algorithm:
2
β€π β€ 1+π π₯
2
1. Let π = 1/π 2
2. Choose an π × π matrix π΄ of i.i.d. sign random
variables (+1 w.p. 1/2, β1 w.p. 1/2)
3. Maintain π΄π₯ in the stream
4. Output
π΄π₯ 2
π
Generic Form
ο
All known algorithms have the following generic
form (linear sketch):
1.
Sample a random matrix π΄
2.
Maintain π΄π₯ in the stream
3.
Output a function of π΄π₯
Question (?!): does the optimal algorithm for
approximating any function in the turnstile model
have this form?
The LNW Reduction
ο
Yes! [Li, Nguyα»
n, Woodruffβ14]
ο
Theorem: for computing a function π of π₯ in
βπ, β¦ , π π in the turnstile model, there is a
randomized algorithm which
1. samples a matrix π΄ and a vector π uniformly
from π(π log π) instances
2. maintains (π΄π₯ mod π) in the stream
3. outputs a function of (π΄π₯ mod π)
ο
Space complexity is optimal up to a constant
factor (not including the π(log π + log log π) bits
for randomness)
Consequence
Input π₯
Create stream π (π₯)
Input π¦
Create stream π (π¦)
Lower Bound Technique
Streaming algorithm π
1. Run π on π (π₯), send state of π(π (π₯)) to Bob
2. Bob computes π(π (π₯), π (π¦))
3. If Bob solves π(π₯, π¦), space complexity of π at
least the 1-way communication complexity of π
Consequence
Input π₯
Create stream π (π₯)
Input π¦
Create stream π (π¦)
The LNW reduction implies
If players can solve π(π₯, π¦), then space of π at least
the simultaneous communication complexity of π
Weaker model in which Alice and Bob simultaneously
send a message to a referee who outputs the answer
Our Result
ο
Strengthen the LNW reduction from several
aspects:
β¦ Remove the βbox constraintβ
β¦ Generalize to the strict turnstile model
β¦ Extend to multi-pass algorithms
ο
Obtain new tight lower bounds
Strengthen the LNW Reduction
ο
Remove the βbox constraintβ
ο
Generalize to the strict turnstile model
ο
Extend to multi-pass algorithms
The βBox Constraintβ
ο
The LNW reduction requires the algorithm to be
correct as long as π₯ β βπ, β¦ , π π at the end of
the stream.
ο
While processing the stream, may have π₯
ο
The algorithm is not allowed to abort if this
happens. It must still be correct at the end of the
stream as long as π₯ β βπ, β¦ , π π .
ο
More natural requirement: the algorithm only needs
to be correct when π₯ belongs to βπ, β¦ , π π at all
time in the stream.
β
β«π
Stream Automaton
β¦
βππ
+ππ
β¦
βπ1 , +π2
β¦
Start
β¦
+π1
+π1
+π5
βπ1
β¦
β¦
Path-Independent Automaton
ο
Every π₯ β β€π in a unique state
Path-Independent Automaton
βππ
+ππ
β¦
βπ1 , +π2
β¦
Start
β¦
+π1
+π1
+π5
0 in two
different states
βπ1
β¦
β¦
Path-Independent Automaton
ο
Every π₯ β β€π in a unique state
ο
Equivalent to π΄π₯ mod π
Zero-Frequency Graph
ο
For stream π, let freq π β β€π be the βnet updateβ
to all coordinates.
ο
Zero-freq graph: directed graph πΊ = (π, πΈ)
β¦ π = states of the automaton
β¦ π’, π£ β πΈ if there exists stream π such that π’ β
π = π£ and freq π = 0
ο
Terminal equivalence class: strongly connected
component in πΊ with no outgoing edge
ο
Walk in G is a sequence of zero-frequency streams
The LNW Reduction
πΊ: zero-frequency graph of πold
ο States of new automaton πnew = terminal
equivalence classes in πΊ
ο
ο
ο
For a terminal equivalence class πΆ and an update ππ ,
define transition as:
β¦ Let π£ β πΆ be an arbitrary node
β¦ Compute π£ β ππ using transition function of πold
β¦ Walk from π£ β ππ in πΊ until reach a terminal
equivalence class πΆβ²
πΆβ² is unique
β¦ Does not depend on π£ or the walk
πΆ
Terminal
equivalence
class
π£
ππ
freq(π) = 0
Terminal
equivalence
class
πΆβ²
The Box Constraint
ο
For a stream π, define
|π|max =
max
prefix π of π
freq π
β
π = (π1 , π2 , β¦ , ππ ) on πnew
πβ² = (β¦ , π1 , β¦ , π2 , β¦ , ππ , β¦ ) on πold
π1
ο
ο
ο
π2
π3
π4 π5
π6
β¦
π1 , π2 , β¦ are zero-frequency streams (walks in πΊ)
Length of ππ could be very large
When |π|max β€ π, |πβ²|max could be very large
Zero-Freq Stream Length
ο
πΏ: upper bound on the lengths of ππ βs
ο
|π|max β€ π βΉ |πβ²|max β€ π + πΏ/2
ο
Want πΏ β€ π
ο
Let s = # states in πold
Lemma: if there is a zero-freq stream from π’ to π£,
then there exists such a stream with length at most
π
π
poly ππ β
+ 1
ο
π
ο
πΏ β€ poly ππ β
π
π
+1
π
Tightness of Our Bound
ο
ο
πΏ β€ poly ππ β
π
π
+1
Lower bound: πΏ β₯
π
π Ξ©(π)
π
Removing the Box Constraint
ο
Want πΏ β€ π
ο
πΏ β€ poly ππ β
ο
π ππ
πΏβ€π βΈ
π
π
+1
π
β€ π ππ
β€ π βΈ log π β€
log π
ππ
Space of πold
Application: Counting
π=1
ο Problem: output |π₯| up to additive error π/4, while
π₯ varies in {βπ, β¦ , π}
ο
ο
π(log π) space algorithm
ο
Is there an Ξ©(log π) lower bound?
β¦ For insertion streams, no: approximate counting
β¦ For relative error, yes: but proof doesnβt apply
⦠For additive error⦠yes!
Application: Counting
ο
Condition for removing box constraint: space β€
log π
log π
=
ππ
π
log π
,
π
ο
Assume space β€
ο
π΄π₯ mod π = (π1 π₯ mod π1 , π2 π₯ mod π2 , β¦ , ππ π₯ mod ππ )
β¦ Show lcm π1 , β¦ , ππ = Ξ©(π)
ο Cannot distinguish π₯, π₯ + lcm, π₯ + 2 β
lcm, β¦
β¦ Ξ©(π) different states, Ξ©(log π) space
otherwise done
Application: Norm Estimation
ο
Problem: for π₯ β βπ, β¦ , π π , output π₯
1
additive error π1/π π
π
up to
4
ο
Ξ©(log π) space lower bound
ο
π(log π + log log π) space algorithm (1 β€ π β€ 2)
[KNWβ10]
ο
Lower bound tight when log log π = π log π βΊ
π β€ exp poly(π)
Strengthen the LNW Reduction
ο
Remove the βbox constraintβ
ο
Generalize to the strict turnstile model
ο
Extend to multi-pass algorithms
The Strict Turnstile Model
ο
The strict turnstile model: no negative coordinates,
i.e., π₯π β₯ 0 at all times in the stream
ο
Dynamic graph streams: insertions and deletions of
edges
β¦ Allow multi-graphs, but no negative edges
ο
Generalize the LNW reduction to the strict turnstile
model
β¦
β¦
β¦
β¦
πΏ: upper bound on the length of zero-freq streams
Initialize all coordinates of π₯ to be πΏ
Now the reduction guarantees π₯ is always nonnegative
Subtract πΏ from all coordinates at the end of the stream
Application: Maximum Matching
ο
[AKLYβ16]: For outputting an ππ -approximate
maximum matching, space is Ξ(π2β3π )
β¦ Lower bound only in simultaneous communication
model
ο
Can apply our reduction
Strengthen the LNW Reduction
ο
Remove the βbox constraintβ
ο
Generalize to the strict turnstile model
ο
Extend to multi-pass algorithms
Multi-Pass Algorithms
ο
π-pass automaton
β¦ After π-th pass (π < π), output an automaton ππ+1
β¦ Run ππ+1 on input stream in (π + 1)-st pass
β¦ After π-th pass, output answer
ο
Theorem: There is a π-pass automaton for which
each automaton in each pass is path-independent
β¦ Space is optimal up to a constant factor
Conclusions
ο
New progress on characterizing turnstile streaming
algorithms as linear sketches
ο
Applications
β¦ Optimal lower bounds for counting with additive
error, maximum matching in dynamic graph
ο
Open questions
β¦ Box constraint
β¦ After removing box constraint, still have very long
streams
β¦ Better reduction?
Thank you!
© Copyright 2025 Paperzz