Random Counter

Ariel Rosenfeld

Counter ranges from 0 to M requiers log2M
bits.

For large data log2M is still a lot.

Using probability to reduce to log2log2M bits.
◦ Small probability of errors.
•
Counting of a large number of events
using a small amount of memory, while
incorporating some probability.
1977 by Robert Morris.
• 1982 analyzed by Philippe Flajolet.
•
SD(X) = Var(X).
 Gathering statistics on a large
2
2
number
of
events
Var(X) = E(X )  E(X) .
 Streaming data frequency
 Data compression
 Etc..
•
•
Because we give up accuracy, we use 2k
approximation and only keep the
exponent.
Representing if the approximate number is
M, we only keep 2k =M in binary form.
• Log2log2 M
•
How do we know when to increase k?

Generate "c" pseudo-random bits
◦ "c" = current value of the counter

If all are 1
◦ What is the probability?
◦ How to check it efficiently?

Simply add the result to the counter.

What is the probability of increment?
◦ 2-C

After N increments
(probabilistic explanation in article)
◦ E(2C) = n+2
◦ Var(2C) = n(n+ 1)/2
◦ Small chance to be “far off”.

Increase was called 1024 times.
◦ Correct value should be 10.
◦ Chance of being more than 1 off is ~8%.