Ariel Rosenfeld Counter ranges from 0 to M requiers log2M bits. For large data log2M is still a lot. Using probability to reduce to log2log2M bits. ◦ Small probability of errors. • Counting of a large number of events using a small amount of memory, while incorporating some probability. 1977 by Robert Morris. • 1982 analyzed by Philippe Flajolet. • SD(X) = Var(X). Gathering statistics on a large 2 2 number of events Var(X) = E(X ) E(X) . Streaming data frequency Data compression Etc.. • • Because we give up accuracy, we use 2k approximation and only keep the exponent. Representing if the approximate number is M, we only keep 2k =M in binary form. • Log2log2 M • How do we know when to increase k? Generate "c" pseudo-random bits ◦ "c" = current value of the counter If all are 1 ◦ What is the probability? ◦ How to check it efficiently? Simply add the result to the counter. What is the probability of increment? ◦ 2-C After N increments (probabilistic explanation in article) ◦ E(2C) = n+2 ◦ Var(2C) = n(n+ 1)/2 ◦ Small chance to be “far off”. Increase was called 1024 times. ◦ Correct value should be 10. ◦ Chance of being more than 1 off is ~8%.
© Copyright 2026 Paperzz