HyperLogLog

wikipedia:HyperLogLog is an algorithm for the count-distinct problem.

Calculating the exact cardinality of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.

The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory

/etc/redis.conf

HyperLogLog++ counts based on the hashes of the values with some properties:

Configurable precision, which decides on how to trade memory for accuracy
Excellent accuracy on low-cardinality sets
Fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.

LogLog

HyperLogLog

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools