HyperLogLog
Jump to navigation
Jump to search
wikipedia:HyperLogLog is an algorithm for the count-distinct problem.
Calculating the exact cardinality of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets.
The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory
/etc/redis.conf
HyperLogLog++ counts based on the hashes of the values with some properties:
- Configurable precision, which decides on how to trade memory for accuracy
- Excellent accuracy on low-cardinality sets
- Fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.
Advertising: