Difference between revisions of "HyperLogLog"

Revision as of 14:59, 3 October 2022

wikipedia:HyperLogLog is an algorithm for the count-distinct problem.

Calculating the exact cardinality of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory.

/etc/redis.conf

HyperLogLog++ counts based on the hashes of the values with some properties:

Configurable precision, which decides on how to trade memory for accuracy
Excellent accuracy on low-cardinality sets
Fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.

@@ Line 1: / Line 1: @@
-[[wikipedia:HyperLogLog]] is an algorithm for the [[count-distinct]] problem.
+[[wikipedia:HyperLogLog]] is an [[algorithm]] for the [[count-distinct]] problem.
+Calculating the exact [[cardinality]] of a [[multiset]] requires an amount of memory proportional to the [[cardinality]], which is impractical for very large data sets. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory.
   [[/etc/redis.conf]]
+[[HyperLogLog++]] counts based on the hashes of the values with some properties:
+* Configurable precision, which decides on how to trade memory for accuracy
+* Excellent accuracy on low-cardinality sets
+* Fixed memory usage: no matter if there are tens or billions of unique values, memory usage only depends on the configured precision.
+== See also ==
+* [[LogLog]]
+* {{Redis}}
+[[Category:Probabilistic data structures]]

Difference between revisions of "HyperLogLog"

Revision as of 14:59, 3 October 2022

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools