Difference between revisions of "Cardinality"
Jump to navigation
Jump to search
↑ https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/
(9 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
[[wikipedia:Cardinality]] is generally defined as the number of elements in a [[set]]. | [[wikipedia:Cardinality]] is generally defined as the number of elements in a [[set]]. | ||
+ | |||
+ | Calculating the exact cardinality of a [[multiset]] requires an amount of memory proportional to the [[cardinality]], which is impractical for very large data sets. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory. | ||
+ | |||
+ | You can have lower cardinality (1:5 label-value ratio), standard cardinality (1:80 label-value ratio), or high cardinality (1:10,000 label-value ratio). <ref>https://grafana.com/blog/2022/02/15/what-are-cardinality-spikes-and-why-do-they-matter/</ref> | ||
Line 10: | Line 14: | ||
* [[HyperLogLog]], [[HyperLogLog]]++ | * [[HyperLogLog]], [[HyperLogLog]]++ | ||
* [[Infinite set]], [[Multiset]] | * [[Infinite set]], [[Multiset]] | ||
+ | |||
== Activities == | == Activities == | ||
* Read about cardinality aggregation in [[Elasticsearch]] https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html | * Read about cardinality aggregation in [[Elasticsearch]] https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html | ||
+ | * Read https://docs.newrelic.com/docs/data-apis/ingest-apis/metric-api/NRQL-high-cardinality-metrics/ to understand "What metric is contributing the most cardinality?" and "What impact does a given attribute(s) have to that total cardinality?". | ||
+ | * [[Prometheus]] https://www.robustperception.io/cardinality-is-key/ | ||
+ | * Read https://valyala.medium.com/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b | ||
+ | |||
+ | == Related == | ||
+ | * [[Elasticsearch]] | ||
+ | * [[New Relic]] | ||
+ | * [[cortex-tools]] | ||
+ | == See also == | ||
+ | * {{Cardinality}} | ||
[[Category:Basic concepts in infinite set theory]] | [[Category:Basic concepts in infinite set theory]] |
Latest revision as of 12:52, 9 December 2022
wikipedia:Cardinality is generally defined as the number of elements in a set.
Calculating the exact cardinality of a multiset requires an amount of memory proportional to the cardinality, which is impractical for very large data sets. The HyperLogLog algorithm is able to estimate cardinalities of > 109 with a typical accuracy (standard error) of 2%, using 1.5 kB of memory.
You can have lower cardinality (1:5 label-value ratio), standard cardinality (1:80 label-value ratio), or high cardinality (1:10,000 label-value ratio). [1]
FROM Metric SELECT cardinality(metric.name) SINCE today RAW
Activities[edit]
- Read about cardinality aggregation in Elasticsearch https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-metrics-cardinality-aggregation.html
- Read https://docs.newrelic.com/docs/data-apis/ingest-apis/metric-api/NRQL-high-cardinality-metrics/ to understand "What metric is contributing the most cardinality?" and "What impact does a given attribute(s) have to that total cardinality?".
- Prometheus https://www.robustperception.io/cardinality-is-key/
- Read https://valyala.medium.com/high-cardinality-tsdb-benchmarks-victoriametrics-vs-timescaledb-vs-influxdb-13e6ee64dd6b
Related[edit]
See also[edit]
Advertising: