- Blog /
- Prometheus Metrics Explained: Counters, Gauges, Histograms & Summaries
This discussion is part of the basic monitoring series, an effort to clarify monitoring concepts for both beginners and experienced users:
A metric is a snapshot of a measurement or count at a specific moment, for example:
Monitoring systems store metrics, analyze them, send alerts (to Slack, Telegram, etc.), and visualize trends through dashboards. These tools provide insight and help drive informed decisions.
Now, collecting all these metrics is great, but without a clear goal, it’s easy to get lost in the noise. There are a few structured ways to think about which metrics actually matter:
A time series is a sequence of data points indexed by timestamps, with each data point representing a value measured at a specific moment. Each time series is identified by a metric name and a set of labels (key-value pairs) that distinguish it from other time series.
That said, a metric consists of four elements: a name (a string), labels (key-value pairs), a value (a floating-point number), and a timestamp (Unix time).
Metrics can include extra details using labels, which are just key-value pairs that add context. These help break down the data in a more meaningful way. For example:
http_requests_total{path="/", code="200"}
represents the total number of requests served at the root path /
with a 200
status code.http_requests_total{path="/admin", code="403"}
tracks requests to /admin
that resulted in a 403
(forbidden) response.Labels aren’t always necessary. You might come across metrics without any labels when checking /metrics
—the common endpoint where monitoring tools pull data from:
go_memstats_heap_alloc_bytes 1.37669232e+08
go_memstats_heap_idle_bytes 2.73235968e+08
process_open_fds 59
Even when labels aren’t explicitly shown, monitoring tools or agents often attach predefined labels in the background.
Interestingly, the metric name itself (e.g. http_requests_total
) is considered a label with a special key: __name__
. So, http_requests_total{path="/", code="200"}
is actually the same as {__name__="http_requests_total", path="/", code="200"}
.
A unique combination of a metric name and its labels forms what’s called a timeseries id. For instance, requests_total{path="/login", code="200"}
and requests_total{path="/", code="200"}
are two separate timeseries, even though they share the same metric name, because the path label has different values.
Every unique label value creates a new timeseries. High-cardinality labels—like IP
, userID
, or phoneNumber
—can quickly generate an overwhelming number of timeseries, slowing down the monitoring system.
Each sample of a timeseries is basically a snapshot of a specific timeseries at a given moment:
Monitoring tools collect these samples from target servers at regular intervals and store them in a timeseries database. Some systems also allow applications to push metrics directly, instead of waiting for the tool to scrape them.
Monitoring systems typically don’t classify metrics into types—everything is just a measurement. That said, the idea of metric types still helps make sense of what’s being tracked.
A counter does exactly what you’d expect—it counts things. It only increases (or resets to zero) and never goes down.
This makes counters perfect for tracking things that continuously grow, like the number of requests hitting a server, the number of errors logged, or the total transactions processed.
Now, there’s a common situation where a counter might suddenly drop to zero—when a service restarts or crashes.
Fortunately, rate calculations like increase()
(total increase over time) and rate()
(per-second increase on average) handle this by detecting when a counter unexpectedly decreases (for example, from 200 to 0). When this happens, they assume a reset occurred and automatically add the last recorded value (200 in this case) to future counts
Therefore, even if your server crashes multiple times, the data remains smooth. You probably won’t even notice the resets in the graphs.
“How does the system know your metric is a counter? Didn’t you just say VictoriaMetrics doesn’t classify metrics into types?”.
The counter-like behavior isn’t built into the metric itself—it depends on how you query the data. Certain functions tell VictoriaMetrics to treat the data as if it were a counter:
increase
irate
rate
So, if you just query my_metric
, you’re getting raw data—no counter reset handling, nothing fancy. But if you query rate(my_metric[5m])
, VictoriaMetrics will apply counter reset detection and correction, because rate
is one of those functions that expects counter behavior.
Unlike counters, which only move in one direction, gauges can increase or decrease, reflecting real-time changes. This makes them useful for tracking values that fluctuate.
If you want to see how much memory is available at this moment or how busy the CPU is, a gauge gives you that snapshot. It’s not about cumulative counts—it’s about what’s happening right now.
A histogram measures how values are spread across a range. In simple terms, it shows how often different values show up within certain intervals, or buckets.
To get a sense of this, take a look at this breakdown of HTTP request durations:
This represents different response time ranges and how many requests fall into each:
In monitoring, these buckets are usually defined with the le
(less-or-equal) label. However, bucket values are cumulative. This means that each bucket includes all the counts from the previous ones. The chart above doesn’t reflect that directly, but a typical histogram looks more like this:
For example, http_requests_total{method="GET", le="200"}
represents the total number of GET requests with a response time of 200 milliseconds or less—which also includes everything from the previous buckets (like 100ms and below).
In text format, histograms usually include three different metric suffixes:
_bucket
: Counts how many observations fall into each range._sum
: Tracks the total sum of all recorded values._count
: Counts the total number of recorded observations.http_request_duration_seconds_bucket{url="/",le="0.005"} 27
http_request_duration_seconds_bucket{url="/",le="0.01"} 60
http_request_duration_seconds_bucket{url="/",le="0.025"} 82
http_request_duration_seconds_bucket{url="/",le="0.05"} 90
http_request_duration_seconds_bucket{url="/",le="10"} 91
http_request_duration_seconds_bucket{url="/",le="+Inf"} 92
http_request_duration_seconds_sum{url="/"} 16.025
http_request_duration_seconds_count{url="/"} 92
A histogram is basically a set of counters. Looking at _count
, we can see that 92 total requests were recorded. _sum
tells us that the combined response time for all of them was 16.025 seconds. Most requests finished in under 0.05s.
One of the biggest advantages of histograms is that they allow percentile estimation. If you want to know how long it took to serve 95% of requests in the last 5 minutes, you can use this query:
histogram_quantile(0.95, sum(rate(http_request_duration_seconds_bucket{url="/"}[5m])) by (le))
This will tell you the response time below which 95% of requests were completed. It’s a useful way to understand performance, especially if you want to make sure that most users are getting fast responses.
Here’s how this query works, step by step:
rate(http_request_duration_seconds_bucket{url="/"}[5m])
: Calculates the average per-second rate of requests in each bucket over the last 5 minutes.sum(...) by (le)
: Groups all buckets with the same le
value (ignoring other labels like status
or method
) and sums up the rates.histogram_quantile(0.95, ...)
: Estimates the 95th percentile from the histogram data.If you’re new to this, don’t worry. We’ll go deeper into functions later in this series.
In practice, classic histograms come with some challenges:
To address these issues, VictoriaMetrics introduces a new histogram format: Improving Histogram Usability for Prometheus and Grafana - Aliaksandr Valialkin. Additionally, Prometheus introduced native histograms in version v2.40.0.
A summary works similarly to a histogram, but the key difference is where quantiles are estimated.
With histograms, data points are sorted into predefined buckets, stored as cumulative counts, and quantiles are estimated later using histogram_quantile
. Summaries, on the other hand, calculate quantiles before exporting the data—right on the client side:
go_gc_duration_seconds{quantile="0"} 0.000189744
go_gc_duration_seconds{quantile="0.25"} 0.000242796
go_gc_duration_seconds{quantile="0.5"} 0.000271349
go_gc_duration_seconds{quantile="0.75"} 0.000313472
go_gc_duration_seconds{quantile="1"} 0.0021355
go_gc_duration_seconds_sum 1.519748632
go_gc_duration_seconds_count 4695
This metric tracks the duration of Go garbage collection (GC) cycles in seconds:
quantile="0"
: The shortest GC duration observed: 0.000189744 sec (189.7µs).quantile="0.25"
: 25% of GC cycles took ≤ 0.000242796 sec (242.8µs).quantile="0.5"
: 50% of GC cycles took ≤ 0.000271349 sec (271.3µs).quantile="0.75"
: 75% of GC cycles took ≤ 0.000313472 sec (313.5µs).quantile="1"
: The longest GC duration recorded: 0.0021355 sec (2.14ms).So, most GC cycles are extremely short—just a fraction of a millisecond.
Since summaries already come with quantile labels, there’s no need to define buckets. This makes them useful when the range of values isn’t predictable.
However, you lose flexibility. Unlike histograms, summaries don’t allow percentile calculations after the fact. If you need to compute the 95th percentile later, you can’t—because it wasn’t precomputed. Summaries also can’t be aggregated across labels, meaning if you have multiple instances of an application, you can’t merge their summary data to get an overall view.
The most important rule in naming metrics is to keep them readable, descriptive, and clear. When you have hundreds or even thousands of metrics, you want to pick them wisely—without wasting time trying to figure out what each one actually represents.
That said, there are a few conventions for naming metrics and labels. These aren’t strict rules, but following them keeps things consistent:
snake_case
: Metric and label names should be in snake_case
, and free of special characters.-
) instead. For example, env="prod-us-west"
is better than env="prod us west"
.http_
, go_
, node_
, mysql_
, redis_
, etc._total
and _count
for counters. It can be combined with base units, e.g. _bytes_total
, _seconds_total
._bytes
, _errors
, and _seconds
. It’s best to stick to base units—prefer seconds
over milliseconds
.http_requests_total{http_status="200"}
, just use status="200"
.service="auth"
, don’t use app="auth-service"
in another. Stick to the same structure.For more details, check out Prometheus’s Metric and label naming.
If you spot anything that’s outdated or have questions, don’t hesitate to reach out. You can drop me a DM on X(@func25) or VictoriaMetrics’ Slack.
If you want to monitor your services, track metrics, and see how everything performs, you might want to check out VictoriaMetrics. It’s a fast, open-source, and cost-saving way to keep an eye on your infrastructure.
The OpenTelemetry Astronomy Shop demo has long served as a reference environment for exploring observability in distributed systems, but until now it shipped with only a Prometheus datasource. VictoriaMetrics forked the demo and extended it with VictoriaMetrics, VictoriaLogs, and VictoriaTraces, providing insights into VictoriaMetrics’ observability stack where metrics, logs, and traces flow into a unified backend.
Proper alerting is an art. It is all about foreseeing bad scenarios before they happen, so you can prepare for them. In this article, we go through practical recommendations of approaching the alerting to reduce alerting fatigue and avoid false positives.
Tech Talk: In this post, we explore vmanomaly through the eyes of its creators. Learn how this AI-powered alerting system helps cut through noise, avoid static rule spaghetti, and deliver actionable insights directly from your monitoring data.
VictoriaLogs uses three core concepts: message, time, and stream fields to structure log data. Too few streams create fat streams that are slow to query, while too many unique stream combinations create high cardinality problems…