rate(http_errors[1m]) function with “1min sample” will group data by 1m like 4th-1st/60sec=avg_value per sec.
But if we run rate(http_errors[24h]) → Will it consider (last sample near to 24th hour - 0th hour / 24h (86,400 secs)) & result will be avg_value per sec ?
I am little confused with rate function.
I assume this is for a lab or for a course. Would you please include a link to the lab or the course if that’s the case; it will help us figure out how to support you here.
Yes its a part of PCA course.
https://learn.kodekloud.com/user/courses/prometheus-certified-associate-pca/module/b4de09eb-de60-4a9d-a193-b6f74f9889a3/lesson/6254635d-a3cb-472d-861a-2e3d699b0751
rate(http_errors[1m]) - Calculates the per-second rate of increase of the http_errors counter over the last 1 minute.
rate(http_errors[24h]) - Calculates the per-second rate of increase of the same metric over the last 24 hours.
The short window measurement is useful for
- Alerting on error spikes within seconds.
- Dashboards focusing on live behavior.
- Identifying sudden regressions or bursts.
The long window is useful for
- Understanding daily behavior and long-term trends.
- SLO/SLA graphs that need stability.
- High-level dashboards that shouldn’t fluctuate rapidly.
The second one means you need to have a retention period of longer than 24 hours on the prometheus instance which in practice, and especially on large clusters where there are lots of metrics can make for massive memory consumption in Prometheus (multiple GB).
Long term metrics are normally gathered by coupling Prometheus with Thanos (not required for the exam, except possibly knowing what Thanos is).
Hi @Alistair_KodeKloud Thanks for explaining. But I would like to see how it works practically.
Lets assume - scrape_interval is 1m and we have scrape sample at 00:01,00:02,00:03,00:04,00:05 so on … (time)
Now i want to see rate and i run this command at 00:04 time rate(http_errors[4m]) - This will use (00:01,00:02,00:03,00:04) sample to group and provide me a per-second rate value.
Now at 00:05 i run this same command again to see rate at 00:05 time - This time it will be using (00:02,00:03,00:04,00:05) sample to group and provide me a per-second rate value.
If my understanding and concept is clear then i got how rate(http_errors[4m])[5m:1m] is working - This is a range vector now and will get per-second rate value of last 5m with 1m gap.
Can you please suggest If my understanding is clear with this concept ?
Pretty much, yes
rate() effectively does this:
- Take all samples in the range vector (e.g., 00:01 → 00:04).
- Perform a least-squares linear regression on the counter values vs time.
- Return the slope → per-second rate.
This means:
- It works even with irregular scrapes.
- You never need evenly spaced samples.
- You don’t need exactly “4 samples for 4 minutes” — it uses whatever is present.
rate(http_errors[4m])[5m:1m] does this
rate(http_errors[4m])
- gives a single instantaneous rate at each evaluation point.
-
[5m:1m]makes an outer range vector, evaluatingrate()repeatedly over the last 5 minutes, at 1-minute step intervals.
So that gives
00:01 (uses samples 00:00–00:01 → effectively just 00:01)
00:02 (uses samples 00:00–00:02 → effectively 00:01–00:02)
00:03 (uses 00:01–00:03)
00:04 (uses 00:01–00:04)
00:05 (uses 00:02–00:05)
Thanks for a good explanation.
You don’t need exactly “4 samples for 4 minutes” — it uses whatever is present. → Right I took an easy example actually to have easily and simple calculation. Yours below example is also giving me good understanding what if we don’t have all samples it will pick as much as available.
Correcting it.
00:01 (uses samples 00:00–00:01 → effectively just 00:01)
00:02 (uses samples 00:00–00:02 → effectively 00:01–00:02)
00:03 (uses 00:00–00:03) <--
00:04 (uses 00:01–00:04)
00:05 (uses 00:02–00:05)