I am not sure where to put this question as it is about SRE and metrics. From a high level, how would one go about tooling (Prometheus, etc) to measure SLO, SLI, SLA?
Try reading this blog post Measuring SLO, SLI and SLAs with the help of Prometheus and Grafanna
SLA is not something you measure. It is an agreement with the customer that a set of objectives are being met such as “We will guarantee 99.95% uptime over a given period X, or we will compensate you $Y”.
1 Like
This is a great link, thank you. I appreciate it.
I also came across these links that may be interesting for others.
- GitHub - slok/sloth: 🦥 Easy and simple Prometheus SLO (service level objectives) generator
- Implementing SLOs using Prometheus and Grafana
- Tutorial - Manage SLOs using Prometheus metrics | Harness Developer Hub
- https://sloth.dev/
- Using Prometheus metrics | Google Cloud Observability
- How We Use Sloth to do SLO Monitoring and Alerting with Prometheus