- Case Studies /
- Scaled & Performant Monitoring at Traefik Labs with VictoriaMetrics
Reducing CPU 2x: Traefik Labs Uses VictoriaMetrics as Drop-in Prometheus Replacement
- API management platform
- Lyon, France
Traefik Labs chose VictoriaMetrics to replace Prometheus, resulting in: 2x reduction in CPU usage. Instant feedback loop to make metrics faster and simpler to work with. Better user experience with the configuration inside the UI, so Traefik Labs can test relabeling configuration before applying it. Prometheus-compatible APIs and query language. Simplified architecture for easier deployment and management.
Main Benefits of Using VictoriaMetrics
Stability & Reliability
Operational Visibility
Open Source
Challenge
Before adopting VictoriaMetrics, Traefik Labs faced several challenges:
- Monitoring metrics was the biggest pod in their infrastructure. You don't want the observability stack to occupy more resources than your actual workload.
- Changing labels, dropping metrics, etc. meant restarting Prometheus each time. This was a 40-second wait and a poor user experience.
Solution
Traefik Labs chose VictoriaMetrics to replace Prometheus, resulting in:
- 2x reduction in CPU usage.
- Instant feedback loop to make metrics faster and simpler to work with.
- Better user experience with the configuration inside the UI, so Traefik Labs can test relabeling configuration before applying it.
- Prometheus-compatible APIs and query language.
- Simplified architecture for easier deployment and management.
Why VictoriaMetrics Was Chosen Over Other Solutions
Open source company with an open core model allows a wide range of use cases.
Simple maintenance with a lightweight, resilient metric collector.
User experience and attention to the community to solve real-world issues.
Traefik’s switch to monitoring with VictoriaMetrics was driven by the architecture. Since it’s divided into multiple microservices, you can pick some metrics on the source, draw up a label on a few things, and send it to the monitoring tool via another small storage node.
Technical Stats
Ingestion rate
677 dp/s
Active time series
4600
CPU usage
3%