Scaling Observability at Spotify with 10x Faster Dashboards & Queries

  • Streaming services
  • Stockholm, Sweden

Spotify chose VictoriaMetrics to replace its legacy in-house time series database. VictoriaMetrics proved to be a robust, efficient, and flexible platform aligned with Spotify’s operational and architectural requirements.

Main Benefits of Using VictoriaMetrics

  • Rocket icon representing stability and reliability

    Stability & Reliability

  • Metrics monitoring icon representing operational visibility

    High Operational Visibility

  • Scalability icon representing performant alerting

    Scalable & Performant Alerting

  • Recommendation icon representing predictable pricing

    Predictable & Transparent Pricing

  • Settings icon representing ingestion and query performance

    Data Ingestion & Query Performance

  • Savings icon representing downsampling cost efficiency

    Downsampling: Extra Cost Efficiency

Challenge

As Spotify scaled its observability, their in-house TSDB proved outdated for handling the large-scale metric ingestion and querying they needed. The team faced severe technical hurdles:

  • Closed source: Restricted community support and maintainability, limited compatibility with Prometheus and open standards.
  • Unhappy users: Slow queries and frequent timeouts led to frustration amongst the end users.
  • High maintenance: Ongoing IT maintenance, support activities and limited features meant it was more expensive to maintain than replace.

Solution

Spotify chose VictoriaMetrics to replace its legacy in-house time series database. VictoriaMetrics proved to be a robust, efficient, and flexible platform aligned with Spotify’s operational and architectural requirements.

  • Significant improvements in data ingestion and query performance
  • Prometheus-compatible APIs and query language
  • Simplified architecture for easier deployment and management
  • Enhanced data retention and cost efficiency through downsampling and control features
  • Support for both cloud and self-hosted deployments, offering high operational visibility
  • Scalable, performant alerting infrastructure
  • A predictable and transparent licensing model
  • Noticeable improvements in dashboard responsiveness and alert evaluation times

Why VictoriaMetrics Was Chosen Over Other Solutions

  • Metrics monitoring icon representing high ingestion throughput

    VictoriaMetrics handled 78 million datapoints/second, surpassing their target ingestion of 50 million.

  • Expert support icon representing simple architecture and enterprise features

    The team recognized the simple architecture that VictoriaMetrics offers, plus enterprise features: downsampling & retention control.

  • Rocket icon representing high performance with efficient resource usage

    VictoriaMetrics stood out for its ability to handle high workloads while maintaining cost-effective resource usage.

  • Scalability icon representing consistent alerting at scale

    Spotify needed scalable and consistent alerting. Adopting VictoriaMetrics’ vmalert and pairing it with internal tooling allowed Spotify’s Observability team to evaluate alerts much faster and more reliably across thousands of services.

Technical Stats

  • Ingestion Rate

    ~78M datapoints/second

  • 10x faster dashboards & queries

  • Faster & reliable alerting

  • Better data accuracy (no pre-aggregations)

  • Access to vast OSS ecosystem (Prometheus, OTel)

  • Significant annual cost savings

  • Flexible deployment models

What Spotify Had to Say

  • “Spotify’s transition to VictoriaMetrics has resulted in significant performance improvements across its monitoring stack, greater efficiency in engineering operations, and enhanced scalability to support future growth.”
    Lauren Roshore, Observability Engineering Manager, Spotify R&D