Running VictoriaMetrics on ARM-based processors

Running VictoriaMetrics on ARM-based processors

Share: Share on LinkedIn Share on X (Twitter)

The future is now and it’s ARM

#

ARM processors become more popular and more cost-effective according to many benchmarks. One of them was made by Percona for MySQL.

Some of our users reported issues with VictoriaMetrics at AWS Graviton instances. The main concerns were higher CPU and disk IO usage compared to x86 instances of the same size and for the same workload.

By that time, we verified that VictoriaMetrics works fine for raspberry and IoT devices, but didn’t do any optimizations for ARM builds.

Benchmarks

#

The main difference between x86 and ARM builds is the library we use for data encoding. x86-build uses gozstd library, a wrapper over Facebook’s zstd written in C.

Cross-compiled ARM64 build uses compress library by Klaus Post written in native Go.

So in many aspects, the performance difference between x86-build and cross-compiled ARM64 build heavily depends on the performance of these libraries.

That’s why, in order to improve performance of the ARM builds we added CGO build with some tweaks.

Testing env

#

For my test I created 3 instances at AWS:

  1. m5.4xlarge - intel x86 based instances with 16 CPU and 64 RAM $0.768 hour to run x86 build of VictoriaMetrics.
  2. m6g.4xlarge - graviton2 based instance with 16 CPU and 64 RAM $0.61 hour to run cross-compiled ARM64 build without CGO.
  3. m6g.4xlarge - graviton2 based instance with 16 CPU and 64 RAM $0.616 hour to run CGO build for ARM64.

And installed vmsingle v1.72.0 on each of the instances.

Workload

#

For workload generation, I’ve used our benchmark suite and set up a separate vmsingle node for metrics collection.

Results

#

Initial CPU profiling proves this theory and shows performance improvements with gozstd lib:

CPU usage by VictoriaMetrics before optimizations CPU usage by VictoriaMetrics before optimizations

CPU usage by VictoriaMetrics after optimizations CPU usage by VictoriaMetrics after optimizations

Optimized ARM and x86 versions show almost the same result for disk IO usage:

Disk writes/reads during the benchmark Disk writes/reads during the benchmark

Query performance for x86 version outperforms optimized ARM by ~10% and unoptimized ARM by ~25%:

Query latency during the benchmark Query latency during the benchmark

Building ARM64 golang with CGO

#

One of major challenges was to add CGO build into our cross-compilation pipeline. We are using musl based builds and the default musl compiler isn’t aware of how to build code for ARM. Instead, special aarch64-musl-gcc compiler must be used:

CC=/path_to_folder/bin/aarch64-linux-musl-gcc \
GOOS=linux GOARCH=arm64 CGO_ENABLED=1 go build main.go

Important note, your C-lang dependencies must be built with the same compiler. In my case, I had to rebuild gozstd lib.

Conclusions

#

  • VictoriaMetrics for ARM has better cost-performance compared with x86 machines. ARM instances are ~ 20% cheaper than x86 with the same performance.
  • Read queries latency at x86 system is better - x86 instance has ~10% lower query duration.
  • VictoriaMetrics has production-ready builds for ARM, prebuilt binaries, docker images since v1.73.0 release.
Leave a comment below or Contact Us if you have any questions!
comments powered by Disqus

You might also like:

Upcoming Conferences & Meetups: Where to Meet Our Team

We love connecting with our community in person, and the next few months are packed with opportunities to do just that. Our team will be attending (and in some cases, speaking at) several conferences and meetups. If you’re planning to be there, we’d love to meet you—here’s where you can find us.

VictoriaMetrics Long-Term Support (LTS): H2 2025 Update

As we’re half-way through the year, we’d like to take this opportunity to provide an update on the most recent changes in our Long-Term Support (LTS) releases.

Creating a Sustainable Open Source Business Model - Introduction

Open source defies everything you’ve ever heard or learned about business before. This blog post is an introduction to how we’re creating a sustainable business model rooted in open source.

Full-Stack Observability with VictoriaMetrics in the OTel Demo

The OpenTelemetry Astronomy Shop demo has long served as a reference environment for exploring observability in distributed systems, but until now it shipped with only a Prometheus datasource. VictoriaMetrics forked the demo and extended it with VictoriaMetrics, VictoriaLogs, and VictoriaTraces, providing insights into VictoriaMetrics’ observability stack where metrics, logs, and traces flow into a unified backend.