Dig Security is a cloud data security startup of 50+ employees that provides real-time visibility, control, and protection of data assets.
We started with a Prometheus server on EKS. That worked until it didn't.
We then spent time scaling it, maintaining it, throwing more $ at it, until we stumbled across VictoriaMetrics.
What we looked for:
With VictoriaMetrics we found the following solution:
sum(median_over_time(process_resident_memory_bytes[24h]))
sum(rate(process_cpu_seconds_total[24h]))
sum(max_over_time(vm_cache_entries{type="storage/hour_metric_ids"}[24h]))
sum(increase(vm_new_timeseries_created_total[24h]))
sum(rate(vm_rows_inserted_total[24h]))
sum(vm_rows{type=~"storage/.+"})
sum(vm_rows{type="indexdb"}))
sum(vm_data_size_bytes{type=~"storage/.+"})
sum(vm_data_size_bytes{type="indexdb"})
sum(vm_data_size_bytes) / sum(vm_rows{type=~"storage/.+"})
sum(rate(vm_http_requests_total{path=~".*/api/v1/query_range"}[24h]))