k6 Cloud, our managed testing solution, supports Prometheus to store and correlate performance testing metrics within your observability stack since a while now. Announced at Grafana ObservabilityCON, we launched Prometheus support to k6 Open Source - our free, open, and extensible load testing tool.
k6 OSS supports sending k6 metrics to multiple outputs such as InfluxDB, New Relic, StatsD, and more. But until now, Prometheus was absent from that list, despite being the standard in cloud-native metrics monitoring and the only supported format in Kubernetes.
Last September, I joined the k6 team, and the first order of business has been the official Prometheus extension: xk6-output-prometheus-remote. Now with k6 OSS, you can also store k6 metrics in Prometheus to better observe the performance and reliability of your systems when testing.
The easiest way to run a Prometheus instance is to use a Docker image locally:
PRW (Prometheus Remote Write) support is not natively built into k6, instead you use the xk6-output-prometheus-remote extension to build a new binary:
Then run a k6 test script with:
Et voilà! The metrics from the execution of script.js will be sent to a Prometheus instance running at http://localhost:9090:
The extension has options for basic HTTP authentication, K6_PROMETHEUS_USER and K6_PROMETHEUS_PASSWORD, and for TLS, K6_PROMETHEUS_INSECURE_SKIP_TLS_VERIFY and K6_CA_CERT_FILE.
The PRW extension can be used with a local Prometheus instance or other observability solutions (link below) supporting a PRW integration. The following command shows how to send k6 metrics to Grafana Cloud Prometheus:
Let's see what we can get running this command.
k6 + Grafana Cloud
There are up to 25 builtin metrics in k6 that each test run generates by default, like metrics for virtual users, iterations and their duration, and measurements for data flow:
Most k6 metrics for duration are of the Trend metric type and each consists of several values as can be seen for iteration duration above.
As HTTP is the most common protocol at the moment, there is quite a lot of basic data describing HTTP requests generated by k6. Depending on the use case and one’s imagination, it can be viewed with different visualizations and data thresholds, alerts, etc.
Here are some examples of how RPS, response rate, total requests and errors from a k6 test run can be represented in Grafana Cloud:
Since the Prometheus Remote Write extension gathers metrics with labels by default, there is also a quick way to add filtering by label values to a Grafana dashboard like this:
And now plots and tables can be filtered by URLs, scenarios and HTTP methods as well:
k6 + any remote write
Prometheus Remote Write is a high-level protocol with active usage and a high number of integrations.
According to the remote write specification:
"The remote write protocol is designed to make stateless implementations of the server possible; as such there are little-to-no inter-message references."
The remote write protocol is cleanly separated from the exact details of both generation of metric samples and how they are stored. From this perspective, it can be seen as a pure transfer protocol that in its simplified version takes an array of TimeSeries via HTTP and therefore, it is not dependent on Prometheus and can be used with any remote agent that supports the remote write protocol, like Cortex, InfluxDB, New Relic, Graphite, and Thanos.
A list of remote agents can be found here.
The mechanics behind metrics crunching
Sending k6 metrics
There are 25 builtin metrics that are generated by k6 at slightly different rates but no less frequent than once per second. They are subsequently sent to all configured outputs. Behaviour of each output can differ, but the Prometheus Remote Write extension attempts to send gathered metrics to the remote write interface each second. If the endpoint does not respond quickly enough and processing begins to lag, the extension can also start dropping samples with a warning.
In addition to builtin metrics, anyone using k6 can define their own custom metrics which will be sent to the outputs together with the builtin ones.
Mapping k6 metrics types
There is more than one way to process and store metrics and in case of Prometheus Remote Write output, metrics must be minimally aggregated to achieve correct representation of what is happening during the test run.
As the remote write protocol is not aware of anything but values and their timestamps, the trouble stems from different expectations of metrics values in different systems. As k6 has its own four metric types and other systems acting as remote agents can have another set of metric types, there must be a way to map metric types from k6 to metric types in a remote agent system.
Here's how the mapping from k6 to Prometheus looks like:
- Counter -> Counter
- Gauge -> Gauge
- Rate -> Gauge
- Trend -> 6 Gauges
Counter and Gauge behave similarly in both k6 and Prometheus so the mapping for those two types is the most trivial. Rate is specific for k6 only. It measures the percentage of successful events and can take values in [0, 100] range. Since this measure fluctuates during the test run, it can be safely mapped to Prometheus' Gauge as well.
Trend is a k6 metric that allows us to follow statistical changes of a value during the test run. At the moment, Trend does not have an analogue in Prometheus metric types and the best mapping that can be done right now is with six Gauges, for average, minimum, maximum, median, percentile-90 and percentile-95 respectively. Since the Prometheus team is working on a sparse histograms addition, that might help us improve Trend's mapping in the future.
If another, non-Prometheus, remote agent is used, this mapping may need adaptation as another remote agent can have its own metric types. The extension is explicit about mapping used. Though default mapping is the Prometheus one, it can be switched by passing an environment variable:
Raw mapping is the most simple mapping possible which sends metric values exactly as they are generated, without any pre-request processing.
Caveats and considerations
Another interesting difference is the handling of time as the remote write protocol requires timestamps to be sent explicitly in order. On rare occasions k6 might send metrics out of time order; in this case, Prometheus will warn with an out of order sample message and drop these metrics.
k6 metrics are differentiated not only by name and type but also by key-value pairs, tags. In Prometheus and Prometheus Remote Write these values are called labels. Remote write outputs extension copies and sends k6 metrics tags as labels. However, that increases label cardinality. Each label adds an additional dimension to a metric and since Prometheus stores time series for each unique combination of label values, it leads to an increase in the number of time series as well. This impacts not only the performance of Prometheus but potentially the price for hosting Prometheus solution.
The remote write extension allows users to opt out of labels completely by using K6_KEEP_TAGS configuration option which simply discards tags before sending metric values to the remote write endpoint.
It must also be noted that addition to custom metrics increases total sample rate sent by the output extension and therefore, may lead to lags and dropping of metric samples earlier. A combination of careful usage of custom metrics and of K6_KEEP_TAGS may be of value to be most effective for measuring heavy test runs.
We’d love to hear from you! We are always looking for feedback, so we can better understand the needs of the community. There are "almost" no limits with k6, as one of its greatest features is building custom extensions for your particular testing needs.
What do you think about the xk6-output-prometheus-remote extension? Please file your ideas for improvements in the issues section on the GitHub repository, or reach out to us on the Community Forum or Slack.