Thresholds are pass/fail criteria that specify the performance expectations of the system under test. Often, k6 users use thresholds to codify their SLOs.
For example, you can use thresholds to test that your system meets the following expectations:
- Less than 1% of requests return an error.
- 95% of requests have a response time below 200ms.
- 99% of requests have a response time below 400ms.
- A specific endpoint always responds within 300ms.
- Any conditions for a custom metric.
Thresholds analyze the performance metrics and determine whether the final results passed or failed the test. Thresholds are essential for load-testing automation.
Here is a sample script that specifies two thresholds. One threshold evaluates the rate of HTTP errors (http_req_failed metric). The other evaluates whether 95 percent of responses happen within a certain duration (the http_req_duration metric).
In other words, when you define your threshold, specify an expression for a pass criteria. If that expression evaluates to false at the end of the test, k6 considers the whole test a fail.
After executing that script, k6 outputs something similar to this:
In this case, the test met the criteria for both thresholds. k6 considers this test a pass and exits with an exit code 0.
If any of the thresholds had failed, the little green checkmark ✓ next to the threshold name (http_req_failed, http_req_duration) would have been a red cross ✗, and k6 would have generated a non-zero exit code.
The quickest way to start with thresholds is to use the standard, built-in k6 metrics. Here are a few copy-paste examples that you can start using right away.
You can also apply multiple thresholds for one metric. This threshold has different duration requirements for different request percentiles.
You can set thresholds per Group. This code has groups for individual requests and batch requests. For each group, there are different thresholds.
To use a threshold, you must define at least one threshold_expression:
This declaration configures thresholds for the metrics metric_name1 and metric_name2. To determine whether the threshold passes or fails, the script evaluates the 'threshold_expression'.
The 'threshold_expression' must follow the format:
aggregation_method operator value
- avg < 200 // average duration must be less than 200ms
- count >= 500 // count must be larger than or equal to 500
- p(90) < 300 // 90% of samples must be below 300
A threshold expression evaluates to true or false.
Each of the four metric types included in k6 provides a set of aggregation methods that you can use in threshold expressions.
|Metric type||Aggregation methods|
|Counter||count and rate|
|Trend||avg, min, max, med and p(N) where N is a number between 0.0 and 100.0 meaning the percentile value to look at, e.g. p(99.99) means the 99.99th percentile. The unit for these values is milliseconds.|
Here is a (slightly contrived) sample script that uses all different types of metrics, setting different types of thresholds for each:
We have these thresholds:
- A counter metric that keeps track of the total number of times that the content response was not OK. The success criteria here is that content cannot be bad more than 99 times.
- A gauge metric that contains the latest size of the returned content. The success criteria for this metric is that the returned content is smaller than 4000 bytes.
- A rate metric that keeps track of how often the content returned was OK. This metric has one success criteria: content must have been OK more than 95% of the time.
- A trend metric that is fed with response time samples and which has the following threshold criteria:
- 99th percentile response time must be below 300 ms
- 70th percentile response time must be below 250 ms
- Average response time must be below 200 ms
- Median response time must be below 150 ms
- Minimum response time must be below 100 ms
⚠️ Common mistake Do not specify multiple thresholds for the same metric by repeating the same object key:
It's often useful to specify thresholds on a single URL or specific tag. In k6, tagged requests create sub-metrics that you can use in thresholds:
And here's a full example.
If you want to abort a test as soon as a threshold is crossed, before the test finishes, there's an extended threshold specification format:
In this example, the threshold specification has been extended to alternatively support a JS object with parameters to control the abort behavior. The fields are as follows:
|threshold||string||This is the threshold expression string specifying the threshold condition to evaluate.|
|abortOnFail||boolean||Whether to abort the test if the threshold is evaluated to false before the test has completed.|
|delayAbortEval||string||If you want to delay the evaluation of the threshold to let some metric samples to be collected, you can specify the amount of time to delay using relative time strings like 10s, 1m and so on.|
Here is an example:
⚠️ Evaluation delay in the cloud
When k6 runs in the cloud, thresholds are evaluated every 60 seconds, therefore the abortOnFail feature may be delayed by up to 60 seconds.
Checks are nice for codifying assertions, but unlike thresholds, checks do not affect the exit status of k6.
If you use only checks to verify that things work as expected, you can't fail the whole test run based on the check results.
It's often useful to combine checks and thresholds, to get the best of both:
In this example, the threshold is configured on the checks metric, establishing that the rate of successful checks is higher than 90%.
Additionally, you can use tags on checks if you want to define a threshold based on a particular check or group of checks. For example:
You can also see how the underlying metric compares to a specific threshold throughout the test. The threshold can be added to the analysis tab for further comparison against other metrics.
Learn more about analyzing results in the k6 Cloud Results docs.