Thresholds are the pass/fail criteria that you define for your test metrics. If the performance of the system under test (SUT) does not meet the conditions of your threshold, the test will finish with a failed status.
Often, testers use thresholds to codify their SLOs. For example, you can create thresholds for any combination of the following expectations:
- Less than 1% of requests return an error.
- 95% of requests have a response time below 200ms.
- 99% of requests have a response time below 400ms.
- A specific endpoint always responds within 300ms.
- Any conditions for a custom metric.
Thresholds are also essential for load-testing automation:
- Give your test a threshold.
- Automate your execution
- Set up alerts for test failures.
After that, you need to worry about the test only after your SUT fails to meet its performance expectations.
This sample script specifies two thresholds. One threshold evaluates the rate of HTTP errors (http_req_failed metric). The other evaluates whether 95 percent of responses happen within a certain duration (the http_req_duration metric).
In other words, when you define your threshold, specify an expression for a pass criteria. If that expression evaluates to false at the end of the test, k6 considers the whole test a fail.
After executing that script, k6 outputs something similar to this:
In this case, the test met the criteria for both thresholds. k6 considers this test a pass and exits with an exit code 0.
If any of the thresholds had failed, the little green checkmark ✓ next to the threshold name (http_req_failed, http_req_duration) would have been a red cross ✗, and k6 would have generated a non-zero exit code.
The quickest way to start with thresholds is to use the standard, built-in k6 metrics. Here are a few copy-paste examples that you can start using right away.
You can also apply multiple thresholds for one metric. This threshold has different duration requirements for different request percentiles.
You can set thresholds per Group. This code has groups for individual requests and batch requests. For each group, there are different thresholds.
To use a threshold, you must define at least one threshold_expression:
This declaration configures thresholds for the metrics metric_name1 and metric_name2. To determine whether the threshold passes or fails, the script evaluates the 'threshold_expression'.
The 'threshold_expression' must follow the format:
aggregation_method operator value
- avg < 200 // average duration must be less than 200ms
- count >= 500 // count must be larger than or equal to 500
- p(90) < 300 // 90% of samples must be below 300
A threshold expression evaluates to true or false.
Each of the four metric types included in k6 provides a set of aggregation methods that you can use in threshold expressions.
|Metric type||Aggregation methods|
|Counter||count and rate|
|Trend||avg, min, max, med and p(N) where N is a number between 0.0 and 100.0 meaning the percentile value to look at, e.g. p(99.99) means the 99.99th percentile. The unit for these values is milliseconds.|
Here is a (slightly contrived) sample script that uses all different types of metrics, setting different types of thresholds for each:
We have these thresholds:
- A counter metric that keeps track of the total number of times that the content response was not OK. The success criteria here is that content cannot be bad more than 99 times.
- A gauge metric that contains the latest size of the returned content. The success criteria for this metric is that the returned content is smaller than 4000 bytes.
- A rate metric that keeps track of how often the content returned was OK. This metric has one success criteria: content must have been OK more than 95% of the time.
- A trend metric that is fed with response time samples and which has the following threshold criteria:
- 99th percentile response time must be below 300 ms
- 70th percentile response time must be below 250 ms
- Average response time must be below 200 ms
- Median response time must be below 150 ms
- Minimum response time must be below 100 ms
⚠️ Common mistake Do not specify multiple thresholds for the same metric by repeating the same object key:
It's often useful to specify thresholds on a single URL or specific tag. In k6, tagged requests create sub-metrics that you can use in thresholds:
And here's a full example.
If you want to abort a test as soon as a threshold is crossed, set the abortOnFail property to true. When you set abortOnFail, the test run stops as soon as the threshold fails.
Sometimes, though, a test might fail a threshold early and abort before the test generates significant data. To prevent these cases, you can delay abortOnFail with delayAbortEval. In this script, abortOnFail is delayed ten seconds. After ten seconds, the test aborts if it fails the p(99) < 10 threshold.
The fields are as follows:
|threshold||string||This is the threshold expression string specifying the threshold condition to evaluate.|
|abortOnFail||boolean||Whether to abort the test if the threshold is evaluated to false before the test has completed.|
|delayAbortEval||string||If you want to delay the evaluation of the threshold to let some metric samples to be collected, you can specify the amount of time to delay using relative time strings like 10s, 1m and so on.|
Here is an example:
Evaluation delay in the cloud
When k6 runs in the cloud, thresholds are evaluated every 60 seconds. Therefore, the abortOnFail feature may be delayed by up to 60 seconds.
Checks are nice for codifying assertions, but unlike thresholds, checks do not affect the exit status of k6.
If you use only checks to verify that things work as expected, you can't fail the whole test run based on the check results.
It's often useful to combine checks and thresholds, to get the best of both:
In this example, the threshold is configured on the checks metric, establishing that the rate of successful checks is higher than 90%.
Additionally, you can use tags on checks if you want to define a threshold based on a particular check or group of checks. For example:
You can also see how the underlying metric compares to a specific threshold throughout the test. The threshold can be added to the analysis tab for further comparison against other metrics.
Learn more about analyzing results in the k6 Cloud Results docs.