No results for

Powered byAlgolia

Data Parameterization

suggest edits

Scripting examples on how to parameterize data in a test script. Parameterization is typically necessary when Virtual Users (VUs) will make a POST, PUT, or PATCH request in a test.

Parameterization helps to prevent server-side caching from impacting your load test. This will, in turn, make your test more realistic.

warning!

Each VU in k6 is a separate JS VM so in order to not have multiple copies of the whole data file SharedArray was added. It does have some CPU overhead in accessing elements compared to a normal non shared array, but the difference is negligible compared to the time it takes to make requests. This becomes even less of an issue compared to not using it with large files, as k6 would otherwise use too much memory to run, which might lead to your script not being able to run at all or aborting in the middle if the system resources are exhausted.

For example, the Cloud service allocates 8GB of memory for every 300 VUs. So if your files are large enough and you are not using SharedArray, that might mean that your script will run out of memory at some point. Additionally even if there is enough memory, k6 has a garbage collector (as it's written in golang) and it will walk through all accessible objects (including JS ones) and figure out which need to be garbage collected. For big JS arrays copied hundreds of times this adds quite a lot of additional work.

A note on performance characteristics of SharedArray can be found within its API documentation.

From a JSON file

data.json
1{
2 "users": [
3 { "username": "test", "password": "qwerty" },
4 { "username": "test", "password": "qwerty" }
5 ]
6}
parse-json.js
1import { SharedArray } from "k6/data";
2// not using SharedArray here will mean that the code in the function call (that is what loads and
3// parses the json) will be executed per each VU which also means that there will be a complete copy
4// per each VU
5const data = new SharedArray("some data name", function() { return JSON.parse(open('./data.json')).users; });
6
7export default function () {
8 let user = data[0];
9 console.log(data[0].username);
10}

From a CSV file

As k6 doesn't support parsing CSV files natively, we'll have to resort to using a library called Papa Parse.

You can download the library and import it locally like this:

papaparse-local-import.js
1import papaparse from './papaparse.js';
2import { SharedArray } from "k6/data";
3// not using SharedArray here will mean that the code in the function call (that is what loads and
4// parses the csv) will be executed per each VU which also means that there will be a complete copy
5// per each VU
6const csvData = new SharedArray("another data name", function() {
7 // Load CSV file and parse it using Papa Parse
8 return papaparse.parse(open('./data.csv'), { header: true }).data;
9});
10
11export default function () {
12 // ...
13}

Or you can grab it directly from jslib.k6.io like this.

papaparse-remote-import.js
1import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js';
2import { SharedArray } from "k6/data";
3
4// not using SharedArray here will mean that the code in the function call (that is what loads and
5// parses the csv) will be executed per each VU which also means that there will be a complete copy
6// per each VU
7const csvData = new SharedArray("another data name", function() {
8 // Load CSV file and parse it using Papa Parse
9 return papaparse.parse(open('./data.csv'), { header: true }).data;
10});
11
12export default function () {
13 // ...
14}

Here's an example using Papa Parse to parse a CSV file of username/password pairs and using that data to login to the test.k6.io test site:

parse-csv.js
1/* Where contents of data.csv is:
2username,password
3admin,123
4test_user,1234
5*/
6import http from 'k6/http';
7import { check, sleep } from 'k6';
8import { SharedArray } from 'k6/data'
9import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js';
10
11// not using SharedArray here will mean that the code in the function call (that is what loads and
12// parses the csv) will be executed per each VU which also means that there will be a complete copy
13// per each VU
14const csvData = new SharedArray("another data name", function() {
15 // Load CSV file and parse it using Papa Parse
16 return papaparse.parse(open('./data.csv'), { header: true }).data;
17});
18
19export default function () {
20 // Now you can use the CSV data in your test logic below.
21 // Below are some examples of how you can access the CSV data.
22
23 // Loop through all username/password pairs
24 for (var userPwdPair of csvData) {
25 console.log(JSON.stringify(userPwdPair));
26 }
27
28 // Pick a random username/password pair
29 let randomUser = csvData[Math.floor(Math.random() * csvData.length)];
30 console.log('Random user: ', JSON.stringify(randomUser));
31
32 const params = {
33 login: randomUser.username,
34 password: randomUser.password,
35 };
36 console.log('Random user: ', JSON.stringify(params));
37
38 let res = http.post('https://test.k6.io/login.php', params);
39 check(res, {
40 'login succeeded': (r) =>
41 r.status === 200 && r.body.indexOf('successfully authorized') !== -1,
42 });
43
44 sleep(1);
45}

Generating data

See this example project on GitHub showing how to use faker.js to generate realistic data at runtime.

Old workarounds

The following section is here for historical reasons as it was the only way to lower the memory usage of k6 prior to v0.30.0 but after v0.27.0, but still have access to a lot of parameterization data with some caveats. All of the below should probably not be used as SharedArray should be sufficient.

After k6 version v0.27.0, while there was still no way to share memory between VUs, the __VU variable was now defined during the init context which means that we could split the data between the VUs during initialization and not have multiple copies of it during the test run. This is not useful now that SharedArray exists. Combining both will likely not bring any more performance benefit then using just the SharedArray.

parse-json-big.js
1var splits = 100; // in how many parts are we going to split the data
2
3if (__VU == 0) {
4 open('./data.json'); // we just open it so it is available in the cloud or if we do k6 archive
5} else {
6 var data = (function () {
7 // separate function in order to not leak all the data in the main scope
8 var all_data = JSON.parse(open('./data.json')); // we load and parse the data in one go, no need for temp variables
9 var part_size = all_data.length / splits;
10 var index = part_size * (__VU % splits);
11 return all_data.slice(index, index + part_size);
12 })();
13}
14
15export default function () {
16 console.log(`VU=${__VU} has ${data.length} data`);
17}

With 100k lines like:

{ "username": "test", "password": "qwerty" },

and a total of 4.8MB the script uses 3.5GB to start 300 VUs, while without it for 100 VUs (with all the data for each VU) it requires nearly 10GB. For direct comparison 100VUs used near 2GB of memory.

Playing with the value for splits will give a different balance between memory used and the amount of data each VU has.

A second approach using another technique will be to pre-split the data in different files and load and parse only the one for each VU.

parse-csv-many.js
1import papaparse from 'https://jslib.k6.io/papaparse/5.1.1/index.js';
2import { sleep } from 'k6';
3
4let dataFiles = [
5 './data1.csv',
6 './data2.csv',
7 './data3.csv',
8 './data4.csv',
9 './data5.csv',
10 './data6.csv',
11 './data7.csv',
12 './data8.csv',
13 './data9.csv',
14 './data10.csv',
15];
16var csvData;
17if (__VU == 0) {
18 // workaround to collect all files for the cloud execution
19 for (let i = 0; i < dataFiles.length; i++) {
20 open(dataFiles[i]);
21 }
22} else {
23 csvData = papaparse.parse(open(dataFiles[__VU % dataFiles.length]), {
24 header: true,
25 }).data;
26}
27export default function () {
28 sleep(10);
29}

The files have 10k lines and are in total 128kb. Running 100VUs with this script takes around 2GB, while running the same with a single file takes upwards of 15GBs.

Either approach works for both JSON and CSV files and they can be combined, as that will probably reduce the memory pressure during the initialization even further.