A number of users while running their test scripts locally will run into limits within their OS which would prevent them from making the necessary number of requests to complete the test. This limit usually manifests itself in a form of Too Many Open Files error. These limits, if unchanged, can be a severe bottleneck if you choose to run a somewhat bigger or complicated test locally on your machine.
In this article we will show you how to inspect the OS imposed limits of your system, tweak them and scale for larger tests.
Important to note here is that everything that we are covering in this article needs to be approached with a healthy dose of caution. As with any changes you are introducing to your OS, we recommend not to blindly change your system settings to a specific value. You should document ways of testing that shows a clear before/after relation. E.g. before changing MSL / TIME_WAIT period, confirm that you’re experiencing the issue (error messages, netstat, ss, etc.), change settings conservatively, re-run the test and note any improvement. This way we can gauge the impact of the optimization, any negative side-effects and come up with a range of recommended values.
Modifications below have been tested for macOS Sierra 10.12 and above, so if you are running an older version than that, the process for changing these settings may be different.
Unix operating system derivatives like GNU/Linux, BSDs and macOS, have the capability to limit the amount of system resources available to a process to safeguard system stability. This includes the total amount of memory, CPU time or amount of open files a single process is allowed to manage.
Since in Unix everything is a file, including network connections, application testing tools that heavily use the network, such as k6, might reach the configured limit of allowed open files, depending on the amount of network connections used in a particular test.
As mentioned in our opening section, this results in a message like the following being shown during a test:
This message means that the network resource limit has been reached, which will prevent k6 from creating new connections, thus altering the test result. In some cases this may be desired, to measure overall system performance, for example, but in most cases this will be a bottleneck towards testing the HTTP server and web application itself.
Below we will look at ways to increase this resource limit, and allow k6 to run tests with hundreds or thousands of concurrent VUs from a single system.
Unix systems have two types of resource limits:
- hard limits: these are the absolute maximum allowed for each user, and can only be configured by the root user.
- soft limits: these can be configured by each user, but cannot be above the hard limit setting.
On GNU/Linux, you can see the configured limits with the ulimit command.
ulimit -Sa will show all soft limits for the current user:
While ulimit -Ha will show all hard limits for the current user:
Note the difference of open files being a maximum of 1024 for the soft limit, while it's 1048576 for the hard limit.
In macOS however, you will have a couple of different system imposed limits to take into consideration.
The first one is launchctl limit maxfiles which prints the per-process limits which are specified also as a soft limit and a hard limit. When a soft limit is exceeded a process may receive a signal (for example, if the CPU time or file size is exceeded), but it will be allowed to continue execution until it reaches the hard limit (or modifies its resource limit). kern.maxfiles is the limit of total file descriptors on the entire system - the sum total of all the open files for all processes plus all the files the kernel has open for its own purposes.
So, to reiterate, running commands above will show you the system limits on open files and running processes.
The first thing you should consider before changing the configuration is the amount of network connections you expect your test to require. The http_reqs metric in the k6 result summary can hint at this, but a baseline calculation of number of max. VUs * number of HTTP requests in a single VU iteration will deliver a fair approximation. Note that k6 also deals with text files and other resources that count towards the "open files" quota, but network connections are the biggest consumers.
Before we can change any system imposed limits in macOS we will need to disable a security feature put in place to prevent us in doing so. You will need to disable System Integrity Protection that was introduced in OS X El Capitan to prevent certain system-owned files and directories from being modified by processes without the proper privileges.
To disable it you will need to restart your Mac and hold down Command + R while it boots. This will boot it into Recovery Mode.
There you should navigate to Utilities which are located in the menu bar at the top of the screen, then open Terminal. Once you have it open, enter the following command:
Once you press enter and close the Terminal, you can reboot your Mac normally and log into your account.
Changing soft limits
So, let's say that we want to run a 1000 VU test which makes 4 HTTP requests per iteration. In this case we could increase the open files limit to 5000, to account for additional non-network file usage. This can be done with the following command:
This changes the limit only for the current shell session.
If we want to persist this change for future sessions, we can add this to a shell startup file. For Bash this would be:
If the soft limit is too low, set the current session to (values written here are usually close to default ones) :
Since sudo is needed, you are prompted for a password.
Changing hard limits
If the above command results in an error like cannot modify limit: Operation not permitted or value exceeds hard limit, that means that the hard limit is too low, which as mentioned before, can only be changed by the root user.
This can be done by modifying the /etc/security/limits.conf file.
For example, to set both soft and hard limits of the amount of open files per process for the alice account, open /etc/security/limits.conf as root in your text editor of choice and add the following lines:
The new limits will be in place after logging out and back in.
Alternatively, * hard nofile 1048576 would apply the setting for all non-root user accounts, and root hard nofile 1048576 for the root user. See the documentation in that file or man bash for the ulimit command documentation.
Next step will be to configure your new file limits. Open terminal and paste the following command:
This will open a text editor inside your terminal window where you will be prompted to provide your user password and then paste the following:
Pressing Control + X will save the changes and exit the editor. By pasting and saving this we have introduced two different limitations to your maxfiles limit. The first one (64000) is a soft limit, which if reached, will prompt your Mac to prepare to stop allowing new file opens but still let them open. If the second one is reached (524288), a hard limit, you will again start seeing your old friend, the 'too many files open' error message.
We will use the same procedure to increase the processes limit next.
While in Terminal create a similar file with this command:
Again, after prompted for your password, you can paste the following and save and close with Control + X
All that is left after this is to reboot your Mac back to the Recovery Mode, open the Terminal, turn the SIP back on with csrutil enable and check if the limits were changed with commands we used at the beginning.
In most cases these limits should be enough to run most of your simple tests locally for some time, but you can modify the files above to any values you will need in your testing.
Please be aware that all of these limitations are put in place to protect your operating system from files and applications that are poorly written and might leak memory like in huge quantities. We would suggest not going too overboard with the values, or you might find your system slowing down to a crawl if or when it runs out of RAM.
When creating an outgoing network connection the kernel allocates a local (source) port for the connection from a range of available ports.
On GNU/Linux you can see this range with:
While 28,231 ports might be sufficient for most use cases, this might be a limiting factor if you’re testing with thousands of connections. You can increase it with, for example:
Be aware that this range applies to both TCP and UDP, so be conservative with the values you choose and increase as needed.
To make the changes permanent, add net.ipv4.ip_local_port_range=16384 65000 to /etc/sysctl.conf. Last resort tweaks If you still experience network issues with the above changes, consider enabling net.ipv4.tcp_tw_reuse:
This will enable a feature to quickly reuse connections in TIME_WAIT state, potentially yielding higher throughput.
On macOS the default ephemeral port range is 49152 to 65535, for a total of 16384 ports. You can check this with the sysctl command:
Once you run out of ephemeral ports, you will normally need to wait until the TIME_WAIT state expires (2 * maximum segment lifetime) until you can reuse a particular port number. You can double the number of ports by changing the range to start at 32768, which is the default on Linux and Solaris. (The maximum port number is 65535 so you cannot increase the high end.)
Note that the official range designated by IANA is 49152 to 65535, and some firewalls may assume that dynamically assigned ports fall within that range. You may need to reconfigure your firewall in order to make use of a larger range outside of your local network.
In this section we will go over some of the optimisations that are not necessarily dependant on your OS, but may impact your testing.
As a baseline, count each VU instance to require between 1MB and 5MB of RAM, depending on your script complexity and dependencies. This is roughly between `GB and 5GB of required system RAM for a 1,000 VU test, so make sure that sufficient physical RAM is available to meet your test demands.
In addition to physical RAM, ensure that the system is configured with an appropriate amount of virtual memory, or swap space, in case higher memory usage bursts are required.
You can see the status and amount of available swap space on your system with the commands swapon or free.
We won't go into swap configuration details here, but you can find several guides online.
Because k6 can generate and sustain large amounts of network traffic, it also stresses the network stack of modern operating systems. Under certain loads or network conditions it's possible to achieve higher throughput and better performance by tweaking some network settings of the operating system or restructuring the network conditions of the test.
TCP network applications, such as web clients and servers, are assigned a network socket pair (a unique combination of local address, local port, remote address, and remote port) for each incoming or outgoing connection. Typically this socket pair is used for a single HTTP request/response session, and closed soon after. However, even after a connection is successfully closed by the application, the kernel might still reserve resources for quickly reopening the same socket if a new matching TCP segment arrives. This also occurs during network congestion where some packets get lost in transmission. This places the socket in a TIME_WAIT state, and is released once the TIME_WAIT period expires. This period is typically configured between 15 seconds and 2 minutes.
The problem some applications like k6 might run into is causing a high number of connections to end up in the TIME_WAIT state, which can prevent new network connections being created.
In these scenarios, before making changes to the system network configuration, which might have adverse side-effects for other applications, it's preferable to first take some common testing precautions. Use different server ports or IPs
Since sockets are uniquely created for a combination of local address, local port, remote address and remote port, a safe workaround for avoiding TIME_WAIT congestion is using different server ports or IP addresses.
For example, you can configure your application to run on ports :8080, :8081, :8082, etc. and spread out your HTTP requests across these endpoints.