Expose highWaterMark in http
Is your feature request related to a problem? Please describe.
Recently I spent quite some time debugging some horrible network performance problem in a Node.js application. That application receives data from multiple HTTP streams (using Axios), most of them quite low bandwidth, with a few of them that can occasionally go over 20MB/s. I traced the problem down to Node.js not reading often enough from the socket, causing the kernel buffer to overflow, which in turn makes it set the TCP window to 0 with dramatic consequences to the throughput. The problem was that libuv was serving only 64k per event loop iteration. This means that an application trying to read at 20MB/s will have to do at least 320 event loop iterations per second - leaving it with a mere 3ms maximum allowed processing time per iteration - including GC and all.
There are two problems here: first, the occasional spike in processing time that should (and can be) absorbed by the kernel buffer. The default one on Linux, 262144 bytes, can absorb 12.5ms of data when receiving 20MB/s. This one can be adjusted by sysctl
and does not concern Node.js.
The second problem is the average throughput. If the application cannot make an average of 320 loop iterations per second, that it's over - data keeps piling up and setting the TCP window to 0 is an awful way to control the flow.
Now, why the 64k limit.
libuv
First thing I noticed is that libuv reads in 64k buffers. This is merely a "suggestion" and can be adjusted by Node in EmitToJSStreamListener::OnStreamAlloc
There is a stale PR in libuv that discusses changing the "suggested size" on their side:
https://github.com/libuv/libuv/pull/1279
When submitting read_cb
s on Linux (why this logic is implemented in an OS-specific layer is beyond me) to the upper layers, libuv will do up to 32 reads (it is a protection to avoid starvation). This happens in the UNIX-specific uv__read
. With 64k buffers this would have allowed me to achieve 20MB/s with only 10 event loop iterations per second. If only the HTTP client was not calling uv_read_stop
at every teaspoon of data.
Node's HTTP client
Node's HTTP client will read the data and will stop when it reaches the highwatermark. The default highwatermark is at a meager 16k. It is 64k for files, but it is 16k for network sockets. Thankfully, Readable
is capable of shoving the entire chunk of data up his buffer (Readable.prototype.push
) before realizing it was too much. Older versions of Node (14) will get a first chunk of 16k and then a second one of 64k, newer ones will get 64k and will immediately realize they are full and emit a readStop
.
highWaterMark
So, how do we set the highwatermark of http.request
? I would like to open an issue in axios
, but I am afraid that if I present the solution I have found, they will tell me that they are not interested in supporting undocumented Node.js internals:
options.createConnection = (opts) => {
opts.highWaterMark = 1024 * 1024;
const socket = new require('net').Socket(opts);
if (opts.timeout) {
socket.setTimeout(opts.timeout);
}
return socket.connect({
host: opts.host,
port: opts.port
});
}
http.request(options, cb);
Describe the solution you'd like
options.highWaterMark = 1024 * 1024;
http.request(options, cb);
Also, probably consider raising the default value of 16k.