worker: initial implementation
Hi everyone!
This PR adds threading support for to Node.js. I realize that this is not exactly a small PR and is going to take a while to review, so: I appreciate comments, questions (any kind, as long as it’s somewhat related
The super-high-level description of the implementation here is that Workers can share and transfer memory, but not JS objects (they have to be cloned for transferring), and not yet handles like network sockets.
FAQ
See https://gist.github.com/benjamingr/3d5e86e2fb8ae4abe2ab98ffe4758665
Example usage
const { Worker, isMainThread, postMessage, workerData } = require('worker_threads');
if (isMainThread) {
module.exports = async function parseJSAsync(script) {
return new Promise((resolve, reject) => {
const worker = new Worker(__filename, {
workerData: script
});
worker.on('message', resolve);
worker.on('error', reject);
worker.on('exit', (code) => {
if (code !== 0)
reject(new Error(`Worker stopped with exit code ${code}`));
});
});
};
} else {
const { parse } = require('some-js-parsing-library');
const script = workerData;
postMessage(parse(script));
}
Feature set
The communication between threads largely builds on the MessageChannel Web API. Transferring ArrayBuffer
s and sharing memory through SharedArrayBuffer
s is supported.
Almost the entire Node.js core API is require()
able or import
able.
Some notable differences:
- stdio streams may be captured by the parent thread.
- Some functions, e.g.
process.chdir()
don’t exist in worker threads. - Native addons are not loadable from worker threads (yet).
- No inspector support (yet).
(Keep in mind that PRs can change significantly based on reviews.)
child_process
and cluster
Comparison with Workers are conceptually very similar to child_process
and cluster
.
Some of the key differences are:
- Communication between Workers is different: Unlike
child_process
IPC, we don’t use JSON, but rather do the same thing thatpostMessage()
does in browsers.- This isn’t necessarily faster, although it can be and there might be more room for optimization. (Keep in mind how long JSON has been around and how much work has therefore been put into making it fast.)
- The serialized data doesn’t actually need to leave the process, so overall there’s less overhead in communication involved.
- Memory in the form of typed arrays can be transferred or shared between Workers and/or the main thread, which enables really fast communication for specific use cases.
- Handles, like network sockets, can not be transferred or shared (yet).
- There are some limitations on the usable API within workers, since parts of it (e.g.
process.chdir()
) affect per-process state, loading native addons, etc. - Each workers have its own event loop, but some of the resources are shared between workers (e.g. the libuv thread pool for file system work)
Benchmarks
$ ./node benchmark/cluster/echo.js
cluster/echo.js n=100000 sendsPerBroadcast=1 payload="string" workers=1: 33,647.30473442063
cluster/echo.js n=100000 sendsPerBroadcast=10 payload="string" workers=1: 12,927.907405288383
cluster/echo.js n=100000 sendsPerBroadcast=1 payload="object" workers=1: 28,496.37373941151
cluster/echo.js n=100000 sendsPerBroadcast=10 payload="object" workers=1: 8,975.53747186485
$ ./node --experimental-worker benchmark/worker/echo.js
worker/echo.js n=100000 sendsPerBroadcast=1 payload="string" workers=1: 88,044.32902365089
worker/echo.js n=100000 sendsPerBroadcast=10 payload="string" workers=1: 39,873.33697018837
worker/echo.js n=100000 sendsPerBroadcast=1 payload="object" workers=1: 64,451.29132425621
worker/echo.js n=100000 sendsPerBroadcast=10 payload="object" workers=1: 22,325.635443739284
A caveat here is that startup performance for Workers using this model is still relatively slow (I don’t have exact numbers, but there’s definitely overhead).
Regarding semverness:
The only breaking change here is the introduction of a new top-level module. The name is currently worker
, this is not under a scope as suggested in https://github.com/nodejs/TSC/issues/389. It seems like the most natural name for this by far.
I’ve reached out to the owner of the worker
module on npm, who declined to provide the name for this purpose – the package has 57 downloads/week, so whether we consider this semver-major because of that is probably a judgement call.
Alternatively, I’d suggest using workers
– it’s not quite what we’re used to in core (e.g. child_process
), but the corresponding npm package is essentially just a placeholder.
Acknowledgements
People I’d like to thank for their code, comments and reviews for this work in its original form, in no particular order:
- @TimothyGu
- @Qard
- @aqrln
- @oe
- @benjamingr
… and finally @petkaantonov for a lot of inspiration and the ability to compare with previous work on this topic.
Individual commits
src: cleanup per-isolate state on platform on isolate unregister
Clean up once all references to an Isolate*
are gone from the
NodePlatform
, rather than waiting for the PerIsolatePlatformData
struct to be deleted since there may be cyclic references between
that struct and the individual tasks.
src: fix MallocedBuffer move assignment operator
!can_call_into_js()
src: break out of timers loop if Otherwise, this turns into an infinite loop.
src: simplify handle closing
Remove one extra closing state and use a smart pointer for
deleting HandleWrap
s.
MessagePort
and MessageChannel
worker: implement Implement MessagePort
and MessageChannel
along the lines of
the DOM classes of the same names. MessagePort
s initially
support transferring only ArrayBuffer
s.
worker: support MessagePort passing in messages
Support passing MessagePort
instances through other MessagePort
s,
as expected by the MessagePort
spec.
SharedArrayBuffer
sharing
worker: add Logic is added to the MessagePort
mechanism that
attaches hidden objects to those instances when they are transferred
that track their lifetime and maintain a reference count, to make
sure that memory is freed at the appropriate times.
src: add Env::profiler_idle_notifier_started()
src: move DeleteFnPtr into util.h
This is more generally useful than just in a crypto context.
worker: initial implementation
Implement multi-threading support for most of the API.
test: add test against unsupported worker features
worker: restrict supported extensions
Only allow .js
and .mjs
extensions to provide future-proofing
for file type detection.
src: enable stdio for workers
Provide stdin
, stdout
and stderr
options for the Worker
constructor, and make these available to the worker thread
under their usual names.
The default for stdin
is an empty stream, the default for
stdout
and stderr
is redirecting to the parent thread’s
corresponding stdio streams.
benchmark: port cluster/echo to worker
worker: improve error (de)serialization
Rather than passing errors using some sort of string representation, do a best effort for faithful serialization/deserialization of uncaught exception objects.
test,tools: enable running tests under workers
Enable running tests inside workers by passing --worker
to tools/test.py
. A number of tests are marked as skipped,
or have been slightly altered to fit the different environment.
Other work
I know that teams from Microsoft (/cc @fs-eire @helloshuangzi) and Alibaba (/cc @aaronleeatali) have been working on forms of multithreading that have higher degrees of interaction between threads, such as sharing code and JS objects. I’d love if you could take a look at this PR and see how well it aligns with your own work, and what conflicts there might be. (From what I’ve seen of the other code, I’m actually quite optimistic that this PR is just going to help everybody.)
Checklist
-
make -j4 test
(UNIX), orvcbuild test
(Windows) passes -
tests and/or benchmarks are included -
documentation is changed or added -
commit message follows commit guidelines