esm: move hooks handling into separate class
This PR is meant to enable #44710 to get over the finish line, and could be viewed as splitting off a big chunk of the work for that PR into a smaller one.
This PR splits apart the ESMLoader
class into two classes, ESMLoader
and Hooks
, where the latter’s file is only required (and the class instantiated) when --loader
is used to register custom user loaders. As part of this, a few other lib/internal/module/esm
files were changed to become lazy-loaded (cc @joyeecheung, please tell me if I’m doing this right).
This should result in a minor performance boost for the default case of no user loaders registered, as we don’t need to do as much validation for the return values of Node’s internal defaultResolve
and defaultLoad
hooks; those functions are also now called directly, rather than as part of a “chain” of one each. cc @mcollina
The case where user loaders are registered will now consume slightly more memory than before. Both previously and in this PR there is a check where the url
property passed between hooks (as a return value from resolve
and an input value for load
) is validated that it can be instantiated via new URL(url)
without throwing. Before this PR, there’s a check that the url
we’re about to validate hasn’t already been validated by virtue of it existing in ESMLoader.moduleMap
. Per this PR, we cache already-validated url
values in a new SafeSet
that’s attached to the Hooks
class. This new data structure means that we store all of the URLs for all loaded modules twice, in ESMLoader.moduleMap
and in Hooks.#validatedUrls
; but I think this is unavoidable if we want this to become the basis for #44710 since we wouldn’t be able to share ESMLoader.moduleMap
across the worker boundary. It’s arguable whether we need to do this validation check at all, or at least to do it between every registered hook; if we did the check only once on the final result of the resolve
chain, it could stay in ESMLoader
. This would be a breaking change to the current Loaders API, however, so I’m avoiding making such a change as part of this PR. (I think we should consider if it’s a change worth making, and the tradeoffs of whether the check is worth the performance cost, but I think that can come in a later PR.)
This should set the stage for another attempt at wrapping the user loaders processing into a worker thread. In #44710 the idea was to put a second ESMLoader
class inside the worker and try to send instantiated Module
out from the worker back to the main thread, but those proved impossible to serialize. After this PR lands, the hooks.js
that this PR adds would be what gets wrapped in a worker, and all of its inputs and outputs should be serializable. (They’re not quite JSON serializable, as the load
hook returns an object with a source
property that could be a buffer, but these return values are much simpler than Module
objects and should be within the capabilities of v8.serialize
/deserialize
.)
The bulk of hooks.js
was just moved from ESMLoader
. I avoided making unrelated changes other than adding/editing some comments and JSDocs.
cc @nodejs/loaders