Skip to content

esm: move hooks handling into separate class

This PR is meant to enable #44710 to get over the finish line, and could be viewed as splitting off a big chunk of the work for that PR into a smaller one.

This PR splits apart the ESMLoader class into two classes, ESMLoader and Hooks, where the latter’s file is only required (and the class instantiated) when --loader is used to register custom user loaders. As part of this, a few other lib/internal/module/esm files were changed to become lazy-loaded (cc @joyeecheung, please tell me if I’m doing this right).

This should result in a minor performance boost for the default case of no user loaders registered, as we don’t need to do as much validation for the return values of Node’s internal defaultResolve and defaultLoad hooks; those functions are also now called directly, rather than as part of a “chain” of one each. cc @mcollina

The case where user loaders are registered will now consume slightly more memory than before. Both previously and in this PR there is a check where the url property passed between hooks (as a return value from resolve and an input value for load) is validated that it can be instantiated via new URL(url) without throwing. Before this PR, there’s a check that the url we’re about to validate hasn’t already been validated by virtue of it existing in ESMLoader.moduleMap. Per this PR, we cache already-validated url values in a new SafeSet that’s attached to the Hooks class. This new data structure means that we store all of the URLs for all loaded modules twice, in ESMLoader.moduleMap and in Hooks.#validatedUrls; but I think this is unavoidable if we want this to become the basis for #44710 since we wouldn’t be able to share ESMLoader.moduleMap across the worker boundary. It’s arguable whether we need to do this validation check at all, or at least to do it between every registered hook; if we did the check only once on the final result of the resolve chain, it could stay in ESMLoader. This would be a breaking change to the current Loaders API, however, so I’m avoiding making such a change as part of this PR. (I think we should consider if it’s a change worth making, and the tradeoffs of whether the check is worth the performance cost, but I think that can come in a later PR.)

This should set the stage for another attempt at wrapping the user loaders processing into a worker thread. In #44710 the idea was to put a second ESMLoader class inside the worker and try to send instantiated Module out from the worker back to the main thread, but those proved impossible to serialize. After this PR lands, the hooks.js that this PR adds would be what gets wrapped in a worker, and all of its inputs and outputs should be serializable. (They’re not quite JSON serializable, as the load hook returns an object with a source property that could be a buffer, but these return values are much simpler than Module objects and should be within the capabilities of v8.serialize/deserialize.)

The bulk of hooks.js was just moved from ESMLoader. I avoided making unrelated changes other than adding/editing some comments and JSDocs.

cc @nodejs/loaders

Merge request reports

Loading