module: CJS exports detection for modules with __esModule export (!33416) · Merge requests · Rodrigo Test / Test Group-nodejs / node

Rodrigo Muino Tomonari requested to merge github/fork/guybedford/cjs-export-detection into master May 15, 2020

This PR provides named exports support for transpiled CommonJS modules (restriction to __esModule sources added based on feedback from discussion at https://github.com/nodejs/node/pull/33416#issuecomment-665993427) to work out correctly using a very fast Wasm-based analysis process that runs before CJS execution, allowing import { name } from 'transpiled-esm-to-cjs' as most JS users would expect, without incurring any significant performance penalty.

This analysis is cached in a weakmap associated with the module object, which also stores the early-accessed source cache for usage on the actual load.

The approach is very similar to the approach TypeScript uses for detection of CommonJS named exports during the binding phase as we discussed in the last modules meeting with @weswigham.

The major concern for a parsing approach is the performance overhead of running a full parse of all CommonJS modules that are imported from an ES module. Acorn parse time is usually in the 10s of miliseconds for standard size source files to 100s of miliseconds for large sources. Scaling this up to multiple dependencies could introduce some loading overhead.

This PR incorporates a fork of es-module-lexer, cjs-module-lexer (https://github.com/guybedford/cjs-module-lexer) to handle this exports analysis with Web Assembly so that the list of named exports for CJS can be known at link time via a very fast source Wasm lexer process (sub millisecond for most sources).

es-module-lexer is used in many current ES module tooling projects, and has been battle tested against a wide range of real-world JS sources over the past two years. It supports the full JS grammar, including comments, strings, template strings and regex / division operator ambiguity.

cjs-module-lexer forks to handle CommonJS detecting named exports extractions based on the rules outlined at https://github.com/guybedford/cjs-module-lexer#supported. It is lexing matcher for patterns of exports. etc, including a Webpack heuristic as well.

Limitations include that some minified sources that obscure the exports binding cannot be parsed as the analysis does not do value tracking at all as it only detects common token patterns for exports etc.

In testing the approach has shown to work well for standard TypeScript and Babel transpilations. Identifiers are always filtered to be valid JS identifiers.

The major edge case this differs from current semantics with bundlers on is that the default export is retained as the full module.exports instance for backwards compat and consistency. This is important for Node.js semantic consistency I think.

//cc @nodejs/modules-active-members

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
documentation is changed or added
commit message follows commit guidelines

Admin message

Admin message

module: CJS exports detection for modules with __esModule export

Checklist

Merge request reports