Skip to content

Adds zip support to the `zlib` module

Rodrigo Muino Tomonari requested to merge github/fork/arcanis/mael/zip into main

Ref #45434

This PR adds a new ZipArchive class to the zlib module, which can be used to read and write content from zip archives. Its current API looks like this:

const fs = require(`fs`);
const {ZipArchive} = require(`zlib`);

// Creates a new in-memory archive
const zip = new ZipArchive();
zip.addFile(`hello`, fs.readFileSync(__filename));
const data = zip.digest();

fs.writeFileSync(`./archive.zip`, data);

// The data obtained from `digest` can also be reopened
const zip2 = new ZipArchive(data);
console.log(zip2.getEntries());
console.log(zip2.getEntries({withFileTypes: true}));
const content = zip2.readEntry(0);

console.log(content);

Maintenance cost

I kept the feature scope limited enough to cover most of the use cases but without increasing the maintenance cost or build cost. A few things have been cut from what the libzip would allow:

  • Opening files directly from the filesystem isn't supported, because it would bypass node:fs. The current API only works with memory buffers (according to my tests it doesn't have any negative impact even when compared to the wasm API which went through file descriptors).

  • Encryption isn't supported, because it's unclear how it should integrate with node:crypto. There's room for follow-up, but it didn't seem a required feature for the first iteration.

Performances

Keep in mind that raw performances aren't the main reason why zip support is important to have as a native feature. The speedup is nice, the simplified garbage collection is very nice, but the real benefit is having a stable cross-platform way to bundle files between platforms. It will be useful for cache mechanisms, transfer algorithms, user CLI generation, and more.

Still, I made some reasonable checks to make sure that no use case regressed. Size of the binary before / after:

before 89604801 85.45MB
after  89780257 85.62MB (+171KB)

Performance-wise, using Yarn as benchmark, the results show native being ~2x faster than wasm (keep in mind the wasm implementation isn't the most popular zip library; projects using jszip will see significantly larger differences):

YARN_EXPERIMENT_NATIVE_ZIPFS=0 PKG=gatsby
➤ YN0000: └ Completed in 15s 806ms
YARN_EXPERIMENT_NATIVE_ZIPFS=1 PKG=gatsby
➤ YN0000: └ Completed in 7s 404ms

YARN_EXPERIMENT_NATIVE_ZIPFS=0 PKG=typescript
➤ YN0000: └ Completed in 13s 676ms
YARN_EXPERIMENT_NATIVE_ZIPFS=1 PKG=typescript
➤ YN0000: └ Completed in 5s 351ms

YARN_EXPERIMENT_NATIVE_ZIPFS=0 PKG=next
➤ YN0000: └ Completed in 5s 923ms
YARN_EXPERIMENT_NATIVE_ZIPFS=1 PKG=next
➤ YN0000: └ Completed in 3s 512ms

To Do

  • Improve the documentation
  • Add more regression tests
  • Benchmark against the WASM libzip
  • API Bikeshedding

Merge request reports

Loading