Skip to content

fs: add read(buffer[, options]) versions

Refs: https://github.com/nodejs/node/pull/42601 (writing counterparts)

This PR does:

  • Adding fs.read(fd, buffer[, options], callback)
  • Adding filehandle.read(buffer[, options])
  • Aligning fs.readSync code (changes in UB corner cases) 1 (offloaded due to potential breakage)

These changes would allow to:

  • Align writing methods with uniform buffer[, options] signature 2
  • Align signatures of reading methods
  • Which also means not only read-write interoperability within each of three APIs, but also between any of them
  • Achieve a bit more consistent UB behaviour, when documentation implies similarity ("For detailed information, see the documentation of the asynchronous version of this API")
  • Have it fully completed without waiting for DEP0162 EOL
  • Harden fs.readSync typechecking 3
  • (optional) Introduce params4 signatures later, if needed 5
  • (optional) Deprecate and remove params signatures later, if strongly needed 6
  • (opinionated) Maybe make usage of fs.read and fh.read less error-prone 6
Why current behaviour is questionable

Default buffer in fs.read and filehandle.read is Buffer.alloc(16384). It's a pretty reasonable value for positioned argument (i.e. if it's used only when all further arguments are unset as well), but it's a footgun when length is defined.

#!/usr/bin/env node
import {open} from 'fs/promises';
async function readSubstring(filepath, len = 0, pos = 0) {
	const fh = await open(filepath, 'r');
	const res = await fh.read({ length: Number(len), position: Number(pos) });
	const str = String(res.buffer);
	return str;
}
const substr = await readSubstring(...process.argv.slice(2));
console.log(substr);
$ echo 123456789 > /tmp/tst
$ readsubstr.mjs /tmp/tst 3 5
678
$ readsubstr.mjs /tmp/tst 3 5 | wc -c
16385

Without reading the docs properly, I'd totes assume that it allocates length + offset||0 number of bytes by default. Because of that, buffer has to be allocated explicitly. Which makes [buffer[, options]] signature more reasonable and less misleading.

  1. Changes in UB are about interpreting false/null/undefined values as invalid or default. I think, the best way here is to interpret typeof offsetOrOptions === 'object' as indicator of attempt to use "named params" version. null means default values. false and undefined means offset === 0. Everything else goes to offset's integer validation. Exception: Callback API methods will still check the number of arguments as well, to keep callback strictly positioned.

  2. Refs: https://github.com/nodejs/node/pull/42601#discussion_r846095440

  3. Current implementation of fs.readSync interprets any value as options object. For example, passing a primitive string will use its length property without any warning.

  4. I use the following terminology here: options == { offset, length, position }, params == { buffer, ...options } == { buffer, offset, length, position }

  5. Theoretical reason to include buffer is its strong connection to offset and length. If we specify nonzero offset, we usually imply a non-arbitrary buffer. However, I don't have strong arguments nor good examples; and it can be worked around anyway.

  6. See "Why current behaviour is questionable" foldablie 2

Merge request reports

Loading