fs: partition readFile to avoid threadpool exhaustion (!17054) · Merge requests · Rodrigo Test / Test Group-nodejs / node

Rodrigo Muino Tomonari requested to merge github/fork/davisjam/PartitionReadFile into master Nov 15, 2017

Problem Node implements fs.readFile as a call to stat, followed by a C++ -> libuv request to read the entire file based on the size reported by stat.

Why is this bad? The effect is to place on the libuv threadpool a potentially-large read request, occupying the libuv thread until it completes. While readFile certainly requires buffering the entire file contents, it can partition the read into smaller buffers (as is done on other read paths) along the way to avoid threadpool squatting.

If the file is relatively large or stored on a slow medium, reading the entire file in one shot seems particularly harmful, and presents a possible DoS vector.

Downsides to partitioning?

Correctness: I don't think partitioning the read like this raises any additional risk of read-write races on the FS. If the application is concurrently readFile'ing and modifying the file, it will already see funny behavior. Though libuv uses preadv where available, this doesn't guarantee read atomicity in the presence of concurrent writes.
Performance implications: a. Downside: Partitioning means that a single large readFile will be broken into many "out and back" requests to libuv, introducing overhead. b. Upside: In between each "out and back", other work pending on the threadpool can take a turn. In short, although partitioning will slow down a large request, it will lead to better throughput if the threadpool is handling more than one type of request.

Related It might be that writeFile has similar behavior. The writeFile path is a bit more complex and I didn't investigate carefully.

Fix approach Simple -- instead of reading in one shot, partition the read length using kReadFileBufferLength.

Test I introduced a new test to ensure that fs.readFile works for files smaller and larger than kReadFileBufferLength. It works.

Performance:

Machine details: $ uname -a Linux jamie-Lenovo-K450e 4.8.0-56-generic #61 (closed)~16.04.1-Ubuntu SMP Wed Jun 14 11:58:22 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
Excerpts from lscpu: Architecture: x86_64 CPU(s): 8 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 Model name: Intel(R) Core(TM) i7-4790 CPU @ 3.60GHz CPU MHz: 1499.194
benchmark/fs/readfile.js

Summary Benchmarks using benchmark/fs/readfile.js are unfavorable. I ran three iterations with my change and three with an unmodified version. Performance within a version was similar across the three iterations, so I report only the third iteration for each.

comparable performance on the 1KB file
significant performance degradation on the 16MB file (4-5x decrease)

With partitioned read:

$ for i in `seq 1 3`; do /tmp/node-part/node benchmark/fs/readfile.js; done
...
fs/readfile.js concurrent=1 len=1024 dur=5: 42,836.45194074361
fs/readfile.js concurrent=10 len=1024 dur=5: 94,170.12611909183
fs/readfile.js concurrent=1 len=16777216 dur=5: 71.79583090225451
fs/readfile.js concurrent=10 len=16777216 dur=5: 163.98033223174818

Without change:

$ for i in `seq 1 3`; do /tmp/node-orig/node benchmark/fs/readfile.js; done
...
fs/readfile.js concurrent=1 len=1024 dur=5: 43,815.347866646596
fs/readfile.js concurrent=10 len=1024 dur=5: 93,783.59180605657
fs/readfile.js concurrent=1 len=16777216 dur=5: 339.77196820103387
fs/readfile.js concurrent=10 len=16777216 dur=5: 592.325183524534

benchmark/fs/readfile-clogging.js

As discussed above, the readfile.js benchmark doesn't tell the whole story. The contention of this PR is that the 16MB reads will clog the threadpool, disadvantaging other work contending for the threadpool. I've introduced a new benchmark to characterize this.

Benchmark summary: I copied readfile.js and added a small asynchronous zlib operation to compete for the threadpool. If a non-partitioned readFile is clogging the threadpool, there will be a relatively small number of zips.

Performance summary:

Small file: No difference whether 1 read or 10
Large file: With 1 read, some effect (1 thread is always reading, but 3 threads remain for zip). With 10 reads, huge effect (zips get a fair share of the threadpool when partitoined). 61K zips with partitioned, 700 zips without.

Partitioned:

$ for i in `seq 1 3`; do /tmp/node-part/node benchmark/fs/readfile-clogging.js; done
...
bench ended, reads 96464 zips 154582
fs/readfile-clogging.js concurrent=1 len=1024 dur=5: 19,289.8420223229
fs/readfile-clogging.js concurrent=1 len=1024 dur=5: 30,909.421907455828
bench ended, reads 332932 zips 62896
fs/readfile-clogging.js concurrent=10 len=1024 dur=5: 66,572.28049862666
fs/readfile-clogging.js concurrent=10 len=1024 dur=5: 12,575.639939453387
bench ended, reads 149 zips 149574
fs/readfile-clogging.js concurrent=1 len=16777216 dur=5: 29.793230608569676
fs/readfile-clogging.js concurrent=1 len=16777216 dur=5: 29,905.935378334147
bench ended, reads 623 zips 61745
fs/readfile-clogging.js concurrent=10 len=16777216 dur=5: 124.57446300744513
fs/readfile-clogging.js concurrent=10 len=16777216 dur=5: 12,345.553950958118

Non-partitioned:

$ for i in `seq 1 3`; do /tmp/node-orig/node benchmark/fs/readfile-clogging.js; done
...
bench ended, reads 92559 zips 153226
fs/readfile-clogging.js concurrent=1 len=1024 dur=5: 18,510.65052192176
fs/readfile-clogging.js concurrent=1 len=1024 dur=5: 30,641.12621937156
bench ended, reads 332066 zips 62739
fs/readfile-clogging.js concurrent=10 len=1024 dur=5: 66,396.6979771542
fs/readfile-clogging.js concurrent=10 len=1024 dur=5: 12,543.801322137173
bench ended, reads 1554 zips 98886
fs/readfile-clogging.js concurrent=1 len=16777216 dur=5: 310.708121371412
fs/readfile-clogging.js concurrent=1 len=16777216 dur=5: 19,769.932924561737
bench ended, reads 2759 zips 703
fs/readfile-clogging.js concurrent=10 len=16777216 dur=5: 550.9968714783075
fs/readfile-clogging.js concurrent=10 len=16777216 dur=5: 140.38479443398438

Issue: This commit addresses #17047 (closed).

Checklist

make -j4 test (UNIX), or vcbuild test (Windows) passes
tests and/or benchmarks are included
commit message follows commit guidelines

Admin message

Admin message

fs: partition readFile to avoid threadpool exhaustion

Checklist

Affected core subsystem(s)

Merge request reports