http_client: prevent domain fronting (Host mismatch) when using proxy
Purpose
This PR is to prevent domain fronting warning/misjudgment (Host mismatch with the one in header) issue when using proxy
Explanation
We are informed by the security team in our company that our http requests contain mismatched Destination Host and Header Host.
According to the tutorial, a Host header field must be sent in all HTTP/1.1 request messages
.
That's why Node.js do the setHeader('Host', hostHeader)
with the given host
(options.hostname/host or localhost) by default.
However, the Host
in the header field should be changed as the destination (actual) host
instead of using the proxy (original) host
unless user force overwrite it with options.headers['Host']
.
Implementation
Since this.path = options.path || '/'
set the path as /
by default, I decide to check if the path has its own host (i.e., NOT start with a slash) with !this.path.startsWith('/')
.
Next, use the built-in new URL(path).host
to get the host, while the is used to prevent invalid URL, such as http://${this.path.replace(/^.*:\/\//, '')}
❌github.com
(vs. ✔️https://github.com
) that raises an exception.
UPDATE (Jan 9): MOVE the implementation UP before the IPv6 address check and separate the
hostHeader
andhostHeaderPort
to fit the original implementation. Also, usethis.path.match('^.+://') ? this.path : `http://${this.path}`
to replacein order to keep thehttp://${this.path.replace(/^.*:\/\//, '')}
proxyURL.protocol
used forproxyURL.port
UPDATE(Jan 17): REPLACE
StringPrototypeMatch
withRegExpPrototypeTest
due to better performance as suggested@aduh95:
RegExpPrototypeTest
is way more efficient thanStringPrototypeMatch
IIRC. It might be worth to store the RegEx in global variable to avoid re-building it at each call.
Further discussion
Origin
This issue was found when running the Azure DevOps task with API calls.
The sample call stack would like: microsoft/azure-devops-node-api@8.1.1
→ microsoft/typed-rest-client/RestClient@1.2.0
→ microsoft/typed-rest-client/HttpClient (tunnelAgent)
→ request/tunnel-agent#32
koichik/node-tunnel@0.0.4
.
UPDATE (Jan 6): The
microsoft/typed-rest-client@1.2.0
usedtunnel@0.0.4
, which also met this issue, and it was fiexd intunnel@0.0.6
andmicrosoft/typed-rest-client@1.7.2
Actually, the request/tunnel-agent#32 and request/tunnel-agent#40 address this issue (host mismatch instead of domain fronting though) but that library has been out of maintenance since Mar. 2017.
Besides, this issue occurs mainly because the way Node.js generating the default Host header field; therefore, it influences not only the request/tunnel-agent
and koichik/node-tunnel
but MUCH more widely.
UPDATE (Jan 6): The koichik/node-tunnel#29 also addressed this issue in
tunnel@v0.0.6
by force adding a header Update dependency tomicrosoft/azure-devops-node-api@^10.1.0
fix the Host header mismatch issue for API calls
curl
and requests
Compare to the Since Node.js doesn't have a built-in proxy support for HttpClient (or I just missed one ...?), the simplest workaround would be the one mentioned on Stack Overflow (put the proxy in the host
and original target in the path
).
However, unlike the curl
and requests
(Python) that have built-in proxy support, Node.js doesn't handle the Host field in header correctly as expected.
UPDATE (Jan 6): Seems
tunnel
is a reasonable alternative choice with FOUR MILLION downloads per week (althoughtunnel-agent
has fifteen million...)
I'm not sure if making this change goes against the design concept of the http_client
or not, but I think it's important to get correct Host in header even though it doesn't matter or even be noticed in most cases.
The sample code and the result screenshot for GET requests using curl
, requests
, and http_client
with proxy are shown below respectively.
# shell script
curl -v http://www.google.com -x http://XXX.XXX.XXX.XXX
# Python
import requests
if __name__ == '__main__':
proxies = {'http': 'http://XXX.XXX.XXX.XXX'}
req = requests.get(url='http://www.google.com', proxies=proxies)
// Node.js
(() => {
const http = require('http');
const options = {
host: 'XXX.XXX.XXX.XXX', // proxy
port: 80,
path: 'http://www.google.com',
};
http.get(options, (res) => {
console.log(`header: ${res.req._header}`);
});
})();