Skip to content

http_client: prevent domain fronting (Host mismatch) when using proxy

Rodrigo Muino Tomonari requested to merge github/fork/LouisSung/patch-1 into master

Purpose

This PR is to prevent domain fronting warning/misjudgment (Host mismatch with the one in header) issue when using proxy

Explanation

We are informed by the security team in our company that our http requests contain mismatched Destination Host and Header Host.

According to the tutorial, a Host header field must be sent in all HTTP/1.1 request messages. That's why Node.js do the setHeader('Host', hostHeader) with the given host (options.hostname/host or localhost) by default.

However, the Host in the header field should be changed as the destination (actual) host instead of using the proxy (original) host unless user force overwrite it with options.headers['Host'].

Implementation

Since this.path = options.path || '/' set the path as / by default, I decide to check if the path has its own host (i.e., NOT start with a slash) with !this.path.startsWith('/').

Next, use the built-in new URL(path).host to get the host, while the http://${this.path.replace(/^.*:\/\//, '')} is used to prevent invalid URL, such as ❌github.com (vs. ✔️https://github.com) that raises an exception.

UPDATE (Jan 9): MOVE the implementation UP before the IPv6 address check and separate the hostHeader and hostHeaderPort to fit the original implementation. Also, use this.path.match('^.+://') ? this.path : `http://${this.path}` to replace http://${this.path.replace(/^.*:\/\//, '')} in order to keep the proxyURL.protocol used for proxyURL.port

UPDATE(Jan 17): REPLACE StringPrototypeMatch with RegExpPrototypeTest due to better performance as suggested

@aduh95: RegExpPrototypeTest is way more efficient than StringPrototypeMatch IIRC. It might be worth to store the RegEx in global variable to avoid re-building it at each call.

Further discussion

Origin

This issue was found when running the Azure DevOps task with API calls. The sample call stack would like: microsoft/azure-devops-node-api@8.1.1microsoft/typed-rest-client/RestClient@1.2.0microsoft/typed-rest-client/HttpClient (tunnelAgent)request/tunnel-agent#32 koichik/node-tunnel@0.0.4.

UPDATE (Jan 6): The microsoft/typed-rest-client@1.2.0 used tunnel@0.0.4, which also met this issue, and it was fiexd in tunnel@0.0.6 andmicrosoft/typed-rest-client@1.7.2

Actually, the request/tunnel-agent#32 and request/tunnel-agent#40 address this issue (host mismatch instead of domain fronting though) but that library has been out of maintenance since Mar. 2017. Besides, this issue occurs mainly because the way Node.js generating the default Host header field; therefore, it influences not only the request/tunnel-agent and koichik/node-tunnel but MUCH more widely.

UPDATE (Jan 6): The koichik/node-tunnel#29 also addressed this issue in tunnel@v0.0.6 by force adding a header Update dependency to microsoft/azure-devops-node-api@^10.1.0 fix the Host header mismatch issue for API calls

Compare to the curl and requests

Since Node.js doesn't have a built-in proxy support for HttpClient (or I just missed one ...?), the simplest workaround would be the one mentioned on Stack Overflow (put the proxy in the host and original target in the path). However, unlike the curl and requests (Python) that have built-in proxy support, Node.js doesn't handle the Host field in header correctly as expected.

UPDATE (Jan 6): Seems tunnel is a reasonable alternative choice with FOUR MILLION downloads per week (although tunnel-agent has fifteen million...)

I'm not sure if making this change goes against the design concept of the http_client or not, but I think it's important to get correct Host in header even though it doesn't matter or even be noticed in most cases.

The sample code and the result screenshot for GET requests using curl, requests, and http_client with proxy are shown below respectively.

# shell script
curl -v http://www.google.com -x http://XXX.XXX.XXX.XXX
# Python
import requests

if __name__ == '__main__':
    proxies = {'http': 'http://XXX.XXX.XXX.XXX'}
    req = requests.get(url='http://www.google.com', proxies=proxies)
// Node.js
(() => {
  const http = require('http');
  const options = {
    host: 'XXX.XXX.XXX.XXX',  // proxy
    port: 80,
    path: 'http://www.google.com',
  };

  http.get(options, (res) => {
    console.log(`header: ${res.req._header}`);
  });
})();

Merge request reports

Loading