Please note that republishing this article in full or in part is only allowed under the conditions described here.
The previous parts of this series looked at firewalls and browsers as black boxes which just behave that way for unknown reason. For this part I took a closer look at the source code of Chromium and Firefox. This way I've found even more ways to construct HTTP which is insanely broken but still gets accepted by the browsers. And not surprisingly lots of firewalls simply pass through such a broken protocol without the ability to analyze the payload properly. Just as an example how broken HTTP could be but still gets accepted by Chrome:
httP/1A.1B 200 ok Content\000-Encoding: defl\000ate first part of malware compressed with zlib followed by the rest compressed with raw deflate
The previous article in this series was Part 9 - How to Fix the Inspection Bypasses
A normal status line looks like this:
HTTP/major.minor status-code description e.g. HTTP/1.1 200 ok
Firefox treats the HTTP version as two numbers which gets parsed with atoi. Chrome instead only takes the first digit of minor and major. Thus 'HTTP/1.010' gets treated like 'HTTP/1.0' with Chrome (chunking not possible) while Firefox sees a version 1.10 which gets treated like 'HTTP/1.1' (chunking possible) because the minor version number 10 is greater than 1. Similar Chrome sees a version of 0.1 (no chunking) in 'HTTP/01.1' while Firefox sees 1.1 (chunking).
The atoi function is used by Firefox also when parsing the status code. The returned integer get then casted to an unsigned short (16 bit) which means that a status code of -65436 gets treated the same as status code 100, i.e. as preliminary response and the real response follows.
Apart from that both Firefox and Chrome accept 'http' or even 'hTTp' instead of 'HTTP' and Firefox accepts 'ICY' in place of 'HTTP/1.0' which is a reference to the Shoutcast protocol.
Chrome replaces each end of a (maybe folded) header line with \000 for easier parsing later. To be on the safe side it eliminates before any existing \000 in the header, which never should occur there anyway with proper HTTP. This way the following header lines are the same for Chrome:
Transfer-Encoding: chunked Transfer-Encoding: chu\000nked Transfer-\000Encoding: chunked \000Transfer-Encoding: chunked \000Transfer-\000Encoding\000:\000chun\000ked\000
As line end Chrome accepts any combination of \r and \n as long as it is not the end of the HTTP header part. Thus \n and \r\n are accepted but also \r or even \n\r\r\n. Interestingly the last one is considered the end of the HTTP header part by Safari.
With chunked transfer encoding each chunk is prefixed by its size given as a hexadecimal number. Firefox uses the stroul function to parse these hex values. This way it also accepts any amount of white space in front of the size including \t,\r,\n,\v or \f. It also means that the size can be prefixed with '0x' and that that even negative numbers are accepted which then get casted to unsigned long. Thus these size specifications mean the same for Firefox
f 0xf \v\f\t\r\n 0xf \v\f\t\r\n +0xf \v\f\t\t\n -0xfffffff0 - 32 bit Firefox \v\f\t\t\n -0xfffffffffffffff0 - 64 bit Firefox
As described in Part 2 - Deflate Compression there are two kinds of compression schemes commonly accepted as 'deflate': zlib (RFC 1950) and raw deflate (RFC 1951). Both Chrome and Firefox try to decompress with zlib first and if this fails they retry with raw delate. But, they simply believe that the failure will be at the initial input buffer and don't verify this assumption. By triggering a zlib error through an invalid checksum after the first part got decompressed, one can make both browsers restart the decompression with raw deflate. This needs the right timing but can be reliably triggered like this:
HTTP/1.1 200 ok Content-type: deflate first part compressed with zlib but missing the final checksum .... sleep(1) so that first packets gets processed ... remaining data compressed with raw deflate
If the content is compressed with gzip Chrome will consider any body data after the end of the compression as uncompressed data and will treat it as valid part of the response:
HTTP/1.1 200 ok Content-type: gzip first part compressed with gzip remaining data not compressed
This was only a small part of the laziness done by the browsers when parsing the content. In my opinion most of these examples can no longer be considered robustness. While robustness aims to deal with malformed data which might occur in practice in these cases the browser accepts nonsensical data which are never expected in practice. From the perspective of the browsers this does not harm, because they still treat valid data in a valid way. But any firewall or IDS in between which tries to analyze the traffic will probably interpret the data differently in lots of cases. This makes a bypass possible unless such data are either blocked or by the inspection or analyzed and forwarded after sanitizing the data.
If you are curious how bad your browser or you IPS behave with uncommon or invalid responses from the web server head over to the the HTTP Evader test site. To read more about the details of specific bypasses go to the main description of HTTP Evader.