Please note that republishing this article in full or in part is only allowed under the conditions described here.
This is the second article in a series which will explain the evasions done by HTTP Evader. It covers the failure of several firewalls to support content some compressions supported by all or most browsers, notable the deflate compression. In short, it is possible to bypass the malware inspection of several firewalls by simply sending a response compressed by deflate:
HTTP/1.1 200 ok Content-Encoding: deflate insert compressed malware here
Since decompression is usually cheaper than bandwidth lots of content gets compressed to optimize delivery. Content compression is specified in the browser with the Content-Encoding header, e.g.
HTTP/1.1 200 ok Content-Encoding: gzip Content-type: text/html ... HTML content compressed with gzip
Which compressions are supported by the browser gets sent by the browser to the server in the request. Typically it looks like this:
GET / HTTP/1.0 Accept-Encoding: gzip, deflate
Practically all current browsers support the compression methods 'gzip' and 'deflate' which are also specified in the HTTP standard. Chrome also has support for 'sdch' and Opera includes support for 'lzma'. But in this article we care only about 'gzip' and 'deflate'.
Lots of readers are probably familiar with the 'gzip' compression scheme since they used the 'gzip' command line program or worked with *.tar.gz files, which are Tar-files compressed with gzip. Gzip is by far the most commonly used encoding for web content and no significant firewall fails to support this encoding. But, with the other major encoding 'deflate' this is different.
Deflate is very similar to gzip. Slightly simplified one can say that gzip is just deflate with some header at the beginning and same trailer at the end.
And then there is yet another wrapper around the Deflate compression: zlib, which is also used in PNG images. The zlib format is actually the one which the browsers should have implemented, but confusingly the content encoding name for zlib was defined as 'deflate'. This confusion caused of course a lot of misinterpretations and thus most browsers accept both zlib and raw deflate with the name 'deflate'. Only Internet Explorer implements only raw deflate, which is actually the scheme which was not meant by the standard.
To summarize we have the following major compression schemes which are allow based on the same algorithm (DEFLATE):
- gzip - deflate with gzip header and trailer, RFC1952: all browsers
- deflate - deflate with zlib header and trailer, RFC1950: all browsers except IE
- deflate - raw deflate without header and trailer, RFC1951: all browsers
Even if the actual compression scheme is the same there are several security products which support gzip but not deflate. I've described this issue in 2014 for the ZScaler URL checker and Comodo Web Inspector in Hiding Malware in Plain Sight From Online Scanners and in Bypassing Malware Scanning in Sophos UTM Web Protection for the Sophos UTM firewall. But, while Sophos fixed the issue before I published it, Comodo and ZScaler still fail to support deflate to this date and I've got no feedback to my bug reports.
Additionally some firewalls are able to deal with Deflate according to RFC1951 as supported by all browsers but fail to deal with Deflate based on RFC1950, which is supported by all browsers except IE.
The most sane way would be of course to block the content, because it might be dangerous. But, most users would probably not like it if access to innocent content gets denied only because the firewall is too stupid to support the relevant compression. Thus most firewalls take the seemingly user-friendly but mainly attacker-friendly attitude: when in doubt let the content pass. Thus, simply using a Content-Encoding of deflate makes the malware invisible to several firewalls and security systems.
Even worse, sometimes one does not even need to compress the content. Simply specifying some Content-Encoding the firewall does not understand make some firewalls stop scanning at all in the assumption that they would not be able to decompress the content anyway. Thus this works against several products:
HTTP/1.0 200 ok Content-Encoding: foobar ... plain uncompressed malware ...
Reports I've received from the HTTP Evader tool suggest that several different products are affected, among them products which claim to have "Advanced Threat Protection" capabilities. As far as I could attribute the evasion reports to specific products I've contacted the vendors over this and other evasion problems but will not release specific product names for now.
But what I can say is that Sophos UTM which claims to be "The Ultimate Security Package" featuring "Advanced Threat Protection" and which is included in the Gartner Top 10 List of Next Generation Firewalls was affected by this and other problems until I've reported it to them, see Bypassing Malware Scanning in Sophos UTM Web Protection.
A HTTP response consists of a the header and the body. The information which compression is used is contained in the header, while the body contains the compressed content. A typical implementation of firewalls is to just send the body to the antivirus for analysis, same as one would send a file to the antivirus. Because gzip contains an easy to recognize header the antivirus can successfully detect the compression scheme and unpack it for analysis.
Deflate does not have such a typical header but instead the compressed content looks mostly random. Thus the antivirus cannot detect that the content is compressed by itself. Instead the antivirus would need to be informed about this or it would be necessary to decompress the content before sending it to the antivirus.
I can just guess that the authors of the firewall are either simply not aware that there are two deflate schemes or that they think it is only worth to implement the one supported by all browsers. Because no sane web developer would use a compression which is not available in all browsers. Unfortunatly attackers don't follow this idea and might be happily use an attack which does work with most but not all browsers.
That's actually what some products do. They strip any compression schemes which they don't support from the Accept-Encoding header in the browsers request in the hope that the standard conforming server will then not use this compression scheme. But hope is not enough because the attacker can either run it's own server or provide compressed content by simply using some PHP as shown in Hiding Malware in Plain Sight From Online Scanners.
Thus a sane firewall should not just hope and trust the server but actually check that the compression was not used. But, all of these firewalls seem to assume that a bad bad hacker would still adhere to noble standards and don't verify that no compression was used. This is actually the same error ZScaler URL checker and Comodo Web Inspector do, since they blindly trust the server to not deliver compressed content just because they don't show compression support in the request, see Hiding Malware in Plain Sight From Online Scanners.
While gzip is by far the most used compression method tests with the Alexa top 10.000 servers shows there are actually servers out there which use the deflate method for innocent data. Thus simply concluding that the site is malicious because it uses deflate will not be possible.
Since it should be obvious that you could not trust the marketing of the vendor you should better check yourself if these Advanced Threat Protection claims are just bragging. If you are behind some firewall claiming to detect malware then all you need is a browser and then follow the instructions to test against the HTTP Evader tool.