Please note that republishing this article in full or in part is only allowed under the conditions described here.
There are serveral sites which offer scanning a URL for malware. One should expect that these sites emulate a real browser good enough so that their rating can be trusted. Unfortunatly this is not the case.
Based on research I published about 17 month ago about unusual Content-Encoding headers I had a closer look at the following major online scanners:
For testing I've compressed the content in the following ways and announced the compression with the Content-Encoding header:
- gzip, which is supported by all major browsers
- deflate as defined in RFC1951, which is supported by all major browsers
- deflate as defined in RFC1950, which is supported by at least Google Chrome and Firefox, but not Microsoft Internet Explorer
- twice compressed with deflate, which is supported by at least Google Chrome and Firefox, but not Microsoft Internet Explorer
To simulate an attacker which tries to be as anonymous as possible I've used one of the many sites offering free PHP hosting, because that's all needed to add custom HTTP headers. All of the tests deliver the harmless EICAR test virus which should be detected by all virus scanners.
This small PHP page delivers the EICAR test virus compressed with gzip. All major browsers will understand this format. The good news is that all of the tested online malware scanners also understand this and detect the virus. Unfortunatly this was most of the good news for this research.
<?php header('HTTP/1.0 200 ok'); header('Content-type: text/plain'); header('Content-Encoding: gzip'); // EICAR compressed with gzip and base64 encoded echo base64_decode('H4sIAPVklFQAA4sw9VcMUHVwDIg2iQmIijA10QiI0zR3dtY0r1Vx9XR2DNINDnH0c3EMctF19AvxDPMMCg3WDXENDtF18/RxVVTx0PbQAgA8z1FoRAAAAA=='); exit(0); // exit explicitly so that the free PHP hoster has no chance to append its own content ?>
These PHP pages use either raw deflate (RFC 1951) as supported by all major browsers or zlib (RFC1950) as supported by at least Google Chrome and Firefox. Surprisingly only Virustotal understands this compression scheme and both ZScaler and Comodo Web Inspector fail to detect the malware.
The reason might be, that the scanners look only at the content (the HTTP body) and ignore any information about the Content-Encoding inside the HTTP header. But, while a compression of gzip can be detected from a few magical bytes at the beginning of the file (the gzip header), deflate compression can not detected this way.
<?php header('HTTP/1.0 200 ok'); header('Content-type: text/plain'); header('Content-Encoding: deflate'); // EICAR compressed with RFC 1951 (raw deflate) echo base64_decode('izD1VwxQdXAMiDaJCYiKMDXRCIjTNHd21jSvVXH1dHYM0g0OcfRzcQxy0XX0C/EM8wwKDdYNcQ0O0XXz9HFVVPHQ9tACAA=='); exit(0); ?>
<?php header('HTTP/1.0 200 ok'); header('Content-type: text/plain'); header('Content-Encoding: deflate'); // EICAR compressed with RFC 1950 (zlib) echo base64_decode('eJyLMPVXDFB1cAyINokJiIowNdEIiNM0d3bWNK9VcfV0dgzSDQ5x9HNxDHLRdfQL8QzzDAoN1g1xDQ7RdfP0cVVU8dD20AIAdFQSDw=='); exit(0); ?>
In this case the content is compressed twice, i.e. compress(compress(content)). While this looks like a strange feature (it actually makes sense sometimes if different types of compressions are combined) it is in the standard and is supported by at least Google Chrome and Firefox. It is not supported by Microsoft Internet Explorer, which assumes no compression at all in this case. It is also not supported by lots of Intrusion Detection Systems (see previous research) which only assume a single compression and ignore the rest. Also, Virustotal, ZScaler and Comodo Web Inspector all fail to detect the malware.
<?php header('HTTP/1.0 200 ok'); header('Content-type: text/plain'); header('Content-Encoding: deflate, deflate'); // EICAR compressed twice with raw deflate echo base64_decode('AUYAuf+LMPVXDFB1cAyINokJiIowNdEIiNM0d3bWNK9VcfV0dgzSDQ5x9HNxDHLRdfQL8QzzDAoN1g1xDQ7RdfP0cVVU8dD20AIACg=='); exit(0); ?>
While file based malware scanners already have enough problems of their own to reliably detect malware, it gets much worse if you add the seemingly simple task of retrieving the content from a web site. Analysis of logs of my test site indicate that also Google and other bots have similar problems. And the requirements for a black hat to mount such an attack are trivial, any of the free PHP hosters or similar sites are enough to create the necessary HTTP responses.
This means, that you can neither trust the online scanners nor Google Safe Browsing or other technologies which are based on scanning the internet for malware.
- The PHP online help describes how to compress content with the various compression schemes.
- If you just want to check these and more compression problems with you own browser or firewall you might visit my test site or setup your own test server with the App::DubiousHTTP Perl module which is available from github or MetaCPAN.
comments powered by Disqus