The Semantic Gap

Please note that republishing this article in full or in part is only allowed under the conditions described here.

The Semantic Gap

I'm currently involved in researching security problems and how perimeter firewalls might help mitigating them. During the research I've found several ways to evade existing IDS, firewalls and other security systems at the application level by using interpretation differences between the protecting and the protected system (i.e firewall and client). This Semantic Gap is caused by incomplete, unclear or contradicting specifications, or buggy or incomplete implementations which especially fail in rare use cases. Additionally in adherence to the famous robustness principle implementations usually accept various malformed data. But because there is no defined behavior different implementations handle bad data in a different way. Especially scary is that most systems which should protect the client against malware blindly assume that the malware is send in a standard conformant way, thus making evasion easy and stealth.

Evasions using the HTTP protocol

While HTTP looks like a simple standard just from looking at some examples one will find enough places where the standard is more flexible then needed or where one is able to specify inherently contradicting information using the protocol. In such cases implementations often differ and thus make bypassing the protecting system possible:

by using Variations on the Transfer-Encoding chunked response header (07/2013).
by using rarely used values for the Content-Encoding response header (07/2013).
by playing with conflicting information about the length of the response body (07/2013).

Several commercial systems can be bypassed this way, as described in

Is this URL safe? Hiding malware in plain sight from online malware scanners (12/2014).
Bypassing Malware Scanning in Sophos UTM Web Protection in 12/2014 and again in 07/2015.

To check how your browser behaves and how good your perimeter firewall is able to protect against these and much more HTTP based evasions use HTTP evader.

Evasions by misusing MIME

Historically e-mails where only plain ASCII text. To get support for different character encodings and for attachments while still keeping compatibility to old systems the MIME standard was developed, which maps all these features back to plain ASCII text. To display a mail and to extract the attachments the mail has thus to be decoded. But unfortunatly the standard makes it possible to construct invalid MIME. Since these data can not be decoded in a clearly defined way the results differ between implementations, which again makes evasions possible:

Using multiple conflicting Content-Transfer-Encoding headers inside a mail makes it possible to hide malware from IDS/IPS or virus scanners but make it available from mail applications or web based mail (11/2014).
Mail clients behave differently when using conflicting declarations of multipart boundary inside a mail which again makes evasions possible (07/2015).

Again systems can be bypassed this way:

Yahoo! Mail virus scanning could be bypassed by using conflicting Content-Transfer-Encoding headers. The issue was reported in 10/2014 but was marked as already known problem. It was fixed some month after the report.
GMX Webmail virus scanning could be bypassed by using conflicting MIME boundaries. The issue was reported in 06/2015 and fixed shortly after the report.
AOL Mail virus scanning could be bypassed using conflicting Content-Transfer-Encoding headers (same issue as reported with Yahoo! Mail in 10/2014). The issue could not be reported because AOL did not provide any way to report security issues, even when asked. Thus it was published here in 07/2015.

Breaking DKIM - on Purpose and by Chance

DKIM is, together with SPF and DMARC, one of the major technologies in fighting sender spoofing. Breaking DKIM - on Purpose and by Chance shows how fragile the protocol is and how easily it can by broken by an attacker to spoof mails and how easily it breaks by itself and treats non-spoofed mails as spoofed.

Five Easy Steps to Bypass Antivirus using manipulated MIME

More than 3 years after I started to publish MIME related problems and many more years after others detected similar problems the vendors of analysis systems still don't seem to be really aware of these problems. Therefore I demonstrate in this post how trivial it is to bypass the mail analysis in current antivirus products.