November 2014

Please note that republishing this article in full or in part is only allowed under the conditions described here.

Dubious MIME - Conflicting Content-Transfer-Encoding Headers

Summary

Because of different interpretations of standards in mail clients, IDS/IPS and antivirus products, it is possible to pass malware undetected to the end user. This is especially funny and dangerous if different interpretations happen inside a single product, like in Yahoo! Web Mail.

What is this about?

MIME describes the common transfer format for anything than trivial e-mails, that is e-mails which contain attachments, embedded images etc. This format dates back to the days when the internet was still young and slow and you could actually hear the bytes traveling.

Because the protocols used for mail transfer at this time, that is UUCP and the still used SMTP, were mostly ASCII-text based, anything binary or non-ASCII like umlauts had to be encoded for transport. And because a way was needed to specify the encoding the Content-Transfer-Encoding was born. This header can have different values. Especially popular are base64 for real binary data and quoted-printable for text with few non-ASCII characters.

A typical mail with a binary attachment looks like this:

     From: foo
     To: bar
     Subject: foobar
     Mime-Version: 1.0
     Content-type: multipart/mixed; boundary=barfoot
   
     --barfoot
     Content-type: text/plain
   
     This is only ASCII text, but the attachment contains an image.
     --barfoot
     Content-type: image/gif
     Content-transfer-encoding: base64
   
     R0lGODlhHgAUAOMJAAAAAAgICAkJCRUVFSEhIfDw8PLy8vX19fj4+P//////////////////////
     /////yH5BAEKAA8ALAAAAAAeABQAAARaMMlJq7046827/2DoFQBQcKRJpWfCSm8Vw2VrzROOr/Xd
     57/YzvWjqYjHYarEZLaERWBz+vwZAgKD72isHg+DwWFrQ3pbCAIBQeYloxhdEH6Rv+/lrvusF3ki
     ADs=
     --barfoot--

Since each of the possible transfer encodings already results in data suitable for transport there is no need to stack multiple encodings on top of each other and therefore the specification allows only a single header.

Violating the specification

But what happens, if we use multiple different Content-Transfer-Encoding headers anyway? We take as example the following mail, which contains a single attachment with the Eicar test virus. And while the attachment is encoded in base64, we specify two Content-Transfer-Encoding headers, one for base64 and the other for quoted-printable:

     From: foo
     To: bar
     Subject: eicar - base64 cte header preceding quoted-printable
     Mime-Version: 1.0
     Content-type: multipart/mixed; boundary=barfoot

     --barfoot
     Content-Transfer-Encoding: base64
     Content-Transfer-Encoding: quoted-printable
     Content-Disposition: attachment; name=eicar.txt

     WDVPIVAlQEFQWzRcUFpYNTQoUF4pN0NDKTd9JEVJQ0FSLVNU
     QU5EQVJELUFOVElWSVJVUy1URVNULUZJTEUhJEgrSCo=
     --barfoot--

It turns out, that there is no single interpretation. Most mail clients and web interfaces will use the first Content-Transfer-Encoding header, while IDS like Bro and Snort and lots of Antivirus products will only look at the last header. All of the following detailed results are based on tests done on 2014/10/09.

Observed Behavior: Mail clients, Web mail, MTA, IDS

The first Content-Transfer-Encoding header is used by:

The last Content-Transfer-Encoding header is used by:

Yet another behavior is shown by others:

Observed Behavior: Antivirus Products

Antivirus products show a variety of behavior. I've tested these products 2014/10/09 using virustotal.com with files in mbox or RFC822 format. Any scanners which did not find the test virus in any of these tests were ignored, because they don't seem to understand these formats. Results:

Most Funny Behavior: Yahoo! Web Mail

Yahoo MTA itself seems to have no builtin virus scanning, but the Web Mail interface uses Norton by Symantec to scan attachments before download. But, the interaction between antivirus and download is totally broken which leads to immediate evasion of the built-in virus scanner:

The issue was reported to Yahoo via hackerone on 2014/10/9, but closed as "Won't fix" because "We are already aware of this functionality on our site and are working towards a fix.". No reply was received when I asked if it is ok to publish the bug. Last time I've checked (2014/10/28) the bug was still there.

Conclusion

Using different interpretations of the standard makes evading security systems easy. Lots of current security products assume that the attacker will behave in a sane way and adhere to the standards, which is probably not what you should expect from an attacker.

Update