====== The Anomy Mail Sanitizer ====== The **[[http://mailtools.anomy.net/|Anomy Sanitizer]]** is what most people would call "an email virus scanner". That description is not totally accurate, but it does cover one of the more important jobs that the sanitizer can do for you - **it can scan email attachments for viruses**. It is a rather old piece of software (the last **1.76** release is dated 2006), but it is still included in **Debian 11 Bullseye** and it can perform rather sofisticated email processing as a simple **filter** operation. I use it as a personal mail filter in GNU/Linux mail servers, because it can be activated on a per-user basis, by the **Local Delivery Agent** called by **Postfix**. The LDA can be as simple as **procmail** or the more complex **Dovecot LDA with Pigeonhole Sieve Interpreter**. ===== Perl unescaped left brace warning ===== The Sanitizer version included in Debian Bullseye contains a deprecated syntax into the Perl code, which triggers the warning message: Unescaped left brace in regex is passed through in regex; It turned out to be into the file **/usr/share/perl5/Anomy/Sanitizer/MacroScanner.pm**, at lines 120 and 127. Here the fix: $score += 4 while ($buff =~ s/\000(ID="\{[-0-9A-F]+)$/x$1/i); $score += 1 while ($buff =~ s/\000(ID="\{[-0-9A-F]+\}"|ThisWorkbook\000|PrivateProfileString)/x$1/i); ===== The HTML MIME multipart problem ===== Several mail user agents nowaday compose email messages in HTML format, sometimes without including a text-only copy of the same message. Some agents include the HTML as a part of multipart [[wp>MIME]] message, correctly marked as text/html. Other agents compose the message body directly in HTML, without using the MIME multipart system. In some circumstances Sanitizer defang the HTML message or the HTML part (changing its content type); thus a modern email reader does not display it correctly. In the best case an **anonymous attachment** is shown, in the worst case **an empty message** is shown. The Anomy Sanitizer uses several methods to detect the HTML parts into a message, relaying on the **Content-Type: text/html** or the **filename** of the MIME part (if specified). Once it detects an HTML part, it performs some operations on it, one of them is the match with a **regular expression** to confirm that it is actually an HTML text. If that regex test fails, the Sanitizer neutralizes (defang) such part changing its content type from **text/html** to something like **application/DEFANGED-14789** (the type name is composed using the **msg_defanged** configuration option). That behaviour is triggered by the **feat_files = 1** configuration option (enable filename-based policy decisions). Unfortunately the regex used by Sanitizer to detect an HTML part is very naive: it simply must contain this expression: |||
|
Notably the **Gmail** application nowaday (Jan 2023) composes the mail messages using only a **%%
%%** tag, thus fooling Sanitizer into //defanging// that part. I fixed the Perl code into **/usr/share/perl5/Anomy/Sanitizer/FileTypes.pm**, changing the regular expression in this way: my $HTML = { id => "html", risk => $low, name => "HTML text file", extensions => [ "html", "htm", "shtml" ], mime_types => [ 'text/html' ], magic => [ ], regexp => '|||
|', };
It is also possibile to remove the ''regexp'' element of the dictionary, in this case Sanitizer will recognize an HTML part only by the content type or the filename. The customized perl module can be installed into **/etc/perl/Anomy/Sanitizer/FileTypes.pm**, without changing the file installed by the Debian package.