doc:appunti:linux:sa:sanitizer
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
doc:appunti:linux:sa:sanitizer [2023/01/19 10:16] – created niccolo | doc:appunti:linux:sa:sanitizer [2023/01/19 12:11] (current) – [Perl Unescaped left brace warning] niccolo | ||
---|---|---|---|
Line 6: | Line 6: | ||
I use it as a personal mail filter in GNU/Linux mail servers, because it can be activated on a per-user basis, by the **Local Delivery Agent** called by **Postfix**. The LDA can be as simple as **procmail** or the more complex **Dovecot LDA with Pigeonhole Sieve Interpreter**. | I use it as a personal mail filter in GNU/Linux mail servers, because it can be activated on a per-user basis, by the **Local Delivery Agent** called by **Postfix**. The LDA can be as simple as **procmail** or the more complex **Dovecot LDA with Pigeonhole Sieve Interpreter**. | ||
+ | |||
+ | ===== Perl unescaped left brace warning ===== | ||
+ | |||
+ | The Sanitizer version included in Debian Bullseye contains a deprecated syntax into the Perl code, which triggers the warning message: | ||
+ | |||
+ | < | ||
+ | Unescaped left brace in regex is passed through in regex; | ||
+ | </ | ||
+ | |||
+ | It turned out to be into the file **/ | ||
+ | |||
+ | <code perl> | ||
+ | $score += 4 while ($buff =~ s/ | ||
+ | </ | ||
+ | |||
+ | <code perl> | ||
+ | $score += 1 while ($buff =~ s/ | ||
+ | </ | ||
+ | |||
+ | |||
+ | ===== The HTML MIME multipart problem ===== | ||
+ | |||
+ | Several mail user agents nowaday compose email messages in HTML format, sometimes without including a text-only copy of the same message. Some agents include the HTML as a part of multipart [[wp> | ||
+ | |||
+ | In some circumstances Sanitizer defang the HTML message or the HTML part (changing its content type); thus a modern email reader does not display it correctly. In the best case an **anonymous attachment** is shown, in the worst case **an empty message** is shown. | ||
+ | |||
+ | The Anomy Sanitizer uses several methods to detect the HTML parts into a message, relaying on the **Content-Type: | ||
+ | |||
+ | That behaviour is triggered by the **feat_files = 1** configuration option (enable filename-based policy decisions). | ||
+ | |||
+ | Unfortunately the regex used by Sanitizer to detect an HTML part is very naive: it simply must contain this expression: | ||
+ | |||
+ | < | ||
+ | < | ||
+ | </ | ||
+ | |||
+ | Notably the **Gmail** application nowaday (Jan 2023) composes the mail messages using only a **%%< | ||
+ | |||
+ | I fixed the Perl code into **/ | ||
+ | |||
+ | <code perl> | ||
+ | my $HTML = { | ||
+ | id => " | ||
+ | risk => $low, | ||
+ | name => "HTML text file", | ||
+ | extensions => [ " | ||
+ | mime_types => [ ' | ||
+ | magic => [ ], | ||
+ | regexp | ||
+ | }; | ||
+ | </ | ||
+ | |||
+ | It is also possibile to remove the '' | ||
+ | |||
+ | The customized perl module can be installed into **/ | ||
doc/appunti/linux/sa/sanitizer.1674119782.txt.gz · Last modified: 2023/01/19 10:16 by niccolo