September 24, 2003
milter-regex 1.5
System Administration Commands
Linux System Administration
BSD System Manager's Manual
milter-regex
1M88
milter-regex
sendmail milter plugin for regular expression
filtering
milter-regex
-d
-c config
-n
-p pipe
-u user
Description
The milter-regex plugin can be used with the milter API of
sendmail
8
sendmail
8
sendmail
1M
to filter mails using regular expressions matching SMTP
envelope parameters and mail headers and body.
Options
-d
Enable verbose debug output, which will be logged via
syslog
3
syslog
3
syslog
3C
at the debug
level for the mail facility. In case you are logging to a file, make sure
to have a lot of space at the partition in question!
-c
config
Use the specified configuration file instead of the default, @CONF@.
-n
Usually milter-regex adds a heading to messages that
are scanned. The header is of the form "X-Milter: version",
this option instructs regex-milter to refrain from adding this heading.
-p
pipe
Use the specified pipe to interface
sendmail
8
sendmail
8
sendmail
1M
. Default is @OCONN@.
-u
user
Run as the specified user instead of the default, smmsp. When
milter-regex
is started as root, it calls
setuid
2
to drop privileges. The non-privileged
user should have read access to the configuration file and read-write access
to the pipe.
Sendmail Configuration
sendmail
8
sendmail
8
sendmail
1M
needs to have milter support. To check, whether your sendmail
has milter support, you may run:
/usr/lib//usr/sbin//usr/sbin/sendmail -d0.1 -bp | grep MILTER
If MILTER appears in the output, sendmail supports
Mail Filter.
The milter aka plugin needs to be registered in the
sendmail
8
sendmail
8
sendmail
1M
configuration,
by adding the following lines to your sendmail m4 configuration file
INPUT_MAIL_FILTER(`milter-regex', `S=@OCONN@, T=S:30s;R:2m')
and rebuilding /etc/mail/sendmail.cf (e.g. cd /etc/mail;
./mknewcf -c server.mc) and restarting
sendmail
8
sendmail
8
sendmail
1M
.
Plugin Configuration
The configuration file consists of rules that, when matched, cause
sendmail
8
sendmail
8
sendmail
1M
to reject mails. Emtpy lines and lines starting with # are ignored, as well as
leading whitespace (blanks, tabs). Trailing backslashes can be used to wrap
long rules into multiple lines. Each rule starts with one of the following
commands:
reject
"message"
Subsequent rules cause the mail to be rejected with a permanent error
consisting of the specified text part. The SMTP reply consists of the
three-digit code 550 (RFC 2821
"command rejected for policy reasons"),
the extended reply code 5.7.1 (RFC 1893
"Permanent Failure", "Security
or Policy Status", "Delivery not
authorized, message refused") and the
text part (which defaults to "Command rejected",
if not specified).
This is a permanent failure, which causes the sender to remove the message
from its queue without trying to retransmit, commonly generating a bounce
message to the sender.
tempfail
"message"
Subsequent matching rules cause the mail to be rejected with a temporary
error consisting of the specified text part. The SMTP reply consists of
the three-digit code 451 (RFC 2821
"Requested action aborted: local error
in processing"), the extended reply code 4.7.1
(RFC 1893 "Persistent
Transient Failure", "Security or Policy
Status", "Delivery not authorized,
message refused") and the text part (which defaults to
"Please try again later", if not specified).
This is a temporary failure, which causes the
sender to keep the message in its queue and try to retransmit it, commonly
for several days.
discard
Subsequent matching rules cause the mail to be accepted but then discarded
silently. Note that connect and helo
rules should not use discard.
accept
Subsequent matching rules cause the mail to be accepted without further rule
evaluation. Can be used for whitelist criteria.
A command is followed by one or more expressions, each causing the previous
command to be executed when matched. The following expressions can be
used:
connect hostname
address
Reject the connection if both the sender's hostname and address match the
specified regular expressions. The numerical address is either dotted-quad
(IPv4) or coloned-hex (IPv6). The hostname is the result of a DNS reverse
resolution of the numerical address (which
sendmail
8
sendmail
8
sendmail
1M
performs independantly
of the milter plugin). When resolution fails, the hostname contains the
numerical address in square brackets.
helo name
Reject the connection if the sender supplied HELO name
matches the specified regular expression. Commonly, the sender supplies his
fully-qualified hostname as HELO name.
envfrom address
Reject the mail if the sender supplied envelope MAIL FROM
address matches the specified regular expression. Addresses commonly have the
form user@host.doma.in.
envrcpt address
Reject the mail if the sender supplied envelope RCPT TO
address matches the specified regular expression.
header name
value
Reject the mail if a header matches the specified name and value.
For instance, the header "Subject: Test"
matches name Subject and value Test.
body line
Reject the mail if a body line matches the specified regular expression.
Regular Expressions
The regular expressions used in the configuration rules are enclosed in
arbitrary delimiters, no further escaping is needed.
The first character of an argument is taken as the delimiter, and all
subsequent characters up to the next occurance of the same delimiter are
taken literally as the regular expression. Since the delimiter itself cannot
be part of the regular expression (no escaping is supported), a delimiter must
be chosen that doesn't occur in the regular expression itself. Each argument
can use a different delimiter, all characters except spaces and tabs are
valid.
Two immediately adjacent delimiters form an empty regular expression, which
always matches and requires no
regexec
3C
regexec
3
regexec
3
call. This can be used in rules
requiring multiple arguments, to match only some arguments.
See
re_format
7
regex
5
regex
7
for a detailed description of basic and extended regular
expressions.
Optionally, the following flags can be used after the closing delimiter:
e
Extended regular expression. This sets REG_EXTENDED for
regcomp
3C
regcomp
3
regcomp
3
.
i
Ignore upper/lower case. This sets REG_ICASE.
n
Not matching. Reverses the matching result, i.e. the mail is rejected if the
regular expression does not match.
Boolean Expressions
A rule can consist of either a simple term or more complex expressions. A
term has the form
header /From/ /domain/i
and expressions can be built combining terms with operators
and, or, not
and ( ), as in
header /From/ /domain/i and body /money/ \
( not header /From/ /domain/ ) and ( body /sex/ or body /fast/ )
Operator precedence should not be relied on, instead parentheses should be used
to resolve any ambiguities (they usually produce syntax errors from the parser).
Macros
Macros allow to store terms or expressions as a name, and
$name can be used as term within other rules, expressions or
macro definitions. Example:
friends = header /^Received$/ /^from [^ ]*(ork.net|home.com)/e
attachments = header ,^Content-Type$, ,multipart/mixed, and \
body ,^Content-Type: application/,
executables = $attachments and body ,name=".*.(pif|exe|scr)"$,e
reject "executable attachment from non-friends"
$executables and not $friends
Macro names must begin with a letter and may contain alphanumeric characters and
punctuation characters. Reserved keywords (like reject or
header) cannot be used as macro names. Macros must be defined
before use, the definition must precede the use in the configuration file,
read from top to bottom.
Evaluation
Rules are evaluated in the order specified in the configuration file, from top
to bottom. When a rule matches, the correpsonding action is taken, that is the
last action specified before the matching rule.
The plugin evaluates the rules every time a line of mail (or envelope) is
received. As soon as a rule matches, the action is taken immediately, possibly
before the entire mail is received, even if further lines might possibly make
other rules match, too. This means the first rule matching chronologically has
precendence.
If evaluation for a line of mail makes two (or more) rules match, the rule that
comes first in the configuration file has precendence.
Boolean expressions are short-circuit evaluated, that means "a or b" becomes
true as soon as one of the terms is true and "a and b" becomes false as soon as
one of the terms is false, even if the other term is not known, possibly because
the relevant mail line has not been received yet.
Examples
# /etc/mail/milter-regex.conf example
tempfail "Sender IP address not resolving"
connect /\[.*\]/ //
reject "Malformed HELO (not a domain, no dot)"
helo /\./n
reject "Malformed RCPT TO (not an email address, not <.*@.*>)"
envrcpt /<(.*@.*|Postmaster)>/ein
reject "HTML mail not accepted"
# use comma as delimiter here, as / occurs within RE
header /^Content-type$/i ,^text/html,i
body ,^Content-type: text/html,i
# Swen worm
discard
header /^(TO|FROM|SUBJECT)$/e //
header /^Content-type$/i /boundary="Boundary_(ID_/i
header /^Content-type$/i /boundary="[a-z]*"/
body ,^Content-type: audio/x-wav; name="[a-z]*\.[a-z]*",i
# Some nasty spammer
reject "Business Corp spam, get lost"
body /^Business Corp. for W.& L. AG/i and \
( body /043.*317.*0285/ or body /0041.43.317.02.85/ )
Logging
milter-regex sends log messages to
syslogd
1M
syslogd
8
syslogd
8
using facility mail and, with
increasing verbosity, level err, notice, info and debug. The following
syslog.conf
4
syslog.conf
5
syslog.conf
5
section can be used to log messages to a dedicated file:
mail.err;mail.notice /var/log/milter-regex.log
mail.debug /var/aLotOfSpaceAvailable/mail-debug.log
Grammar
Syntax for milter-regex in BNF:
file = ( rule | macro ) file
rule = action expr-list
action = "reject" msg | "tempfail" msg | "discard" | "accept"
msg = ( '"' | "'" ) string ( '"' | "'" )
expr-list = expr [ expr-list ]
expr = term | term "and" expr | term "or" expr | "not" term
term = '(' expr ')' | "connect" arg arg | "helo" arg |
"envfrom" arg | "envrcpt" arg | "header" arg arg |
"body" arg | '$' name
arg = del regex del flags
del = '/' | ',' | '-' | ...
flags = [ 'e' ] [ 'i' ] [ 'n' ]
macro = name '=' expr
Files
@CONF@
default configuration file
See Also
regcomp
3C
regcomp
3
regcomp
3
,
syslog.conf
4
syslog.conf
5
syslog.conf
5
,
re_format
7
regex
5
regex
7
,
sendmail
8
sendmail
8
sendmail
1M
,
syslogd
1M
syslogd
8
syslogd
8
The Internet Society (2001), RFC 2821 -
Simple Mail Transfer Protocol,
AT&T Laboratories, April 2001
G. Vaudreuil, RFC 1893 -
Enhanced Mail System Status Codes,
Octel Network Services, January 1996
History
The first version of milter-regex was written in 2003. Boolean expression
evaluation was added in 2004.
Authors
Daniel Hartmeier daniel@benzedrine.cx