$Id: spamass-howto.txt,v 1.8 2007/12/18 19:00:08 malin Exp $ $Author: malin $ $Date: 2007/12/18 19:00:08 $ ============================================================================= How to block Spam at the BIC ============================================================================= This file can be found at: http://www.bic.mni.mcgill.ca/~malin/bicsystems/spamass-howto.txt ============================================================================= The email servers at the BIC are configured to automatically filter all incoming and outgoing messages, looking for possible spam using a mail filter called SpamAssassin (see . The instructions and recipes are given bellow. You should carefully read this document before enabling them to filter your incoming emails. It is important to be aware that filtering emails is a dangerous business (you might end up ditching away valid emails!) but if you follow the rules given below all should be good. We also have configured the BIC email servers to use a feature called 'Grey Listing'. Read further below to learn about it as it is a really effective way of getting rid of the pesky spam. The following sections follow: * Introduction * Basic Filter Recipe * More Aggressive Filter Recipe * How to Whitelist/Blacklist an Email Address * Headers Added by SpamAssassin * Training SpamAssassin * Grey Listing: How it Works ============================================================================= ************ Introduction ************ SpamAssassin computes a score using a suite of tests applied to the headers and body of emails and when this score reaches a (configurable) threshold a special 'tag' is inserted in the email headers and by using a filter you can 'detect' it as spam and decide what to do with it: trash it, move it to a 'trash spam' mailbox, etc. Other specific tags are inserted by SpamAssassin: they are described below but if you as a user want to detect spam you have to search for that tag: see the *Basic Filter Recipe* shown below. A very good source of info is hosted at the SpamAssassin site . Read and check frequently the Faq and Wiki. Note that BIC email servers only tag emails for possible spam, they don't discart them. They are configured very conservatively (X-Spam-Status will be set to 'Yes' only if SpamAssassin calculated a score >= 5) and it's up to you, as a user, to decide if you want to have a more aggressive filtering. If you do then have a look at the *More Aggressive Filter Recipe* below. Once you use spamassassin we suggest that you 'train' it so that it get better at differentiating spam from ham and also because spammers are nasty creatures and they devolve in curious ways. The Bayesian training is explained at and excerpts are given below under the section *Training SpamAssassin*. **************************************************************************** * VERY IMPORTANT NOTE: WE WILL NOT BE HELD RESPONSIBLE FOR ANY VALID EMAIL * * THAT _YOU_ DISCARTED AS SPAM -- A FALSE-POSITIVE -- SHOULD YOU DECIDE TO * * USE SPAMASSASSIN. * **************************************************************************** ============================================================================= ******************* Basic Filter Recipe ******************* If you don't have a file called .procmailrc create it at the top of your home directory and with your prefered file editor (say 'emacs ~/.procmailrc') and stick the following lines into it: #----------------Do Not Copy This Line!---------------------------- SPAMBOX=$HOME/.spambox :0: * ^X-Spam-Flag: YES $SPAMBOX #----------------Do Not Copy This Line!---------------------------- If you already have a ~/.procmailrc make sure you insert the above recipe *before* any other existing recipes. This will redirect all spam to a file called '.spambox' at the top of your home directory. If you don't even want to see the nice juicy spam, replace 'spambox' by '/dev/null'. If you do redirect the tagged spam to a file MAKE SURE TO CLEAN THE CRAP once in a while because that file will grow fast and you might go beyond the quotas set for users home directories. A cronjob is perfect for that, ie, login on yorick and type the following in a xterm/shell: 'crontab -e'. This will start an editor and just stick the following in it: (very important: all in one line!) ----------------Do Not Copy This Line!---------------------------- 6 0 * * 1 umask 077; cd $HOME; if test -s .spambox -a "`wc -c .spambox | awk '{print $1}'`" -ge 10240; then mv -f .spambox OLD.spambox; touch .spambox; fi ----------------Do Not Copy This Line!---------------------------- This will cleanup you spambox file (~/.spambox) every Sunday at midnight. I suggest to review this file once in a while for any false positive that might have been redirected in there. Make sure that your ~/.procmailrc belongs to you and that no other group or user can write to it (reading bits are not enforced) as otherwize the mail server will refuse to acknowledge it for obvious security reasons: chmod 600 ~/.procmailrc ***************************************************************************** * WARNING: If you do use SpamAssassin, you're on your own: don't come * * complaining to us that someone sent you an email and you didn't receive * * it because it was detected as spam and you threw it away! * ***************************************************************************** ============================================================================= ***************************** More Aggressive Filter Recipe ***************************** If you want to modify spamassassin behaviour, you can do your own filtering by using the following procmail recipes (do as above and stick the following in your ~/.procmailrc ***before*** any other existing recipes!) : ----------------Do Not Copy This Line!---------------------------- SPAMBOX=$HOME/.spambox :0fw: spamassassin.lock | /usr/bin/spamassassin :0: * ^X-Spam-Flag: Yes $SPAMBOX ----------------Do Not Copy This Line!---------------------------- You do the same thing as in the previous recipe except that before looking for possible tagged spams you do you own filtering. The nice thing about this is that as a user you can then modify the spamassassin behaviour by editing ~/.spamassassin/user_prefs and customizing the scores needed for a message to be considered spam or modifying the score that spamassassin associates to a particular test. For instance, in my case, I've bumped the scores associated with email originating from well known spam site along with HTML email (used a lot by Phishers) and I lowered the threshold for spam from 5 (default value) to 4.5. With that I have litterally no more spam in my mailbox with very very few false-positives. I also regularly train spamassassin (see section below 'Training SpamAssassin'). ########################################################################### # SpamAssassin user preferences file. See 'man Mail::SpamAssassin::Conf' # for details of what can be tweaked. ########################################################################### # How many hits before a mail is considered spam. required_hits 4.5 # Whitelist and blacklist addresses are now file-glob-style patterns, so # "friend@somewhere.com", "*@isp.com", or "*.domain.net" will all work. # whitelist_from someone@somewhere.com # Add your own customised scores for some tests below. The default scores are # read from the installed spamassassin rules files, but you can override them # here. To see the list of tests and their default scores, go to # http://spamassassin.org/tests.html . # # score SYMBOLIC_TEST_NAME n.nn score AMATEUR_PORN 4.5 score BAYES_99 4.5 score BIZ_TLD 4.5 score DNS_FROM_AHBL_RHSBL 4.5 score DRUGS_MUSCLE 4.5 score DRUGS_DIET 4.5 score EARN_PER_WEEK 4.5 score HTML_MESSAGE 4.5 score HTML_80_90 4.5 score MICROSOFT_EXECUTABLE 4.5 score HARDCORE_PORN 4.5 score HG_HORMONE 4.5 score HOT_NASTY 4.5 score INCREASE_SEX 4.5 score IMPOTENCE 4.5 score MILLION_USD 4.5 score MORTGAGE_PITCH 4.5 score NIGERIAN_BODY1 4.5 score NIGERIAN_BODY2 4.5 score NIGERIAN_BODY3 4.5 score OFFSHORE_SCAM 4.5 score PENIS_ENLARGE 4.5 score PENIS_ENLARGE2 4.5 score RCVD_IN_BL_SPAMCOP_NET 4.5 score RCVD_IN_DSBL 4.5 score RCVD_IN_NJABL_PROXY 4.5 score RCVD_IN_NJABL_DUL 4.5 score RCVD_IN_SBL 4.5 score RCVD_IN_SORBS_HTTP 4.5 score RCVD_IN_SORBS_WEB 4.5 score RCVD_IN_SORBS_DUL 4.5 score SUBJECT_SEXUAL 4.5 score STOCK_ALERT 4.5 score US_DOLLARS_3 4.5 score URIBL_AB_SURBL 4.5 score URIBL_SBL 4.5 score URIBL_OB_SURBL 4.5 score URIBL_WS_SURBL 4.5 ########################################################################### See http://spamassassin.apache.org/tests.html to know all the different tests that spamassassin uses and how to modify them to your liking. ============================================================================= ******************************************* How to Whitelist/Blacklist an Email Address ******************************************* Often you have emails that you know are not spam but nevertheless spamassassin incorrectly tagged them as spam (false-positives). All you have to do is to 'whitelist' them by adding a line in your ~/.spamassassin/user_prefs whitelist_from joe@hotmail.com Spamassassin will not redirect the messages from to your spambox even though it might have tagged it as spam. Multiple addresses per line, separated by spaces, is OK. Multiple "whitelist_from" lines is also OK. The same principle applies to 'blacklist' an email address: blacklist_from add@ress.com Used to specify addresses that are often tagged (incorrectly) as non-spam, but which the user doesn't want. Same format as "whitelist_from". ============================================================================= ***************************** Headers Added by SpamAssassin ***************************** SpamAssassin will add a few headers to an email after it has processed it. These are usually hidden but can be easily shown if you don't use a brain-dead mailer like PutLook or other utter pieces of crapware. Here is an example of a spam I received: X-Spam-Flag: YES X-Spam-Checker-Version: SpamAssassin 3.0.1 (2004-10-22) on yorick.bic.mni.mcgill.ca X-Spam-Level: ********** X-Spam-Status: Yes, score=10.9 required=4.5 tests=BAYES_00,HTML_MESSAGE, RCVD_IN_SORBS_WEB,STOCK_ALERT autolearn=no version=3.0.1 Four extra headers have been inserted: the first one telling you SpamAssassin has computed a score higher that it's configured threshold, the second one is informing you which version of SpamAssassin was used, and finally the last two tell you about the spam level of the message and which tests were used to calculate the final score. You can edit your user preference file in ~/.spamassassin/user_prefs and modify the default score values of all these tests which are located in /usr/local/unstable/share/spamassassin on the BIC systems. ============================================================================= ********************* Training SpamAssassin ********************* To train Spamassassin, you get a mailbox full of messages that you know are spam and use the sa-learn program to pull out the tokens and remember them for later: * sa-learn --showdots --mbox --spam spam-file Then you get a mailbox full of messages you're sure are ham and teach Bayes about those: * sa-learn --showdots --mbox --ham ham-file It is important to do both. ============================================================================= ***************************** Grey Listing: How it Works ***************************** Grey listing works by assuming that contrarily to legitimate MTA, (Mail Transport Agent) spam engines will not retry sending their junk mail on a temporary error. The filter will always temporarily reject mail on a first attempt, and accept it after some time has elapsed. If spammers ever try to resend rejected messages, we can assume they will not stay idle between the two sends. Odds are good that the spam-mer will send a mail to an honey pot address and get blacklisted in an Internet-distributed black list before the second attempt. Grey listing can be enabled on a per user, domain, and IP ranges basis. Essentially it delays delivery to a greylisted email address by a small amount of time. I've tested it with a (small) delay of 10m and seems to be really effective and used in conjonction with spamassassin catches all the 250+ spams I receive a day :) Greylist allows whitelisting (do nothing, the default), greylisting (delay) and blacklisting (plonk! you're not welcome). If you are harrassed by too much spam just get in touch with bicadmin@bic and make a request to add your bic email address in the mail server's greylist. The only drawback on a user perspective is the small delay incured for the final delivery in her mailbox. I have whitelisted mcgill.ca so emails originating from there are not affected by greylist, ie, no delay is incured for the final delivery. Also broken MTAs like gmail.com (that's for Andrew:) are whitelisted along with others like yahoo, amazon, aol and a bunch of sites with weird if not broken smtp setups. Obviously the whole BIC domain is whitelisted too :) You want to be greylisted? Make the request by sending an email to Have fun. jf -- "The Zen nature of a spammer resembles a cockroach, except that the cockroach is higher up on the evolutionary chain." --Peter Olson, Delphi Postmaster