Spam. I get alot of it. I’ve battled it with home-brewed procmail recipes, Mozilla message filters, and variety of other means. However, creating my own filters became to cumbersome given the huge amount of ever varying spam I get. So, I installed SpamAssassin so that I could leverage its huge community developed rule base. All mail sent to “@fuzzybelly.org” is now sent through SpamAssassin. It performs, header, text, blacklist, and HTML analysis of each mail message and tags mail that is suspected of being spam with special mail headers which can then be used to direct the mail straight to the trash or to a holding folder using procmail or your favorite mail client.
SpamAssassin performs hundreds of tests against each incoming mail message. Each test holds a certain point value. If a mail message accumulates a point value in excess of a user specified threshold, the message is marked as spam. SpamAssassin prepends its analysis of the message to the message itself so that you can see why it was flagged as spam. This analysis is often very interesting and shows the myriad ways in which spam tries to sneak into your mailbox and track you. Here’s an example from a very spammy message.
SPAM: ——————– Start SpamAssassin results ———————-
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM: Content analysis details: (25.9 hits, 5 required)
SPAM: Hit! (2.7 points) Subject contains lots of white space
SPAM: Hit! (2.4 points) ‘Message-Id’ was added by a relay (2)
SPAM: Hit! (0.5 points) Subject has an exclamation mark
SPAM: Hit! (1.0 point) BODY: No such thing as a free lunch (3)
SPAM: Hit! (0.8 points) BODY: Uses words and phrases which indicate porn (11)
SPAM: Hit! (0.1 points) BODY: Uses words and phrases which indicate porn (10)
SPAM: Hit! (1.5 points) BODY: Asks you to click below
SPAM: Hit! (0.1 points) BODY: List removal information
SPAM: Hit! (1.8 points) URI: Uses %-escapes inside a URL’s hostname
SPAM: Hit! (-0.3 points) URI: Includes a link to send a mail with a subject
SPAM: Hit! (1.9 points) URI: Includes a URL link to send an email with the subject ‘remove’
SPAM: Hit! (1.3 points) URI: Includes a ‘remove’ email address
SPAM: Hit! (2.1 points) BODY: FONT Size +2 and up or 3 and up
SPAM: Hit! (0.0 points) BODY: Includes a URL link to send an email
SPAM: Hit! (1.8 points) BODY: Tells you to click on a URL
SPAM: Hit! (3.2 points) HTML-only mail, with no text version
SPAM: ——————– End of SpamAssassin results ———————
As you can see, SpamAssassin found several things in this message indicative of spam. A common tactic of spammers is to include a link that will supposedly allow you to remove yourself from the spammer’s mailing list. Of course, they have no intentions of removing you from their list, they want confirmation that their mail was received. Further, they want to track you. If you look at the html for the “remove me” links, you will see that the URL’s contain these big, long strings that are meant to be unique identifiers. Take a look at this example.
< a href="http://emza.net/r/r0.4?o2HIYLo1kINbxkiJ3XEVq
GcetHK-KbMPnsDXjuqn7IwKt8Y3zE58926″>click here to unsubscribe.
If you “click here to unsubscribe” you serve only to confirm that you got the spammer’s message and to provide them with a unique tracking number. A more insidious form of tracking is the web bug. A web bug is an invisible image that, when loaded by your web browser or HTML enabled mail client, provides hit confirmation and a tracking number to the spammer. Here is an example from a spam message I recently received.
IMG SRC=”http://emza.net/r/r0.4?EgGOvcEumOojhmerX-iTrd9EAu4N4ngJG2FmZU2c” HEIGHT=1 WIDTH=1 BORDER=0
SpamAssassin also checks to see if the mail message has been relayed through a network listed in one of the Realtime Blackhole Lists (RBLs). Networks that relay spam either purposefully or through misconfiguration are placed on these blackhole lists so that those who chose t do so can reject or tag all mail that goes through one of these relays. When SpamAssassin receives a message that is passed through a relay listed in the RBLs, it will tag it as such.
SPAM: Hit! (2.0 points) Received via a relay in relays.osirusoft.com
SPAM: [RBL check: found 220.127.116.11.relays.osirusoft.com., type: 127.0.0.9]
Another spam trick is to perform Base64 Content-Transfer-Encoding on mail messages so as to twart content analysis by less capable spam analyzers. If your mail client is so equipped, whenever it receives a message sporting the tag “Content-Transfer-Encoding: base64” it will decode the message and present it in a readable format. If the message were not decoded, it would look something like this.
This is actually a spam message trying to sell me a fake diploma. SpamAssassin knows how to dea
l with the base64 encoding so that it can do its job of analyzing the message. Typically, base64 is used for sending binary files over the net. In this case it was used to obscure the transport of spam.
I’ve been using SpamAssassin for a week now and it has caught 99% of the spam I receive without generating a single false positive. During that week, it prevented over 100 spam messages from reaching my inbox. I can now get rid of my collection of hodge-podge recipes and instead use and contribute to the well-maintained and constantly updated SpamAssassin rule base.