Spam. I get alot of it. I’ve battled it with home-brewed procmail recipes, Mozilla message filters, and variety of other means. However, creating my own filters became to cumbersome given the huge amount of ever varying spam I get. So, I installed SpamAssassin so that I could leverage its huge community developed rule base. All mail sent to “@fuzzybelly.org” is now sent through SpamAssassin. It performs, header, text, blacklist, and HTML analysis of each mail message and tags mail that is suspected of being spam with special mail headers which can then be used to direct the mail straight to the trash or to a holding folder using procmail or your favorite mail client.
SpamAssassin performs hundreds of tests against each incoming mail message. Each test holds a certain point value. If a mail message accumulates a point value in excess of a user specified threshold, the message is marked as spam. SpamAssassin prepends its analysis of the message to the message itself so that you can see why it was flagged as spam. This analysis is often very interesting and shows the myriad ways in which spam tries to sneak into your mailbox and track you. Here’s an example from a very spammy message.
SPAM: ——————– Start SpamAssassin results ———————-
SPAM: This mail is probably spam. The original message has been altered
SPAM: so you can recognise or block similar unwanted mail in future.
SPAM: See http://spamassassin.org/tag/ for more details.
SPAM:
SPAM: Content analysis details: (25.9 hits, 5 required)
SPAM: Hit! (2.7 points) Subject contains lots of white space
SPAM: Hit! (2.4 points) ‘Message-Id’ was added by a relay (2)
SPAM: Hit! (0.5 points) Subject has an exclamation mark
SPAM: Hit! (1.0 point) BODY: No such thing as a free lunch (3)
SPAM: Hit! (0.8 points) BODY: Uses words and phrases which indicate porn (11)
SPAM: Hit! (0.1 points) BODY: Uses words and phrases which indicate porn (10)
SPAM: Hit! (1.5 points) BODY: Asks you to click below
SPAM: Hit! (0.1 points) BODY: List removal information
SPAM: Hit! (1.8 points) URI: Uses %-escapes inside a URL’s hostname
SPAM: Hit! (-0.3 points) URI: Includes a link to send a mail with a subject
SPAM: Hit! (1.9 points) URI: Includes a URL link to send an email with the subject ‘remove’
SPAM: Hit! (1.3 points) URI: Includes a ‘remove’ email address
SPAM: Hit! (1.7 points) BODY: JavaScript code
SPAM: Hit! (3.3 points) BODY: Auto-executing JavaScript code
SPAM: Hit! (2.1 points) BODY: FONT Size +2 and up or 3 and up
SPAM: Hit! (0.0 points) BODY: Includes a URL link to send an email
SPAM: Hit! (1.8 points) BODY: Tells you to click on a URL
SPAM: Hit! (3.2 points) HTML-only mail, with no text version
SPAM:
SPAM: ——————– End of SpamAssassin results ———————
As you can see, SpamAssassin found several things in this message indicative of spam. A common tactic of spammers is to include a link that will supposedly allow you to remove yourself from the spammer’s mailing list. Of course, they have no intentions of removing you from their list, they want confirmation that their mail was received. Further, they want to track you. If you look at the html for the “remove me” links, you will see that the URL’s contain these big, long strings that are meant to be unique identifiers. Take a look at this example.
< a href="http://emza.net/r/r0.4?o2HIYLo1kINbxkiJ3XEVq
GcetHK-KbMPnsDXjuqn7IwKt8Y3zE58926″>click here to unsubscribe.
If you “click here to unsubscribe” you serve only to confirm that you got the spammer’s message and to provide them with a unique tracking number. A more insidious form of tracking is the web bug. A web bug is an invisible image that, when loaded by your web browser or HTML enabled mail client, provides hit confirmation and a tracking number to the spammer. Here is an example from a spam message I recently received.
IMG SRC=”http://emza.net/r/r0.4?EgGOvcEumOojhmerX-iTrd9EAu4N4ngJG2FmZU2c” HEIGHT=1 WIDTH=1 BORDER=0
Notice that the image has a height and width of 1 pixel. This is so that you can’t see it. And notice the long string in the image name. This is your new tracking number. SpamAssassin detects these web bugs and other HTML abuses and defuses them by changing the Content-Type of the message from “text/html” to “text/plain”. This prevents the message from being loaded and displayed as a web page. Instead, the underlying HTML source is displayed as plain text. This makes the message difficult to read (why would you want to read spam anyway), but it defuses web bugs and abuses of JavaScript. Speaking of JavaScript, you should always disable use of JavaScript in your mail client. JavaScript is sometimes necessary during regular web browsing, but its use in mail messages pretty much guarantees that that message is spam or similarly useless material.
SpamAssassin also checks to see if the mail message has been relayed through a network listed in one of the Realtime Blackhole Lists (RBLs). Networks that relay spam either purposefully or through misconfiguration are placed on these blackhole lists so that those who chose t do so can reject or tag all mail that goes through one of these relays. When SpamAssassin receives a message that is passed through a relay listed in the RBLs, it will tag it as such.
SPAM: Hit! (2.0 points) Received via a relay in relays.osirusoft.com
SPAM: [RBL check: found 54.195.251.205.relays.osirusoft.com., type: 127.0.0.9]
Another spam trick is to perform Base64 Content-Transfer-Encoding on mail messages so as to twart content analysis by less capable spam analyzers. If your mail client is so equipped, whenever it receives a message sporting the tag “Content-Transfer-Encoding: base64” it will decode the message and present it in a readable format. If the message were not decoded, it would look something like this.
bW9uZXkgDQo0MjM4V0hSVzctMjI4Y3RmdDg2NzlYdExXOS0xMTZEclBjMTU3
MVlqY2wwLTY4OWN3aWg5MDgxWW9CRzYtMzE5RWtsZTYxNDNRRWw3MA0KDQpV
TklWRVJTSVRZIERJUExPTUENCg0KT2J0YWluIGEgcHJvc3Blcm91cyBmdXR1
cmUsIG1vbmV5IGVhcm5pbmcgcG93ZXIsDQphbmQgdGhlIGFkbWlyYXRpb24g
b2YgYWxsLg0KDQpEaXBsb21hcyBmcm9tIHByZXN0aWdpb3VzIG5vbi1hY2Ny
ZWRpdGVkIHVuaXZlcnNpdGllcw0KYmFzZWQgb24geW91ciBwcmVzZW50IGtu
b3dsZWRnZWFuZCBsaWZlIGV4cGVyaWVuY2UuDQoNCk5vIHJlcXVpcmVkIHRl
c3RzLCBjbGFzc2VzLCBib29rcywgb3IgaW50ZXJ2aWV3cy4NCg0KQmFjaGVs
b3JzLCBtYXN0ZXJzLCBNQkEsIGFuZCBkb2N0b3JhdGUgKFBoRCkgZGlwbG9t
YXMNCmF2YWlsYWJsZSBpbiB0aGUgZmllbGQgb2YgeW91ciBjaG9pY2UuTm8g
b25lIGlzIHR1cm5lZA0KZG93bi4NCg0KQ29uZmlkZW50aWFsaXR5IGFzc3Vy
ZWQuIENBTEwgTk9XIHRvIHJlY2VpdmUgeW91cg0KZGlwbG9tYSB3aXRoaW4g
ZGF5cyEhIQ0KDQo3MTMtODY2LTg4NjkNCg0KQ2FsbCAyNCBob3VycyBhIGRh
eSwgNyBkYXlzIGEgd2VlaywgaW5jbHVkaW5nIFN1bmRheXMNCmFuZCBob2xp
ZGF5cy4NCg0KbW9uZXkgDQo0MjM4V0hSVzctMjI4Y3RmdDg2NzlYdExXOS0x
MTZEclBjMTU3MVlqY2wwLTY4OWN3aWg5MDgxWW9CRzYtMzE5RWtsZTYxNDNR
RWw3MA0KNDIzOFdIUlc3LTIyOGN0ZnQ4Njc5WHRMVzktMTE2RHJQYzE1NzFZ
amNsMC02ODljd2loOTA4MVlvQkc2LTMxOUVrbGU2MTQzUUVsNzANCjU4NTVt
end2My0zNThtTm51MTQyOFV1ZFcwLTA2OHFkd2g0NDQ5WWZCRTItODE1bGx0
TzI3NzdnRFNVOC0zODhKd090MDg2MXFjbDcw
This is actually a spam message trying to sell me a fake diploma. SpamAssassin knows how to dea
l with the base64 encoding so that it can do its job of analyzing the message. Typically, base64 is used for sending binary files over the net. In this case it was used to obscure the transport of spam.
I’ve been using SpamAssassin for a week now and it has caught 99% of the spam I receive without generating a single false positive. During that week, it prevented over 100 spam messages from reaching my inbox. I can now get rid of my collection of hodge-podge recipes and instead use and contribute to the well-maintained and constantly updated SpamAssassin rule base.
Lots of good information. I will have to try it. Thanks.
Well…it is more than a MONTH later after this post, but there is an article in today’s (just for the record, today is 18July2002) Dallas Morning News Online:
http://www.dallasnews.com/business/technology/stories/071802dnbusspam.3a5cb.html