Spam filtering rules
I was poking around in my spam folders tonight comparing the results of Spam Assassin against a simple Procmail recipe that tags messages with URLs in them. I run both filters all the time, catching almost all spam. I need to add a recipe that tags base64 encoded messages to clean up the rest.
Interestingly enough, simply throwing out all messages with URLs that don’t come from people on a whitelist catches more spam than Spam Assassin does. Since almost all spam comes with a URL it’s easy to filter for unless you receive plenty of legitimate email with URLs, and can’t easily maintain a whitelist. The whitelist method has its drawbacks, primarily with web transactions that involve email verification. It’s a trivial task to find the message at the end of a spam folder to find the address and add it to the whitelist.
| Folder | SA hits | SA misses | URL tagged |
| SRT.spam | 1376 | 606 | 1700 |
| SRT.spam.160404 | 5381 | 1652 | 6657 |
| SRT.spam.190504 | 5755 | 2365 | 6838 |