Here you'll find a brief description of the measures you can use to protect yourself from spam. The intended audience for this page is mainly mail system administrators (with a bias towards Sendmail on Unix-like systems, particularly Linux), but I've listed a couple of items that will be useful for end-users (marked "[end-users too!"]) - things you can do with very little effort and technical knowledge to reduce the amount of junk e-mail you get (admins, consider encouraging your users to do these things).
Just so you know, I run mail services for this domain, my work domain, for a number of our clients' domains, and several domains for family and friends, and use these techniques on all of them. This page is my attempt to consolidate the things I've learned over the course of a few years into one place, and I hope you find it useful. To get an idea of effectiveness, I receive mail across 5 separate accounts (some with multiple aliases), for which the addresses are well and truly on spammers' lists, and get on average less than one spam a day (and half of those are caught as 'probable'), and (other than a handful of teething problems) have had no reports of any legitimate e-mail being bounced. How does your ISP stack up against that? ;-)
Of course, there are many other techniques, programs and systems you can use; I'm just listing the ones with which I have experience. If you would like any more info, please e-mail me.
The best place to stop spam (assuming that people are going to try to send it to you) is at the point of delivery - before it makes it as far as your mailbox. This is because it saves your resources - bandwidth, disk space and CPU. Ideally, you would like to identify that the message is spam and refuse the connection as soon as the spammer's server connects to yours, before any data has been sent.
Whilst it is possibly to identify IP addresses that spam you and add them to a local block list, this is time comsuming to manage, is likely only to be minimally effective - spammers have control of thousands of IP addresses (many of them hijacked by trojans, etc.) and it is unlikely that you'll get much junk from the same IP address. DNS is a convenient way of distributing these lists.
Using Spamhaus' XBL and SBL DNS blacklists with Sendmail enable you to reject incoming connections from known spam sources. The newer PBL from Spamhaus is lists of IPs that the ISP/admin concerned has asserted should never send mail directly, so any SMTP connections from these indicate probable "spam zombies". Spamhaus makes these DNSBLs available separately, or combined under zen.spamhaus.org - I find using this list alone that more than 50% of spam gets blocked. There are loads of DNS blacklists available from various sources; the three I've chosen are well respected and fairly conservative (because I don't want to lose legitimate mail) but are very effective.
Evan Harris' site has much more info on this, but in outline: this technique consists of not immediately accepting messages from IP addresses that have not previously successfully connected to your mail server - instead, they are added to a greylist. If a connection is made from on IP that's on the greylist, instead of accepting the mail, the server generates a 4xx ("retry later") response. After a configured time limit, the IP is moved to a whitelist, and next time it retries mail is accepted as usual. The theory behind this is that spammers usually use lightweight SMTP implementations and do not generally retry after 4xx errors, unlike real mail servers. In addition, the delay imposed means that by the time the time limit has expired, it's more likely that a spammer's IP will have been listed in another blacklist on which you can then block. The down-side is that legitimate mail is also delayed, though this can be mitigated by setting a long timeout before a previously-accepted IP address is dropped from your whitelist, and also a very small number of sender setups don't play nicely so need special treatment (admins, have you checked your setup works for recipients who use greylisting?).
I have found this technique to be very effective, and it stops the vast majority of spam (90%+) that had previously been getting past my other filters. I use milter-greylist with Sendmail, which is very configurable, and supports syncing of white/grey lists between all your MXs. Evan suggests 1 hour as the greylist time limit; I use currently 7 mins which gets all the spammers who don't retry at all, without delaying legitimate mail for too long. I keep addresses on the whitelist for 36 days after the last connection received, which means that various monthly mailing lists refresh often enough never to drop off the whitelist.
Sender Policy Framework is a means by which domain names specify which IP addresses may originate mail for that domain. This a) allows you to detect and reject e-mail with a forged sender, b) increases the amount of spam you reject as many spammers forge the sender to be your own domain, and c) allows other sites to reject mail that has a forged sender in your own domain (which also reduces the amount of bounces you may have to deal with). I find this rejects around 10% of the junk that makes it through the rest of my filters, and this will probably improve as more domains implement it. There are various implementations; I use SMF-SPF.
In the past I've used milter-sender (was free, now requires payment) to perform MX callbacks - that is, to verify the purported sender of a message with their MX server. This simply asks the mail server of the sender's domain if it accepts mail for the sender's address; if not, we know that the sender's address doesn't exist, so the Milter instructs Sendmail to reject the message.
This is effective at rejecting spam, because many spammers forge the sender address. It does suffer from problems that some legitimate mails are sent with return addresses that don't exist (some automated order responses, mailing lists, and so on), so these are rejected by default. However, milter-sender allows whitelisting using Sendmail's access database.
Another useful feature of milter-sender for secondary MX servers, also available in the lighter-weight milter-ahead (was free, now requires payment), is to call ahead to the primary MX (if it's up; if not it will just accept the message anyway) and verify that the recipient address exists, and again reject the message if not. This is effective if your secondary MX just relays on to the primary based on domain name, without checking usernames (e.g. with LDAP).
Teergrubing (German for "tar pit") sounds like a great idea and is something I may investigate in future. The basic idea is to respond v-e-r-y--s-l-o-w-l-y to connecting mailservers, using multi-line SMTP responses with the lines sent just frequently enough to prevent the sending server from timing out and disconnecting. This consumes little bandwidth or resources on your server, but prevents the spamming server from closing the connection and sending spam to the next victim.
There are a number of variations on the theme. Some apply a delay to all servers before accepting the mail (e.g. applying a 60 second delay is unlikely to impact a legitimate server, but would cost a spammer several tens of message deliveries), some apply a variable slowdown based on various factors (e.g. how many recent connections have been made, trust level, etc.), but the one I prefer is to only apply the slowdown once you've positively identified the spammer by some other means (based on IP address, message content or whatever), and then to hold the connection open for as long as possible (up to some limit based on the confidence of your identification of the spammer) before rejecting the message.
Another variation is to install a teergrube program (e.g. this perl script) on machines that aren't used as mailservers - the idea is that no legitimate server would ever connect to these, so any connections are almost certainly from be spammers who've found the server with a port scan, and can thus safely be teergrubed. Taking this one step further, LaBrea will similate entire machines in your unused IP address space, and tarpit any connections to them.
You should be aware that spammers often ignore the primary MX when sending spam, as secondary MX servers often have less filtering in place. You should as far as possible replicate all of the protection you have on your primary MX on your secondary. If you can't do this, it's worth considering not using a secondary MX at all - instead rely on sending mail servers to retry any failures.
Paul Graham's article A Plan For Spam has introduced a great step forward in the effective and reliable filtering of spam, through the use of Bayesian statistics. This method calculates the characteristics of spam and non-spam messages from inputs you give it (it's based on the premise that words commonly used in spam are different from the words used in your day-to-day mail), and filters messages based on the characteristics it "learns". Various sources quote >99% of spam filtered with <0.1% false positives for a filter that has been "trained" with a few hundred messages, and my experience used to agree with this, although spammers have caught up somewhat in the last couple of years (primarily by using images, PDFs, and ever MP3s rather than text, and padding the message with random text) and I'm currently getting very roughly 50% of spam filtered.
On my mail server I use Bogofilter in conjunction with procmail to filter out incoming spam to a separate folder which I periodically check.
There are various end-user products which contain Bayesian filters. Mozilla Thunderbird is my mailreader of choice and I strongly reccomend that you (yes, you personally!) ditch Outlook/Outlook Express in favour of that if you haven't already done so, for this feature and many other security and usability benefits - the switch is dead easy, and it imports all your Outlook settings and address books too. Note that the allegedly state-of-the-art
spam filtering introduced in Outlook 2003 is worse than useless.
The only drawback to this approach (on servers) is that it is relatively computationally expensive, so is more costly to implement on systems that have large numbers of users or otherwise deal with large volumes of mail. The load can be reduced somewhat (at the cost of slightly more errors) by only updating the database when it makes a mistake, rather than updating it each time a message is received.
SpamAssassin is another server-based filtering method with an excellent reputation (although I haven't used it myself). It is highly configurable with hundreds of rules (including a Bayesian rule) against which an incoming mail is checked, and each one has a score associated with it. Thus, each incoming mail gets a score for its likely 'spamminess' and can be deleted/filtered, accepted, etc. according to this score.
If you get a spam, you should try to complain. Spamming is against almost all ISPs' terms of service, and most will terminate the accounts of anyone who sends spam from their systems. Whilst most professional spammers (about 200 of them, worldwide, are responsible for 80% of the spam you get!) accept this as an occupational hazard and move from ISP to ISP as their accounts are terminated, it does make it much more difficult and expensive for them. Additionally, many of the servers that are used to send spam are used without the owner's knowledge, so you're doing them a favour by letting them know. Finally, large numbers of complaints make life more difficult for the few ISPs who do tolerate spammers, and hopefully will get them to change their minds (or at least raise the prices they charge to spammers!).
For most people, actually finding out where a spam actually came from (and therefore where the complaint should be sent) is very difficult, and to make matters worse spammers almost always try to forge the information about where the mail actually originated (hint - it's almost never from the person shown in the 'From' address - this information is easily forged).
So, the solution to this is SpamCop. Once you've registered (free), you can report spam either by e-mail or a web-based form, and they will complain on your behalf to the correct people - the spammer's ISP, server administrators, company hosting their web site, etc. Highly recommended, and quite satisfying!
In my home setup, I have a script to make reporting even easier - I just drop spam into a folder on the server; the script picks it up from there and sends it on to SpamCop and other places as appropriate.
One way to waste spammers' resources is to poison their lists - try to fill them with non-working addresses, and otherwise generally interfere with their programs which crawl the web looking for new addresses. I have a script, wpoison-gt, which does this - just install it on your web site and add a link to it.
An approach which goes much further is Project Honey Pot - you can install their script (or just a link) on your web site and it also issues random e-mail addresses, except that they really work - they are tracked from address harvesting to spam delivery, and the information used to build legal cases against spammers, amongst other things. If you're a little more technically-minded, you can donate use of subdomain for the random e-mail addresses - this doesn't impact your existing use of the domain. Well worth a visit - they really do make it easy for you to contribute, and there's plenty of stats on how well you're doing if that floats your boat.
Firstly, you should get a virus scanner. There are a number of free ones; the ones I've used include GriSoft AVG (free for personal use) and ClamAV. You can filter virii at the e-mail server level in conjunction with any number of programs (I've used amavis-new, MailScanner and clamav-milter in various different configurations) and most end-user anti-virus programs can be configured to scan incoming and outgoing mail too.
Make sure your mail server is configured so that only legitimate users can send outgoing e-mail through it. Check with an open relay tester.
Make sure your machine is secure, so it doesn't become infected with a mail proxy which spammers then use (or any other nastyware!). Keep your operating system up to date - especially Windows users (use Windows Update). Don't use Internet Explorer - it's very prone to allowing web sites to install all sorts of bad software on your computer (I can't emphasise this enough - switch to Firefox immediately! It's dead easy!). Make sure your machines are firewalled (Personal firewall day is an excellent intro for the non-technical).