Combatting Spam using Certificates of Approval - Draft v1.0 of 03/14/03

Many solutions to the problem of spam have been proposed, including filtering (for keywords, by bayesian statistical analysis), blacklists (of offending mailservers), and even putting digital “stamps” on email.

Each proposal has its merits, but also has drawbacks. For example, the infrastructure requirements for digital stamps make it difficult to deploy them broadly, while filtering/blacklist systems suffer from the dreaded false positive (your boss forwards you a nigerian scam email saying he's going to invest unless you think it is a bad idea, and the mail gets deleted as spam! Ooops. Your Xmas bonus is in Lagos…)

After considering the various methods (and building my own spamfilter), I've come up with another method of dealing with spam. While not perfect, it is very complimentary to the other popular systems; in particular, it would work well with aggressive bayesian filtering while reducing false positives to almost nothing.

The basic concept is a “Good Emailing Seal of Approval” using a encryption-based reputation system.

Email senders (end users, isps, whatever) go to a central registration organization and buy a Seal of Approval. Maybe it costs them $50. The certificate can be anonymous if that's desired, though legit emailers will want to provide information about themselves. The money goes to pay for the registrar's services, with any extra being used to hound spammers in court.

OK, let's say Alice wants to email Bob. Every machine along the mail transport path that is part of the scheme (not all have to be) tags the email with an X-header that encrypts their ip address, the ip address they received the email from, date/time, message ID and original sending email address using their certificate. Effectively, it's a certified version of the Received: line. It might look like this:

X-CERTIFIED-EMAIL: [email protected] [string of characters]

A mailserver along the path could also certify the email:

X-CERTIFIED-EMAIL: [string of characters]

Or an ISP could certify all of the mail coming out of their domain using a global certificate:

X-CERTIFIED-EMAIL: [string of characters]

When Bob gets the email, his mailreader can now look at these headers (and, indeed, intermediate mailservers can do the same if they want to). There might be one from Alice's mailreader and one from her ISP's mailserver that she contacted to send the mail. Or perhaps Alice doesn't have a certificate yet, but her ISP does (or vice-versa). It doesn't really matter.

Bob can now use those certificates to do a lookup on the central registration org (perhaps using a DNSBL-type mechanism) to see if they are valid, and get an idea of the reputation of the sender (and the path to him). He can use that, in conjunction with other spam filters, to decide what to do with the email. It's just another data point.

How is the reputation generated? Easy. Bob helps do that. If he puts an email into a special spam folder (or perhaps there's a special “report these emails as spam” command), his mailreader will tell the registrar he thinks Alice is a spammer. If he reads the email, but doesn't report an email as spam, his mailreader will tell the registrar he thinks Alice is OK.

Very quickly, the registrar will be able that Alice is spamming. At which point, the word is out on Alice. Sure, she can buy another certificate. Great. Let her waste her money and support the legal harrassment of her fellow spammers (because 99% of the time, by the time Bob downloads one of her spams using the latest certificate she bought, she'll already be tagged as a spammer).

Note some of the interesting side effects of this system:

  • Mailing Lists are not affected.
  • No infrastructure costs inflicted on those who don't want to use it.
  • Doesn't depend on everyone rating their emails.
  • Hard to abuse. If you're a legit mailer (like, say, Amazon) or mailing list operator, even a big group of people who decide they don't like your politics can't overwhelm the even bigger group of people who think you're legit. And since every rater has their own certificate, they'll self-select for good people. Very few people will pay $ to be a jerk.
  • ISPs can vouch for their customers, so they don't have to buy certificates themselves. ISPs won't want their certificates to get a bad rap, so they'll police their users better. It won't cost good ISPs anything, their good users will overwhelm the bad ones. But note all that info in the header. That all gets reported to the registrar when an email is reported as spam. And if the registrar turns it over to the owner of the certificate, now the ISP knows which one of his customers is being a jerk, and can cut him off at the knees.
  • It permits – if the sender so desires – the sender's identity to be a bit more verifiable than currently possible, in particular if certificates were available (for more $) that had some documentation on the owner (such as is done with SSL certs [yeah, I know that's not perfect]). It also would provide, as a side effect, a way to do digital change-of-email-address lookups.
  • It provides a method of doing whitelisting.
  • It provides a method for detecting spoofed headers.
  • It provides an extra data point useful when aggressively filtering spam.
  • To get the ball rolling, it only requires a couple of the major email application vendors to get on the bandwagon.

So when that email from your boss comes in asking whether he should invest in an exciting Nigerian banking opportunity, even though your bayesian filter and your keyword filter are going TILT!, the fact that it appears to be coming from a trusted source means you'll probably read it, and save the company. Which means you can fly to sunny Lagos using your bonus!

This proposal was inspired by reading the Slashdot article ISP Operator Barry Shein Answers Spam Questions, in which Barry mentions the idea of stamps on email. My thanks to Barry and all the partipants in that discussion.