Spamfighting: Keeping spammers from using my domain

We all dislike email spam. It clogs up our inboxes, and causes good
engineers to spend way too much time creatively blocking or filtering
it out, while spammers creatively work to get around the blocks. In
my personal life, the spammers are winning. (My employer, Dell, makes
several security and spam-fighting products. I’m not using them for
my personal domains, so this series is not related to Dell products in
any way.)

I recently came across DMARC, the Domain-based Message Authentication,
Reporting & Conformance specification. One feature of DMARC is that it
allows mail receivers, after processing given piece of mail, to inform
an address at the sending domain of that mail’s disposition: passed,
quarantined, or rejected. This is the first such feedback method I’ve
come across, and it seems to be gaining traction. Furthermore,
services such as dmarcian.com have popped up to act as DMARC report
receivers, which then display your aggregate results in a clear
manner.

A DMARC-compliant outbound mail server provides several useful bits of
information. 1) The domain publishes a valid Sender Policy Framework
(SPF) record. 2) The domain signs mail using Domain Keys Identified
Mail (DKIM). These are best practices now, in place by millions of
domains. In addition, the domain publishes its DMARC policy
recommendation, what an inbound mail server should do if a message
purporting to be from the domain fails both SPF and DKIM checks. The
policies today include “none” – do nothing special, “quarantine” –
treat the message as suspect, perhaps applying additional filtering or
sending to a spam folder, and “reject” – reject the message
immediately, sending a bounce back to the sender.

A DMARC-compliant inbound mail server validates each incoming message
against two things: compliance with the Sender Policy Framework (SPF)
and checks the DKIM signature. The server then follows the policy
suggested by the sending domain (none, quarantine, or reject), and
furthermore, reports back the results of its actions daily to the
purported sending domain.

I’ve been publishing SPF records for my personal and community
organization domains for several years, in hopes this would cut down
on spammers pretending to be from my domains. I recently added DKIM
signing, the next step in the process. With these two in place,
publishing a DMARC policy is very straightforward. So I did this,
publishing a “none” policy – just send me reports. And within a few
days, I started getting reports back, which I sent to dmarcian.com for
analysis.

What did I find?

On a usual day, my personal domain, used by myself and family members,
sends maybe a hundred total emails, as reported by DMARC-reporting
inbound servers. My community org domains may send 1000-2000 emails a
day, particularly if we have announcements to everyone on our lists.
That seems about right.

In addition, spammers, mostly in Russia and other parts of Asia, are
sending upwards of 20,000-40,000+ spam messages pretending to come from my
personal domain, again as reported by DMARC-reporting inbound
servers. Hotmail’s servers kindly are sending me reports for each
failed message they process thinking they were from me – a steady
stream of ~3600/day. No other DMARC servers have sent me such forensic
data yet.

spamsources

Spam source by country for the last week

For several days, I experimented with a DMARC policy of “quarantine”,
with various small percentages from 10 to 50 percent. And sure
enough, dmarcian reports that the threat/spam mails were in fact
quarantined. It was really cool to wake up in the morning, check the
overnight results, and see the threat/spam graphs show half of the
messages being quarantined. It’s working!

However, dmarcian also reported that some of my legitimate emails,
originating from my servers and being DKIM-signed, were also getting
quarantined. What? That wasn’t what I hoped for.

It turns out that authentic messages were in fact being forwarded –
some by mailing lists, some by individuals setting up forwarding from
one inbound mail address to another. Neither of which I can do
anything about.

This isn’t a new problem – it’s the Achilles heel of SPF, which DMARC then inherits. Forwarding email through a mailing list typically makes subtle yet valid changes while keeping the From: line the same.

The Subject: line may get a [listname] prepended to it. The body may
get a “click here to unsubscribe” footer added to it. These
invalidate the original DKIM signature. The list may strip out the
original DKIM signature. And of course, it remails the message,
outbound using its own server name and IP, which causes it to then
fail SPF tests.

Sure, there are suggested solutions, like getting everyone to use
Sender Rewriting Scheme (SRS) when remailing, and fixing Mailman and
every other mailing list manager. Wake me when all the world’s email
servers have added that, I will have been dead a very very long time.

So, I switched back to policy “none”, and get the reports, aggravated
that there’s nothing I can directly do to protect the good name of my
domains. It’s hard both knowing the size of the problem, and knowing
we have no technological method of solving it today. Food for
thought.

In part 2 of this series, I will describe my system setup for using
the above techniques.

Do you use SPF? Do you use DKIM? Do you publish a DMARC policy? If so, what has your experience been? Leave comments below.

F/OSS for querying publicdata.com?

Has anyone written free/open source software to use the publicdata.com query API? I’ve got several dozen coaches we need to do background checks on, and we use publicdata.com. Turns out our folks have been doing this manually for years. Seems like something that could be easily automated, but haven’t found any software, open or otherwise, to do so.

s3tools / s3cmd needs a new maintainer

As posted to the s3tools-general mailing list, s3tools maintainer Michal Ludvig is looking for new maintainers to step up to continue the care and feeding of the s3tools / s3cmd application.  s3cmd is widely used, on both Linux and Windows, to publish and maintain content in the Amazon Web Services S3 storage system and CloudFront content distribution network.

I use s3cmd for two primary purposes:

  1. as Fedora Mirror Wrangler, I use it within Fedora Infrastructure to maintain mirrors within S3 in each region for the benefit of EC2 users running Fedora or using the EPEL repository on top of RHEL or a derivative.  Fedora has mirrors in us-east-1, us-west-1 and -2, and eu-west-1 right now, and may add the other regions over time.
  2. for my own personal web site, I offload storage of static historical pictures and movies so that they are served from economical storage and not consuming space on my primary web server.

I congratulate Michal for recognizing when he no longer has the time to commit to regular maintenance of such an important project, and to begin looking for contributors who can carry out that responsibility more effectively.  While I’ve submitted a few patches in support of the Fedora Infrastructure mirror needs, I know that I don’t have the time to take on that added responsibility right now either.

If you use s3cmd, or have contributed to s3cmd, and feel you could make the time commitment to be the next maintainer, you’ll find an active contributor base and dedicated user base to help you move the project forward.

 

SELinux on a Rackspace Cloud Server

After a long time hosting my personal web site at WestHost, I finally decided to move it to another cloud provider – a Rackspace Cloud Server.  This move gives me a chance to run Fedora 16, as I do at home everywhere, and which is more than capable of serving a few light traffic domains, personal mail and mailing lists, and email for our neighborhood youth basketball league.

One thing that surprised me though was that the default Fedora 16 image provided by Rackspace in their smallest configuration (256GMB RAM, 10GB storage) had SELinux disabled, and no selinux-policy package installed.  Being a big fan of Mark Cox’s work reporting on vulnerabilities in RHEL, and Josh Bressers work leading the Fedora Security Response Team, it just didn’t feel right running an internet-facing Fedora server without having SELinux enabled.

This was easily enough resolved by installing the selinux-policy-targeted package, editing /etc/grub.conf to remove selinux=0 from the kernel command line, enabling the configuration in /etc/selinux/config, and restarting the server.  After a few minutes of autorelabeling, all is well and good.

I’m sure SELinux can get in the way of some application deployments.  It’s easiest for Rackspace to keep it disabled, letting experienced folks like myself enable it if they want.  I would have preferred it to be enabled by default, as there’s always the option to disable it later or run in permissive mode.

Because I run a few mailing lists using mailman, across multiple domains, I of course wanted to run several separate instances of mailman, one for each domain.  Fedora has a SELinux-aware mailman package just a quick yum install away.  The problem is, the SELinux file context rules are written expecting only one instance of mailman per server.  That’s when I remembered a recent blog post by Dutch where he had patched the mailman spec and config files to build separate mailman-${sitename} RPMs, each with their own correct SELinux contexts.  Very cool, and exactly what I needed.  Well, almost – he did his work on EL6, I’m running Fedora 16, but close enough (see his blog comments for the few changes necessary on F16).  Thanks to Dutch, I’ve got a fully SELinux-secured web and mail server with separate mailman instances for each domain.

Next time you build a Rackspace Cloud Server running Fedora, take an extra couple minutes and enable SELinux.  The site you save may be your own!

FUDCon Blacksburg videos

I shot videos of several of the presentations at the Fedora User and Developer Conference yesterday.  For your viewing pleasure:

  • “State of Fedora” from the Fedora Project Leader, Jared Smith [ogg]
  • Mike McGrath, team lead for OpenShift, demoing OpenShift [ogg]
  • Jon Masters and Chris Tyler, on the ARM architecture in Fedora [ogg]. ARM is a secondary architecture today.  By Fedora 18, with your help, it needs to become a primary architecture.
  • David Nalley presented on CloudStack, which is aiming for Fedora 17 inclusion. [ogg]
  • Dan Prince and Russell Bryant giving an introduction to OpenStack [ogg]
  • Mo Morsi presenting the Aeolus cloud management project [ogg]

[Update 1/18/2012] I was able to upload all the videos to YouTube.  http://www.youtube.com/playlist?list=PL2BAA7FF83E6482C2
is a playlist with all 6.

Northwest Austin Youth Basketball registration

Northwest Austin Youth Basketball Association (NWAYBA) registration deadline is only 3 weeks away.  Register your 1st grader through High School player, and join us on the courts this Fall.  Registration forms must be postmarked by October 16, but I’d appreciate it if you’d mail them sooner.  Somehow I got roped onto the NWAYBA Board, as the Registrar.  We’re expecting 400 players again this year. I’d prefer not to deal with 300 applications in the last week.

Central Texas Wildfire Relief Food and Goods Drive

On Saturday, September 24th, the boys of Cub Scout Pack 2 will be in the Doss Teachers’ Parking lot from 8a until 12p for final collections before delivering the food and goods to those in need. Firefighers with a Fire Engine from the City of Austin Fire Department will join us at Doss from 9-11am, work permitting.

Please demonstrate the generosity and caring of our neighborhood by joining Cub Scout Pack 2 in collecting the following needed items for those impacted by the wildfires of the past several weeks:

  • Canned food items
  • Dishwashing soap
  • Diapers and wipes (all sizes)
  • Hand sanitizer
  • Eye drops
  • Bandages
  • Neosporin/triple anti-biotic cream
  • gift cards or cash which will be converted to HEB gift cards

Boys from Pack 2 have been and will be in uniform at Doss each day September 19th through the 23rd to collect your donations. So far several truckloads of items have been collected.

Google Voice: Why do I need a home phone?

For the past 3 months I’ve been using Google Voice, and I must say, I like it.  But I’m not exactly using it as intended.

I’ve had the same home phone number for 10 years.  A lot of people have that number.  Not a lot of people call it (what that says about my popularity I don’t really want to know), and we don’t make that many outgoing calls a month, but the thought of changing it everywhere is daunting.  More so for anyone with a number for even longer.  I’ve started doing so, but only opportunistically.

What to do?  I don’t want to give up my home number, and I can’t yet transfer my number to Google Voice.  And in theory, I get a discount on my phone/cable/internet by having all three, they’d charge even more for having just two.

My trick?  Time Warner offers unlimited free call forwarding.  So, my home number forwards to GV.  GV then forwards to my cell phone, email, Celeste’s cell phone, etc.  I dropped the voicemail from TW, as now GV takes care of that.  And I can drop the long distance with TW and use GV for that too.  Everything works great.

At some point, when I can transfer my home number to GV and have two numbers for the account (old home number and new GV number I’ve been giving out), and if TW’s rates change again so it’s cheaper to drop their phone service, I will.  Or they will get enough competition to realize that for a couple dozen calls a month, charging $$ for phone service won’t work and they just throw it in for free.  Here’s to hoping.

How many social networking sites do you use?

Daily I look at Facebook, Twitter, Identi.ca, LinkedIn, Yammer, and probably a few more I use far less often.  But what do you do when a friend or colleague invites you to use YASNS (yet another social networking site)?

I’ve been sitting on an invite to Namyz for ages.  It was sent by a friend of mine, and I don’t want to snuff his enthusiasm for this particular site, but I don’t really want yet another of these to keep track of.  Same goes with Plaxo.  I’m sure they’re wonderful, but really, how many can one person be expected to use?  What’s the etiquette for saying “no thanks to Namyz, but if you’d care to send me a LinkedIn invite, I’d accept that?”