Spamfighting: Keeping spammers from using my domain

We all dislike email spam. It clogs up our inboxes, and causes good
engineers to spend way too much time creatively blocking or filtering
it out, while spammers creatively work to get around the blocks. In
my personal life, the spammers are winning. (My employer, Dell, makes
several security and spam-fighting products. I’m not using them for
my personal domains, so this series is not related to Dell products in
any way.)

I recently came across DMARC, the Domain-based Message Authentication,
Reporting & Conformance specification. One feature of DMARC is that it
allows mail receivers, after processing given piece of mail, to inform
an address at the sending domain of that mail’s disposition: passed,
quarantined, or rejected. This is the first such feedback method I’ve
come across, and it seems to be gaining traction. Furthermore,
services such as have popped up to act as DMARC report
receivers, which then display your aggregate results in a clear

A DMARC-compliant outbound mail server provides several useful bits of
information. 1) The domain publishes a valid Sender Policy Framework
(SPF) record. 2) The domain signs mail using Domain Keys Identified
Mail (DKIM). These are best practices now, in place by millions of
domains. In addition, the domain publishes its DMARC policy
recommendation, what an inbound mail server should do if a message
purporting to be from the domain fails both SPF and DKIM checks. The
policies today include “none” – do nothing special, “quarantine” –
treat the message as suspect, perhaps applying additional filtering or
sending to a spam folder, and “reject” – reject the message
immediately, sending a bounce back to the sender.

A DMARC-compliant inbound mail server validates each incoming message
against two things: compliance with the Sender Policy Framework (SPF)
and checks the DKIM signature. The server then follows the policy
suggested by the sending domain (none, quarantine, or reject), and
furthermore, reports back the results of its actions daily to the
purported sending domain.

I’ve been publishing SPF records for my personal and community
organization domains for several years, in hopes this would cut down
on spammers pretending to be from my domains. I recently added DKIM
signing, the next step in the process. With these two in place,
publishing a DMARC policy is very straightforward. So I did this,
publishing a “none” policy – just send me reports. And within a few
days, I started getting reports back, which I sent to for

What did I find?

On a usual day, my personal domain, used by myself and family members,
sends maybe a hundred total emails, as reported by DMARC-reporting
inbound servers. My community org domains may send 1000-2000 emails a
day, particularly if we have announcements to everyone on our lists.
That seems about right.

In addition, spammers, mostly in Russia and other parts of Asia, are
sending upwards of 20,000-40,000+ spam messages pretending to come from my
personal domain, again as reported by DMARC-reporting inbound
servers. Hotmail’s servers kindly are sending me reports for each
failed message they process thinking they were from me – a steady
stream of ~3600/day. No other DMARC servers have sent me such forensic
data yet.


Spam source by country for the last week

For several days, I experimented with a DMARC policy of “quarantine”,
with various small percentages from 10 to 50 percent. And sure
enough, dmarcian reports that the threat/spam mails were in fact
quarantined. It was really cool to wake up in the morning, check the
overnight results, and see the threat/spam graphs show half of the
messages being quarantined. It’s working!

However, dmarcian also reported that some of my legitimate emails,
originating from my servers and being DKIM-signed, were also getting
quarantined. What? That wasn’t what I hoped for.

It turns out that authentic messages were in fact being forwarded –
some by mailing lists, some by individuals setting up forwarding from
one inbound mail address to another. Neither of which I can do
anything about.

This isn’t a new problem – it’s the Achilles heel of SPF, which DMARC then inherits. Forwarding email through a mailing list typically makes subtle yet valid changes while keeping the From: line the same.

The Subject: line may get a [listname] prepended to it. The body may
get a “click here to unsubscribe” footer added to it. These
invalidate the original DKIM signature. The list may strip out the
original DKIM signature. And of course, it remails the message,
outbound using its own server name and IP, which causes it to then
fail SPF tests.

Sure, there are suggested solutions, like getting everyone to use
Sender Rewriting Scheme (SRS) when remailing, and fixing Mailman and
every other mailing list manager. Wake me when all the world’s email
servers have added that, I will have been dead a very very long time.

So, I switched back to policy “none”, and get the reports, aggravated
that there’s nothing I can directly do to protect the good name of my
domains. It’s hard both knowing the size of the problem, and knowing
we have no technological method of solving it today. Food for

In part 2 of this series, I will describe my system setup for using
the above techniques.

Do you use SPF? Do you use DKIM? Do you publish a DMARC policy? If so, what has your experience been? Leave comments below.

Dasein Cloud at OSCON

While working on Dell’s acquisition of Enstratius, one of the highlights for me was the work George Reese and team have done on the open source (Apache license) cloud abstraction layer – Dasein Cloud.  I’m pleased Enstratius joined Dell, and that the work on building Dasein, and making Dasein available for other uses, has only accelerated.

Please see George’s blog post on his views of Dasein’s progress in just the last few months, and if you’re at OSCON, stop by the Dell booth or the Dasein session and talk to George.

The Open Source Soul of Dell Multi-Cloud Manager

F/OSS for querying

Has anyone written free/open source software to use the query API? I’ve got several dozen coaches we need to do background checks on, and we use Turns out our folks have been doing this manually for years. Seems like something that could be easily automated, but haven’t found any software, open or otherwise, to do so.

Why MirrorManager Matters

MirrorManager’s primary aim is to make sure end users get directed to the “best” mirror for them.  “Best” is defined in terms of network scopes, based on the concept that a mirror that is network-wise “close” to you is going to provide you a better download experience than a mirror that is “far” from you.

In a pure DNS-based round robin mirror system, you would expect all requests to be sent to a “global” mirror, with no preference for where you are on the network.  In a country-based DNS round robin system, perhaps where the user has specified what country they are in, or perhaps it was automatically determined, you’d expect most hits in countries where you know you have mirrors.

MirrorManager’s scopes include clients and mirrors on the the same network blocks, Autonomous System Numbers, jointly on Internet2 or its related regional high speed research and education networks in your same country, then falling back to GeoIP to find mirrors in the same country, and same continent.  In only the rarest of cases does the GeoIP lookup fail, we have no idea where you are, and you get sent to some random mirror somewhere.

But, how well does this work in practice?  MM 1.4 added logging, so we can create statistics on how often we get a hit for each scope.  Raw statistics:


Scope Percentage On-Network Percentage
netblocks 16.10% 16.10%
Autonomous System 5.61% 21.71%
Internet2 8.95% 30.66%
geoip country 57.50% 88.16%
geoip continent 10.34% 98.51%
Global (any mirror) 1.38% 99.88%


In the case of MirrorManager, we take it three steps further than pure DNS round robin or GeoIP lookups.  By using Internet2 routing tables, ASN routing tables, and letting mirror admins specify their Peer ASNs and their own netblocks, we are able to, in nearly 22% of all requests, keep the client traffic completely local to the organization or upstream ISP, and when adding in Internet2 lookups, a whopping 30% of client traffic never hits the commodity Internet at all.  In 88% of all cases, you’re sent to a mirror within your own country – never having to deal with congested inter-country links.

MirrorManager 1.4 now in production in Fedora Infrastructure

After nearly 3 years in on-again/off-again development, MirrorManager 1.4 is now live in the Fedora Infrastructure, happily serving mirrorlists to yum, and directing Fedora users to their favorite ISOs – just in time for the Fedora 19 freeze.

Kudos go out to Kevin Fenzi, Seth Vidal, Stephen Smoogen, Toshio Kuratomi, Pierre-Yves Chivon, Patrick Uiterwijk, Adrian Reber, and Johan Cwiklinski for their assistance in making this happen.  Special thanks to Seth for moving the mirrorlist-serving processes to their own servers where they can’t harm other FI applications, and to Smooge, Kevin and Patrick, who gave up a lot of their Father’s Day weekend (both days and nights) to help find and fix latent bugs uncovered in production.

What does this bring the average Fedora user?  Not a lot…  More stability – fewer failures with yum retrieving the mirror lists, not that there were many, but it was nonzero.  A list of public mirrors where the versions are sorted in numerical order.

What does this bring to a Fedora mirror administrator?  A few new tricks:

  • Mirror admins have been able to specify their own Autonomous System Number for several years.  Clients on the same AS get directed to that mirror.  MM 1.4 adds the ability for mirror admins to request additional “peer ASNs” – particularly helpful for mirrors located at a peering point (say, Hawaii), where listing lots of netblocks instead is unwieldy.  As this has the potential to be slightly dangerous (no, you can’t request ALL ASNs be sent your way), ask a Fedora sysadmin if you want to use this new feature – we can help you.
  • Multiple mirrors claiming the same netblock, or overlapping netblocks, were returned to clients in random order.  Now they will be returned in ascending netblock size order.  This lets an organization that has a private mirror, and their upstream ISP, both have a mirror, and most requests will be sent to the private mirror first, falling back to the ISP’s mirror.  This should save some bandwidth for the organization.
  • If you provide rsync URLs, You’ll see reduced load from the MM crawler as it will now use rsync to retrieve your content listing, rather than a ton of HTTP or FTP requests.

What does this bring Fedora Infrastructure (or anyone else running MirrorManager)?

  • reduced memory usage in the mirrorlist servers.  Especially with as bad as python is at memory management on x86_64 (e.g. reading in a 12MB pickle file blows out memory usage from 4MB to 120MB), this is critical.  This directly impacts the number of simultaneous users that can be served, the response latency, and the CPU overhead too – it’s a win-win-win-win.
  • An improved admin interface – getting rid of hand-coded pages that looked like they could have been served by BBS software on my Commodore 64 – for something modern, more usable, and less error prone.
  • Code specifically intended for use by Debian/Ubuntu and CentOS communities, should they decide to use MM in the future.
  • A new method to upgrade database schemas – saner than SQLObject’s method.  This should make me less scared to make schema changes in the future to support new features.  (yes, we’re still using SQLObject – if it’s not completely broken, don’t fix it…)
  • Map generation moved to a separate subpackage, to avoid the dependency on 165MB of  python-basemap and python-basemap-data packages on all servers.

MM 1.4 is a good step forward, and hopefully I’ve laid the groundwork to make it easier to improve in the future.  I’m excited that more of the Fedora Infrastructure team has learned (the hard way) the internals of MM, so I’ll have additional help going forward too.

MirrorManager at FUDCon Lawrence

Two weeks ago I once again had the opportunity to attend the Fedora User and Developer Conference, this time in Lawrence, KS.  My primary purpose in going was to work with the Fedora Infrastructure team, and develop a plan for MirrorManager maintenance going forward, and learn about some of the faster-paced projects that Fedora is driving.

MirrorManager began as a labor of love immediately after the Fedora 6 launch, when our collection of mirrors was both significantly smaller and less well wrangled, leading to unacceptable download times for the release, and impacts to Fedora and Red Hat networks and our few functional mirrors that we swore never to suffer or inflict again.  Fedora 18 launch, 6 years later, was just as downloaded as before, but with nearly 300 public mirrors and hundreds of private mirrors, the release was nary a blip on the bandwidth charts, as “many  mirrors make for light traffic”.  To that end, MirrorManager continues to do its job well.

However, over the past 2 years, with changes in my job and outside responsibilities, I haven’t had as much time to devote to MirrorManager maintenance as I would have liked.  The MirrorManager 1.4 (future) branch has languished, with an occasional late-night prod, but no significant effort. This has prevented MirrorManager from being more widely adopted by other non-Fedora distributions.  The list of hotfixes sitting in Fedora Infrastructure’s tree was getting untenable.  And I hadn’t really taken advantage of numerous offers of help from potential new maintainers.

FUDCon gave me the opportunity to sit down with the Infrastructure team, including Kevin, Seth, Toshio, Pierre, Stephen, Ricky, Ian and now Ralph, to think through our goals for this year, specifically with MM.  Here’s what we came up with.

  1.  I need to get MM 1.4 “finished” and into production.  This falls squarely on my shoulders, so I spent time both at FUDCon, and since, moving in that direction.  The backlog of hotfixes needed to get into the 1.4 branch.  The schema upgrade from 1.3 to 1.4 needed testing on a production database (Postgres) not just my local database (mysql) – that revealed additional work to be done.  Thanks to Toshio for getting me going on the staging environment again.  Now it’s just down to bug fixes.
  2. I need not to be the single point of knowledge about how the system works.  To that end, I talked through the MM architecture, which components did what, and how they interacted.  Hopefully the whole FI team has a better understanding of how it all fits together.
  3. I need to be more accepting of offers of assistance.  Stephen, Toshio, and Pierre have all offered, and I’m saying “yes”.  Stephen and I sat down, figured out a capability he wanted to see (better logging for mirrorlist requests to more easily root cause failure reports), he wrote the patch, and I accepted it.  +1 to the AUTHORS list.
  4. Ralph has been hard at work on fedmsg, the Fedora Infrastructure Message Bus.  This is starting to be really cool, and I hope to see it used to replace a lot of the cronjob-based backend work, and cronjob-based rsyncs that all our mirrors do.  One step closer to a “push mirror” system.  Wouldn’t it be cool if Tier 2 mirrors listened on the message bus for their Tier 1 mirror to report “I have new content in this directory tree, now is a good time to come get it!” , and started their syncs, rather than the “we sync 2-6 times a day whenever we feel like it” that mirrors use today ?  I think so.

Now, to get off (or really, on) the couch and make it happen!

A few other cool things I saw at FUDCon I wanted to share (snagged mostly from my twitter stream):

  1. OpenLMI = Open Linux Management Infrastructure software to manage systems based on DMTF standards.
  2. Mark Langsdorf from @calxeda is demonstrating the ECX1000 #armserver SoC based build hardware going in PHX at #fudcon
  3. @ralphbean talking about fedmsg at #fudcon . I need to think about how @mirrormanager can leverage this.
  4. Hyperkitty is a new Mailman mailing list graphical front end, bringing email lists into the 21st century.

I look forward to next year’s FUDCon, wherever it happens to be.

s3tools / s3cmd needs a new maintainer

As posted to the s3tools-general mailing list, s3tools maintainer Michal Ludvig is looking for new maintainers to step up to continue the care and feeding of the s3tools / s3cmd application.  s3cmd is widely used, on both Linux and Windows, to publish and maintain content in the Amazon Web Services S3 storage system and CloudFront content distribution network.

I use s3cmd for two primary purposes:

  1. as Fedora Mirror Wrangler, I use it within Fedora Infrastructure to maintain mirrors within S3 in each region for the benefit of EC2 users running Fedora or using the EPEL repository on top of RHEL or a derivative.  Fedora has mirrors in us-east-1, us-west-1 and -2, and eu-west-1 right now, and may add the other regions over time.
  2. for my own personal web site, I offload storage of static historical pictures and movies so that they are served from economical storage and not consuming space on my primary web server.

I congratulate Michal for recognizing when he no longer has the time to commit to regular maintenance of such an important project, and to begin looking for contributors who can carry out that responsibility more effectively.  While I’ve submitted a few patches in support of the Fedora Infrastructure mirror needs, I know that I don’t have the time to take on that added responsibility right now either.

If you use s3cmd, or have contributed to s3cmd, and feel you could make the time commitment to be the next maintainer, you’ll find an active contributor base and dedicated user base to help you move the project forward.


logrotate and bash

It took me a while (longer that I should admit) to figure out how to make daemon processes written in bash, work properly with logrotate so that the output from bash gets properly rotated, compressed, closed, and re-opened.

Say, you’re doing this in bash:

while :; do
     echo -n "Today's date is" >>  ${logfile}
     echo date >> ${logfile} 
     sleep 60

This will run forever, adding a line to the noted logrotate file every minute.  Easy enough, and if logrotate is asked to rotate the somelog.txt file, it will do so happily.

But what if bash has started a process that itself takes a long time to complete:

find / -type f -exec cat \{\} \; >>  ${logfile}

which, I think we’d agree, will take a long time.  During this time, it keeps the logfile open for writing.  If logrotate then fires to rotate it, we will lose all data written to the logfile after the rotate occurs.  The find continues to run, but the results are lost.  This isn’t really what we want.

The solution is to change how logs are written.  Instead of using the > ${logfile} syntax, we’re going to let bash itself do the writing.

exec 1>>${logfile} 2>&1
find / -type f -exec cat \{\} \;

Now, the output from the find command is written to its stdout, which winds up on bash’s stdout, which because of the exec command there, writes it to the logfile.  If logrotate fires here, we’ll still lose any data written after the rotate.  To solve this, we’d need to have bash close and re-open its logfile.

Logrotate can send a signal, say SIGHUP, to a process, when it rotates its logfile out from underneath it.  On receipt of that signal, the process should close its logfile and reopen it. Here’s how that looks in bash:


function sighup_handler()
    exec 1>>${logfile} 2>&1
trap sighup_handler HUP
trap "rm -f ${pidfile}" QUIT EXIT INT TERM
echo "$$" > ${pidfile}
# fire the sighup handler to redirect stdout/stderr to logfile
find / -type f -exec cat \{\} \;

and we add to our logrotate snippet:

somelog.txt {
 rotate 7
 compresscmd /usr/bin/bzip2
 uncompresscmd /usr/bin/bunzip2
 compressext .bz2
    /bin/kill -HUP `cat pidfile.txt 2>/dev/null` 2>/dev/null || true

Now, when logrotate fires, it sends a SIGHUP signal to our long-running bash process.  Bash catches the SIGHUP, closes and re-opens its logfiles (via the exec command), and continues writing.  There is a brief window between when the logrotate fires, and when bash can re-open the logfile, where those messages may be lost, but that is often pretty minimal.

There you have it.  Effective log rotation of bash-generated log files.

(Update 7/5: missed the ‘copytruncate’ option in the logrotate config before, added it now.)


Dell Linux Engineers work over 5000 bugs with Red Hat

A post today by Dell’s Linux Engineering team announcing support for RHEL 5.8 on PowerEdge 12G servers made me stop and think.  In the post, they included a link to a list of fixes and enhancements worked in preparing RHEL 5.8 for our new servers.  The list was pretty short. But that list doesn’t tell the whole story.

A quick search in Bugzilla for issues which Dell has been involved in since 1999 yields 5420 bugs, 4959 of which are CLOSED, and only 380 of which are still in NEW or ASSIGNED state, many of which look like they’re pretty close to being closed as well.  This is a testament to the hard work Dell puts into ensuring Linux “Just Works” on our servers, straight out of the box, with few to no extra driver disks or post-install updates needed to make your server fully functional.  You want a working new 12G server?  Simply grab the latest RHEL or SLES DVD image and go.  Want a different flavor of Linux?  Just be sure you’re running a recent upstream kernel – we push updates and fixes there regularly too.

Sure, we could make it harder for you, but why?

Congratulations to the Linux Engineering team for launching 12G PowerEdge with full support baked into Linux!  Keep up the good work!