MirrorManager automatic local mirror selection

MirrorManager 1.3.2 (plus a hotfix) is now running on all Fedora Infrastructure application servers.  This brings one new interesting feature – automatic mirror detection.  How’s that you say?

As you know, Internet routing uses BGP (Border Gateway Protocol), and Autonomous System Numbers (ASNs) to exchange IP prefixes (aa.bb.cc.dd/nn) and routing tables.  By grabbing a copy of the global BGP table a few times a day, MM can know the ASN of an incoming client request, and Hosts in the MM database have grown two new fields: ASN and “ASN Clients?”.  MM then looks to see if there is a mirror with the same ASN as each client, and offers it up earlier in the list.

I’ve pre-populated the MM database, for public servers only, with ASNs, and set “ASN Clients?” = True, meaning such will offer to serve all clients on the same ASN.  If you have a private server and wish to do likewise (remember, this doesn’t work for home systems or those behind NATs), you can fill in those fields yourself.  The Fedora wiki page on mirroring gives an example on how to look up your ASN.  I recommend this for all schools, research organizations, companies, and ISPs.

The mirrorlist lookup code now goes in preferential order:

  • same netblock
  • same ASN
  • both on Internet2
  • same country
  • same continent
  • global

For ISPs and schools, this should mean that most of the possible Fedora traffic will stay within your network – no transit costs.  And as netblocks change, MM will keep up with them automatically.

To see this in action, try a query as such, and look for the ‘Using ASN ####’ in the result comment line.

$ wget -O – ‘http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-11&arch=i386′

# Using preferred netblock Using ASN XXXX country = US country = MX country = CA

your-local-mirror-here

I hope you enjoy this new feature.

14 comments on this post.
  1. Roger:

    For those copying and pasting, and then getting bewildering error messages be aware that the blogging software has changed the quotes and dashes from ascii into more decorative ones.

    I didn’t get any ASN listed presumably because there isn’t a mirror on it.

    # repo = fedora-11 arch = i386 country = US country = MX country = CA

  2. Matt Domsch: MirrorManager automatic local mirror selection | TuxWire : The Linux Blog:

    [...] Matt Domsch: MirrorManager automatic local mirror selection Share and [...]

  3. Pascal Terjan:

    For me it works at the office

    repo = fedora-11 arch = i386 Using ASN 12322 Using ASN 12322 country = FR

    but not at home while it is also on AS12322

    Home is 82.224.208.X

    $ whois 82.229.208.1 | grep AS12
    % Information related to ’82.224.0.0/11AS12322′
    origin: AS12322

  4. mdomsch:

    @Pascal: the lookup code for this is working for me.
    repo = fedora-11 arch = i386 Using ASN 12322 Using ASN 12322

    Perhaps you’re using IPv6 at home but not at work? I don’t have IPv6 lookups working in this method yet.

  5. mdomsch:

    IPv6 lookups are working now.

  6. Jon Masters:

    Very cool stuff. Thanks for all the infrastructure work you do!

  7. MirrorManager automatic local mirror selection | Full-Linux.com:

    [...] Domsch takes a look at MirrorManager in Fedora. “As you know, Internet routing uses BGP (Border Gateway [...]

  8. Robert:

    Perhaps I am doing something wrong? I am using Telenet in Belgium (BE), here is what I get:
    repo = fedora-11 arch = x86_64 country = BG country = BY country = RS country = RO country = GR country = GB country = HU country = PT country = PL country = EE country = IT country = ES country = MD country = IL country = FR country = FI country = NL country = NO country = CH country = CZ country = SK country = SE country = DK country = DE country = LV country = IS country = AT country = IE country = UA

    # whois 94.225.20.1|grep origin
    origin: AS6848

    I am behind a home NAT router (dlink), and my public IP is in the above range

  9. Mark:

    I’d consider adding *.com;*.net;*.org before global (other continents/countries).

  10. mdomsch:

    @Robert: Telenet does not offer a public mirror already, so you won’t see one appear on this list. You are getting the continent list for Europe because the only mirror in Belgium does not carry the Fedora 11 Everything tree, which is the repo/arch you requested.

    @Mark: why would that help? The list being returned is sorted in the order noted, so you’ll get same-country, and same-continent choices first, and only off-continent choices last. Most folks never fall into this category.

  11. Robert:

    @mdomsch Yes I figured no local mirror was available, it is just the order that I do not understand, it is not alphabetical, and not by distance either. I doubt BG (Bulgaria) would give me the best performance.

  12. mdomsch:

    @Robert: What you got was a weighted (by server bandwidth) randomization of the per-continent list. If you hit that same URL again, you’ll get a different list. It turns out that metrics based on geographic distance mean little in the networking topology, so aside from same-country or same-continent, MM doesn’t even try. The weighted randomization does a good job of spreading the load out so even if you don’t get the best possible mirror, you are likely not to get an overloaded one.

  13. Pascal Bleser:

    Quite a different approach from http://mirrorbrain.org/ (which we’re using on download.opensuse.org)

  14. mdomsch:

    Pascal, it has been a few months since I looked at mirrorbrain’s functionality, so I hadn’t seen the apache mod_asn code Peter wrote. That’s a very similar approach to the same problem, and I really like how mod_asn could be more generally useful than how I implemented ASN lookups in MM.

    Architecturally, the mod_asn code requires a database to be available to each httpd instance. For MM, the app servers each have a private cache of data from the database, so every app server can operate even when the database isn’t reachable (as has happened on occasion). This could be solved by putting the mod_asn database on each app server I suppose, but we don’t presently do that in the Fedora infrastructure.

    Peter notes the use of Patricie Trie indexing, which provides fast lookups. MM is using plain bisection of an ordered list (separate lists for IPv4 and IPv6), the longest list being the IPv4 global table at ~350k entries, so 19 lookups for those, and 12 lookups for IPv6 which has proven fast enough, but I wonder if using Patricia Tries might be even faster.

    Thanks for the reminder to look at mirrorbrain again!