Mirroring the Brisbane City Council web site

By |

This post references UberGlobal. As of June 2011 I am no longer employed by UberGlobal.

The floods in Queensland and Brisbane over the past few weeks have shocked and saddened us all. I would think that many, if not all Australians if not affected themselves would know somebody who is.

While I’m based in Canberra, I still have family and friends in the Brisbane area and have been watching the news and other online resources including the Brisbane City Council web site to make sure everyone is okay.

The Brisbane City Council posted regularly updated maps of flood projections to their web site, but alas the site began to experience severe difficulties. Feeling helpless, I decided to to use the resources I had at my disposal and put a mirror (copy) of the web site online.

It’s important to note that I did not request permission to create the mirror prior to doing so and you should always do this. In this case I believe that it would be fine to do so, as confirmed by the council in a tweet.

I have my own dedicated server hosted in Canberra, Australia through my employer (UberGlobal) which runs this web site and some other sites of mine. While the server would have been able to cope with the load that the mirror ended up receiving, I wasn’t sure how much traffic there would be so I decided to put the master mirror on this server but offload end user requests to the EdgeCast content delivery network (CDN) where I already had an account.

The EdgeCast CDN has points of presence (POPs) around the world such as:

  • Asia: Hong Kong, Singapore and Tokyo
  • Australia: Sydney
  • Europe: Amsterdam, Frankfurt, London (2) and Paris
  • North America: Ashburn, Atlanta, Chicago, Dallas, Los Angeles (2), Miami, New York, San Jose and Seattle

The type of configuration with a master server is known as “origin pull”. In this scenario, I have a copy on my server and when files are requested from the EdgeCast network, a copy will be automatically retrieved and cached in the various EdgeCast locations around the world. Future requests will then use the cached version, rather than retrieving it from the master.

To get started, first I needed to fetch a copy of the Brisbane City Council web site. I used the ‘wget’ command to retrieve the content:

[root@server1 mirrors]# wget --mirror http://www.brisbane.qld.gov.au/

The mirror tag on wget is essential when mirroring a web site as on subsequent runs it will only check for changes to the site content. Files that have not changed will not be copied again.

Normally I would not use wget to mirror a web site as there are more efficient ways when the act of mirroring a site is sanctioned. In this case, wget was the best way available to me.

The initial run took a long time (more than an hour) due to the performance issues that the primary site was experiencing. However, once I had a copy subsequent runs were quicker.

Once I had a copy of the content, I placed it online. First it was available at http://cdn.se.id.au/mirrors/www.brisbane.qld.gov.au/ as that was the quickest way to get the link out there. However, I soon setup a subdomain at http://bnefloods.se.id.au/ which referenced the same content.

When the copy was online and tested, I then setup the script to automatically update the mirror periodically. Note that when using a CDN, this isn’t enough – you also need to ensure that you remove changed files from the caching locations around the world.

EdgeCast provide APIs to allow you to do this, but that’s beyond the scope of this journal post.

As soon as I was satisfied that the mirror was working successfully, I began posting the URL to various social networks such as Twitter and Facebook as well as online forums. After my initial post, members on those networks and forums took over and continued propagating the link.

The initial traffic was an order of magnitude greater than what I usually receive and was still growing:

Over the next few days the traffic continued to grow, until decreasing again once the performance issues with the main Brisbane City Council site were fixed:

Overall the project was a success, and it would seem that the Brisbane City Council caught onto the idea and moved their main site onto a CDN as well.

In the short time that my mirror was online it served over 50,000 copies of the flood maps to people. I believe that it achieved its goal, and hopefully helped a few people gain access access to important information during this disaster.

For now the mirror has been taken offline, but please contact me if there is another faltering web site that might need a bit of love to keep providing information to people affected by the floods.