Category: CommunityA Mirror Is Not A Backup

Share this post...Tweet about this on TwitterShare on Google+0Share on Facebook0
Mirror is Not a Backup

Photo Credits: Tim Sheerman-Chase

Any sysadmin worth their salt knows backups are one of their major priorities. Maintaining a good backup is crucial for ensuring business continuity when the inevitable unforeseen catastrophe occurs.

But, not all backup plans are equal. It’s a complex subject, especially when we’re dealing with production environments that are in a constant state of flux.

In an embarrassingly well-publicized mishap, KDE, developers of the popular open source desktop environment, saw how not thinking hard enough about backing up data can have serious consequences.

The KDE Git repository is kept on a pair of virtual machines. That master repo is mirrored to a large number of secondary repositories around the world. When people commit or clone the repos they use the secondary repositories to spread the load.

Recently, the main repo was taken down for security updates, and when they brought it back up again, it was noticed that there was filesystem corruption. Unfortunately, because the repo was mirrored out to the secondary repos, every one of them contained a copy of the corrupted files. There were no clean repos, so it was impossible to sync back from the secondary repos to get a clean version onto the main server.

The KDE folks were able to solve the problem, but, as they acknowledged, they were very lucky. They narrowly avoided a catastrophe.

The KDE backup solution was eminently scalable, but it was neither reliable nor secure. They’d failed to properly account for all potential failure scenarios. It’s not sufficient that data exist in many different places for it to be backed up, it must also exist in sufficient historical versions. A hundred copies of broken data is no better than one copy.

There are various solutions that KDE could and should have implemented to make sure that their backups were reliable. One, which they are considering, is to use the ZFS file system, which is capable of making snapshots. But a more common solution, would be to do regular backups and setup a cron job to copy them to an external storage device.

If they had backups available, it would have been fairly easy to take one of the production servers offline, sync it to the backup, and then put it back online.

How do you manage backups? If you were designing KDEs new backup protocols, how would you go about it? Let us know what you think in the comments.

backupDisaster Recovery
Apr 9, 2013, 10:45 amBy: InterWorx (3) Comments
  1. Luvenia Purwin: Data backup online is not too hard. My computer documents are all encrypted at Their cloud is the fastest and also free.
    May 1, 2013 at 5:48 am
  2. Marc P: I came to learn about R1Soft through Interworx. R1soft is a great backup utility that will keep historical backups but also it does only bits of data that change. It keeps the backups quick, yet very small and you can keep many more historical changes. Plus it has a MySQL module that ensures table consistency across the database. That's great for small things, but when a server goes down, restoring it can take a lot of time. That's why a mirror server is also important. So you have a hot-standby of the data. This can be accomplished with rsync and a mysql-slave server. Time consuming, but for a proper backup, that is what's required at a minimum.
    April 10, 2013 at 4:22 pm
  3. Richard Whitney: At phpmydev, we have Nightly, Weekly and Monthly off-site backups separate from one another, as well as a daily backup locally.   These run automatically, rsyncing incremental changes to a remote server via cron.  Oh yeah, and we use Nodeworx! 
    April 10, 2013 at 3:02 pm

Leave a Reply
Surround code blocks with <pre>code</pre>

Your email address will not be published.


Sign up to receive periodic InterWorx news, updates and promos!

New Comments

Current Poll

  • This field is for validation purposes and should be left unchanged.