Remote backups with rsnapshot

What type of backup strategy do you employ for your solution? Do you have backups within your datacenter, or are you utilizing your hosting providers backup infrastructure if one is available? These are both good starting points for preparing your solution for disaster.

Now, what do you have in place for remote backups? Remote backups are critical in the event something where to happen to your primary datacenter. What if there was a fire, or there was a major natural disaster that took out the datacenter?

Perhaps as a more common scenario, maybe your existing backup solution was having problems and you weren’t aware of it. When the time comes for needing to restore your backups, you find that they are corrupted and unusable. This happens more often then people think.

When you deploy a new solution, you make sure its redundant and highly available. It is important to also do the same with your backup architecture. Having an on-site backup allows you to perform a speedy recovery should something go wrong. Including an off-site backup solution allows you to plan for the worst case scenario, and also gives you the piece of mind that your data is stored outside of that datacenter, under your control.

When having solution architecture discussions with clients, I strongly encourage:
– Use all available backup solutions offered by the hosting provider
– Have an off-site backup solution that is managed by yourself or a different provider

You can never have enough backups. Your data took weeks, months, or sometimes even years to develop and fine tune. If there are concerns about how much will it cost to have a remote backup solution, here is a more important cost consideration: How much will it cost your business and reputation to rebuild all your website and database content from scratch?

As you can probably tell, I am very paranoid about my clients data. So now that I hopefully gave you some food for thought, I’ll show you one inexpensive way I like to perform remote backups for smaller solutions (Under 500G). Please keep in mind that there are many backup solutions available, this is just one of many different types of solutions I present as an option to my clients.

Welcome rsnapshot. Taken from their website, http://www.rsnapshot.org:

"rsnapshot is a filesystem snapshot utility for making backups of local and remote systems.  Using rsync and hard links, it is possible to keep multiple, full backups instantly available. The disk space required is just a little more than the space of one full backup, plus incrementals. 

Depending on your configuration, it is quite possible to set up in just a few minutes. Files can be restored by the users who own them, without the root user getting involved. 

There are no tapes to change, so once it's set up, your backups can happen automatically untouched by human hands. And because rsnapshot only keeps a fixed (but configurable) number of snapshots, the amount of disk space used will not continuously grow.

Many of the more common questions such as, “How do I restore a backup?” are answered in their FAQ which is located here:
http://www.rsnapshot.org/faq.html

I strongly encourage you to review their documentation so you can decide if this software is good for your solution. I like this solution cause it essentially allows you to simply rsync or SCP the needed information from your remote backup server back to your production servers when you need it. There are no complicated tools required to get your critical data back on your solution.

So, what do you need to set this up? You simply need a Linux/UNIX based computer that is running offsite, maybe even at your office if it is in a secure location, and enough hard drive space to store your backups. Installation is quick and easy as I’ll outline below. For this example, I am using a Rackspace Cloud, CentOS 6 server with 2x 200G Cloud Block Storage volumes setup in a Raid 1, encrypted using LUKS, mounted under /opt/storage01. I outlined how to set this up in an older article: http://www.stephenlang.net/2012/12/encryption-block-storage-in-the-cloud/.

My setup is a bit more elaborate, but again, I am just paranoid about data. A simple server with enough free hard drive space will work just as well. Just make sure it is in a secured location.

Procedure

Without further ado, here is how I personally setup rsnapshot. Please note that you have to enable the EPEL repos on your server to yum install rsnapshot. You can enable the EPEL repo by:

CentOS 5

wget http://dl.fedoraproject.org/pub/epel/5/x86_64/epel-release-5-4.noarch.rpm
sudo rpm -Uvh epel-release-5*.rpm

CentOS 6

wget http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
sudo rpm -Uvh epel-release-6*.rpm

Now, install rsnapshot:

yum install rsnapshot

The rest of our setup will take place in /etc/rsnapshot.conf. Make a quick backup of the configuration:

cp /etc/rsnapshot.conf /etc/rsnapshot.conf.orig

Modify the configuration to meet our needs:

vi /etc/rsnapshot.conf

Set the following to specify where you want your backups to be stored. I put in my preference, but you can change this to anything you like. Just be sure its in a location that is only accessible to root:

snapshot_root	/opt/storage02/snapshots/

Now uncomment cmd_ssh as we’ll be rsyncing over SSH:

cmd_ssh	/usr/bin/ssh

Define the backup intervals. Here is what I use:

interval        hourly  6
interval        daily   7
interval        weekly  4
interval        monthly 3

All that is left configure which remote servers you will be backing up. You will have to be sure that you setup SSH keys so rsnapshot can SSH into the remote servers without a passphrase.

As a side note, when backup up your databases, be sure to backup your MySQL dumps, (Or the dumps from whatever database software you are using). If you try to backup the live database, you will likely have severe corruption if you ever need to restore it.

For our example, I am backing up 2 servers:
– db01.example.com (192.168.2.2) : /etc, /var/lib/mysqlbackup
– web01.example.com (192.168.2.3) : /etc, /var/www, and excluding /var/www/example.com/file/big_log_file.log

# db01.example.com (192.168.2.2)
backup  [email protected]:/etc/  db01.example.com/
backup  [email protected]:/var/lib/mysqlbackup/  db01.example.com/

# web01.example.com (192.168.2.3)
backup  [email protected]:/etc/  web01.example.com/
backup  [email protected]:/var/www  web01.example.com/ exclude=file/big_log_file.log

Finally, setup the cron jobs:

crontab -e
0 */4 * * * /usr/bin/rsnapshot hourly
30 8 * * * /usr/bin/rsnapshot daily
55 8 * * 1 /usr/bin/rsnapshot weekly
15 9 1 * * /usr/bin/rsnapshot monthly

Test to ensure everything works accordingly:

/usr/bin/rsnapshot hourly

– Check the directory to ensure your content was saved:

ls /opt/storage02/snapshots/

– Check the log file to ensure there are no errors:

less /var/log/rsnapshot

Most importantly, you must check to ensure that your backup system is functionality properly pretty often. You will want to periodically test your backups, at least every 90 days, to ensure that your team is familiar with the process, and to ensure that everything is okay with your backups. Backups are not ‘set it and forget it’. Always verify your data’s integrity, otherwise you may have a really bad time the day to find you need to restore from backups!