Keeping multiple web servers in sync with rsync

People looking to create a load balanced web server solution often ask, how can they keep their web servers in sync with each other? There are many ways to go about this: NFS, lsync, rsync, etc. This guide will discuss a technique using rsync that runs from a cron job every 10 minutes.

There will be two different options presented, pulling the updates from the master web server, and pushing the updates from the master web server down to the slave web servers.

Our example will consist of the following servers:

web01.example.com (192.168.1.1) # Master Web Server
web02.example.com (192.168.1.2) # Slave Web Server
web03.example.com (192.168.1.3) # Slave Web Server

Our master web server is going to the single point of truth for the web content of our domain. Therefore, the web developers will only be modifying content from the master web server, and will let rsync handle keeping all the slave nodes in sync with each other.

There are a few prerequisites that must be in place:
1. Confirm that rsync is installed.
2. If pulling updates from the master web server, all slave servers must be able to SSH to the master server using a SSH key with no pass phrase.
3. If pushing updates from the master down to the slave servers, the master server must be able to SSH to the slave web servers using a SSH key with no passphrase.

To be proactive about monitoring the status of the rsync job, both scripts posted below allow you to perform a http content check against a status file to see if the string “SUCCESS” exists. If something other then SUCCESS is found, that means that the rsync script may have failed and should be investigated. An example of this URL to monitor would be is: 192.168.1.1/datasync.status

Please note that the assumption is being made that your web server will serve files that are placed in /var/www/html/. If not, please update the $status variable accordingly.

Using rsync to pull changes from the master web server:

This is especially useful if you are in a cloud environment and scale your environment by snapshotting an existing slave web server to provision a new one. When the new slave web server comes online, and assuming it already has the SSH key in place, it will automatically grab the latest content from the master server with no interaction needed by yourself except to test, then enable in your load balancer.

The disadvantage with using the pull method for your rsync updates comes into play when you have multiple slave web servers all running the rsync job at the same time. This can put a strain on the master web servers CPU, which can cause performance degradation. However if you have under 10 servers, or if your site does not have a lot of content, then the pull method should work fine.

Below will show the procedure for setting this up:

1. Create SSH keys on each slave web server:

ssh-keygen -t dsa

2. Now copy the public key generated on the slave web server (/root/.ssh/id_dsa.pub) and append it to the master web servers, /root/.ssh/authorized_keys2 file.

3. Test ssh’ing in as root from the slave web server to the master web server
# On web02

ssh [email protected]

4. Assuming you were able to log in to the master web server cleanly, then its time to create the rsync script on each slave web server. Please note that I am assuming your sites documentroot’s are stored in /var/www/vhosts. If not, please change the script accordingly and test!

mkdir -p /opt/scripts/
vi /opt/scripts/pull-datasync.sh

#!/bin/bash
# pull-datasync.sh : Pull site updates down from master to front end web servers via rsync

status="/var/www/html/datasync.status"

if [ -d /tmp/.rsync.lock ]; then
echo "FAILURE : rsync lock exists : Perhaps there is a lot of new data to pull from the master server. Will retry shortly" > $status
exit 1
fi

/bin/mkdir /tmp/.rsync.lock

if [ $? = "1" ]; then
echo "FAILURE : can not create lock" > $status
exit 1
else
echo "SUCCESS : created lock" > $status
fi

echo "===== Beginning rsync ====="

nice -n 20 /usr/bin/rsync -axvz --delete -e ssh [email protected]:/var/www/vhosts/ /var/www/vhosts/

if [ $? = "1" ]; then
echo "FAILURE : rsync failed. Please refer to solution documentation" > $status
exit 1
fi

echo "===== Completed rsync ====="

/bin/rm -rf /tmp/.rsync.lock
echo "SUCCESS : rsync completed successfully" > $status

Be sure to set executable permissions on this script so cron can run it:

chmod 755 /opt/scripts/pull-datasync.sh

Using rsync to push changes from the master web server down to slave web servers:

Using rsync to push changes from the master down to the slaves also has some important advantages. First off, the slave web servers will not have SSH access to the master server. This could become critical if one of the slave servers is ever compromised and try’s to gain access to the master web server. The next advantage is the push method does not cause a serious CPU strain cause the master will run rsync against the slave servers, one at a time.

The disadvantage here would be if you have a lot of web servers syncing content that changes often. Its possible that your updates will not be pushed down to the web servers as quickly as expected since the master server is syncing the servers one at a time. So be sure to test this out to see if the results work for your solution. Also if you are cloning your servers to create additional web servers, you will need to update the rsync configuration accordingly to include the new node.

Below will show the procedure for setting this up:

1. To make administration easier, its recommended to setup your /etc/hosts file on the master web server to include a list of all the servers hostnames and internal IP’s.

vi /etc/hosts
192.168.1.1 web01 web01.example.com
192.168.1.2 web02 web02.example.com
192.168.1.3 web03 web03.example.com

2. Create SSH keys on the master web server:

ssh-keygen -t dsa

3. Now copy the public key generated on the master web server (/root/.ssh/id_dsa.pub) and append it to the slave web servers, /root/.ssh/authorized_keys2 file.

4. Test ssh’ing in as root from the master web server to each slave web server
# On web01

ssh root@web02

5. Assuming you were able to log in to the slave web servers cleanly, then its time to create the rsync script on the master web server. Please note that I am assuming your sites documentroot’s are stored in /var/www/vhosts. If not, please change the script accordingly and test!

mkdir -p /opt/scripts/
vi /opt/scripts/push-datasync.sh

#!/bin/bash
# push-datasync.sh - Push site updates from master server to front end web servers via rsync

webservers=(web01 web02 web03 web04 web05)
status="/var/www/html/datasync.status"

if [ -d /tmp/.rsync.lock ]; then
echo "FAILURE : rsync lock exists : Perhaps there is a lot of new data to push to front end web servers. Will retry soon." > $status
exit 1
fi

/bin/mkdir /tmp/.rsync.lock

if [ $? = "1" ]; then
echo "FAILURE : can not create lock" > $status
exit 1
else
echo "SUCCESS : created lock" > $status
fi

for i in ${webservers[@]}; do

echo "===== Beginning rsync of $i ====="

nice -n 20 /usr/bin/rsync -avzx --delete -e ssh /var/www/vhosts/ root@$i:/var/www/vhosts/

if [ $? = "1" ]; then
echo "FAILURE : rsync failed. Please refer to the solution documentation " > $status
exit 1
fi

echo "===== Completed rsync of $i =====";
done

/bin/rm -rf /tmp/.rsync.lock
echo "SUCCESS : rsync completed successfully" > $status

Be sure to set executable permissions on this script so cron can run it:

chmod 755 /opt/scripts/push-datasync.sh

Now that you have the script in place and tested, its now time to set this up to run automatically via cron. For the example here, I am setting up cron to run this script every 10 minutes.

If using the push method, put the following into the master web servers crontab:

crontab -e
# Datasync script
*/10 * * * * /opt/scripts/push-datasync.sh

If using the pull method, put the following onto each slave web servers crontab:

crontab -e
# Datasync script
*/10 * * * * /opt/scripts/pull-datasync.sh

Using gmirror, ggated, and freevrrpd to create a high availability environment

Technologies used:
– FreeBSD 7
– GEOM gmirror
– GEOM gate
– freevrrpd

Intended Uses:
– High-Availability Apache Setup
– High-Availability NFS Setup
– High-Availability Mysql Setup

Example Scenario:

We want borkcorp’s web server to be able to fail over to their dedicated mysql server. This will be done by mirroring a 500MB file-backed filesystem across the network. ggated is going to provide disk exporting and gmirror for the replication across the ggated device.

If you are unfamiliar with the GEOM framework, you may want to read the following documenation before continuing:
– http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom.html
– http://phaq.phunsites.net/2006/10/28/convert-single-disk-to-geom-mirror/
– Absolute FreeBSD 2nd Edition p529 – p567

Concept:

The GEOM frameworks introduces another level of high availability to FreeBSD. The different classes of GEOM are extremely flexible and allows for stackable setups. So in the following example, we are going to be building a setup that allows for high availability without having to use shared storage. For sake of ease, we are going to identify the 2 servers as follows:

Server_A –> Master server (192.168.1.100)
Server_B –> Slave server (192.168.1.101)

Both servers will be sharing a single disk image that is mounted as /home/sites. Think of this like a NFS server on steroids (in the most basic sense), the primary difference being that geom gate only allows one server’s image be actively in use at a time. However, if Server_A fails or otherwise goes offline, we will allow Server_B to mount /home/sites and Server_B’s Apache will be able to serve the sites until Server_A is back online.

Configuration:

First, we need to get GEOM gate setup on both servers. Remember, this is similar to a NFS server setup. So with that in mind, log onto the slave server (Server_B) and create the disk image that will be copied over to the master server (Server_A)

On Server_B –> Slave server (192.168.1.101):

mkdir /root/images
truncate -s500m /root/images/sitesbackup.img

Then setup /etc/gg.exports to allow our master server (Server_A) to access this disk image:

vi /etc/gg.exports
192.168.1.100 RW /root/images/sitebackup.img

Now turn on ggated so our image can be served across the network:

ggated

There are more steps involved in setting up the primary server. So to start, lets attach the ggate device to the remote file. On Server_A –> Master server (192.168.1.100), the image that we made on the slave server will now be mountable on the master server as /dev/ggate0

ggatec create backupserver /root/images/sitebackup.img

Next, lets setup a disk image on the master server (Server_A), that will be mirrored against the slave servers disk image:

mkdir /root/images
truncate -s500m /root/images/sitesprimary.img

Time to setup gmirror to mirror both of the disk images from Server_A and Server_B to have them setup in Raid 1. First, create a mirror and pull in the remote disk image (/dev/ggate0) into a mirror labeled remotemirror.

gmirror label remotemirror /dev/ggate0

Now, lets mount our disk image that we just created on the primary server. This sets the image in /dev/md0:

mdconfig -a -t vnode -f /root/images/sitesprimary.img

Finally, lets insert the sitesprimary.img into the mirror:

gmirror insert remotemirror /dev/md0

So now that we have the 2 images setup in a Raid 1, lets see what gmirror has to say about this:
– md0 is the image on the master server
– ggate0 is the image on the slave server
– /dev/mirror/remotemirror is the raid 1 that will be mounted as /home/sites

[root@dev01 ~]gmirror status
Name Status Components
mirror/remotemirror COMPLETE md0
ggate0

Now that we have our mirror setup, we need to put a file system on it:

newfs -U /dev/mirror/remotemirror
mount /dev/mirror/remotemirror /home/sites

You now have a distributed mirror mounted on /home/sites on the primary server.

Freevrrpd:

A quick primer is in order so you have a basic idea how this works. Freevrrpd is similar to heartbeat for Linux. With freevrrpd installed on both servers, it can detect when one server is up or down. An example may better explain this:

The master server and slave servers are both running freevrrpd. If the master server goes down, we want the ip’s and the /home/sites directory to be transferred to the slave server. Once this is complete, the slave server will be able to serve up the websites in the same fashion that the master was able to. When the master server finally comes back online, Freevrrpd will automatically transfer the ip’s and /home/sites back to the master server.

Here is what makes freevrrpd interesting. Lets say the master server goes offline for some reason, the slave server will detect this and then execute a custom script called masterscript. This masterscript is the script that you can use to turn on/off services, or perform any other tasks that may be needed when making the slave server the primary server. Once the master server comes back up, freevrrpd on the slave server will run its exit script called backupscript. This custom script allows for you to perform any nessacery tasks to have this server become a slave once more.

Configuration:

On Server_A –> Master server (192.168.1.100), install freevrrpd:

pkg_add -r freevrrpd

Time to configure it:

vi /usr/local/etc/freevrrpd.conf
serverid = 1
interface = lnc1
priority = 255
addr = 192.168.1.102/32
password = borkborkborkborkbork
useVMAC = no
sendgratuitousarp = yes
masterscript = /usr/local/bin/master_script.sh
backupscript = /usr/local/bin/backup_script.sh
# Note: addr is the floating ip that will be transferred between both servers.
# Note2: Please refer to the freevrrpd man page to find what each of these directives do.

On Server_B –> Slave server (192.168.1.101), install freevrrpd:

pkg_add -r freevrrpd

Time to configure it:

vi /usr/local/etc/freevrrpd.conf

[VRID]
serverid = 2
interface = lnc1
priority = 200
addr = 192.168.1.102/32
password = borkborkborkborkbork
useVMAC = no
sendgratuitousarp = yes
masterscript = /usr/local/bin/master_script.sh
backupscript = /usr/local/bin/backup_script.sh

Ok, now we have this installed. We now need to setup the scripts that will be excuted when one server fails or comes back up:

Here is a summary of what each script does:

1. Master server: /usr/local/bin/master_script.sh
This script applies when the server is first booted up, and assumes that it is going to regain control of /home/sites from the slave server.

The following tasks are executed:
– Removes any hard drive images from the mirror if they exist
– Uses ggatec to establish a connection to the slave servers shared disk image.
– Sets up the mirror between the disk image on the master server and the one on the slave server
– The script assumes that the slave server has the most updated content, so it’ll copy its new data onto the primary servers image.
– Finally it mounts the mirror as /home/sites

2. Master server: /usr/local/bin/backup_script.sh
This script is really only provided for us to test failing over the master to the slave server. Freevrrpd will most likely never use this as the master server is expected to stay up 100% of the time as far as freevrrpd is concerned.

3. Slave server: /usr/local/bin/master_script.sh
This script is executed when the master server fails. This allows the slave server to become the sole server in charge of providing the /home/sites directory. The following tasks are executed:
– Destroy the ggated daemon
– Fsck the drive to ensure there was no corruption
– Mount the slave servers disk image as /home/sites

4. Slave server: /usr/local/bin/backup_script.sh
This script puts the slave server back into a state when its simply a slave and no longer serving /home/sites. The following tasks are executed:
-Unmount /home/sites
-Start the ggated daemon so the master server can connect and mount the disk image.

Please note: The following scripts do not have any error checking. They are provided only to give you a basic idea of what they need to contain. Your scripts will be different depending on the exact type of setup you are doing, hense why these are only barebone and basic scripts. So in short, you are better off writing your own and just using these are a basic template.

Server_A: Master Server: 192.168.1.100
master_script.sh

#!/usr/local/bin/bash 

gmirror deactivate remotemirror md0
gmirror deactivate remotemirror ggate0

while [ ! -c /dev/ggate0 ]; do
	ggatec create -d 5 192.168.1.102 /root/images/sitesbackup.img
	sleep 1
done;

mdconfig=`mdconfig -l`

if [ "$mdconfig" = "md0" ]; then
	echo "md0 already exists"
	sleep 1
else
	mdconfig -a -t vnode -f /root/images/sitesprimary.img
	echo "had to create md0"
	sleep 1
fi

gmirror deactivate remotemirror ggate0
gmirror deactivate remotemirror md0

gmirror label -v -n -b prefer remotemirror ggate0
echo "loaded remoteimrror ggate0"
sleep 1
gmirror insert remotemirror md0
echo "loaded local md0"
sleep 1
gmirror rebuild remotemirror md0


while [ `gmirror status |grep remotemirror |awk '{print $2}'` != "COMPLETE" ]; do
	sleep 1
done;

/sbin/mount /dev/mirror/remotemirror /home/sites/

backup_script.sh

#!/usr/local/bin/bash

umount /home/sites
gmirror deactivate remotemirror ggate0
gmirror deactivate remotemirror md0

ggatec destroy -u 0

Server_B: Slave Server: 192.168.1.101
master_script.sh

#!/usr/local/bin/bash

pkill ggated

if [ ! -c /dev/md0 ]; then
        mdconfig -a -t vnode -f /root/images/sitesbackup.img
fi

fsck -t ufs -y /dev/md0

sleep 2
if [  -c /dev/mirror/remotemirror ]; then
        mount /dev/mirror/remotemirror /home/sites
else
        mount /dev/md0 /home/sites
fi

backup_script.sh

#!/usr/local/bin/bash

umount /home/sites
/sbin/ggated

Failover Testing:

So how do you test this out to ensure its working?

1. Fail the master server:

Log onto the master server (Server_A) and verify first that everything is working between the master and slave. This will show us that the ggate device is working:

[root@dev01 ~]# ggatec list
ggate0

This will show us if both images are setup in the raid:

[root@dev01 ~]# gmirror status
Name Status Components
mirror/remotemirror COMPLETE md0
ggate0

Remember, md0 is the master server’s image and ggate0 is the slave server’s image

Fail the master server by running the following:

/usr/local/bin/backup_script.sh

Verify the master server is no longer able to handle requests:
– gmirror status should return no output
– ggatec list should return no output
– /home/sites should no longer be mounted on the master server.

Now log into the slave server and run the following:

/usr/local/bin/master_script.sh

Verify the slave server can mount the /home/sites directory

df
touch /home/sites/file_from_slave_server.txt

2. Fail the slave server back to the master server:

We are now assuming that the master server came back online. Log onto the slaver server (Server_B), and run the freevrrpd exit script to release control back to the master server:

/usr/local/bin/backup_script.sh

Verify /home/sites is no longer mounted on this server

df

Confirm ggated is now accepting connections

ps -waux |grep ggated

Log onto the master server (Server_A), and make this server the master server again:

/usr/local/bin/master_script.sh

Verify everything is working by doing the following:

[root@dev01 ~]# gmirror status
Name Status Components
mirror/remotemirror DEGRADED ggate0
md0 (6%)

*This shows that the local md0 image is getting the changes from the slave servers image.
During this time, /home/sites is availabe to be written to
**Verify the following file exists if you followed the first step:
/home/sites/file_from_slave_server.txt

3. Fail the slave server:

Just pull the ethernet cable or shut it down. Log onto the master server and verify that /home/sites can still be written to.

touch /home/sites/masterfiletest.txt

Now turn the slave server back on and run the following to get everything working again:

mdconfig -a -t vnode -f /root/images/sitesbackup.img (can be put into a rc script)
ggated (can also be put into a rc script)

Then log into the master server and run the following:

ggatec create -d 5 192.168.1.102 /root/images/sitebackup.img
gmirror rebuild remotemirror ggate0

Once the mirror is done rebuilding, ensure the file you touched is still there!!! The way I set this up, if the slave fails, manual intervention is required to have the master server connect back to the slave. You just have to run the 2 commands above. This is setup like this because I don’t think client wants to risk data loss. So be sure to setup monitoring to detect when this happens.

So why are we using disk images instead of partitions?

1. This allows for on the fly setups on existing server setups. If you used partitons to achieve this, you would have to either format the hard drive, or work some voodoo with moving data around to free up a partition. (if thats even possible without data loss)

2. Ease of adding disk space to the image

3. Emergency Migrations. If the solution were to totally fail, you can bring the server up in single user, and cp the disk image to a new server quickly.