Duplicity manager

Coming up with a secure and cost effective backup solution can be a daunting task as there are many considerations that much be taken into account. Some of the more basic items to think about are:

- Where to store your backups?
- Is the storage medium redundant?
- How will data retention will handled?
- How will the data at rest be encrypted?

A tool that I prefer for performing encrypted, bandwidth efficient backups to a variety of remote backends such as Rackspace Cloud Files, Amazon S3, and many others is Duplicity.

Taken from Duplicity’s site, (http://duplicity.nongnu.org), Duplicity back directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. Because duplicity uses librsync, the incremental archives are space efficient and only record the parts of files that have changed since the last backup. Because duplicity uses GnuPG to encrypt and/or sign these archives, they will be safe from spying and/or modification by the server.

Duplicity-manager was created to act as a wrapper script for the tasks I commonly perform with Duplicity.

Features

- Simple invocation from cron for nightly backups.
- All in one script for performing backups, restores, searching for content from specific time period.
- Provides an optional menu driven interface to make backups as painless as possible.

Configuration

The currently configurable options are listed below:

# Configuring either Rackspace Cloud Files or Amazon S3 backends

# List of directories to backup
INCLUDE_LIST=( /etc /var/www /var/lib/mysqlbackup )

# GPG Passphrase for encrypting data at rest
# You can use the following to generate a decent GPG passphrase, just be sure
# to store it someone secure off this server.
# < /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c64
export PASSPHRASE=YOUR_PASSPHRASE

# Backup Retention 
retention_type=remove-older-than
retention_max=14D
 
# Number of full backups to keep (alternative to above)
# retention_type=remove-all-but-n-full
# retention_max=3

# Force Full Backup Every XX Days
full_backup_days=7D

# Restore Directory
restore=/tmp

Usage

./duplicity-manager.sh 

Options:

--backup:                      runs a normal backup based off retention settings
--backup-force-full:           forces a full backup
--list-files [age]:            lists the files currently stored in backups
--restore-all [age]:           restores everything to restore directory
--restore-single [age] [path]: restores a specific file/dir to restore directory
--show-backups:                lists full and incremental backups in the archive
--menu:                        user friendly menu driven interface

Examples:

duplicity-manager.sh --list-files 0D              Lists the most recent files in archive
duplicity-manager.sh --restore-all 2D             Restores everything from 2 days ago
duplicity-manager.sh --restore-single 0D var/www/ Restores /var/www from latest backup

Implementation

Download script to desired directory and set it to be executable:

# Linux based systems
cd /root
git clone https://github.com/stephenlang/duplicity-manager

After configuring the tunables in the script (see above), create a cron job to execute the script one a day:

# Linux based systems
crontab -e
10 3 * * * /root/duplicity-manager/duplicity-manager.sh

As with any backup solution, it is critical that you test your backups often to ensure your data is recoverable in the event a restore is needed.

Scrutiny

Being asked at 9AM to determine what caused a system to have problems at 2:30AM can be a weary task. If the normal system logs do not give us any real hints about what may have caused the issue, we oftentimes get trapped having to give the really poor answer of “We cannot replicate the issue that you experienced during the overnight, and the logs are not giving us enough information to go on. So we’ll have to watch for it tonight to see if it re-occurs.” Times like that makes a sysadmin feel completely helpless.

What if you could see what processes were running on the system at prescribed intervals? And not just processes, but what about what queries were running, how many people were hitting Apache, perhaps what types of network connections you were getting, on top of a bunch of other information that can be gathered from tools like vmstat, iostat, etc? Now you can draw better conclusions cause you will know what was happening at that single point in time.

Welcome Scrutiny! Located over on github. A tool based off of recap, rewriten to suit my own needs for portability between Red Hat, Debian, and FreeBSD based systems, as well as allowing for simple modifications of the metrics needed to best suit your own environment.

Features

– Simple code base for quick customizations
– Ability to enable/disable groups of checks
– Easy to add/modify/remove individual metric gathering
– Uses tools such as ps, top, df, vmstat, iostat, netstat, mysqladmin, and apache’s server-status module to help create a point in time snapshot of the systems events.

Configuration

The currently configurable options and thresholds are listed below:

# Enable / Disable Statistics
process_log=on
resource_log=on
network_log=on
mysql_log=on
apache_log=on

# Retention Days
retension=2

# Logs
basedir=/var/log/scrutiny

Implementation

Download script to desired directory and set it to be executable:

# Linux based systems
cd /root
git clone https://github.com/stephenlang/scrutiny/linux/scrutiny.sh

# FreeBSD based systems
git clone https://github.com/stephenlang/scrutiny/freebsd/scrutiny.sh
chmod 755 scrutiny.sh

After configuring the tunables in the script (see above), create a cron job to execute the script every 10 minutes:

crontab -e
*/10 * * * * /root/scrutiny.sh

Now days later, if a problem was reported during the overnight and you were able to narrow it down to a specific timeframe, you will be able to look at the point in time snapshots of system events that occurred:

ls /var/log/scrutiny

Server monitoring script

Without an agent based monitoring system, monitoring your servers internals for items such as CPU, memory, storage, processes, etc becomes very difficult without manually checking. There are many reputable monitoring services on the web such as Pingdom (www.pingdom.com), and most hosting providers provide a monitoring system, but they do not provide an agent. Therefore, you can only do basic external checks such as ping, port, and http content checks. There is no way to report if your MySQL replication has failed, some critical process has stopped running, or if your about to max out your / partition.

This simple bash script located on github, is meant to compliment these types of monitoring services. Just drop the script into a web accessible directory, configure a few options and thresholds, setup a URL content check that looks at the status page searching for the string ‘OK’, and then you can rest easy at night that your monitoring service will alert you if any of the scripts conditions are triggered.

Security note: To avoid revealing information about your system, it is strongly recommended that you place this and all web based monitoring scripts behind a htaccess file that has authentication, whitelisting your monitoring servers IP addresses if they are known.

Features

– Memory Check
– Swap Check
– Load Check
– Storage Check
– Process Check
– Replication Check

Configuration

The currently configurable options and thresholds are listed below:

 # Status page
status_page=/var/www/system-health-check.html

# Enable / Disable Checks
memory_check=off
swap_check=on
load_check=on
storage_check=on
process_check=on
replication_check=off

# Configure partitions for storage check
partitions=( / )

# Configure process(es) to check
process_names=( httpd mysqld postfix )

# Configure Thresholds
memory_threshold=99
swap_threshold=80
load_threshold=10
storage_threshold=80

Implementation

Download script to desired directory and set it to be executable:

 cd /root
git clone https://github.com/stephenlang/system-health-check
chmod 755 system-health-check/system-health-check.sh

After configuring the tunables in the script (see above), create a cron job to execute the script every 5 minutes:

 crontab -e
*/5 * * * * /root/system-health-check/system-health-check.sh

Now configure a URL content check with your monitoring providers tools check the status page searching for the string “OK”. Below are two examples:

 http://1.1.1.1/system-health-check.html
http://www.example.com/system-health-check.html

Testing

It is critical that you test this monitoring script before you rely on it. Bugs always exist somewhere, so test this before you implement it on your production systems! Here are some basic ways to test:

1. Configure all the thresholds really low so they will create an alarm. Manually run the script or wait for the cronjob to fire it off, then check the status page to see if it reports your checks are now in alarm.

2. To test out the process monitoring (assuming the system is not in production), configure the processes you want the script to check, then stop the process you are testing, and check the status page after the script runs to see if it reports your process is not running.

3. To test out the replication monitoring (assuming the system is not in production), log onto your MySQL slave server and run ‘stop slave;’. Then check the status page after the script runs to see if it reports an error on replication.