Stephen R Lang – Page 15 – Random notes and musings

SSH Tips and Tricks

SSH is like the Swiss army knife of the unix and linux world. It is a tool that helps solve a variety of issues when it comes to passing traffic over port 22.

This post is going to be a collection of guides showing some of the various solutions that can be accomplished with SSH. This guide assumes you are running Linux on both your office and home workstations.

SSH proxies for web traffic

Suppose you are at a hotel on a public WIFI, and don’t have access to a VPN. How can you still surf the web securely? Assuming you have a server setup at home, or some other secure location, you can simply proxy all your traffic through your home server using an encrypted SSH tunnel.

On your local workstation, run the following:

[user@workstation ~]# ssh -D 7070 [email protected]

Now with the proxy setup, you can route your traffic through a SOCKS 5 proxy in your browsers configuration, or using a plugin such as FoxyProxy. Just be sure that under the SOCKS host, you use:

Host:  127.0.0.1
Port:  7070
Type:  SOCKS v5

SSH proxies for SSH traffic

Using the same scenario, we’re still at the hotel using public WIFI, and don’t have access to a VPN. How can we pass our SSH traffic through a secure proxy server?

On your local workstation, run the following:

[user@workstation ~]# ssh -D 7070 [email protected]

Now configure your .ssh/config to pass all SSH traffic through the SOCKS proxy:

[user@workstation ~]# vim .ssh/config
...
# Filter through socks5 proxy
Host *
ProxyCommand /usr/bin/nc -X 5 -x 127.0.0.1:7070 %h %p
IdentityFile ~/.ssh/id_dsa
ForwardAgent yes
GSSAPIAuthentication no
VerifyHostKeyDNS no
HashKnownHosts no
TCPKeepAlive yes
ServerAliveInterval 300
...

How to X forward applications located on remote servers

Lets suppose you want to access a specific application from your office computer not installed on your home workstation maybe, or want to be able to pick up where you left off on something. For this example, we’ll be accessing Firefox from the office computer, on the home computer.

This would be done on your home computer by:

[user@workstation ~]# ssh -X user@remoteserver firefox

What if you needed to access a internal Windows server from your home computer, but it was only accessible from your office computer? Assuming you can SSH to your office computer, you can run:

[user@workstation ~]# ssh -X user@remoteserver rdesktop -a 24 -u user -f 192.168.1.100

How to create a reverse SSH tunnel

Reverse SSH tunnels are useful when the firewall on the server you need to connect to is blocking inbound SSH connections, but allows for outbound SSH connections. So this is where we simply have the remote server behind the firewall establish the SSH tunnel.

Please understand the security implications of doing this! If you have the ability to use a VPN to access the remote server, use that instead!

On the remote server behind the firewall, create a SSH tunnel to your workstation by:

[root@remoteserver ~]# ssh -R 7022:localhost:22 user@yourworkstation

Now from your workstation, connect to the remote server by:

[user@workstation ~]# ssh -p 7022 localhost

If you want to keep this tunnel up so you don’t have to keep re-establishing the connection when/if it drops, you can script it on the remote server by placing the following script into a cronjob that runs every 5 minutes or something. You will need to have SSH keys in place prior to doing this:

[root@remoteserver ~]# vim /root/ssh-check-tunnel.sh
#!/bin/bash

# Create tunnel if its not established
COMMAND='ssh -N -R 7022:localhost:22 user@yourworkstation'

# Check to see if tunnel is currently up
CHECK_TUNNEL=`ps -eo args | grep "$COMMAND" | grep -v grep`

# If the tunnel is down, create it
if [ -z "$CHECK_TUNNEL" ] ; then
$COMMAND
fi

Then make it executable

[root@remoteserver ~]# chmod 755 /root/ssh-check-tunnel.sh

Finally, drop it into cron:

[root@remoteserver ~]# crontab -e
*/5 * * * * /root/ssh-check-tunnel.sh

How to access servers behind a NAT transparently

Here we will be creating an SSH VPN. However if a regular VPN is available, use that instead. Also understand the security implications of running this before you follow this guide!

To make this a bit easier to understand, here are our devices and IP’s:

workstation:  Ubuntu based workstation
remoterouter:  FreeBSD router running NAT
remoteserver:  Server behind router with a private IP

Assuming you have a workstation running Ubuntu, and you want to be able to access servers that are behind a FreeBSD router running NAT, how can you go about accessing the private IP’s of those remote servers from your workstation simply by running:

[user@workstation ~]# ssh [email protected]

To do this, you need to create a tunnel from your workstation to the FreeBSD router that is providing the private IP’s to the remote servers.

On your workstation, create a SSH key pair without a passphrase (Just take the defaults), and copy that over to the FreeBSD router running NAT:

[root@workstation ~]# ssh-keygen -t dsa
[root@workstation ~]# scp /root/.ssh/id_dsa.pub root@freebsdrouter:/root/.ssh/authorized_keys

Then on your workstation, allow SSHD to create a tunnel:

[root@workstation ~]# vim /etc/ssh/sshd_config
...
PermitTunnel yes
...

Then restart SSH:

[root@workstation ~]# service ssh restart

Now comes the hard part, configure the tunnel interface on your Ubuntu workstation. This guide is assuming 10.1.0.100 and .200 are not in use already on your network. Please be sure to replace ‘remoterouter’ with the IP of your remote server:

[root@workstation ~]# vim /etc/network/interfaces
...
iface tun0 inet static
pre-up ssh -f -w 0:0 root@remoterouter 'ifconfig tun0 10.1.0.200 10.1.0.100'
pre-up sleep 5
address 10.1.0.100
pointopoint 10.1.0.200
netmask 255.255.255.0
up route add -net 10.10.10.0 netmask 255.255.255.0 gw 10.0.0.100 tun0
...

Note, if the remote router is running CentOS 5, you may need to run:

[root@remoterouter ~]# modprobe -av tun
[root@remoterouter ~]# echo modprobe tun >> /etc/rc.modules
[root@remoterouter ~]# chmod -x /etc/rc.modules

On your workstation, bring up the tunnel:

[root@workstation ~]# ifup tun0

Finally, wave a dead chicken over the alter and try to ping one of the servers behind the NAT:

[root@workstation ~]# ping 10.1.0.50

If that works, great! You just created an SSH VPN! You will be able to access any of your servers that are behind this NAT.

If the tunnel gets disconnected, you will need to run the following to re-establish the tunnel:

[root@workstation ~]# ifdown tun0
[root@workstation ~]# ifup tun0

Checking the modulus of a SSL key and certificate

When you get a new SSL certificate to install, how can you be sure the key matches the certificate? If they do not match, the web server may fail to start or SSL in general for your website may not work.

Fortunately openssl allows us to compare the modulus of the SSL key and certificate easily enough by:

[root@web01 ~]# openssl rsa -noout -modulus -in yourdomain.key | openssl md5
[root@web01 ~]# openssl x509 -noout -modulus -in yourdomain.crt | openssl md5

If the resulting MD5 checksums match, then the key matches the certificate. If they do not match for some reason, that typically indicates that the key used to generate the original CSR is different from the key you are currently testing against.

Rsync Migration Guidelines

There are numerous gotcha’s and things that you must be aware of before you can confidently perform a migration. Migrations go bad all the time. There is no guranteed way of knowing when it will fail. But there are ways to minimize the potential for some problem creeping up a month after the migration. Outlined below are steps that should be taken before proceeding with the migration.

1. Evaluate the server to be migrated. This involves:
– How large are the drives?
– Are there any directories that have hundreds of thousands of files?
– Are there any directories that contain thousands of other directories?
– Is the server extremely busy?

2. Check your backups to ensure everything is in place
You must first determine how you are going to do the migration. Are you build a new server, then use the backups to do a server restore to it? Or are you just going to perform a straight rsync migration. If your going to attempt to utilize your system’s backups:
– Check to see when the last known good backup was. You need confirmation that its good, and also attempt to get a rough eta on how long a server restore will take.
– Setup a new server with the same EXACT specs as the orginal server.

3. Ask questions
Below are some basic questions that should be asked before performing a migration. It may help shed insight on the server’s day to day tasks to ensure a smooth migration. Things to ask:
– Are their any known quirks experienced from time to time on the server?
– What are the key critical services that need special attention when migrating?
– How can you test the server to ensure the migration was successful? ie. websites that can check that utilizies both apache and mysql
– When is a good time the server can be shutdown the services on the production box so the final rsync can be performed?

4. Perform phase 1 of 2 of the rsync migration
The goal here is to create a base system on the new server. You want to be able to get the majority of the data copied over. This is to minimize the downtime the public will have during the final rsync phase. There is no need to schedule this, this is safe to do whenever.
– If you are utilizing your backups, get the new server jumped, throw on a temp ip, and do a full server restore.
– If you are just going to use rsync for everything, be aware that the server may seem sluggish as rsync may eat up the system resources.

To perform the rsync, log onto the old server (the one currently in production), and start a screen session:

screen -S migrations

Create a shared key between the old server and the new server

cd / && exec bash
for i in `ls |grep -v 'proc|etc'` do rsync -axvz --delete-after -e ssh $i [email protected]:/; done

To disconnect from the screen session, just hit:

ctrl-a then hit d

Depending on how much data and what the data structure is, at best your looking at 3-5G per hour. To speed up, use a cross connect on the gig ethernet ports.

5. Final rsync

At specified time stop all services on the production machine except sshd. (its better to just drop to single user mode with networking). Then using the same screen session, type:

for i in s | grep -v proc' do rsync -axvz --delete-after -e ssh $i [email protected]:/; done

Once complete, reboot and wave dead chicken over alter. You will want to swap the ips after you verified the new server at least boots.

6. Testing

This involves:
– Confirm websites work properly
– Confirm you can send and receive email
– Confirm mysql is functioning
– Go through error logs and correct any problems.

7. Troubleshooting

If machine doesn’t boot, you may have to fix grub (redhat). Also make sure /etc/fstab and /boot/grub/grub.conf have the labels setup right, or just specify the device: ex. /dev/hda1

Using gmirror, ggated, and freevrrpd to create a high availability environment

Technologies used:
– FreeBSD 7
– GEOM gmirror
– GEOM gate
– freevrrpd

Intended Uses:
– High-Availability Apache Setup
– High-Availability NFS Setup
– High-Availability Mysql Setup

Example Scenario:

We want borkcorp’s web server to be able to fail over to their dedicated mysql server. This will be done by mirroring a 500MB file-backed filesystem across the network. ggated is going to provide disk exporting and gmirror for the replication across the ggated device.

If you are unfamiliar with the GEOM framework, you may want to read the following documenation before continuing:
– http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/geom.html
– http://phaq.phunsites.net/2006/10/28/convert-single-disk-to-geom-mirror/
– Absolute FreeBSD 2nd Edition p529 – p567

Concept:

The GEOM frameworks introduces another level of high availability to FreeBSD. The different classes of GEOM are extremely flexible and allows for stackable setups. So in the following example, we are going to be building a setup that allows for high availability without having to use shared storage. For sake of ease, we are going to identify the 2 servers as follows:

Server_A –> Master server (192.168.1.100)
Server_B –> Slave server (192.168.1.101)

Both servers will be sharing a single disk image that is mounted as /home/sites. Think of this like a NFS server on steroids (in the most basic sense), the primary difference being that geom gate only allows one server’s image be actively in use at a time. However, if Server_A fails or otherwise goes offline, we will allow Server_B to mount /home/sites and Server_B’s Apache will be able to serve the sites until Server_A is back online.

Configuration:

First, we need to get GEOM gate setup on both servers. Remember, this is similar to a NFS server setup. So with that in mind, log onto the slave server (Server_B) and create the disk image that will be copied over to the master server (Server_A)

On Server_B –> Slave server (192.168.1.101):

mkdir /root/images
truncate -s500m /root/images/sitesbackup.img

Then setup /etc/gg.exports to allow our master server (Server_A) to access this disk image:

vi /etc/gg.exports
192.168.1.100 RW /root/images/sitebackup.img

Now turn on ggated so our image can be served across the network:

ggated

There are more steps involved in setting up the primary server. So to start, lets attach the ggate device to the remote file. On Server_A –> Master server (192.168.1.100), the image that we made on the slave server will now be mountable on the master server as /dev/ggate0

ggatec create backupserver /root/images/sitebackup.img

Next, lets setup a disk image on the master server (Server_A), that will be mirrored against the slave servers disk image:

mkdir /root/images
truncate -s500m /root/images/sitesprimary.img

Time to setup gmirror to mirror both of the disk images from Server_A and Server_B to have them setup in Raid 1. First, create a mirror and pull in the remote disk image (/dev/ggate0) into a mirror labeled remotemirror.

gmirror label remotemirror /dev/ggate0

Now, lets mount our disk image that we just created on the primary server. This sets the image in /dev/md0:

mdconfig -a -t vnode -f /root/images/sitesprimary.img

Finally, lets insert the sitesprimary.img into the mirror:

gmirror insert remotemirror /dev/md0

So now that we have the 2 images setup in a Raid 1, lets see what gmirror has to say about this:
– md0 is the image on the master server
– ggate0 is the image on the slave server
– /dev/mirror/remotemirror is the raid 1 that will be mounted as /home/sites

[root@dev01 ~]gmirror status
Name Status Components
mirror/remotemirror COMPLETE md0
ggate0

Now that we have our mirror setup, we need to put a file system on it:

newfs -U /dev/mirror/remotemirror
mount /dev/mirror/remotemirror /home/sites

You now have a distributed mirror mounted on /home/sites on the primary server.

Freevrrpd:

A quick primer is in order so you have a basic idea how this works. Freevrrpd is similar to heartbeat for Linux. With freevrrpd installed on both servers, it can detect when one server is up or down. An example may better explain this:

The master server and slave servers are both running freevrrpd. If the master server goes down, we want the ip’s and the /home/sites directory to be transferred to the slave server. Once this is complete, the slave server will be able to serve up the websites in the same fashion that the master was able to. When the master server finally comes back online, Freevrrpd will automatically transfer the ip’s and /home/sites back to the master server.

Here is what makes freevrrpd interesting. Lets say the master server goes offline for some reason, the slave server will detect this and then execute a custom script called masterscript. This masterscript is the script that you can use to turn on/off services, or perform any other tasks that may be needed when making the slave server the primary server. Once the master server comes back up, freevrrpd on the slave server will run its exit script called backupscript. This custom script allows for you to perform any nessacery tasks to have this server become a slave once more.

Configuration:

On Server_A –> Master server (192.168.1.100), install freevrrpd:

pkg_add -r freevrrpd

Time to configure it:

vi /usr/local/etc/freevrrpd.conf
serverid = 1
interface = lnc1
priority = 255
addr = 192.168.1.102/32
password = borkborkborkborkbork
useVMAC = no
sendgratuitousarp = yes
masterscript = /usr/local/bin/master_script.sh
backupscript = /usr/local/bin/backup_script.sh
# Note: addr is the floating ip that will be transferred between both servers.
# Note2: Please refer to the freevrrpd man page to find what each of these directives do.

On Server_B –> Slave server (192.168.1.101), install freevrrpd:

pkg_add -r freevrrpd

Time to configure it:

vi /usr/local/etc/freevrrpd.conf

[VRID]
serverid = 2
interface = lnc1
priority = 200
addr = 192.168.1.102/32
password = borkborkborkborkbork
useVMAC = no
sendgratuitousarp = yes
masterscript = /usr/local/bin/master_script.sh
backupscript = /usr/local/bin/backup_script.sh

Ok, now we have this installed. We now need to setup the scripts that will be excuted when one server fails or comes back up:

Here is a summary of what each script does:

1. Master server: /usr/local/bin/master_script.sh
This script applies when the server is first booted up, and assumes that it is going to regain control of /home/sites from the slave server.

The following tasks are executed:
– Removes any hard drive images from the mirror if they exist
– Uses ggatec to establish a connection to the slave servers shared disk image.
– Sets up the mirror between the disk image on the master server and the one on the slave server
– The script assumes that the slave server has the most updated content, so it’ll copy its new data onto the primary servers image.
– Finally it mounts the mirror as /home/sites

2. Master server: /usr/local/bin/backup_script.sh
This script is really only provided for us to test failing over the master to the slave server. Freevrrpd will most likely never use this as the master server is expected to stay up 100% of the time as far as freevrrpd is concerned.

3. Slave server: /usr/local/bin/master_script.sh
This script is executed when the master server fails. This allows the slave server to become the sole server in charge of providing the /home/sites directory. The following tasks are executed:
– Destroy the ggated daemon
– Fsck the drive to ensure there was no corruption
– Mount the slave servers disk image as /home/sites

4. Slave server: /usr/local/bin/backup_script.sh
This script puts the slave server back into a state when its simply a slave and no longer serving /home/sites. The following tasks are executed:
-Unmount /home/sites
-Start the ggated daemon so the master server can connect and mount the disk image.

Please note: The following scripts do not have any error checking. They are provided only to give you a basic idea of what they need to contain. Your scripts will be different depending on the exact type of setup you are doing, hense why these are only barebone and basic scripts. So in short, you are better off writing your own and just using these are a basic template.

Server_A: Master Server: 192.168.1.100
master_script.sh

#!/usr/local/bin/bash 

gmirror deactivate remotemirror md0
gmirror deactivate remotemirror ggate0

while [ ! -c /dev/ggate0 ]; do
	ggatec create -d 5 192.168.1.102 /root/images/sitesbackup.img
	sleep 1
done;

mdconfig=`mdconfig -l`

if [ "$mdconfig" = "md0" ]; then
	echo "md0 already exists"
	sleep 1
else
	mdconfig -a -t vnode -f /root/images/sitesprimary.img
	echo "had to create md0"
	sleep 1
fi

gmirror deactivate remotemirror ggate0
gmirror deactivate remotemirror md0

gmirror label -v -n -b prefer remotemirror ggate0
echo "loaded remoteimrror ggate0"
sleep 1
gmirror insert remotemirror md0
echo "loaded local md0"
sleep 1
gmirror rebuild remotemirror md0


while [ `gmirror status |grep remotemirror |awk '{print $2}'` != "COMPLETE" ]; do
	sleep 1
done;

/sbin/mount /dev/mirror/remotemirror /home/sites/

backup_script.sh

#!/usr/local/bin/bash

umount /home/sites
gmirror deactivate remotemirror ggate0
gmirror deactivate remotemirror md0

ggatec destroy -u 0

Server_B: Slave Server: 192.168.1.101
master_script.sh

#!/usr/local/bin/bash

pkill ggated

if [ ! -c /dev/md0 ]; then
        mdconfig -a -t vnode -f /root/images/sitesbackup.img
fi

fsck -t ufs -y /dev/md0

sleep 2
if [  -c /dev/mirror/remotemirror ]; then
        mount /dev/mirror/remotemirror /home/sites
else
        mount /dev/md0 /home/sites
fi

backup_script.sh

#!/usr/local/bin/bash

umount /home/sites
/sbin/ggated

Failover Testing:

So how do you test this out to ensure its working?

1. Fail the master server:

Log onto the master server (Server_A) and verify first that everything is working between the master and slave. This will show us that the ggate device is working:

[root@dev01 ~]# ggatec list
ggate0

This will show us if both images are setup in the raid:

[root@dev01 ~]# gmirror status
Name Status Components
mirror/remotemirror COMPLETE md0
ggate0

Remember, md0 is the master server’s image and ggate0 is the slave server’s image

Fail the master server by running the following:

/usr/local/bin/backup_script.sh

Verify the master server is no longer able to handle requests:
– gmirror status should return no output
– ggatec list should return no output
– /home/sites should no longer be mounted on the master server.

Now log into the slave server and run the following:

/usr/local/bin/master_script.sh

Verify the slave server can mount the /home/sites directory

df
touch /home/sites/file_from_slave_server.txt

2. Fail the slave server back to the master server:

We are now assuming that the master server came back online. Log onto the slaver server (Server_B), and run the freevrrpd exit script to release control back to the master server:

/usr/local/bin/backup_script.sh

Verify /home/sites is no longer mounted on this server

df

Confirm ggated is now accepting connections

ps -waux |grep ggated

Log onto the master server (Server_A), and make this server the master server again:

/usr/local/bin/master_script.sh

Verify everything is working by doing the following:

[root@dev01 ~]# gmirror status
Name Status Components
mirror/remotemirror DEGRADED ggate0
md0 (6%)

*This shows that the local md0 image is getting the changes from the slave servers image.
During this time, /home/sites is availabe to be written to
**Verify the following file exists if you followed the first step:
/home/sites/file_from_slave_server.txt

3. Fail the slave server:

Just pull the ethernet cable or shut it down. Log onto the master server and verify that /home/sites can still be written to.

touch /home/sites/masterfiletest.txt

Now turn the slave server back on and run the following to get everything working again:

mdconfig -a -t vnode -f /root/images/sitesbackup.img (can be put into a rc script)
ggated (can also be put into a rc script)

Then log into the master server and run the following:

ggatec create -d 5 192.168.1.102 /root/images/sitebackup.img
gmirror rebuild remotemirror ggate0

Once the mirror is done rebuilding, ensure the file you touched is still there!!! The way I set this up, if the slave fails, manual intervention is required to have the master server connect back to the slave. You just have to run the 2 commands above. This is setup like this because I don’t think client wants to risk data loss. So be sure to setup monitoring to detect when this happens.

So why are we using disk images instead of partitions?

1. This allows for on the fly setups on existing server setups. If you used partitons to achieve this, you would have to either format the hard drive, or work some voodoo with moving data around to free up a partition. (if thats even possible without data loss)

2. Ease of adding disk space to the image

3. Emergency Migrations. If the solution were to totally fail, you can bring the server up in single user, and cp the disk image to a new server quickly.

FreeBSD dump/restore migration

Utilizing dump/restore to migrate data to a larger drive is one of the easiest and fool proof methods to perform a hard drive upgrade in FreeBSD.

This guide is assuming that you have a vanilla FreeBSD 4.11 install on the primary drive, and a secondary drive setup with the label as /backdrive. This logic should apply with FreeBSD 4, 5, 6, 7.

Here is how we do this:

First, lets just do this step by step, so lets dump the / partition. This is to be used per partition, so when you do df, and you see multiple partition, do this for each partition. But as this server only has 1 partition, simply do:

dump 0f - / > root.dmp

Transfer over the file to the new server:

scp root.dmp [email protected]:/tmp

Next, on the new server, cd into the directory we’ll be restoring, and begin the restore:

cd /backdrive
restore -rf /tmp/root.dmp

Fix the disk label and make /backdrive the / partition to boot:

# Notice we're modifying this for the second disk
disklabel -e -f ad1s1

# Change:
disk: ad1s1 to ad0s1

# And change:
e: 154199170 2097152 4.2BSD 2048 16384 89 # (Cyl. 130*- 9728*)

# To:
a: 154199170 2097152 4.2BSD 2048 16384 89 # (Cyl. 130*- 9728*)

Then fix fstab to point to /dev/ad0s1a for /.

Extended notes on disklabel:
Ok, there is a small science with disklabel. But just watch to 2 things:

1. The e: is the root partition on the second drive. So in disklabel:

change e: to a: (as a: is the default root partition on freebsd).

2. Now you have to match up whats showing in disk label to what is showing in fstab:

Before:
8 partitions: <-- (figure out this nightmare)
# size offset fstype [fsize bsize bps/cpg]
b: 4194304 0 swap # (Cyl. 0 - 261*)
c: 71119692 0 unused 0 0 # (Cyl. 0 - 4426*)
e: 262144 4194304 4.2BSD 2048 16384 94 # (Cyl. 261*- 277*)
f: 524288 4456448 4.2BSD 2048 16384 94 # (Cyl. 277*- 310*)
g: 524288 4980736 4.2BSD 2048 16384 94 # (Cyl. 310*- 342*)
h: 65614668 5505024 4.2BSD 2048 16384 89 # (Cyl. 342*- 4426*)

After:
8 partitions:
# size offset fstype [fsize bsize bps/cpg]
b: 4194304 0 swap # (Cyl. 0 - 261*)
c: 71119692 0 unused 0 0 # (Cyl. 0 - 4426*) 
a: 262144 4194304 4.2BSD 2048 16384 94 # (Cyl. 261*- 277*)
f: 524288 4456448 4.2BSD 2048 16384 94 # (Cyl. 277*- 310*)
g: 524288 4980736 4.2BSD 2048 16384 94 # (Cyl. 310*- 342*)
h: 65614668 5505024 4.2BSD 2048 16384 89 # (Cyl. 342*- 4426*)

And with fstab, I just have to verify each label by testing it and figuring out which is which.

How Qmail Works

Qmail is a very compartmentalized program. Its broken down into multiple tiny programs that govern a very specific piece of the MTA process. This guide documents how Qmail handles email in a nutshell.

Below is a rough diagram of what Qmail looks like:

Messages can enter the mail server in one of 2 ways, either the message came from a remote mailserver like hotmail.com, or the message is being sent from the local server (ie. imap, webmail, mail functions, etc) The 2 daemons that are responsible for this are:

1. qmail-smtpd –> handles mail coming from an outside mail server. ie. hotmail.com

2. qmail-inject –> handles any messages generated locally by the server. ie. imap, webmail, or php mail functions, etc. This service injects the messages directly into the mail queue.

The primary objective of qmail-smtpd and qmail-inject is to pass the message along to qmail-queue.

3. qmail-queue –> This is a complicated program. This writes all the messages to the central queue directory: /var/qmail/queue/. The qmail-queue program can be invoked by qmail-inject for locally generated messages, qmail-smtpd for messages received through SMTP, qmail-local for forwarded messages, or qmail-send for bounced messages. If this is confusing, just remember that this is the program that acually writes the messages to the mail queue. Now, if you are curious like me and want to know the nitty gritty, here it is. /var/qmail/queue is comprised of 5 directories. pid/, mess/, intd/, todo/, info/, and remote/.

Below is a diagram that shows how the message gets handled by qmail-queue during the various message “stages”. Next to each folder I also noted which program controls the message at that particular point in time.

pid/111 --  (S1)  # qmail-queue
          \_ mess/111 (S2)  # qmail-queue
                          |
                          |
                      _ intd/111 (S3)  # qmail-queue
                     /
          todo/111 -- (S4)  # qmail-queue
              |
              |
          info/111 -- local/111  (S4 - S5)  # qmail-send
              |
              |
          remote/111 (S4 - S5)  # qmail-send

Key:
# qmail-send --> responsible for this part of queue
# qmail-queue --> responsible for this part of queue
S1 -->  -mess -intd -todo -info -local -remote -bounce
S2 --> +mess -intd -todo -info -local -remote -bounce
S3 --> +mess +intd -todo -info -local -remote -bounce
S4 --> +mess ?intd +todo ?info ?local ?remote -bounce (queued)
S5 --> +mess -intd -todo +info ?local ?remote ?bounce (preprocessed)

Here are all possible states for a message. + means a file exists; - means it does not exist; ? means it may or may not exist in that folder

It is also well documented in the qmail src file called INTERNALS which explains it better than I can!

Short and sweet overview of qmail-queue: It is responsible for writing the message to the queue.

4. qmail-send –> This takes the message from qmail-queue and passes the message either to qmail-rspawn (for remote delivery) or it sends the message to qmail-lspawn (for local delivery)

5a. qmail-rspawn (remote delivery) –> This sends the message to the remote mail server (ie. yahoo.com)
– qmail-remote –> This transmits the message to the remote mail server.

5b. qmail-lspawn (local delivery) –> This sends the message for local delivery.
– qmail-local –> This passes the message off to a local delivery agent. It reads the users .qmail-default first, which basically just tells qmail-local that vdelivermail is going to handle delivery.

– vdelivermail -> This delivers the mail to the local users. It locates the users Maildir and passes the message off to preline. (preline passes teh mail to other filters or commands) In the users maildir, search for a .qmail file that relates to the user and cat the file. If a cat .qmail-USERNAME doesn’t exist, then it defaults to .qmail-default, which tells it to send the message to procmail.

– procmail -> Procmail performs the mail filtering and local delivery to the mailbox. In procmail, this is where you can send the message to spamassassin or another filtering agent for processing before delivery.

– spamassassin -> Spam filtering. Take special note of the spamassassin versions and permissions, and the spamd vs spamc methods.

Below are qmail’s configuration files, found within /var/qmail/control:

Control file                    Purpose

badmailfrom             blacklisted From addresses
bouncefrom              username of bounce sender
bouncehost              hostname of bounce sender
concurrencyincoming     max simultaneous incoming SMTP connections
concurrencylocal        max simultaneous local deliveries
concurrencyremote       max simultaneous remote deliveries
defaultdomain           domain name
defaulthost             host name
databytes               max number of bytes in message (0=no limit)
doublebouncehost        host name of double bounce sender
doublebounceto          user to receive double bounces
locals                  domains that we deliver locally
morercpthosts           secondary rcpthosts database
queuelifetime           seconds a message can remain in queue
rcpthosts               domains that we accept mail for
smtproutes              artificial SMTP routes
timeoutconnect          how long, in seconds, to wait for SMTP connection
timeoutremote           how long, in seconds, to wait for remote server
timeoutsmtpd            how long, in seconds, to wait for SMTP client
virtualdomains          virtual domains and users

How to read the Qmail logs

This is a quick guide on how to read the Qmail logs.

All message activity is written to /var/log/qmail/current. There can be a lot of information here, so lets break it down line by line. Below is a snippet from /var/log/qmail/current. I added numbers on the left hand side for the sake of learning.

1.  @40000000461d81581ec60f34 new msg 5915497
2.  @40000000461d81581eda8194 info msg 5915497: bytes 22122 from <[email protected]> qp 99024 uid 89
3.  @40000000461d8158214e737c starting delivery 4088258: msg 5915497 to local myuser@localhost
4.  @40000000461d81582155ca64 status: local 2/10 remote 0/60
5.  @40000000461d815824de127c delivery 4088258: success: did_1+0+0/
6.  @40000000461d815824f572dc end msg 5915497

Holy @#$%!, what does all this mean? Here it is, line by line:

1. This indicates that a new message has entered the queue. It is denoted by number: 5915497
2. This tells us where the message was from. In this case: [email protected]
3. Here, we see that the message is trying to deliver to a local user: myuser@localhost. Note the delivery sub id number: 4088258
4. This tells us what the queue volume is like. Not important at this moment.
5. This lets us know the message was delivered successfully to myuser@localhost. Note again the delivery sub id number: 4088258
6. Now qmail says, okay, the message denoted by the number: 5915497 is complete.

So when looking at the logs, first locate a from address or destination address in the logs. Once that is found, find the message id number (should be up 1 or 2 lines) and once you find that, you can discover what the message is doing when in the queue.