IO Scheduler tuning

What is an I/O scheduler? The I/O scheduler is a kernel level tunable whose purpose is to optimize disk access requests. Traditionally this is critical for spinning disks as I/O requests can be grouped together to avoid “seeking”.

Different I/O schedulers have their pro’s and con’s, so choosing which one to use depends on the type of environment and workload. There is no one right I/O scheduler to use, it all simply ‘depends’. Benchmarking your application before and after the I/O scheduler change is usually your best indicator. The good news is, the I/O scheduler can be changed at run time and can be configured to persist after reboots.

The three common I/O schedulers are:
– noop
– deadline
– cfq

noop

The noop I/O scheduler is optimized for systems that don’t need an I/O scheduler such as VMware, AWS EC2, Google Cloud, Rackspace public cloud, etc. Since the hypervisor already controls the I/O scheduling, it doesn’t make sense for the VM to waste CPU cycles on it. The noop I/O scheduler simply works as a FIFO (First In First Out) queue.

You can update the I/O scheduler to noop by:

## CentOS 6

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq] 
[root@db01 ~]# echo 'noop' > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
[noop] anticipatory deadline cfq

# Change at boot time by appending 'elevator=noop' to end of kernel line:
[root@db01 ~]# vim /boot/grub/grub.conf
kernel /vmlinuz-2.6.9-67.EL ro root=/dev/vg0/lv0 elevator=noop


## CentOS 7

# Change at run time
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq] 
[root@db01 ~]# echo 'noop' > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
[noop] anticipatory deadline cfq

# Change at boot time by appending 'elevator=noop' end of the following line, then rebuild the grub config:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap elevator=noop"
...
[root@db01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg


## Ubuntu 14.04

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
[root@db01 ~]# echo noop > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
[noop] deadline cfq

# Change at boot time by appending 'elevator=noop' end of the following line, then rebuild the grub config:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=noop"
...
[root@db01 ~]# grub-mkconfig -o /boot/grub/grub.cfg


## Ubuntu 16.04

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
[root@db01 ~]# echo noop > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
[noop] deadline cfq

# Change at boot time by appending 'elevator=noop' end of the following line, then rebuild the grub config:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=noop"
...
[root@db01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

deadline

The deadline I/O scheduler is optimized by default for read heavy workloads like MySQL. It attempts to optimize I/O request by putting it in a read queue or write queue and assigning a timestamp to the request. For requests in the read queue, they have 500ms (by default) to execute before they are given the highest priority to run. Requests entering the write queue have 5000ms to execute before they are given the highest priority to run.

This deadline assigned to each I/O request is what makes deadline I/O scheduler optimal for read heavy workloads like MySQL.

You can update the I/O scheduler to deadline by:

## CentOS 6

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq] 
[root@db01 ~]# echo 'deadline' > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq

# Change at boot time by appending 'elevator=deadline' to end of kernel line apply the changes to grub:
[root@db01 ~]# vim /boot/grub/grub.conf
kernel /vmlinuz-2.6.9-67.EL ro root=/dev/vg0/lv0 elevator=deadline


## CentOS 7

# Change at run time
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq] 
[root@db01 ~]# echo 'deadline' > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq

# Change at boot time by appending 'elevator=deadline' end of the following line and apply the changes to grub:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap elevator=deadline"
...
[root@db01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg


# Ubuntu 14.04

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
[root@db01 ~]# echo deadline > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq

# Change at boot time by appending 'elevator=deadline' end of the following line apply the changes to grub:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=deadline"
...
[root@db01 ~]# grub-mkconfig -o /boot/grub/grub.cfg


# Ubuntu 16.04

# Change at runtime
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]
[root@db01 ~]# echo deadline > /sys/block/sda/queue/scheduler
[root@db01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq

# Change at boot time by appending 'elevator=deadline' end of the following line apply the changes to grub:
[root@db01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=deadline"
...
[root@db01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

cfg

The cfg I/O scheduler is probably best geared towards things running GUIs (like a desktop) where each process needs a fast response. The goal of the cfq I/O scheduler (Complete Fairness Queueing) is to give a fair allocation of disk I/O bandwidth for all the processes which requests an I/O operation.

You can update the I/O scheduler to cfq by:

## CentOS 6

# Change at runtime
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq 
[root@server01 ~]# echo 'cfq' > /sys/block/sda/queue/scheduler
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

# Change at boot time by appending 'elevator=cfq' to end of kernel line apply the changes to grub:
[root@server01 ~]# vim /boot/grub/grub.conf
kernel /vmlinuz-2.6.9-67.EL ro root=/dev/vg0/lv0 elevator=cfq


## CentOS 7

# Change at run time
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory [deadline] cfq 
[root@server01 ~]# echo 'cfg' > /sys/block/sda/queue/scheduler
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop anticipatory deadline [cfq]

# Change at boot time by appending 'elevator=cfq' end of the following line and apply the changes to grub:
[root@server01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=rhel00/root rd.lvm.lv=rhel00/swap elevator=cfq"
...
[root@server01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg


# Ubuntu 14.04

# Change at runtime
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
[root@server01 ~]# echo cfq > /sys/block/sda/queue/scheduler
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]

# Change at boot time by appending 'elevator=cfq' end of the following line apply the changes to grub:
[root@server01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=cfq"
...
[root@server01 ~]# grub-mkconfig -o /boot/grub/grub.cfg


# Ubuntu 16.04

# Change at runtime
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop [deadline] cfq
[root@server01 ~]# echo cfq > /sys/block/sda/queue/scheduler
[root@server01 ~]# cat /sys/block/sda/queue/scheduler
noop deadline [cfq]

# Change at boot time by appending 'elevator=cfq' end of the following line apply the changes to grub:
[root@server01 ~]# vim /etc/default/grub
...
GRUB_CMDLINE_LINUX="elevator=cfq"
...
[root@server01 ~]# grub2-mkconfig -o /boot/grub2/grub.cfg

As with any performance tuning recommendations, there is never a one size fits all solution! Always benchmark your application to establish a baseline before you make the change. After the performance changes have been made, run the same benchmark and compare the results to ensure that they had the desired outcomes.

Disabling Transparent Huge Pages in Linux

Transparent Huge Pages (THP) is a Linux memory management system that reduces the overhead of Translation Lookaside Buffer (TLB) lookups on machines with large amounts of memory by using larger memory pages.

However, database workloads often perform poorly with THP, because they tend to have sparse rather than contiguous memory access patterns. The overall recommendation for MySQL, MongoDB, Oracle, etc is to disable THP on Linux machines to ensure best performance.

You can check to see if THP is enabled or not by running:

[root@db01 ~]# cat /sys/kernel/mm/transparent_hugepage/enabled
[always] madvise never
[root@db01 ~]# cat /sys/kernel/mm/transparent_hugepage/defrag
[always] madvise never

If the result shows [never], then THP is disabled. However if the result shows [always], then THP is enabled.

You can disable THP at runtime on CentOS 6/7 and Ubuntu 14.04/16.04 by running:

[root@db01 ~]# echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled
[root@db01 ~]# echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag

However once the system reboots, it will go back to its default value again. To make the setting persistent on CentOS 7 and Ubuntu 16.04, you can disable THP on system startup by making a systemd unit file:

# CentOS 7 / Ubuntu 16.04:
[root@db01 ~]# vim /etc/systemd/system/disable-thp.service
[Unit]
Description=Disable Transparent Huge Pages (THP)

[Service]
Type=simple
ExecStart=/bin/sh -c "echo 'never' > /sys/kernel/mm/transparent_hugepage/enabled && echo 'never' > /sys/kernel/mm/transparent_hugepage/defrag"

[Install]
WantedBy=multi-user.target

[root@db01 ~]# systemctl daemon-reload
[root@db01 ~]# systemctl start disable-thp
[root@db01 ~]# systemctl enable disable-thp

On CentOS 6 and Ubuntu 14.04, you can disable THP on system startup by adding the following to /etc/rc.local. If this is on Ubuntu 14.04, make sure its added before the ‘exit 0’:

# CentOS 6 / Ubuntu 14.04
[root@db01 ~]# vim /etc/rc.local
...
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
   echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
   echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
...

Creating table indexes in MySQL

You may ask, what is a table index and how will it help performance? Table indexes provide MySQL a more efficient way to retrieve records. I often like to use the following example to explain it:

Imagine you have a phone book in front of you, and there are no letters in the top right corner that you can reference if you are looking up a last name. Therefore, you have to search page by page through hundreds of pages that have tens of thousands of results. Very inefficient and intensive. Think of this as a full table scan.

Now picture the phone book having the letter references in the top right corner. You can flip right to section “La – Lf” and only have to search through a smaller result set. The time to find the results is must faster and easier.

Common symptoms where this logic can be applied is when you log onto a server and see MySQL frequently chewing up a lot of CPU time, either constantly, or in spikes. The slow-query-log is also a great indicator cause if the query is taking a long time to execute, chances are it was because the query was making MySQL work too hard performing full table scans.

The information below will provide you with the tools to help identify these inefficient queries and how to help speed them up.

There are 2 common ways to identify queries that are very inefficient and may be creating CPU contention issues:

View MySQL’s process list:

When entering into MySQL CLI, you will want to look for any queries that you see that are often running to evaluate. You can see the queries by:

mysql
show processlist;

View slow queries log:

To view this, first check to ensure the slow-query-log variables are enabled in the my.cnf:

log-slow-queries=/var/lib/mysqllogs/slow-log
long_query_time=5

Now, lets look at an example of a slow query that got logged. Please note, these queries got logged here cause they took longer to run then the max seconds defined on long_query_time:

# Time: 110404 22:45:25
# User@Host: wpadmin[wordpressdb] @ localhost []
# Query_time: 14.609104  Lock_time: 0.000054 Rows_sent: 4  Rows_examined: 83532
SET timestamp=1301957125;
SELECT * FROM wp_table WHERE `key`='5544dDSDFjjghhd2544xGFDE' AND `carrier`='13';

Here is a query that we know is know runs often, and takes over 5 seconds to execute:

SELECT * FROM wp_table WHERE `key`='5544dDSDFjjghhd2544xGFDE' AND `carrier`='13';

Within the MySQL cli, run the following to view some more details about this query:

explain SELECT * FROM wp_table WHERE `key`='5544dDSDFjjghhd2544xGFDE' AND `carrier`='13';
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | wp_table   | ALL  | NULL          | NULL |    NULL | NULL | 83532 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+

The 2 important fields here are:

- Type: When you see "ALL", MySQL is performing a full table scan which is a very CPU intensive operation.
- Row: This is the total amount of rows returned in the table, so 83,000 results is a lot to sort through.

In general, when you are creating an index, you want to choose a field that has the highest amount of unique characters. In our case, we are going to use the field ‘key’ as shown below:

create index key_idx on wp_table(`key`);

Now, lets rerun our explain to see if the query is now returning less rows:

explain SELECT * FROM wp_table WHERE `key`='5544dDSDFjjghhd2544xGFDE' AND `carrier`='13';
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows  | Extra       |
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+
|  1 | SIMPLE      | wp_table   | ALL  | NULL          | NULL |    NULL | NULL | 13    | Using where |
+----+-------------+------------+------+---------------+------+---------+------+-------+-------------+

This is much better. Now each time that common query runs, MySQL will only have to go through 13 rows, instead of it having to check through 83,000.

Important note: Each time a table is updated, MySQL has to update the indexes, which could create some performance issues. Therefore, its recommended to keep the amount of indexes per table low, perhaps in the 4-6 range.

How to see what indexes already exist on a table and their cardinality:

show indexes from wp_table;

How to remove a table index:

delete index key_idx from wp_table;