Today's practical use-case for Performance Schema

Today's blog post starts with a question:

"In my continuous integration environment, running tests currently takes 30 minutes. I
believe that a large percentage of time is spent in MySQL, and I want to know if putting MySQL
on tmpfs will cut that time down?".

I have previously written about how to configure MySQL to be less durable in testing environments, but today I wanted to write about how we can really test the assumption that the question is based around. That is to say:

  • We know that total time is 30 minutes.
  • We assume MySQL is a large percentage of this.
  • We know that this optimization will potentially allow us to avoid IO.
  • We then want to know what amount of time MySQL spends waiting on IO during our test suite?

And the answer to this is performance schema.

Performance Schema instruments File IO with statistics such as bytes read/written, and wait time. So if we see that IO wait time was only a small percentage of the 30 minutes of test time, we automatically know if this optimization is worth investigating or not.

Here is a simple query that will work on 5.6 out of the box:

mysql> SELECT
 `a`.`EVENT_NAME`,
 `a`.`SUM_TIMER_WAIT`/1024/1024/1024 AS `total_latency_ms`,
 `b`.`SUM_NUMBER_OF_BYTES_READ` AS `total_bytes_read`,
 `b`.`SUM_NUMBER_OF_BYTES_WRITE` AS `total_bytes_written`
FROM
 `performance_schema`.`events_waits_summary_global_by_event_name` `a`
 INNER JOIN `performance_schema`.`file_summary_by_event_name` `b` USING (event_name)
WHERE
 `a`.`EVENT_NAME` like 'wait/io/file/%'
 AND `a`.`COUNT_STAR` > 0\G
*************************** 1. row ***************************
         EVENT_NAME: wait/io/file/sql/casetest
   total_latency_ms: 0.245716745034
   total_bytes_read: 0
total_bytes_written: 0
*************************** 2. row ***************************
         EVENT_NAME: wait/io/file/sql/ERRMSG
   total_latency_ms: 0.911862457171
   total_bytes_read: 58982
total_bytes_written: 0
*************************** 3. row ***************************
         EVENT_NAME: wait/io/file/sql/FRM
   total_latency_ms: 50.292402482592
   total_bytes_read: 137771
total_bytes_written: 0
*************************** 4. row ***************************
         EVENT_NAME: wait/io/file/sql/global_ddl_log
   total_latency_ms: 0.012274521403
   total_bytes_read: 0
total_bytes_written: 0
*************************** 5. row ***************************
         EVENT_NAME: wait/io/file/sql/pid
   total_latency_ms: 0.218277866021
   total_bytes_read: 0
total_bytes_written: 5
*************************** 6. row ***************************
         EVENT_NAME: wait/io/file/mysys/charset
   total_latency_ms: 0.450179565698
   total_bytes_read: 18316
total_bytes_written: 0
*************************** 7. row ***************************
         EVENT_NAME: wait/io/file/mysys/cnf
   total_latency_ms: 0.315982563421
   total_bytes_read: 56
total_bytes_written: 0
*************************** 8. row ***************************
         EVENT_NAME: wait/io/file/myisam/dfile
   total_latency_ms: 0.885083482601
   total_bytes_read: 2322
total_bytes_written: 0
*************************** 9. row ***************************
         EVENT_NAME: wait/io/file/myisam/kfile
   total_latency_ms: 3.881758853793
   total_bytes_read: 7058
total_bytes_written: 0
*************************** 10. row ***************************
         EVENT_NAME: wait/io/file/innodb/innodb_data_file
   total_latency_ms: 114.101030502468
   total_bytes_read: 10223616
total_bytes_written: 49152
*************************** 11. row ***************************
         EVENT_NAME: wait/io/file/innodb/innodb_log_file
   total_latency_ms: 2.865770795383
   total_bytes_read: 69632
total_bytes_written: 3072
11 rows in set (0.01 sec)

Taking it one step further and combining with ps_helper, I think there is a real opportunity to build this metrics collection into the test suite:

/* start by reseting all metrics */
ps_helper.truncate_all();

/* run test suite */

/* Collect test suite metrics from MySQL */
SELECT * FROM ps_helper.wait_classes_global_by_latency;

Here is an example for wait_classes_global_by_latency from the GitHub project:

mysql> select * from wait_classes_global_by_latency;
+-------------------+--------------+---------------+-------------+-------------+-------------+
| event_class       | total_events | total_latency | min_latency | avg_latency | max_latency |
+-------------------+--------------+---------------+-------------+-------------+-------------+
| wait/io/file      |       550470 | 46.01 s       | 19.44 ns    | 83.58 µs    | 4.21 s      |
| wait/io/socket    |       228833 | 2.71 s        | 0 ps        | 11.86 µs    | 29.93 ms    |
| wait/io/table     |        64063 | 1.89 s        | 99.79 ns    | 29.43 µs    | 68.07 ms    |
| wait/lock/table   |        76029 | 47.19 ms      | 65.45 ns    | 620.74 ns   | 969.88 µs   |
| wait/synch/mutex  |       635925 | 34.93 ms      | 19.44 ns    | 54.93 ns    | 107.70 µs   |
| wait/synch/rwlock |        61287 | 7.62 ms       | 21.38 ns    | 124.37 ns   | 34.65 µs    |
+-------------------+--------------+---------------+-------------+-------------+-------------+

Neat, huh?

The MySQL error log in MySQL 5.7

The MySQL error log has received some attention in MySQL 5.7, with a new setting called log_error_verbosity.

There are three possible values, as documented in the manual:

Verbosity Value Message Types Logged
1 Errors only
2 Errors and warnings
3 Errors, warnings, and notes (default)

As Giuseppe has written about previously, writing notes or "informational events" can create debugging problems because they reduce the signal to noise ratio. There is now an easy way to reduce the logging to be only warnings and errors!

What is a mutex anyway?

While I was working on the MySQL Workbench booth at MySQL Connect this year, an attendee came up and asked what a mutex was and how important it is to them in their function as a DBA.

It's a good question. The short answer is:

Mutexes are mutually exclusive locks and are designed to prevent concurrent access to resources that in doing so may result in unsafe conditions.

They are required as part of virtually all multi-threaded applications (not just MySQL), and are not something that you can specifically prevent from happening. This is a different concept to row-locks and table locks.

Some examples

If the InnoDB buffer pool is full, internally free space needs to be created in order to be able to load your desired page into memory. What you don't want, is in between freeing and loading somebody else to take your desired free-slot.

When a page becomes too full, it may need to split data into other pages. It is possible that there could be a thread deadlock if two pages left and right of each-other reach this condition at once and can't decide who should proceed first.

Should you care as a DBA?

Yes, and no.

No in the sense that you are not expected to fix these problems, they are usually very internal to the operations of the database and require a MySQL developer to fix them.

Yes in the sense that newer MySQL versions often change the way internal locking works if a mutex becomes hot for a large number of users. Hot is the terminology used to describe a mutex which has other threads blocked waiting for it for a large amount of time. This contention reduces scalability on multi-core machines because some threads sit idle, essentially waiting for the ablity to run.

Hot Mutex History

Here is an incomplete list of some of the mutexes worked on in newer versions of MySQL:

MySQL 5.7:

  • Index lock contention is reduced via introduction of a new SX lock mode. This improves performance of large tables.

  • The LOCK_thread_count mutex protected several independent internal server structures and variables, and was a bottleneck, particularly affecting server performance in the circumstance when many clients were connecting and disconnecting at once.

MySQL 5.6:

  • The kernel mutex is split into smaller, localized locks. This mutex previous covered many different internal tasks to InnoDB.
  • LOCK_open mutex is split - this mutex used to be a mutex to protect the metadata of a table within the MySQL Server.
  • The default configuration was changed, so that InnoDB has 8 buffer pool instances when the buffer pool is > 1GB.

MySQL 5.5:

How can you see if a mutex is hot?

Various metadata commands can expose mutexes that are hot. For example: SHOW ENGINE INNODB STATUS, SHOW ENGINE INNODB MUTEX and PERFORMANCE_SCHEMA. Operating specific tools can also be used, such as oprofile or Linux Perf.

Here is an example from SHOW ENGINE INNODB MUTEX (via dimitrik):

mysql> show engine innodb mutex;
+--------+----------------------------+-------------------+
| Type   | Name                       | Status            |
+--------+----------------------------+-------------------+
| InnoDB | log/log0log.c:775          | os_waits=26       |
| InnoDB | log/log0log.c:771          | os_waits=1        |
| InnoDB | buf/buf0buf.c:1208         | os_waits=3219     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=6990     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=4619     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=5627     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=7873     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=4466     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=16929    |
| InnoDB | buf/buf0buf.c:1208         | os_waits=19305    |
| InnoDB | buf/buf0buf.c:1208         | os_waits=16301962 |
| InnoDB | buf/buf0buf.c:1208         | os_waits=11649    |
| InnoDB | buf/buf0buf.c:1208         | os_waits=950471   |
| InnoDB | buf/buf0buf.c:1208         | os_waits=6545     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=4262     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=5642     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=7878     |
| InnoDB | buf/buf0buf.c:1208         | os_waits=387166   |
| InnoDB | fil/fil0fil.c:1559         | os_waits=1265     |
| InnoDB | srv/srv0srv.c:987          | os_waits=460452   |
| InnoDB | combined buf/buf0buf.c:900 | os_waits=38503    |
| InnoDB | log/log0log.c:832          | os_waits=184      |
| InnoDB | combined buf/buf0buf.c:901 | os_waits=77       |
+--------+----------------------------+-------------------+
23 rows in set (0.56 sec)

Here is an example of SHOW ENGINE INNODB STATUS (via the MySQL forums):

..
---------- 
SEMAPHORES 
---------- 
OS WAIT ARRAY INFO: reservation count 334026866, signal count 322155474 
--Thread 139787989395216 has waited at ../../../storage/innobase/trx/trx0undo.c line 1684 for 165.00 seconds the semaphore: 
Mutex at 0x7f23518408c0 created file ../../../storage/innobase/trx/trx0rseg.c line 147, lock var 1 
waiters flag 1 
--Thread 139788009264912 has waited at ../../../storage/innobase/btr/btr0cur.c line 463 for 165.00 seconds the semaphore: 
S-lock on RW-latch at 0x7f230fbd2b48 created in file ../../../storage/innobase/buf/buf0buf.c line 550 
a writer (thread id 139787989395216) has reserved it in mode exclusive 
number of readers 0, waiters flag 1 
Last time read locked in file ../../../storage/innobase/btr/btr0cur.c line 463 
Last time write locked in file ../../../storage/innobase/btr/btr0cur.c line 463 
--Thread 139787997423376 has waited at ../../../storage/innobase/trx/trx0roll.c line 921 for 15.00 seconds the semaphore: 
Mutex at 0x7f23518408c0 created file ../../../storage/innobase/trx/trx0rseg.c line 147, lock var 1 
waiters flag 1 
--Thread 139788384470800 has waited at ../../../storage/innobase/btr/btr0cur.c line 463 for 15.00 seconds the semaphore: 
X-lock on RW-latch at 0x7f230f558038 created in file ../../../storage/innobase/buf/buf0buf.c line 550 
a writer (thread id 139788006655760) has reserved it in mode exclusive 
number of readers 0, waiters flag 1 
Last time read locked in file ../../../storage/innobase/btr/btr0cur.c line 463 
Last time write locked in file ../../../storage/innobase/btr/btr0cur.c line 463 
..

And one for PERFORMANCE_SCHEMA (via Mark Leith):

SELECT EVENT_NAME,
 SUM_TIMER_WAIT/1000000000 WAIT_MS,
 COUNT_STAR
FROM performance_schema.EVENTS_WAITS_SUMMARY_GLOBAL_BY_EVENT_NAME 
ORDER BY SUM_TIMER_WAIT DESC, COUNT_STAR DESC LIMIT 30;
+---------------------------------------------------------+-------------+------------+
| EVENT_NAME                                              | WAIT_MS     | COUNT_STAR |
+---------------------------------------------------------+-------------+------------+
| wait/io/file/innodb/innodb_data_file                    | 150562.0345 |     182750 |
| wait/io/file/innodb/innodb_log_file                     |  77795.5425 |       8913 |
| wait/synch/rwlock/innodb/checkpoint_lock                |  11636.7894 |       1284 |
| wait/synch/rwlock/innodb/btr_search_latch               |   7429.2204 |   39677465 |
| wait/io/file/myisam/kfile                               |   7353.2737 |       2011 |
| wait/synch/mutex/innodb/kernel_mutex                    |   4115.0534 |   59198510 |
| wait/synch/mutex/innodb/fil_system_mutex                |   3473.8341 |    1263809 |
| wait/io/file/myisam/dfile                               |   2940.5701 |        641 |
| wait/synch/mutex/innodb/buf_pool_mutex                  |   2650.8666 |   33982979 |
| wait/synch/mutex/innodb/rw_lock_mutex                   |   1261.8025 |   18877546 |
| wait/io/file/sql/FRM                                    |    116.6419 |        863 |
| wait/synch/cond/sql/COND_thread_count                   |      1.4073 |          1 |
| wait/io/file/sql/pid                                    |      1.2654 |          3 |
| wait/synch/mutex/innodb/mutex_list_mutex                |      0.7675 |      16727 |
| wait/synch/mutex/innodb/rw_lock_list_mutex              |      0.4023 |       8331 |
| wait/io/file/sql/dbopt                                  |      0.2745 |         12 |
| wait/io/file/sql/casetest                               |      0.2041 |          5 |
| wait/synch/mutex/innodb/thr_local_mutex                 |      0.2009 |       2050 |
| wait/synch/mutex/mysys/THR_LOCK_open                    |      0.1993 |        989 |
| wait/synch/rwlock/innodb/trx_purge_latch                |      0.1436 |        255 |
| wait/io/file/sql/ERRMSG                                 |      0.1432 |          5 |
| wait/synch/rwlock/sql/LOCK_grant                        |      0.1375 |        188 |
| wait/synch/rwlock/sql/MDL_lock::rwlock                  |      0.1013 |        481 |
| wait/synch/mutex/sql/Query_cache::structure_guard_mutex |      0.0923 |        628 |
| wait/synch/mutex/mysys/THR_LOCK_myisam                  |      0.0781 |        283 |
| wait/synch/rwlock/sql/Query_cache_query::lock           |      0.0676 |        198 |
| wait/io/file/mysys/charset                              |      0.0561 |          3 |
| wait/synch/mutex/sql/LOCK_open                          |      0.0529 |        543 |
| wait/synch/mutex/mysys/THR_LOCK_lock                    |      0.0504 |        244 |
| wait/synch/mutex/innodb/srv_innodb_monitor_mutex        |      0.0424 |         75 |
+---------------------------------------------------------+-------------+------------+
30 rows in set (0.00 sec)

Estimating MySQL's Working Set with information_schema

In an earlier post, I wrote about when MySQL loads data in and out of cache. At the end of this post, I talked about the concept of page churn, and having enough memory for your working set:

"If you do not have enough memory to hold all of your working set in memory, what will happen is that the buffer pool will start juggling as it churns pages out of memory only to load them back in soon after. A small amount of this is probably okay, but it can start to degrade performance. The traditional rule of thumb is called "the 5 minute rule". This means that if you load a page into memory and are going to need it again in the next five minutes - it should not need to be churned out."

To be able to calculate this in a precise way we would need to collect:
- The current number of pages in the buffer pool.
- The unique pages loaded in over a 5 minute period.

MySQL doesn't have a hook to be able to tell when a page is loaded in memory, but as of MySQL 5.6 it does have an information_schema table called INNODB_BUFFER_PAGE. So while we will not be able to get the exacty number, polling this table on a frequent enough interval and storing the unique results (space_id + page_id) to a temporary table should get close.

So I wrote a stored procedure estimate_working_set.sql to do exactly this. It accepts two arguments; sleep time and iterations. Here is an example ~5 minute observation on my laptop:

mysql> call test.estimate_working_set(10, 30);
+----------+
| progress |
+----------+
| 1/30     |
+----------+
1 row in set (10.13 sec)

+----------+
| progress |
+----------+
| 2/30     |
+----------+
1 row in set (21.19 sec)

.. lines ommitted for brevity ..

+----------+
| progress |
+----------+
| 29/30    |
+----------+
1 row in set (5 min 41.97 sec)

+----------+
| progress |
+----------+
| 30/30    |
+----------+
1 row in set (5 min 54.72 sec)

+----------------------+
| pages_in_working_set |
+----------------------+
|               100679 |
+----------------------+
1 row in set (5 min 55.61 sec)

Query OK, 0 rows affected (5 min 55.71 sec)

So in my case my working set is 100679 pages. I am not using compressed pages so each page is 16 KiB, or divide by 64 to convert to MiB. 100679/64 = 1573MB working set. I have a 128M buffer pool - so I desperately need to increase it / add more memory.

Warning! This stored procedure will be expensive on a server with a large buffer pool. Use at your own risk!