Master MySQL – Page 7 – Blog of Morgan Tocker

Percona Live Highlights

Last week marked my first April-MySQL-Conf since 2009, and now that I’m back home I wanted to reflect on some of my personal highlights.

Photo Credit: @miguel2angel

Tomas’ keynote on Wednesday morning was awesome. It felt great to be part of the team at Oracle that announced a release 2x faster than MySQL 5.6 and 3x faster than MySQL 5.5. The Parser+Optimizer+InnoDB GIS labs release is also a great sign of things to come.
Hosting the meet the MySQL team @ Oracle BOF Wednesday night, and answering community questions with my colleagues on our engineering team. It was a pleasant surprise to have Mark Callaghan (Facebook) and Peter Zaitsev (Percona) come and join in as well.
Attending some of my colleague’s sessions and collating feedback from attendees. In Luis Soares’ what’s new in replication session, we received a couple of great suggestions which we will certainly investigate in more detail.
Learning how to use new tools. For me, this will be the R programming language (something I had heard of many times before, but never taken a closer look at). It was also great to see MySQL Workbench’s Visual Explain in Bill Karwin’s query optimization talk.
Seeing our ecosystem expand. The two new technologies that I am most excited about are VividCortex and WebScaleSQL. With WebScaleSQL, it’s a great endorsement to see it based on MySQL 5.6.

I’m back home this week, then on to vacation next week!

Configuring MySQL to use minimal memory

I’ve been experimenting with MySQL Fabric – the official tool in development to provide HA and sharding. Part of this experimentation has meant that I have been running 7 copies of the MySQL Server on my laptop at once, which can consume a lot of memory 🙂

So today I wanted to write about how to configure MySQL to use less memory, which may also be beneficial in situations where peak performance is not important. For example: a developer VM environment.

Previous configuration guides

Before I get started, let me point to some of my previous configuration guides:

Reducing MySQL durability for testing environments – losing the ability to recover all changes in exchange for a performance gain.
Improving MySQL’s default configuration – explaining the motivation of changing the defaults for MySQL 5.6.
What to tune in MySQL 5.6 after installation – explaining what changes to make in production for MySQL 5.6.

Configuring MySQL to use minimal memory

The MySQL defaults have to balance performance with what is considered reasonable for what may be a development system with other applications needing to run alongside MySQL. In many cases, this will mean 4-8GB, but on virtual machines (or in my case with 7 copies of mysqld running), there is a lot less available.

Obligatory warning: If you are running these settings on a machine with 1GB+ RAM, you should expect worse performance when compared to the defaults.

Setting	Default	Minimum
innodb_buffer_pool_size	128M	5M
innodb_log_buffer_size	1M	256K
query_cache_size	1M	0
max_connections	151	1 (although 10 might be more reasonable)
key_buffer_size	8388608	8
thread_cache_size	(autosized)	0
host_cache_size	(autosized)	0
innodb_ft_cache_size	8000000	1600000
innodb_ft_total_cache_size	640000000	32000000
thread_stack	262144	131072
sort_buffer_size	262144	32K
read_buffer_size	131072	8200
read_rnd_buffer_size	262144	8200
max_heap_table_size	16777216	16K
tmp_table_size	16777216	1K
bulk_insert_buffer_size	8388608	0
join_buffer_size	262144	128
net_buffer_length	16384	1K
innodb_sort_buffer_size	1M	64K
binlog_cache_size	32K	4K
binlog_stmt_cache_size	32K	4K

(Note: you may be able to save a little more memory by disabling InnoDB or performance schema, but I don’t recommend these two steps.)

To summarize these changes:

# /etc/my.cnf:
innodb_buffer_pool_size=5M
innodb_log_buffer_size=256K
query_cache_size=0
max_connections=10
key_buffer_size=8
thread_cache_size=0
host_cache_size=0
innodb_ft_cache_size=1600000
innodb_ft_total_cache_size=32000000

# per thread or per operation settings
thread_stack=131072
sort_buffer_size=32K
read_buffer_size=8200
read_rnd_buffer_size=8200
max_heap_table_size=16K
tmp_table_size=1K
bulk_insert_buffer_size=0
join_buffer_size=128
net_buffer_length=1K
innodb_sort_buffer_size=64K

#settings that relate to the binary log (if enabled)
binlog_cache_size=4K
binlog_stmt_cache_size=4K

Plan to remove unused command line programs (10 out of 43)

Have you ever noticed the vast number of command line programs that ship with MySQL? Here is an ls from MySQL 5.6.16’s bin directory:

morgo@Rbook:~/Documents/mysql-installs/5.6.16/bin$ ls
innochecksum
msql2mysql
my_print_defaults
myisam_ftdump
myisamchk
myisamlog
myisampack
mysql
mysql_client_test
mysql_client_test_embedded
mysql_config
mysql_config_editor
mysql_convert_table_format
mysql_embedded
mysql_find_rows
mysql_fix_extensions
mysql_plugin
mysql_secure_installation
mysql_setpermission
mysql_tzinfo_to_sql
mysql_upgrade
mysql_waitpid
mysql_zap
mysqlaccess
mysqlaccess.conf
mysqladmin
mysqlbinlog
mysqlbug
mysqlcheck
mysqld
mysqld-debug
mysqld_multi
mysqld_safe
mysqldump
mysqldumpslow
mysqlhotcopy
mysqlimport
mysqlshow
mysqlslap
mysqltest
mysqltest_embedded
perror
replace
resolve_stack_dump
resolveip

For MySQL 5.7 we are planning to cleanup and remove programs that appear to have no current use-case. This helps us to improve supportability of MySQL as users will not be able to accidentally find themselves using a utility that has not historically been well maintained, or has outlived its original purpose. In the case of shell scripts, removal also helps us offer a better cross platform experience.

The list scheduled for removal

msql2mysql [Shell Script]
mysqlaccess [Perl program]
mysqlbug [Shell script]
mysql_zap [Shell script]
mysql_waitpid [C program]
mysql_convert_table_format [Perl Program]
mysql_fix_extensions [Perl Program]
mysql_find_rows [Perl Program]
mysql_setpermission [Perl Program]
~~resolve_stack_dump [C program]~~

(Hat tip to Todd Farmer, who first wrote about removing these programs last year.)

Update March 12th: resolve_stack_dump has been removed from the list.

Feedback Welcome

Are you currently using one of these utilities scheduled for removal?
Did we get a decision wrong?
We value your feedback. Please leave a comment, or get in touch!

A followup on SHOW ENGINE INNODB MUTEX

EDIT: This post has been updated here. SHOW ENGINE INNODB MUTEX will return in MySQL 5.7.8.

In November, I wrote about the command SHOW ENGINE INNODB MUTEX and sought feedback as to how it continues to be used post introduction of performance_schema.

The outcome from this feedback is that the MySQL team has decided to remove this command from MySQL 5.7.

These decisions are not light ones for the team to make. I wanted to thank everyone for their feedback, and also share why this decision was an important one for the future of MySQL:

By design, data collection for SHOW ENGINE INNODB MUTEX was always enabled, with no switch to disable it. We believe that for some workloads being able to disable the collection is useful, and in the case of performance_schema this is possible.
performance_schema overlaps with this functionality but provides a superset of features, including timing information. The noted exception to this is that spinning information is not yet available in performance_schema.
Comparable information to SHOW ENGINE INNODB MUTEX can be re-generated by creating views on performance_schema tables, offering a migration path to users affected and those desiring an interface that requires fewer key-strokes.
The InnoDB mutex code was recently refactored as part of our server cleanup. After refactoring, the InnoDB code can now mix several mutex types internally including spin locks, system mutexes (POSIX) and InnoDB home brewed ones. The display output for SHOW ENGINE INNODB MUTEX doesn’t really account for these differences in its status column. Grouping all mutex types together becomes less intuitive, and would require either some change to the output, or a coarse (inaccurate) mixing of the types:

mysql> show engine innodb mutex;
+--------+----------------------------+-------------------+
| Type   | Name                       | Status            |
+--------+----------------------------+-------------------+
| InnoDB | log/log0log.c:775          | os_waits=26       |
| InnoDB | log/log0log.c:771          | os_waits=1        |
| InnoDB | buf/buf0buf.c:1208         | os_waits=3219     |
..

Optimizing InnoDB Transactions

(This is a cross-post from the Engine Yard blog. The original article appears here.)

Here is a question I’ve actually been asked a few times:

“I am writing a batch processing script which modifies data as part of an ongoing process that is scheduled by cron. I have the ability to group a number of modifications together into a transaction, but I’m not sure what the correct number is?”

First off, I think that this question is interesting not just in the context of batch processing, but it equally applies to all parts of the application. If you are designing a high throughput system for MySQL, there are actually some potential pain points that you can design your way around.

Potential Pain Points

Here are the situations where the size of the transaction could impact performance:

Very Small Transactions

Using MySQL’s auto_commit mode (or modifying just one row at a time) can cause some internal pressure on InnoDB’s redo log system, since it has to make sure that its in-memory buffer is safely flushed to disk. This flushing is required to be able to provide durability, and safely recover all changes that have been committed if the server crashes.

Note that this durability requirement can be disabled by setting innodb_flush_log_at_trx_commit=2 (default: 1), or the impact can be reduced by installing a RAID controller with a battery backed write cache or a fast SSD.

Very Long Transactions

To understand what I mean by “long”, lets use this hypothetical example:

START TRANSACTION;
/* Modify Data */
sleep(60)
COMMIT;

The problem with these transactions, is that even if they do not modify much data, because of a feature called Multiversion Concurrency Control InnoDB internally needs to keep around older versions of rows that were modified in other transactions. There is an internal clean-up thread which is constantly trying to purge older versions, but in the case of a very long transaction it will sit and wait until after a COMMIT/ROLLBACK event occurs.

In the example above, I used sleep(60) to illustrate a point, but in a very high throughut system it may not even need to be this long to start impacting performce.

Very large Transactions

By very large transactions, I simply mean transactions which modify very large amounts of data. There are a couple of risks that you need to be aware of in this case:

Locks will be held for the duration of a transaction. By modifying more rows at once, you may increase the number of deadlocks or introduce lock wait contention.
Rolling back transactions can be 30 times the cost of a corresponding apply transaction. It is important to note that rolling back may be something you requested, but it will also happen in the case of a disconnect or a deadlock.

Closing Advice

I think it’s easy to get lost in the details and sub-optimize to eliminate any potential problems. Fear not! I have a two step process to avoiding the vast majority of all issues:

Move intensive application processing logic to before or after transaction running. While you have a transaction open, try and make any changes that you need to make and COMMIT as soon as possible. i.e.

/* do intensive processing */
START TRANSACTION;
/* Modify Data */
COMMIT;
/* more intensive processing */

You usually can’t go wrong by using transactions as they were originally intended: commit logical units of work together. As mentioned above you can actually run into problems using auto_commit with row-at-a-time processing, as well as batches that are too large. My advice would be to keep transactions to modifying less than 10K rows at a time.

Good luck!

The MySQL Team is Hiring

As mentioned in my recent MySQL Performance Blog interview, Oracle has a number of vacancies in the MySQL team. By my count, there are 21 current vacancies:

IRC2431754
Software Developer 4
MySQL Server Development (Windows)
IRC2433642
Software Developer 4
MySQL Server Development (Replication)
IRC2435607
Internet Sales Representative II
MySQL Corporate Sales
IRC2437590
Software Developer 3
MySQL Enterprise Tools
IRC2423566
Technology Sales Representative III
MySQL Sales
IRC2421573
Technical Analyst 3-Support
MySQL Support
IRC2409719
Sales Consultant
MySQL Global Business Unit
IRC2409720
Sales Consultant
MySQL Global Business Unit
IRC2303935
Sales Consultant
MySQL Global Business Unit
IRC2411711
Telesales Business Development Representative II
MySQL Sales
IRC2401744
Internet Sales Representative II
MySQL Sales
IRC2335551
Business Analyst 4-Ops
Professional Services
IRC2376534
Technical Analyst 3-Support
MySQL Support
IRC2378540
Technical Analyst 3-Support
MySQL Support
IRC2378080
Internet Sales Representative II
MySQL Sales Americas
IRC2379740
Applications Sales Representative III
MySQL Field Sales
IRC2359667
Software Developer 4
MySQL Server Development Team
IRC2351560
Software Developer 3
MySQL Engineering
IRC1720632
Software Developer 4
MySQL Server General Team
IRC2139563
Technical Analyst 3-Support
MySQL Support
IRC2131703
Technical Analyst 3-Support
MySQL Support

To apply, go to irecruitment.oracle.com and enter the IRC code in the search box.

Today’s practical use-case for Performance Schema

Today’s blog post starts with a question:

“In my continuous integration environment, running tests currently takes 30 minutes. I
believe that a large percentage of time is spent in MySQL, and I want to know if putting MySQL
on tmpfs will cut that time down?”.

I have previously written about how to configure MySQL to be less durable in testing environments, but today I wanted to write about how we can really test the assumption that the question is based around. That is to say:

We know that total time is 30 minutes.
We assume MySQL is a large percentage of this.
We know that this optimization will potentially allow us to avoid IO.
We then want to know what amount of time MySQL spends waiting on IO during our test suite?

And the answer to this is performance schema.

Performance Schema instruments File IO with statistics such as bytes read/written, and wait time. So if we see that IO wait time was only a small percentage of the 30 minutes of test time, we automatically know if this optimization is worth investigating or not.

Here is a simple query that will work on 5.6 out of the box:

mysql> SELECT
 `a`.`EVENT_NAME`,
 `a`.`SUM_TIMER_WAIT`/1024/1024/1024 AS `total_latency_ms`,
 `b`.`SUM_NUMBER_OF_BYTES_READ` AS `total_bytes_read`,
 `b`.`SUM_NUMBER_OF_BYTES_WRITE` AS `total_bytes_written`
FROM
 `performance_schema`.`events_waits_summary_global_by_event_name` `a`
 INNER JOIN `performance_schema`.`file_summary_by_event_name` `b` USING (event_name)
WHERE
 `a`.`EVENT_NAME` like 'wait/io/file/%'
 AND `a`.`COUNT_STAR` > 0\G
*************************** 1. row ***************************
         EVENT_NAME: wait/io/file/sql/casetest
   total_latency_ms: 0.245716745034
   total_bytes_read: 0
total_bytes_written: 0
*************************** 2. row ***************************
         EVENT_NAME: wait/io/file/sql/ERRMSG
   total_latency_ms: 0.911862457171
   total_bytes_read: 58982
total_bytes_written: 0
*************************** 3. row ***************************
         EVENT_NAME: wait/io/file/sql/FRM
   total_latency_ms: 50.292402482592
   total_bytes_read: 137771
total_bytes_written: 0
*************************** 4. row ***************************
         EVENT_NAME: wait/io/file/sql/global_ddl_log
   total_latency_ms: 0.012274521403
   total_bytes_read: 0
total_bytes_written: 0
*************************** 5. row ***************************
         EVENT_NAME: wait/io/file/sql/pid
   total_latency_ms: 0.218277866021
   total_bytes_read: 0
total_bytes_written: 5
*************************** 6. row ***************************
         EVENT_NAME: wait/io/file/mysys/charset
   total_latency_ms: 0.450179565698
   total_bytes_read: 18316
total_bytes_written: 0
*************************** 7. row ***************************
         EVENT_NAME: wait/io/file/mysys/cnf
   total_latency_ms: 0.315982563421
   total_bytes_read: 56
total_bytes_written: 0
*************************** 8. row ***************************
         EVENT_NAME: wait/io/file/myisam/dfile
   total_latency_ms: 0.885083482601
   total_bytes_read: 2322
total_bytes_written: 0
*************************** 9. row ***************************
         EVENT_NAME: wait/io/file/myisam/kfile
   total_latency_ms: 3.881758853793
   total_bytes_read: 7058
total_bytes_written: 0
*************************** 10. row ***************************
         EVENT_NAME: wait/io/file/innodb/innodb_data_file
   total_latency_ms: 114.101030502468
   total_bytes_read: 10223616
total_bytes_written: 49152
*************************** 11. row ***************************
         EVENT_NAME: wait/io/file/innodb/innodb_log_file
   total_latency_ms: 2.865770795383
   total_bytes_read: 69632
total_bytes_written: 3072
11 rows in set (0.01 sec)

Taking it one step further and combining with ps_helper, I think there is a real opportunity to build this metrics collection into the test suite:

/* start by reseting all metrics */
ps_helper.truncate_all();

/* run test suite */

/* Collect test suite metrics from MySQL */
SELECT * FROM ps_helper.wait_classes_global_by_latency;

Here is an example for wait_classes_global_by_latency from the GitHub project:

mysql> select * from wait_classes_global_by_latency;
+-------------------+--------------+---------------+-------------+-------------+-------------+
| event_class       | total_events | total_latency | min_latency | avg_latency | max_latency |
+-------------------+--------------+---------------+-------------+-------------+-------------+
| wait/io/file      |       550470 | 46.01 s       | 19.44 ns    | 83.58 µs    | 4.21 s      |
| wait/io/socket    |       228833 | 2.71 s        | 0 ps        | 11.86 µs    | 29.93 ms    |
| wait/io/table     |        64063 | 1.89 s        | 99.79 ns    | 29.43 µs    | 68.07 ms    |
| wait/lock/table   |        76029 | 47.19 ms      | 65.45 ns    | 620.74 ns   | 969.88 µs   |
| wait/synch/mutex  |       635925 | 34.93 ms      | 19.44 ns    | 54.93 ns    | 107.70 µs   |
| wait/synch/rwlock |        61287 | 7.62 ms       | 21.38 ns    | 124.37 ns   | 34.65 µs    |
+-------------------+--------------+---------------+-------------+-------------+-------------+

Neat, huh?

MySQL 5.6 in universe for Ubuntu Trusty (14.04)

I’m very excited to share that Ubuntu Bug #1256931 has just been updated. MySQL 5.6 will be installable from the universe archive in trusty, with 5.5 remaining in the main archive.

Trusty Tahr (14.04) is a long term release (LTS) for Ubuntu, with support until 2019. It’s due out in April.

MySQL 5.6.16 Community Release Notes

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.16.

In particular:

Thanks to Honza Horak for suggesting we make tmpdir configurable at build-time, and for providing a patch. Bug #68338.
Thanks to Hartmut Holzgraefe for reporting a memory leak when using the InnoDB memcached API and replication. Bug #70757.
Thanks to Justin Swanhart for reporting that InnoDB reported an incorrect operating system error code when it failed to initialize. Bug #70867.
Thanks to Yoshinori Matsunobu who reported that MySQL 5.6’s persistent InnoDB statistics caused stalls due to latch contention. Bug #70768.
Thanks to Laurynas Biveinis for discovering that InnoDB could needlessly call os_thread_get_curr_id(). Bug #70417.
Thanks to Nizameddin Ordulu for identifying a case where InnoDB’s doublewrite buffer would not restore corrupted pages but could have. Bug #70087.
Thanks to Hartmut Holzgraefe for reporting that the README file for InnoDB memcached incorrectly specified the version of libevent statically linked. Bug #70034.
Thanks to Shahriyar Rzayev for reporting that the error message when reseting a slave with innodb_force_recovery set was cryptic. We have since improved the error message. Bug #69907.
Thanks to Keith Dechant for reporting that AUTO_INCREMENT values could not be reset when the INPLACE algorithm was used for ALTER TABLE. Bug #69882.
Thanks to Laurynas Biveinis for reporting an incorrect comment in the InnoDB source code. Bug #69847.
Thanks to Christian Rabe for reporting a problem when using an InnoDB tablespace stored on a raw device. Bug #69424.
Thanks to Justin Swanhart for reporting that an online ALTER TABLE operation on a partitioned table can consume a significant amount of memory. Bug #69325.
Thanks to Valeriy Kravchuk for reporting a specific workload where InnoDB did not scale well on multi-core machines. In response to Valeriy’s bug report, we added a new feature to use atomic reference counting to track page use instead of mutexes. Bug #68079.
Thanks to Stewart Smith for reporting that table renaming operations showed up under foreign key errors in SHOW ENGINE INNODB STATUS. Stewart also generously provided a patch. Bug #61746.
Thanks to Laurynas Biveinis for reporting that UNIV_SYNC_DEBUG was disabled erroneously as part of an earlier bug fix. We’ve since reenabled it. Bug #69617.
Thanks to Joffrey Michaie for reporting a situation where queries on tables partitioned by hash could return wrong results. Bug #70588.
Thanks to zhai weixiang for reporting an issue where binlog dump information could incorrectly be written to the error log. Bug #70685.
Thanks to Justin Swanhart for reporting that mysqlbinlog did not properly decode DECIMAL values. Bug #65812.
Thanks to Santosh Praneeth Banda for reporting that the semisynchronous replication plugin was called twice for a DDL statement. Bug #70410.
Thanks to Yang Dingning from NCNIPC, Graduate University of Chinese Academy of Sciences for reporting a security issue. Bug #61065.
Thanks to Bryan Turner for reporting a situation where tables with utf8_bin collation could return results in the wrong order. Bug #69005.
Thanks to Honza Horak for reporting code improvement suggestions as the result of a coverity analysis. Bug #70830.
Thanks to Ed Reeder for reporting that the option --local-service did not work on Windows. Bug #69637.
Thanks to David Coyle for reporting a limitation when using views with an ORDER BY in their definition. We have since worked around this limitation. Bug #69678.
Thanks to Davi Arnaut for reporting an issue with the performance_schema instrumentation interface. Bug #70628.
Thanks to Laurynas Biveinis for pointing out a situation where make_atomic_cas_body64 might be miscompiled. Thanks also to Davi Arnaut for his comments on the bug. Bug #63451.
Thanks to Davi Arnaut for reporting that the mysql_plugin client inadvertently attempts to remove files. Bug #69752.
Thanks to Igor Babaev and xiaobin lin who independently discovered a case when COUNT(DISTINCT) could return wrong results. xiaobin lin also provided a patch, which while we did not end up using it, we valued the contribution. Bug #68749, Bug #71028.
Thanks to Patrick Middleton for pointing out an issue when trying to compile MySQL without partitioning. While we didn’t end up using it, we thank Patrick for also providing a patch. Bug #71010.
Thanks to Anthony Tso for reporting a bug where wrong results could be returned. Bug #70608.
Thanks to Elena Stepanova for reporting a security issue. Bug #68751.
Thanks to Daniel van Eeden for letting us know that some scripts provided out of date information regarding where to file bug reports. Bug #68742.
Thanks to Don Coffin for reporting that some file removal operations were not instrumented by performance_schema. Bug #69782.
Thanks to Eric Bergen for reporting that mysqldump would not release metadata locks until after the dump operation had finished. Bug #71017.
Thanks to Michael Ezzell for reporting a security issue. Bug #68354.
Thanks to Giuseppe Maxia for raising awareness of the obsolete mysqlbug utility. We have since removed mention of it from mysql_install_db. Bug #29716.

Thank you again to all the community contributors listed above. In particular, the MySQL team would like to call out the names that appear more than once in this release:

Honza Horak (2), Hartmut Holzgraefe (2), Justin Swanhart (3), Davi Arnaut (2) and Laurynas Biveinis (4).

If I missed a name here, please let me know!

- Morgan

Eating your own dogfood

I’ve just finishing converting my blog from Jekyll to WordPress.

There were a couple of features I was looking for (such as being able to schedule upcoming posts), but the real reason is that I want to be able to use MySQL 5.7 DMR3 against a tool that I interact with almost every day.

Or as this is more informally known, I wanted to eat my own dogfood:

I now have:

WordPress 3.8.1
MySQL 5.7 DMR3
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES
All InnoDB Storage Engine
Disqus comment engine (comment if you think this is cheating)
“Markdown on Save Improved” plugin.

I apologize for broken images and URLs changing slightly (which will break comments). I plan to fix them as I spot them.

Do you have any other suggestions I should try?