March 2014 – Master MySQL

March 10, 2014

Configuring MySQL to use minimal memory

I’ve been experimenting with MySQL Fabric – the official tool in development to provide HA and sharding. Part of this experimentation has meant that I have been running 7 copies of the MySQL Server on my laptop at once, which can consume a lot of memory 🙂
So today I wanted to write about how to configure MySQL to use less memory, which may also be beneficial in situations where peak performance is not important. For example: a developer VM environment.

Previous configuration guides

Before I get started, let me point to some of my previous configuration guides:

Reducing MySQL durability for testing environments – losing the ability to recover all changes in exchange for a performance gain.
Improving MySQL’s default configuration – explaining the motivation of changing the defaults for MySQL 5.6.
What to tune in MySQL 5.6 after installation – explaining what changes to make in production for MySQL 5.6.

Configuring MySQL to use minimal memory

The MySQL defaults have to balance performance with what is considered reasonable for what may be a development system with other applications needing to run alongside MySQL. In many cases, this will mean 4-8GB, but on virtual machines (or in my case with 7 copies of mysqld running), there is a lot less available.
Obligatory warning: If you are running these settings on a machine with 1GB+ RAM, you should expect worse performance when compared to the defaults.

Setting	Default	Minimum
innodb_buffer_pool_size	128M	5M
innodb_log_buffer_size	1M	256K
query_cache_size	1M	0
max_connections	151	1 (although 10 might be more reasonable)
key_buffer_size	8388608	8
thread_cache_size	(autosized)	0
host_cache_size	(autosized)	0
innodb_ft_cache_size	8000000	1600000
innodb_ft_total_cache_size	640000000	32000000
thread_stack	262144	131072
sort_buffer_size	262144	32K
read_buffer_size	131072	8200
read_rnd_buffer_size	262144	8200
max_heap_table_size	16777216	16K
tmp_table_size	16777216	1K
bulk_insert_buffer_size	8388608	0
join_buffer_size	262144	128
net_buffer_length	16384	1K
innodb_sort_buffer_size	1M	64K
binlog_cache_size	32K	4K
binlog_stmt_cache_size	32K	4K

(Note: you may be able to save a little more memory by disabling InnoDB or performance schema, but I don’t recommend these two steps.)
To summarize these changes:

# /etc/my.cnf:
innodb_buffer_pool_size=5M
innodb_log_buffer_size=256K
query_cache_size=0
max_connections=10
key_buffer_size=8
thread_cache_size=0
host_cache_size=0
innodb_ft_cache_size=1600000
innodb_ft_total_cache_size=32000000
# per thread or per operation settings
thread_stack=131072
sort_buffer_size=32K
read_buffer_size=8200
read_rnd_buffer_size=8200
max_heap_table_size=16K
tmp_table_size=1K
bulk_insert_buffer_size=0
join_buffer_size=128
net_buffer_length=1K
innodb_sort_buffer_size=64K
#settings that relate to the binary log (if enabled)
binlog_cache_size=4K
binlog_stmt_cache_size=4K

March 7, 2014

Plan to remove unused command line programs (10 out of 43)

Have you ever noticed the vast number of command line programs that ship with MySQL? Here is an ls from MySQL 5.6.16’s bin directory:

morgo@Rbook:~/Documents/mysql-installs/5.6.16/bin$ ls
innochecksum
msql2mysql
my_print_defaults
myisam_ftdump
myisamchk
myisamlog
myisampack
mysql
mysql_client_test
mysql_client_test_embedded
mysql_config
mysql_config_editor
mysql_convert_table_format
mysql_embedded
mysql_find_rows
mysql_fix_extensions
mysql_plugin
mysql_secure_installation
mysql_setpermission
mysql_tzinfo_to_sql
mysql_upgrade
mysql_waitpid
mysql_zap
mysqlaccess
mysqlaccess.conf
mysqladmin
mysqlbinlog
mysqlbug
mysqlcheck
mysqld
mysqld-debug
mysqld_multi
mysqld_safe
mysqldump
mysqldumpslow
mysqlhotcopy
mysqlimport
mysqlshow
mysqlslap
mysqltest
mysqltest_embedded
perror
replace
resolve_stack_dump
resolveip

For MySQL 5.7 we are planning to cleanup and remove programs that appear to have no current use-case. This helps us to improve supportability of MySQL as users will not be able to accidentally find themselves using a utility that has not historically been well maintained, or has outlived its original purpose. In the case of shell scripts, removal also helps us offer a better cross platform experience.

The list scheduled for removal

msql2mysql [Shell Script]
mysqlaccess [Perl program]
mysqlbug [Shell script]
mysql_zap [Shell script]
mysql_waitpid [C program]
mysql_convert_table_format [Perl Program]
mysql_fix_extensions [Perl Program]
mysql_find_rows [Perl Program]
mysql_setpermission [Perl Program]
~~resolve_stack_dump [C program]~~

(Hat tip to Todd Farmer, who first wrote about removing these programs last year.)
Update March 12th: resolve_stack_dump has been removed from the list.

Feedback Welcome

Are you currently using one of these utilities scheduled for removal?
Did we get a decision wrong?
We value your feedback. Please leave a comment, or get in touch!

March 5, 2014

A followup on SHOW ENGINE INNODB MUTEX

EDIT: This post has been updated here. SHOW ENGINE INNODB MUTEX will return in MySQL 5.7.8.

In November, I wrote about the command SHOW ENGINE INNODB MUTEX and sought feedback as to how it continues to be used post introduction of performance_schema.
The outcome from this feedback is that the MySQL team has decided to remove this command from MySQL 5.7.
These decisions are not light ones for the team to make. I wanted to thank everyone for their feedback, and also share why this decision was an important one for the future of MySQL:

By design, data collection for SHOW ENGINE INNODB MUTEX was always enabled, with no switch to disable it. We believe that for some workloads being able to disable the collection is useful, and in the case of performance_schema this is possible.
performance_schema overlaps with this functionality but provides a superset of features, including timing information. The noted exception to this is that spinning information is not yet available in performance_schema.
Comparable information to SHOW ENGINE INNODB MUTEX can be re-generated by creating views on performance_schema tables, offering a migration path to users affected and those desiring an interface that requires fewer key-strokes.
The InnoDB mutex code was recently refactored as part of our server cleanup. After refactoring, the InnoDB code can now mix several mutex types internally including spin locks, system mutexes (POSIX) and InnoDB home brewed ones. The display output for SHOW ENGINE INNODB MUTEX doesn’t really account for these differences in its status column. Grouping all mutex types together becomes less intuitive, and would require either some change to the output, or a coarse (inaccurate) mixing of the types:

mysql> show engine innodb mutex;
+--------+----------------------------+-------------------+
| Type   | Name                       | Status            |
+--------+----------------------------+-------------------+
| InnoDB | log/log0log.c:775          | os_waits=26       |
| InnoDB | log/log0log.c:771          | os_waits=1        |
| InnoDB | buf/buf0buf.c:1208         | os_waits=3219     |
..

March 4, 2014

Optimizing InnoDB Transactions

(This is a cross-post from the Engine Yard blog. The original article appears here.)
Here is a question I’ve actually been asked a few times:

“I am writing a batch processing script which modifies data as part of an ongoing process that is scheduled by cron. I have the ability to group a number of modifications together into a transaction, but I’m not sure what the correct number is?”

First off, I think that this question is interesting not just in the context of batch processing, but it equally applies to all parts of the application. If you are designing a high throughput system for MySQL, there are actually some potential pain points that you can design your way around.

Potential Pain Points

Here are the situations where the size of the transaction could impact performance:
Very Small Transactions
Using MySQL’s auto_commit mode (or modifying just one row at a time) can cause some internal pressure on InnoDB’s redo log system, since it has to make sure that its in-memory buffer is safely flushed to disk. This flushing is required to be able to provide durability, and safely recover all changes that have been committed if the server crashes.

Note that this durability requirement can be disabled by setting innodb_flush_log_at_trx_commit=2 (default: 1), or the impact can be reduced by installing a RAID controller with a battery backed write cache or a fast SSD.
Very Long Transactions
To understand what I mean by “long”, lets use this hypothetical example:

START TRANSACTION;
/* Modify Data */
sleep(60)
COMMIT;

The problem with these transactions, is that even if they do not modify much data, because of a feature called Multiversion Concurrency Control InnoDB internally needs to keep around older versions of rows that were modified in other transactions. There is an internal clean-up thread which is constantly trying to purge older versions, but in the case of a very long transaction it will sit and wait until after a COMMIT/ROLLBACK event occurs.
In the example above, I used sleep(60) to illustrate a point, but in a very high throughut system it may not even need to be this long to start impacting performce.
Very large Transactions
By very large transactions, I simply mean transactions which modify very large amounts of data. There are a couple of risks that you need to be aware of in this case:

Locks will be held for the duration of a transaction. By modifying more rows at once, you may increase the number of deadlocks or introduce lock wait contention.
Rolling back transactions can be 30 times the cost of a corresponding apply transaction. It is important to note that rolling back may be something you requested, but it will also happen in the case of a disconnect or a deadlock.

Closing Advice

I think it’s easy to get lost in the details and sub-optimize to eliminate any potential problems. Fear not! I have a two step process to avoiding the vast majority of all issues:

Move intensive application processing logic to before or after transaction running. While you have a transaction open, try and make any changes that you need to make and COMMIT as soon as possible. i.e.

/* do intensive processing */
START TRANSACTION;
/* Modify Data */
COMMIT;
/* more intensive processing */

You usually can’t go wrong by using transactions as they were originally intended: commit logical units of work together. As mentioned above you can actually run into problems using auto_commit with row-at-a-time processing, as well as batches that are too large. My advice would be to keep transactions to modifying less than 10K rows at a time.

Good luck!