Deciding whether or not to make MySQL durable

Today I had intended to talk about my approach and thought process in configuring a fresh MySQL server running 5.6 with the InnoDB storage engine. However, it didn’t quite work out that way. I managed to get side-tracked by decision #0 – whether or not durability is important.

Introduction

By default, MySQL is configured to lie to you. That is that when you modify data there are no guarantees that your changes will be able to survive a catastrophic failure, or to put in simple terms: a power loss. The advantage of doing this, is MySQL is able to perform additional buffering and merging and consequently achieve better performance. However, the downside is also very clear: losing data is problematic, so you need to decide what is appropriate for your use-case.
I tend to argue that there are only two options of durability (yes or no), since you do not get to decide which transaction you lose – it could be a $10 order or a $10 million dollar order. Once you have decided you can potentially lose one transaction, in reality you can probably lose a few more.
I also warn strongly that there is no point in making MySQL durable if you can not make durability guarantees with the layers below MySQL. MySQL is not alone in lying by default – consumer hard drives have buffers which are not battery backed, as do virtualization platforms.

Durable

Frequently you will hear the example:

“We don’t process credit cards so we don’t need to be durable”.

I tend to disagree with this, as it is too simplistic to be useful. Order processing and user management are two equally valuable examples which require durability. For example: I don’t want to revoke someone’s credentials only to find with a power loss my change did not take effect! I also do not like that this example omits the truth that lost DBA/developer productivity can be more expensive than hardware. Many businesses do not want to be in a situation where a failure requires post-failure investigation to handle any discrepancies.
I would also mention that the cost of providing the added IOPS durable workloads create is much lower now that we have the ability to install SSDs on our database servers. If the required performance can be achieved with durable or not durable, losing durability is a sub-optimization, not a trade-off.
So having discussed it, here are the settings to change MySQL to be durable:

  • sync_binlog=1

For slaves:
sync_relay_log=1
relay-log-info-repository=TABLE
master-info-repository=TABLE
or:
sync_relay_log=1
sync_relay_log_info=1
sync_master_info=1
I recommend using the first option for slaves, which will use an InnoDB table internally. This is new to MySQL 5.6.

Non-durable

It’s best to describe this by case study. In the case of inPowered, we have multiple database servers, but by enlarge we are non-durable. I argue that this is the right choice for a few reasons:

  • We do not have much in the way of user data
  • Most of the data can be regenerated by re-scraping web articles
  • Most of the data is valuable in aggregate (tracking likes/shares of articles) so losing a portion is not critical.
  • I have no reason to believe that Amazon EC2 servers provide durable writes (see introduction), nor do I have the ability to do a true pull the plug test.

Settings to change to be non-durable:

  • innodb_flush_log_at_trx_commit=2

There are fewer settings to change when deciding to be non durable, since this is the default.

Conclusion

I confessed that I am currently managing a non-durable installation, but you may detect from my tone that I am by and large a durability proponent. I do not necessarily agree with ‘lie by default’, but nor do I hold it against MySQL – it is a common choice amongst databases vendors. I also think that users are often heavily swayed by naive benchmarks, so it is hard to take a high moral ground otherwise.
Did I miss anything? Please leave a comment.