The slow decline of using MySQL read slaves

Once upon a time, it was a very common MySQL choice to have an application split its queries between two connections, sending reads to a slave and writes directly to the master.

I won’t say this technique is no longer used (it certainly is) but I have been observing its popularity decline slowly over the years. Today I wanted to try and provide insight into some of the possible reasons:

  1. InnoDB as default. MyISAM table locks are very high impact for high throughput systems – worse still, MyISAM does not have MVCC, an algorithm that allows InnoDB reads to be non blocking. Running with multiple replication slaves was very important to distribute read-locks across an array of servers. With InnoDB, for most people this should now be a non-issue.
  2. Single-threaded replication. When hardware has multiple CPUs and multiple disks in RAID (and many SSDs in singular) it often requires multiple concurrent threads to be able to exploit all of the capacity that it can offer. Master servers can exploit this capacity very well, but they then replicate to slaves through a single thread. This creates a situation where the problem is not read scaling (which splitting reads/writes solves very well) but write scaling. The slaves very quickly become the bottleneck as they can not keep up.
  3. Increased desire for ACID behaviour. MySQL replication slaves are asynchronous[1], which means that if the application does an immediate read-after-write, then there is a good chance stale data might be served from a slave. Actually, it’s worse than that: with single-threaded slaves, if there is a 10 second update on the master, there is at least a 10 second delay on all slaves.

    Since the data is in an inconsistent state, mixing reads and writes between masters and slaves like this is not ACID compliant and introduces what I would call “asynchronous failures”. This leads to more time spent debugging errors.

    MySQL has also been reducing the performance hit to run ACID (including durability) in recent versions. I recommend it for most people.

    [1] Semi-sync introduced in MySQL 5.5 only ensures one slave is up to date.

  4. Increased performance-per-server. From MySQL 5.5 onwards there has been a huge focus on performance. Add to that increased memory in new servers, fast SSDs, it is possible that for some users their lifetime performance needs can be answered by a single server without the need for architectural changes.

Of these points, I suspect 1, 2, & 3 are the most likely, and to some degree 2 (single-threaded replication) makes 3 (consistency) worse because of the single threaded blips. There has been a lot of progress made to improve this in new versions:

  • MySQL 5.6 introduced parallel replication slaves across schemas. This increases the write throughput of slaves significantly, you just have to split tables across multiple databases then set –slave-parallel-workers to take advantage.
  • MySQL 5.7 DMR2 takes this one step further and offers intra-schema parallel replication. This offers parallel slaves and can keep all tables in the same schema.

So it is possible that we may actually see a return of read/write split in coming years. But it is also possible that as MySQL Fabric matures we may see users just rely on sharding instead, even though these two solutions should be orthogonal. It is so hard to predict the future.

I should also point out: I still see slaves immensely valuable for HA/DR as well as a place to send very heavy reporting queries to. This hasn’t changed.

What do you think about read/write split?