Five reasons why vertical scalability matters

The latest benchmarks show that MySQL 5.7 is now able to scale to 60 cores, which is quite an incredible feat when you compare to the 4-8 core scaling of MySQL 5.1 just a few years ago. These improvements are the result of a lot of heavy lifting to reorganize internal locking structures, and I have an earlier blog post on what is a mutex anyway? which may help serve as an introduction.

While I consider horizontal scaling and projects like MySQL Fabric to be very important, it should be stated that horizontal and vertical scaling are really orthogonal choices. That is to say that a given database technology should ideally support both options, and today I wanted to zoom in on some of the advantages I see with being able to scale vertically:

  1. Having more cores offers more consistent performance. Think of a single CPU as like shopping at a convenience store with one person on the checkout. The experience is very good when there is nobody else in line, but it degrades very quickly when just a couple of shoppers are lined up before you.

    To add to that, even if you know that there are an average of 60 customers per hour, you can not expect them to arrive at an equal distribution of one customer per minute. What usually happens is a more random pattern (see Poisson distribution) of arrivals.

    The solution to this problem is to have more staff available at the checkout. The more staff available, the less variance in the time that it takes to serve a customer. Similarly, having multiple CPUs means that query times will degrade much nicer as there are subtle spikes in load.

  2. Simplified debugging and performance characteristics. For some applications, where the lifetime growth requirements can be answered by a single server (or single master, multiple HA slaves), having a single primary server can be beneficial.

    The number of transactions/second that a single-server can now respond to is also much higher than it used to be. Dimitri’s 5.7 Sysbench OLTP_RW shows 15K transactions/second (or over 500K point select queries/second), and prematurely introducing architectural-complexity through horizontal scaling may increase the effort required to troubleshoot problems. In some cases it may also artificially prevent desirable features such as strong consistency (ACID).

  3. Good insurance for the unknown. Some applications grow in unexpected ways, and being able to scale up offers a great upgrade path that is less likely to change performance characteristics than horizontal scaling, and require fewer application changes.

    Often the cost of higher-end commodity hardware is less than that of custom-development time. I once experienced the case where a legacy application that was planned for decommission started having performance problems. Paying for the biggest EC2 instance type was worth it for a few months, and we probably would have paid more if other options were available (they were not at the time, but are now).

  4. Increased efficiency at scale. That is to say that even with automation, it is easier to manage 1000 16-core instances than it is to manage 16000 single core instances.

  5. An alternative consolidation strategy to virtualization. For some organizations, backing up and maintaining many small database servers presents operational complexity, even when the underlying servers lie on virtualized hardware. Having a larger single database instance with many database schemas can offer an alternative that may be easier to manage.

    I concede that there is some functionality missing on the MySQL-side to truly realize this potential, since it would be nice to be able to set more fine-grained quotas per application and limit the ability to accidentally Denial of Service other applications. However, some headway has been made with performance_schema now able to instrument things that were previously not possible. Most notably, in MySQL 5.7, memory can now instrumented per user.


I wanted to close with an example of how the horizontal and vertical should work together. Lets say that you operate a SaaS application with millions of users, and have the option to either massively-shard or use a single larger server. Both options may be sub-optimal:

Vertical Scaling Horizontal Scaling

PRO: Having vertical scalability allows each user to have some burstable performance that can be absorbed by larger hardware.

CON: It is possible that an extremely busy users will impact all other users, creating an all-eggs-in-one-basket scenario.

PRO: Multiple horizontal shards allows some natural fencing where the extreme spikes can be contained to group of users. This assumes that indivual users do not need cross-shard queries, otherwise it hinges on my point above under “Simplified debugging and performance characteristics”.

CON: If the application is excessively horizontally scaled, some of the busier users may have a bad experience as their shard becomes overloaded too quickly. Quite often these busy users can be the ones with the most revenue associated to them.

By horizontally scaling across many vertically scalable servers, you can get closer to get the best of both worlds.