Converting MyISAM to InnoDB and a lesson on variance

I’m about to start working through converting a large MySQL installation to InnoDB. For the particular use case durability is desired, and with MyISAM a loss of power almost certainly guarantees data loss. Durability is not available as an option.

So the normal question people ask is.. okay, what do I pay for this feature?

I think if you are looking at the workload in question I actually believe InnoDB will perform better overall. Row-level locking, multiversion concurrency control, and the correct selection of hardware (fast-disks with a RAID controller + batter backed write cache) will really help it shine.

But my problem is that InnoDB could be better in 9 situations, and worse in 1. What matters more than anything else is performance regressions. That is…

“Our Customers Feel the Variance, Not the Mean” – a key concept in Six Sigma.

The way I like to explain this, is that if we were to go to a restaurant and order the same meal – or go to Toyota and order the same car, we should receive the same product. I can’t get the equivalent to a race car and you get a Trabant. Nor can we allow some queries that were already fast enough get faster, and other queries become intolerably slower.

So what I am actively trying to do is identify situations where MyISAM will be faster than InnoDB. And so far I have on my watch list:

  1. Table Scans – MyISAM data is far more compact, and typically table scans need to examine far less data.
  2. Index Scans – In InnoDB, indexes will contain multiple versions and typically be much larger.
  3. Inserts – MyISAM has a special optimization where it can just insert into the end of a table. I think even with the battery backed write cache this will be hard to beat.
  4. Single-threaded workloads – maybe there’s an hourly cron that in InnoDB will take significantly longer. InnoDB really performs better in concurrent workloads.

Facebook and Percona have previously written about the importance of variance reduction as well.

I should also note that durability is not the only reason to use InnoDB. Operational tasks like backups become much easier. But that’s the subject for a different post 🙂