A cold cache, or a poorly tuned cache can be responsible for a number of performance problems. If we look at the data and indexes of InnoDB, the cache responsible is called the InnoDB buffer pool.
In earlier versions of MySQL the default value for the setting
innodb_buffer_pool_size was 8M – which is a little out of date when you consider the recommended value to be 50-80% of system memory. In MySQL 5.5 the default value was increased to 128M, which is a comprimise made for users that may install MySQL on a development machine and not have it running as a dedicated server. In production, it is unlikely that you will buy a new server with less than 64GB of RAM, so it is quite typical that this setting is 50GB+
So lets talk about the behaviour of the InnoDB buffer pool –
Up until and including MySQL 5.5
When MySQL starts up, the buffer pool is completely empty. As queries are executed, MySQL will determine which pages are required – and if they are not in memory, they will be loaded from disk.
Initially, performance will be very poor – since there will be 100% cache misses, but over time as the buffer pool fills this should reduce.
Provided there is enough memory – MySQL will just keep loading pages into cache, and warm to a state where it will not have to perform any reads to the disk at all. Only writes will go to disk.
From MySQL 5.6 onwards
Because server warm time is becoming a more serious problem (it always existed, but it is exacerbated by us now having an abundance of RAM, but if we are using hard drives, they are not really any faster), Oracle introduced a feature in MySQL 5.6 – buffer pool warming.
Upon startup, these pages can then automatically be loaded back into memory straight away so that we can have cache misses for less time.
This feature is not turned on by default, but I suspect in a future version it may be. There are a number of other advantages to pre-warming such as being able to sort and merge read requests and load the data in much faster. Also, since it is only the addresses of pages being saved it only takes 8 bytes to point to each 16KB page, and there are no risks if there have been modifications since the last address saving operation ran.
So we’ve described the “best case” so far, which is that data only gets loaded into cache and performance keeps getting better and better. However in practice, our cache is never unlimited. When our buffer pool gets to the point of being completely full it will need to free space in order to load new pages into memory. The behavior of this ‘eviction’ is version specific.
Up until MySQL 5.5
InnoDB previously used a classic Least Recently Used (LRU) algorithm. What this means is that each page has an order in a list. I like to think of it as a “Hot or not” score. Each time a page is accessed it gets promoted up the list to be considered “more hot”. When it comes to freeing space, InnoDB just picks the least hot page and frees it from memory to make space for the next page.
In typical operation MySQL will not free pages from memory unless it needs to make space for another page. I say ‘typical’, because pages will be freed if a table is dropped for example.
MySQL 5.5 onwards
There are a number of operational limitations with using a classic LRU implementation on a database which has many users, and typically more than one workload. For example lets say that everything is working fine and we’ve figured out what is “hot or not”, noting that we do not need as much memory as we do data because in most cases there will be pages that are not accessed frequently. Now imagine that a series of queries come in from mysqldump that want to run tablescans to export the data. What can happen, is the pages loaded in from the tablescans push out some of the good pages in our buffer pool, and even worse, as soon as the mysqldump operation is complete, they will no longer be required. So now post-mysqldump we have random IO as we try and re-settle cache-contents again.
So in MySQL 5.5 the LRU was split into two sublists:
– A ‘young’ sublist for our frequently accessed pages (default: 63%)
– An ‘old’ sublist for our table scan page accesses (default: 37%)
The configuration variable
innodb_old_blocks_pct was introduced to configure the old/new split, and a variable
innodb_old_blocks_time (default in 5.5: 0, in 5.6: 1000) was introduced to specify a minimum amount of milliseconds that a freshly loaded page must stay in the old sublist before it is ellible to be promoted to the young sublist.
A note on page churn
If you do not have enough memory to hold all of your working set in memory, what will happen is that the buffer pool will start juggling as it churns pages out of memory only to load them back in soon after. A small amount of this is probably okay, but it can start to degrade performance. The traditional rule of thumb is called “the 5 minute rule“. This means that if you load a page into memory and are going to need it again in the next five minutes – it should not need to be churned out.