How do you use the Query Cache?

We are looking for community feedback on the use-cases for the Query Cache in MySQL.

As astute followers will notice, the query cache is now off by default in 5.6, but realistically it’s always been disabled because the previous configuration was ON with zero memory allocated to it.

The reason for its disabling, is that the query cache does not scale with high-throughput workloads on multi-core machines. This is due to an internal global-lock, which can often be seen as a hotspot in performance_schema.

The ideal scenario for the query cache tends to be largely read-only, where there are a number of very expensive queries which examine millions of rows only to return a few. A hypothetical example might be a complex query to build a list of values for a drop-down list that always appears on a webpage form. In a situation like this, the query cache can mask performance problems caused by missing indexes, which makes it helpful for novice users.

My opinion however (and it does not necessarily reflect that of the MySQL team), is that this scenario is becoming less frequent:

Many users are building queries via ORMs, which can often offer caching directly.
Tooling is getting better, and novice users can discover missing indexes via things like MySQL Workbench’s visual explain.
Any updates can invalidate the cache, and good performance practice teaches us that we should not pay attention to the average, but try and tighten the variance between user requests.

Have I missed anything? Please leave a comment, or get in touch. I would love to hear how you are using the Query Cache.

Published by

morgo

I joined MySQL AB in 2006, left, and am now back at Oracle working on the MySQL team. I’ve also worked at Percona and InPowered.

View all posts by morgo

Bill Karwin

I agree, the best use case for the query cache is for expensive queries against infrequently-updating data. Similar to your description of populating drop-down lists. Or calculating aggregates like SUM or COUNT or MAX.

The problem is that the query cache is a one-size-fits-all solution, in that it adds overhead (including a mutex contention point) for *all* queries, not just the few queries that it would benefit the most.

One-size-fits-all solutions are convenient for novices, but they aren’t likely to be the best solution for any given situation. For example, PHP’s magic quotes turned out to be a bad idea, and they eventually deprecated it.

I think that Materialized Views would be a better solution than the query cache to solve the same use cases. That would give the developer more control over which queries are desirable to cache.

One could refresh the content of the view manually with syntax like: CREATE OR REPLACE MATERIALIZED VIEW … AS SELECT … Or create an Event to do the same on a schedule.
- Morgan Tocker
  
  Interesting.. I’ve always liked the feature materialized views (which MySQL currently does not support), but I never thought about it as a query cache replacement. It makes sense.
  - Justin Swanhart
    
    I have always thought about it like a query cache replacement and have talked openly about it. Flexviews is open source.
    
    The flexviews API just needs hooked into the SQL parser. It would be straightforward to implement Flexviews directly into the database.
    
    I wonder if Oracle would include such a contribution, but MariaDB might. Hmm..
    
    Basically all it needs is either a replication server plugin or a daemon plugin to read the binary log. I’m not sure the first interface provides enough information.
    
    http://www.mysqlperformanceblog.com/2011/04/04/flexviews-part-3-improving-query-performance-using-materialized-views/
Justin Swanhart

Keep in mind that Oracle also has a query cache, so there are use cases for it. I am sure they didn’t do it to simply copy MySQL 🙂

There are a few downsides to materialized views:
a) you have to define the views
b) you have to change your queries to use the views

So Oracle supports materialized view re-write. This can take regular queries and re-write them automatically to use the materialized view. But this is heavy. It works best for OLAP queries. In general, OLAP materialized views take long enough to refresh that it makes sense to take the time to define the views. You are also likely to know what views you will need in advance. There are various measures that you are likely to sum for example in a fact table.

OLTP isn’t so easy. Often queries are generated by ORMs or on-the-fly or they frequently change in the codebase. This makes using materialized views difficult unless you want to implement a rewrite mechanism yourself.

So a system that dynamically creates short lived resultset cacheing for queries that are likely to be executed more than once is a good idea. I’d like to see something like a SQL_CACHE_TIME hint for OLTP queries that makes only other queries waiting for that specific resultset wait.

A mechanism similar to GET_LOCK(MD5(query_text)) can be used for the lock. Anyway, a hash backed locking mechanism for a short-lived mutex. Each structure can have an attached pointer for the memory for the query. That way there will be no query cache fragmentation to worry about.

Instead of invalidating on each table change, the query will invalidate after a specific amount of time. Of course, you could have course invalidation when a table change. For best performance have an index into the query cache struct that links table names to hash buckets. You can empty the buckets quickly, and in parallel, reducing the time for the invalidation.
- gggeek
  
  From my own – very limited admittedly – testing the QC of Oracle engine did not get a significant speed boost (compared to the mysql one). Probably because the big difference in performances seems to be found, for our usage pattern, in row fetch time. I seriously hope that with 12c and the optimized network protocol the difference will be reduced.
gggeek

As far as ORM systems go, I can give some insight from having worked for 6 years with an open source CMS – which uses an ORM pattern, with EAV for content fields.

The queries generated by the system are quite variable, and the indexes in the db give good performance on the more common use cases – optimizing for all usages is impossible by definition because there is no fixed limit.
Plugins can add custom tables to the base db, and abuse of the base schema is possible by writing custom sql.

There is of course a (quite vast) set of caches internal to the application, but it is quite hard to set them up optimally (high TTL and expire-imemdiately-on-data change). Most developers get it in fact quite wrong – which means the app spends time executing the same queries more frequently than needed and often creating cache files which are just used once or not at all.

All of this to say that the query cache within mysql is something we generally recommend to keep on. The fact that it is transparent to the app (no need to rewrite queries, and db never serves stale data) is its main advantage over other solutions. For customers with lots of data and/or web traffic, a round of load testing is also recommended.

What I’d like to see is a more flexible query cache: define tables not to be cached at all (without rewriting the sql of the query), have per-database configuration, more flexible purges, nicer usage stats, etc
JamesDayAtOracle

Query cache off is best for the high concurrency and prominent workloads but for the smaller boxes that are often quite lightly loaded in concurrency terms it still has a place. Those cases are also the ones that are least likely to have an ORM or fancy caching architecture that reduces the benefit of the QC. What we have is a low-high box split, with high not needing it.

So I’m not in favour of removing it. I’d prefer to see something like connections hashed to multiple query caches, accepting duplication of cached content because optimal QC sizes tend to be small enough for that duplication not to matter. I still have reservations about invalidation cost, though.
Anvesh Patel

Nice Article !
I have already shared this article with my friends.
I have also worked around this MySQL Query Cache and created my research note on my blog.
Please visit my blog.
For basic theory:
http://www.dbrnd.com/2015/08/mysql-query-cache/
For basic configuration:
http://www.dbrnd.com/2015/08/mysql-query-cache-configuration/