Estimating MySQL's Working Set with information_schema

In an earlier post, I wrote about when MySQL loads data in and out of cache. At the end of this post, I talked about the concept of page churn, and having enough memory for your working set:

“If you do not have enough memory to hold all of your working set in memory, what will happen is that the buffer pool will start juggling as it churns pages out of memory only to load them back in soon after. A small amount of this is probably okay, but it can start to degrade performance. The traditional rule of thumb is called “the 5 minute rule“. This means that if you load a page into memory and are going to need it again in the next five minutes – it should not need to be churned out.”

To be able to calculate this in a precise way we would need to collect:
– The current number of pages in the buffer pool.
– The unique pages loaded in over a 5 minute period.
MySQL doesn’t have a hook to be able to tell when a page is loaded in memory, but as of MySQL 5.6 it does have an information_schema table called INNODB_BUFFER_PAGE. So while we will not be able to get the exacty number, polling this table on a frequent enough interval and storing the unique results (space_id + page_id) to a temporary table should get close.
So I wrote a stored procedure estimate_working_set.sql to do exactly this. It accepts two arguments; sleep time and iterations. Here is an example ~5 minute observation on my laptop:

mysql> call test.estimate_working_set(10, 30);
+----------+
| progress |
+----------+
| 1/30     |
+----------+
1 row in set (10.13 sec)
+----------+
| progress |
+----------+
| 2/30     |
+----------+
1 row in set (21.19 sec)
.. lines ommitted for brevity ..
+----------+
| progress |
+----------+
| 29/30    |
+----------+
1 row in set (5 min 41.97 sec)
+----------+
| progress |
+----------+
| 30/30    |
+----------+
1 row in set (5 min 54.72 sec)
+----------------------+
| pages_in_working_set |
+----------------------+
|               100679 |
+----------------------+
1 row in set (5 min 55.61 sec)
Query OK, 0 rows affected (5 min 55.71 sec)

So in my case my working set is 100679 pages. I am not using compressed pages so each page is 16 KiB, or divide by 64 to convert to MiB. 100679/64 = 1573MB working set. I have a 128M buffer pool – so I desperately need to increase it / add more memory.
Warning! This stored procedure will be expensive on a server with a large buffer pool. Use at your own risk!