Beyond the FRM: ideas for a native MySQL Data Dictionary

The frm file has provided long service since the earliest days of MySQL. Now, it is time to replace it with a native InnoDB-based Data Dictionary.

This is a change that has been on our wish list for a long time, as well as others in the MySQL development community:

For historical context:

  • Every table in MySQL has at least a corresponding .frm file. For example in the case of MyISAM these three files constitute a complete backup:

    mytable.MYI
    mytable.MYD
    mytable.frm
    

  • The .frm file stores information such as column names and data-types. It is a binary format, which as Stewart explains predates MySQL and spans back to 1979 with UNIREG. In addition to the .frm file, there are .trn, .trg and .par files which have been added over time to support triggers and partitioning.

Our motivation to develop a native dictionary spans from a number of issues with the current filesystem-based formats:

  1. The current .frm file predates MySQL’s support for transactions, and has a lot of complexity to handle various failure states in replication and crash recovery. For example: Bug#69444. Using a native data dictionary simplifies code and makes handling failure states very simple.

  2. Our information_schema implementation currently suffers, and has been subject to years of complaints. By using a native dictionary, we will be able implement information_schema as views over real tables, significantly improving the speed of queries.

  3. On a closely related point, we currently build information_schema on top of a series of differing filesystem properties, while attempting to provide the same cross-platform experience. The code to account for filesystem case insensitivity increases code complexity, and ties up developer resources that could be better spent elsewhere.

  4. Aside from the MySQL server’s data dictionary, storage engines may also store their own data dictionary. In the case of the InnoDB storage engine, this redundant storage has led to complexity in troubleshooting an out-of-sync data dictionary.

  5. The current non-comformity of the data dictionary (using .frm, .par, .trn and .trg files) spans from a lack of extensibility from the original .frm format. Not having a centralized extensible repository makes it difficult to incorporate feature requests that require additional meta data stored, or to offer new relational objects in the future.

  6. The current format does not support versioning meta-data in such a way that we can use it to assist in the upgrade experience between MySQL versions.

This change is of course in addition to my recent post about storing system tables in InnoDB.

While this change will be transparent for many of our users, we are inviting feedback from the community. Please let us know:

  • If you have a use-case where you interact with the file-based formats directly.
  • What features you want to see in a native data dictionary!

You can either leave a comment, or email me.