The MySQL 5.7 Optimizer Challenge

In the MySQL team, we have been working really hard on refactoring the optimizer and improving the cost model. The hacks of storage engines lying to the optimizer are being rolled back, and your chances of getting an optimal query plan should now be much higher than in prior releases of MySQL.

The optimizer team has also allowed cost constants to be configurable on both a server and a storage engine basis, and we are confident that the default InnoDB engine will always work "as good as MyISAM" (which has a natural advantage, in that the optimizer was originally largely built around it.)

Today, I want to issue a challenge:

Find an example where the optimizer picks the wrong execution plan for InnoDB tables but is correct for MyISAM. If you can demonstrate a reproducible testcase, I have a polo with MySQL 5.7 Community Contributor on it waiting for you.

The supplies of this special edition t-shirt are limited, but I will ship it to you anywhere in the world :)

The MySQL 5.7 Community Contributor Polo, as modeled by Daniël van Eeden.
The MySQL 5.7 Community Contributor Polo, as modeled by Daniël van Eeden. I'm the guy on the left.

Published by

morgo

I joined MySQL AB in 2006, left, and am now back at Oracle working on the MySQL team. I've also worked at Percona and InPowered.

  • http://rzayevsehriyar.wordpress.com/ Shahriyar Rzayev

    :) Great one. So here is simple reproducible test case. My table structure:

    CREATE TABLE `sales` (
    `SALES_ID` int(8) NOT NULL AUTO_INCREMENT,
    `CUSTOMER_ID` decimal(8,0) NOT NULL,
    `PRODUCT_ID` decimal(8,0) NOT NULL,
    `SALE_DATE` datetime NOT NULL,
    `QUANTITY` decimal(8,0) NOT NULL,
    `SALE_VALUE` decimal(8,0) NOT NULL,
    `DEPARTMENT_ID` decimal(8,0) DEFAULT '0',
    `SALES_REP_ID` decimal(8,0) DEFAULT '0',
    `GST_FLAG` decimal(8,0) DEFAULT NULL,
    `sale_status` char(1) DEFAULT NULL,
    `FREE_SHIPPING` char(1) DEFAULT '',
    `DISCOUNT` decimal(8,0) unsigned DEFAULT '0',
    PRIMARY KEY (`SALES_ID`),
    KEY `sales_cust_idx` (`CUSTOMER_ID`)
    ) ENGINE=InnoDB AUTO_INCREMENT=2500004 DEFAULT CHARSET=latin1

    Same structure for MyISAM table named sales2.
    And answer to challenge follows.
    Take a look at estimated rows count:

    mysql> explain select count(*) from sales where sales_id>0G
    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: sales
    partitions: NULL
    type: range
    possible_keys: PRIMARY
    key: PRIMARY
    key_len: 4
    ref: NULL
    rows: 1158379
    filtered: 100.00
    Extra: Using where; Using index
    1 row in set, 1 warning (0.29 sec)

    mysql> explain select count(*) from sales2 where sales_id>0G
    *************************** 1. row ***************************
    id: 1
    select_type: SIMPLE
    table: sales2
    partitions: NULL
    type: ALL
    possible_keys: NULL
    key: NULL
    key_len: NULL
    ref: NULL
    rows: 2500003
    filtered: 33.33
    Extra: Using where
    1 row in set, 1 warning (0.00 sec)

    • http://www.tocker.ca/ Morgan Tocker

      Shahriyar, I'm going to followup via email. I'd like to make sure I have the same data as a starting point.

      (This is one of those cases where InnoDB has different features. The primary key is a clustered index.. so I want to make sure I judge it fairly.)

  • まえあつ

    Hi, I translated this article into Japanese

    Translated one is as follows.
    http://yakst.com/ja/posts/2690

    If there is any problem, please contact me.
    Thanks.