We’re looking to improve PlanetMySQL

We are looking for community feedback on how improve the Signal-to-Noise ratio of PlanetMySQL, and reduce the amount of irrelevant and off-topic posts.

To give you a sense of where we are at today, we currently approve most [all?] blogs for aggregation, provided they have some existing technical content. We then apply a naive regular expression to make sure that all posts are on topic (MySQL).

The sorts of ideas we are looking for are:

What content are you interested in seeing more or less of?
What posts do you deem as acceptable or unacceptable?
Do we need to change the balance between technical, marketing, event, or business-related posts?
Are there any tools or features we could introduce to better serve readers and authors?

If you have any suggestions – please send them my way!

Proposal to deprecate mysqlhotcopy

In the MySQL team, we are considering deprecating the mysqlhotcopy utility. To provide some background, here is an excerpt from the MySQL manual:

Continue reading Proposal to deprecate mysqlhotcopy

MySQL 5.6.17 Community Release Notes

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.17.

In particular:

Thanks to Anthony Pong for reporting a confusing error message when mysql_install_db could not locate the required Perl modules. Bug #69844.
Thanks to Jervin Real for reporting a recently introduced performance regression with compressed InnoDB tables. Bug #71436.
Thanks to Laurynas Biveinis for reporting a race condition in InnoDB on shutdown. Bug #70430.
Thanks to Jervin Real for reporting that innodb_data_file_path could not be specified in kilobytes, as was indicated in the manual. Bug #68282.
Thanks to Elena Stepanova for reporting that InnoDB miscalculates auto-increment after changing auto_increment_increment. Bug #65225.
Thanks to qinglin zhang for reporting a memory leak in the slave sql thread, plus providing a suggested patch. While we did not end up using qinglin’s patch, we value the contribution. Bug #71197.
Thanks to 徹赤松 for reporting that SHOW SLAVE STATUS incorrectly reported the master’s SSL information. Bug #70866.
Thanks to Yoshinori Matsunobu for reporting an issue where a slave can’t continue replication after the master’s crash recovery Bug #70669
Thanks to Erik Hoekstra for pointing out that using slave-parallel-workers may result in the error log being spammed with warnings. Bug #68429.
Thanks to Ovais Tariq for identifying that slaves with an additional auto-increment were not updated correctly. Bug #69680.
Thanks to Simon Mudd for suggesting that modifications to performance_schema tables should not be replicated. Bug #67159.
Thanks to Miguel Angel Nieto for identifying that temporary files created by binlog cache are not cleaned up. Bug #66237.
Thanks to hickey liu for reporting a race-condition in semi-sync replication. Bug #66411.
Thanks to Yoshinori Matsunobu for reporting excessive mutex contention in semi-sync replication. Bug #70218.
Thanks to honza horak for suggesting that mysqld –help should exit with a zero error code. Bug #70058.
Thanks to Jonathan Weaver for identifying that MySQL community edition client programs could not connect to MySQL enterprise servers with SSL enabled. Bug #68788.
Thanks to Jørgen Thomsen for reporting that mysqldump did not support the secure-auth parameter. Bug #69051.
Thanks to Raghavendra Prabhu for reporting an assertion when running Random Query Generator (RQG). Bug #69969.
Thanks to Tim McLaughlin for reporting that wrong results could be returned in a SELECT DISTINCT…GROUP BY query. We also thank Tim for reducing the bug to a minimal testcase. Bug #70657.
Thanks to Hartmut Holzgraefe for suggesting improvements to warnings written to the error log for invalid collations. Bug #68144.
Thanks to Ralf Adams for reporting that wrong results could be returned when using ALL() and GROUP BY. Ralf also provided a minimal testcase, which was useful in reproducing the issue. Bug #71244.
Thanks to Elena Stepanova for identifying that unquoted file names for variable values are accepted but parsed incorrectly. Bug #69703.
Thanks to Valeriy Kravchuk for suggesting that innodb_ft_result_search_limit should have a predictable maximum value. Bug #71554.

Thank you again to all the community contributors listed above. In particular, the MySQL team would like to call out the names that appear more than once in this release:

Jervin Real (2), Yoshinori Matsunobu (2), Elena Stepanova (2)

If I missed a name here, please let me know!

- Morgan

Plan to remove unused command line programs (10 out of 43)

Have you ever noticed the vast number of command line programs that ship with MySQL? Here is an ls from MySQL 5.6.16’s bin directory:

morgo@Rbook:~/Documents/mysql-installs/5.6.16/bin$ ls
innochecksum
msql2mysql
my_print_defaults
myisam_ftdump
myisamchk
myisamlog
myisampack
mysql
mysql_client_test
mysql_client_test_embedded
mysql_config
mysql_config_editor
mysql_convert_table_format
mysql_embedded
mysql_find_rows
mysql_fix_extensions
mysql_plugin
mysql_secure_installation
mysql_setpermission
mysql_tzinfo_to_sql
mysql_upgrade
mysql_waitpid
mysql_zap
mysqlaccess
mysqlaccess.conf
mysqladmin
mysqlbinlog
mysqlbug
mysqlcheck
mysqld
mysqld-debug
mysqld_multi
mysqld_safe
mysqldump
mysqldumpslow
mysqlhotcopy
mysqlimport
mysqlshow
mysqlslap
mysqltest
mysqltest_embedded
perror
replace
resolve_stack_dump
resolveip

For MySQL 5.7 we are planning to cleanup and remove programs that appear to have no current use-case. This helps us to improve supportability of MySQL as users will not be able to accidentally find themselves using a utility that has not historically been well maintained, or has outlived its original purpose. In the case of shell scripts, removal also helps us offer a better cross platform experience.

The list scheduled for removal

msql2mysql [Shell Script]
mysqlaccess [Perl program]
mysqlbug [Shell script]
mysql_zap [Shell script]
mysql_waitpid [C program]
mysql_convert_table_format [Perl Program]
mysql_fix_extensions [Perl Program]
mysql_find_rows [Perl Program]
mysql_setpermission [Perl Program]
~~resolve_stack_dump [C program]~~

(Hat tip to Todd Farmer, who first wrote about removing these programs last year.)

Update March 12th: resolve_stack_dump has been removed from the list.

Feedback Welcome

Are you currently using one of these utilities scheduled for removal?
Did we get a decision wrong?
We value your feedback. Please leave a comment, or get in touch!

A followup on SHOW ENGINE INNODB MUTEX

EDIT: This post has been updated here. SHOW ENGINE INNODB MUTEX will return in MySQL 5.7.8.

In November, I wrote about the command SHOW ENGINE INNODB MUTEX and sought feedback as to how it continues to be used post introduction of performance_schema.

The outcome from this feedback is that the MySQL team has decided to remove this command from MySQL 5.7.

These decisions are not light ones for the team to make. I wanted to thank everyone for their feedback, and also share why this decision was an important one for the future of MySQL:

By design, data collection for SHOW ENGINE INNODB MUTEX was always enabled, with no switch to disable it. We believe that for some workloads being able to disable the collection is useful, and in the case of performance_schema this is possible.
performance_schema overlaps with this functionality but provides a superset of features, including timing information. The noted exception to this is that spinning information is not yet available in performance_schema.
Comparable information to SHOW ENGINE INNODB MUTEX can be re-generated by creating views on performance_schema tables, offering a migration path to users affected and those desiring an interface that requires fewer key-strokes.
The InnoDB mutex code was recently refactored as part of our server cleanup. After refactoring, the InnoDB code can now mix several mutex types internally including spin locks, system mutexes (POSIX) and InnoDB home brewed ones. The display output for SHOW ENGINE INNODB MUTEX doesn’t really account for these differences in its status column. Grouping all mutex types together becomes less intuitive, and would require either some change to the output, or a coarse (inaccurate) mixing of the types:

mysql> show engine innodb mutex;
+--------+----------------------------+-------------------+
| Type   | Name                       | Status            |
+--------+----------------------------+-------------------+
| InnoDB | log/log0log.c:775          | os_waits=26       |
| InnoDB | log/log0log.c:771          | os_waits=1        |
| InnoDB | buf/buf0buf.c:1208         | os_waits=3219     |
..

MySQL 5.6.16 Community Release Notes

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.16.

In particular:

Thanks to Honza Horak for suggesting we make tmpdir configurable at build-time, and for providing a patch. Bug #68338.
Thanks to Hartmut Holzgraefe for reporting a memory leak when using the InnoDB memcached API and replication. Bug #70757.
Thanks to Justin Swanhart for reporting that InnoDB reported an incorrect operating system error code when it failed to initialize. Bug #70867.
Thanks to Yoshinori Matsunobu who reported that MySQL 5.6’s persistent InnoDB statistics caused stalls due to latch contention. Bug #70768.
Thanks to Laurynas Biveinis for discovering that InnoDB could needlessly call os_thread_get_curr_id(). Bug #70417.
Thanks to Nizameddin Ordulu for identifying a case where InnoDB’s doublewrite buffer would not restore corrupted pages but could have. Bug #70087.
Thanks to Hartmut Holzgraefe for reporting that the README file for InnoDB memcached incorrectly specified the version of libevent statically linked. Bug #70034.
Thanks to Shahriyar Rzayev for reporting that the error message when reseting a slave with innodb_force_recovery set was cryptic. We have since improved the error message. Bug #69907.
Thanks to Keith Dechant for reporting that AUTO_INCREMENT values could not be reset when the INPLACE algorithm was used for ALTER TABLE. Bug #69882.
Thanks to Laurynas Biveinis for reporting an incorrect comment in the InnoDB source code. Bug #69847.
Thanks to Christian Rabe for reporting a problem when using an InnoDB tablespace stored on a raw device. Bug #69424.
Thanks to Justin Swanhart for reporting that an online ALTER TABLE operation on a partitioned table can consume a significant amount of memory. Bug #69325.
Thanks to Valeriy Kravchuk for reporting a specific workload where InnoDB did not scale well on multi-core machines. In response to Valeriy’s bug report, we added a new feature to use atomic reference counting to track page use instead of mutexes. Bug #68079.
Thanks to Stewart Smith for reporting that table renaming operations showed up under foreign key errors in SHOW ENGINE INNODB STATUS. Stewart also generously provided a patch. Bug #61746.
Thanks to Laurynas Biveinis for reporting that UNIV_SYNC_DEBUG was disabled erroneously as part of an earlier bug fix. We’ve since reenabled it. Bug #69617.
Thanks to Joffrey Michaie for reporting a situation where queries on tables partitioned by hash could return wrong results. Bug #70588.
Thanks to zhai weixiang for reporting an issue where binlog dump information could incorrectly be written to the error log. Bug #70685.
Thanks to Justin Swanhart for reporting that mysqlbinlog did not properly decode DECIMAL values. Bug #65812.
Thanks to Santosh Praneeth Banda for reporting that the semisynchronous replication plugin was called twice for a DDL statement. Bug #70410.
Thanks to Yang Dingning from NCNIPC, Graduate University of Chinese Academy of Sciences for reporting a security issue. Bug #61065.
Thanks to Bryan Turner for reporting a situation where tables with utf8_bin collation could return results in the wrong order. Bug #69005.
Thanks to Honza Horak for reporting code improvement suggestions as the result of a coverity analysis. Bug #70830.
Thanks to Ed Reeder for reporting that the option --local-service did not work on Windows. Bug #69637.
Thanks to David Coyle for reporting a limitation when using views with an ORDER BY in their definition. We have since worked around this limitation. Bug #69678.
Thanks to Davi Arnaut for reporting an issue with the performance_schema instrumentation interface. Bug #70628.
Thanks to Laurynas Biveinis for pointing out a situation where make_atomic_cas_body64 might be miscompiled. Thanks also to Davi Arnaut for his comments on the bug. Bug #63451.
Thanks to Davi Arnaut for reporting that the mysql_plugin client inadvertently attempts to remove files. Bug #69752.
Thanks to Igor Babaev and xiaobin lin who independently discovered a case when COUNT(DISTINCT) could return wrong results. xiaobin lin also provided a patch, which while we did not end up using it, we valued the contribution. Bug #68749, Bug #71028.
Thanks to Patrick Middleton for pointing out an issue when trying to compile MySQL without partitioning. While we didn’t end up using it, we thank Patrick for also providing a patch. Bug #71010.
Thanks to Anthony Tso for reporting a bug where wrong results could be returned. Bug #70608.
Thanks to Elena Stepanova for reporting a security issue. Bug #68751.
Thanks to Daniel van Eeden for letting us know that some scripts provided out of date information regarding where to file bug reports. Bug #68742.
Thanks to Don Coffin for reporting that some file removal operations were not instrumented by performance_schema. Bug #69782.
Thanks to Eric Bergen for reporting that mysqldump would not release metadata locks until after the dump operation had finished. Bug #71017.
Thanks to Michael Ezzell for reporting a security issue. Bug #68354.
Thanks to Giuseppe Maxia for raising awareness of the obsolete mysqlbug utility. We have since removed mention of it from mysql_install_db. Bug #29716.

Thank you again to all the community contributors listed above. In particular, the MySQL team would like to call out the names that appear more than once in this release:

Honza Horak (2), Hartmut Holzgraefe (2), Justin Swanhart (3), Davi Arnaut (2) and Laurynas Biveinis (4).

If I missed a name here, please let me know!

- Morgan

Proposal to enable sql mode ONLY_FULL_GROUP_BY by default

We are considering enabling the SQL mode ONLY_FULL_GROUP_BY by default in MySQL 5.7. Here is a quick example to demonstrate how this change will impact you:

Sample Data:

CREATE TABLE invoice_line_items (id INT NOT NULL PRIMARY KEY auto_increment, 
invoice_id INT NOT NULL,
description varchar(100)
);

INSERT INTO invoice_line_items VALUES 
 (NULL, 1, 'New socks'),
 (NULL, 1, 'A hat'),
 (NULL, 2, 'Shoes'),
 (NULL, 2, 'T shirt'),
 (NULL, 3, 'Tie');

Default behaviour in MySQL 5.6 and below:

mysql> SELECT id, invoice_id, description 
FROM invoice_line_items GROUP BY invoice_id;
+----+------------+-------------+
| id | invoice_id | description |
+----+------------+-------------+
|  1 |          1 | New socks   |
|  3 |          2 | Shoes       |
|  5 |          3 | Tie         |
+----+------------+-------------+
3 rows in set (0.00 sec)

Proposed default behaviour in MySQL 5.7:

mysql> SELECT id, invoice_id, description 
FROM invoice_line_items GROUP BY invoice_id;
ERROR 1055 (42000): 'test.invoice_line_items.id' isn't in GROUP BY

That is to say that in 5.7 this statement will produce an error.

Notice that columns id and description are not actually unique? Roland Bouman has a blog post from 2007 that describes what MySQL is doing here, and how it is different in behaviour to other databases.

To summarize: MySQL is picking one of the possible values for id and description and the query executes in a non deterministic way. Here’s some more information to prove this:

mysql> SELECT id, invoice_id, description, max(description) 
FROM invoice_line_items GROUP BY invoice_id;
+----+------------+-------------+------------------+
| id | invoice_id | description | max(description) |
+----+------------+-------------+------------------+
|  1 |          1 | New socks   | New socks        |
|  3 |          2 | Shoes       | T shirt          |
|  5 |          3 | Tie         | Tie              |
+----+------------+-------------+------------------+
3 rows in set (0.00 sec)

mysql> SELECT GROUP_CONCAT(id), invoice_id, GROUP_CONCAT(description) 
FROM invoice_line_items GROUP BY invoice_id;
+------------------+------------+---------------------------+
| GROUP_CONCAT(id) | invoice_id | GROUP_CONCAT(description) |
+------------------+------------+---------------------------+
| 1,2              |          1 | New socks,A hat           |
| 3,4              |          2 | Shoes,T shirt             |
| 5                |          3 | Tie                       |
+------------------+------------+---------------------------+
3 rows in set (0.00 sec)

What is problematic about this behaviour is that it can hide bugs in application code. To highlight two blog posts:

Ike Walker @ Flite explains why he only uses full group by.
Keyur Govande @ Etsy strongly recommends STRICT_ALL_TABLES and ONLY_FULL_GROUP_BY to avoid data corruption.

The Proposal

We are proposing to change ONLY_FULL_GROUP_BY to be enabled by default as of MySQL 5.7. The downside in doing this, is that some users upgrading will have statements return errors. These users will be left with two options:

Set sql_mode in their my.cnf (or my.ini) file explicitly, or as part of their application code since sql_mode is configurable per session.
Modify the offending GROUP BY statements to be deterministic.

On the second point, we are also proposing introducing an ANY_VALUE() SQL function to allow statements that are ‘acceptable as non-deterministic’ to run easy retrofit in this less permissive configuration. For example:

mysql> SELECT ANY_VALUE(id) AS id, invoice_id, 
ANY_VALUE(description) AS description 
FROM invoice_line_items GROUP BY invoice_id;
+------+------------+-------------+
| id   | invoice_id | description |
+------+------------+-------------+
| 1    |          1 | New socks   |
| 3    |          2 | Shoes       |
| 5    |          3 | Tie         |
+------+------------+-------------+
3 rows in set (0.01 sec)

Why I personally like this proposal, is that I compare it to buying an undercooked burger in a North American restaurant. We are proposing to switch our policy from “we may serve it to you without you having realized you ordered it” to “sign a waiver, then we will serve it”.

i.e. We are not removing options, but want you to have to opt-in to what can be an unsafe choice, rather than opt-out.

Fans of standard SQL will rejoice at compatibility with SQL-2003, a standard defined after this behavior was first implemented.

Conclusion

As with other changes that have been proposed for MySQL 5.7, we are soliciting your feedback!

Have you accidentally returned incorrect data, and could have benefited from having ONLY_FULL_GROUP_BY enabled?
Do you maintain an application that relies on the non deterministic behaviour?
Do you agree or disagree that this change is the better default for users?

Please leave a comment, or get in touch!

Update: This proposal has gone ahead. MySQL 5.7 DMR5 has ONLY_FULL_GROUP_BY enabled by default!

Spring cleaning MySQL bugs

The MySQL team has been busy fixing bugs and sorting through older bugs. To give you a couple of recent stats:

Norvald blogged about 14 bugs fixed at the request of Linux Distributions.
Tomas Ulin’s keynote at Percona Live mentioned 1991 Bugs Fixed in 5.6, with 3763 Bugs Fixed in Total Since MySQL 5.5 GA 🙂

As part of our spring cleaning efforts, we have also decided to retire the bug status to be fixed later. That is to say that:

Some bugs that were marked as to be fixed later have actually been fixed. Keeping a status of items that won’t be fixed for now has proven difficult to diligently maintain as accurate.
We want to prevent a half-way state where historically a fix may not have been technically feasible but we did not always set user expectations appropriately. We have decided that in some of these cases it is better to ‘break the bad news early’ and set the status to won’t fix.
All remaining bugs that were to be fixed later will now be changed to be verified, and will be re-evaluated in priority on a regular interval moving forward.

If you are affected by a particular bug, I recommend clicking the “Affects Me” button.

Making strict sql_mode the default

MySQL has sometimes faced criticism for being too relaxed at allowing invalid values, or inserting but truncating values that are out of range. For example:

mysql> CREATE TABLE unsigned_int (a int unsigned);
Query OK, 0 rows affected (0.03 sec)

mysql> INSERT INTO unsigned_int (a) VALUES (-1);
Query OK, 1 row affected, 1 warning (0.01 sec)

mysql> SELECT * FROM unsigned_int;
+------+
| a    |
+------+
|    0 |
+------+
1 row in set (0.00 sec)

In MySQL 5.0 two strict sql_mode options were introduced to be able to change this behaviour:

STRICT_ALL_TABLES – Behave more like the SQL standard and produce errors when data is out of range.
STRICT_TRANS_TABLES – Behave more like the SQL standard and produce errors when data is out of range, but only on transactional storage engines like InnoDB.

However, these options were disabled by default.

Changes in MySQL 5.6

MySQL 5.6 made the first important step by setting STRICT_TRANS_TABLES “by default” for new installations of MySQL. That is to say that when using one of the MySQL installation packages, the included configuration file includes the line:

# Recommended in standard MySQL setup
sql_mode=NO_ENGINE_SUBSTITUTION,STRICT_TRANS_TABLES

Being in a configuration file only, it has given users an easy way to be able to revert to the old behaviour if their application relies on MySQL’s relaxed handling.

(Small Note: The included “default configuration file” could be /etc/my.cnf or BASEDIR/my.cnf which may end up being /usr/my.cnf).

Upcoming Changes in MySQL 5.7

In MySQL 5.7 we are planning to simplify the SQL modes, and potentially even introduce additional strictness to bring MySQL even closer to the SQL standard.

I have previously written about this topic here.

Proposed Changes in MySQL 5.x

We want to make STRICT_TRANS_TABLES the default for all installations. That is to say, whether or not you have specified an sql-mode in configuration, MySQL will reject invalid or out of range values unless you chose to set it otherwise.

However, we also realize that this is one of the changes that impact users the most, so we are soliciting feedback on how we can minimize the impact.

Do you think this change can be included in MySQL 5.7?
Should it wait until MySQL 5.8?
If you discovered that you needed to remove the SQL mode STRICT_TRANS_TABLES from your configuration file after installing MySQL 5.6, what did you think of the experience?

Please leave a comment or get in touch!

Proposal to change MySQL replication defaults

In the MySQL team @ Oracle, we’ve been discussing changing the default values for MySQL replication so that they are safer to use.

Specifically:

Setting	Current Value	Proposed New Value
sync_binlog	0	1
master-info-repository	FILE	TABLE
relay-log-info-repository	FILE	TABLE

I have written about all of these settings before in my post on deciding whether or not to make MySQL durable. The short version is that:

By default, if MySQL crashes you could lose data.
You don’t get to chose what you lose, it could be a $10 order, or a $10M order.
Independent of losing orders, DBA time is expensive. Having to perform post-failure investigation to handle discrepancies has a high cost.
The cost of durability is much lower with SSDs. In a comment, Daniël van Eeden also correctly said that the cost is much lower in 5.6 with group commit (thanks Daniël!).
I recommend most users should be running with MySQL as fully durable.

Now to explain each of these proposed changes:

Sync_binlog

By setting sync_binlog=1 you are ensuring that any changes written to the binary log will not be lost in the event of a crash. This is a problem because with the current default of sync_binlog=0, a replication master crashing may lose events and require all slaves to be re-imaged with a fresh copy of data to be consistent.

It’s important to note that changing this setting has a downside: syncing the binary log causes a performance impact. As I mentioned above, this was reduced significantly in MySQL 5.6 with group commit, but it will still be evident in some workloads. The most unfortunate may be in single-threaded performance on systems with hard-drives and no RAID controller with a battery backed write cache.

Master-info-repository/relay-log-info-repository

These two options were first introduced in MySQL 5.6 as part of a feature called transactional replication. When both set to TABLE, they store the internal replication coordinates in an InnoDB table, and COMMIT changes as part of the same transaction as the replication events being applied.

What this means in practical terms, is that slaves are in a consistent state when they crash. Replication can just resume processing events from the point of failure, and data does not need to be re-imaged from the master or another slave.

Conclusion

As with previous proposals, we are seeking feedback on how these changes will impact you.

Do you currently use sync_binlog and transactional replication?
Have you considered these features, but decided not to switch? (This is a case where we’d really love to hear from you).
Do you agree that these changes will make MySQL safer, and easier to administer?

Please leave a comment, or get in touch!