Suggestions for transitioning into Strict SQL-Mode

As MySQL 5.6 now enables the SQL Mode STRICT_TRANS_TABLES for new installations, I have been running into users who have been upgrading from previous versions, and not quite sure how to test their application for compatibility with this new default.

For some applications converting warnings to errors presents an unknown-unknown, in that the operators of the system can not easily detect what may break. Certainly many deployments will have QA environments, but there is always some fear production queries are just a little bit different.

Since sql_mode is configurable on a per-session basis, there are some strategies that I have been recommending to ease transition:

  • Whitelist: Have all new application components enable strict mode by default. For example, if you are building a set of cron jobs to rebuild caches of data – have these scripts set sql-mode strict as soon as they connect to MySQL, but initially leave existing applications without STRICT mode.
  • Blacklist: Modify existing application components that have not extensively been tested with strict mode to explicitly unset this SQL mode when connecting to MySQL. This is an important change, since all new components will then default to being strict.
  • Staged Rollout: Have the ability to turn on/off strict SQL mode on a per user-basis, with perhaps internal users or beta users being the first to have strict mode enabled for. This offers a more gradual transition where you can contain any errors to within a small number of users. This strategy was suggested by @geodbz

That’s my list to date. I would love to hear if anyone has any suggestions on how to better manage the transition process!

SET GLOBAL sql_log_bin – We need your feedback!

Jeremy Cole recently blogged about the feature SET GLOBAL sql_log_bin. In his blog post, Jeremy suggested that there is no strong use-case for this feature, and that it is too easy to run accidentally (i.e. SET GLOBAL instead of SET [SESSION]). The result of this incorrect usage is that slaves will drift out of sync from masters.

We discussed Jeremy’s request in the MySQL team, and we agree that the syntax should produce an error in MySQL 5.7. However, we have not yet determined which steps should be taken for MySQL 5.5 and MySQL 5.6. The two proposals we would like to float are:

  1. The statement SET GLOBAL sql_log_bin=N produces a warning.
  2. The statement SET GLOBAL sql_log_bin=N produces an error.

Option #1 does not specifically solve Jeremy’s reported issue, because even though the command provides a warning, it will still execute. Thus, the data-drift from accidental usage will still occur; the operator will simply be aware of the problem sooner.

Option #2 does not follow our standard policy regarding functionality changes in GA releases. That is to say that we are discussing making a special exception since this behaviour is regarded as dangerous and almost certainly unintended.

We are seeking input from the community as to which is the better of these two options:

  • Which is your preference – option #1 or option #2?
  • Do you agree that this specific situation warrants a behaviour change in GA releases?

Please leave a comment, or get in touch!

MySQL 5.6.20 Community Release Notes

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.20.

In particular:

  • Thank you to Jeremy Cole for reporting a case where InnoDB’s redo log could become corrupt with externally stored blobs. Jeremy also provided a testcase to demonstrate the problem. Bug #69477.
  • Thank you to Ronald Bradford for identifying that MySQL would continue to operate even though the binary log could not be written. Our team picked up Ronald’s blog post, and filed a bug in response. Bug #51014.
  • Thank you to Daniël van Eeden, Andrew Garner, Detlef Ulherr and Bjoern Boschman for reporting that mysql_install_db would always create a configuration file in /usr. We have added a new option --keep-my-cnf to preserve any existing configuration files, and not create a new file in /usr. Bug #71600, Bug #68318, Bug #68117, Bug #68416.
  • Thank you to David Newgas for reporting that loading InnoDB full-text search as a plugin would fail. David also provided a suggested patch. Bug #70178.
  • Thank you to Inaam Rana for reporting that the fix for a previous bug added superfluous buf_flush_list() logic to InnoDB startup code. Bug #70899.
  • Thank you to Stewart Smith for reporting that some InnoDB internal locks did not account for memory ordering differences on non-intel platforms. While we had discovered this issue earlier in Bug #47213, Stewart’s report and sample patches helped considerably. Bug #72539, Bug #47213.
  • Thank you to Valeriy Kravchuk for reporting a case where queries on partitioned tables could return wrong results. Bug #71095.
  • Thank you to Domas Mituzas for reporting that mysqlbinlog’s --raw feature did not correctly handle errors caused by failed writes, leading to potential corruption. Bug #72597.
  • Thank you to Raolh Rao for reporting that updating or deleting a row on a master that did not exist on a slave can create problems using row-based replication. Bug #72085.
  • Thank you to Vlad Lesin for reporting that quotation marks were not always handled correctly by LOAD DATA INFILE when writing to the binary log. Bug #71603.
  • Thank you to Simon Mudd for reporting that manually specifying an AUTO_INCREMENT value could cause a replication error in certain conditions. Bug #70583.
  • Thank you to Julien Duponchelle for reporting that client applications could no longer set the BINLOG_DUMP_NON_BLOCK flag in the initial handshake to a MySQL 5.6 server. This functionality was removed by mistake and has now been restored. Bug #71178.
  • Thank you to Arthur O’Dwyer for reporting a wrong results bug when using MIN(), MAX() and an aggregate function. Bug #69833.
  • Thank you to Shlomi Noach and David Vaughan for reporting that a view defined on a UNION could create an invalid view definition. Bug #65388, Bug #72018.
  • Thank you to Lynn Garren for reporting that configuring with cmake -DWITHOUT_SERVER to build clients without the server failed for builds outside of the source tree. Bug #66000.
  • Thank you to Dan Kloke for reporting a case when a query using COUNT(DISTINCT) could return wrong results. Bug #52582.
  • Thank you to Laurynas Biveinis and Sergey Vojtovich for reporting a situation where a thread deadlock could occur. Bug #71236.
  • Thank you to Raghavendra Prabhu for reporting that MySQL did not compile with Bison 3. Bug #71250.
  • Thank you to Santosh Praneeth Banda for reporting that uninstalling and reinstalling semisynchronous replication plugins while semisynchronous replication was active caused replication failures. Bug #70391.
  • Thank you to Valeriy Kravchuk for reporting a situation when wrong results could be returned for GROUP BY queries. Bug #71097.
  • Thank you to Yoshinori Matsunobu for reporting that client auto-reconnect did not work for clients linked against libmysqlclient. Bug #70026.
  • Thank you to Daniël van Eeden for reporting that upgrades using RPM packages could change the ownership of an installation directory. Bug #71715.
  • Thank you to Valeriy Kravchuk for requesting that MySQL provide option to link against libCstd instead of stlport4 on Solaris. We agree, and have added this functionality. Bug #72352.
  • Thank you to Stewart Smith for reporting that various test-suite file permissions were not correct. Bug #71112, Bug #71113.

In addition, we would like to thank the MySQL Community for their feedback on our earlier proposal to deprecate mysqlhotcopy. Based on your feedback we have decided to officially deprecated mysqlhotcopy in 5.6.20, and it will be removed in MySQL 5.7.

Thank you again to all the community contributors listed above. If I missed a name here, please let me know!

– Morgan

The InnoDB Team is looking for your feedback!

Restarting production database servers with gigabytes of memory is difficult. It can lead to cold caches and other operational complexities.

The InnoDB team is looking to make improvements so that restarting MySQL is required as little as possible. Please help them out by answering which configuration settings you are most eager to see made dynamic.

Thanks!

Beyond the FRM: ideas for a native MySQL Data Dictionary

The frm file has provided long service since the earliest days of MySQL. Now, it is time to replace it with a native InnoDB-based Data Dictionary.

This is a change that has been on our wish list for a long time, as well as others in the MySQL development community:

For historical context:

  • Every table in MySQL has at least a corresponding .frm file. For example in the case of MyISAM these three files constitute a complete backup:

    mytable.MYI
    mytable.MYD
    mytable.frm
    

  • The .frm file stores information such as column names and data-types. It is a binary format, which as Stewart explains predates MySQL and spans back to 1979 with UNIREG. In addition to the .frm file, there are .trn, .trg and .par files which have been added over time to support triggers and partitioning.

Our motivation to develop a native dictionary spans from a number of issues with the current filesystem-based formats:

  1. The current .frm file predates MySQL’s support for transactions, and has a lot of complexity to handle various failure states in replication and crash recovery. For example: Bug#69444. Using a native data dictionary simplifies code and makes handling failure states very simple.

  2. Our information_schema implementation currently suffers, and has been subject to years of complaints. By using a native dictionary, we will be able implement information_schema as views over real tables, significantly improving the speed of queries.

  3. On a closely related point, we currently build information_schema on top of a series of differing filesystem properties, while attempting to provide the same cross-platform experience. The code to account for filesystem case insensitivity increases code complexity, and ties up developer resources that could be better spent elsewhere.

  4. Aside from the MySQL server’s data dictionary, storage engines may also store their own data dictionary. In the case of the InnoDB storage engine, this redundant storage has led to complexity in troubleshooting an out-of-sync data dictionary.

  5. The current non-comformity of the data dictionary (using .frm, .par, .trn and .trg files) spans from a lack of extensibility from the original .frm format. Not having a centralized extensible repository makes it difficult to incorporate feature requests that require additional meta data stored, or to offer new relational objects in the future.

  6. The current format does not support versioning meta-data in such a way that we can use it to assist in the upgrade experience between MySQL versions.

This change is of course in addition to my recent post about storing system tables in InnoDB.

While this change will be transparent for many of our users, we are inviting feedback from the community. Please let us know:

  • If you have a use-case where you interact with the file-based formats directly.
  • What features you want to see in a native data dictionary!

You can either leave a comment, or email me.

MySQL 5.6.19 Community Release Notes

Thank you to the MySQL Community, on behalf of the MySQL team @ Oracle. Your bug reports, testcases and patches have helped create a better MySQL 5.6.19.

In particular:

  • Thank you to Daniël van Eeden for reporting that the utility charset2html was unmaintained and obsolete. We followed Daniël’s suggestion and removed it from MySQL distributions. Bug #71897.
  • Thank you to Dario Kampkaspar for reporting that upgrading from 5.6.10 to a more recent 5.6 release would fail. Bug #72079.
  • Thank you to Fangxin Flou for reporting that InnoDB would call memset too often. Fangxin also provided a suggested patch to fix the issue. Bug #71014.
  • Thank you to Daniël van Eeden for reporting an assertion when enabling the innodb_table_monitor. Bug #69641.
  • Thank you to Domas Mituzas for reporting an off by one error in the handling of innodb_max_dirty_pages_pct. Thanks also to Valeriy Kravchuk for suggesting that the value be a float, and Mark Callaghan for providing a sample patch. Bug #62534.
  • Thank you to Guangpu Feng for reporting an issue where an aborted TRUNCATE TABLE command could still be written to the binary log. Guangpu also provided detailed code-analysis, which helped with our investigation. Bug #71070.
  • Thank you to Justin Swanhart for reporting an issue where the server did not always correctly handle file permissions on the auto.cnf file. Bug #70891.
  • Thank you to Frédéric Condette for reporting an issue when using replication with GTIDs and replicate-ignore-db. Bug #71376.
  • Thank you to Michael Gmelin for reporting compilation errors when building in C++11 mode. Michael also provided a patch with a suggested fix. Bug #66803.
  • Thank you to Hartmut Holzgraefe for reporting that CMake produced not-useful warnings. Bug #71089.
  • Thank you to John Curtusan for reporting that on Windows REPAIR TABLE and OPTIMIZE TABLE failed on MyISAM tables with .MYD files larger than 4GB. Bug #69683.
  • Thank you to Elena Stepanova for reporting an issue that could break row-based replication. Elena also managed to reduce the issue down to a simple test case. Bug #71179.
  • Thank you to Yu Hongyu for reporting that an UPDATE statement could fail to update all applicable rows. Bug #69684.
  • Thank you to Tom Lane for his comments on why the test “outfile_loaddata” could be failing. His suggestion proved to be correct, and helped tremendously in tracking down the root cause. Bug #46895.

In Addition:

Thank you again to all the community contributors listed above. If I missed a name here, please let me know!

Please also note:
There was no MySQL Community Server 5.6.18. That version number was used for an out-of-schedule release of the Enterprise Edition to address the OpenSSL “Heartbleed” issue. This issue did not affect the Community Edition because it uses yaSSL, not OpenSSL, so a new release of the Community Server was not needed, and 5.6.17 is followed by 5.6.19.

– Morgan

Semi-sync replication is not slow!

If you read Yoshinori’s post about Semi-sync at Facebook, he lists the objective of using semi-sync as an alternative to running full durability on a master. That is to say that once you can guarantee writes have safely been shipped across the network, you may not strictly need to guarantee that they are safe locally.

This is something that I have been wanting to benchmark for a long time, and reading Jay’s post about Semi-sync replication performance in MySQL 5.7 DMR4 and a conversation in last week’s #dbhangops inspired me to explore this in more detail. For my tests, I will be using Master-Slave replication and three alternative definitions of durability:

  • Local Durability. By ensuring that sync_binlog=1 and innodb_flush_log_at_trx_commit=1 any changes that are committed locally will be crash-safe (provided that underlying hardware does not lie). A slave will be connected, but using asynchronous replication.
  • Network Durability. By enabling semi-sync replication, each change will be guaranteed on more than one machine. However, each machine will take a relaxed view to how safe the changes are locally (sync_binlog=0, innodb_flush_log_at_trx_commit=0). This is the MySQL Cluster definition of durability, and also some NoSQL systems. It requires that each replica has physical isolation (in the case of AWS: an availability zone).
  • Local + Network Durability. This is a strict definition where both semi-sync and network durability is employed. While it might seem excessively strict, what I like about this option, is that it has the potential to reduce administrative costs during failure events.

Testing

Basic setup details:

  • AWS EC2 m3.medium instances
  • Master in us-east-1b, slave in us-east-1d
  • Latest Amazon Linux AMI
  • MySQL 5.7 DMR4
  • Datadir stored on 40GB Provisioned IOPS EBS volume (900 IOPS)
  • sysbench 0.5 update_index.lua test

I measured both the network latency (ping) and disk latency (dd bs=512 count=1000 if=/dev/zero of=testfile oflag=dsync) prior to starting the test. Interestingly, both were the same at about 2ms.

My first test was to compare single threaded performance:

singlethreaded

I would say that the results are somewhat as expected:

  • Performance when not durable is awesome! (but probably not useful)
  • Local durability is quite expensive in single-threaded workloads. There is not much opportunity for group-commit, and the 2ms latency I have for an fsync hurts to be able to safely write the InnoDB redo logs + the binary logs.
  • Network durability actually performs better than local durability.
  • The combination of Local+Network durability is worse than network only or local only.

Now to repeat the same test, but in 32-threads:

32threads

What is interesting to note is that:

  • No durability still performs the best, but by a much smaller margin.
  • The improved group-commit in MySQL 5.6+ is really helping the local durability throughput increase.
  • Network durability is marginally ahead of local durability.
  • Network+Local is about 15% worse than just local durability.

Comparing Local versus Network Durability

So network durability scored better than local in both tests!?

I think at this point it’s important to explain the difference between local and network durability in the event of a crash, using our current set of tools that are available. If we are using asynchronous replication with a durable local master, and crash safe slaves, then resuming replication is actually possible in the event of a crash. This is not true in the case of network durability, as writes could be lost on the failing machine. This is an important loss of functionality that is difficult to justify.

I would also point out that I think it is easier to hardware-accelerate local durability than it is network durability. The local fsync speed can be improved with an SSD or a RAID Controller with a battery-backed write cache. Network latency can be lowered too, but it might be harder to do so while keeping physical isolation.

Conclusion

For the reasons I listed, I still think local durability is still more useful when choosing just one option for durability. But at the same time, semi-sync is no slouch and it’s great to see new developments in this area in MySQL 5.7.

What I did not test, is configuring MySQL to be mostly-durable locally, and fully network durable. There is a lot of potential to meet somewhere in the middle and very selectively disable features such as the InnoDB doublewrite buffer.

I do not think it is fair to label semi-sync as “slow”, and I am hoping to convince more users that they should take a look 🙂

Build a MySQL Fabric Farm in one step using AWS CloudFormation

I have been building a CloudFormation template for MySQL Fabric as an experiment to kick the tyres and further my understanding of how it all works.

For those not familiar, CloudFormation is a way of representing a collection of Amazon Cloud resources (a “stack”) into a static json file (a “template”). In my previous company, we managed our AWS account almost exclusively via CloudFormation, and there are two key advantages I see with managing resources via templates:

  • It allows for all environments to be synchronized (dev/qa/production).
  • It allows for very fast disaster recovery (you can very quickly pick up and restore in another region).

CloudFormation is also a great-fit for MySQL Fabric, since it can remove many of the essential bootstrap steps that come with building a distributed system. It took a bit of work, but I managed to build a template to bring Fabric installation down to three simple questions:

Here are some of the interesting parts:

  • I am using the latest Amazon Linux AMI + the official MySQL yum repositories.
  • The fabric HA group GLOBAL1 is created in an AutoScalingGroup.
  • When the instance in the AutoScalingGroup starts up, it talks back to the fabric backing store to add itself to the HA group.
  • If the AutoScalingGroup is expanded to 2, a new instance will talk to the fabric backing store, find a peer in the same group and create a backup. Once the backup is restored on the new server, it will add itself to the Fabric HA group.

The less interesting part of course is that this is a proof-of-concept. It needs a bit of work to improve error checking etc. You can find the resulting code in my mysql-fabric-scripts project. The file stack.json can be uploaded directly into the Amazon Web Console.

How important is it to use 2-byte and 3-byte integers?

One interesting feature of MySQL, is that it supports a very large number of integer data types. From the MySQL manual:

Type Storage Minimum Value Maximum Value
(Bytes) (Signed/Unsigned) (Signed/Unsigned)
TINYINT 1 -128 127
0 255
SMALLINT 2 -32768 32767
0 65535
MEDIUMINT 3 -8388608 8388607
0 16777215
INT 4 -2147483648 2147483647
0 4294967295
BIGINT 8 -9223372036854775808 9223372036854775807
0 18446744073709551615

I have previously written on How important it is to using the correct datatype in MySQL, but I find that between integer types I don’t always follow my own advice. I predominantly use TINYINT, INT, and BIGINT, and usually as the default of SIGNED rather than UNSIGNED.

Am I lazy? I think it is best to run through examples…

Primary Key Integer

In this example lets say that we know that we will never have more than 65535 customers, so we have the choice of using a SMALLINT UNSIGNED at 2 bytes, or lazily an INT will accommodate 2147483647 customers at 4 bytes:

Option #1:
CREATE TABLE customers (
id SMALLINT UNSIGNED PRIMARY KEY auto_increment,
company_name VARCHAR(20),
..
..
last_order_date DATETIME
);
Option #2:
CREATE TABLE customers (
id INT PRIMARY KEY auto_increment,
company_name VARCHAR(20),
..
..
last_order_date DATETIME
);

I will project that there are 50000 rows in the table, which is about as close as you would want to get to the maximum range possible of SMALLINT. Using SMALLINT provides a 2 byte saving per row:

2 * 50000 = 100000 = A saving of 97KB

In InnoDB the primary key is included in every other index on the table, so technically we will save an additional ~97KB per-index, but I would say that this saving is too small to matter for most cases.

Attribute on a large table

Option #1:
CREATE TABLE page_views (
id BIGINT PRIMARY KEY auto_increment,
site_id SMALLINT UNSIGNED NOT NULL,
..
..
);
Option #2:
CREATE TABLE page_views (
id BIGINT PRIMARY KEY auto_increment,
site_id INT NOT NULL,
..
..
);

In this example, I will say that there are 1 billion rows in the page_views table:

1 billion * 2 bytes saving per row = 1.8GB

So the choice of using a SMALLINT over an INT has more measurable savings in this case. On a per unindexed column basis I could be saving around 2GB. So with five SMALLINT columns on the same row, I could save 10GB.

Is 10GB worth it? It is maybe easier to answer this question when expressed as a percentage saving for a table. To use some optimistic overhead numbers (that assume very little fragmentation) for InnoDB:

  • ~13 bytes per row (MVCC features)
  • ~1KB per 16K page (page fill factor)

I’ll look at two table definitions. One with the best case of all SMALLINT columns in place of INT, and one with a VARCHAR column that is assumed to be 50 bytes on average:

Option #1:
CREATE TABLE page_views (
id BIGINT PRIMARY KEY auto_increment,
site_id SMALLINT UNSIGNED NOT NULL, /* or INT */
page_id SMALLINT UNSIGNED NOT NULL, /* or INT */
category_id SMALLINT UNSIGNED NOT NULL, /* or INT */
region_id SMALLINT UNSIGNED NOT NULL, /* or INT */
time_id SMALLINT UNSIGNED NOT NULL /* or INT */
);
Option #2:
CREATE TABLE page_views (
id BIGINT PRIMARY KEY auto_increment,
site_id SMALLINT UNSIGNED NOT NULL, /* or INT */
page_uri VARCHAR(2048) NOT NULL, /* assume average length of 50 bytes */
category_id SMALLINT UNSIGNED NOT NULL, /* or INT */
region_id SMALLINT UNSIGNED NOT NULL, /* or INT */
time_id SMALLINT UNSIGNED NOT NULL /* or INT */
);
# INT/SMALLINT only:
(13B (overhead) + 8B (pkey) + 5*2) * 1.06 * 1 billion = 30.60 GB
(13B (overhead) + 8B (pkey) + 5*4) * 1.06 * 1 billion = 40.47 GB

# One VARCHAR column:
(13B (overhead) + 8B (pkey) + 4*2 + 50 (varchar)) * 1.06 * 1 billion = 77.9GB
(13B (overhead) + 8B (pkey) + 4*4 + 50 (varchar)) * 1.06 * 1 billion = 85.9GB

So in the INT/SMALLINT table there is about a 25% saving by using smaller data types, and about a 10% saving in the case of the single VARCHAR column.

I do think that 25% is worth caring about, so maybe in this case I could follow my own advice a little closer 🙂 However, I would also consider this a close to best-case situation that I have used to illustrate a point. I would expect typical space savings to be less than 10%.

One other point to consider, is that changing the datatype of a column is not supported by Online DDL. It is important not to sub-optimize data type choices and have to make changes later.

Did I miss anything? Please leave a comment.

Using MySQL Sandbox to setup a MySQL Fabric cluster in development

With Fabric’s official GA release this week, I thought I would post a spin on how to setup a development environment loosely based on the Fabric Quick Start guide in the manual.

The notable change, is the use of MySQL Sandbox for bootstrapping each of the MySQL instances.

Step 1: Install the MySQL Utilities + Python Connector

In my case, I downloaded:

The GUI install in both cases works as expected. Next, Next, Done!

Step 2: Install MySQL Sandbox

MySQL Sandbox is available via Launchpad or CPAN. There isn’t a major difference between the two, but I happened to choose CPAN.

The advantage of installing Sandbox over a utility such as mysqld_multi, is that it’s much easier to setup and tear down replication.

Step 3: Download a 5.6 tarball

In my case, I downloaded mysql-5.6.17-osx10.7-x86.tar.gz Mac OS X 10.7 (x86, 32-bit), Compressed TAR Archive. On Mac OS X it is useful to use Google Chrome rather than Safari, as you do not want the file automatically decompressed.

Important: Download only! MySQL Sandbox does the install in the next step.

Step 4: Setup the Fabric Backing Store

The first copy of MySQL to setup is the MySQL Fabric backing store. This will house the fabric schema which stores configuration data. Here is the setup line I used:

$ make_sandbox mysql-5.6.17-osx10.7-x86.tar.gz
unpacking /Users/morgo/Downloads/mysql-5.6.17-osx10.7-x86.tar.gz
Executing low_level_make_sandbox --basedir=/Users/morgo/Downloads/5.6.17 \
    --sandbox_directory=msb_5_6_17 \
    --install_version=5.6 \
    --sandbox_port=5617 \
    --no_ver_after_name \
    --my_clause=log-error=msandbox.err
    The MySQL Sandbox,  version 3.0.43
    (C) 2006-2013 Giuseppe Maxia
installing with the following parameters:
upper_directory                = /Users/morgo/sandboxes
sandbox_directory              = msb_5_6_17
sandbox_port                   = 5617
check_port                     =
no_check_port                  =
datadir_from                   = script
install_version                = 5.6
basedir                        = /Users/morgo/Downloads/5.6.17
tmpdir                         =
my_file                        =
operating_system_user          = morgo
db_user                        = msandbox
remote_access                  = 127.%
bind_address                   = 127.0.0.1
ro_user                        = msandbox_ro
rw_user                        = msandbox_rw
repl_user                      = rsandbox
db_password                    = msandbox
repl_password                  = rsandbox
my_clause                      = log-error=msandbox.err
master                         =
slaveof                        =
high_performance               =
prompt_prefix                  = mysql
prompt_body                    =  [\h] {\u} (\d) >
force                          =
no_ver_after_name              = 1
verbose                        =
load_grants                    = 1
no_load_grants                 =
no_run                         =
no_show                        =
do you agree? ([Y],n) Y
loading grants
.. sandbox server started
Your sandbox server was installed in $HOME/sandboxes/msb_5_6_17

Using a configuration similar to the MySQL manual, with the password simply being secret I saved the following to /etc/mysql/fabric.cfg:

[DEFAULT]
prefix =
sysconfdir = /etc
logdir = /var/log

[storage]
address = localhost:5617
user = fabric
password = secret
database = fabric
auth_plugin = mysql_native_password
connection_timeout = 6
connection_attempts = 6
connection_delay = 1

[servers]
user = fabric
password = secret

[protocol.xmlrpc]
address = localhost:32274
threads = 5
user = admin
password = secret
disable_authentication = no
realm = MySQL Fabric
ssl_ca =
ssl_cert =
ssl_key =

[executor]
executors = 5

[logging]
level = INFO
url = file:///var/log/fabric.log

[sharding]
mysqldump_program = /usr/bin/mysqldump
mysqlclient_program = /usr/bin/mysql

[statistics]
prune_time = 3600

[failure_tracking]
notifications = 300
notification_clients = 50
notification_interval = 60
failover_interval = 0
detections = 3
detection_interval = 6
detection_timeout = 1
prune_time = 3600

[connector]
ttl = 1

[client]
password =

I also need to create the user in MySQL for Fabric to connect as:

$ $HOME/sandboxes/msb_5_6_17/use -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.6.17 MySQL Community Server (GPL)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql [localhost] {root} ((none)) > CREATE USER 'fabric'@'localhost' IDENTIFIED BY 'secret';
Query OK, 0 rows affected (0.01 sec)

mysql [localhost] {root} ((none)) > GRANT ALL ON fabric.* TO 'fabric'@'localhost';
Query OK, 0 rows affected (0.00 sec)

Fabric can then install the system tables, and be started with:

$ mysqlfabric manage setup
[INFO] 1401222720.664261 - MainThread - Initializing persister: user (fabric), server (localhost:5617), database (fabric).
[INFO] 1401222721.245034 - MainThread - Initial password for admin/xmlrpc set
Password set for admin/xmlrpc from configuration file.
[INFO] 1401222721.247700 - MainThread - Password set for admin/xmlrpc from configuration file.
$ mysqlfabric manage start
[INFO] 1401222753.862997 - MainThread - Initializing persister: user (fabric), server (localhost:5617), database (fabric).
[INFO] 1401222753.867357 - MainThread - Loading Services.
[INFO] 1401222753.902501 - MainThread - Fabric node starting.
[INFO] 1401222753.910188 - MainThread - Starting Executor.
[INFO] 1401222753.910300 - MainThread - Setting 5 executor(s).
[INFO] 1401222753.910602 - Executor-0 - Started.
[INFO] 1401222753.911354 - Executor-1 - Started.
[INFO] 1401222753.912450 - Executor-2 - Started.
[INFO] 1401222753.913185 - Executor-3 - Started.
[INFO] 1401222753.914456 - Executor-4 - Started.
[INFO] 1401222753.914701 - MainThread - Executor started.
[INFO] 1401222753.927123 - MainThread - Starting failure detector.
[INFO] 1401222753.928365 - XML-RPC-Server - XML-RPC protocol server ('127.0.0.1', 32274) started.
[INFO] 1401222753.928731 - XML-RPC-Server - Setting 5 XML-RPC session(s).
[INFO] 1401222753.928991 - XML-RPC-Session-0 - Started XML-RPC-Session.
[INFO] 1401222753.929455 - XML-RPC-Session-1 - Started XML-RPC-Session.
[INFO] 1401222753.930168 - XML-RPC-Session-2 - Started XML-RPC-Session.
[INFO] 1401222753.930747 - XML-RPC-Session-3 - Started XML-RPC-Session.
[INFO] 1401222753.931320 - XML-RPC-Session-4 - Started XML-RPC-Session.

Step 5: Build out a HA group

The next step is to bootstrap a 3-node replication sandbox, and then add it as a HA Group in Fabric.

First with MySQL Sandbox:

$ make_replication_sandbox mysql-5.6.17-osx10.7-x86.tar.gz
installing and starting master
installing slave 1
installing slave 2
starting slave 1
.. sandbox server started
starting slave 2
.. sandbox server started
initializing slave 1
initializing slave 2
replication directory installed in $HOME/sandboxes/rsandbox_mysql-5_6_17
$ $HOME/sandboxes/rsandbox_mysql-5_6_17/m -uroot
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 5
Server version: 5.6.17-log MySQL Community Server (GPL)

Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

master [localhost] {root} ((none)) > GRANT ALL ON *.* TO 'fabric'@'localhost' IDENTIFIED BY 'secret';
Query OK, 0 rows affected (0.01 sec)

master [localhost] {root} ((none)) > SHOW GLOBAL VARIABLES LIKE 'port';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| port          | 20082 |
+---------------+-------+
1 row in set (0.00 sec)

master [localhost] {root} ((none)) > exit
Bye

Then with MySQL Fabric:

$ mysqlfabric group create GLOBAL1
Procedure :
{ uuid        = 23593cf1-c516-46ef-beaa-516b333353ca,
  finished    = True,
  success     = True,
  return      = True,
  activities  =
}
$ mysqlfabric group add GLOBAL1 localhost:20082
Procedure :
{ uuid        = e72decdf-6524-45b4-9eb6-6d757c0daf3a,
  finished    = True,
  success     = False,
  return      = ServerError: Server (e208ca4a-e5df-11e3-9dd5-4cd7500a4330) does not have the binary log or gtid enabled.,
  activities  =
}

An error! What this error is saying, is that Fabric is designed to specifically take advantage of Replication with GTIDs which are designed to handle smoother failover if the topology changes. Thankfully, MySQL Sandbox has an easy way of being able to switch to GTIDs:

$ $HOME/sandboxes/rsandbox_mysql-5_6_17/enable_gtid
# option 'master-info-repository=table' added to master configuration file
# option 'relay-log-info-repository=table' added to master configuration file
# option 'gtid_mode=ON' added to master configuration file
# option 'log-slave-updates' added to master configuration file
# option 'enforce-gtid-consistency' added to master configuration file
# option 'master-info-repository=table' added to node1 configuration file
# option 'relay-log-info-repository=table' added to node1 configuration file
# option 'gtid_mode=ON' added to node1 configuration file
# option 'log-slave-updates' added to node1 configuration file
# option 'enforce-gtid-consistency' added to node1 configuration file
# option 'master-info-repository=table' added to node2 configuration file
# option 'relay-log-info-repository=table' added to node2 configuration file
# option 'gtid_mode=ON' added to node2 configuration file
# option 'log-slave-updates' added to node2 configuration file
# option 'enforce-gtid-consistency' added to node2 configuration file
# executing "stop" on /Users/morgo/sandboxes/rsandbox_mysql-5_6_17
executing "stop" on slave 1
executing "stop" on slave 2
executing "stop" on master
# executing "start" on /Users/morgo/sandboxes/rsandbox_mysql-5_6_17
executing "start" on master
. sandbox server started
executing "start" on slave 1
. sandbox server started
executing "start" on slave 2
. sandbox server started

With this change, adding servers to Fabric will now work:

$ mysqlfabric group add GLOBAL1 localhost:20082
Procedure :
{ uuid        = c9da0536-0f3f-4d75-9f29-1ac6c12ff686,
  finished    = True,
  success     = True,
  return      = True,
  activities  =
}
$ mysqlfabric group add GLOBAL1 localhost:20083
Procedure :
{ uuid        = 92b42645-311f-423b-86b4-3a56870db244,
  finished    = True,
  success     = True,
  return      = True,
  activities  =
}
$ mysqlfabric group add GLOBAL1 localhost:20084
Procedure :
{ uuid        = 1f16a586-4a60-4b7d-a339-be3830614984,
  finished    = True,
  success     = True,
  return      = True,
  activities  =
}

We can find group commands with:

$ mysqlfabric group

Commands available in group 'group' are:
    group activate group_id  [--synchronous]
    group description group_id  [--description=NONE] [--synchronous]
    group deactivate group_id  [--synchronous]
    group create group_id  [--description=NONE] [--synchronous]
    group remove group_id server_id  [--synchronous]
    group add group_id address  [--timeout=5] [--update_only] [--synchronous]
    group health group_id
    group lookup_servers group_id  [--server_id=NONE] [--status=NONE] [--mode=NONE]
    group destroy group_id  [--force] [--synchronous]
    group demote group_id  [--update_only] [--synchronous]
    group promote group_id  [--slave_id=NONE] [--update_only] [--synchronous]
    group lookup_groups  [--group_id=NONE]

Of which the useful option will be to list the servers in our group GLOBAL1:

$ mysqlfabric group lookup_servers GLOBAL1
Command :
{ success     = True
  return      = [{'status': 'SECONDARY', 'server_uuid': 'e208ca4a-e5df-11e3-9dd5-4cd7500a4330', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': 'localhost:20082'}, {'status': 'SECONDARY', 'server_uuid': 'e931694e-e5df-11e3-9dd5-4cd64dfe115b', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': 'localhost:20083'}, {'status': 'SECONDARY', 'server_uuid': 'ea61dfba-e5df-11e3-9dd5-4bd23bb2cbe3', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': 'localhost:20084'}]
  activities  =
}

The UUID in use here, is the same one that is used for Replication with GTIDs. To promote the first server as a master:

$ mysqlfabric group promote GLOBAL1 --slave_id=e208ca4a-e5df-11e3-9dd5-4cd7500a4330
Procedure :
{ uuid        = be791f70-be3e-4ee9-9872-e1cedce36791,
  finished    = True,
  success     = True,
  return      = True,
  activities  =
}
$ mysqlfabric group lookup_servers GLOBAL1
Command :
{ success     = True
  return      = [{'status': 'PRIMARY', 'server_uuid': 'e208ca4a-e5df-11e3-9dd5-4cd7500a4330', 'mode': 'READ_WRITE', 'weight': 1.0, 'address': 'localhost:20082'}, {'status': 'SECONDARY', 'server_uuid': 'e931694e-e5df-11e3-9dd5-4cd64dfe115b', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': 'localhost:20083'}, {'status': 'SECONDARY', 'server_uuid': 'ea61dfba-e5df-11e3-9dd5-4bd23bb2cbe3', 'mode': 'READ_ONLY', 'weight': 1.0, 'address': 'localhost:20084'}]
  activities  =
}

Next steps

This How-to can easily be extended to setup additional HA groups for shards – MySQL Sandbox really simplifies the setup by bootstrapping replication.

One caveat to be aware of: Each copy of mysqld running can take up quite a bit of RAM. It is possible to tune this down using some steps that I mentioned in an earlier post.