Saturday, August 20, 2016

A discussion about sync-master-info and other replication parameters

Some time ago, feedback was requested on new replication default after MySQL 5.7.  Some of the suggested default are:
I agree on the suggestions for relay-log-info-repository and relay-log-recovery: they are needed for crash safe replication, and having crash safe replication enabled by default is a good thing.  I have doubts about master-info-repository: I do not see what benefits are introduced by this change.  I have much bigger doubts about sync-master-info and sync-relay-log: those changes bring an illusion of safety, which is bad.  Let's dive in more details about those parameters.

I have already covered replication crash safety on the Booking.com blog.  I am happy that the suggested changes enable crash safe replication by default.  I would also like to see work done on improving relay-log recovery to reduce its side effects, but this is another discussion (more details about those side effects are given in the section "The problems with MySQL 5.6 Crash-safe replication" of a previous post).

About changing the default for master-info-repository from FILE to TABLE, I do not see what benefit this change is bringing.  Without a clear goal for that change, this looks like a "change for the sake of change", something that should be avoided.  I see good and bad in both values, so I would be tempted not to change it:
  • having the master-info in a file makes this information readable even if MySQL is stopped,
  • having the master-info in a table reduces the amount of external files needed by MySQL and hides the credential used to connect to the master.
One thing I think master-info-repository in a TABLE does not bring is more safety for replication.  It is a common misunderstanding that slipped even in the MySQL documentation (I filed bug #82667 about that).  Crash safe replication does not need master-info-repository in a TABLE.  However, there are no bad things introduced by setting master-info-repository to TABLE (and maybe some good things that I cannot think of), so I have no big disagreement with a change in default here.

I have bigger disagreements with the changes in default for sync-master-info and sync-relay-log.  The sync-master-info parameter controls after how many events the master-info data is updated (FILE or TABLE).  But as explained previously, the content of the relay-logs and of master-info are most of the time out of sync.  Lowering sync-master-info from 10,000 to 1,000 brings a false sense of security by being 10 times more in sync, but we are still 999 times out of 1,000 not in sync (before it was 9,999 times out or 10,000).  False impressions of security are bad !  The only change that I think would be good for this parameter is to set it to zero (0 = no sync): this way it becomes clear that master-info cannot be trusted.

Using the same logic as above, I think the only change that would be good for sync-relay-log is to set it to 0.  Another parameter has not been discussed (sync_relay_log_info), no change in default are suggested for this one, but the same logic applies to it.

Note: I could change my mind about sync-master-info and sync-relay-log if a better relay-log recovery was implemented (see section "Our wishes for MySQL Crash-safe replication" of a previous post for more information about a better relay-log recovery).

What do you think ?  Feel free to add a comment below.

1 comment: