Saturday, October 17, 2015

Binlog Servers for Simplifying Point in Time Recovery

A common way to implement point in time recovery capability is:
  1. to regularly do a full backup of a database,
  2. and to save the binary logs of that database (or from its master if doing backups on a slave).
When point in time recovery is required you need to:
  1. restore a backup,
  2. and apply the binary logs up to the point of recovery.
(Step # 2 and # b above are the ones that will be simplified by using Binlog Servers.)

There are many ways of doing a backup of a database:
  1. mysqldump (or mysqlpump) on the master: no need for slaves but taking a dump of a large database without blocking writes and with a consistent binlog position is not trivial,
  2. XtraBackup on the master: no need for slaves, but we only have a physical backup,
  3. mysqldump (or mysqlpump) on a slave: needs extra hardware but very simple and we now have a logical backup:
    • stop slave sql_thread;
    • do a backup
    • save show slave status\G output
    • save show master status\G output
    • start slave sql_thread;
  4. XtraBackup on the slave: this could be useful but I do not have an example in mind.
but saving the binary logs is tricky:
  • you can use a script to copy the binary logs (many corner cases to take into account),
  • or you can use mysqlbinlog to stream the binary logs directly from a MySQL master (you will still need a wrapper script to restart mysqlbinlog in the case of disconnection from the master).
and applying the binary logs is complicated:
  • you need to pipe the output of a correctly executed mysqlbinlog to a mysql client (good luck in managing all error cases),
  • or you need to copy the binary logs to your restored database and use CHANGE MASTER TO with RELAY_LOG_FILE and RELAY_LOG_POS (I never tried this myself, I am sure there are some pitfalls). 
or you can use a Binlog Server

Binlog Servers download and store an exact copy of the binary logs of a master and serve them to slaves:
  • from the point of view of the master, a Binlog Server is a standard slave,
  • from the point of view of the slave, the Binlog Server is an exact copy of the master.
You can find more information about Binlog Servers and their use-cases in the following posts from the Booking.com dev blog:
So, Binlog Servers allow to keep an exact copy of the binary logs of a database in a location external from the master: this is exactly what is needed for achieving point in time recovery.

When point in time recovery is needed, after restoring a backup, the restored MySQL is configured as a slave of a Binlog Server.  That will bring the database to the required state (point in time) without using file copy or needing running commands on an external server.  Moreover, all the error conditions and the corresponding retry logic is already implemented in replication, so you do not need to care about those.

A deployment with no slave (XtraBackup, mysqldump or mysqlpump from the master) would be the following (M is the master and X is a Binlog Server):
+---+      / \
| M | --> / X \
+---+     -----
If backups are taken on a slave (S), the deployment would be the following:
+---+
| M |
+---+
  |
  +--------+
  |        |
 / \     +---+
/ X \    | S |
-----    +---+
After restoring a backup (on R below), the database would be brought to the required state by slaving R to X:
 / \      +---+
/ X \ --> | R |
-----     +---+
If more than one slave is needed and/or more than one copy of the binary logs needs to be kept, one can deploy more than one Binlog Servers (the Binlog Server Layer below).  This also allows to take advantage of easy master promotion as described in the following post on the Booking.com dev blogAbstracting Binlog Servers and MySQL Master Promotion without Reconfiguring all Slaves.  The deployment would be the following:
  +---+
  | M |
  +---+
    |
+---+-----------------------+
|    Binlog Server Layer    |
+---+-------+-----------+---+
    |       |           |
  +---+   +---+       +---+
  | S1|   | S2|  ...  | Sn|
  +---+   +---+       +---+
This use-case of the Binlog Servers has been suggested to me by Alessandro Fustini.  Alessandro attended my Binlog Server session at Percona Live Amsterdam and came up with this idea: thanks Alessandro for this new and original Binlog Server use-case.

If you are in the San Francisco Bay Area at the end of October and want to know more, I will be speaking about Binlog Serves and Replication at Oracle Open World (from October 25 to 29).  My two sessions are on October 27:

2 comments:

  1. I've read a a lot of your posts about binlog servers and failed to find instructions how to setup one. Did you open-sourced them?

    ReplyDelete
  2. Hi, some details here:
    http://jfg-mysql.blogspot.nl/2015/04/even-easier-master-promotion-and-high-availability.html

    MaxScale 1.1 do not yet allow to safely do what is describe in this post. We are working on the latest version that will allow it. It should be available soon. Stay tuned on the MaxScale website and mailing list.

    ReplyDelete