Migrating to a Cloud SQL MySQL instance
This guide describes a way of migrating an existing MySQL database to a Brightbox Cloud SQL instance with minimal downtime. Even a 100GB database can be migrated with less than a minute of downtime using this method.
The aim is to migrate over an initial full db dump from your source server to your Cloud SQL instance, which can take a long time, and then migrate over only the changes since then, which is quick. Basically, it’s like an incremental backup and restore. This is made possible with MySQL’s binary logging feature, which you’ll need to have enabled on your source server. You’ll also really need to be using InnoDB tables, not MyISAM, but almost everyone should be using InnoDB nowadays.
Enable MySQL binary logging on your source server
If you don’t already have binary logging enabled, you can configure it
easily. Just add the following section to your mysql config and restart (often
in /etc/mysql/my.conf
). Choose an appropriate directory for log_bin
(the
following example will work fine on Ubuntu. Other distro’s may differ):
[mysqld]
binlog_format=STATEMENT
server-id = 1
log_bin = /var/log/mysql/mysql-bin.log
Build your Cloud SQL instance
Create your Cloud SQL instance and map a Cloud IP to it. You’ll need the Cloud IP address and the admin password to proceed.
This whole process is much easier if you grant direct access from your source
server to your Cloud SQL instance, so ensure you’ve configured the Cloud SQL
access control to allow access from it. If it’s offsite, you can grant access to
the external IP address - just be sure to use the --ssl
option to the mysql
client to ensure data is encrypted in transit.
Take an initial mysqldump
Use the mysqldump
tool to take a copy of the source databases - specify all
databases on your source server except the mysql
database as you don’t want to
override the internal mysql privilege tables.
Use the --single-transaction
option to avoid locking any tables and the
--master-data
option to get details of the current binlog positions, which
you’ll need later.
To avoid unnecessary writes to disk, you can stream the mysqldump
output
directly to a mysql client connecting to the Cloud SQL instance. The following
handy command will pick out the master data for you and pass everything else
onto the Cloud SQL instance. So note the MASTER_LOG_FILE
and MASTER_LOG_POS
variables that get displayed:
$ mysqldump --master-data --single-transaction --databases database1 database2 | awk '/^CHANGE MASTER/ {print > "/dev/stderr"; next} {print}' | mysql -h 109.107.x.x -u admin -p
Enter password:
CHANGE MASTER TO MASTER_LOG_FILE='mysql-bin.000001', MASTER_LOG_POS=3107103;
If you’re using entirely InnoDB tables as stipulated, this command will not lock any tables and won’t interrupt service. It can take as long as necessary.
If this is a very large database, then this command may take several hours, so be sure to run it from somewhere that you can leave indefinitely (or run it in a detachable screen session or similar).
If you’re using stored procedures or triggers, be sure to use the appropriate
arguments to mysqldump
to copy those over too.
Apply first incremental update
After the initial dump and load is complete, you can extract all the changes since then from the binlog and apply them to the Cloud SQL instance.
First step is to run a FLUSH LOGS
mysql command on the source server, which
will reopen a new binlog file, which makes things a bit easier to keep track of.
mysql> FLUSH LOGS;
Query OK, 0 rows affected (0.00 sec)
Then use the mysqlbinlog
command to read the mysql binlogs, and specify the
log filename and master log position that you got when you ran the initial
dump. You’ll also need to filter out the psudeo_thread_id
settings from there,
which can be troublesome without SUPER privileges and shouldn’t be necessary
under these circumstances:
$ mysqlbinlog --base64-output=NEVER --start-position 3107103 /var/log/mysql/mysql-bin.000001 | grep -vE "^SET @@session.pseudo_thread_id" | mysql -h 109.107.x.x -u admin -p
The amount of time this process takes will depend on the rate at which your database receives new writes and how long it took to take the initial dump.
Now you’re up to date with the majority of writes since the initial dump, but
you’re behind on the writes since you ran the FLUSH LOGS
command.
When you ran FLUSH LOGS
, mysql increased the log number, so you’ll see a log
named /var/log/mysql/mysql-bin.000002
now with all the writes in it since
then. So run FLUSH LOGS
again and catch up using mysqlbinlog
with
mysql-bin.000002
this time.
Each time you do this, the time it takes to catch up should get shorter and
shorter. Be sure to keep track of the files appearing in /var/log/mysql/
so
you don’t miss any logs (mysql may choose to open new binlog files at any time).
Apply final update and switch over
By this point you’ve caught up on the writes during the longest period of time so now you just stop your app, stop mysql, run a final incremental update using the latest log files and then switch your app over to using your Cloud SQL instance.
Your app only needs to be down as long as it takes to apply the last batch of changes. If you’ve not left a long gap since you applied the last log, this could be as little as a few seconds.
Limitations
This process works well under most of the same circumstances that MySQL replication works - it’s basically a manually managed replication stream. Review the MySQL replication documentation to learn more about its limitations.
No writes to other databases
As noted, you can’t migrate over or even write to the mysql
database during
the migration. Furthermore, you can’t write to any database not part of
the initial dump as it’s not possible to reliably ignore those writes in the
binlog.
If you’re certain you don’t use any single write statements that span multiple
databases then you can use the --database
with mysqlbinlog
to select just
writes to one single database, but even then you can’t use it to specify
multiple databases.
Can’t switch back the same way
Cloud SQL instances don’t currently have binary logging enabled themselves so you can’t keep track of writes once you’ve switched over, so you can’t easily switch back after going live if you’re not happy. This will change in the near future though and we’ll allow enabling of binlogging.
MyISAM tables
As noted, you can’t use mysqldump
with MyISAM tables without locking tables
which may block your app. You can minimize MyISAM lock times during the initial
dump by using a lower level snapshot system such as LVM, but this is beyond the
scope of this article.