We all knew that we are risking with MMM. Risking, and placing availability as a more important like consistency. But non of us can risk loosing data forever but we show using it, regarding to our conversations think: "I can fix my data later on, but I can’t turn back time and prevent the downtime. (Pascal Hofmann@xaprb.com)".
As I wrote before about staying online, now let me write about how to stay consistent.
We all know, mmm is not like a key of salvation, but its getting close to it . While MySQL doesn't support multi-master-slave environments from it's source code, we will sleep badly wondering on the safety of our precious databases.
But its not just about MMM, a few days ago we ran in to a well known InnoDB "feature". Its about the auto increment counter determination on restart. InnoDB try to count the next auto increment value on MySQL restart what can screw up things in the replication as in your data integrity too (http://dev.mysql.com/doc/refman/5.1/en/innodb-auto-increment-handling.html) what could be risky when you are about to use mmm, because you can restart your masters whenever you just want to.
In this post I'm using maatkit to verify and restore the rows without locking or downtime.
This is not a success of MMM, but success of maatkit.
I don't want to write about maatkit. Everybody who is able to use MMM were met with maatkit. Great stuff and if you didn't met with it yet, this is the right time. http://www.maatkit.org
Verification between your masters:
This is the hardest part, you CAN'T and/or you SHOULD NOT to avoid manual overview.
I'm using mk-table-sync for this scenerio. As the about say: mk-table-sync finds and fixes data differences between MySQL tables.
This is what we are going to do, but lets consider the possibilities. If we goes trough the help menu, we can see there is a lots of options. Worth going trough of them.
Here is the command I used to execute.
mk-table-sync --chunksize 100 --databases mytestdb --lock 1 --transaction --skipbinlog --print db02 db01
And here is the explain:
--chunksize 100 : Chunksize means the size of the chunks maatkit compare to each other. Mean rows. In this command maatkit compare the tables by chunks which are contain 100 rows. Its an affordable size, Its wont be faster and as I benchmarked neider become slower, but as you will see later, its start a transaction and lock the table. More about locking and transactions later but to understand why I recommend a small chunksize, you have to know that this means that the locking of the table is it in will be a short lock.
--database mytestdb: Allows you to give a comma-separated list of databases should be taken care of. In this example I use only one database but as I mentioned you can pass a list of databases you want to scan or what you don't want to scan.
--lock 1: As I mentioned before, maatkit provide you two kind of "locking". The first one is a real table lock and the second one is the way of using transactions. With innodb tables using transaction is a good choice, more about later, but with MyIsam tables, I think the best way is lock 1 regarding to what I want to do here, syncing online masters to each other. mk-table-sync provide 3 different ways of locking here: "0=none, 1=per sync cycle, 2=per table, or 3=globally" as far as I know, 0 means no locks, thats not a good way if you are about to sync online masters under traffic. Locking each table could be a wrong decision too, in case of a larger table, and globally is the same under usage. So this is why I just picked lock 1 what will lock while it's take care of my chunk (this is why I lowered the chunk size to 100).
--transaction: This is what i was talking about above. Instead of locking, mk-table-sync able to use transactions . Starting a transaction with a larger chunk size could cause lock-wait-timeout messages as locking can. This is why there is a small chunk size again. You have to listen to your isolation level too. As you know, its could be unsafe under read-committed.
--skipbinlog: This is the most important part of it. In a multi-master environment syncing without this option is like a suicide. With this, mk-table-sync will execute the fixing queries without logging in to binlog. I should not to mention, that logging to binlog on the slave could really mess up your databases as the other master will replicate and execute it.
--print: mk-table-sync provide three different (and some of them able to combine) ways to run. First I would recommend is --print. Print will just print to the stdout (what you can forward to file with 1>/your/file/path) but when you redirect the stdout, don't forget when using with --ask-pass that this message won't show up for your . You can use --execute instead. That will execute the queries immediately. I'm always scared by that a little bit so I rather get the output, go over it and execute what I think I should.
db02/db01: In my example these two hosts are the writable master (at first) and the passive master as a second one.You can use whatever hosts you want. Commands will be executed on the second host.
What to do with your output: When its done and its find differences, you can find where ever your stdout goes. In thet file or on your screen you can see the queries what mk-table-sync would execute to make your passive master consistent. Now its up to you what to do with this information. You can execute each of the or re-run the sync with --execute instead of --print.
If you done with your masters, you can sync the slaves on the same way, but you can use the already consistent passive master of your to sync from. Both maatkit and mmm provide some very useful scripts to rebuild slaves and you always got a passive master to do from. With this you are free to rebuild your cluster whenever you want.