In this blog, I would like to mark myself as a power-user of MMM. For more then 3 years now I'm engineering high-traffic websites and none of them was a small project. In the beginning I worked behind the livejasmin.com project (warning: NSFW!) which is an Alexa top100 site and now I'm working at ustream.tv which is in the top500 (so far). In this area, your greatest enemy is downtime. Thousands of users want to see your dynamic content by executing thousand of queries against your MySQL servers and what they don't have is the patience and they don't even tolerate critical security-upgrades. In both sites I mentioned above there is no point in time during the day at which less then 10k visitors are hitting the sites. Now one question comes up: How can you do an upgrade on your MySQL servers or how can you alter a larger table?
So you have a new feature and that means another index on a table or you just want to upgrade your application, there's no chance that your downtime will be hidden. We all know, even if you run your site with no downtime for months, 1 minute can make your bosses forget about that and ask you THE question: "What has happened?". If you would say: "There was no problem, I just took down the site to upgrade MySQL", you won't be the employee of the month.
So let's see how can MMM help:
Modifications in your database, backup or maintenance:
For a lot of people these words could mean different things, so what I would like to mention is how MMM can help with that.
Just imagine a typical issue: a new column should be added to a large table.
If you have DRBD, NDB or actually nothing at all, this will make you cry if that table is large enough. This could mean different size for everyone, but lets say 25GB with 1,5 million rows. You can see each row size is not small at all. Adding a column even with a pretty fast, but standalone (not clustered like ndb or different sharding techniques) this could take up 4-5 minutes. Taking your site down for minutes is just not a possibility.
What would you or your developers do to avoid this scenario? A few times, I have seen them create a new table and later execute joins or multiple queries on that. That's really not good for us. (here you can find more answers on performance impacts). So lets say you have a chance to alter the table without downtime. MMM can do it for you.
All you have to do is:
- Define the new column (at least in the beginning) with a possible NULL (but that isn't really good for us too) or with a default value to make it possible that your table will work with the current application and replication together without failed replication or failed queries. What I mean is: let's say you have a table with the following columns: id int(8) auto_increment, sometext varchar(32) NOT NULL and you want to add a column to the end say insertdate datetime default NOW() at the end.
So lets see our possibilities: At first, if the developers would handle it, you would have another table, lets say date_of_join which would contain the same id that you can find in the table above and every time when they want to insert in to the first table, they would write the first and our new table too. And when they want to select something that would mean at least two queries which is really not so good for us. So the first thing you have to take care is: Always made changes what would work perfectly with your current architecture/application.
- Start executing alter command on each server of yours one-by-one (except your current active master) and on those servers which serve as a master (too), use sql_log_bin = 0 for this query! And this is important. With this option, you can alter each server and finally move the writer role with MMM to another server and alter the last one too.
I already went trough this for a few times. Pretty smooth.
Another problem with MySQL that is its a single-point of failure. In the most cases (except ndb) you cant afford to loose your master. So every single modification will cost a lot. You can't upgrade. Now this problem is solved with MMM. All you have to take care is the correct, planned and exact ORDER of upgrading.
What I mean is you have to start upgrading from the bottom. First upgrade your slaves and when they done, upgrade the passive master and finally, move the writer role with MMM by the command set_offline. And set_offline is (could be) very important. Because if you move the writer rule to your passive, but upgraded master, maybe its execute queries in a different way what your lower version MySQL don't understand. Moving roles with set_offline will stop replication and you have time to upgrade safely. But I have to mention that you are risking downtime again but on a different way. Moving to a newer version (5.0>5.1) could be VERY risky without proper testing. Take care about that.
I hope I gave you some good advices.