Today's enterprise depends on the availability of mail and web services. Failure is never far away, whether it be a hardware failure or a human error. We have to try to make an infrastructure as highly available as possible.
When building highly available clusters, people often choose one extra physical machine per service, creating an A-B, fail-over schema. With static websites, there is no problem making the application highly available; you can just store the data in two places. However, the moment you add a database to your environment, things start to become more difficult. The easy way out is to move the database to a different machine and move that server into a SEP field.
That's not how we do it. In the old days when sites became too heavily loaded, we used MySQL replication to create multiple read-only copies of the database, which LVS load balanced. This, however, meant that we had to modify the application itself so that it could only write to the master node.
Later, many people tried to create a replication environment that implemented high availability. All of them struggled with the problem that they couldn't exactly define where a node failed, so it was possible to lose records. Also, recovering from a failover seemed to be a difficult task.
In late 2004 and early 2005, MySQL announced MySQL Cluster--the NDB storage engine from MySQL that plugs into the MySQL environment. With this cluster environment, I built a highly redundant environment. This article discusses how to do it yourself.
Hold on before you decide to implement MySQL cluster, as it certainly doesn't suit all jobs yet. The MySQL NDB engine currently runs its database completely in memory. This means that you have to be able to fit your database in memory. If you have a 1GB dataset and you want to have your data spread over two nodes, you need 1GB of memory per node. If you have four nodes, you can deal with 512MB per node. The first thing that becomes clear is that you have to look at your dataset. Giant data-warehousing databases won't fit into the MySQL NDB engine yet. Today, you might need to look at commercial alternatives such as Continuent, the former Emic Networks.
Secondly, this article is about building a truly available MySQL cluster, which means it focuses on high availability, not on high throughput. If your database is suffering from too-high loads, you still might want to look back at replication in combination with LVS.
The MySQL NDB storage engine consists of different parts:
ndb_mgmd is the NDB management daemon. This daemon manages the cluster. It should start first in order to monitor the state of the other parts. The management daemon can arbitrate who becomes master and which nodes have to be disconnected from the cluster. It is also capable of (re)starting different nodes and starting backups. The other nodes will ask the management node for their configuration details, but from then on, they don't really need the management node anymore. You can easily stop and start the management node while not disturbing the cluster, as long as no other fault happens during this restart. The management node listens on port tcp/1186 (tcp/2200 in older versions).ndb_mgm is the management client. It sends commands to ndb_mgmd.ndbd is the actual network database engine. You need at least the number of nodes equal to the amount of replicas you want. To spread data over multiple nodes, increase the number of nodes.mysqld is the standard SQL node that connects to the ndbd for NDB engine type storage. It still can use MyISAM or InnoDB tables.Standard MySQL clients connect to the SQL node and won't notice the difference between a MyISAM or InnoDB query, so there is no need to change the API.
In order to achieve high availability, you need at least three nodes: one management node, and two different replica nodes for ndbd.
|
Initially, MySQL didn't provide the NDB build in RPM form, but they did include source RPMs so you could build your own. These days, MySQL kindly has split the different binaries into different packages: MySQL-client-4.1.12-1.i386.rpm, MySQL-devel-4.1.12-1.i386.rpm, MySQL-ndb-extra-4.1.12-1.i386.rpm, MySQL-ndb-management-4.1.12-1.i386.rpm, MySQL-ndb-storage-4.1.12-1.i386.rpm, MySQL-ndb-tools-4.1.12-1.i386.rpm, and MySQL-server-4.1.12-1.i386.rpm.
Install MySQL-ndb-management on the management node. Install MySQL-ndb-storage, MySQL-ndb-tools, MySQL-NDB-extra, and the usual MySQL-client and MySQL-server on the database nodes (here, DB-A and DB-B).
On the management node, the configuration file (config.ini) resides in the directory /var/lib/mysql/mysql-cluster:
[ndbd default]
NoOfReplicas= 2
[mgm]
HostName= 172.16.32.1
[ndbd]
HeartbeatIntervalDbDb=600
HeartbeatIntervalDbApi=600
HostName= 172.16.32.7
DataDir: /var/lib/mysql/mysql-cluster/
DataMemory = 100M
IndexMemory = 50M
MaxNoOfConcurrentTransactions = 500
MaxNoOfConcurrentOperations = 250000
TimeBetweenWatchDogCheck = 40000
MaxNoOfOrderedIndexes = 27000
MaxNoOfTables = 9000
MaxNoOfAttributes = 25000
[ndbd]
HeartbeatIntervalDbDb=600
HeartbeatIntervalDbApi=600
HostName = 172.16.32.107
DataDir: /var/lib/mysql/mysql-cluster/
DataMemory = 100M
IndexMemory = 50M
MaxNoOfConcurrentTransactions = 50
MaxNoOfConcurrentOperations = 250000
TimeBetweenWatchDogCheck = 40000
MaxNoOfOrderedIndexes = 27000
MaxNoOfTables = 9000
MaxNoOfAttributes = 25000
[mysqld]
[mysqld]
[mysqld]
This is the main config file that you need to get MySQL NDB cluster up and running. In this version, ndbd has two replicas. The management server listens on 172.16.32.1. In my work project, our current needs don't exceed a 100MB data segment and a 50MB index memory.
The three unnamed [mysqld] sections refer to the two mysqld processes on the different DBs and a possible third mysqld engine.
With this config in place, you can start the ndb_mgmd for other nodes to connect to.
Both data nodes have a file called Ndb.cfg in /var/lib/mysql/mysql-cluster (also symlinked to /var/lib/mysql/):
DB-A:/var/lib/mysql # cat Ndb.cfg
host=172.16.32.1:1186
The first time you start ndbd, run it with the --initial parameter.
Running ndb_mgm now should show two active ndb nodes:
DB-A:/var/lib/mysql #ndb_mgm-- NDB Cluster -- Management Client -- ndb_mgm>showConnected to Management Server at: 172.16.32.1:1186 Cluster Configuration --------------------- [ndbd(NDB)] 2 node(s) id=2 @172.16.32.7 (Version: 4.1.12, Nodegroup: 0) id=3 @172.16.32.107 (Version: 4.1.12, Nodegroup: 0, Master) [ndb_mgmd(MGM)] 1 node(s) id=1 @172.16.32.1 (Version: 4.1.12)
If you can see this, you've accomplished the first part of the mission. Congratulations! You now have a working NDB engine cluster.
|
Now you need to tell the MySQL SQL node that it can use the NDB engine. Modify the MySQL config file my.cnf in /etc/ to add the parameter ndbcluster in the [mysqld] section. Upon restarting mysqld, the output of ndb_mgm should have some additional lines:
[mysqld(API)] 3 node(s)
id=4 @172.16.32.7 (Version: 4.1.12)
id=5 @172.16.32.107 (Version: 4.1.12)
id=6 (not connected, accepting connect from any host)
Now test the actual cluster database and create a table with the option engine=Ndb;:
mysql> create table blah (i int, a char) engine NDB;
Query OK, 0 rows affected (0.46 sec)
Then on node DB-B, watch the table show up:
mysql> show tables;
+----------------+
| Tables_in_test |
+----------------+
| blah |
+----------------+
1 row in set (0.03 sec)
On node DB-A, insert a value into the table:
mysql>insert into blah values (1,"x");Query OK, 1 row affected (0.01 sec) mysql>select * from blah;+------+------+ | i | a | +------+------+ | 1 | x | +------+------+ 1 row in set (0.01 sec)
... and watch it appear in the same table on the second node (DB-B):
mysql>select * from blah;Empty set (0.08 sec) mysql>select * from blah;+------+------+ | i | a | +------+------+ | 1 | x | +------+------+ 1 row in set (0.00 sec)
As you can see, modifying a record on one node makes it automagically appear on the other node, as well. This actually works both directions, so it really doesn't matter which node you connect to.
When querying the database, the SQL node will contact the management node to see which engine it should query. If one of the NDB engine nodes disappears, the management node will direct the SQL node to another. That's why you need to have the management node running on a separate node; if the node where both your active NDB engine and your MGM node run crashes, your SQL engine has nowhere to go. You can restart or take down your management node for maintenance; as long as none of the other nodes goes down your applications will continue. Of course, you can always configure two management nodes.
MySQL Cluster makes sure your database stays up, but what if the mysqld you are talking to crashes? What if the physical machine you are talking to crashes? What if your network connection goes down? In these cases, your application, which used to talk to the mysql daemon on DB-A, doesn't know what to do anymore. That's why my team added the mon-plus-Heartbeat combination to our setup.
Heartbeat will take care of monitoring both nodes and making sure that the active node has the correct application address (for example, the address of your Java applications).
Setting up Heartbeat on cluster nodes means configuring three files. In /etc/ha.d:
$cat authkeysauth 1 1 md5 WorldDominationFast! $more haresourcesDB-A 172.17.0.103/24/eth0:1 DB-A:/etc/ha.d #more ha.cflogfacility local0 keepalive 2 deadtime 10 warntime 5 initdead 120 nice_failback off udpport 694 bcast eth0 node DB-A node DB-B
Make sure that files both exist and are identical on both nodes.
Heartbeat does not monitor the state of the application. You can still be able to ping a DB, but its mysqld might be completely dysfunctional. That's why we added mon to our setup.
The next change is to mon.cf in /etc/mon/:
hostgroup mysql-node 127.0.0.1
#
# watch definitions
watch mysql-node
service mysql
interval 10s
monitor msql-mysql.monitor --mode mysql --database=your database \
--password=yourpasswordgoeshere
period wd {Mon-Sun}
alert stop-heartbeat.alert
As this config snippet shows, mon uses its msql-mysql.monitor script to see if it can actually still connect to the database. It also periodically checks that it can still do a show tables on the database on the node on which it is running. The moment that fails, it will run stop-heartbeat.alert.
When mon decides it can't connect to MySQL anymore, it triggers Heartbeat to move the active IP to the second node, where another MySQL instance is happily running and can access identical data to that of the prior node.
With this setup, we created a highly available MySQL cluster with no single point of failure.
Kris Buytaert is a Linux and open source consultant operating in the Benelux. He currently maintains the openMosix HOWTO.
Return to O'Reilly Databases
Copyright © 2009 O'Reilly Media, Inc.