Advanced MySQL Replication Techniques
Pages: 1, 2, 3, 4, 5, 6
Measuring circular replication performance
That usage is transparent, but how good is the performance? To answer this question, I had to make some measurements. Rather than clobbering this article unnecessarily, I refer to a recent entry in my blog dealing with this specific issue (Measuring replication speed, with its companion measuring replication speed source code). This tool shows that replicating 1,000 bytes along the whole replication array takes 0.001150 second. To get a reasonable comparison, consider that measuring on the same machine (that is, without replication at all) takes 0.000015 second, while moving the same data between two nodes takes only 0.000065 second.
I would also like to show some more tangible experience, something that makes sense to the average programmer. The method I used for measuring aims at getting a precise figure. However, replication speed can be so fast that it can accommodate even the most demanding programmer needs. Thus, consider an extreme test. Insert three records containing more than 1,000 bytes in one node, and immediately afterward fetch those records from that node's master (Figure 3). Those records will need to navigate through the whole array of nodes before being available.

Figure 3. Testing replication speed
The testing code is a Perl script.
#!/usr/bin/perl
use strict;
use warnings;
use English qw( -no_match_vars );
use DBI;
use Time::HiRes qw/ usleep /;
my @configuration_files = ();
my $max_config_index = 3;
my $current_config = 0;
for ( 'A' .. 'D' ) {
push @configuration_files, "$ENV{HOME}/circular_replica/my.node$_.cnf";
}
sub get_connection {
my ($config_files) = @_;
my $config_file = $config_files->[$current_config];
$current_config++;
if ($current_config > $max_config_index) {
$current_config = 0;
}
my $dbh;
eval {
$dbh=DBI->connect("dbi:mysql:test"
. ";mysql_read_default_file=$config_file",
undef,
undef,
{RaiseError => 1})
or die "Can't connect: $DBI::errstr\n";
};
if ( $EVAL_ERROR ) {
print STDERR $EVAL_ERROR;
return;
}
return $dbh;
}
my $dbh = get_connection(\@configuration_files);
$dbh->do(qq{truncate x});
my $bigtext = 'a' x 1000;
for my $loop (1 .. 10)
{
my $dbh1 = get_connection(\@configuration_files);
my $dbh2 = get_connection(\@configuration_files);
my ($server1) = $dbh1->selectrow_array(qq{select \@\@server_id});
my ($server2) = $dbh2->selectrow_array(qq{select \@\@server_id});
for (1 .. 3) {
$dbh2->do( qq{insert into x values (null, concat("server ", ? ), ?) } , undef , $server1, $bigtext );
}
usleep(1);
my $count = $dbh1->selectrow_array(qq{ select count(*) from x });
print "inserted a record from server $server2\n ",
"retrieved $count records from $server1\n";
}
After removing all records from the test table, for which there is now a third column (MEDIUMTEXT), the code loops through the nodes, getting at each step a node ($dbh2) and its master ($dbh1). Immediately after inserting three records in the forward node, it calls the master to fetch a count of records, after a stop of one microsecond (usleep). A sample session follows:
inserted a record from server 30
retrieved 3 records from 20
inserted a record from server 10
retrieved 6 records from 40
inserted a record from server 30
retrieved 9 records from 20
inserted a record from server 10
retrieved 12 records from 40
inserted a record from server 30
retrieved 15 records from 20
inserted a record from server 10
retrieved 18 records from 40
inserted a record from server 30
retrieved 21 records from 20
inserted a record from server 10
retrieved 24 records from 40
inserted a record from server 30
retrieved 27 records from 20
inserted a record from server 10
retrieved 30 records from 40
With this simple demonstration, I believe I have shown that circular replication arrays in MySQL 5 are a viable alternative to expensive clusters. However, circular replication does not scale well. When the number of nodes grows to more than 10 nodes, the speed of replication may not be up to the expectations of normal business operations.
Keep this in mind, though. If a small number of servers are enough for your business, an array of replication nodes could be just what you need.