Improving Network Reliability with Keepalived
Pages: 1, 2, 3
The virtual_router_id value matches up master and backup VRRP
servers. All servers in a particular VRRP group (one master and one or more
backup servers) should have the same vrrp_router_id.
VRRP uses an election mechanism to determine who is the master in a VRRP
group, and the highest priority wins. The master should have a priority at
least 50 higher than that of any of the backup servers, because the priority
contributes to the failover time. In this example, if you set the master
priority to 100, none of the backup servers should have a priority higher than
50. Remember, I said the state setting controls the state in which the
VRRP server starts up. Immediately after startup, all VRRP routers in the
same group (on the same network and with the same
virtual_router_id) will hold an election. The server with the
highest priority wins and becomes master, even if that machine has a lower
priority than a system that started in the master state.
The VRRP specification describes several authentication mechanisms.
Obviously there should be some way for the VRRP servers to communicate
securely, because a rogue system could create a denial of service attack on
your network by overriding your real VRRP master server. Keepalived supports
both the password and IPSEC Authentication Header authentication methods, but
password authentication is easier for normal use due to some implementation
problems with IPSEC-AH authentication. Thus in my example I set the auth_type
to password (plain password) and I specify a password on the auth_pass line.
Note that this is a plain-text password that goes over the network very often
(at least once per second at the standard advertisement interval), so this is
really not a strong security measure.
Finally, the file sets the virtual router address in the
virtual_ipaddress section. As I mentioned earlier, it probably
makes sense to set your virtual IP address (VIP) to whatever your existing
gateway was using, to minimize client configuration changes. You also have to
specify the device this address is on. This should match the value of the
interface setting above.
You can specify multiple addresses in the virtual_ipaddress
section. This is useful if your VRRP server is on several different VLANs. In
that case, each VIP goes on a separate line and the device entry corresponds to
the VLAN of the virtual IP address.
That's an entire minimal master keepalived.conf file. See the
keepalived.conf(5) man page for the other, optional settings.
The backup server keepalived.conf is almost identical. First,
change the state setting to BACKUP, as you want this server to come up in the
backup state. Then, change the priority to a lower number than that of the
master. Remember that it should be at least 50 percent lower than the master priority,
so 50 is a good choice in this case. Everything else in the configuration file
(including, most importantly, the auth_pass) should have the
same.
Time to Start Keepalived
Now that you have Keepalived configured on both the master and backup servers, start VRRP by running the Keepalived init script that came in the Keepalived source tarball (assuming you are on a Red Hat or Fedora system; adjust for other distros):
# /sbin/service keepalived start
Do this on both the master and slave servers. Then, check the syslog
(/var/log/messages) on each machine. You should see messages
indicating that Keepalived has started in mode MASTER on the master server and
mode BACKUP on the slave. How can you tell if the master server is answering
on the virtual IP address? The best way to check this is with the
ip command. Run:
# ip addr show
on the master. Assuming that you're running Keepalived on eth0, you should see something like this:
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:e0:81:2b:aa:b5 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.253/24 brd 192.168.1.255 scope global eth0
inet 192.168.1.1/24 brd 192.168.1.255 scope global secondary eth0
The output on the slave should be:
2: eth0: <BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue
link/ether 00:e0:81:2b:aa:c3 brd ff:ff:ff:ff:ff:ff
inet 192.168.1.254/24 brd 192.168.1.255 scope global eth0
This shows that the master is answering both on its 192.168.1.253 address
and the virtual IP of 192.168.1.1. This information is not available in the
traditional ifconfig command output, so this is a good reason to
bite the bullet and start using the ip command to view and or
change your network settings if you haven't already been using it.
Testing
Testing Keepalived is straightforward: unplug the master from the network and see if the slave takes over. You can see the Keepalived state changes in syslog, however you won't see anything in the master syslog when you disconnect it from the network. The master actually does notice that the backup has disappeared and it transfers to a fault state--because the master also listens for multicast advertisements from the backup. In practice, the master doesn't do anything in the fault state except wait to hear from the backup.
The backup is more chatty. Its syslog will contain messages such as:
Keepalived_vrrp: VRRP_Instance(VI_1) Transition to MASTER STATE
Keepalived_vrrp: VRRP_Instance(VI_1) Entering MASTER STATE
Keepalived_vrrp: VRRP_Instance(VI_1) setting protocol VIPs.
Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARP on eth0
This should be pretty easy to understand: the backup lost track of the master, so it decided to become master, take over the VIP, and send a gratuitous ARP to notify the clients. With the sample configuration, this will take 3.6 seconds at most.
Once this transition occurs, the backup is now in the MASTER state and
controls the virtual IP address. This continues until the master comes back
(well, or until the backup server goes down). Verify that the backup is in
charge of the VIP by running ip addr show on the backup and
verifying that the VIP is there (as in the previous section).
When the master comes back, you will see this in the backup server syslog:
Keepalived_vrrp: VRRP_Instance(VI_1) Received higher prio advert
Keepalived_vrrp: VRRP_Instance(VI_1) Entering BACKUP STATE
Keepalived_vrrp: VRRP_Instance(VI_1) removing protocol VIPs
Again, this is pretty easy to follow. The backup heard from another, more
important VRRP server (the master, because that's the only other one in the
example), so it went to the BACKUP state and deleted the virtual IP
addresses.