Improving Network Reliability with Keepalived
Pages: 1, 2, 3
At the same time, when the master comes back on to the network, it notices the backup and forces the backup to give up the VIP:
Keepalived_vrrp: VRRP_Instance(VI_1) Received lower prio advert,
forcing new election
Keepalived_vrrp: VRRP_Instance(VI_1) Sending gratuitous ARP on eth0
At this point, the master is back in charge. Now you know that your Keepalived setup is working.
Failover Time-outs
Why is the maximum failover time in the example 3.6 seconds? This comes from the advertisement interval and the skew time. The default advertisement interval is 1 second (configurable in keepalived.conf). The skew time helps to keep everyone from trying to transition at once. It is a number between 0 and 1, based on the formula
(256 - priority) / 256
As defined in the RFC, the backup must receive an advertisement from the master every
(3 * advert_int) + skew_time
seconds. If it doesn't hear anything from the master, it takes over. With a backup router priority of 100 (as in the example), the failover will happen at most 3.6 seconds after the master goes down.
Closing Thoughts
Keepalived provides a rich set of tools for server monitoring. For our purposes of increasing router redundancy, the most interesting one is VRRP. Take a couple of Linux routers, add Keepalived with VRRP, and you have a much more redundant configuration.
Of course, it is important to note that this is not a complete solution. Consider the standard office setup of one T1 connected to one router. Even if you set a a backup router, you don't have full protection: if the one router with the T1 goes down, your clients will lose all their connectivity. Any complete redundancy solution must also consider external network links, not just internal VRRP routers. The IBM Redpaper on VRRP has some good information on designing a network with robust upstream routing.
In the past, some people have hesitated to consider using Keepalived for just a VRRP setup, as they perceive Keepalived as a large and complex system. I can assure you, based on my experience, that this isn't the case. If you are running a pool of systems such as web servers, you should check out the other features Keepalived has to offer. However, if you just want to add router redundancy to your Linux network, VRRP via Keepalived is just the ticket.
Special thanks to Keepalived developer Alexandre Cassen for reviewing this article and providing valuable feedback and corrections.
Philip Hollenback is a system administrator at a financial firm in Manhattan. When he's not upgrading Linux servers or skateboarding, Phil spends his time updating his web site, www.hollenback.net.
Return to the Linux DevCenter.