The HA-OSCAR project's primary goal is to improve the existing OSCAR, Beowulf architecture, and cluster management technology systems (including OSCAR, ROCKS, and Scyld) while providing high-availability and scalability capabilities for Linux clusters. The OCG recognized the project as an official working group, along with the current OSCAR and Thin-OSCAR working groups. HA-OSCAR introduces several enhancements and new features to OSCAR, mainly in the areas of availability, scalability, and security. The new features in the initial release are head node redundancy and self-recovery for hardware, service, and application outages.
This document provides a systematic installation guide for system administrators, as well as a detailed explanation of what happens during the installation. This guide assumes familiarity with basic Linux administration commands. Prior knowledge of OSCAR installation and administration will be useful.
The HA-OSCAR team has tested HA-OSCAR to work with OSCAR 2.3, 2.3.1, and 3.0 based on Red Hat 9.0. The test environment for the installation discussed in this article is as follows:
This article assumes that you have built the cluster with OSCAR beforehand. If this is not the case, please refer to the OSCAR project page for the OSCAR installation procedure.
The primary and standby servers should have homogeneous hardware, and each server should have at least two network interface cards. The network interfaces must support PXE boot, and they must all connect to the local switch (two for redundancy purposes).
Figure 1 illustrates the HA-OSCAR architecture.

Figure 1.
HA-OSCAR architecture
HA-OSCAR consists of the following major system components:
Each head node must have at least two NICs: eth0 and eth1. One of the NICs
is a public interface to the outside network and the other is a private interface
to its local LAN and towards computing nodes. The exact configuration depends
on how a user wants to connect eth0 and eth1 to either the public or private
network. Our example assumes that eth0 is a private interface and eth1 is
the public interface.
Figure 2 shows the sample network configuration of a HA-OSCAR head node.

Figure 2. Sample HA-OSCAR network configuration for head
nodes
The HA-OSCAR team has developed an easy-to-install package with a GUI interface. When the system has OSCAR installed, download HA-OSCAR.
You must be root to be able to install HA-OSCAR. Once you uncompress the package, start the installation by typing the following command:
% ./haoscar_install <interface>
The interface directive is the private network interface for the primary
head, normally eth0.
The installation wizard should pop up (as shown in Figure 3). The HA-OSCAR installation wizard will walk the user through a complete installation process consisting of the following steps:
The following sections describe the HA-OSCAR wizard installation process and provide visuals for the associated screens.
In step 1, this wizard will install all of the required packages to the OSCAR cluster server and prepare the environment.

Figure
3. HA-OSCAR installation wizard
The first step will take less than one minute to complete. Step 2 is for building a standby server image from the primary node. When you click the button Building Image for Standby server, the wizard will pop up another window requesting a server image name. Normally, you can leave the default value and just press the Fetch image button (shown in Figure 4) to fetch an image for the standby server. This step will take several minutes.
This is an important step in cloning a standby server image from a primary one. For a stringent downtime requirement, we recommend a separate image server for image repository and recovery purposes.

Figure 4. Fetching or cloning a server image
This step will take ten to 15 minutes. Once it succeeds (a successful status window will pop up), click the Close button.
|
The third step requires the you to enter an alias public IP address. Proceed by clicking on step 3. HA-OSCAR will pop up the Standby server initial network Configuration screen shown in Figure 5. Users normally use this public IP address as a virtual entry point to access the head node. When the failover occurs, the standby server will take over this address so users can continue accessing the cluster as if nothing has happened. The normal procedures within this step are:
eth1 interface. When the failover occurs,
the standby server will automatically clone the cluster public IP on the
designated network interface, probably eth1.
Figure 5. Standby server initial network configuration
This step will take less than a minute. When the successful status window pops up, click on the Close button.
Pay close attention to the following procedures to retrieve the standby
server's MAC address for PXE booting before building its images on the local
drive. One of the standby server network interfaces, typically eth0, connects
to the private LAN and broadcasts its MAC address during its network boot.
Whenever the primary server is ready to build the standby server image, it starts cloning its images with the collected addresses. Consequently, the standby server will fetch the image by network booting the standby server via PXE (or floppy) from the primary server or an optional image server on its local file system. When the cloning succeeds, the server will reboot from its hard disk. This marks the completion of the standby server installation.
To assign the standby server's MAC address and build a local image on the standby server, proceed to step 4 in "Network Setup & Make boot server." HA-OSCAR will display the standby server MAC address configuration screen as shown in Figure 6.

Figure 6. Standby server MAC address configuration
Step 4 contains the following procedures:
eth0 is connected to the local switch where the primary server PXE
daemon will listen to the broadcast boot request. Otherwise, the primary
server will not be able to collect the standby MAC address in the next
step.eth0).
Figure 7. Standby server MAC address configuration after MAC address
collection
Having completed all four steps, the cluster should have all of its packages installed. The cluster should be ready to use or test. HA-OSCAR also provides a web-based management to customize the HA-OSCAR configuration, including the capability to enable new outage monitor/detection modules and failover capabilities. However, this is a feature for advanced users only, as it may cause invalid cluster configurations if you incorrectly configure HA-OSCAR parameters. The next section elaborates on this topic.
HA-OSCAR provides a default self-healing system resource and outage monitoring health and recovery mechanism. It also provides a web-based service monitoring and configuration program based on WebMin and Mon. You can use HA-OSCAR Webmin to customize resource managing, configuring, and service monitoring.
The following sections describe step by step how to manually configure the virtual network interface, (heartbeat) detection channel, and optional service monitoring configurations. Again, we intended to support the following procedures and features only for advanced users. The normal initial head node configuration steps are:
Access HA-OSCAR Webmin by opening http://localhost:10000 (but only if you have it running) and selecting the HA-OSCAR category to configure the system (Figure 8). A manual configuration (Figure 9) involves the following steps:
eth0 and eth1.Other users also can later log in and manage your system with the web-based tool.

Figure 8. Step-by-step instructions to set up a
virtual network interface and detection channel

Figure 9.
The main Webmin screen
|
First select Detection channel configuration shown in Figure 10, and navigate
into the corresponding screen shown in Figure 11. Initially, there should be
three network interfaces: eth0, eth1, and lo, all created during OSCAR and
HA-OSCAR installation. Add virtual network interfaces for eth0 and eth1 by
clicking on the Add a new interface button in Figure 12. Sample screens
(Figures 13 and 14) show how to add virtual network interfaces for eth0 and
eth1.

Figure 10. HA-OSCAR monitoring configuration screen

Figure 11. Detection channel configuration

Figure 12. Network interface screen

Figure 13. Sample eth0 virtual network interface creation

Figure 14. Sample eth1 virtual network interface creation
After you create the virtual network interfaces, the next step is to define the previously created network interfaces for HA-OSCAR (health) detection channels. Access the detection channel configuration from the HA-OSCAR Webmin screen shown in Figure 15. When you've made your selection, enter the network interface information in the form shown in Figure 15.

Figure 15. Channel
configuration selection

Figure 16. Primary server channel configuration
screen
When you complete the channel setup and primary server configuration, make sure to click the Save button in Figure 17 and then click the Apply configuration button in Figure 18.

Figure 17. Network and monitoring service
configuration screen

Figure 18. Main monitor configuration screen
HA-OSCAR provides a useful default set of monitoring policies. However, users can add new services and change monitoring parameters. We do recommend this option for advanced users.

Figure 19.
Monitor list task
Return to the index page, and apply all of the configurations to HA-OSCAR.

Figure 19. Details of the "Process Server" monitoring policy
Make sure to apply the change to the new configuration.
|
After you complete the primary server network and detection channel configuration, create a virtual network interface and enable channel detection. First, switch to the standby server terminal and access HA-OSCAR Webmin by opening http://localhost:10000 (the link will only work if you've enabled this locally). The setup steps are similar to those shown earlier in the primary server setup section. When you finish configuring the network interface, select only the Channel configuration on standby server button. Otherwise, it may cause unpredictable behavior and invalid configuration.
Virtual Network Interface Creation on the Standby Server
You can create a virtual network interface for the standby server that is
similar to the primary server configuration. Figure 21 shows how to set up the
standby server's virtual network interface. Be sure not to activate the
virtual public IP at boot time. It should come up only at the failover when a
user creates a virtual public interface, perhaps eth1:1.

Figure 21. The standby server network interface screen

Figure 22. Adding a new virtual network interface to eth1

Figure 23. A new network
interface created on the standby server

Figure 24. Configuring a detection channel on the standby
server

Figure 25. More configuration
for the detection channel
When you complete configuring both the network interface and the detection channel, return to the index page and apply the configuration (Figure 26).

Figure 26. Applying the changes after channel configuration
This article is meant as a guide to help you on your feet with your installation and configuration of a highly available Linux cluster using HA-OSCAR.
|
Related Reading
High Performance Linux Clusters with OSCAR, Rocks, OpenMosix, and MPI |
Open source projects have a special dynamic, especially popular projects, and they tend to advance and change as their users request. Therefore, if any of the steps/functionality/screen captures above are not valid by the time you read this article, this will be due to changes in the HA-OSCAR package; please forgive us and post an update on the discussion forum.
We hope you find this article useful. Have fun.
Happy Hacking!
resync.Ibrahim Haddad is the Director of Technology for the Software Operations Group (Home & Network Mobility Business Unit) at Motorola Inc.
Chokchai Leangsuksun is an Associate Professor of Computer Science, Louisiana Tech University.
Stephen L. Scott is a founding member of OCG and OSCAR - and has served in the capacity of both release manager and working group chair
Return to the Linux DevCenter.
Copyright © 2009 O'Reilly Media, Inc.