If you want to try out a High Availability Cluster on OpenSolaris, but don’t have the physical hardware, you can easily prototype it in VirtualBox. You need only a single physical machine with an AMD or Intel processor and at least 3 GB of RAM. Even laptops work fine; I’m using a Toshiba Tecra M10 laptop.
When using VirtualBox, the “cluster” will be two VirtualBox guests. Because of a new preview feature in Open HA Cluster 2009.06, called “weak membership,” you don’t need to worry about a quorum device or quorum server. More on that below.
For detailed instructions on running Open HA Cluster 2009.06 in VirtualBox, I highly recommend Thorsten Frueauf’s whitepaper (pdf link). This post won’t attempt to be a substitute for that document. Instead, I will describe a single, simple configuration to get you up and running. If this piques your interest, please read Thorsten’s whitepaper for more details.
Without further ado, here are the instructions for running a two-node Open HA Cluster in VirtualBox.
# beadm create cluster
# beadm activate cluster
# init 6
# gunzip VirtualBox-2.2.0-45846-SunOS.tar.gz
# tar -xf VirtualBox-2.2.0-45846-SunOS.tar
# pkgadd -G -d VirtualBoxKern-2.2.0-SunOS-r45846.pkg
# pkgadd -G -d VirtualBox-2.2.0-SunOS-r45846.pkg
# beadm create cluster
# beadm activate cluster
Disabling the graphical login helps reduce memory consumption by the guests. Perform the following procedure in each of the two guests:
# cp /rpool/boot/grub/menu.lst menu.lst.bak1
# vi /rpool/boot/grub/menu.lst
The diffs should be something like this (though your line numbers may vary depending on what BEs you have in the GRUB menu):
# diff menu.lst menu.bak1
> splashimage /boot/solaris.xpm
> foreground d25f00
> background 115d93
< kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
> kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS,console=graphics
Your BE entry should look something like this:
kernel$ /platform/i86pc/kernel/$ISADIR/unix -B $ZFS-BOOTFS
# svcadm disable graphical-login/gdm
# init 6
In order to form the cluster you’ll later configure “bridged networking” for the VirtualBox guests. But once you do that, the guests won’t be able to access the Internet without some additional steps that I won’t document here (see Thorsten’s whitepaper for details).
Thus, you need to install all the packages you’ll need from the repositories before you’ve configured the networking.
# pkg set-publisher -k /var/pkg/ssl/Open_HA_Cluster_2009.06.key.pem -c /var/pkg/ssl/Open_HA_Cluster_2009.06.certificate.pem -O https://pkg.sun.com/opensolaris/ha-cluster ha-cluster
# pkg install ha-cluster-full
# pkg install SUNWstmf SUNWiscsi SUNWiscsit
# pkg install SUNWtcat
# pkg install SUNWmysql51
You can now set up the networking framework to allow the two cluster nodes (the VirtualBox guests) to communicate both with each other and with the physical host.
# dladm create-etherstub etherstub0
# dladm create-vnic -l etherstub0 -m a:b:c:d:1:2 vnic0
# dladm create-vnic -l etherstub0 -m a:b:c:d:1:3 vnic1
# dladm create-vnic -l etherstub0 -m a:b:c:d:1:4 vnic2
# dladm create-vnic -l etherstub0 -m a:b:c:d:1:5 vnic3
# dladm create-vnic -l etherstub0 -m a:b:c:d:1:6 vnic4
# svcadm disable nwam
# svcadm enable network/physical:default
These are random choices. Feel free to use any IP addresses within the proper subnet.
# ifconfig vnic0 plumb
# ifconfig vnic0 inet 10.0.2.97/24 up
# echo “10.0.2.97/24″ > /etc/hostname.vnic0
# ifconfig e1000g0 plumb
# ifconfig e1000g0 dhcp start
# touch /etc/hostname.e1000g0 /etc/dhcp.e1000g0
# grep dns /etc/nsswitch.conf
hosts: files dns
As described earlier, you need to use “bridged networking” for the guests, which gives the guests emulated physical adapters that run on VNICs on the host. You need to give each guest two adapters – one for the public network and one for the cluster private network. Note that you can’t use VNICs inside the guests because they don’t work inside VirtualBox.
# svcadm disable network/physical:nwam
# svcadm enable network/physical:default
# ifconfig e1000g0 plumb
# ifconfig e1000g0 inet 10.0.2.98/24 up
# echo “10.0.2.98/24″ > /etc/hostname.e1000g0
chopin# ping 10.0.2.97
host# ping 10.0.2.98
host# ping chopin
# ifconfig e1000g0 inet 10.0.2.99/24 up
# echo “10.0.2.99/24″ > /etc/hostname.e1000g0
# svccfg -s rpc/bind setprop config/local_only = boolean: false
# svcadm refresh rpc/bind
If you’re familiar with HA Clusters, you may notice that you haven’t configured a quorum device or quorum server to break a tie and ensure that only one node of the cluster stays up in the case of a network partition. Instead, you can use “weak membership” which is a new preview features in Open HA Cluster 2009.06. Weak membership allows a two-node cluster to run without a quorum device arbitrator. Instead, you use a “ping target” arbitrator, which can be any network device on the same subnet. In the case of node death or a network partition, each node attempts to ping the ping target. If the node can ping it successfully, it stays up. As you might guess, this mechanism is imperfect, and in the worst case can lead to a split-brain scenario of both nodes providing services simultaneously, which can lead to data loss. To configure weak membership in the VirtualBox setup, you can use the physical host as the ping target.
# /usr/cluster/bin/clq set -p multiple_partitions=true -p ping_targets=10.0.2.97 membership
This action might result in data corruption or loss.
Are you sure you want to enable multiple partitions in the cluster to be operational (y/n) [n]?y
# /usr/cluster/bin/clq reset
Now your cluster is ready to use!
I’m happy to report that the first Open HA Cluster Summit last week was a fantastic event.
The summit kicked off with Professor David Cheriton’s keynote address, The Network is the Cluster (pdf link). If the audience took away only one thing from the whole summit, I hope it was the point Cheriton made up-front on the second slide, that everyone needs high availability because the alternative is unpredictability and un-dependability. We all take availability of the computer services we use for granted until those services are not there. Here’s a photo of Professor Cheriton concluding his talk:
As you can see the room was full and the audience attentive (I’m in the front row on the left):
After Dr. Cheriton’s address, we had a discussion featuring panelists from Aster Data Systems, Google, and Sun Microsystems and moderated by Eve Kleinknecht.
One point I took away from the discussion is that high availability isn’t for the 364 days a year that everything works right. It’s for the one day a year when something goes wrong. Here Thorsten Frueauf (left) and I are watching the panel:
The morning session wrapped up with my talk on a minimal and modular HA cluster for OpenSolaris (pdf link). Here I am at the podium:
Lunch featured a nice centerpiece of an OpenSolaris Bible at each table for one lucky winner from that table.
I happily signed the books for all the winners.
The event concluded with a fun “Casino Night”:
First two photos by Thorsten Frueauf. All other photos by Tirthankar Das
Here’s an interview with my Engineering Director, Meenakshi Kaul-Basu, and Dan Roberts, Director of OpenSolaris Product Management, about the Open HA Cluster 2009.06 release that was announced last week.
I’m excited that Professor David Cheriton has agreed to give the keynote address at the Open HA Cluster summit on May 31. Dr. Cheriton has an impressive resume in both academia and industry, and his lecture should be quite interesting. I can testify from personal experience that Dr. Cheriton is an entertaining speaker, as I took CS244b (Distributed Systems) from him as a grad student. This was my first exposure to distributed systems, and I’ve been in the field ever since.
The summit is free and open to anyone. It falls on the Sunday directly before CommunityOne and JavaOne, so if you’re planning to attend those conferences, come a day early and check out this one. There will be free food, and the first 10 students to arrive at the conference, bright and early at 8:45 AM, will be able to participate in a drawing for a nano-iPod. Later at the evening reception we will be giving away a Toshiba Portege laptop. You can register on the summit wiki.
Please join us at the first Open HA Cluster summit on May 31, 2009! This event will come almost exactly two years after we formed the HA Clusters community and released the first Open HA Cluster source code, and one year after we released the code for the Sun Cluster Core. I’m looking forward to showing off project Colorado. Here’s the official invitation from Jatin, our new community manager:
You are invited to participate in the first OpenSolaris Summit for Open HA Cluster.
Open HA Cluster Summit
Sunday, May 31st, 2009
San Francisco Marriott (Next to Moscone Convention Center)
55 Fourth Street
San Francisco, CA 94103 USA
The Open HA Cluster Summit will precede the CommunityOne West and JavaOne Conferences which start on June 1. We will bring together members of the HA Clusters community, technologists, and users of High Availability and Business Continuity software. Not only will experts lead interactive sessions, panel discussions and technical tracks, but there will be ample time for you to also be an active participant.
We invite you to register yourself for this event at your earliest convenience. Email email@example.com if you have any difficulty with the registration. Attendance is free. There will be a reception and Community Marketplace, an informal venue to showcase your products and ideas, in the evening following the technical sessions.
HA Clusters Community Manager
This event is sponsored by Sun Microsystems, Inc. Spread the word.
I’ve launched a project page on OpenSolaris.org for project Colorado. As this is quite a large project, there are several efforts occurring in parallel. One of these tasks is to write the requirements for the project. If you’re interested, you can read a draft of the requirements and send any comments to firstname.lastname@example.org by September 10.
Another task is basic building and bringup of Sun Cluster / Open HA Cluster on OpenSolaris. You can see some of the status on that effort on the project wiki. We are also starting to investigate what it will take to convert the existing SVR4 packages to IPS packages. There’s a lot of work to do! If you’re interested in getting involved with the project, please let me know.
I’ve just proposed a new OpenSolaris project to port Open HA Cluster to the OpenSolaris distribution, including the new Image Packaging System. To quote from my proposal email, this distribution of OHAC will provide basic cluster functionality with a low barrier to entry and easy deployment for OpenSolaris. Additional functionality will be provided via optional IPS packages. The intended audience is system administrators needing simple HA and developers needing an HA framework for their applications or software appliances.
This project to me feels like the natural next step from the open sourcing work I’ve been doing for the past year and 1/2. Now that the code is out there, it’s time to get it running on OpenSolaris. I’m particularly excited to get back to hands-on engineering after suffering through the legal and process work of open sourcing.
One note on the project name: Following the state-name precedence of Nevada and Indiana, I naturally chose the state in which I live.
As I announced to the OpenSolaris community yesterday, we’ve released a new version of Solaris Cluster Express. SCX, as we call it, is a build of Solaris Cluster for Solaris Express. This release runs on SXCE build 86. If you don’t want to build the cluster source code yourself, this binary distribution is a good option for trying out Solaris Cluster / Open HA Cluster on OpenSolaris.
Solaris Cluster Express 10/07 is now available for download.
As usual, Solaris Cluster Express is a complete version of the Solaris Cluster product that runs on the Solaris Express platform. This release of Solaris Cluster Express runs on Solaris Express Developer Edition 9/07.