Kerrighed is made of a kernel making the actual stuff, and tools to control its behavior. The kernel part is provided as a Linux kernel patch.
Kerrighed provides SSI features using a Linux container (lxc). In a few words, a container is basically a light-weight virtual machine sharing its kernel with the host OS. Depending on the needs, it may share or isolate some resources with its host, such as PIDs, IPCs, Net, file systems, etc., and provides resource control groups (memory usage allowed, etc.).
On a Kerrighed kernel, the host system doesn't provides kerrighed features. Those features are only available inside a special Linux container called Kerrighed container. A process running on the host system will behave as on a non patched kernel. Processes running in the Kerrighed container will have the ability to migrate from one node to another, checkpoint and restart, use distant memory, etc.
By default on the Kerrighed system, the host system shares mosts of its ressources with the Kerrighed container (Network addresses, physical devices, filesystem, system users, etc). When the Kerrighed service boots the container, it additionally executes a configurable set of commands. By default, a ssh server listening on port 2222 is launched on each node. Once connected, you are in the SSI cluster!
The build process consists in the traditional ./configure && make && make install.
cd /usr/src wget http://gforge.inria.fr/frs/download.php/27161/kerrighed-3.0.0.tar.gz tar zxf kerrighed-3.0.0.tar.gz mv kerrighed-3.0.0 kerrighed-src cd kerrighed-src
export CONCURRENCY_LEVEL=<number of processors/cores available in the computer>
cd _kernel make menuconfig
cd /usr/src/kerrighed-src make
make install DESTDIR=/path/to/an/exported/root/ INSTALL_PATH=/path/to/an/exported/root/boot/ INSTALL_MOD_PATH=/path/to/an/exported/root/
There are several ways to setup a kerrighed cluster. You may choose whether to install systems on the hard drive, or to boot it via the network. Each have its benefits and drawbacks.
The network boot is a combination of PXE, which serves the kernel, and NFS root, which consists in mounting the whole system tree over NFS. This provides a very good consistency for the cluster. This ensures every system of the cluster will boot the same kernel, will have the same binaries and libraries. However this requires a fine tunning of kernel configuration.
An install on Hard drive will be documented later.
Quick usage guide
Before being able to set up nodes, you will have to configure your Kerrighed service.
Settings are located in 3 files:
Here we describe the most important settings:
Finally, you may want to execute the Kerrighed service at boot time.
update-rc.d kerrighed-host defaults 60
chkconfig -add kerrighed-host
ln -s ../init.d/kerrighed-host /etc/rc2.d/S60kerrighed-host
Kerrighed tools come with several utilities which control its behavior and its state:
Please note that those tools only work when run in the Kerrighed container.
Running applications/tools in Kerrighed
Kerrighed implements Single System Image only inside a Linux container. In the default configuration, to log in the container you will need to connect to the SSH server listening on port 2222. The command should look like this:
ssh <cluster-user>@<node IP address> -p 2222
It is quite useful to use SSH keys to authenticate directly without typing every time the cluster user password. Check the manpages of ssh-keygen and ssh-copy-id for details.
You have several ways to visualize informations relative to your cluster, but all of them are available only when you log in a node container (see above). Once you are connected (which is a good sign that the container is available by the way) these are your options to check cluster status:
krgadm cluster status krgadm nodes status
alias psk='ps axf -o user,pid:9,ppid:9,pgid:9,sid:9,%cpu,%mem,rss,tty,stat,start,time,psr,command:80'
A load-balancing is achieved by a scheduler over every node of the cluster. The legacy scheduler, whose rules are automatically loaded, aims to detect processes which use much CPU resource, and balance them. This scheduler acts on fork()s, spreading new processes over the cluster (distant fork) in a round robin fashion, or at any moment, dynamically relocating some running processes to other nodes (auto-migration).
Note: Before having the ability to distant fork / migrate, a process must have the right to do so. Kerrighed implements a capabilities system, which rules the rights of each process. Contrary to Linux capabilities, Kerrighed ones are divided in 4 sets: effective, permitted, inheritable effective, inheritable permitted. You should see the manpage for a full description of each one.
krgcapset --pid 10486538 -e +DISTANT_FORK krgcapset --pid 10486538 -e +CAN_MIGRATE
Note that additional capabilities are available to control multiple aspects of a process behavior. See the kerrighed_capabilities (7) manpage for more information.
migrate 10486538 3
Note that if a Kerrighed scheduler is loaded when you use the migrate (1) command, it might relocate the process just after you. If a process have the CAN_MIGRATE capability, schedulers may migrate it at any time unless you configure them not to do so.
For more information on schedulers, see the dedicated page.
The last key feature of Kerrighed is checkpointing.
checkpoint 10486538 ... restart 10486538 1