INTRODUCTION ============ Autocluster is set of scripts for building virtual clusters to test clustered Samba. It uses Linux's libvirt and KVM virtualisation engine. BASIC SETUP =========== Before you start, make sure you have the latest version of autocluster. To download autocluster do this: git clone git://git.samba.org/tridge/autocluster.git autocluster Or to update it, run "git pull" in the autocluster directory To setup a virtual cluster for SoFS with autocluster follow these steps: 1) Install kvm, libvirt, qemu-nbd and nbd-client. For various distros: * RHEL/CentOS For RHEL5/CentOS5, useful packages for both kvm and libvirt can be found here: http://www.lfarkas.org/linux/packages/centos/5/x86_64/ You will need to install a matching kmod-kvm package to get the kernel module. qemu-nbd is in the kvm package. Unless you can find an RPM for nbd-client then you need to download source from: http://sourceforge.net/projects/nbd/ and build it. * Fedora Core Useful packages ship with Fedora Core 10 (Cambridge) and later. qemu-nbd is in the kvm package. nbd-client is in the nbd package. - Ubuntu Useful packages ship with Ubuntu 8.10 (Intrepid Ibex) and later. qemu-ndb is in the kvm package but is called kvm-nbd, so you need to set the QEMU_NBD configuration variable. nbd-client is in the nbd-client package. For other distributions you'll have to backport distro sources or compile from upstream source as described below. * For KVM see the "Downloads" and "Code" sections at: http://www.linux-kvm.org/ * For libvirt see: http://libvirt.org/ * As mentioned about, nbd can be found at: http://sourceforge.net/projects/nbd/ You will need to configure the right kvm networking setup. The files in host_setup/etc/libvirt/qemu/networks/ should help. This command will install the right networks for kvm: rsync -av --delete host_setup/etc/libvirt/qemu/networks/ /etc/libvirt/qemu/networks/ 2) You need a caching web proxy on your local network. If you don't have one, then install a squid proxy on your host. See host_setup/etc/squid/squid.conf for a sample config suitable for a virtual cluster. Make sure it caches large objects and has plenty of space. This will be needed to make downloading all the RPMs to each client sane To test your squid setup, run a command like this: http_proxy=http://10.0.0.1:3128/ wget 3) Setup a DNS server on your host. See host_setup/etc/bind/ for a sample config that is suitable. It needs to redirect DNS queries for your SOFS virtual domain to your windows domain controller 4) Download a RHEL install ISO. 5) Create a 'config' file in the autocluster directory. See the "CONFIGURATION" section below for more details. 6) Use "./autocluster create base" to create the base install image. The install will take about 10 to 15 minutes and you will see the packages installing in your terminal Before you start create base make sure your web proxy cache is authenticated with the Mainz BSO (eg. connect to https://9.155.61.11 with a web browser) 7) When that has finished I recommend you mark that base image immutable like this: chattr +i /virtual/SoFS-1.5-base.img That will ensure it won't change. This is a precaution as the image will be used as a basis file for the per-node images, and if it changes your cluster will become corrupt 8) Now run "./autocluster create cluster" specifying a cluster name. For example: ./autocluster create cluster c1 That will create your cluster nodes and the TSM server node 9) Now boot your cluster nodes like this: ./vircmd start c1 The most useful vircmd commands are: start : boot a node shutdown : graceful shutdown of a node destroy : power off a node immediately 10) You can watch boot progress like this: tail -f /var/log/kvm/serial.c1* All the nodes have serial consoles, making it easier to capture kernel panic messages and watch the nodes via ssh 11) Now you can ssh into your nodes. You may like to look at the small set of scripts in /root/scripts on the nodes for some scripts. In particular: setup_tsm_server.sh: run this on the TSM node to setup the TSM server setup_tsm_client.sh: run this on the GPFS nodes to setup HSM mknsd.sh : this sets up the local shared disks as GPFS NSDs setup_gpfs.sh : this sets GPFS, creates a filesystem etc, byppassing the SoFS GUI. Useful for quick tests. 12) If using the SoFS GUI, then you may want to lower the memory it uses so that it fits easily on the first node. Just edit this file on the first node: /opt/IBM/sofs/conf/overrides/sofs.javaopt 13) For automating the SoFS GUI, you may wish to install the iMacros extension to firefox, and look at some sample macros I have put in the imacros/ directory of autocluster. They will need editing for your environment, but they should give you some hints on how to automate the final GUI stage of the installation of a SoFS cluster. CONFIGURATION ============= * See config.sample for an example of a configuration file. Note that all items in the sample file are commented out by default * Configuration options are defined in config.d/*.defconf. All configuration options have an equivalent command-line option. * Use "autocluster --help" to list all available command-line options - all the items listed under "configuration options:" are the equivalents of the settings for config files. * Run "autocluster --dump > config.foo" (or similar) to create a config file containing the default values for all options that you can set. You can then delete all options for which you wish to keep the default values and then modify the remaining ones, resulting in a relatively small config file. * Use the --with-release option on the command-line or the with_release function in a configuration file to get default values for building virtual clusters for releases of particular "products". Currently there are only release definitions for SoFS. For example, you can setup default values for SoFS-1.5.3 by running: autocluster --with-release=SoFS-1.5.3 ... Equivalently you can use the following syntax in a configuration file: with_release "SoFS-1.5.3" The release definitions are stored in releases/*.release. The available releases are listed in the output of "autocluster --help". NOTE: Occasionally you will need to consider the position of with_release in your configuration. If you want to override options handled by a release definition then you will obviously need to set them later in your configuration. This will be the case for most options you will want to set. However, some options will need to appear before with_release so that they can be used within a release definition - the most obvious one is the (rarely used) RHEL_ARCH option, which is used in the default ISO setting for each release. DEVELOPMENT HINTS ================= The -e option provides support for executing arbitrary bash code. This is useful for testing and debugging. One good use of this option is to test template substitution using the function substitute_vars(). For example: ./autocluster --with-release=SoFS-1.5.3 -e 'CLUSTER=foo; DISK=foo.qcow2; UUID=abcdef; NAME=foon1; set_macaddrs; substitute_vars templates/node.xml' This prints templates/node.xml with all appropriate substitutions done. Some internal variables (e.g. CLUSTER, DISK, UUID, NAME) are given fairly arbitrary values but the various MAC address strings are set using the function set_macaddrs().