4 Autocluster is set of scripts for building virtual clusters to test
5 clustered Samba. It uses Linux's libvirt and KVM virtualisation
8 Autocluster is a collection of scripts, template and configuration
9 files that allow you to create a cluster of virtual nodes very
10 quickly. You can create a cluster from scratch in less than 30
11 minutes. Once you have a base image you can then recreate a cluster
12 or create new virtual clusters in minutes.
14 Autocluster has recently been tested to create virtual clusters of
15 RHEL 6/7 nodes. Older versions were tested with RHEL 5 and some
22 * INSTALLING AUTOCLUSTER
37 INSTALLING AUTOCLUSTER
38 ======================
40 Before you start, make sure you have the latest version of
41 autocluster. To download autocluster do this:
43 git clone git://git.samba.org/autocluster.git
45 Or to update it, run "git pull" in the autocluster directory
47 You probably want to add the directory where autocluster is installed
48 to your PATH, otherwise things may quickly become tedious.
54 This section explains how to setup a host machine to run virtual
55 clusters generated by autocluster.
58 1) Install and configure required software.
60 a) Install kvm, libvirt and expect.
62 Autocluster creates virtual machines that use libvirt to run under
63 KVM. This means that you will need to install both KVM and
64 libvirt on your host machine. Expect is used by the waitfor()
65 function and should be available for installation from your
72 Autocluster should work with the standard RHEL qemu-kvm and
73 libvirt packages. It will try to find the qemu-kvm binary. If
74 you've done something unusual then you'll need to set the KVM
75 configuration variable.
77 For RHEL5/CentOS5, useful packages for both kvm and libvirt used
80 http://www.lfarkas.org/linux/packages/centos/5/x86_64/
82 However, since recent versions of RHEL5 ship with KVM, 3rd party
83 KVM RPMs for RHEL5 are now scarce.
85 RHEL5.4's KVM also has problems when autocluster uses virtio
86 shared disks, since multipath doesn't notice virtio disks. This
87 is fixed in RHEL5.6 and in a recent RHEL5.5 update - you should
88 be able to use the settings recommended above for RHEL6.
90 If you're still running RHEL5.4, you have lots of time, you have
91 lots of disk space, and you like complexity, then see the
92 sections below on "iSCSI shared disks" and "Raw IDE system
97 Useful packages ship with Fedora Core 10 (Cambridge) and later.
98 Some of the above notes on RHEL might apply to Fedora's KVM.
102 Useful packages ship with Ubuntu 8.10 (Intrepid Ibex) and later.
103 In recent Ubuntu versions (e.g. 10.10 Maverick Meerkat) the KVM
104 package is called "qemu-kvm". Older versions have a package
107 For other distributions you'll have to backport distro sources or
108 compile from upstream source as described below.
110 * For KVM see the "Downloads" and "Code" sections at:
112 http://www.linux-kvm.org/
118 b) Install guestfish or qemu-nbd and nbd-client.
120 Autocluster needs a method of updating files in the disk image for
123 Recent Linux distributions, including RHEL since 6.0, contain
124 guestfish. Guestfish (see http://libguestfs.org/ - there are
125 binary packages for several distros here) is a CLI for
126 manipulating KVM/QEMU disk images. Autocluster supports
127 guestfish, so if guestfish is available then you should use it.
128 It should be more reliable than NBD.
130 Autocluster attempts to use the best available method (guestmount
131 -> guestfish -> loopback) for accessing disk image. If it chooses
132 a suboptimal method (e.g. nodes created with guestmount sometimes
133 won't boot), you can force the method:
135 SYSTEM_DISK_ACCESS_METHOD=guestfish
137 If you can't use guestfish then you'll have to use NBD. For this
138 you will need the qemu-nbd and nbd-client programs, which
139 autocluster uses to loopback-nbd-mount the disk images when
140 configuring each node.
142 NBD for various distros:
146 qemu-nbd is only available in the old packages from lfarkas.org.
147 Recompiling the RHEL5 kvm package to support NBD is quite
148 straightforward. RHEL6 doesn't have an NBD kernel module, so is
149 harder to retrofit for NBD support - use guestfish instead.
151 Unless you can find an RPM for nbd-client then you need to
152 download source from:
154 http://sourceforge.net/projects/nbd/
160 qemu-nbd is in the qemu-kvm or kvm package.
162 nbd-client is in the nbd package.
166 qemu-nbd is in the qemu-kvm or kvm package. In older releases
167 it is called kvm-nbd, so you need to set the QEMU_NBD
168 configuration variable.
170 nbd-client is in the nbd-client package.
172 * As mentioned above, nbd can be found at:
174 http://sourceforge.net/projects/nbd/
176 c) Environment and libvirt virtual networks
178 You will need to add the autocluster directory to your PATH.
180 You will need to configure the right libvirt networking setup. To
183 host_setup/setup_networks.sh [ <myconfig> ]
185 If you're using a network setup different to the default then pass
186 your autocluster configuration filename, which should set the
187 NETWORKS variable. If you're using a variety of networks for
188 different clusters then you can probably run this script multiple
191 You might also need to set:
193 VIRSH_DEFAULT_CONNECT_URI=qemu:///system
195 in your environment so that virsh does KVM/QEMU things by default.
197 2) Configure a local web/install server to provide required YUM
200 If your install server is far away then you may need a caching web
201 proxy on your local network.
203 If you don't have one, then you can install a squid proxy on your
206 WEBPROXY="http://10.0.0.1:3128/"
208 See host_setup/etc/squid/squid.conf for a sample config suitable
209 for a virtual cluster. Make sure it caches large objects and has
210 plenty of space. This will be needed to make downloading all the
211 RPMs to each client sane
213 To test your squid setup, run a command like this:
215 http_proxy=http://10.0.0.1:3128/ wget <some-url>
217 Check your firewall setup. If you have problems accessing the
218 proxy from your nodes (including from kickstart postinstall) then
219 check it again! Some distributions install nice "convenient"
220 firewalls by default that might block access to the squid port
221 from the nodes. On a current version of Fedora Core you may be
222 able to run system-config-firewall-tui to reconfigure the
225 3) Setup a DNS server on your host. See host_setup/etc/bind/ for a
226 sample config that is suitable. It needs to redirect DNS queries
227 for your virtual domain to your windows domain controller.
229 4) Download a RHEL (or CentOS) install ISO.
235 A cluster comprises a single base disk image, a copy-on-write disk
236 image for each node and some XML files that tell libvirt about each
237 node's virtual hardware configuration. The copy-on-write disk images
238 save a lot of disk space on the host machine because they each use the
239 base disk image - without them the disk image for each cluster node
240 would need to contain the entire RHEL install.
242 The cluster creation process can be broken down into several main
245 1) Create a base disk image.
247 2) Create per-node disk images and corresponding XML files.
249 3) Update /etc/hosts to include cluster nodes.
251 4) Boot virtual machines for the nodes.
253 5) Post-boot configuration.
255 However, before you do this you will need to create a configuration
256 file. See the "CONFIGURATION" section below for more details.
258 Here are more details on the "create cluster" process. Note that
259 unless you have done something extra special then you'll need to run
262 1) Create the base disk image using:
264 ./autocluster base create
266 The first thing this step does is to check that it can connect to
267 the YUM server. If this fails make sure that there are no
268 firewalls blocking your access to the server.
270 The install will take about 10 to 15 minutes and you will see the
271 packages installing in your terminal
273 The installation process uses kickstart. The choice of
274 postinstall script is set using the POSTINSTALL_TEMPLATE variable.
275 This can be used to install packages that will be common to all
276 nodes into the base image. This save time later when you're
277 setting up the cluster nodes. However, current usage (given that
278 we test many versions of CTDB) is to default POSTINSTALL_TEMPLATE
279 to "" and install packages post-boot. This seems to be a
280 reasonable compromise between flexibility (the base image can be,
281 for example, a pristine RHEL7.0-base.qcow2, CTDB/Samba packages
282 are selected post-base creation) and speed of cluster creation.
284 When that has finished you should mark that base image immutable
287 chattr +i /virtual/ac-base.img
289 That will ensure it won't change. This is a precaution as the
290 image will be used as a basis file for the per-node images, and if
291 it changes your cluster will become corrupt
294 Now run "autocluster cluster build", specifying a configuration
297 autocluster -c m1.autocluster cluster build
299 This will create and install the XML node descriptions and the
300 disk images for your cluster nodes, and any other nodes you have
301 configured. Each disk image is initially created as an "empty"
302 copy-on-write image, which is linked to the base image. Those
303 images are then attached to using guestfish or
304 loopback-nbd-mounted, and populated with system configuration
305 files and other potentially useful things (such as scripts).
306 /etc/hosts is updated, the cluster is booted and post-boot
309 Instead of doing all of the steps 2-5 using 1 command you call do:
311 2) autocluster -c m1.autocluster cluster create
313 3) autocluster -c m1.autocluster cluster update_hosts
315 4) autocluster -c m1.autocluster cluster boot
317 5) autocluster -c m1.autocluster cluster configure
319 BOOTING/DESTROY A CLUSTER
320 =========================
322 Autocluster provides a command called "vircmd", which is a thin
323 wrapper around libvirt's virsh command. vircmd takes a cluster name
324 instead of a node/domain name and runs the requested command on all
325 nodes in the cluster.
327 The most useful vircmd commands are:
330 shutdown : graceful shutdown of a node
331 destroy : power off a node immediately
333 You can watch boot progress like this:
335 tail -f /var/log/kvm/serial.c1*
337 All the nodes have serial consoles, making it easier to capture
338 kernel panic messages and watch the nodes via ssh
344 Autocluster copies some scripts to cluster nodes to enable post-boot
345 configuration. These are used to configure specialised subsystems
346 like GPFS or Samba and are installed in /root/scripts/ on each node.
347 The main 2 entry points are install_packages.sh and setup_cluster.sh.
348 To setup a clustered NAS system you will normally need to run
349 setup_gpfs.sh and setup_cluster.sh on one of the nodes. If you want
350 to run these manually, see autocluster's cluster_configure() function
353 There are also some older scripts that haven't been used for a while
354 and have probably bit-rotted, such as setup_tsm_client.sh and
355 setup_tsm_server.sh. However, they are still provided as examples.
363 Autocluster uses configuration files containing Unix shell style
364 variables. For example,
368 indicates that the last octet of the first IP address in the cluster
369 will be 30. If an option contains multiple words then they will be
370 separated by underscores ('_'), as in:
374 All options have an equivalent command-line option, such
379 Command-line options are lowercase. Words are separated by dashes
384 Normally you would use a configuration file with variables so that you
385 can repeat steps easily. The command-line equivalents are useful for
386 trying things out without resorting to an editor. You can specify a
387 configuration file to use on the autocluster command-line using the -c
390 autocluster -c config-foo create base
392 If you don't provide a configuration variable then autocluster will
393 look for a file called "config" in the current directory.
395 You can also use environment variables to override the default values
396 of configuration variables. However, both command-line options and
397 configuration file entries will override environment variables.
399 Potentially useful information:
401 * Use "autocluster --help" to list all available command-line options
402 - all the items listed under "configuration options:" are the
403 equivalents of the settings for config files. This output also
404 shows descriptions of the options.
406 * You can use the --dump option to check the current value of
407 configuration variables. This is most useful when used in
408 combination with grep:
410 autocluster --dump | grep ISO_DIR
412 In the past we recommended using --dump to create initial
413 configuration file. Don't do this - it is a bad idea! There are a
414 lot of options and you'll create a huge file that you don't
415 understand and can't debug!
417 * Configuration options are defined in config.d/*.defconf. You
418 shouldn't need to look in these files... but sometimes they contain
419 comments about options that are too long to fit into help strings.
424 * I recommend that you aim for the smallest possible configuration file.
429 and move on from there.
431 * The NODES configuration variable controls the types of nodes that
432 are created. At the time of writing, the default value is:
434 NODES="nas:0-3 rhel_base:4"
436 This means that you get 4 clustered NAS nodes, at IP offsets 0, 1,
437 2, & 3 from FIRSTIP, all part of the CTDB cluster. You also get an
438 additional utility node at IP offset 4 that can be used, for
439 example, as a test client. The base node will not be part of the
440 CTDB cluster. It is just extra node that can be used as a test
446 The RHEL5 version of KVM does not support the SCSI block device
447 emulation. Therefore, you can use either virtio or iSCSI shared
448 disks. Unfortunately, in RHEL5.4 and early versions of RHEL5.5,
449 virtio block devices are not supported by the version of multipath in
450 RHEL5. So this leaves iSCSI as the only choice.
452 The main configuration options you need for iSCSI disks are:
454 SHARED_DISK_TYPE=iscsi
455 NICMODEL=virtio # Recommended for performance
456 add_extra_package iscsi-initiator-utils
458 Note that SHARED_DISK_PREFIX and SHARED_DISK_CACHE are ignored for
459 iSCSI shared disks because KVM doesn't (need to) know about them.
461 You will need to install the scsi-target-utils package on the host
462 system. After creating a cluster, autocluster will print a message
463 that points you to a file tmp/iscsi.$CLUSTER - you need to run the
464 commands in this file (probably via: sh tmp/iscsi.$CLUSTER) before
465 booting your cluster. This will remove any old target with the same
466 ID, and create the new target, LUNs and ACLs.
468 You can use the following command to list information about the
471 tgtadm --lld iscsi --mode target --op show
473 If you need multiple clusters using iSCSI on the same host then each
474 cluster will need to have a different setting for ISCSI_TID.
479 RHEL versions of KVM do not support the SCSI block device emulation,
480 so autocluster now defaults to using an IDE system disk instead of a
481 SCSI one. Therefore, you can use virtio or ide system disks.
482 However, writeback caching, qcow2 and virtio are incompatible and
483 result in I/O corruption. So, you can use either virtio system disks
484 without any caching, accepting reduced performance, or you can use IDE
485 system disks with writeback caching, with nice performance.
487 For IDE disks, here are the required settings:
490 SYSTEM_DISK_PREFIX=hd
491 SYSTEM_DISK_CACHE=writeback
493 The next problem is that RHEL5's KVM does not include qemu-nbd. The
494 best solution is to build your own qemu-nbd and stop reading this
497 If, for whatever reason, you're unable to build your own qemu-nbd,
498 then you can use raw, rather than qcow2, system disks. If you do this
499 then you need significantly more disk space (since the system disks
500 will be *copies* of the base image) and cluster creation time will no
501 longer be pleasantly snappy (due to the copying time - the images are
502 large and a single copy can take several minutes). So, having tried
503 to warn you off this option, if you really want to do this then you'll
506 SYSTEM_DISK_FORMAT=raw
509 Note that if you're testing cluster creation with iSCSI shared disks
510 then you should find a way of switching off raw disks. This avoids
511 every iSCSI glitch costing you a lot of time while raw disks are
517 The -e option provides support for executing arbitrary bash code.
518 This is useful for testing and debugging.
520 One good use of this option is to test template substitution using the
521 function substitute_vars(). For example:
523 ./autocluster -c example.autocluster -e 'CLUSTER=foo; DISK=foo.qcow2; UUID=abcdef; NAME=foon1; set_macaddrs; substitute_vars templates/node.xml'
525 This prints templates/node.xml with all appropriate substitutions
526 done. Some internal variables (e.g. CLUSTER, DISK, UUID, NAME) are
527 given fairly arbitrary values but the various MAC address strings are
528 set using the function set_macaddrs().
530 The -e option is also useful when writing scripts that use
531 autocluster. Given the complexities of the configuration system you
532 probably don't want to parse configuration files yourself to determine
533 the current settings. Instead, you can ask autocluster to tell you
534 useful pieces of information. For example, say you want to script
535 creating a base disk image and you want to ensure the image is
538 base_image=$(autocluster -c $CONFIG -e 'echo $VIRTBASE/$BASENAME.img')
539 chattr -V -i "$base_image"
541 if autocluster -c $CONFIG create base ; then
542 chattr -V +i "$base_image"
545 Note that the command that autocluster should run is enclosed in
546 single quotes. This means that $VIRTBASE and $BASENAME will be expand
547 within autocluster after the configuration file has been loaded.