metze/ctdb/wip.git
14 years agodoc: add metainfo "manual" and "source" in the ctdbd manual page
Michael Adam [Wed, 24 Feb 2010 11:53:21 +0000 (12:53 +0100)]
doc: add metainfo "manual" and "source" in the ctdbd manual page

14 years agodoc: fill metainfo "manual" and "source" in the ctdb manual page
Michael Adam [Wed, 24 Feb 2010 11:52:30 +0000 (12:52 +0100)]
doc: fill metainfo "manual" and "source" in the ctdb manual page

14 years agoCorrection of spelling errors.
Mathieu Parent [Tue, 5 Jan 2010 10:04:24 +0000 (11:04 +0100)]
Correction of spelling errors.

* interupted -> interrupted
* dont -> don't

(thanks to lintian)

See https://bugzilla.samba.org/show_bug.cgi?id=6935

14 years agoCorrection of spelling errors in manpages
Mathieu Parent [Tue, 5 Jan 2010 09:59:44 +0000 (10:59 +0100)]
Correction of spelling errors in manpages

thanks to lintian

See https://bugzilla.samba.org/show_bug.cgi?id=6935

14 years agofix bug #7152: check NFS-Shares, fails with to long path-names
Michael Adam [Tue, 23 Feb 2010 10:00:23 +0000 (11:00 +0100)]
fix bug #7152: check NFS-Shares, fails with to long path-names

Thanks to Thomas Sesselmann <t.sesselmann@dkfz.de> .

Michael

14 years agoserver:ctdb_send_dmaster_reply: fix a message typo.
Michael Adam [Wed, 6 Jan 2010 13:59:23 +0000 (14:59 +0100)]
server:ctdb_send_dmaster_reply: fix a message typo.

Michael

14 years agodoc: regenerate ctdb.1*
Stefan Metzmacher [Tue, 23 Feb 2010 09:29:27 +0000 (10:29 +0100)]
doc: regenerate ctdb.1*

metze

14 years agodoc/ctdb.1.xml: document "ctdb setifacelink <iface> <status>"
Stefan Metzmacher [Tue, 23 Feb 2010 09:36:46 +0000 (10:36 +0100)]
doc/ctdb.1.xml: document "ctdb setifacelink <iface> <status>"

metze

14 years agodoc/ctdb.1.xml: document "ctdb ipinfo <ip>"
Stefan Metzmacher [Tue, 23 Feb 2010 09:04:51 +0000 (10:04 +0100)]
doc/ctdb.1.xml: document "ctdb ipinfo <ip>"

metze

14 years agodoc/ctdb.1.xml: update "ctdb ip" documentation
Stefan Metzmacher [Tue, 23 Feb 2010 09:03:00 +0000 (10:03 +0100)]
doc/ctdb.1.xml: update "ctdb ip" documentation

metze

14 years agodoc/ctdb.1.xml: document "ctdb ifaces"
Stefan Metzmacher [Tue, 23 Feb 2010 09:01:50 +0000 (10:01 +0100)]
doc/ctdb.1.xml: document "ctdb ifaces"

metze

14 years agodoc/ctdb.1.xml: document PARTIALLYONLINE status
Stefan Metzmacher [Tue, 23 Feb 2010 07:35:08 +0000 (08:35 +0100)]
doc/ctdb.1.xml: document PARTIALLYONLINE status

metze

14 years agoconfig/13.per_ip_routing: fix typo in error message
Stefan Metzmacher [Fri, 12 Feb 2010 08:54:46 +0000 (09:54 +0100)]
config/13.per_ip_routing: fix typo in error message

metze

14 years agoconfig/13.per_ip_routing: use better names for release_script and setup_script
Stefan Metzmacher [Fri, 12 Feb 2010 13:06:40 +0000 (14:06 +0100)]
config/13.per_ip_routing: use better names for release_script and setup_script

As the basename of the script will be used for the readd script
from setup_iface_ip_readd_script, it's know easier to identify
what script is called by delete_ip_from_iface() while readding
ips to the interface.

metze

14 years agoconfig/13.per_ip_routing: register the setup script with setup_iface_ip_readd_script()
Stefan Metzmacher [Fri, 12 Feb 2010 08:52:09 +0000 (09:52 +0100)]
config/13.per_ip_routing: register the setup script with setup_iface_ip_readd_script()

This is needed because we need to resetup the routing table when
the delete_ip_from_iface() function readds the ip to the interface.

metze

14 years agoconfig/13.per_ip_routing: add a setup_per_ip_routing() function
Stefan Metzmacher [Tue, 9 Feb 2010 15:34:59 +0000 (16:34 +0100)]
config/13.per_ip_routing: add a setup_per_ip_routing() function

This combines the logic into a shell function which can be used by the
"takeip" and "updateip" hooks.

We check the return values of the "ip" commands now
instead of ignoring them.

We now create a setup_script.sh similar to the release_script.sh
which makes it easier to analyze problems.

metze

14 years agoserver: add "setup" event
Stefan Metzmacher [Fri, 12 Feb 2010 10:24:08 +0000 (11:24 +0100)]
server: add "setup" event

This is needed because the "init" event can't use 'ctdb' commands.

metze

14 years agoconfig/10.interface: use delete_ip_from_iface also in the "init" event
Stefan Metzmacher [Fri, 12 Feb 2010 10:25:26 +0000 (11:25 +0100)]
config/10.interface: use delete_ip_from_iface also in the "init" event

metze

14 years agoconfig/11.natgw: use delete_ip_from_iface() instead of remove_ip()
Stefan Metzmacher [Fri, 12 Feb 2010 09:33:54 +0000 (10:33 +0100)]
config/11.natgw: use delete_ip_from_iface() instead of remove_ip()

This also initializes the variables correctly for the
shutdown|removenatgw code path to delete_all.

metze

14 years agoconfig: make remove_ip() a wrapper of delete_ip_from_iface()
Stefan Metzmacher [Fri, 12 Feb 2010 09:24:44 +0000 (10:24 +0100)]
config: make remove_ip() a wrapper of delete_ip_from_iface()

metze

14 years agoconfig: interface_modify states in a $CTDB_BASE/state/interface_modify directory
Stefan Metzmacher [Fri, 12 Feb 2010 09:23:17 +0000 (10:23 +0100)]
config: interface_modify states in a $CTDB_BASE/state/interface_modify directory

metze

14 years agoconfig: add setup_iface_ip_readd_script() helper function
Stefan Metzmacher [Fri, 12 Feb 2010 08:48:01 +0000 (09:48 +0100)]
config: add setup_iface_ip_readd_script() helper function

This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.

metze

14 years agoconfig: readd ips with a broadcast address in delete_ip_from_iface()
Stefan Metzmacher [Fri, 12 Feb 2010 08:55:28 +0000 (09:55 +0100)]
config: readd ips with a broadcast address in delete_ip_from_iface()

metze

14 years agoIn ctdb_control_end_recovery,
Ronnie Sahlberg [Tue, 23 Feb 2010 01:43:49 +0000 (12:43 +1100)]
In ctdb_control_end_recovery,

We used to talloc_steal c (the command packet) and make it a child of the
"event script state context".
If we failed to create a eventscript child context for some reason,
this would have talloc freed state, but at the same time it would also
implicitely have freed c.
Once ctdb_control_end_recovery() returns the error back to the caller,
the caller would dereference both c, and also outdata which is a child of c
and we would either read garbage data or segv.

Change the ordering so we only talloc_steal c as a child of state IFF
we have successfully created a child context for the script.

BZ61068

14 years ago Make sure that the natgw eventscript also triggers on the "stopped" event
Ronnie Sahlberg [Mon, 22 Feb 2010 23:14:51 +0000 (10:14 +1100)]
Make sure that the natgw eventscript also triggers on the "stopped" event
    to remove the natgw configuration and ip assignments used.

BZ61036

14 years agoctdb regsrvids is much more useful for testing if it sleeps once it has registered...
Ronnie Sahlberg [Mon, 22 Feb 2010 04:34:26 +0000 (15:34 +1100)]
ctdb regsrvids is much more useful for testing if it sleeps once it has registered its srvid.
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.

14 years agoFrom Sumit Bose <sbose@redhat.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:06:52 +0000 (14:06 +1100)]
From Sumit Bose <sbose@redhat.com>

Fixes for init script to meet guidelines

14 years agoFrom Elia Pinto <gitter.spiros@gmail.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:00:33 +0000 (14:00 +1100)]
From Elia Pinto <gitter.spiros@gmail.com>

We dont need to include getopt.h under AIX

14 years agoIgnore any scripts that timesout for most events, except startup.
Ronnie Sahlberg [Tue, 16 Feb 2010 00:18:43 +0000 (11:18 +1100)]
Ignore any scripts that timesout for most events, except startup.

Threat hung scripts always (except startup) as success.

14 years agotry to restart rpc-rquotad if it is not running
Ronnie Sahlberg [Fri, 12 Feb 2010 02:19:57 +0000 (13:19 +1100)]
try to restart rpc-rquotad if it is not running

bz60317

14 years agoLeave sequence number alone when merely migrating records.
Rusty Russell [Fri, 12 Feb 2010 06:32:56 +0000 (17:02 +1030)]
Leave sequence number alone when merely migrating records.

(Based on earlier version from Ronnie which modified tdb; this one
is standalone).

When storing records in a tdb that has "automatic seqnum updates"
also check if the actual data for the record has changed or not.

If it has not changed at all, except for possibly the header,
this is likely just a dmaster migration operation in which case
we want to write the record to the tdb but we do not want the tdb
sequence number to be increased.

This resolves the problem of notify.tdb being thrashed under load:
the heuristic in smbd to only reread this when the sequence number
increases (rarely) breaks down.

Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like:
 512      1496  118.33 MB/sec  execute 60 sec  latency 0.00 msec
And turning on latency tracking, this was typical in the logs:
 ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb

After this commit:
  512      2451  143.85 MB/sec  execute 60 sec  latency 0.00 msec
And no more latency messages...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoReduce loglevel for two eventscript related debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 01:00:43 +0000 (12:00 +1100)]
Reduce loglevel for two eventscript related debug messages

14 years agoReducing the log level for a debug message
Ronnie Sahlberg [Thu, 11 Feb 2010 00:54:46 +0000 (11:54 +1100)]
Reducing the log level for a debug message

              DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\

14 years agoReduce the log level for two debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 00:49:48 +0000 (11:49 +1100)]
Reduce the log level for two debug messages

       DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
       DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",

14 years agoAdd a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
Ronnie Sahlberg [Thu, 11 Feb 2010 00:32:22 +0000 (11:32 +1100)]
Add a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
to control whether or not to check if we are swapping, and produce
useful output into the logfile if we are.

For production systems with dedicated nas-heads we should never swap.
But for developer/test systems we often use smaller nondedicated systems where
we can no longer guarantee that we will not be using swap.

14 years agolower the loglevel for a debug message for redundant releases of public ips
Ronnie Sahlberg [Thu, 11 Feb 2010 00:19:08 +0000 (11:19 +1100)]
lower the loglevel for a debug message for redundant releases of public ips

14 years agoAdd a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
Ronnie Sahlberg [Thu, 11 Feb 2010 00:09:39 +0000 (11:09 +1100)]
Add a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
when set to "yes" this will skip checking if knfsd has hung or not.

bz59626

14 years agofixed printing of high latency
Andrew Tridgell [Fri, 5 Feb 2010 06:11:29 +0000 (17:11 +1100)]
fixed printing of high latency

14 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Thu, 11 Feb 2010 03:08:41 +0000 (14:08 +1100)]
Merge commit 'martins/master'

14 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 10 Feb 2010 09:24:28 +0000 (20:24 +1100)]
Merge commit 'origin/master'

14 years agocommands that relate to manual failover of ip addresses (moveip)
Ronnie Sahlberg [Tue, 9 Feb 2010 07:34:47 +0000 (18:34 +1100)]
commands that relate to manual failover of ip addresses (moveip)
can sometimes take long so allow for a longer timeout for the controls used.

14 years agodont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
Ronnie Sahlberg [Tue, 9 Feb 2010 03:35:10 +0000 (14:35 +1100)]
dont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
return success back to the caller instead.

otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled.

14 years agoevent scripts: add logging for low memory conditions
Rusty Russell [Tue, 9 Feb 2010 02:16:35 +0000 (12:46 +1030)]
event scripts: add logging for low memory conditions

We should never enter swap; if we do, show the memory state of the machine and the process list.  This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoctdb: migrate to new dlinklist.h from Samba
Andrew Tridgell [Sun, 7 Feb 2010 08:02:06 +0000 (19:02 +1100)]
ctdb: migrate to new dlinklist.h from Samba

14 years agoonnode documentation - update documentation to reflect recent onnode changes.
Martin Schwenke [Fri, 5 Feb 2010 04:30:39 +0000 (15:30 +1100)]
onnode documentation - update documentation to reflect recent onnode changes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge branch 'master' of git://git.samba.org/sahlberg/ctdb
Martin Schwenke [Fri, 5 Feb 2010 03:00:23 +0000 (14:00 +1100)]
Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

14 years agoctdb: when we fill the client packet queue we need to drop the client
Andrew Tridgell [Thu, 4 Feb 2010 03:36:14 +0000 (14:36 +1100)]
ctdb: when we fill the client packet queue we need to drop the client

We can't just drop packets to the list, as those packets could be part
of the core protocol the client is using. This happens (for example)
when Samba is doing a traverse. If we drop a traverse packet then
Samba hangs indefinately. We are better off dropping the ctdb socket
to Samba.

14 years agoctdb: move ctdb_io.c to use TLIST_*() macros
Andrew Tridgell [Thu, 4 Feb 2010 03:14:18 +0000 (14:14 +1100)]
ctdb: move ctdb_io.c to use TLIST_*() macros

This will make large packet queues much more efficient

14 years agoutil: added TLIST_*() macros
Andrew Tridgell [Thu, 4 Feb 2010 03:13:49 +0000 (14:13 +1100)]
util: added TLIST_*() macros

The TLIST_*() macros are like the DLIST_*() macros, but take both a
head and tail pointer for the list. This means that adding an element
to the end of the list is efficient (it doesn't need to walk the
list).

We should move all uses of the DLIST_*() macros which use
DLIST_ADD_END() to use the TLIST_*() macros instead.

14 years agoWhen trying to enable/disable a node.
Ronnie Sahlberg [Wed, 3 Feb 2010 23:03:21 +0000 (10:03 +1100)]
When trying to enable/disable a node.
Check if the node is already enabled/disabled and log an information
message if so.

14 years agoWe only queued up to 1000 packets per queue before we start dropping
Ronnie Sahlberg [Wed, 3 Feb 2010 22:54:06 +0000 (09:54 +1100)]
We only queued up to 1000 packets per queue before we start dropping
packets, to avoid the queue to grow excessively if smbd has blocked.

This could cause traverse packets to become discarded in case the main
smbd daemon does a traverse of a database while there is a recovery
(sending a erconfigured message to smbd, causing an avalanche of unlock
messages to be sent across the cluster.)

This avalance of messages could cause also the tranversal message to be
discarded  causing the main smbd process to hang indefinitely waiting
for the traversal message that will never arrive.

Bump the maximum queue length before starting to discard messages from
1000 to 1000000 and at the same time rework the queueing slightly so we
can append messages cheaply to the queue instead of walking the list
from head to tail every time.

14 years agoadd two new debug controls to send and receive messages
Ronnie Sahlberg [Wed, 3 Feb 2010 22:45:32 +0000 (09:45 +1100)]
add two new debug controls to send and receive messages

ctdb msglisten and msgsend

14 years agoDrop the debug level for logging fd creation to DEBUG_DEBUG
Ronnie Sahlberg [Wed, 3 Feb 2010 19:37:41 +0000 (06:37 +1100)]
Drop the debug level for logging fd creation to DEBUG_DEBUG

14 years agotdb: fix an early release of the global lock that can cause data corruption
Volker Lendecke [Fri, 29 Jan 2010 17:21:09 +0000 (18:21 +0100)]
tdb: fix an early release of the global lock that can cause data corruption

There was a bug in tdb where the

                tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);

(ending the transaction-"mutex") was done before the

                        /* remove the recovery marker */

This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.

This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.

14 years agoeventscripts: stop loadconfig function from loading ctdb config file twice.
Martin Schwenke [Fri, 22 Jan 2010 06:19:12 +0000 (17:19 +1100)]
eventscripts: stop loadconfig function from loading ctdb config file twice.

If "$1" was empty than loadconfig would load the ctdb config twice.
This stops that from happening.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoeventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.
Martin Schwenke [Fri, 22 Jan 2010 06:14:50 +0000 (17:14 +1100)]
eventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.

Proper fix for 085d1bea78fabf754ef6dd6d323f74a1d361e45c's workaround.
$NFS_TICKLE_SHARED_DIRECTORY was being used before it is set via
loadconfig.

Ronnie actually spotted this one.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoinitscript: Remove bash-ism.
Martin Schwenke [Fri, 22 Jan 2010 06:13:17 +0000 (17:13 +1100)]
initscript: Remove bash-ism.

Also, change the order of the comparison so it is consistent with
others in the script.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Fri, 22 Jan 2010 06:05:11 +0000 (17:05 +1100)]
Merge commit 'origin/master'

14 years agoinitscript: handle spaces in option values inserted into $CTDB_OPTIONS.
Martin Schwenke [Fri, 22 Jan 2010 02:19:00 +0000 (13:19 +1100)]
initscript: handle spaces in option values inserted into $CTDB_OPTIONS.

This puts single quotes around everything and uses eval on the
command-lines that actually start ctdbd.  The eval causes the single
quotes to be interpreted.

The "redhat" init style no longer uses the Red Hat daemon function.
It loses the quoting and re-splits on spaces.  Instead we add an extra
line that uses the success/failure functions to keep things pretty.
Note that this means that we don't respect daemon's
$DAEMON_COREFILE_LIMIT variable but we do our own core file handling
with $CTDB_SUPPRESS_COREFILE anyway.  daemon's core file handling was
probably overriding what we were doing anyway, so this can be regarded
as a bug fix.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoinitscript: handle spaces in option values inserted into $CTDB_OPTIONS.
Martin Schwenke [Fri, 22 Jan 2010 02:19:00 +0000 (13:19 +1100)]
initscript: handle spaces in option values inserted into $CTDB_OPTIONS.

This puts single quotes around everything and uses eval on the
command-lines that actually start ctdbd.  The eval causes the single
quotes to be interpreted.

The "redhat" init style no longer uses the Red Hat daemon function.
It loses the quoting and re-splits on spaces.  Instead we add an extra
line that uses the success/failure functions to keep things pretty.
Note that this means that we don't respect daemon's
$DAEMON_COREFILE_LIMIT variable but we do our own core file handling
with $CTDB_SUPPRESS_COREFILE anyway.  daemon's core file handling was
probably overriding what we were doing anyway, so this can be regarded
as a bug fix.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoonnode: update algorithm for finding nodes file.
Martin Schwenke [Thu, 21 Jan 2010 02:40:03 +0000 (13:40 +1100)]
onnode: update algorithm for finding nodes file.

2 changes:

* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
  this file does not exist then try looking for the file in /etc/ctdb
  (or $CTDB_BASE if set).

* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
  file does not exist (even when checked as per above) then do not
  fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set).  The old
  behaviour was surprising and hid errors.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoonnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.
Martin Schwenke [Thu, 21 Jan 2010 02:16:18 +0000 (13:16 +1100)]
onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoonnode: update algorithm for finding nodes file.
Martin Schwenke [Thu, 21 Jan 2010 02:40:03 +0000 (13:40 +1100)]
onnode: update algorithm for finding nodes file.

2 changes:

* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
  this file does not exist then try looking for the file in /etc/ctdb
  (or $CTDB_BASE if set).

* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
  file does not exist (even when checked as per above) then do not
  fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set).  The old
  behaviour was surprising and hid errors.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoonnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.
Martin Schwenke [Thu, 21 Jan 2010 02:16:18 +0000 (13:16 +1100)]
onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoconfig: 10.interface: search "ethtool" in $PATH instead of using a hardcoded path
Stefan Metzmacher [Mon, 18 Jan 2010 12:05:54 +0000 (13:05 +0100)]
config: 10.interface: search "ethtool" in $PATH instead of using a hardcoded path

This is very useful for testing, I use such a script:

cat ~/bin/ethtool
 #!/bin/sh

 IFACE=$1

 case "$IFACE" in
        Neth2)
                ;;
        Neth3)
                ;;
        Neth4)
                ;;
        Neth5)
                ;;
        *)
                exec /usr/sbin/ethtool $@
                ;;
 esac

 ip link set down $IFACE

 exec /usr/sbin/ethtool $@

metze

14 years agoserver: reload the public addresses before doing a takeover run
Stefan Metzmacher [Tue, 19 Jan 2010 07:42:48 +0000 (08:42 +0100)]
server: reload the public addresses before doing a takeover run

metze

14 years agoserver: ban ourself if the ctdb and kernel knowledge of a public ip differs
Stefan Metzmacher [Mon, 18 Jan 2010 14:04:32 +0000 (15:04 +0100)]
server: ban ourself if the ctdb and kernel knowledge of a public ip differs

metze

14 years agoserver: give an error if we're getting an takeover_ip event with a wrong pnn
Stefan Metzmacher [Mon, 18 Jan 2010 14:38:01 +0000 (15:38 +0100)]
server: give an error if we're getting an takeover_ip event with a wrong pnn

metze

14 years agoserver: return an error if we get an takeover ip event and we cannot serve the ip
Stefan Metzmacher [Mon, 18 Jan 2010 14:08:15 +0000 (15:08 +0100)]
server: return an error if we get an takeover ip event and we cannot serve the ip

metze

14 years agoserver: print node number as signed integer on release ip event
Stefan Metzmacher [Mon, 18 Jan 2010 14:12:46 +0000 (15:12 +0100)]
server: print node number as signed integer on release ip event

metze

14 years agoserver: debug redundant takeover ip events with level INFO
Stefan Metzmacher [Mon, 18 Jan 2010 14:22:16 +0000 (15:22 +0100)]
server: debug redundant takeover ip events with level INFO

metze

14 years agoserver: be less verbose on redundant release_ip events
Stefan Metzmacher [Mon, 18 Jan 2010 14:04:32 +0000 (15:04 +0100)]
server: be less verbose on redundant release_ip events

metze

14 years agoserver: add a ctdb_do_updateip()
Stefan Metzmacher [Sat, 16 Jan 2010 14:01:17 +0000 (15:01 +0100)]
server: add a ctdb_do_updateip()

metze

14 years agoserver: split out a ctdb_do_takeover_ip() function
Stefan Metzmacher [Sat, 16 Jan 2010 12:30:58 +0000 (13:30 +0100)]
server: split out a ctdb_do_takeover_ip() function

metze

14 years agoserver: split out a ctdb_announce_vnn_iface() function
Stefan Metzmacher [Sat, 16 Jan 2010 12:20:45 +0000 (13:20 +0100)]
server: split out a ctdb_announce_vnn_iface() function

metze

14 years agoevents: add updateip event to 13.per_ip_routing
Stefan Metzmacher [Mon, 21 Dec 2009 07:45:19 +0000 (08:45 +0100)]
events: add updateip event to 13.per_ip_routing

metze

14 years agoevents: 10.interface handle updateip event
Stefan Metzmacher [Mon, 21 Dec 2009 07:40:50 +0000 (08:40 +0100)]
events: 10.interface handle updateip event

metze

14 years agoserver: add updateip event
Stefan Metzmacher [Mon, 21 Dec 2009 07:33:55 +0000 (08:33 +0100)]
server: add updateip event

metze

14 years agoconfig: add CTDB_PARTIALLY_ONLINE_INTERFACES to ctdb.sysconfig
Stefan Metzmacher [Mon, 21 Dec 2009 13:02:03 +0000 (14:02 +0100)]
config: add CTDB_PARTIALLY_ONLINE_INTERFACES to ctdb.sysconfig

With this option set to "yes", we don't become unhealthy
as long as at least one interface is still available.

metze

14 years agoserver: start with disabled interfaces and let the event scripts enable the interface...
Stefan Metzmacher [Mon, 21 Dec 2009 18:18:10 +0000 (19:18 +0100)]
server: start with disabled interfaces and let the event scripts enable the interfaces explicit

This makes sure that we don't get public addresses assigned during the
initial recovery and remove them again in the startup event.

metze

14 years agoconfig: 10.interfaces call monitor_interfaces on startup
Stefan Metzmacher [Tue, 22 Dec 2009 14:25:30 +0000 (15:25 +0100)]
config: 10.interfaces call monitor_interfaces on startup

metze

14 years agoconfig: 10.interfaces call ctdb ifaces and ctdb setifacelink for monitoring
Stefan Metzmacher [Tue, 22 Dec 2009 14:25:30 +0000 (15:25 +0100)]
config: 10.interfaces call ctdb ifaces and ctdb setifacelink for monitoring

metze

14 years agoevents: splitout a monitor_interfaces function in 10.interface
Stefan Metzmacher [Mon, 14 Dec 2009 10:59:45 +0000 (11:59 +0100)]
events: splitout a monitor_interfaces function in 10.interface

metze

14 years agoserver: monitor interfaces in verify_ip_allocation()
Stefan Metzmacher [Tue, 22 Dec 2009 14:21:08 +0000 (15:21 +0100)]
server: monitor interfaces in verify_ip_allocation()

metze

14 years agoserver: only trigger one takeover run in verify_ip_allocation()
Stefan Metzmacher [Tue, 22 Dec 2009 14:21:08 +0000 (15:21 +0100)]
server: only trigger one takeover run in verify_ip_allocation()

metze

14 years agotools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"
Stefan Metzmacher [Mon, 21 Dec 2009 12:30:45 +0000 (13:30 +0100)]
tools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"

This is based on the GET_IFACES control against each node.

metze

14 years agotools/ctdb: display interfaces in "ctdb ip" and "ctdb ip -Y" outputs
Stefan Metzmacher [Sat, 16 Jan 2010 09:36:35 +0000 (10:36 +0100)]
tools/ctdb: display interfaces in "ctdb ip" and "ctdb ip -Y" outputs

metze

14 years agotests: add a all_ips_on_node() helper function that wraps ctdb ip -Y
Stefan Metzmacher [Sat, 16 Jan 2010 09:35:41 +0000 (10:35 +0100)]
tests: add a all_ips_on_node() helper function that wraps ctdb ip -Y

metze

14 years agotests/simple/11_ctdb_ip.sh: be more strict in checking ctdb ip -Y output
Stefan Metzmacher [Fri, 15 Jan 2010 09:53:14 +0000 (10:53 +0100)]
tests/simple/11_ctdb_ip.sh: be more strict in checking ctdb ip -Y output

metze

14 years agotools/ctdb: add "ctdb ipinfo <ip>"
Stefan Metzmacher [Thu, 17 Dec 2009 10:23:59 +0000 (11:23 +0100)]
tools/ctdb: add "ctdb ipinfo <ip>"

metze

14 years agotools/ctdb: add "ctdb setifacelink <iface> <status>"
Stefan Metzmacher [Wed, 16 Dec 2009 16:02:23 +0000 (17:02 +0100)]
tools/ctdb: add "ctdb setifacelink <iface> <status>"

metze

14 years agotools/ctdb: add "ctdb ifaces"
Stefan Metzmacher [Wed, 16 Dec 2009 15:50:23 +0000 (16:50 +0100)]
tools/ctdb: add "ctdb ifaces"

metze

14 years agoserver: implement ctdb_control_set_iface_link()
Stefan Metzmacher [Thu, 17 Dec 2009 09:30:36 +0000 (10:30 +0100)]
server: implement ctdb_control_set_iface_link()

This only marks the interface status and doesn't
generate any directly triggered action.

The actions is later taken by the recovery process
in verify_ip_allocation.

metze

14 years agoserver: implement ctdb_control_get_ifaces()
Stefan Metzmacher [Wed, 16 Dec 2009 10:14:44 +0000 (11:14 +0100)]
server: implement ctdb_control_get_ifaces()

metze

14 years agoserver: implement ctdb_control_get_public_ip_info()
Stefan Metzmacher [Wed, 16 Dec 2009 10:20:28 +0000 (11:20 +0100)]
server: implement ctdb_control_get_public_ip_info()

metze

14 years agoclient: implement ctdb_ctrl_set_iface_link()
Stefan Metzmacher [Wed, 16 Dec 2009 15:18:36 +0000 (16:18 +0100)]
client: implement ctdb_ctrl_set_iface_link()

metze

14 years agoclient: implement ctdb_ctrl_get_ifaces()
Stefan Metzmacher [Wed, 16 Dec 2009 14:30:07 +0000 (15:30 +0100)]
client: implement ctdb_ctrl_get_ifaces()

metze

14 years agoclient: implement ctdb_ctrl_get_public_ip_info()
Stefan Metzmacher [Wed, 16 Dec 2009 15:23:08 +0000 (16:23 +0100)]
client: implement ctdb_ctrl_get_public_ip_info()

metze