metze/ctdb/wip.git
9 years agoconfig: add setup_iface_ip_readd_script() helper function
Stefan Metzmacher [Fri, 12 Feb 2010 08:48:01 +0000 (09:48 +0100)]
config: add setup_iface_ip_readd_script() helper function

This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.

metze

9 years agoconfig: readd ips with a broadcast address in delete_ip_from_iface()
Stefan Metzmacher [Fri, 12 Feb 2010 08:55:28 +0000 (09:55 +0100)]
config: readd ips with a broadcast address in delete_ip_from_iface()

metze

9 years agoIn ctdb_control_end_recovery,
Ronnie Sahlberg [Tue, 23 Feb 2010 01:43:49 +0000 (12:43 +1100)]
In ctdb_control_end_recovery,

We used to talloc_steal c (the command packet) and make it a child of the
"event script state context".
If we failed to create a eventscript child context for some reason,
this would have talloc freed state, but at the same time it would also
implicitely have freed c.
Once ctdb_control_end_recovery() returns the error back to the caller,
the caller would dereference both c, and also outdata which is a child of c
and we would either read garbage data or segv.

Change the ordering so we only talloc_steal c as a child of state IFF
we have successfully created a child context for the script.

BZ61068

9 years ago Make sure that the natgw eventscript also triggers on the "stopped" event
Ronnie Sahlberg [Mon, 22 Feb 2010 23:14:51 +0000 (10:14 +1100)]
Make sure that the natgw eventscript also triggers on the "stopped" event
    to remove the natgw configuration and ip assignments used.

BZ61036

9 years agoctdb regsrvids is much more useful for testing if it sleeps once it has registered...
Ronnie Sahlberg [Mon, 22 Feb 2010 04:34:26 +0000 (15:34 +1100)]
ctdb regsrvids is much more useful for testing if it sleeps once it has registered its srvid.
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.

9 years agoFrom Sumit Bose <sbose@redhat.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:06:52 +0000 (14:06 +1100)]
From Sumit Bose <sbose@redhat.com>

Fixes for init script to meet guidelines

9 years agoFrom Elia Pinto <gitter.spiros@gmail.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:00:33 +0000 (14:00 +1100)]
From Elia Pinto <gitter.spiros@gmail.com>

We dont need to include getopt.h under AIX

9 years agoIgnore any scripts that timesout for most events, except startup.
Ronnie Sahlberg [Tue, 16 Feb 2010 00:18:43 +0000 (11:18 +1100)]
Ignore any scripts that timesout for most events, except startup.

Threat hung scripts always (except startup) as success.

9 years agotry to restart rpc-rquotad if it is not running
Ronnie Sahlberg [Fri, 12 Feb 2010 02:19:57 +0000 (13:19 +1100)]
try to restart rpc-rquotad if it is not running

bz60317

9 years agoLeave sequence number alone when merely migrating records.
Rusty Russell [Fri, 12 Feb 2010 06:32:56 +0000 (17:02 +1030)]
Leave sequence number alone when merely migrating records.

(Based on earlier version from Ronnie which modified tdb; this one
is standalone).

When storing records in a tdb that has "automatic seqnum updates"
also check if the actual data for the record has changed or not.

If it has not changed at all, except for possibly the header,
this is likely just a dmaster migration operation in which case
we want to write the record to the tdb but we do not want the tdb
sequence number to be increased.

This resolves the problem of notify.tdb being thrashed under load:
the heuristic in smbd to only reread this when the sequence number
increases (rarely) breaks down.

Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like:
 512      1496  118.33 MB/sec  execute 60 sec  latency 0.00 msec
And turning on latency tracking, this was typical in the logs:
 ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb

After this commit:
  512      2451  143.85 MB/sec  execute 60 sec  latency 0.00 msec
And no more latency messages...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
9 years agoReduce loglevel for two eventscript related debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 01:00:43 +0000 (12:00 +1100)]
Reduce loglevel for two eventscript related debug messages

9 years agoReducing the log level for a debug message
Ronnie Sahlberg [Thu, 11 Feb 2010 00:54:46 +0000 (11:54 +1100)]
Reducing the log level for a debug message

              DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\

9 years agoReduce the log level for two debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 00:49:48 +0000 (11:49 +1100)]
Reduce the log level for two debug messages

       DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
       DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",

9 years agoAdd a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
Ronnie Sahlberg [Thu, 11 Feb 2010 00:32:22 +0000 (11:32 +1100)]
Add a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
to control whether or not to check if we are swapping, and produce
useful output into the logfile if we are.

For production systems with dedicated nas-heads we should never swap.
But for developer/test systems we often use smaller nondedicated systems where
we can no longer guarantee that we will not be using swap.

9 years agolower the loglevel for a debug message for redundant releases of public ips
Ronnie Sahlberg [Thu, 11 Feb 2010 00:19:08 +0000 (11:19 +1100)]
lower the loglevel for a debug message for redundant releases of public ips

9 years agoAdd a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
Ronnie Sahlberg [Thu, 11 Feb 2010 00:09:39 +0000 (11:09 +1100)]
Add a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
when set to "yes" this will skip checking if knfsd has hung or not.

bz59626

9 years agofixed printing of high latency
Andrew Tridgell [Fri, 5 Feb 2010 06:11:29 +0000 (17:11 +1100)]
fixed printing of high latency

9 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Thu, 11 Feb 2010 03:08:41 +0000 (14:08 +1100)]
Merge commit 'martins/master'

9 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 10 Feb 2010 09:24:28 +0000 (20:24 +1100)]
Merge commit 'origin/master'

9 years agocommands that relate to manual failover of ip addresses (moveip)
Ronnie Sahlberg [Tue, 9 Feb 2010 07:34:47 +0000 (18:34 +1100)]
commands that relate to manual failover of ip addresses (moveip)
can sometimes take long so allow for a longer timeout for the controls used.

9 years agodont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
Ronnie Sahlberg [Tue, 9 Feb 2010 03:35:10 +0000 (14:35 +1100)]
dont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
return success back to the caller instead.

otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled.

9 years agoevent scripts: add logging for low memory conditions
Rusty Russell [Tue, 9 Feb 2010 02:16:35 +0000 (12:46 +1030)]
event scripts: add logging for low memory conditions

We should never enter swap; if we do, show the memory state of the machine and the process list.  This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
9 years agoctdb: migrate to new dlinklist.h from Samba
Andrew Tridgell [Sun, 7 Feb 2010 08:02:06 +0000 (19:02 +1100)]
ctdb: migrate to new dlinklist.h from Samba

9 years agoonnode documentation - update documentation to reflect recent onnode changes.
Martin Schwenke [Fri, 5 Feb 2010 04:30:39 +0000 (15:30 +1100)]
onnode documentation - update documentation to reflect recent onnode changes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoMerge branch 'master' of git://git.samba.org/sahlberg/ctdb
Martin Schwenke [Fri, 5 Feb 2010 03:00:23 +0000 (14:00 +1100)]
Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

9 years agoctdb: when we fill the client packet queue we need to drop the client
Andrew Tridgell [Thu, 4 Feb 2010 03:36:14 +0000 (14:36 +1100)]
ctdb: when we fill the client packet queue we need to drop the client

We can't just drop packets to the list, as those packets could be part
of the core protocol the client is using. This happens (for example)
when Samba is doing a traverse. If we drop a traverse packet then
Samba hangs indefinately. We are better off dropping the ctdb socket
to Samba.

9 years agoctdb: move ctdb_io.c to use TLIST_*() macros
Andrew Tridgell [Thu, 4 Feb 2010 03:14:18 +0000 (14:14 +1100)]
ctdb: move ctdb_io.c to use TLIST_*() macros

This will make large packet queues much more efficient

9 years agoutil: added TLIST_*() macros
Andrew Tridgell [Thu, 4 Feb 2010 03:13:49 +0000 (14:13 +1100)]
util: added TLIST_*() macros

The TLIST_*() macros are like the DLIST_*() macros, but take both a
head and tail pointer for the list. This means that adding an element
to the end of the list is efficient (it doesn't need to walk the
list).

We should move all uses of the DLIST_*() macros which use
DLIST_ADD_END() to use the TLIST_*() macros instead.

9 years agoWhen trying to enable/disable a node.
Ronnie Sahlberg [Wed, 3 Feb 2010 23:03:21 +0000 (10:03 +1100)]
When trying to enable/disable a node.
Check if the node is already enabled/disabled and log an information
message if so.

9 years agoWe only queued up to 1000 packets per queue before we start dropping
Ronnie Sahlberg [Wed, 3 Feb 2010 22:54:06 +0000 (09:54 +1100)]
We only queued up to 1000 packets per queue before we start dropping
packets, to avoid the queue to grow excessively if smbd has blocked.

This could cause traverse packets to become discarded in case the main
smbd daemon does a traverse of a database while there is a recovery
(sending a erconfigured message to smbd, causing an avalanche of unlock
messages to be sent across the cluster.)

This avalance of messages could cause also the tranversal message to be
discarded  causing the main smbd process to hang indefinitely waiting
for the traversal message that will never arrive.

Bump the maximum queue length before starting to discard messages from
1000 to 1000000 and at the same time rework the queueing slightly so we
can append messages cheaply to the queue instead of walking the list
from head to tail every time.

9 years agoadd two new debug controls to send and receive messages
Ronnie Sahlberg [Wed, 3 Feb 2010 22:45:32 +0000 (09:45 +1100)]
add two new debug controls to send and receive messages

ctdb msglisten and msgsend

9 years agoDrop the debug level for logging fd creation to DEBUG_DEBUG
Ronnie Sahlberg [Wed, 3 Feb 2010 19:37:41 +0000 (06:37 +1100)]
Drop the debug level for logging fd creation to DEBUG_DEBUG

9 years agotdb: fix an early release of the global lock that can cause data corruption
Volker Lendecke [Fri, 29 Jan 2010 17:21:09 +0000 (18:21 +0100)]
tdb: fix an early release of the global lock that can cause data corruption

There was a bug in tdb where the

                tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);

(ending the transaction-"mutex") was done before the

                        /* remove the recovery marker */

This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.

This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.

9 years agoeventscripts: stop loadconfig function from loading ctdb config file twice.
Martin Schwenke [Fri, 22 Jan 2010 06:19:12 +0000 (17:19 +1100)]
eventscripts: stop loadconfig function from loading ctdb config file twice.

If "$1" was empty than loadconfig would load the ctdb config twice.
This stops that from happening.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoeventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.
Martin Schwenke [Fri, 22 Jan 2010 06:14:50 +0000 (17:14 +1100)]
eventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.

Proper fix for 085d1bea78fabf754ef6dd6d323f74a1d361e45c's workaround.
$NFS_TICKLE_SHARED_DIRECTORY was being used before it is set via
loadconfig.

Ronnie actually spotted this one.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoinitscript: Remove bash-ism.
Martin Schwenke [Fri, 22 Jan 2010 06:13:17 +0000 (17:13 +1100)]
initscript: Remove bash-ism.

Also, change the order of the comparison so it is consistent with
others in the script.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoMerge commit 'origin/master'
Martin Schwenke [Fri, 22 Jan 2010 06:05:11 +0000 (17:05 +1100)]
Merge commit 'origin/master'

9 years agoinitscript: handle spaces in option values inserted into $CTDB_OPTIONS.
Martin Schwenke [Fri, 22 Jan 2010 02:19:00 +0000 (13:19 +1100)]
initscript: handle spaces in option values inserted into $CTDB_OPTIONS.

This puts single quotes around everything and uses eval on the
command-lines that actually start ctdbd.  The eval causes the single
quotes to be interpreted.

The "redhat" init style no longer uses the Red Hat daemon function.
It loses the quoting and re-splits on spaces.  Instead we add an extra
line that uses the success/failure functions to keep things pretty.
Note that this means that we don't respect daemon's
$DAEMON_COREFILE_LIMIT variable but we do our own core file handling
with $CTDB_SUPPRESS_COREFILE anyway.  daemon's core file handling was
probably overriding what we were doing anyway, so this can be regarded
as a bug fix.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoinitscript: handle spaces in option values inserted into $CTDB_OPTIONS.
Martin Schwenke [Fri, 22 Jan 2010 02:19:00 +0000 (13:19 +1100)]
initscript: handle spaces in option values inserted into $CTDB_OPTIONS.

This puts single quotes around everything and uses eval on the
command-lines that actually start ctdbd.  The eval causes the single
quotes to be interpreted.

The "redhat" init style no longer uses the Red Hat daemon function.
It loses the quoting and re-splits on spaces.  Instead we add an extra
line that uses the success/failure functions to keep things pretty.
Note that this means that we don't respect daemon's
$DAEMON_COREFILE_LIMIT variable but we do our own core file handling
with $CTDB_SUPPRESS_COREFILE anyway.  daemon's core file handling was
probably overriding what we were doing anyway, so this can be regarded
as a bug fix.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoonnode: update algorithm for finding nodes file.
Martin Schwenke [Thu, 21 Jan 2010 02:40:03 +0000 (13:40 +1100)]
onnode: update algorithm for finding nodes file.

2 changes:

* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
  this file does not exist then try looking for the file in /etc/ctdb
  (or $CTDB_BASE if set).

* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
  file does not exist (even when checked as per above) then do not
  fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set).  The old
  behaviour was surprising and hid errors.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoonnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.
Martin Schwenke [Thu, 21 Jan 2010 02:16:18 +0000 (13:16 +1100)]
onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoonnode: update algorithm for finding nodes file.
Martin Schwenke [Thu, 21 Jan 2010 02:40:03 +0000 (13:40 +1100)]
onnode: update algorithm for finding nodes file.

2 changes:

* If a relative nodes file is specified via -f or $CTDB_NODES_FILE but
  this file does not exist then try looking for the file in /etc/ctdb
  (or $CTDB_BASE if set).

* If a nodes file is specified via -f or $CTDB_NODES_FILE but this
  file does not exist (even when checked as per above) then do not
  fall back to /etc/ctdb/nodes ((or $CTDB_BASE if set).  The old
  behaviour was surprising and hid errors.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoonnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.
Martin Schwenke [Thu, 21 Jan 2010 02:16:18 +0000 (13:16 +1100)]
onnode - respect $CTDB_BASE rather than hard-coding /etc/ctdb.

Signed-off-by: Martin Schwenke <martin@meltin.net>
9 years agoconfig: 10.interface: search "ethtool" in $PATH instead of using a hardcoded path
Stefan Metzmacher [Mon, 18 Jan 2010 12:05:54 +0000 (13:05 +0100)]
config: 10.interface: search "ethtool" in $PATH instead of using a hardcoded path

This is very useful for testing, I use such a script:

cat ~/bin/ethtool
 #!/bin/sh

 IFACE=$1

 case "$IFACE" in
        Neth2)
                ;;
        Neth3)
                ;;
        Neth4)
                ;;
        Neth5)
                ;;
        *)
                exec /usr/sbin/ethtool $@
                ;;
 esac

 ip link set down $IFACE

 exec /usr/sbin/ethtool $@

metze

9 years agoserver: reload the public addresses before doing a takeover run
Stefan Metzmacher [Tue, 19 Jan 2010 07:42:48 +0000 (08:42 +0100)]
server: reload the public addresses before doing a takeover run

metze

9 years agoserver: ban ourself if the ctdb and kernel knowledge of a public ip differs
Stefan Metzmacher [Mon, 18 Jan 2010 14:04:32 +0000 (15:04 +0100)]
server: ban ourself if the ctdb and kernel knowledge of a public ip differs

metze

9 years agoserver: give an error if we're getting an takeover_ip event with a wrong pnn
Stefan Metzmacher [Mon, 18 Jan 2010 14:38:01 +0000 (15:38 +0100)]
server: give an error if we're getting an takeover_ip event with a wrong pnn

metze

9 years agoserver: return an error if we get an takeover ip event and we cannot serve the ip
Stefan Metzmacher [Mon, 18 Jan 2010 14:08:15 +0000 (15:08 +0100)]
server: return an error if we get an takeover ip event and we cannot serve the ip

metze

9 years agoserver: print node number as signed integer on release ip event
Stefan Metzmacher [Mon, 18 Jan 2010 14:12:46 +0000 (15:12 +0100)]
server: print node number as signed integer on release ip event

metze

9 years agoserver: debug redundant takeover ip events with level INFO
Stefan Metzmacher [Mon, 18 Jan 2010 14:22:16 +0000 (15:22 +0100)]
server: debug redundant takeover ip events with level INFO

metze

9 years agoserver: be less verbose on redundant release_ip events
Stefan Metzmacher [Mon, 18 Jan 2010 14:04:32 +0000 (15:04 +0100)]
server: be less verbose on redundant release_ip events

metze

9 years agoserver: add a ctdb_do_updateip()
Stefan Metzmacher [Sat, 16 Jan 2010 14:01:17 +0000 (15:01 +0100)]
server: add a ctdb_do_updateip()

metze

9 years agoserver: split out a ctdb_do_takeover_ip() function
Stefan Metzmacher [Sat, 16 Jan 2010 12:30:58 +0000 (13:30 +0100)]
server: split out a ctdb_do_takeover_ip() function

metze

9 years agoserver: split out a ctdb_announce_vnn_iface() function
Stefan Metzmacher [Sat, 16 Jan 2010 12:20:45 +0000 (13:20 +0100)]
server: split out a ctdb_announce_vnn_iface() function

metze

9 years agoevents: add updateip event to 13.per_ip_routing
Stefan Metzmacher [Mon, 21 Dec 2009 07:45:19 +0000 (08:45 +0100)]
events: add updateip event to 13.per_ip_routing

metze

9 years agoevents: 10.interface handle updateip event
Stefan Metzmacher [Mon, 21 Dec 2009 07:40:50 +0000 (08:40 +0100)]
events: 10.interface handle updateip event

metze

9 years agoserver: add updateip event
Stefan Metzmacher [Mon, 21 Dec 2009 07:33:55 +0000 (08:33 +0100)]
server: add updateip event

metze

9 years agoconfig: add CTDB_PARTIALLY_ONLINE_INTERFACES to ctdb.sysconfig
Stefan Metzmacher [Mon, 21 Dec 2009 13:02:03 +0000 (14:02 +0100)]
config: add CTDB_PARTIALLY_ONLINE_INTERFACES to ctdb.sysconfig

With this option set to "yes", we don't become unhealthy
as long as at least one interface is still available.

metze

9 years agoserver: start with disabled interfaces and let the event scripts enable the interface...
Stefan Metzmacher [Mon, 21 Dec 2009 18:18:10 +0000 (19:18 +0100)]
server: start with disabled interfaces and let the event scripts enable the interfaces explicit

This makes sure that we don't get public addresses assigned during the
initial recovery and remove them again in the startup event.

metze

9 years agoconfig: 10.interfaces call monitor_interfaces on startup
Stefan Metzmacher [Tue, 22 Dec 2009 14:25:30 +0000 (15:25 +0100)]
config: 10.interfaces call monitor_interfaces on startup

metze

9 years agoconfig: 10.interfaces call ctdb ifaces and ctdb setifacelink for monitoring
Stefan Metzmacher [Tue, 22 Dec 2009 14:25:30 +0000 (15:25 +0100)]
config: 10.interfaces call ctdb ifaces and ctdb setifacelink for monitoring

metze

9 years agoevents: splitout a monitor_interfaces function in 10.interface
Stefan Metzmacher [Mon, 14 Dec 2009 10:59:45 +0000 (11:59 +0100)]
events: splitout a monitor_interfaces function in 10.interface

metze

9 years agoserver: monitor interfaces in verify_ip_allocation()
Stefan Metzmacher [Tue, 22 Dec 2009 14:21:08 +0000 (15:21 +0100)]
server: monitor interfaces in verify_ip_allocation()

metze

9 years agoserver: only trigger one takeover run in verify_ip_allocation()
Stefan Metzmacher [Tue, 22 Dec 2009 14:21:08 +0000 (15:21 +0100)]
server: only trigger one takeover run in verify_ip_allocation()

metze

9 years agotools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"
Stefan Metzmacher [Mon, 21 Dec 2009 12:30:45 +0000 (13:30 +0100)]
tools/ctdb: add PartiallyOnline state for "ctdb status" and "ctdb status -Y"

This is based on the GET_IFACES control against each node.

metze

9 years agotools/ctdb: display interfaces in "ctdb ip" and "ctdb ip -Y" outputs
Stefan Metzmacher [Sat, 16 Jan 2010 09:36:35 +0000 (10:36 +0100)]
tools/ctdb: display interfaces in "ctdb ip" and "ctdb ip -Y" outputs

metze

9 years agotests: add a all_ips_on_node() helper function that wraps ctdb ip -Y
Stefan Metzmacher [Sat, 16 Jan 2010 09:35:41 +0000 (10:35 +0100)]
tests: add a all_ips_on_node() helper function that wraps ctdb ip -Y

metze

9 years agotests/simple/11_ctdb_ip.sh: be more strict in checking ctdb ip -Y output
Stefan Metzmacher [Fri, 15 Jan 2010 09:53:14 +0000 (10:53 +0100)]
tests/simple/11_ctdb_ip.sh: be more strict in checking ctdb ip -Y output

metze

9 years agotools/ctdb: add "ctdb ipinfo <ip>"
Stefan Metzmacher [Thu, 17 Dec 2009 10:23:59 +0000 (11:23 +0100)]
tools/ctdb: add "ctdb ipinfo <ip>"

metze

9 years agotools/ctdb: add "ctdb setifacelink <iface> <status>"
Stefan Metzmacher [Wed, 16 Dec 2009 16:02:23 +0000 (17:02 +0100)]
tools/ctdb: add "ctdb setifacelink <iface> <status>"

metze

9 years agotools/ctdb: add "ctdb ifaces"
Stefan Metzmacher [Wed, 16 Dec 2009 15:50:23 +0000 (16:50 +0100)]
tools/ctdb: add "ctdb ifaces"

metze

9 years agoserver: implement ctdb_control_set_iface_link()
Stefan Metzmacher [Thu, 17 Dec 2009 09:30:36 +0000 (10:30 +0100)]
server: implement ctdb_control_set_iface_link()

This only marks the interface status and doesn't
generate any directly triggered action.

The actions is later taken by the recovery process
in verify_ip_allocation.

metze

9 years agoserver: implement ctdb_control_get_ifaces()
Stefan Metzmacher [Wed, 16 Dec 2009 10:14:44 +0000 (11:14 +0100)]
server: implement ctdb_control_get_ifaces()

metze

9 years agoserver: implement ctdb_control_get_public_ip_info()
Stefan Metzmacher [Wed, 16 Dec 2009 10:20:28 +0000 (11:20 +0100)]
server: implement ctdb_control_get_public_ip_info()

metze

9 years agoclient: implement ctdb_ctrl_set_iface_link()
Stefan Metzmacher [Wed, 16 Dec 2009 15:18:36 +0000 (16:18 +0100)]
client: implement ctdb_ctrl_set_iface_link()

metze

9 years agoclient: implement ctdb_ctrl_get_ifaces()
Stefan Metzmacher [Wed, 16 Dec 2009 14:30:07 +0000 (15:30 +0100)]
client: implement ctdb_ctrl_get_ifaces()

metze

9 years agoclient: implement ctdb_ctrl_get_public_ip_info()
Stefan Metzmacher [Wed, 16 Dec 2009 15:23:08 +0000 (16:23 +0100)]
client: implement ctdb_ctrl_get_public_ip_info()

metze

9 years agocontrols: add stups for GET_PUBLIC_IP_INFO, GET_IFACES and SET_IFACE_LINK_STATE
Stefan Metzmacher [Wed, 16 Dec 2009 13:40:21 +0000 (14:40 +0100)]
controls: add stups for GET_PUBLIC_IP_INFO, GET_IFACES and SET_IFACE_LINK_STATE

metze

9 years agoserver: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run
Stefan Metzmacher [Wed, 16 Dec 2009 15:09:40 +0000 (16:09 +0100)]
server: use CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE during a takeover run

We know ask for the known and available interfaces.
This means a node gets a RELEASE_IP event for all interfaces
it "knows", but doesn't serve and a node only gets a TAKE_IP event
for "available" interfaces.

metze

9 years agoserver: implement CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE behavior
Stefan Metzmacher [Wed, 16 Dec 2009 15:08:45 +0000 (16:08 +0100)]
server: implement CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE behavior

metze

9 years agoclient: add CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE ctdb_ctrl_get_public_ips_flags()
Stefan Metzmacher [Wed, 16 Dec 2009 14:50:06 +0000 (15:50 +0100)]
client: add CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE ctdb_ctrl_get_public_ips_flags()

metze

9 years agoreserve upper bits in ctdb_control->flags for opcode specific flags
Stefan Metzmacher [Mon, 21 Dec 2009 11:10:18 +0000 (12:10 +0100)]
reserve upper bits in ctdb_control->flags for opcode specific flags

metze

9 years agoserver: keep the interface information in a list of ctdb_iface structures
Stefan Metzmacher [Wed, 16 Dec 2009 09:39:40 +0000 (10:39 +0100)]
server: keep the interface information in a list of ctdb_iface structures

metze

9 years agoserver: we don't need to copy strings we pass as talloc_asprintf() arguments
Stefan Metzmacher [Wed, 16 Dec 2009 08:48:21 +0000 (09:48 +0100)]
server: we don't need to copy strings we pass as talloc_asprintf() arguments

metze

9 years agoevents: 10.interfaces allow multiple interfaces per public address
Stefan Metzmacher [Mon, 21 Dec 2009 07:39:21 +0000 (08:39 +0100)]
events: 10.interfaces allow multiple interfaces per public address

metze

9 years agoserver: allow multiple interfaces comma separated in public_addresses
Stefan Metzmacher [Mon, 14 Dec 2009 17:52:06 +0000 (18:52 +0100)]
server: allow multiple interfaces comma separated in public_addresses

metze

9 years agoserver: add a ctdb_vnn_iface_string() helper function to access vnn->iface
Stefan Metzmacher [Wed, 16 Dec 2009 07:54:02 +0000 (08:54 +0100)]
server: add a ctdb_vnn_iface_string() helper function to access vnn->iface

metze

9 years agoserver: add a ctdb_set_single_public_ip() helper function
Stefan Metzmacher [Mon, 14 Dec 2009 18:33:35 +0000 (19:33 +0100)]
server: add a ctdb_set_single_public_ip() helper function

metze

9 years agoconfig: add 13.per_ip_routing event script
Stefan Metzmacher [Sat, 19 Dec 2009 17:26:01 +0000 (18:26 +0100)]
config: add 13.per_ip_routing event script

With this script it's possible to generate routing tables
per public ip address.

metze

9 years agoconfig: add some ipv4 helper shell functions
Stefan Metzmacher [Fri, 11 Dec 2009 18:56:36 +0000 (19:56 +0100)]
config: add some ipv4 helper shell functions

Many thanks to Michael Adam <obnox@samba.org>
for the basic work.

metze

9 years agoconfig: add interface_modify.sh and call it under flock to make modification on inter...
Stefan Metzmacher [Wed, 20 Jan 2010 10:10:48 +0000 (11:10 +0100)]
config: add interface_modify.sh and call it under flock to make modification on interfaces atomic

When two releaseip events run in parallel it's possible that the 2nd script
readds a secondary ip that was removed by the 1st script.

metze

9 years agoevents/10.interfaces: move some parts to helper functions
Stefan Metzmacher [Fri, 18 Dec 2009 10:08:22 +0000 (11:08 +0100)]
events/10.interfaces: move some parts to helper functions

metze

9 years agoconfig/functions: add tickle_tcp_connections()
Stefan Metzmacher [Fri, 18 Dec 2009 08:43:20 +0000 (09:43 +0100)]
config/functions: add tickle_tcp_connections()

metze

9 years agoserver: add "init" event
Stefan Metzmacher [Tue, 19 Jan 2010 09:07:14 +0000 (10:07 +0100)]
server: add "init" event

This is needed because the "startup" event runs after the initial recovery,
but we need to do some actions before the initial recovery.

metze

9 years agoserver: setup fault handler to get the build-in backtrace support
Stefan Metzmacher [Thu, 7 Jan 2010 08:21:56 +0000 (09:21 +0100)]
server: setup fault handler to get the build-in backtrace support

The panic action feature will be added later.

metze

9 years agolib/util: add pre and post panic action hooks
Stefan Metzmacher [Tue, 12 Jan 2010 11:17:00 +0000 (12:17 +0100)]
lib/util: add pre and post panic action hooks

metze

9 years agolib/util: import fault/backtrace handling from samba.
Stefan Metzmacher [Fri, 18 Dec 2009 11:32:38 +0000 (12:32 +0100)]
lib/util: import fault/backtrace handling from samba.

metze

9 years agoconfigure: don't overwrite AC_CHECK_FUNC_EXT and AC_CHECK_LIB_EXT
Stefan Metzmacher [Fri, 18 Dec 2009 11:14:28 +0000 (12:14 +0100)]
configure: don't overwrite AC_CHECK_FUNC_EXT and AC_CHECK_LIB_EXT

This has curently no affect on the generated configure and config.h.in files.

metze