git.samba.org - ctdb.git/log

git.samba.org / ctdb.git / log

summary | shortlog | log | commit | commitdiff | tree
first ⋅ prev ⋅ next

commit | commitdiff | tree

Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]

tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.

The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).

Rename it to OPEN_LOCK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 7ab422d6fbd4f8be02838089a41f872d538ee7a7)

commit | commitdiff | tree

Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]

tdb: make _tdb_transaction_cancel static.

Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.

Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
(Imported from commit a6e0ef87d25734760fe77b87a9fd11db56760955)

commit | commitdiff | tree

Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]

tdb: cleanup: split brlock and brunlock methods.

This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.

For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this). This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.

We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 452b4a5a6efeecfb5c83475f1375ddc25bcddfbe)

commit | commitdiff | tree

Brad Hards [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]

Spelling fixes for tdb.

Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>
(Imported from commit 09e756b1d651caef203a4b7e02234f6dea374b08)

commit | commitdiff | tree

Andrew Tridgell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]

tdb: use fdatasync() instead of fsync() in transactions

This might help on some filesystems

(Imported from commit 1373e748aa53fbd3afe4d2377208257d42628d86)

commit | commitdiff | tree

Volker Lendecke [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]

tdb: Apply some const, just for clarity

(Imported from commit 6824c6f46ba7c15e8af91d5aa8b21a946b63107b)

commit | commitdiff | tree

Rusty Russell [Thu, 22 Apr 2010 04:23:41 +0000 (13:53 +0930)]

tdb: fix recovery reuse after crash

If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42. This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.

Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic. The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit b37b452cb8c1f56b37b04abe7bffdede371ca361)

commit | commitdiff | tree

Rusty Russell [Thu, 22 Apr 2010 04:23:26 +0000 (13:53 +0930)]

tdb: give a name to the invalid recovery area constant (0)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 6269cdcd1538e2e3cead9e0f3c156b0363d607a0)

commit | commitdiff | tree

Simo Sorce [Thu, 22 Apr 2010 04:23:21 +0000 (13:53 +0930)]

release-scripts: parametrize scripts

This should make it easier to keep all release scripts alined as it will reduce
the difference between them to ideally a few variables

Also moves the tdb script in the scripts directory.

(Imported from commit 6339de7f4fef46fb3ad32d1ecf9379f5b5d24ccb)

commit | commitdiff | tree

Simo Sorce [Thu, 22 Apr 2010 04:15:58 +0000 (13:45 +0930)]

tdb: raise version to 1.2.1

after recent fixes we need to raise the version to 1.2.1 so that
we can require also the right patched version.

(Imported from commit 70534adee10fc6f5bba2d9304668dc6508e5de5a)

commit | commitdiff | tree

Martin Schwenke [Tue, 20 Apr 2010 00:52:31 +0000 (10:52 +1000)]

Fix a thinko in 2ea0a9f1a93781a0d036feb9fcc0d120b182922f.

If the driver is virtio_net then we assume that the link is up rather
than ignoring the check altogether.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Ralph Wuerthner [Thu, 15 Apr 2010 06:38:19 +0000 (16:38 +1000)]

ethtool does not support virtio_net devices.

Skip link test for this type of devices

Signed-off-by: Ralph Wuerthner <ralph.wuerthner@de.ibm.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Thu, 15 Apr 2010 03:45:50 +0000 (13:45 +1000)]

Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 8 Apr 2010 04:30:01 +0000 (14:30 +1000)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 8 Apr 2010 04:28:52 +0000 (14:28 +1000)]

Fix a compiler warning

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 8 Apr 2010 04:07:57 +0000 (14:07 +1000)]

In the recovery daemon, keep track of which node we have assigned public ip
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.

If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 7 Apr 2010 00:45:27 +0000 (10:45 +1000)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 7 Apr 2010 00:42:51 +0000 (10:42 +1000)]

Lower the loglevel for "Recovery lock successfully taken"
from ERR to NOTICE

BZ62086

commit | commitdiff | tree

Martin Schwenke [Wed, 31 Mar 2010 06:52:42 +0000 (17:52 +1100)]

Merge commit 'origin/master'

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 30 Mar 2010 01:50:19 +0000 (12:50 +1100)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 30 Mar 2010 01:47:54 +0000 (12:47 +1100)]

When we forcefully abort a running eventscript, dont log this as is
the script timedout.

Instead send a different signal (SIGABRT) to the child process to silently
kill the process group for the script and its children without logging
anything.

We abort any running "monitor" script anytime any other event is generated
either by ctdbd itself or by "ctdb eventscript ..."

BZ61043

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 30 Mar 2010 00:58:37 +0000 (11:58 +1100)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 30 Mar 2010 00:57:25 +0000 (11:57 +1100)]

Reduce the loglevel for two log messages for Registering and Deregistering server ids.

BZ61890

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 29 Mar 2010 06:06:50 +0000 (17:06 +1100)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Volker Lendecke [Wed, 24 Mar 2010 09:35:10 +0000 (10:35 +0100)]

In ctdb catdb, print the payload data length without the ctdb header length

commit | commitdiff | tree

Volker Lendecke [Mon, 22 Feb 2010 14:04:16 +0000 (15:04 +0100)]

Fix a typo in run_startrecovery_eventscript

commit | commitdiff | tree

Michael Adam [Fri, 26 Mar 2010 16:33:51 +0000 (17:33 +0100)]

events:50.samba: wipe the local part of the serverid db before starting winbind/smnd/nmbd

This is necessary for the new serverid approach.

Michael

commit | commitdiff | tree

Volker Lendecke [Wed, 24 Mar 2010 09:35:10 +0000 (10:35 +0100)]

In ctdb catdb, print the payload data length without the ctdb header length

commit | commitdiff | tree

Volker Lendecke [Mon, 22 Feb 2010 14:04:16 +0000 (15:04 +0100)]

Fix a typo in run_startrecovery_eventscript

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 24 Mar 2010 06:21:10 +0000 (17:21 +1100)]

new version 1.0.114

commit | commitdiff | tree

Stefan Metzmacher [Fri, 26 Feb 2010 11:41:21 +0000 (12:41 +0100)]

config: let 13.per_ip_routing use a flock for generate_auto_link_local()

metze

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Mar 2010 07:34:32 +0000 (18:34 +1100)]

Merge commit 'obnox/master-rebase'

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Mar 2010 07:15:41 +0000 (18:15 +1100)]

Merge root@10.1.1.27:/shared/ctdb/ctdb-git

commit | commitdiff | tree

Christian Ambach [Wed, 10 Mar 2010 17:46:15 +0000 (18:46 +0100)]

adjust a vacuum log level

made the severity of the decreasing interval log level the same as for the increasing,
they are both just info logs because they don't report errors

commit | commitdiff | tree

Wolfgang Mueller-Friedt [Wed, 10 Mar 2010 09:39:31 +0000 (10:39 +0100)]

ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list

commit | commitdiff | tree

Wolfgang Mueller-Friedt [Wed, 10 Mar 2010 09:39:31 +0000 (10:39 +0100)]

ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 13:52:55 +0000 (14:52 +0100)]

packaging: add tdbtool and tdbdump as dependencies to the RPM

The init script relies on the existence.
This should fix bug #6773 on bugzilla.samba.org:
https://bugzilla.samba.org/show_bug.cgi?id=6773

Michael

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 13:52:04 +0000 (14:52 +0100)]

doc: regenerate ctdb(1) manpages after xml change

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 13:50:37 +0000 (14:50 +0100)]

doc: fix a linebreak in the example output of "ctdb getdbmap" in ctdb(1)

commit | commitdiff | tree

Mathieu Parent [Thu, 4 Mar 2010 15:06:11 +0000 (16:06 +0100)]

Fix some more bashisms

commit | commitdiff | tree

Mathieu Parent [Mon, 8 Mar 2010 20:19:35 +0000 (21:19 +0100)]

Correct nice_service()

nice takes a binary as argument and not a function or builtin command

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 11:58:57 +0000 (12:58 +0100)]

doc: regenerate ctdb and ctdb manpages after xml changes

Michael

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 11:53:21 +0000 (12:53 +0100)]

doc: add metainfo "manual" and "source" in the ctdbd manual page

commit | commitdiff | tree

Michael Adam [Wed, 24 Feb 2010 11:52:30 +0000 (12:52 +0100)]

doc: fill metainfo "manual" and "source" in the ctdb manual page

commit | commitdiff | tree

Mathieu Parent [Tue, 5 Jan 2010 10:04:24 +0000 (11:04 +0100)]

Correction of spelling errors.

* interupted -> interrupted
* dont -> don't

(thanks to lintian)

See https://bugzilla.samba.org/show_bug.cgi?id=6935

commit | commitdiff | tree

Mathieu Parent [Tue, 5 Jan 2010 09:59:44 +0000 (10:59 +0100)]

Correction of spelling errors in manpages

thanks to lintian

See https://bugzilla.samba.org/show_bug.cgi?id=6935

commit | commitdiff | tree

Michael Adam [Tue, 23 Feb 2010 10:00:23 +0000 (11:00 +0100)]

fix bug #7152: check NFS-Shares, fails with to long path-names

Thanks to Thomas Sesselmann <t.sesselmann@dkfz.de> .

Michael

commit | commitdiff | tree

Michael Adam [Wed, 6 Jan 2010 13:59:23 +0000 (14:59 +0100)]

server:ctdb_send_dmaster_reply: fix a message typo.

Michael

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 09:29:27 +0000 (10:29 +0100)]

doc: regenerate ctdb.1*

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 09:36:46 +0000 (10:36 +0100)]

doc/ctdb.1.xml: document "ctdb setifacelink <iface> <status>"

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 09:04:51 +0000 (10:04 +0100)]

doc/ctdb.1.xml: document "ctdb ipinfo <ip>"

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 09:03:00 +0000 (10:03 +0100)]

doc/ctdb.1.xml: update "ctdb ip" documentation

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 09:01:50 +0000 (10:01 +0100)]

doc/ctdb.1.xml: document "ctdb ifaces"

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 23 Feb 2010 07:35:08 +0000 (08:35 +0100)]

doc/ctdb.1.xml: document PARTIALLYONLINE status

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 08:54:46 +0000 (09:54 +0100)]

config/13.per_ip_routing: fix typo in error message

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 13:06:40 +0000 (14:06 +0100)]

config/13.per_ip_routing: use better names for release_script and setup_script

As the basename of the script will be used for the readd script
from setup_iface_ip_readd_script, it's know easier to identify
what script is called by delete_ip_from_iface() while readding
ips to the interface.

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 08:52:09 +0000 (09:52 +0100)]

config/13.per_ip_routing: register the setup script with setup_iface_ip_readd_script()

This is needed because we need to resetup the routing table when
the delete_ip_from_iface() function readds the ip to the interface.

metze

commit | commitdiff | tree

Stefan Metzmacher [Tue, 9 Feb 2010 15:34:59 +0000 (16:34 +0100)]

config/13.per_ip_routing: add a setup_per_ip_routing() function

This combines the logic into a shell function which can be used by the
"takeip" and "updateip" hooks.

We check the return values of the "ip" commands now
instead of ignoring them.

We now create a setup_script.sh similar to the release_script.sh
which makes it easier to analyze problems.

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 10:24:08 +0000 (11:24 +0100)]

server: add "setup" event

This is needed because the "init" event can't use 'ctdb' commands.

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 10:25:26 +0000 (11:25 +0100)]

config/10.interface: use delete_ip_from_iface also in the "init" event

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 09:33:54 +0000 (10:33 +0100)]

config/11.natgw: use delete_ip_from_iface() instead of remove_ip()

This also initializes the variables correctly for the
shutdown|removenatgw code path to delete_all.

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 09:24:44 +0000 (10:24 +0100)]

config: make remove_ip() a wrapper of delete_ip_from_iface()

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 09:23:17 +0000 (10:23 +0100)]

config: interface_modify states in a $CTDB_BASE/state/interface_modify directory

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 08:48:01 +0000 (09:48 +0100)]

config: add setup_iface_ip_readd_script() helper function

This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.

metze

commit | commitdiff | tree

Stefan Metzmacher [Fri, 12 Feb 2010 08:55:28 +0000 (09:55 +0100)]

config: readd ips with a broadcast address in delete_ip_from_iface()

metze

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 23 Feb 2010 01:43:49 +0000 (12:43 +1100)]

In ctdb_control_end_recovery,

We used to talloc_steal c (the command packet) and make it a child of the
"event script state context".
If we failed to create a eventscript child context for some reason,
this would have talloc freed state, but at the same time it would also
implicitely have freed c.
Once ctdb_control_end_recovery() returns the error back to the caller,
the caller would dereference both c, and also outdata which is a child of c
and we would either read garbage data or segv.

Change the ordering so we only talloc_steal c as a child of state IFF
we have successfully created a child context for the script.

BZ61068

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 22 Feb 2010 23:14:51 +0000 (10:14 +1100)]

Make sure that the natgw eventscript also triggers on the "stopped" event
to remove the natgw configuration and ip assignments used.

BZ61036

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 22 Feb 2010 04:34:26 +0000 (15:34 +1100)]

ctdb regsrvids is much more useful for testing if it sleeps once it has registered its srvid.
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 22 Feb 2010 03:06:52 +0000 (14:06 +1100)]

From Sumit Bose <sbose@redhat.com>

Fixes for init script to meet guidelines

commit | commitdiff | tree

Ronnie Sahlberg [Mon, 22 Feb 2010 03:00:33 +0000 (14:00 +1100)]

From Elia Pinto <gitter.spiros@gmail.com>

We dont need to include getopt.h under AIX

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 16 Feb 2010 00:18:43 +0000 (11:18 +1100)]

Ignore any scripts that timesout for most events, except startup.

Threat hung scripts always (except startup) as success.

commit | commitdiff | tree

Ronnie Sahlberg [Fri, 12 Feb 2010 02:19:57 +0000 (13:19 +1100)]

try to restart rpc-rquotad if it is not running

bz60317

commit | commitdiff | tree

Rusty Russell [Fri, 12 Feb 2010 06:32:56 +0000 (17:02 +1030)]

Leave sequence number alone when merely migrating records.

(Based on earlier version from Ronnie which modified tdb; this one
is standalone).

When storing records in a tdb that has "automatic seqnum updates"
also check if the actual data for the record has changed or not.

If it has not changed at all, except for possibly the header,
this is likely just a dmaster migration operation in which case
we want to write the record to the tdb but we do not want the tdb
sequence number to be increased.

This resolves the problem of notify.tdb being thrashed under load:
the heuristic in smbd to only reread this when the sequence number
increases (rarely) breaks down.

Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like:
512 1496 118.33 MB/sec execute 60 sec latency 0.00 msec
And turning on latency tracking, this was typical in the logs:
ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb

After this commit:
512 2451 143.85 MB/sec execute 60 sec latency 0.00 msec
And no more latency messages...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 01:00:43 +0000 (12:00 +1100)]

Reduce loglevel for two eventscript related debug messages

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 00:54:46 +0000 (11:54 +1100)]

Reducing the log level for a debug message

DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 00:49:48 +0000 (11:49 +1100)]

Reduce the log level for two debug messages

DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 00:32:22 +0000 (11:32 +1100)]

Add a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
to control whether or not to check if we are swapping, and produce
useful output into the logfile if we are.

For production systems with dedicated nas-heads we should never swap.
But for developer/test systems we often use smaller nondedicated systems where
we can no longer guarantee that we will not be using swap.

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 00:19:08 +0000 (11:19 +1100)]

lower the loglevel for a debug message for redundant releases of public ips

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 00:09:39 +0000 (11:09 +1100)]

Add a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
when set to "yes" this will skip checking if knfsd has hung or not.

bz59626

commit | commitdiff | tree

Andrew Tridgell [Fri, 5 Feb 2010 06:11:29 +0000 (17:11 +1100)]

fixed printing of high latency

commit | commitdiff | tree

Ronnie Sahlberg [Thu, 11 Feb 2010 03:08:41 +0000 (14:08 +1100)]

Merge commit 'martins/master'

commit | commitdiff | tree

Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]

Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions. This changes things so that we never bother matching
the machine readable header, just the actual data in the output. It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]

Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions. This changes things so that we never bother matching
the machine readable header, just the actual data in the output. It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Wed, 10 Feb 2010 09:24:28 +0000 (20:24 +1100)]

Merge commit 'origin/master'

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 9 Feb 2010 07:34:47 +0000 (18:34 +1100)]

commands that relate to manual failover of ip addresses (moveip)
can sometimes take long so allow for a longer timeout for the controls used.

commit | commitdiff | tree

Ronnie Sahlberg [Tue, 9 Feb 2010 03:35:10 +0000 (14:35 +1100)]

dont just exit(0) upon successful completion of waiting for an ipreallocate to finish.
return success back to the caller instead.

otherwise things like 'ctdb enable -n all' will just finish after the first disabled node has become enabled.

commit | commitdiff | tree

Rusty Russell [Tue, 9 Feb 2010 02:16:35 +0000 (12:46 +1030)]

event scripts: add logging for low memory conditions

We should never enter swap; if we do, show the memory state of the machine and the process list. This will help us diagnose what caused the condition before it's too late and the box starts OOM-killing processes.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>

commit | commitdiff | tree

Andrew Tridgell [Sun, 7 Feb 2010 08:02:06 +0000 (19:02 +1100)]

ctdb: migrate to new dlinklist.h from Samba

commit | commitdiff | tree

Martin Schwenke [Fri, 5 Feb 2010 04:30:39 +0000 (15:30 +1100)]

onnode documentation - update documentation to reflect recent onnode changes.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Fri, 5 Feb 2010 03:00:23 +0000 (14:00 +1100)]

Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

commit | commitdiff | tree

Andrew Tridgell [Thu, 4 Feb 2010 03:36:14 +0000 (14:36 +1100)]

ctdb: when we fill the client packet queue we need to drop the client

We can't just drop packets to the list, as those packets could be part
of the core protocol the client is using. This happens (for example)
when Samba is doing a traverse. If we drop a traverse packet then
Samba hangs indefinately. We are better off dropping the ctdb socket
to Samba.

commit | commitdiff | tree

Andrew Tridgell [Thu, 4 Feb 2010 03:14:18 +0000 (14:14 +1100)]

ctdb: move ctdb_io.c to use TLIST_*() macros

This will make large packet queues much more efficient

commit | commitdiff | tree

Andrew Tridgell [Thu, 4 Feb 2010 03:13:49 +0000 (14:13 +1100)]

util: added TLIST_*() macros

The TLIST_*() macros are like the DLIST_*() macros, but take both a
head and tail pointer for the list. This means that adding an element
to the end of the list is efficient (it doesn't need to walk the
list).

We should move all uses of the DLIST_*() macros which use
DLIST_ADD_END() to use the TLIST_*() macros instead.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 3 Feb 2010 23:03:21 +0000 (10:03 +1100)]

When trying to enable/disable a node.
Check if the node is already enabled/disabled and log an information
message if so.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 3 Feb 2010 22:54:06 +0000 (09:54 +1100)]

We only queued up to 1000 packets per queue before we start dropping
packets, to avoid the queue to grow excessively if smbd has blocked.

This could cause traverse packets to become discarded in case the main
smbd daemon does a traverse of a database while there is a recovery
(sending a erconfigured message to smbd, causing an avalanche of unlock
messages to be sent across the cluster.)

This avalance of messages could cause also the tranversal message to be
discarded causing the main smbd process to hang indefinitely waiting
for the traversal message that will never arrive.

Bump the maximum queue length before starting to discard messages from
1000 to 1000000 and at the same time rework the queueing slightly so we
can append messages cheaply to the queue instead of walking the list
from head to tail every time.

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 3 Feb 2010 22:45:32 +0000 (09:45 +1100)]

add two new debug controls to send and receive messages

ctdb msglisten and msgsend

commit | commitdiff | tree

Ronnie Sahlberg [Wed, 3 Feb 2010 19:37:41 +0000 (06:37 +1100)]

Drop the debug level for logging fd creation to DEBUG_DEBUG

commit | commitdiff | tree

Volker Lendecke [Fri, 29 Jan 2010 17:21:09 +0000 (18:21 +0100)]

tdb: fix an early release of the global lock that can cause data corruption

There was a bug in tdb where the

tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);

(ending the transaction-"mutex") was done before the

/* remove the recovery marker */

This means that when a transaction is committed there is a window where another
opener of the file sees the transaction marker while the transaction committer
is still fully functional and working on it. This led to transaction being
rolled back by that second opener of the file while transaction_commit() gave
no error to the caller.

This patch moves the F_UNLCK to after the recovery marker was removed, closing
this window.

commit | commitdiff | tree

Martin Schwenke [Fri, 22 Jan 2010 06:19:12 +0000 (17:19 +1100)]

eventscripts: stop loadconfig function from loading ctdb config file twice.

If "$1" was empty than loadconfig would load the ctdb config twice.
This stops that from happening.

Signed-off-by: Martin Schwenke <martin@meltin.net>

commit | commitdiff | tree

Martin Schwenke [Fri, 22 Jan 2010 06:14:50 +0000 (17:14 +1100)]

eventscript: Use of $NFS_TICKLE_SHARED_DIRECTORY must be after loadconfig.

Proper fix for 085d1bea78fabf754ef6dd6d323f74a1d361e45c's workaround.
$NFS_TICKLE_SHARED_DIRECTORY was being used before it is set via
loadconfig.

Ronnie actually spotted this one. :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>

CTDB repository