metze/ctdb/wip.git
14 years agotdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()

tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it.  Change
this, and rename it to show it's clearly transaction-specific.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit a84222bbaf9ed2c7b9c61b8157b2e3c85f17fa32)

14 years agotdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.

Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit dd1b508c63034452673dbfee9956f52a1b6c90a5)

14 years agotdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade

Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.

Then we use this in the transaction code.  Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.

One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit fca1621965c547e2d076eca2a2599e9629f91266)

14 years agotdb: suppress record write locks when allrecord lock is taken.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: suppress record write locks when allrecord lock is taken.

Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).

The write record lock, grabbed by the delete code, is not suppressed
by the allrecord lock.  This is now bad: it causes us to punch a hole
in the allrecord lock when we release the write record lock.  Make this
consistent: *no* record locks of any kind when the allrecord lock is
taken.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit caaf5c6baa1a4f340c1f38edd99b3a8b56621b8b)

14 years agotdb: cleanup: always grab allrecord lock to infinity.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: always grab allrecord lock to infinity.

We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains.  Change it to always grab to end of file for simplicity and
so we can merge the two.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 9341f230f8968b4b18e451d15dda5ccbe7787768)

14 years agotdb: remove num_locks
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: remove num_locks

This was redundant before this patch series: it mirrored num_lockrecs
exactly.  It still does.

Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 1ab8776247f89b143b6e58f4b038ab4bcea20d3a)

14 years agotdb: use tdb_nest_lock() for seqnum lock.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: use tdb_nest_lock() for seqnum lock.

This is pure overhead, but it centralizes the locking.  Realloc (esp. as
most implementations are lazy) is fast compared to the fnctl anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit d48c3e4982a38fb6b568ed3903e55e07a0fe5ca6)

14 years agotdb: use tdb_nest_lock() for active lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for active lock.

Use our newly-generic nested lock tracking for the active lock.

Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 4738d474c412cc59d26fcea64007e99094e8b675)

14 years agotdb: use tdb_nest_lock() for open lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for open lock.

This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.

Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 9136818df30c7179e1cffa18201cdfc990ebd7b7)

14 years agotdb: use tdb_nest_lock() for transaction lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for transaction lock.

Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.

Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit e8fa70a321d489b454b07bd65e9b0d95084168de)

14 years agotdb: cleanup: find_nestlock() helper.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: find_nestlock() helper.

Factor out two loops which find locks; we are going to introduce a couple
more so a helper makes sense.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit ce41411c84760684ce539b6a302a0623a6a78a72)

14 years agotdb: cleanup: tdb_release_extra_locks() helper
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_release_extra_locks() helper

Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit db270734d8b4208e00ce9de5af1af7ee11823f6d)

14 years agotdb: cleanup: tdb_have_extra_locks() helper
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_have_extra_locks() helper

In many places we check whether locks are held: add a helper to do this.

The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit fba42f1fb4f81b8913cce5a23ca5350ba45f40e1)

14 years agotdb: don't suppress the transaction lock because of the allrecord lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: don't suppress the transaction lock because of the allrecord lock.

tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we
hold the allrecord lock.  However, the two locks don't overlap, so
this is wrong.

This simplification makes the transaction lock a straight-forward nested
lock.

There are two callers for these functions:
1) The transaction code, which already makes sure the allrecord_lock
   isn't held.
2) The traverse code, which wants to stop transactions whether it has the
   allrecord lock or not.  There have been deadlocks here before, however
   this should not bring them back (I hope!)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit b754f61d235bdc3e410b60014d6be4072645e16f)

14 years agotdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_nest_lock/tdb_nest_unlock

Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0.  We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.

To generalize this:

1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
   allrecord check out to the callers (except the mark case which doesn't
   care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
   move the allrecord out to the callers (except mark again).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 5d9de604d92d227899e9b861c6beafb2e4fa61e0)

14 years agotdb: cleanup: rename global_lock to allrecord_lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename global_lock to allrecord_lock.

The word global is overloaded in tdb.  The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.

Rename it to allrecord_lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit e9114a758538d460d4f9deae5ce631bf44b1eff8)

14 years agotdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.

The word global is overloaded in tdb.  The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).

Rename it to OPEN_LOCK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 7ab422d6fbd4f8be02838089a41f872d538ee7a7)

14 years agotdb: make _tdb_transaction_cancel static.
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: make _tdb_transaction_cancel static.

Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.

Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
(Imported from commit a6e0ef87d25734760fe77b87a9fd11db56760955)

14 years agotdb: cleanup: split brlock and brunlock methods.
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: cleanup: split brlock and brunlock methods.

This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.

For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this).  This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.

We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 452b4a5a6efeecfb5c83475f1375ddc25bcddfbe)

14 years agoSpelling fixes for tdb.
Brad Hards [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
Spelling fixes for tdb.

Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>
(Imported from commit 09e756b1d651caef203a4b7e02234f6dea374b08)

14 years agotdb: use fdatasync() instead of fsync() in transactions
Andrew Tridgell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: use fdatasync() instead of fsync() in transactions

This might help on some filesystems

(Imported from commit 1373e748aa53fbd3afe4d2377208257d42628d86)

14 years agotdb: Apply some const, just for clarity
Volker Lendecke [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: Apply some const, just for clarity

(Imported from commit 6824c6f46ba7c15e8af91d5aa8b21a946b63107b)

14 years agotdb: fix recovery reuse after crash
Rusty Russell [Thu, 22 Apr 2010 04:23:41 +0000 (13:53 +0930)]
tdb: fix recovery reuse after crash

If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42.  This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.

Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic.  The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit b37b452cb8c1f56b37b04abe7bffdede371ca361)

14 years agotdb: give a name to the invalid recovery area constant (0)
Rusty Russell [Thu, 22 Apr 2010 04:23:26 +0000 (13:53 +0930)]
tdb: give a name to the invalid recovery area constant (0)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 6269cdcd1538e2e3cead9e0f3c156b0363d607a0)

14 years agorelease-scripts: parametrize scripts
Simo Sorce [Thu, 22 Apr 2010 04:23:21 +0000 (13:53 +0930)]
release-scripts: parametrize scripts

This should make it easier to keep all release scripts alined as it will reduce
the difference between them to ideally a few variables

Also moves the tdb script in the scripts directory.

(Imported from commit 6339de7f4fef46fb3ad32d1ecf9379f5b5d24ccb)

14 years agotdb: raise version to 1.2.1
Simo Sorce [Thu, 22 Apr 2010 04:15:58 +0000 (13:45 +0930)]
tdb: raise version to 1.2.1

after recent fixes we need to raise the version to 1.2.1 so that
we can require also the right patched version.

(Imported from commit 70534adee10fc6f5bba2d9304668dc6508e5de5a)

14 years agoFix a thinko in 2ea0a9f1a93781a0d036feb9fcc0d120b182922f.
Martin Schwenke [Tue, 20 Apr 2010 00:52:31 +0000 (10:52 +1000)]
Fix a thinko in 2ea0a9f1a93781a0d036feb9fcc0d120b182922f.

If the driver is virtio_net then we assume that the link is up rather
than ignoring the check altogether.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoethtool does not support virtio_net devices.
Ralph Wuerthner [Thu, 15 Apr 2010 06:38:19 +0000 (16:38 +1000)]
ethtool does not support virtio_net devices.

Skip link test for this type of devices

Signed-off-by: Ralph Wuerthner <ralph.wuerthner@de.ibm.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge branch 'master' of git://git.samba.org/sahlberg/ctdb
Martin Schwenke [Thu, 15 Apr 2010 03:45:50 +0000 (13:45 +1000)]
Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Thu, 8 Apr 2010 04:30:01 +0000 (14:30 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoFix a compiler warning
Ronnie Sahlberg [Thu, 8 Apr 2010 04:28:52 +0000 (14:28 +1000)]
Fix a compiler warning

14 years agoIn the recovery daemon, keep track of which node we have assigned public ip
Ronnie Sahlberg [Thu, 8 Apr 2010 04:07:57 +0000 (14:07 +1000)]
In the recovery daemon, keep track of which node we have assigned public ip
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.

If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Wed, 7 Apr 2010 00:45:27 +0000 (10:45 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoLower the loglevel for "Recovery lock successfully taken"
Ronnie Sahlberg [Wed, 7 Apr 2010 00:42:51 +0000 (10:42 +1000)]
Lower the loglevel for "Recovery lock successfully taken"
from ERR to NOTICE

BZ62086

14 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 31 Mar 2010 06:52:42 +0000 (17:52 +1100)]
Merge commit 'origin/master'

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Tue, 30 Mar 2010 01:50:19 +0000 (12:50 +1100)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoWhen we forcefully abort a running eventscript, dont log this as is
Ronnie Sahlberg [Tue, 30 Mar 2010 01:47:54 +0000 (12:47 +1100)]
When we forcefully abort a running eventscript, dont log this as is
the script timedout.

Instead send a different signal (SIGABRT) to the child process to silently
kill the process group for the script and its children without logging
anything.

We abort any running "monitor" script anytime any other event is generated
either by ctdbd itself or by "ctdb eventscript ..."

BZ61043

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Tue, 30 Mar 2010 00:58:37 +0000 (11:58 +1100)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoReduce the loglevel for two log messages for Registering and Deregistering server...
Ronnie Sahlberg [Tue, 30 Mar 2010 00:57:25 +0000 (11:57 +1100)]
Reduce the loglevel for two log messages for Registering and Deregistering server ids.

BZ61890

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Mon, 29 Mar 2010 06:06:50 +0000 (17:06 +1100)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoIn ctdb catdb, print the payload data length without the ctdb header length
Volker Lendecke [Wed, 24 Mar 2010 09:35:10 +0000 (10:35 +0100)]
In ctdb catdb, print the payload data length without the ctdb header length

14 years agoFix a typo in run_startrecovery_eventscript
Volker Lendecke [Mon, 22 Feb 2010 14:04:16 +0000 (15:04 +0100)]
Fix a typo in run_startrecovery_eventscript

14 years agoevents:50.samba: wipe the local part of the serverid db before starting winbind/smnd...
Michael Adam [Fri, 26 Mar 2010 16:33:51 +0000 (17:33 +0100)]
events:50.samba: wipe the local part of the serverid db before starting winbind/smnd/nmbd

This is necessary for the new serverid approach.

Michael

14 years agoIn ctdb catdb, print the payload data length without the ctdb header length
Volker Lendecke [Wed, 24 Mar 2010 09:35:10 +0000 (10:35 +0100)]
In ctdb catdb, print the payload data length without the ctdb header length

14 years agoFix a typo in run_startrecovery_eventscript
Volker Lendecke [Mon, 22 Feb 2010 14:04:16 +0000 (15:04 +0100)]
Fix a typo in run_startrecovery_eventscript

14 years agonew version 1.0.114
Ronnie Sahlberg [Wed, 24 Mar 2010 06:21:10 +0000 (17:21 +1100)]
new version 1.0.114

14 years agoconfig: let 13.per_ip_routing use a flock for generate_auto_link_local()
Stefan Metzmacher [Fri, 26 Feb 2010 11:41:21 +0000 (12:41 +0100)]
config: let 13.per_ip_routing use a flock for generate_auto_link_local()

metze

14 years agoMerge commit 'obnox/master-rebase'
Ronnie Sahlberg [Thu, 11 Mar 2010 07:34:32 +0000 (18:34 +1100)]
Merge commit 'obnox/master-rebase'

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Thu, 11 Mar 2010 07:15:41 +0000 (18:15 +1100)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agoadjust a vacuum log level
Christian Ambach [Wed, 10 Mar 2010 17:46:15 +0000 (18:46 +0100)]
adjust a vacuum log level

made the severity of the decreasing interval log level the same as for the increasing,
they are both just info logs because they don't report errors

14 years agoctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called...
Wolfgang Mueller-Friedt [Wed, 10 Mar 2010 09:39:31 +0000 (10:39 +0100)]
ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list

14 years agoctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called...
Wolfgang Mueller-Friedt [Wed, 10 Mar 2010 09:39:31 +0000 (10:39 +0100)]
ctdb_setstatus in /etc/ctdb/functions was not working correctly because it was called with a wrong parameter list

14 years agopackaging: add tdbtool and tdbdump as dependencies to the RPM
Michael Adam [Wed, 24 Feb 2010 13:52:55 +0000 (14:52 +0100)]
packaging: add tdbtool and tdbdump as dependencies to the RPM

The init script relies on the existence.
This should fix bug #6773 on bugzilla.samba.org:
https://bugzilla.samba.org/show_bug.cgi?id=6773

Michael

14 years agodoc: regenerate ctdb(1) manpages after xml change
Michael Adam [Wed, 24 Feb 2010 13:52:04 +0000 (14:52 +0100)]
doc: regenerate ctdb(1) manpages after xml change

14 years agodoc: fix a linebreak in the example output of "ctdb getdbmap" in ctdb(1)
Michael Adam [Wed, 24 Feb 2010 13:50:37 +0000 (14:50 +0100)]
doc: fix a linebreak in the example output of "ctdb getdbmap" in ctdb(1)

14 years agoFix some more bashisms
Mathieu Parent [Thu, 4 Mar 2010 15:06:11 +0000 (16:06 +0100)]
Fix some more bashisms

14 years agoCorrect nice_service()
Mathieu Parent [Mon, 8 Mar 2010 20:19:35 +0000 (21:19 +0100)]
Correct nice_service()

nice takes a binary as argument and not a function or builtin command

14 years agodoc: regenerate ctdb and ctdb manpages after xml changes
Michael Adam [Wed, 24 Feb 2010 11:58:57 +0000 (12:58 +0100)]
doc: regenerate ctdb and ctdb manpages after xml changes

Michael

14 years agodoc: add metainfo "manual" and "source" in the ctdbd manual page
Michael Adam [Wed, 24 Feb 2010 11:53:21 +0000 (12:53 +0100)]
doc: add metainfo "manual" and "source" in the ctdbd manual page

14 years agodoc: fill metainfo "manual" and "source" in the ctdb manual page
Michael Adam [Wed, 24 Feb 2010 11:52:30 +0000 (12:52 +0100)]
doc: fill metainfo "manual" and "source" in the ctdb manual page

14 years agoCorrection of spelling errors.
Mathieu Parent [Tue, 5 Jan 2010 10:04:24 +0000 (11:04 +0100)]
Correction of spelling errors.

* interupted -> interrupted
* dont -> don't

(thanks to lintian)

See https://bugzilla.samba.org/show_bug.cgi?id=6935

14 years agoCorrection of spelling errors in manpages
Mathieu Parent [Tue, 5 Jan 2010 09:59:44 +0000 (10:59 +0100)]
Correction of spelling errors in manpages

thanks to lintian

See https://bugzilla.samba.org/show_bug.cgi?id=6935

14 years agofix bug #7152: check NFS-Shares, fails with to long path-names
Michael Adam [Tue, 23 Feb 2010 10:00:23 +0000 (11:00 +0100)]
fix bug #7152: check NFS-Shares, fails with to long path-names

Thanks to Thomas Sesselmann <t.sesselmann@dkfz.de> .

Michael

14 years agoserver:ctdb_send_dmaster_reply: fix a message typo.
Michael Adam [Wed, 6 Jan 2010 13:59:23 +0000 (14:59 +0100)]
server:ctdb_send_dmaster_reply: fix a message typo.

Michael

14 years agodoc: regenerate ctdb.1*
Stefan Metzmacher [Tue, 23 Feb 2010 09:29:27 +0000 (10:29 +0100)]
doc: regenerate ctdb.1*

metze

14 years agodoc/ctdb.1.xml: document "ctdb setifacelink <iface> <status>"
Stefan Metzmacher [Tue, 23 Feb 2010 09:36:46 +0000 (10:36 +0100)]
doc/ctdb.1.xml: document "ctdb setifacelink <iface> <status>"

metze

14 years agodoc/ctdb.1.xml: document "ctdb ipinfo <ip>"
Stefan Metzmacher [Tue, 23 Feb 2010 09:04:51 +0000 (10:04 +0100)]
doc/ctdb.1.xml: document "ctdb ipinfo <ip>"

metze

14 years agodoc/ctdb.1.xml: update "ctdb ip" documentation
Stefan Metzmacher [Tue, 23 Feb 2010 09:03:00 +0000 (10:03 +0100)]
doc/ctdb.1.xml: update "ctdb ip" documentation

metze

14 years agodoc/ctdb.1.xml: document "ctdb ifaces"
Stefan Metzmacher [Tue, 23 Feb 2010 09:01:50 +0000 (10:01 +0100)]
doc/ctdb.1.xml: document "ctdb ifaces"

metze

14 years agodoc/ctdb.1.xml: document PARTIALLYONLINE status
Stefan Metzmacher [Tue, 23 Feb 2010 07:35:08 +0000 (08:35 +0100)]
doc/ctdb.1.xml: document PARTIALLYONLINE status

metze

14 years agoconfig/13.per_ip_routing: fix typo in error message
Stefan Metzmacher [Fri, 12 Feb 2010 08:54:46 +0000 (09:54 +0100)]
config/13.per_ip_routing: fix typo in error message

metze

14 years agoconfig/13.per_ip_routing: use better names for release_script and setup_script
Stefan Metzmacher [Fri, 12 Feb 2010 13:06:40 +0000 (14:06 +0100)]
config/13.per_ip_routing: use better names for release_script and setup_script

As the basename of the script will be used for the readd script
from setup_iface_ip_readd_script, it's know easier to identify
what script is called by delete_ip_from_iface() while readding
ips to the interface.

metze

14 years agoconfig/13.per_ip_routing: register the setup script with setup_iface_ip_readd_script()
Stefan Metzmacher [Fri, 12 Feb 2010 08:52:09 +0000 (09:52 +0100)]
config/13.per_ip_routing: register the setup script with setup_iface_ip_readd_script()

This is needed because we need to resetup the routing table when
the delete_ip_from_iface() function readds the ip to the interface.

metze

14 years agoconfig/13.per_ip_routing: add a setup_per_ip_routing() function
Stefan Metzmacher [Tue, 9 Feb 2010 15:34:59 +0000 (16:34 +0100)]
config/13.per_ip_routing: add a setup_per_ip_routing() function

This combines the logic into a shell function which can be used by the
"takeip" and "updateip" hooks.

We check the return values of the "ip" commands now
instead of ignoring them.

We now create a setup_script.sh similar to the release_script.sh
which makes it easier to analyze problems.

metze

14 years agoserver: add "setup" event
Stefan Metzmacher [Fri, 12 Feb 2010 10:24:08 +0000 (11:24 +0100)]
server: add "setup" event

This is needed because the "init" event can't use 'ctdb' commands.

metze

14 years agoconfig/10.interface: use delete_ip_from_iface also in the "init" event
Stefan Metzmacher [Fri, 12 Feb 2010 10:25:26 +0000 (11:25 +0100)]
config/10.interface: use delete_ip_from_iface also in the "init" event

metze

14 years agoconfig/11.natgw: use delete_ip_from_iface() instead of remove_ip()
Stefan Metzmacher [Fri, 12 Feb 2010 09:33:54 +0000 (10:33 +0100)]
config/11.natgw: use delete_ip_from_iface() instead of remove_ip()

This also initializes the variables correctly for the
shutdown|removenatgw code path to delete_all.

metze

14 years agoconfig: make remove_ip() a wrapper of delete_ip_from_iface()
Stefan Metzmacher [Fri, 12 Feb 2010 09:24:44 +0000 (10:24 +0100)]
config: make remove_ip() a wrapper of delete_ip_from_iface()

metze

14 years agoconfig: interface_modify states in a $CTDB_BASE/state/interface_modify directory
Stefan Metzmacher [Fri, 12 Feb 2010 09:23:17 +0000 (10:23 +0100)]
config: interface_modify states in a $CTDB_BASE/state/interface_modify directory

metze

14 years agoconfig: add setup_iface_ip_readd_script() helper function
Stefan Metzmacher [Fri, 12 Feb 2010 08:48:01 +0000 (09:48 +0100)]
config: add setup_iface_ip_readd_script() helper function

This adds a generic infrastructure to register scripts which will
be called when the delete_ip_from_iface() funtion needs to readd
secondary ips to an interface.

metze

14 years agoconfig: readd ips with a broadcast address in delete_ip_from_iface()
Stefan Metzmacher [Fri, 12 Feb 2010 08:55:28 +0000 (09:55 +0100)]
config: readd ips with a broadcast address in delete_ip_from_iface()

metze

14 years agoIn ctdb_control_end_recovery,
Ronnie Sahlberg [Tue, 23 Feb 2010 01:43:49 +0000 (12:43 +1100)]
In ctdb_control_end_recovery,

We used to talloc_steal c (the command packet) and make it a child of the
"event script state context".
If we failed to create a eventscript child context for some reason,
this would have talloc freed state, but at the same time it would also
implicitely have freed c.
Once ctdb_control_end_recovery() returns the error back to the caller,
the caller would dereference both c, and also outdata which is a child of c
and we would either read garbage data or segv.

Change the ordering so we only talloc_steal c as a child of state IFF
we have successfully created a child context for the script.

BZ61068

14 years ago Make sure that the natgw eventscript also triggers on the "stopped" event
Ronnie Sahlberg [Mon, 22 Feb 2010 23:14:51 +0000 (10:14 +1100)]
Make sure that the natgw eventscript also triggers on the "stopped" event
    to remove the natgw configuration and ip assignments used.

BZ61036

14 years agoctdb regsrvids is much more useful for testing if it sleeps once it has registered...
Ronnie Sahlberg [Mon, 22 Feb 2010 04:34:26 +0000 (15:34 +1100)]
ctdb regsrvids is much more useful for testing if it sleeps once it has registered its srvid.
Othervise, as soon as it terminates, ctdbd will deregister the id automatically.

14 years agoFrom Sumit Bose <sbose@redhat.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:06:52 +0000 (14:06 +1100)]
From Sumit Bose <sbose@redhat.com>

Fixes for init script to meet guidelines

14 years agoFrom Elia Pinto <gitter.spiros@gmail.com>
Ronnie Sahlberg [Mon, 22 Feb 2010 03:00:33 +0000 (14:00 +1100)]
From Elia Pinto <gitter.spiros@gmail.com>

We dont need to include getopt.h under AIX

14 years agoIgnore any scripts that timesout for most events, except startup.
Ronnie Sahlberg [Tue, 16 Feb 2010 00:18:43 +0000 (11:18 +1100)]
Ignore any scripts that timesout for most events, except startup.

Threat hung scripts always (except startup) as success.

14 years agotry to restart rpc-rquotad if it is not running
Ronnie Sahlberg [Fri, 12 Feb 2010 02:19:57 +0000 (13:19 +1100)]
try to restart rpc-rquotad if it is not running

bz60317

14 years agoLeave sequence number alone when merely migrating records.
Rusty Russell [Fri, 12 Feb 2010 06:32:56 +0000 (17:02 +1030)]
Leave sequence number alone when merely migrating records.

(Based on earlier version from Ronnie which modified tdb; this one
is standalone).

When storing records in a tdb that has "automatic seqnum updates"
also check if the actual data for the record has changed or not.

If it has not changed at all, except for possibly the header,
this is likely just a dmaster migration operation in which case
we want to write the record to the tdb but we do not want the tdb
sequence number to be increased.

This resolves the problem of notify.tdb being thrashed under load:
the heuristic in smbd to only reread this when the sequence number
increases (rarely) breaks down.

Before, running nbench --num-progs=512 across 4 nodes, we saw numbers like:
 512      1496  118.33 MB/sec  execute 60 sec  latency 0.00 msec
And turning on latency tracking, this was typical in the logs:
 ctdbd: High latency 9380914.000000s for operation lockwait on database notify.tdb

After this commit:
  512      2451  143.85 MB/sec  execute 60 sec  latency 0.00 msec
And no more latency messages...

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoReduce loglevel for two eventscript related debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 01:00:43 +0000 (12:00 +1100)]
Reduce loglevel for two eventscript related debug messages

14 years agoReducing the log level for a debug message
Ronnie Sahlberg [Thu, 11 Feb 2010 00:54:46 +0000 (11:54 +1100)]
Reducing the log level for a debug message

              DEBUG(DEBUG_DEBUG,("pnn %u starting migration of %08x t\

14 years agoReduce the log level for two debug messages
Ronnie Sahlberg [Thu, 11 Feb 2010 00:49:48 +0000 (11:49 +1100)]
Reduce the log level for two debug messages

       DEBUG(DEBUG_DEBUG,("pnn %u dmaster response %08x\n", ctdb->pnn, ctdb_has
       DEBUG(DEBUG_DEBUG,("pnn %u dmaster request on %08x for %u from %u\n",

14 years agoAdd a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
Ronnie Sahlberg [Thu, 11 Feb 2010 00:32:22 +0000 (11:32 +1100)]
Add a variable CTDB_CHECK_SWAP_IS_NOT_USED="yes"
to control whether or not to check if we are swapping, and produce
useful output into the logfile if we are.

For production systems with dedicated nas-heads we should never swap.
But for developer/test systems we often use smaller nondedicated systems where
we can no longer guarantee that we will not be using swap.

14 years agolower the loglevel for a debug message for redundant releases of public ips
Ronnie Sahlberg [Thu, 11 Feb 2010 00:19:08 +0000 (11:19 +1100)]
lower the loglevel for a debug message for redundant releases of public ips

14 years agoAdd a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
Ronnie Sahlberg [Thu, 11 Feb 2010 00:09:39 +0000 (11:09 +1100)]
Add a new variable : CTDB_NFS_SKIP_KNFSD_ALIVE_CHECK
when set to "yes" this will skip checking if knfsd has hung or not.

bz59626

14 years agofixed printing of high latency
Andrew Tridgell [Fri, 5 Feb 2010 06:11:29 +0000 (17:11 +1100)]
fixed printing of high latency

14 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Thu, 11 Feb 2010 03:08:41 +0000 (14:08 +1100)]
Merge commit 'martins/master'

14 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoTest suite: Make "ctdb ip" test backward compatible with older ctdb versions.
Martin Schwenke [Wed, 10 Feb 2010 09:27:53 +0000 (20:27 +1100)]
Test suite: Make "ctdb ip" test backward compatible with older ctdb versions.

Recent updates to the test meant that it only worked with the latest
ctdb versions.  This changes things so that we never bother matching
the machine readable header, just the actual data in the output.  It
also takes a slightly more liberal approach in massaging the human
readable output to ensure it matches the machine readable output.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 10 Feb 2010 09:24:28 +0000 (20:24 +1100)]
Merge commit 'origin/master'