sahlberg/ctdb.git
13 years agochange the controls to return a struct ctdb_client_control_state* instead of a handle libctdb
Ronnie Sahlberg [Wed, 19 May 2010 00:56:57 +0000 (10:56 +1000)]
change the controls to return a struct ctdb_client_control_state* instead of a handle
and use a dedicated cancel fucntion for these   ctdb_cancel_control()

13 years agochange ...generic_callback to ...control_callback
Ronnie Sahlberg [Wed, 19 May 2010 00:43:40 +0000 (10:43 +1000)]
change ...generic_callback to ...control_callback

13 years agoAdd verification that we called the correct *_recv() function for the handle from...
Ronnie Sahlberg [Tue, 18 May 2010 06:07:42 +0000 (16:07 +1000)]
Add verification that we called the correct *_recv() function for the handle from the callbacks and elsewhere.

13 years agoadd a missing argument to the macros to test for samba/test owner for a particular...
Ronnie Sahlberg [Tue, 18 May 2010 02:48:12 +0000 (12:48 +1000)]
add a missing argument to the macros to test for samba/test owner for a particular server id

13 years agomove the definition of server/message id port values to the public ctdb.h header
Ronnie Sahlberg [Tue, 18 May 2010 02:46:00 +0000 (12:46 +1000)]
move the definition of server/message id port values to the public ctdb.h header

change tst.c example to use message id ports from the range reserved for test purposes

13 years agocreate macros that define that all server ids for messaging where
Ronnie Sahlberg [Tue, 18 May 2010 02:41:13 +0000 (12:41 +1000)]
create macros that define that all server ids for messaging where
the top 32 bits are all zero are reserved for samba

allocate the prefix 0x00000001 for test purposes

13 years agoRemove ctdb_set_callback() from the public API and provide the callback as part
Ronnie Sahlberg [Tue, 18 May 2010 02:19:12 +0000 (12:19 +1000)]
Remove ctdb_set_callback() from the public API and provide the callback as part
of the *_send() signature.

13 years agomake the getrecovery master control use the new generic set_callback
Ronnie Sahlberg [Tue, 18 May 2010 01:24:39 +0000 (11:24 +1000)]
make the getrecovery master control use the new generic set_callback
api and remove the callback argument from the _send() signature

Update the example tst.c function to show the three different modes to talk to ctdb

13 years agoupdate getpnn to use a generic "set_callback" call to register the callback on the...
Ronnie Sahlberg [Tue, 18 May 2010 01:11:12 +0000 (11:11 +1000)]
update getpnn to use a generic "set_callback" call to register the callback on the handle
this allows us to have a generic signature for all callbacks instead of different types for each operation.

13 years agoadd a function ctdb_writerecord() to write a record to the database.
Ronnie Sahlberg [Mon, 17 May 2010 08:40:24 +0000 (18:40 +1000)]
add a function ctdb_writerecord() to write a record to the database.

This function can only be called while hoilding a ctdb_readreacordlock*() handle.
Either from the callback provided or after ctdb_readrecordlock_recv() has been called but before ctdb_free() is used to release the handle.

13 years agomove ctdb_call_recv to libctdb
Ronnie Sahlberg [Mon, 17 May 2010 05:55:40 +0000 (15:55 +1000)]
move ctdb_call_recv to libctdb

13 years agocreatedb and getdbpath does not need to be in the public api
Ronnie Sahlberg [Mon, 17 May 2010 05:43:37 +0000 (15:43 +1000)]
createdb and getdbpath does not need to be in the public api

13 years agoremove the old ctdb_attach() function and move all callers over to the new ctdb_attac...
Ronnie Sahlberg [Mon, 17 May 2010 04:26:19 +0000 (14:26 +1000)]
remove the old ctdb_attach() function and move all callers over to the new ctdb_attachdb() in libctdb

13 years agocreate a ctdb_attachdb_recv() function for libctdb.
Ronnie Sahlberg [Mon, 17 May 2010 04:01:49 +0000 (14:01 +1000)]
create a ctdb_attachdb_recv() function for libctdb.

13 years agocreate an async version of ctdb_attachdb_send() for libctdb
Ronnie Sahlberg [Mon, 17 May 2010 02:30:22 +0000 (12:30 +1000)]
create an async version of ctdb_attachdb_send() for libctdb

13 years agocreate an async version of the control to get the path top a ctdb database file
Ronnie Sahlberg [Fri, 14 May 2010 05:29:05 +0000 (15:29 +1000)]
create an async version of the control to get the path top a ctdb database file
for libctdb

13 years agoin ctdb_getrecmaster_recv return 0 (success) when the fucntion completed correctly...
Ronnie Sahlberg [Fri, 14 May 2010 03:43:00 +0000 (13:43 +1000)]
in ctdb_getrecmaster_recv  return 0 (success) when the fucntion completed correctly and not state->status which is the PNN number in this case.

fix some additional bugs and improve error handling in the callbacks

13 years agochange ctdb_ctrl_attach to use the new libctdb versions to create the database
Ronnie Sahlberg [Fri, 14 May 2010 03:13:49 +0000 (13:13 +1000)]
change ctdb_ctrl_attach to use the new libctdb versions to create the database

13 years agowe sometimes need to send special flags when asking ctdb to create a new database...
Ronnie Sahlberg [Fri, 14 May 2010 03:05:38 +0000 (13:05 +1000)]
we sometimes need to send special flags when asking ctdb to create a new database sicne ctdbd will also open the same database internally

13 years agolet the ctdb_createdb*() fucntions return the db_id
Ronnie Sahlberg [Fri, 14 May 2010 02:43:00 +0000 (12:43 +1000)]
let the ctdb_createdb*() fucntions return the db_id

13 years agoupdate the internal ctdb_ctrl_createdb() call to use the async libctdb code
Ronnie Sahlberg [Fri, 14 May 2010 01:16:31 +0000 (11:16 +1000)]
update the internal ctdb_ctrl_createdb() call to use the async libctdb code

13 years agoadd a libctdb version of the control to create a database
Ronnie Sahlberg [Fri, 14 May 2010 00:59:45 +0000 (10:59 +1000)]
add a libctdb version of the control to create a database

13 years agochange the old ctdb_ctrl_getpnn() function with tiemout use the new
Ronnie Sahlberg [Thu, 13 May 2010 02:16:00 +0000 (12:16 +1000)]
change the old ctdb_ctrl_getpnn() function with tiemout use the new
libctdb functions instead of calling ctdb_control() directly.

13 years agoChange the ctdb_getpnn*() functions to take a destination node parameter so we can...
Ronnie Sahlberg [Thu, 13 May 2010 02:00:22 +0000 (12:00 +1000)]
Change the ctdb_getpnn*() functions to take a destination node parameter so we can send it to any node (as a "ping are you there?")

move the special addresses like CTDB_CURRENT_NODE from ctdb_protocol.h to ctdb.h

13 years agoAdd a control to read the PNN number of the local node to libctdb
Ronnie Sahlberg [Thu, 13 May 2010 00:14:27 +0000 (10:14 +1000)]
Add a control to read the PNN number of the local node to libctdb

13 years agotemporary kludge
Ronnie Sahlberg [Wed, 12 May 2010 23:46:38 +0000 (09:46 +1000)]
temporary kludge

add a temporary kludge to the ctdb_service() function to handle the timed events with zero timeout that are used in the loibctdb code.

13 years agoremove the timeout parameter to ctdb_control_send() and
Ronnie Sahlberg [Wed, 12 May 2010 04:34:17 +0000 (14:34 +1000)]
remove the timeout parameter to ctdb_control_send() and
have all callers set this explicitely when they need a timeout
for the control

13 years agomove some timed event calls out from libctdb and put it back in client/ctdb_client.c
Ronnie Sahlberg [Wed, 12 May 2010 04:16:17 +0000 (14:16 +1000)]
move some timed event calls out from libctdb and put it back in client/ctdb_client.c

13 years agoooops, add a file that was missing
Ronnie Sahlberg [Wed, 12 May 2010 04:09:04 +0000 (14:09 +1000)]
ooops,  add a file that was missing

13 years agodont include events.h from files that dont need/use it.
Ronnie Sahlberg [Wed, 12 May 2010 03:46:34 +0000 (13:46 +1000)]
dont include events.h from files that dont need/use it.

13 years agoshow gow to use _send() _recv() in the example program
Ronnie Sahlberg [Wed, 12 May 2010 02:11:07 +0000 (12:11 +1000)]
show gow to use _send() _recv() in the example program

13 years agoUpdate ctdb_remove_message_handler to provide a nonblocking async version
Ronnie Sahlberg [Tue, 11 May 2010 23:40:16 +0000 (09:40 +1000)]
Update ctdb_remove_message_handler to provide a nonblocking async version

13 years agoadditional cleanup.
Ronnie Sahlberg [Tue, 11 May 2010 23:22:12 +0000 (09:22 +1000)]
additional cleanup.

create async version of ctdb_set_message_handler()

13 years agomove messaging functions into libctdb
Ronnie Sahlberg [Tue, 11 May 2010 18:53:44 +0000 (04:53 +1000)]
move messaging functions into libctdb

13 years agochange the libctdb_ prefix to ctdb_
Ronnie Sahlberg [Tue, 11 May 2010 18:10:18 +0000 (04:10 +1000)]
change the libctdb_ prefix to ctdb_

13 years agoChange the ifdefs to match the filename
Ronnie Sahlberg [Tue, 11 May 2010 17:59:15 +0000 (03:59 +1000)]
Change the ifdefs to match the filename

13 years agorename ctdb.h to ctdb_protocol.h
Ronnie Sahlberg [Tue, 11 May 2010 17:56:20 +0000 (03:56 +1000)]
rename ctdb.h to ctdb_protocol.h
rename libctdb.h to ctdb.h

update all source files to include ctdb_protocol.h when required

13 years agodont include talloc in libctdb.a
Ronnie Sahlberg [Tue, 11 May 2010 17:32:41 +0000 (03:32 +1000)]
dont include talloc in libctdb.a
use te talloc that the application we link with provides

13 years agoadd libctdb getrecmaster control recv function and
Ronnie Sahlberg [Tue, 11 May 2010 01:26:48 +0000 (11:26 +1000)]
add libctdb getrecmaster control recv function and
a sync version of getrecmaster

13 years agoupdates
Ronnie Sahlberg [Tue, 11 May 2010 00:58:27 +0000 (10:58 +1000)]
updates

13 years agoexample libctdb.a and test program libctdb/tst.c
Ronnie Sahlberg [Mon, 10 May 2010 21:23:56 +0000 (07:23 +1000)]
example libctdb.a    and test program libctdb/tst.c
for evaluation.

libctdb.a contains a lot of code used in ctdbd
as well as a small libctdb wrapper that are calleable form clients
: libctdb/ctdb_connect.c

13 years agowhen performing a recovery,
Ronnie Sahlberg [Wed, 5 May 2010 23:33:08 +0000 (09:33 +1000)]
when performing a recovery,
ensure that all nodes use the same reclock file setting as the recovery master

13 years agoAdd a new eventscript 62.cnfs to integrate better with gpfs/cnfs
Ronnie Sahlberg [Tue, 4 May 2010 03:56:55 +0000 (13:56 +1000)]
Add a new eventscript 62.cnfs to integrate better with gpfs/cnfs

13 years agoMerge commit 'rusty/signal-fix'
Ronnie sahlberg [Mon, 3 May 2010 05:57:41 +0000 (15:57 +1000)]
Merge commit 'rusty/signal-fix'

13 years ago Dont check ip assignment across the cluster while ip-verification
Ronnie Sahlberg [Mon, 3 May 2010 05:52:02 +0000 (15:52 +1000)]
Dont check ip assignment across the cluster while ip-verification
    checks are disabled

13 years agoThe recent change to the recovery daemon to keep track of and
Ronnie Sahlberg [Wed, 28 Apr 2010 05:43:11 +0000 (15:43 +1000)]
The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.

Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.

BZ62782

13 years agoMake create_merged_ip_list() a static function since
Ronnie Sahlberg [Wed, 28 Apr 2010 04:47:37 +0000 (14:47 +1000)]
Make create_merged_ip_list() a static function since
it is not called from outside of ctdb_takeover.c

13 years agoIn the log message when we have found an inconsistent ip address allocation,
Ronnie Sahlberg [Wed, 28 Apr 2010 04:44:53 +0000 (14:44 +1000)]
In the log message when we have found an inconsistent ip address allocation,
add extra log information about what the inconsistency is.

14 years agoIf the admin makes a configuration mistake and configures NATGW to use the
Ronnie Sahlberg [Tue, 27 Apr 2010 22:46:41 +0000 (08:46 +1000)]
If the admin makes a configuration mistake and configures NATGW to use the
same ip address as a normal public-address,
check for this in the natgw script and warn the user.

Also prevent ctdb from starting up since this configuration will not work.

BZ60933

14 years agoMerge commit 'rusty/tdb-update'
Ronnie sahlberg [Thu, 22 Apr 2010 23:25:25 +0000 (09:25 +1000)]
Merge commit 'rusty/tdb-update'

14 years agoAdd a setting where CTDB will monitor and warn for low memory conditions.
Ronnie Sahlberg [Thu, 22 Apr 2010 22:52:09 +0000 (08:52 +1000)]
Add a setting where CTDB will monitor and warn for low memory conditions.

    CTDB_MONITOR_FREE_MEMORY_WARN

BZ 59747

14 years agoIn the example script to remove all ip addresses after a ctdb crash,
Ronnie Sahlberg [Thu, 22 Apr 2010 22:35:01 +0000 (08:35 +1000)]
In the example script to remove all ip addresses after a ctdb crash,
add the NATGW address as one to be removed in addition to the
public addresses.

14 years agotdb: define _PUBLIC_ so we can compile tdb. rusty/tdb-update
Rusty Russell [Thu, 22 Apr 2010 04:41:38 +0000 (14:11 +0930)]
tdb: define _PUBLIC_ so we can compile tdb.

The Samba tree defines _PUBLIC_ (and _PRIVATE_) for libraries to
control visibility.  The last commit absorbed this from their tdb,
but we need to #define to stub it out since ctdb doesn't use it
(and doesn't need to: we only use tdb internally).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agotdb: update tdb ABI to use hide_symbols=True
Andrew Tridgell [Thu, 22 Apr 2010 04:31:36 +0000 (14:01 +0930)]
tdb: update tdb ABI to use hide_symbols=True

We now use -fvisibilty=hidden to hide symbols from outside the tdb
shared library.

This also moved tdb_transaction_recover() into the tdb_private.h
header, as it should never have been a public API. For that reason we
are changing the version number. We're only doing a minor version
increment as it is extremely unlikely that anyone was actually using
tdb_transaction_recover() as its locking requirements were rather
unusual.

Pair-Programmed-With: Rusty Russell <rusty@samba.org>

(Imported from commit 773a8afbba27a5e2e48577100f3ca9873b506615)

14 years agosubunit: Support formatting compatible with upstream subunit, for consistency.
Jelmer Vernooij [Thu, 22 Apr 2010 04:29:22 +0000 (13:59 +0930)]
subunit: Support formatting compatible with upstream subunit, for consistency.

Upstream subunit makes a ":" after commands optional, so I've fixed any
places where we might trigger commands accidently. I've filed a bug
about this in subunit.

(Imported from commit 7da94cc4a664521be279b019e9f32121cd410193)

14 years agotdb: update exports and signatures files
Simo Sorce [Thu, 22 Apr 2010 04:28:35 +0000 (13:58 +0930)]
tdb: update exports and signatures files

(Imported from commit c1f6f61f620e865516d1856c9d937b5326a29046)

14 years agotdb: Add a non-blocking version of tdb_transaction_start
Volker Lendecke [Thu, 22 Apr 2010 04:28:35 +0000 (13:58 +0930)]
tdb: Add a non-blocking version of tdb_transaction_start

(Imported from commit 261c3b4f1beed820647061bacbee3acccbcbb089)

14 years agotdb: Fix indentation in tdb_new_database()
Volker Lendecke [Thu, 22 Apr 2010 04:28:07 +0000 (13:58 +0930)]
tdb: Fix indentation in tdb_new_database()

(Imported from commit 59315887a07033316edf91c0c57563eee5ea992d)

14 years agoFix some nonempty blank lines
Volker Lendecke [Thu, 22 Apr 2010 04:28:07 +0000 (13:58 +0930)]
Fix some nonempty blank lines

(Imported from commit ea8e0d5d54b020c530e392c4edaeed43e20af303)

14 years agopython: use '#!/usr/bin/env python' to cope with varying install locations
Andrew Tridgell [Thu, 22 Apr 2010 04:27:17 +0000 (13:57 +0930)]
python: use '#!/usr/bin/env python' to cope with varying install locations

this should be much more portable

(Imported from commit 088096d1bad51428a2e2d487214995d4fdfc7ccc)

14 years agotdb: Fix bug 7248, avoid the nanosleep dependency
Volker Lendecke [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: Fix bug 7248, avoid the nanosleep dependency

(Imported from commit e2c7e5c4f72565fe49265d5b036531926ea1ac92)

14 years agotdb: If tdb_parse_record does not find a record, return -1 instead of 0
Volker Lendecke [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: If tdb_parse_record does not find a record, return -1 instead of 0

(Imported from commit fb98f60594b6cabc52d0f2f49eda08f793ba4748)

14 years agotdb: handle processes dying during transaction commit.
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: handle processes dying during transaction commit.

tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit.  Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.

The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:

Before:
2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75

After:
2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit ec96ea690edbe3398d690b4a953d487ca1773f1c)

14 years agopatch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch

(Imported from commit 1bf482b9ef9ec73dd7ee4387d7087aa3955503dd)

14 years agotdb: add -k option to tdbtorture
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: add -k option to tdbtorture

To test the case of death of a process during transaction commit, add
a -k (kill random) option to tdbtorture.  The easiest way to do this
is to make every worker a child (unless there's only one child), which
is why this patch is bigger than you might expect.

Using -k without -t (always transactions) you expect corruption, though
it doesn't happen every time.  With -t, we currently get corruption but
the next patch fixes that.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit ececeffd85db1b27c07cdf91a921fd203006daf6)

14 years agotdb: don't truncate tdb on recovery
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: don't truncate tdb on recovery

The current recovery code truncates the tdb file on recovery.  This is
fine if recovery is only done on first open, but is a really bad idea
as we move to allowing recovery on "live" databases.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 8c3fda4318adc71899bc41486d5616da3a91a688)

14 years agotdb: remove lock ops
Rusty Russell [Thu, 22 Apr 2010 04:24:06 +0000 (13:54 +0930)]
tdb: remove lock ops

Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway.  We don't need to
have special noop lock ops for transactions.

This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 9f295eecffd92e55584fc36539cd85cd32c832de)

14 years agotdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()

tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it.  Change
this, and rename it to show it's clearly transaction-specific.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit a84222bbaf9ed2c7b9c61b8157b2e3c85f17fa32)

14 years agotdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.

Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit dd1b508c63034452673dbfee9956f52a1b6c90a5)

14 years agotdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade

Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.

Then we use this in the transaction code.  Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.

One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit fca1621965c547e2d076eca2a2599e9629f91266)

14 years agotdb: suppress record write locks when allrecord lock is taken.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: suppress record write locks when allrecord lock is taken.

Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).

The write record lock, grabbed by the delete code, is not suppressed
by the allrecord lock.  This is now bad: it causes us to punch a hole
in the allrecord lock when we release the write record lock.  Make this
consistent: *no* record locks of any kind when the allrecord lock is
taken.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit caaf5c6baa1a4f340c1f38edd99b3a8b56621b8b)

14 years agotdb: cleanup: always grab allrecord lock to infinity.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: cleanup: always grab allrecord lock to infinity.

We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains.  Change it to always grab to end of file for simplicity and
so we can merge the two.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 9341f230f8968b4b18e451d15dda5ccbe7787768)

14 years agotdb: remove num_locks
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: remove num_locks

This was redundant before this patch series: it mirrored num_lockrecs
exactly.  It still does.

Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 1ab8776247f89b143b6e58f4b038ab4bcea20d3a)

14 years agotdb: use tdb_nest_lock() for seqnum lock.
Rusty Russell [Thu, 22 Apr 2010 04:24:05 +0000 (13:54 +0930)]
tdb: use tdb_nest_lock() for seqnum lock.

This is pure overhead, but it centralizes the locking.  Realloc (esp. as
most implementations are lazy) is fast compared to the fnctl anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit d48c3e4982a38fb6b568ed3903e55e07a0fe5ca6)

14 years agotdb: use tdb_nest_lock() for active lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for active lock.

Use our newly-generic nested lock tracking for the active lock.

Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 4738d474c412cc59d26fcea64007e99094e8b675)

14 years agotdb: use tdb_nest_lock() for open lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for open lock.

This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.

Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 9136818df30c7179e1cffa18201cdfc990ebd7b7)

14 years agotdb: use tdb_nest_lock() for transaction lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: use tdb_nest_lock() for transaction lock.

Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.

Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit e8fa70a321d489b454b07bd65e9b0d95084168de)

14 years agotdb: cleanup: find_nestlock() helper.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: find_nestlock() helper.

Factor out two loops which find locks; we are going to introduce a couple
more so a helper makes sense.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit ce41411c84760684ce539b6a302a0623a6a78a72)

14 years agotdb: cleanup: tdb_release_extra_locks() helper
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_release_extra_locks() helper

Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit db270734d8b4208e00ce9de5af1af7ee11823f6d)

14 years agotdb: cleanup: tdb_have_extra_locks() helper
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_have_extra_locks() helper

In many places we check whether locks are held: add a helper to do this.

The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit fba42f1fb4f81b8913cce5a23ca5350ba45f40e1)

14 years agotdb: don't suppress the transaction lock because of the allrecord lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: don't suppress the transaction lock because of the allrecord lock.

tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we
hold the allrecord lock.  However, the two locks don't overlap, so
this is wrong.

This simplification makes the transaction lock a straight-forward nested
lock.

There are two callers for these functions:
1) The transaction code, which already makes sure the allrecord_lock
   isn't held.
2) The traverse code, which wants to stop transactions whether it has the
   allrecord lock or not.  There have been deadlocks here before, however
   this should not bring them back (I hope!)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit b754f61d235bdc3e410b60014d6be4072645e16f)

14 years agotdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: tdb_nest_lock/tdb_nest_unlock

Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0.  We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.

To generalize this:

1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
   allrecord check out to the callers (except the mark case which doesn't
   care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
   move the allrecord out to the callers (except mark again).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 5d9de604d92d227899e9b861c6beafb2e4fa61e0)

14 years agotdb: cleanup: rename global_lock to allrecord_lock.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename global_lock to allrecord_lock.

The word global is overloaded in tdb.  The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.

Rename it to allrecord_lock.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit e9114a758538d460d4f9deae5ce631bf44b1eff8)

14 years agotdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
Rusty Russell [Thu, 22 Apr 2010 04:23:51 +0000 (13:53 +0930)]
tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.

The word global is overloaded in tdb.  The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).

Rename it to OPEN_LOCK.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 7ab422d6fbd4f8be02838089a41f872d538ee7a7)

14 years agotdb: make _tdb_transaction_cancel static.
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: make _tdb_transaction_cancel static.

Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.

Signed-off-by: Rusty Russell<rusty@rustcorp.com.au>
(Imported from commit a6e0ef87d25734760fe77b87a9fd11db56760955)

14 years agotdb: cleanup: split brlock and brunlock methods.
Rusty Russell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: cleanup: split brlock and brunlock methods.

This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.

For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this).  This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.

We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 452b4a5a6efeecfb5c83475f1375ddc25bcddfbe)

14 years agoSpelling fixes for tdb.
Brad Hards [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
Spelling fixes for tdb.

Signed-off-by: Matthias Dieter Wallnöfer <mwallnoefer@yahoo.de>
(Imported from commit 09e756b1d651caef203a4b7e02234f6dea374b08)

14 years agotdb: use fdatasync() instead of fsync() in transactions
Andrew Tridgell [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: use fdatasync() instead of fsync() in transactions

This might help on some filesystems

(Imported from commit 1373e748aa53fbd3afe4d2377208257d42628d86)

14 years agotdb: Apply some const, just for clarity
Volker Lendecke [Thu, 22 Apr 2010 04:23:42 +0000 (13:53 +0930)]
tdb: Apply some const, just for clarity

(Imported from commit 6824c6f46ba7c15e8af91d5aa8b21a946b63107b)

14 years agotdb: fix recovery reuse after crash
Rusty Russell [Thu, 22 Apr 2010 04:23:41 +0000 (13:53 +0930)]
tdb: fix recovery reuse after crash

If a process (or the machine) dies after just after writing the
recovery head (pointing at the end of file), the recovery record will filled
with 0x42.  This will not invoke a recovery on open, since rec.magic
!= TDB_RECOVERY_MAGIC.

Unfortunately, the first transaction commit will happily reuse that
area: tdb_recovery_allocate() doesn't check the magic.  The recovery
record has length 0x42424242, and it writes that back into the
now-valid-looking transaction header) for the next comer (which
happens to be tdb_wipe_all in my tests).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit b37b452cb8c1f56b37b04abe7bffdede371ca361)

14 years agotdb: give a name to the invalid recovery area constant (0)
Rusty Russell [Thu, 22 Apr 2010 04:23:26 +0000 (13:53 +0930)]
tdb: give a name to the invalid recovery area constant (0)

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(Imported from commit 6269cdcd1538e2e3cead9e0f3c156b0363d607a0)

14 years agorelease-scripts: parametrize scripts
Simo Sorce [Thu, 22 Apr 2010 04:23:21 +0000 (13:53 +0930)]
release-scripts: parametrize scripts

This should make it easier to keep all release scripts alined as it will reduce
the difference between them to ideally a few variables

Also moves the tdb script in the scripts directory.

(Imported from commit 6339de7f4fef46fb3ad32d1ecf9379f5b5d24ccb)

14 years agoadd an example script that can be called from crontab to cleanup
Ronnie Sahlberg [Thu, 22 Apr 2010 04:02:11 +0000 (14:02 +1000)]
add an example script that can be called from crontab to cleanup
and release public ip addresses if ctdbd is no longer running

14 years ago add a missing ||
Ronnie Sahlberg [Thu, 22 Apr 2010 04:22:46 +0000 (14:22 +1000)]
add a missing ||
    to make the 10.interface script not fail with a syntax error

14 years agotdb: raise version to 1.2.1
Simo Sorce [Thu, 22 Apr 2010 04:15:58 +0000 (13:45 +0930)]
tdb: raise version to 1.2.1

after recent fixes we need to raise the version to 1.2.1 so that
we can require also the right patched version.

(Imported from commit 70534adee10fc6f5bba2d9304668dc6508e5de5a)

14 years agoFix a thinko in 2ea0a9f1a93781a0d036feb9fcc0d120b182922f.
Martin Schwenke [Tue, 20 Apr 2010 00:52:31 +0000 (10:52 +1000)]
Fix a thinko in 2ea0a9f1a93781a0d036feb9fcc0d120b182922f.

If the driver is virtio_net then we assume that the link is up rather
than ignoring the check altogether.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoethtool does not support virtio_net devices.
Ralph Wuerthner [Thu, 15 Apr 2010 06:38:19 +0000 (16:38 +1000)]
ethtool does not support virtio_net devices.

Skip link test for this type of devices

Signed-off-by: Ralph Wuerthner <ralph.wuerthner@de.ibm.com>
Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge branch 'master' of git://git.samba.org/sahlberg/ctdb
Martin Schwenke [Thu, 15 Apr 2010 03:45:50 +0000 (13:45 +1000)]
Merge branch 'master' of git://git.samba.org/sahlberg/ctdb

14 years agoeventscript: simplify script timeout handling rusty/signal-fix
Rusty Russell [Thu, 8 Apr 2010 05:41:05 +0000 (15:11 +0930)]
eventscript: simplify script timeout handling

Now the script child signal handler doesn't do anything, we can unify the
"timeout" and "abort" cases introduced in 9dd25cb751919799.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
14 years agoeventscript: wait for debugging dump before killing timedout script
Rusty Russell [Thu, 8 Apr 2010 05:39:08 +0000 (15:09 +0930)]
eventscript: wait for debugging dump before killing timedout script

Fairly simple: prevent the destructor from killing the script, and do it
explicitly from the debugging child.

We can remove the extra "already dead" test, since this will be detected
in the destructor anyway.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>