sahlberg/ctdb.git
14 years agoversion 1.0.93 ctdb-1.0.93
Ronnie Sahlberg [Tue, 6 Oct 2009 06:05:14 +0000 (17:05 +1100)]
version 1.0.93

14 years agoupdate natgw eventscript to allow you to fore it to update and / or to remove the...
Ronnie Sahlberg [Tue, 6 Oct 2009 05:09:24 +0000 (16:09 +1100)]
update natgw eventscript to allow you to fore it to update and / or to remove the configuration at runtime

14 years agoMerge commit 'origin/master'
Martin Schwenke [Tue, 6 Oct 2009 02:39:31 +0000 (13:39 +1100)]
Merge commit 'origin/master'

14 years agoDocument CTDB_NODES_FILE environment variable used by onnode.
Martin Schwenke [Tue, 6 Oct 2009 02:38:00 +0000 (13:38 +1100)]
Document CTDB_NODES_FILE environment variable used by onnode.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoalways send the release/take ip controls to make sure all nodes are updated
Ronnie Sahlberg [Tue, 6 Oct 2009 01:25:44 +0000 (12:25 +1100)]
always send the release/take ip controls to make sure all nodes are updated

14 years agoadd a new message to ask the recovery daemon to temporarily disable checking ip addre...
Ronnie Sahlberg [Tue, 6 Oct 2009 01:11:32 +0000 (12:11 +1100)]
add a new message to ask the recovery daemon to temporarily disable checking ip address consistency.

This is useful when we are moving addresses using moveip in the cluster since otherwise if we collide with the recovery daemons own check we could cause a recovery

14 years agoupdate addip/moveip/delip to make it less likely to trigger an accidental recovery
Ronnie Sahlberg [Tue, 6 Oct 2009 00:41:18 +0000 (11:41 +1100)]
update addip/moveip/delip to make it less likely to trigger an accidental recovery

14 years agochange some loglevels and also pront the pnn of the ip for takeip/releaseip logging
Ronnie Sahlberg [Tue, 6 Oct 2009 00:40:38 +0000 (11:40 +1100)]
change some loglevels and also pront the pnn of the ip for takeip/releaseip logging

14 years agoadd a new function to collect a list of all active nodes EXCEPT a certain node
Ronnie Sahlberg [Mon, 5 Oct 2009 23:52:31 +0000 (10:52 +1100)]
add a new function to collect a list of all active nodes EXCEPT a certain node

14 years agoallocate takeoverip state as a child of vnn and also make the takeocerip context...
Ronnie Sahlberg [Mon, 5 Oct 2009 22:35:15 +0000 (09:35 +1100)]
allocate takeoverip state as a child of vnn and also make the takeocerip context a child of vnn

14 years agoWhen adding a public ip to a node, make sure to push the assignment of ip addresses...
Ronnie Sahlberg [Mon, 5 Oct 2009 21:19:25 +0000 (08:19 +1100)]
When adding a public ip to a node, make sure to push the assignment of ip addresses out to all nodes so all nodes become aware who currently holds the ip.

14 years agoversion 1.0.92 ctdb-1.0.92
Ronnie Sahlberg [Fri, 2 Oct 2009 04:38:16 +0000 (14:38 +1000)]
version 1.0.92

14 years agowe should close this file on exec
Ronnie Sahlberg [Fri, 2 Oct 2009 03:41:54 +0000 (13:41 +1000)]
we should close this file on exec

14 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Thu, 1 Oct 2009 05:46:01 +0000 (15:46 +1000)]
Merge commit 'martins/master'

14 years agoTest suite: The ctdb ping test should allow time to go backwards.
Martin Schwenke [Thu, 1 Oct 2009 05:39:09 +0000 (15:39 +1000)]
Test suite: The ctdb ping test should allow time to go backwards.

Time can actually go backwards during this test if ntpd happens to
adjust it little bit.  So we should cope...

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agodont exit on a commit failure
Ronnie Sahlberg [Thu, 1 Oct 2009 04:53:35 +0000 (14:53 +1000)]
dont exit on a commit failure

14 years agoRevert "Revert "allow the transaction commit to fail""
Ronnie Sahlberg [Thu, 1 Oct 2009 04:51:32 +0000 (14:51 +1000)]
Revert "Revert "allow the transaction commit to fail""

This reverts commit 74e416108df6934f45ca646d709785dd76ab3c35.

14 years agodocument how to use the notification script
Ronnie Sahlberg [Thu, 1 Oct 2009 04:31:55 +0000 (14:31 +1000)]
document how to use the notification script

14 years agoadd a new notification to trigger on when ctdb has started
Ronnie Sahlberg [Thu, 1 Oct 2009 04:05:30 +0000 (14:05 +1000)]
add a new notification to trigger on when ctdb has started

14 years agoMinor fixes to 01.reclock eventscript.
Martin Schwenke [Wed, 30 Sep 2009 11:21:56 +0000 (21:21 +1000)]
Minor fixes to 01.reclock eventscript.

test -z really needs its argument to be quoted.  Simplified a status
test.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years ago40.vsftpd monitor event only fails after 2 failures to connect to port 21.
Martin Schwenke [Wed, 30 Sep 2009 11:05:16 +0000 (21:05 +1000)]
40.vsftpd monitor event only fails after 2 failures to connect to port 21.

Change the monitor event in 40.vsftpd so it only fails if there are 2
successive failures connecting to port 21.  This reduces the
likelihood of unhealthy nodes due to vsftpd being restarted for
reconfiguration due to node failover or system reconfiguration.

New eventscript functions ctdb_counter_init, ctdb_counter_incr,
ctdb_counter_limit.  These are used to count arbitrary things in
eventscripts, depending on the eventscript name and a tag that is
passed, and determine if a specified limit has been hit.  They're good
for counting failures!

These functions are used in 40.vsftpd and also in 01.reclock - the
latter used to do the counting without these functions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 30 Sep 2009 09:22:59 +0000 (19:22 +1000)]
Merge commit 'origin/master'

14 years agoNew version 1.0.91 ctdb-1.0.91
Ronnie Sahlberg [Tue, 29 Sep 2009 03:31:41 +0000 (13:31 +1000)]
New version 1.0.91

14 years agoFrom Wolfgang Mueller-Friedt
Ronnie Sahlberg [Tue, 29 Sep 2009 03:20:18 +0000 (13:20 +1000)]
From Wolfgang Mueller-Friedt

Remove the explicit vacuum/repack commands from the 00.ctdb eventscript
and implement this in the ctdb daemon.

Combine vacuuming and repacking into one
cheap read traverse to enumerate all candidate records
and one write traverse that both repacks the database and also deletes the record locally where we are lmaster and where the records have already been deleted remotely.

this code also adds initial autotuning heuristics for the vacuum intervals and how many records to delete in each iteration.

minor stylish changes made by ronnie s

14 years agoMerge commit 'origin/master'
Martin Schwenke [Tue, 29 Sep 2009 02:59:10 +0000 (12:59 +1000)]
Merge commit 'origin/master'

14 years agochange the reclock fail count to 19 monitor intervals before we shut down ctdbd
Ronnie Sahlberg [Mon, 28 Sep 2009 04:12:59 +0000 (14:12 +1000)]
change the reclock fail count to 19 monitor intervals before we shut down ctdbd

14 years ago add a new eventscript 01.reclock
Ronnie Sahlberg [Mon, 28 Sep 2009 04:06:40 +0000 (14:06 +1000)]
add a new eventscript 01.reclock

    if the reclock file has been set, then this script will test that the
    reclock file can actually be accessed.
    if the file does not exist, or if the attempts to stat the file hangs,
    the node will be marked unhealthy after the third failed monitoring event
    and after the tenth failure, ctdb itself will shutdown.

14 years agoadd machinereadable output for the ctdb getreclock command
Ronnie Sahlberg [Mon, 28 Sep 2009 03:39:54 +0000 (13:39 +1000)]
add machinereadable output for the ctdb getreclock command

14 years agoTest suite: Print debug info on node status timeouts.
Martin Schwenke [Fri, 25 Sep 2009 08:00:17 +0000 (18:00 +1000)]
Test suite: Print debug info on node status timeouts.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'obnox/master-rebase'
Ronnie Sahlberg [Fri, 25 Sep 2009 07:34:59 +0000 (17:34 +1000)]
Merge commit 'obnox/master-rebase'

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Fri, 25 Sep 2009 03:18:18 +0000 (13:18 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agowith the new banning logic with one struct for each node we no longer "forget" the...
Ronnie Sahlberg [Fri, 25 Sep 2009 03:14:53 +0000 (13:14 +1000)]
with the new banning logic with one struct for each node we no longer "forget" the other culprits as often as we used to do, which means that things like "ctdb recover" can now actually lead to a node becomming banned if we perform too many recoveries too frequently.

change this to provide absolution to all nodes once they have participated in a recovery session.

14 years agoRevert "dont check if commit failed, we do allow the commit to fail sometimes"
Michael Adam [Thu, 10 Sep 2009 14:21:01 +0000 (16:21 +0200)]
Revert "dont check if commit failed, we do allow the commit to fail sometimes"

This reverts commit affa6f47432507e84b7e76b88a2c27fff8e6e2e4.

Transaction commit should not be allowed to fail.
This is a fatal error.

Michael

14 years agoRevert "allow the transaction commit to fail"
Michael Adam [Thu, 10 Sep 2009 14:20:26 +0000 (16:20 +0200)]
Revert "allow the transaction commit to fail"

This reverts commit 7a6134e684c9ac4763bf198ef1410867b6082c94.

Transaction commit should not be allowed to fail.
This is a fatal error.

Michael

14 years agoctdb_client: fix race in starting concurrent transactions on a single node
Michael Adam [Tue, 4 Aug 2009 07:45:50 +0000 (09:45 +0200)]
ctdb_client: fix race in starting concurrent transactions on a single node

There are two races in concurrent transactions on a single node.
One in starting a transaction, and one with committing (replaying).

This commit closes the first race by storing the pid in the
transaction-lock record and comparing the own pid against it
as a measure to prevent starting a second transaction when
a second node has come inbetween and changed the pid in the lock
record.

Michael

14 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Fri, 18 Sep 2009 04:23:37 +0000 (14:23 +1000)]
Merge commit 'martins/master'

14 years agodont mark the recovery daemon as a ban culprit just because a node in the cluster...
Ronnie Sahlberg [Fri, 18 Sep 2009 02:58:30 +0000 (12:58 +1000)]
dont mark the recovery daemon as a ban culprit just because a node in the cluster was set to recvoery mode == ACTIVE.

This happens normally when someone explicitely triggers a recovery using "ctdb recover"

14 years agotry restarting ststd indefinitely not just once
Ronnie Sahlberg [Tue, 15 Sep 2009 09:33:53 +0000 (19:33 +1000)]
try restarting ststd indefinitely   not just once

14 years agoRevert "try to restart statd everytime it fails, not just the first time"
Ronnie Sahlberg [Tue, 15 Sep 2009 09:33:35 +0000 (19:33 +1000)]
Revert "try to restart statd everytime it fails, not just the first time"

This reverts commit 4f7b39a4871af28df1c4545ec37db179fa47a7da.

14 years agotry to restart statd everytime it fails, not just the first time
Ronnie Sahlberg [Tue, 15 Sep 2009 03:35:58 +0000 (13:35 +1000)]
try to restart statd everytime it fails, not just the first time

14 years agoMerge commit 'obnox/master-rebase'
Ronnie Sahlberg [Mon, 14 Sep 2009 22:05:33 +0000 (08:05 +1000)]
Merge commit 'obnox/master-rebase'

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git ctdb-1.0.90
Ronnie Sahlberg [Fri, 11 Sep 2009 21:05:21 +0000 (07:05 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years ago new version 1.0.90
Ronnie Sahlberg [Fri, 11 Sep 2009 21:30:18 +0000 (07:30 +1000)]
 new version  1.0.90

14 years agoTest suite: Update "complex" tests for wait_until_node_has_status() change.
Martin Schwenke [Fri, 11 Sep 2009 06:15:31 +0000 (16:15 +1000)]
Test suite: Update "complex" tests for wait_until_node_has_status() change.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoTest suite: wait_until_node_has_status() now uses "onnode any".
Martin Schwenke [Fri, 11 Sep 2009 05:55:53 +0000 (15:55 +1000)]
Test suite: wait_until_node_has_status() now uses "onnode any".

Many tests currently do this sort of thing:

  onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status 1 disconnected

In fact, they all use exactly the same "onnode 0 $CTDB_TEST_WRAPPER"
idiom.  This is both repetitious and dangerous, since node 0 might be
shutdown during a test.  Instead, we push "onnode any
$CTDB_TEST_WRAPPER" (which selects a connected node) into
wait_until_node_has_status() and just call that function directly in
tests, like this:

  wait_until_node_has_status 1 disconnected

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoTest suite: Rework the cluster (re)start code.
Martin Schwenke [Fri, 11 Sep 2009 04:06:12 +0000 (14:06 +1000)]
Test suite: Rework the cluster (re)start code.

Make it possible to start on only 1 node - for tests that need to
restart a particular node.

_ctdb_hack_options() attempts to see what options are being passed to
a daemon that is being run via the initscript.  It then sets a
corresponding environment variable that the initscript knows about.
Currently only the --start-as-stopped option is supported.  This is
extremely ugly but it seems like the only way...  :-(

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoIntroduce sysconfig variable CTDB_SYSLOG=yes/no (default "no").
Michael Adam [Fri, 28 Aug 2009 11:01:27 +0000 (13:01 +0200)]
Introduce sysconfig variable CTDB_SYSLOG=yes/no (default "no").

This allows for controlling start of ctdbd with or without the option "--syslog"
from the sysconfig/ctdb file.

Michael

14 years agoctdb_logging: fix a comment typo.
Michael Adam [Fri, 28 Aug 2009 10:45:43 +0000 (12:45 +0200)]
ctdb_logging: fix a comment typo.

Michael

14 years agoRename the CTDB_INIT_STYLE "ubuntu" to "debian" - this is where it comes from.
Michael Adam [Thu, 27 Aug 2009 23:04:47 +0000 (01:04 +0200)]
Rename the CTDB_INIT_STYLE "ubuntu" to "debian" - this is where it comes from.

Micheal

14 years agoUpdate outdated autotools helper files.
Mathieu Parent [Thu, 27 Aug 2009 22:58:52 +0000 (00:58 +0200)]
Update outdated autotools helper files.

This fixes https://bugzilla.samba.org/show_bug.cgi?id=6370
and http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=536256

Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoFix bashism in nfstickle event script.
Mathieu Parent [Thu, 27 Aug 2009 21:44:39 +0000 (23:44 +0200)]
Fix bashism in nfstickle event script.

Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoFix bashisms in samba event script.
Mathieu Parent [Thu, 27 Aug 2009 21:36:07 +0000 (23:36 +0200)]
Fix bashisms in samba event script.

Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoFix bashisms in multipathd event script.
Mathieu Parent [Thu, 27 Aug 2009 21:35:41 +0000 (23:35 +0200)]
Fix bashisms in multipathd event script.

Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoFix bashism in natgw eventscript.
Mathieu Parent [Thu, 27 Aug 2009 21:35:03 +0000 (23:35 +0200)]
Fix bashism in natgw eventscript.

Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoallow the transaction commit to fail
Ronnie Sahlberg [Wed, 9 Sep 2009 02:50:55 +0000 (12:50 +1000)]
allow the transaction commit to fail

14 years agoMerge commit 'martins/master'
Ronnie Sahlberg [Wed, 9 Sep 2009 02:50:21 +0000 (12:50 +1000)]
Merge commit 'martins/master'

14 years agoMerge commit 'origin/master'
Martin Schwenke [Wed, 9 Sep 2009 02:48:40 +0000 (12:48 +1000)]
Merge commit 'origin/master'

14 years agodont check if commit failed, we do allow the commit to fail sometimes
Ronnie Sahlberg [Wed, 9 Sep 2009 02:48:21 +0000 (12:48 +1000)]
dont check if commit failed, we do allow the commit to fail sometimes

14 years agodont force an election just because the ban flag differs across the cluster.
Ronnie Sahlberg [Wed, 9 Sep 2009 00:57:39 +0000 (10:57 +1000)]
dont force an election just because the ban flag differs across the cluster.
a simple push to resync this flag is sufficient

14 years agoDocument onnode "onnode any".
Martin Schwenke [Tue, 8 Sep 2009 05:19:24 +0000 (15:19 +1000)]
Document onnode "onnode any".

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoonnode: add "any" nodespec to select any node with running CTDB.
Martin Schwenke [Tue, 8 Sep 2009 05:10:20 +0000 (15:10 +1000)]
onnode: add "any" nodespec to select any node with running CTDB.

In testing and other situations (e.g. eventscripts) it is necessary to
select a node where a ctdb command can be run.  The whole idea here is
to avoid nodes where ctdbd is not running and where most ctdb commands
would fail.  This implements a standard way of doing this involving a
recursive onnode command.

There is still a small window for a race, where the selected node is
suddenly shutdown, but this is unavoidable.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Mon, 7 Sep 2009 05:29:34 +0000 (15:29 +1000)]
Merge commit 'origin/master'

14 years agolower the loglevel for the info messages that a public ip is not hosted locally for...
Ronnie Sahlberg [Thu, 3 Sep 2009 18:09:30 +0000 (04:09 +1000)]
lower the loglevel for the info messages that a public ip is not hosted locally for takeip/releaseip

14 years ago new version 1.0.89 1.0.89 ctdb-1.0.89
Ronnie Sahlberg [Thu, 3 Sep 2009 17:05:37 +0000 (03:05 +1000)]
 new version 1.0.89

14 years agomake it possible to have ctdb manage (start/stop/monitor) winbind without having...
Ronnie Sahlberg [Thu, 3 Sep 2009 16:59:24 +0000 (02:59 +1000)]
make it possible to have ctdb manage (start/stop/monitor) winbind without having samba

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Thu, 3 Sep 2009 16:00:14 +0000 (02:00 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agonew prototype banning code
Ronnie Sahlberg [Thu, 3 Sep 2009 16:20:39 +0000 (02:20 +1000)]
new prototype banning code

14 years agooverwrite the state file, dont append to it.
Ronnie Sahlberg [Tue, 1 Sep 2009 18:39:17 +0000 (04:39 +1000)]
overwrite the state file, dont append to it.
dont log errors is trying to delete a nonexisting state file

this eliminates some annoying log entries in the ctdb log

14 years agoredirect stderr to dev null since the rule might not exist when we try to uncondition...
Ronnie Sahlberg [Tue, 1 Sep 2009 17:12:27 +0000 (03:12 +1000)]
redirect stderr to dev null since the rule might not exist when we try to unconditionally delete it

14 years agoset broadcast addresses in the takeip event.
Michael Adam [Thu, 27 Aug 2009 20:09:42 +0000 (22:09 +0200)]
set broadcast addresses in the takeip event.

Michael

14 years agoremove a check for the reclock file we dont need
Ronnie Sahlberg [Thu, 27 Aug 2009 19:19:44 +0000 (05:19 +1000)]
remove a check for the reclock file we dont need

14 years agoTest suite: fix minor typo in complex/32_cifs_tickle.sh
Martin Schwenke [Thu, 27 Aug 2009 02:35:52 +0000 (12:35 +1000)]
Test suite: fix minor typo in complex/32_cifs_tickle.sh

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoMerge commit 'origin/master'
Martin Schwenke [Thu, 27 Aug 2009 02:33:43 +0000 (12:33 +1000)]
Merge commit 'origin/master'

14 years agoTest suite: Fix debug code for unexpectedly unhealthy cluster.
Martin Schwenke [Thu, 16 Jul 2009 04:04:06 +0000 (14:04 +1000)]
Test suite: Fix debug code for unexpectedly unhealthy cluster.

The debug code should run "ctdb status" on a cluster node, not on the
test client.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agoskip any persistent databases ending in .bak
Ronnie Sahlberg [Tue, 18 Aug 2009 22:25:50 +0000 (08:25 +1000)]
skip any persistent databases ending in .bak

14 years agoMerge commit 'origin/master'
Martin Schwenke [Mon, 17 Aug 2009 03:08:42 +0000 (13:08 +1000)]
Merge commit 'origin/master'

14 years agonew version 1.0.88 ctdb-1.0.88
Ronnie Sahlberg [Mon, 17 Aug 2009 01:04:40 +0000 (11:04 +1000)]
new version 1.0.88

14 years agoreduce the loglevel for the message that we switch to a different recmaster while...
Ronnie Sahlberg [Mon, 17 Aug 2009 00:56:12 +0000 (10:56 +1000)]
reduce the loglevel for the message that we switch to a different recmaster while waiting for ipreallocate to finish

14 years agoif no timeout at all is specified to the ctdb tool, neither using -T nor by setting...
Ronnie Sahlberg [Mon, 17 Aug 2009 00:54:45 +0000 (10:54 +1000)]
if no timeout at all is specified to the ctdb tool, neither using -T nor by setting CGTDB_TIMEOUT, then use 120 seconds as a default timepout before the ctdb command will exit with an error.

14 years agoTest suite: ctdb_persistent.c needs to use transactions.
Martin Schwenke [Fri, 14 Aug 2009 10:47:38 +0000 (20:47 +1000)]
Test suite: ctdb_persistent.c needs to use transactions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
14 years agodocument enable/disablescript
Ronnie Sahlberg [Thu, 13 Aug 2009 03:02:00 +0000 (13:02 +1000)]
document enable/disablescript

14 years agoadd new controls to make it possible to enable/disable individual eventscripts
Ronnie Sahlberg [Thu, 13 Aug 2009 03:04:08 +0000 (13:04 +1000)]
add new controls to make it possible to enable/disable individual eventscripts

update scriptstatus output so it lists disabled scripts

14 years agoMerge commit 'origin/master'
Martin Schwenke [Tue, 11 Aug 2009 22:48:03 +0000 (08:48 +1000)]
Merge commit 'origin/master'

14 years agoMerge root@10.1.1.27:/shared/ctdb/ctdb-git
Ronnie Sahlberg [Sun, 9 Aug 2009 21:33:52 +0000 (07:33 +1000)]
Merge root@10.1.1.27:/shared/ctdb/ctdb-git

14 years agotests: fix the 52_ctdb_fetch.sh test.
Michael Adam [Thu, 30 Jul 2009 10:02:27 +0000 (12:02 +0200)]
tests: fix the 52_ctdb_fetch.sh test.

The parser for the output of the ctdb_fetch program
did not match the output that ctdb_fetch generates.
It seemed to rather come from the ctdb_bench test...

This patch adapts the parser to correctly interpret
the output of ctdb_fetch.

Michael

14 years agoclient: fix a debug message (misplaced newline).
Michael Adam [Sat, 11 Jul 2009 22:39:29 +0000 (00:39 +0200)]
client: fix a debug message (misplaced newline).

Michael

14 years agoclient:ctdb_control_send: remove duplicate setting of the reqid header.
Michael Adam [Wed, 15 Jul 2009 08:03:03 +0000 (10:03 +0200)]
client:ctdb_control_send: remove duplicate setting of the reqid header.

Michael

14 years agoctdbd: use ctdb_syslog_log() as debug_add function for syslog
Michael Adam [Tue, 21 Jul 2009 07:50:56 +0000 (09:50 +0200)]
ctdbd: use ctdb_syslog_log() as debug_add function for syslog

Michael

14 years agoctdbd: set debug_add hook to be able to use dump_data in the daemon.
Michael Adam [Tue, 21 Jul 2009 07:48:10 +0000 (09:48 +0200)]
ctdbd: set debug_add hook to be able to use dump_data in the daemon.

Michael

14 years agodebug: add debug_add and dump_data functions
Michael Adam [Tue, 21 Jul 2009 07:47:07 +0000 (09:47 +0200)]
debug: add debug_add and dump_data functions

Michael

14 years agotdb: don't alter tdb->flags in tdb_reopen_all()
Rusty Russell [Thu, 30 Jul 2009 02:22:39 +0000 (11:52 +0930)]
tdb: don't alter tdb->flags in tdb_reopen_all()

The flags are user-visible, via tdb_get_flags/add_flags/remove_flags.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
14 years agotdb: Reimplementation of Metze's "lib/tdb: if we know pwrite and pread are thread...
Rusty Russell [Thu, 30 Jul 2009 02:22:08 +0000 (11:52 +0930)]
tdb: Reimplementation of Metze's "lib/tdb: if we know pwrite and pread are thread/fork safe tdb_reopen_all() should be a noop".

This version just wraps the reopen code, so we still re-grab the lock and do
the normal sanity checks.

The reason we do this at all is to avoid global fd limits, see:
http://forums.fedoraforum.org/showthread.php?t=210393

Note also that this whole reopen concept is fundamentally racy: if the parent
goes away before the child calls tdb_reopen_all, the database can be left
without an active lock and another TDB_CLEAR_IF_FIRST opener will clear it.
A fork_with_tdbs() wrapper could use a pipe to solve this, but it's hardly
elegant (what if there are other independent things which have similar needs?).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
14 years agorealloc() has that horrible overloaded free semantic when size is 0: current code...
Rusty Russell [Thu, 30 Jul 2009 20:10:33 +0000 (13:10 -0700)]
realloc() has that horrible overloaded free semantic when size is 0: current code does a free of the old record in this case, then fail.

14 years agoIf the record is at the end of the database, pretending it has length 1 might take...
Rusty Russell [Thu, 30 Jul 2009 20:09:33 +0000 (13:09 -0700)]
If the record is at the end of the database, pretending it has length 1 might take us out-of-bounds. Only pretend to be length 1 for the malloc.

14 years agoPort from SAMBA tdb: commit 54a51839ea65aa788b18fce8de0ae4f9ba63e4e7 Author: Rusty...
Rusty Russell [Wed, 29 Jul 2009 05:23:03 +0000 (14:53 +0930)]
Port from SAMBA tdb: commit 54a51839ea65aa788b18fce8de0ae4f9ba63e4e7 Author: Rusty Russell <rusty@rustcorp.com.au> Date: Sat Jul 18 15:28:58 2009 +0930

Make tdb transaction lock recursive (samba version)

    This patch replaces 6ed27edbcd3ba1893636a8072c8d7a621437daf7 and
    1a416ff13ca7786f2e8d24c66addf00883e9cb12, which fixed the bug where traversals
    inside transactions would release the transaction lock early.

    This solution is more general, and solves the more minor symptom that nested
    traversals would also release the transaction lock early.  (It was also suggestd in
    Volker's comment in 6ed27ed).

    This patch also applies to ctdb, if the traverse.c part is removed (ctdb's tdb
    code never received the previous two fixes).

    Tested using the testsuite from ccan (adapted to the samba code).  Thanks to
    Michael Adam for feedback.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Adam <obnox@samba.org>
commit 760104188d0d2ed96ec4a70138e6d0bf86d797ed
Author: Rusty Russell <rusty@rustcorp.com.au>
Date:   Tue Jul 21 16:23:35 2009 +0930

    tdb: fix locking error

    54a51839ea65aa788b18fce8de0ae4f9ba63e4e7 "Make tdb transaction lock
    recursive (samba version)" was broken: I "cleaned it up" and prevented
    it from ever unlocking.

    To see the problem:
        $ bin/tdbtorture -s 1248142523
        tdb_brlock failed (fd=3) at offset 8 rw_type=1 lck_type=14 len=1
        tdb_transaction_lock: failed to get transaction lock
        tdb_transaction_start failed: Resource deadlock avoided

    My testcase relied on the *count* being correct, which it was.  Fixing that
    now.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Michael Adam <obnox@samba.org>
14 years agoPort from SAMBA tdb: commit a6cc04a20089e8fbcce138c271961c37ddcd6c34 Author: Andrew...
Rusty Russell [Wed, 29 Jul 2009 05:21:34 +0000 (14:51 +0930)]
Port from SAMBA tdb: commit a6cc04a20089e8fbcce138c271961c37ddcd6c34 Author: Andrew Tridgell <tridge@samba.org> Date: Mon Jun 1 13:13:07 2009 +1000

overallocate all records by 25%

    This greatly reduces the fragmentation of databases where records
    tend to grow slowly by a small amount each time. The case where this
    is most seen is the ldb index records. Adding this overallocation
    reduced the size of the resulting database by more than 20x when
    running a test that adds 10k users.

14 years agoPort from SAMBA tdb: commit a386173fa1c7c5bcc11ea9260d84b6c52c154b3d Author: Andrew...
Rusty Russell [Wed, 29 Jul 2009 05:21:12 +0000 (14:51 +0930)]
Port from SAMBA tdb: commit a386173fa1c7c5bcc11ea9260d84b6c52c154b3d Author: Andrew Tridgell <tridge@samba.org> Date: Mon Jun 1 13:11:39 2009 +1000

auto-repack in transactions that expand the tdb

    The idea behind this is to recover from badly fragmented free
    lists. Choosing the point where the file expands is fairly arbitrary,
    but seems to work well.

14 years agoPort from SAMBA ctdb: commit 936d76802f98d04d9743b2ca8eeeaadd4362db51 Author: Andrew...
Rusty Russell [Wed, 29 Jul 2009 06:32:51 +0000 (16:02 +0930)]
Port from SAMBA ctdb: commit 936d76802f98d04d9743b2ca8eeeaadd4362db51 Author: Andrew Tridgell <tridge@samba.org> Date: Tue Dec 16 14:38:17 2008 +1100

imported the tdb_repack() code from CTDB

    The tdb_repack() function repacks a TDB so that it has a single
    freelist entry. The file doesn't shrink, but it does remove all
    freelist fragmentation. This code originated in the CTDB vacuuming
    code, but will now be used in ldb to cope with fragmentation from
    re-indexing

14 years agoPort from SAMBA tdb: commit 4b4fec65db4e202afa13b2d15867f4d8a54d154e Author: Andrew...
Rusty Russell [Wed, 29 Jul 2009 05:20:39 +0000 (14:50 +0930)]
Port from SAMBA tdb: commit 4b4fec65db4e202afa13b2d15867f4d8a54d154e Author: Andrew Tridgell <tridge@samba.org> Date: Thu May 28 16:08:28 2009 +1000

make TDB_NOSYNC affect all the fsync/msync calls in transactions

    During a transaction commit tdb normally uses fsync/msync calls to
    make it crash safe. This can be disabled using the TDB_NOSYNC flag,
    but it wasn't disabling all the code paths that caused a fsync/msync.

14 years agoPort from SAMBA tdb: commit a91bcbccf8a2243dac57cacec6fdfc9907580f69 Author: Jim...
Rusty Russell [Wed, 29 Jul 2009 05:19:57 +0000 (14:49 +0930)]
Port from SAMBA tdb: commit a91bcbccf8a2243dac57cacec6fdfc9907580f69 Author: Jim McDonough <jmcd@samba.org> Date: Thu May 21 16:26:26 2009 -0400

Detect tight loop in tdb_find()