ctdb.git
10 years agoNew version 1.2.67 ctdb-1.2.67
Amitay Isaacs [Wed, 14 Aug 2013 06:23:27 +0000 (16:23 +1000)]
New version 1.2.67

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Change timeout to 10 seconds for the call to ctdb_ctrl_getpnn()
Martin Schwenke [Wed, 14 Aug 2013 09:17:46 +0000 (19:17 +1000)]
client: Change timeout to 10 seconds for the call to ctdb_ctrl_getpnn()

A more flexible solution would be to backport the patch to add a
timeout argument to ctdb_cmdline_client() but that breaks to many
things for this branch.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Increase default control timeout to 10 seconds
Martin Schwenke [Fri, 9 Aug 2013 01:56:29 +0000 (11:56 +1000)]
tools/ctdb: Increase default control timeout to 10 seconds

The current 3 second timeout is arbitrary and users trip over it
sometimes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit b49c4f39666d5b1596213bf41bcdc47ed3c327ae)

10 years agorecoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databases
Amitay Isaacs [Tue, 13 Aug 2013 04:02:46 +0000 (14:02 +1000)]
recoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databases

When creating missing databases either locally or remotely, recovery
master calls ctdb_ctrl_createdb().  Recovery master always passes 0
for tdb_flags.  For volatile databases, if TDB_INCOMPATIBLE_HASH is not
specified, then they will be attached without using jenkins hash causing
database corruption.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 2fc6b6403707a292d134140fc0b9145b454992c5)

10 years agoctdbd: Print tdb flags when logging attached to database message
Amitay Isaacs [Wed, 10 Jul 2013 02:23:30 +0000 (12:23 +1000)]
ctdbd: Print tdb flags when logging attached to database message

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 846109169ee5e3d03135156e45c8dac93aa2e95b)

10 years agotools/ctdb: Make ban/unban more resilient to timeouts
Martin Schwenke [Wed, 14 Aug 2013 05:40:27 +0000 (15:40 +1000)]
tools/ctdb: Make ban/unban more resilient to timeouts

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Move NFS reconfigure to "ipreallocated" event
Martin Schwenke [Thu, 8 Aug 2013 04:37:03 +0000 (14:37 +1000)]
eventscripts: Move NFS reconfigure to "ipreallocated" event

Doing this in the "monitor" event is unsafe because it causes the node
health status to flip-flop.  At the moment when a node goes unhealthy
it is failed out, IPs are released and the monitor event handles the
reconfigure, returning 0 even though the service failure is
unresolved.

This change was made in the master branch a long time ago.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Change the nfsd RPC check failure policy
Martin Schwenke [Tue, 6 Aug 2013 06:46:21 +0000 (16:46 +1000)]
eventscripts: Change the nfsd RPC check failure policy

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: New function ctdb_check_counter()
Martin Schwenke [Tue, 6 Aug 2013 06:46:01 +0000 (16:46 +1000)]
eventscripts: New function ctdb_check_counter()

This provides much more flexible counter handling.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Add optional counter name argument to some counter functions
Martin Schwenke [Tue, 6 Aug 2013 06:44:50 +0000 (16:44 +1000)]
eventscripts: Add optional counter name argument to some counter functions

This helps some calling code look less like line noise.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Banned nodes should not be told to run "ipreallocated" event
Martin Schwenke [Fri, 2 Aug 2013 06:29:32 +0000 (16:29 +1000)]
recoverd: Banned nodes should not be told to run "ipreallocated" event

They will reject it because they are in recovery.  This can result in
extra banning credits being applied to banned nodes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Call takeover fail callback only once per node
Martin Schwenke [Mon, 22 Jul 2013 06:39:46 +0000 (16:39 +1000)]
recoverd: Call takeover fail callback only once per node

Currently the fail callback is called once per (takeip/releaseip) control
failure.  This is overkill and can get a node banned much too quickly.

Instead, keep track of control failures per node and only call fail
callback once per failed node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit bf4a7c1ad87e0e848296d15d63eb8cd901ca5335)

Conflicts:
server/ctdb_takeover.c

10 years agorecoverd: Log node that causes takoever run to fail
Martin Schwenke [Fri, 31 May 2013 04:55:07 +0000 (14:55 +1000)]
recoverd: Log node that causes takoever run to fail

Extend takeover_fail_callback() to just log (and not do any ban
processing) when the callback data is NULL.  Always call
ctdb_takeover_run() with the callback so that useful errors are always
logged.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit c429394afbabaee09f9216dc743419adddf523ea)

10 years agoclient: Exit with non-zero status when unix socket is closed
Amitay Isaacs [Mon, 24 Jun 2013 07:37:15 +0000 (17:37 +1000)]
client: Exit with non-zero status when unix socket is closed

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 733fc909425860f6a02c205c2d8f34a731853922)

10 years agoNew version 1.2.66 ctdb-1.2.66
Martin Schwenke [Thu, 18 Jul 2013 03:33:04 +0000 (13:33 +1000)]
New version 1.2.66

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: A missing interface should cause monitoring to fail
Martin Schwenke [Tue, 16 Jul 2013 09:31:05 +0000 (19:31 +1000)]
eventscripts: A missing interface should cause monitoring to fail

A missing interface is at least as bad as an interface with a link
that is down so should have a similar effect.

This couldn't be done previously because orphaned interfaces used to
be listed for monitoring.  This was worked around in 10.interface in
commit a5b8e2c1ec1b4fd7ef25e70a919ef4c70f3e1c75.

If $CTDB_PARTIALLY_ONLINE_INTERFACES="yes" then monitoring won't
actually fail but the interface is still marked as down.

This effectively reverts d40330453854d81d182112b49f5f6f2e0814b231 and
89547a1910fd74f98ae9d5737914328eb5cc3eaf.  However, it heeds the
warning in the commit message for latter by avoiding an early exit.
it just flags a failure and marks the interfaces as down in ctdbd.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoNew version 1.2.65 ctdb-1.2.65
Amitay Isaacs [Tue, 2 Jul 2013 07:19:05 +0000 (17:19 +1000)]
New version 1.2.65

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Don't ban self if init or shutdown event fails
Amitay Isaacs [Tue, 2 Jul 2013 02:40:37 +0000 (12:40 +1000)]
ctdbd: Don't ban self if init or shutdown event fails

There is no point in banning the node if init or shutdown event times
out since it's going to quit anyway.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit ef1c4e99ca66e7a990bc557f34abb624c315e6ba)

10 years agodoc: The second half of monitoring is only for recovery master
Amitay Isaacs [Thu, 27 Jun 2013 07:46:43 +0000 (17:46 +1000)]
doc: The second half of monitoring is only for recovery master

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6)

10 years agorecoverd: when the recmaster is banned, use that information when forcing an election
Michael Adam [Wed, 26 Jun 2013 07:23:22 +0000 (09:23 +0200)]
recoverd: when the recmaster is banned, use that information when forcing an election

When we trigger an election because the recmaster considers itself inactive,
update our local nodemap with the recmaster's flags before calling
force_election(). This way, we don't send the inactive node freeze commands
(e.g.) that may fail and then lead to ourselves getting banned.

The theory is that this should help avoiding banning loops.

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 932360992b08a5483d90c0590218ba0fd756119e)

10 years agorecoverd: fix a comment typo
Michael Adam [Wed, 26 Jun 2013 05:11:51 +0000 (07:11 +0200)]
recoverd: fix a comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 741944f118e98f178b860194eecb215180949d18)

10 years agorecoverd: fix a comment in main_loop
Michael Adam [Fri, 21 Jun 2013 15:57:37 +0000 (17:57 +0200)]
recoverd: fix a comment in main_loop

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95)

10 years agorecoverd: eliminate some trailing spaces from ctdb_election_win()
Michael Adam [Fri, 21 Jun 2013 12:06:22 +0000 (14:06 +0200)]
recoverd: eliminate some trailing spaces from ctdb_election_win()

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit df30c0a05ed908fc2a997c56ff5484736b23b70f)

10 years agorecoverd: Don't continue if the current node gets banned
Martin Schwenke [Fri, 28 Jun 2013 06:31:07 +0000 (16:31 +1000)]
recoverd: Don't continue if the current node gets banned

Can not continue with recovery or monitoring cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a)

10 years agorecoverd: Refactor code to ban misbehaving nodes
Amitay Isaacs [Fri, 28 Jun 2013 04:31:02 +0000 (14:31 +1000)]
recoverd: Refactor code to ban misbehaving nodes

Since we have nodemap information, there is no need to hardcode the
limit of 20.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(cherry picked from commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe)

10 years agorecoverd: Move code to ban other nodes after we get local node flags
Amitay Isaacs [Thu, 27 Jun 2013 06:01:16 +0000 (16:01 +1000)]
recoverd: Move code to ban other nodes after we get local node flags

If a node gets banned first, then it should not ban other nodes.

This code was moved up in main_loop to avoid waiting for nodemap
from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795).

To prevent a banned node from banning other nodes, we need to first get
nodemap information from local node, so trying to ban other nodes can
fail if we are already banned.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit ae1693905036ecdbc4594fde1f12500faae4a554)

10 years agorecoverd: Delay the initial election if node is started in stopped state
Amitay Isaacs [Thu, 27 Jun 2013 05:44:27 +0000 (15:44 +1000)]
recoverd: Delay the initial election if node is started in stopped state

Since there is an early exit if a node is stopped or banned, we can wait till
the node becomes active to start initial election.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 593a17678fbd3109e118154b034d43b852659518)

10 years agorecoverd: Update capabilities only if the current node is active
Amitay Isaacs [Thu, 27 Jun 2013 05:33:49 +0000 (15:33 +1000)]
recoverd: Update capabilities only if the current node is active

Since we do an early return if a node is stopped or banned, move update
capabilities code below the early return and just before we check the
capabilities of current recovery master.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 93bcb6617e1024f810533e12390a572f51703ca0)

10 years agorecoverd: No need to check if node is recovery master when inactive
Amitay Isaacs [Thu, 27 Jun 2013 05:46:04 +0000 (15:46 +1000)]
recoverd: No need to check if node is recovery master when inactive

If a node is stopped or banned, it will cause early return from the
main_loop, so this check is redundent.  The election will called by an
active node.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 815ddd3341b7e9db39e05a3a3fcd9a1420f053bc)

Conflicts:
server/ctdb_recoverd.c

10 years agorecoverd: Always do an early exit from main_loop if node is stopped or banned
Amitay Isaacs [Thu, 27 Jun 2013 05:39:15 +0000 (15:39 +1000)]
recoverd: Always do an early exit from main_loop if node is stopped or banned

A stopped or banned node cannot do anything useful.  So do not participate
in any cluster activity and do not cause any unnecessary network traffic.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 2396981c4bcf30530aeb7f4395093cc202105b50)

Conflicts:
server/ctdb_recoverd.c

10 years agorecoverd: Do not set banning credits on a node if current node is inactive
Amitay Isaacs [Fri, 28 Jun 2013 04:10:47 +0000 (14:10 +1000)]
recoverd: Do not set banning credits on a node if current node is inactive

If the current node is banned or stopped, then it should not assign banning
credits to other nodes since the current node will not have up-to-date flags
of other nodes.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 38304f88e0c634e97d4687c25adef975f71537b8)

10 years agobanning: Do not come out of ban if databases are not frozen
Amitay Isaacs [Mon, 1 Jul 2013 07:40:36 +0000 (17:40 +1000)]
banning: Do not come out of ban if databases are not frozen

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit a60f228f8380f222f838eb619d2ab55f96f11ac2)

10 years agobanning: No need to check if banned pnn is for local node
Amitay Isaacs [Mon, 24 Jun 2013 04:33:32 +0000 (14:33 +1000)]
banning: No need to check if banned pnn is for local node

If the banned pnn is not the local node, the function returns early.
So no need for additional check.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 297d93cecc3c0655e72ecac38508e113bdbeab9c)

10 years agobanning: Make ctdb_local_node_got_banned() a void function
Amitay Isaacs [Fri, 28 Jun 2013 04:04:18 +0000 (14:04 +1000)]
banning: Make ctdb_local_node_got_banned() a void function

When this function is called, we are already committed to banning
and there is no point in failing this function.  In case, freezing of
databases fails, it will be fixed from recovery daemon.
(cherry picked from commit bb178338658b4ae32382a1f62f7c21cee1d4878f)

10 years agorecoverd: Also check if current node is in recovery when it is banned
Amitay Isaacs [Fri, 28 Jun 2013 04:02:44 +0000 (14:02 +1000)]
recoverd: Also check if current node is in recovery when it is banned

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8)

10 years agorecoverd: Set node_flags information as soon as we get nodemap
Amitay Isaacs [Fri, 28 Jun 2013 04:09:35 +0000 (14:09 +1000)]
recoverd: Set node_flags information as soon as we get nodemap

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 8d622660a14c929e365d306147b378ea6ab92175)

10 years agorecovered: Remove old comment as the code corresponding to that has gone away
Amitay Isaacs [Wed, 26 Jun 2013 06:02:23 +0000 (16:02 +1000)]
recovered: Remove old comment as the code corresponding to that has gone away

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7)

10 years agobanning: Log ban state changes for other nodes at higher debug level
Amitay Isaacs [Mon, 24 Jun 2013 04:31:50 +0000 (14:31 +1000)]
banning: Log ban state changes for other nodes at higher debug level

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit c6f8407648abb37f2ed781afa5171dad8c9f59e9)

10 years agofreeze: Make ctdb_start_freeze() a void function
Amitay Isaacs [Mon, 1 Jul 2013 06:28:04 +0000 (16:28 +1000)]
freeze: Make ctdb_start_freeze() a void function

If this function fails due to memory errors, there is no way to recover.
The best course of action is to abort.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 46efe7a886f8c4c56f19536adc98a73c22db906a)

Conflicts:
server/ctdb_freeze.c

10 years agofreeze: If priority is invalid here, it's time to abort
Amitay Isaacs [Mon, 1 Jul 2013 06:21:00 +0000 (16:21 +1000)]
freeze: If priority is invalid here, it's time to abort

ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the
priority if it's 0 and return error if it's invalid.  Other callers of
ctdb_start_freeze() are internal to CTDB.  So if priority is invalid in
ctdb_start_freeze(), definitely something is seriously wrong.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 87716e8f504d659515d3dbcf93badbf106873bc8)

Conflicts:
server/ctdb_freeze.c

10 years agofreeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()
Amitay Isaacs [Mon, 1 Jul 2013 03:26:33 +0000 (13:26 +1000)]
freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()

This ensures that whenever databases are frozen either via sending
control or by calling ctdb_start_freeze(), the action is logged.
Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of
message in early return condition if databases are already frozen.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 478e24bceda3fedfba54ccb48faa115df726b819)

10 years agorecoverd: Print banning message only after verifying pnn
Amitay Isaacs [Mon, 24 Jun 2013 04:18:58 +0000 (14:18 +1000)]
recoverd: Print banning message only after verifying pnn

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 4be8dff3a4451192f838497b4747273685959bed)

10 years agorecoverd: When updating flags on nodes, send updated flags and not old flags
Amitay Isaacs [Wed, 26 Jun 2013 05:22:46 +0000 (15:22 +1000)]
recoverd: When updating flags on nodes, send updated flags and not old flags

This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa.
Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control
was sent to the local daemon which in turn informed the recovery daemon.
And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc)

10 years agotools/ctdb: Add "force" option to "recover" command
Martin Schwenke [Wed, 26 Jun 2013 04:34:47 +0000 (14:34 +1000)]
tools/ctdb: Add "force" option to "recover" command

At the moment there is no easy way to force a recovery when attempting
to reproduce certain classes of bugs.  This option is added without
documentation because it is dangerous until the bugs are fixed!  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 4f87925a287f612a6ab3b5da1a387a31c7bea28f)

10 years agorecoverd: remove bogus comment "qqq" from "add prototype new banning code"
Michael Adam [Fri, 22 Mar 2013 16:48:00 +0000 (17:48 +0100)]
recoverd: remove bogus comment "qqq" from "add prototype new banning code"

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 9f01b8db72780acf2f88f1392bc0a796dd4c6176)

10 years agorecoverd: Clarify some misleading log messages
Martin Schwenke [Thu, 11 Oct 2012 04:59:00 +0000 (15:59 +1100)]
recoverd: Clarify some misleading log messages

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f)

10 years agorecoverd: try to become the recovery master if we have the capability, but the curren...
Stefan Metzmacher [Tue, 21 Jun 2011 13:49:30 +0000 (15:49 +0200)]
recoverd: try to become the recovery master if we have the capability, but the current master doesn't

metze
(cherry picked from commit 6ba8af28f8a8f79db65120a97d7157dcc5c7e083)

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit ccd67cf7f26713e695000d89d9ce8cfa78bfe00f)

10 years agoserver/recoverd: do takeover_run after verifying the reclock file
Stefan Metzmacher [Tue, 31 Aug 2010 06:42:32 +0000 (08:42 +0200)]
server/recoverd: do takeover_run after verifying the reclock file

metze
(cherry picked from commit 93df096773c89f21f77b3bcf9aa90bf28881b852)

10 years agoserver/banning: also release all ips if we're banning ourself
Stefan Metzmacher [Tue, 31 Aug 2010 07:28:34 +0000 (09:28 +0200)]
server/banning: also release all ips if we're banning ourself

metze
(cherry picked from commit c386f2c62f06f1c60047b7d4b1ec7a9eec11873c)

10 years agoNew version 1.2.64 ctdb-1.2.64
Amitay Isaacs [Thu, 20 Jun 2013 04:40:00 +0000 (14:40 +1000)]
New version 1.2.64

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agorecoverd: Fix printing of node flags from local information
Amitay Isaacs [Wed, 23 Jan 2013 03:35:47 +0000 (14:35 +1100)]
recoverd: Fix printing of node flags from local information

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 124e2a471aeda9c900fd898178a30522d7d74221)

10 years agotools/ctdb: Do not exit prematurely on control timeout if retrying in a loop
Amitay Isaacs [Tue, 18 Jun 2013 04:27:34 +0000 (14:27 +1000)]
tools/ctdb: Do not exit prematurely on control timeout if retrying in a loop

This avoids premature exits from "ctdb stop" and "ctdb continue" due to
intermittent control (e.g. getpnn, getnodemap) timeouts.

This needs a proper fix to distinguish between timeout and failure
conditions and take appropriate action.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit c48583fd238496a81ddc46a21892f0b49559036a)

10 years agotools/ctdb: Remove duplicate command definition for "sync"
Martin Schwenke [Thu, 23 May 2013 06:06:47 +0000 (16:06 +1000)]
tools/ctdb: Remove duplicate command definition for "sync"

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

(cherry picked from commit 9e7b7cd04adc5e66e2ffa4edf463a682aaea379b)

Conflicts:
tools/ctdb.c

10 years agotools/ctdb: Fix racy ipreallocate code
Amitay Isaacs [Thu, 23 May 2013 03:04:06 +0000 (13:04 +1000)]
tools/ctdb: Fix racy ipreallocate code

This code tried to find the recovery master and send an ipreallocate
request to that node.  When a node is stopped, this code asked the
stopped node for recovery master.  Stopped node does not have up-to-date
information on the current recovery master.  So ipreallocate requests
were sent to the wrong node and ignored by that node which is not the
recovery master.

Send ipreallocate request to all active nodes.  That way we guarantee
that the current recovery master will see it and respond to it.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

(cherry picked from commit 0577ce3c68e4febf49a1ef5093e918db9d5ec636)

Conflicts:
tools/ctdb.c

10 years agoeventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS
Martin Schwenke [Thu, 13 Jun 2013 01:56:25 +0000 (11:56 +1000)]
eventscripts: New configuration varable $CTDB_NFS_DUMP_STUCK_THREADS

If some nfsd threads are still alive after a shutdown during a restart
then this indicates the maximum number of threads for which a stack
trace should be dumped.  This can be useful for trying to determine
why nfsd is stuck.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 2503245db10d567af708a04edd3a3b488c24f401)

10 years agoeventscripts: Add new option $CTDB_MONITOR_NFS_THREAD_COUNT
Martin Schwenke [Thu, 13 Jun 2013 00:17:20 +0000 (10:17 +1000)]
eventscripts: Add new option $CTDB_MONITOR_NFS_THREAD_COUNT

Consider the following example:

1. There are 256 nfsd threads configured.
2. 200 threads are "stuck" in system calls, perhaps waiting for the
   underlying filesystem when an attempt is made to restart NFS.
3. 56 threads exit when NFS is stopped.
4. 56 new threads are started when NFS is started.
5. 200 "stuck" threads exit leaving only 56 threads running.

Setting this option to "yes" makes the 60.nfs monitor event look for
this situation and try to correct it.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 99b0d8b8ecc36dfc493775b9ebced54539c182d2)

Conflicts:
config/events.d/60.nfs

10 years agoNew version 1.2.63 ctdb-1.2.63
Amitay Isaacs [Mon, 17 Jun 2013 03:19:39 +0000 (13:19 +1000)]
New version 1.2.63

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotdb: Update TDB library to 1.2.12 from upstream
Amitay Isaacs [Thu, 13 Jun 2013 02:52:38 +0000 (12:52 +1000)]
tdb: Update TDB library to 1.2.12 from upstream

This comprises of following patches from Samba upstream.

d1feccb35e987545f4ae8e3f4eb0b4fc741e7e7e tdb: change version to tdb-1.2.12
1f269fcc6e2bb46b765d547eb1add2bc52272c47 tdb: Add another overflow check to tdb_expand_adjust
d9b4f19e73f241a1ccf64f04c3cc28d481550bb7 tdb: Make tdb_recovery_allocate overflow-safe
8b215df4454883b3733733af4f49f87eb0a2a46a tdb: Make tdb_recovery_size overflow-safe
7ae09a9695bcc5fad606441db3ab6e413b9d48ce tdb: add proper OOM/ENOSPC handling to tdb_expand()
854c5f0aac03c7c6d7e1b37997dcdc848fec1499 tdb: add overflow detection to tdb_expand_adjust()
e19d46f7e31a32e2b5ae3ec05e13f32b8ac2109d tdb: add overflow/ENOSPC handling to tdb_expand_file()
a07ba17e0c91d726416db946e6f65b064b2d17ec tdb: add a 'new_size' helper variable to tdb_expand_file()
4483bf143ddfee9ec07aed8f124559b00f757d9a tdb: Add overflow-checking tdb_add_off_t
3bd686c5ad4756af1033ac14ba09a40156cc6d47 tdb: fix logging of offets and lengths.
cd4b413cb0574c459c1c24cf07f8d6b44f5fc077 build: Remove autoconf build system
11f467d0bd8e2264f311d82f3299443b99526bb3 tdb: include information about hash function being used in tdbtool info output
c8c0bf74805c61b1379dab1d4529df0004872bb4 tdb: Fix blank line endings
a92c08e18bca2f1db671dc5e2d0db4adbf39752d tdb: Little format change
68698b4e64831d2fdf762b5f8577ff404f10a3cb tdb: Slightly simplify tdb_expand_file
2f4b21bb57c4f96c5f5b57a69d022c142d8088d5 ntdb: switch between secrets.tdb and secrets.ntdb depending on 'use ntdb'
a7fdd4f7c2e64eedf12cb46c3435edbec772a4ab tdb: Slightly simplify transaction_write
fcb345f5d6be9659a0f8a5afe62a937010a33d5c tdb: Make tdb_release_transaction_locks use tdb_allrecord_unlock
5929e38b6cdbd4f9721293a19f079ceae1af76b0 tdb: Don't segfault if tdb_open_ex for check failed
3534e4e8d5bebfaaaef5886dcea219a7e4047fc7 tdb: Factor out the retry loop from tdb_allrecord_upgrade
1f93f0836484ccc1abe335f5bd2cfd35df0b7631 tdb: Simplify fcntl_lock() a bit
542400a966039178effd6c8da932ed3a8d749131 tdb: Use tdb_null in freelistcheck
68559b787e7c9a83e055493bde638ec02e1097d1 tdb: Enhance lock tracking a bit
05235d5b444558f6d06ef12ea7d74850800425cf tdb: Fix a typo
72cd5d5ff664dc46afb3dd6a5ea45a28ef7b8591 tdb: Remove "header" from tdb_context
71247ec4bdefb3a1b16619f7ea7404bcbafb5b60 tdb: Pass argument "header" to check_header_hash
1436107b0769c88e7eb50057b5f05ad5b8573990 tdb: Pass argument "header" to tdb_new_database
80a6fe84271d15cc22caa3d08768ab5559ef9ed7 Remove some unused variables.
f2d67af7bc0b316f54d6cc1a44d07f1b24244378 tdb: Fix undefined prototype warnings
1beb4bc9d12fb124935e9e4710f48ad616dacc60 tdb: Fix \n in error messages
a444bb95a270ca5b331041c11a7e785c1e0559b7 tdb: Add a comment explaining the "check"
3109b541c9b2f0063e1ccb0cdaec0a8e388b29b4 tdb: Make tdb_new_database() follow a more conventional style
d972e6fa74b6499403d4c3d3c6a84cbda7eded39 tdb: Fix a typo
c04de8f3a4deba0062f8cdbcbe74d3735a80b735 tdb: Fix a typo
24755d75b0ee7170195bc26cf28bab4ffdb6f752 tdb: Use tdb_lock_covered_by_allrecord_lock in tdb_unlock
f8dafe5685008671f4f983a4defc90b4a05cf992 tdb: Factor out tdb_lock_covered_by_allrecord_lock from tdb_lock_list
26b8545df44df7e60ba0ba7336ffbdda8a14e423 tdb: Simplify logic in tdb_lock_list slightly
0f4e7a1401998746a6818b9469ab369d70418ac1 tdb: Slightly simplify tdb_lock_list
116ec13bb0718eb1de1ac1f4410d5c33f1db616f tdb: Fix blank line endings
7237fdd4ddc0b9c848b5936431b4f8731ce56dba tdb: Fix a comment
d2b852d79bd83754d8952a0e3dece00e513549f2 tdb: Fix a typo
2c3fd8a13e7dde23a23404cd88078a04c8b338ea tdb: Fix a missing CONVERT
c9053796b389758e8bacff4cd37d633fd65171f9 tdb: Improve the documentation of tdb_reopen() and tdb_close().
7f08365a28770fdcc1bb625d8a16d11d8f15c29a tdb: Fix possible crash bugs in the python tdb code.
ede2aaef281048123cacab9ae879f5c546787080 lib/tdb: Rename manpages/ to man/.
68c6dcb0942244f542eec7bbe5fba78ef7f66051 docs: man tdbtool: Add missing meta data.
c62f8baff878001ead921112dd653ff69d1cfe7d tdb: Make tdb robust against improper CLEAR_IF_FIRST restart
37fd93194db10fc832ed3fa1ec880ebc26be904b tdb: Make robust against shrinking tdbs
100d38d6e0fae1dc666ae43941570c9f106b73c2 tdb: add -e option to tdbdump (and docment it).
ffde8678910ae838588ab380b48a544333f81241 tdb: tdbdump should log errors, and fail in that case.
90f463b25f7bb0bc944732773c56e356834ea203 tdb: add tdb_rescue()
a168a7c791a4be1730a370d059b3a1073fbb0bdd tdb: Fix a typo
1f50b6c3aefe9a7ac64641b1e9c23e014459647f tdb/test: fix build on OSF/1
41cffa3c8b126570203e32c2024d5a8f439b529e doc: Remove build/ from doxygen config or it will not work in brew.
ea6b8ee026a4c53d9dfb5a42e4d9e485b89018e3 lib/tdb: Fix format string errors found by -Werror=format in tdb tests
c92a5670e3d60decbc13bd8680de37184bc12815 pytdb: Check if the database is closed before we touch it
a8e88332a394f4a4c3e43b496738008fba39d39f pytdb: Check for errors parsing strings into TDB_DATA
66f59f040984bef5023fc844097b85bebee88f09 tdb: finish weaning off err.h.
3c4263e7580143c69225729f5b67f09c00add2fd tdb: don't use err.h in tests.
1783fe34433f9bb4b939de3231a7c296390ec426 tdb: make TDB_NOSYNC merely disable sync.
f7f6992c1e6ee8ac4a55c2fddf169ac695362036 autobuild: always set TDB_NO_FSYNC.
bf5934ca1b80930d8fd2f19ef12e32092b34fa4d tdb/wscript: Remove unecessary semicolons.
e2caba054f977b631720f8dc2528ba03dc237122 tdb: remove unused debug_fprintf() macro that breaks the build
0688cf102d2a513875d1832ad0de6052b837a72a tdb:tests: fix use of a non-existent word (existant)
c8877d8f63ea367401fae4377cd28ee91b58d9e3 build: Remove unused release scripts for tdb
593e731097bc6f2fd50034f5e3ddac017894e584 lib/tdb: Update ABI
3fdeaa3992bb0599613e20d8e3248c478897531f lib/tdb: Add/expose lock functions to support CTDB
4442c0b2c92e4b2e88661e15022228c5f6547112 lib/tdb: fix transaction issue for HAVE_INCOHERENT_MMAP.
c12970cc91cb4da8976801e194e29e33e02b340a lib/tdb: fix test/run-die-during-transaction when HAVE_INCOHERENT_MMAP.
330e3e1b91ecbf99af3b598b324f21b5eff933fd lib/tdb: fix missing return 0 code.
fde694274e1e5a11d1473695e7ec7a97f95d39e4 lib/tdb: fix OpenBSD incoherent mmap.
eafd83736918bc5953e4a91cf2d893e68f2da2a2 lib/tdb: fix up run-die-during-transaction test cases on Solaris.
3272ba0d2d63e6a7d00972bc2c052aee84f073fd lib/tdb: remove unnecessary XOPEN and FILE_OFFSET_BITS defines in test/
583ffeae404cc632eebc43fed054391a59dffee2 lib/tdb: fix tests for standalone out-of-tree.
4d58d0fa8f936e7efdc02e31c053d42a47b3e62a tdb: build and run unit tests in tdb/test/
205242e1769f96e0e8fccd52378965d35dd02093 tdb/test: fix up tests for use in SAMBA tdb code.
8fa345d952328c5866f3a0f835f3599343c51b00 tdb: wean CCAN-style unit tests off of tap.
0802791081ba39298aa93f0e6860c3b62800df73 tdb: import unit tests from CCAN into tdb/test/
390b9a2dd8447ecd16e3957c02fa886781797733 tdb: make tdb_private.h idempotent.
eff69aa0f908f5cb44b3cb846c8a4ada874240fa Add "repack" command to tdbtool.
7b42ceb414d3e14c411dd95dcd0b200113fe1191 Fix compile when TDB_TRACE is enabled.
c1e9537ed0f58404fed96a7aa9e581b8ebb1fb60 tdb: Use tdb_parse_record in tdb_update_hash
5767224b7f4703c3195aa69eef4352d80980f95e tdb: don't free old recovery area when expanding if already at EOF.
3a2a755e3380a8f81374009d463cd06161352507 tdb: use same expansion factor logic when expanding for new recovery area.
664add17757836c5ee98618aef11371f412b6e44 tdb: Avoid a malloc/memcpy in _tdb_store
b64494535dc62f4073fc6302847593ed6e6ec38b tdb: be more careful on 4G files.
20789bfdde37ee3140422b5599d41a280c01d29f tdb: Fix python documentation for tdb module
3741cf999e0f05a831d88330ac6bfa7ad34c2ec7 Remove unused variable.
3e6e1aed949a4483fc38607e443b5c8b715aca3b Fix a bunch of "warning: variable â€˜XXXX’ set but not used [-Wunused-but-set-variable]" warnings from the new gcc.
86afe83d867229b11fd4ec9cb6e29af698cacdef waf: Factor checking for undefined symbol flags out into separate method.
3585abcd4cc1b6ffeeb7f64abe3d21a12f9633f6 pytdb: Shorter description which fits on a single line.
774f85649b5d9f8872ebfdd359964330b4ff436a tdb: Only check for pkg-config file when checking for system tdb.
31912781ca84db9b27264b5182729d1097c0661d wafsamba: Only install .pc files if libraries are public.
a5025a3c2fa83c67e0a53611ad8fbe264888a590 tdb: Install pkg-config file.
ee720fc19cebf9108711429dfe25ccaf192e2c7e tdb: increment sequence number in tdb_wipe_all().
e01f3108ff447239fb3cb2f89b4749c5f7b88c3b tdb: remove 'EOF' print from tdbrestore
5eecc854236f0b943aaa89e0c3a46f9fbd208ca9 tdb2: create tdb2 versions of various testing TDBs.
6bc59d77b64d1ecbe5c906ed2fa80a7b8ca5d420 tdb_store: check returns for 0, not -1.
4fa51257b283c2e8bb415cc7f9c001d64c8a2669 tdb: enable VALGRIND to remove valgrind noise.
43ab5aa390769ee9b57918cf5b57aa4a22586805 lib/tdb/python/tests/simple.py: don't assume TDB ordering.
73c31f044e32103276558a194698ea6cf876b4f2 tdb: fix a build warning.
bf3b2e2aee284c85ecea6a3142bc1fa5344b430a Support the 'PYTHON' environment variable.
1804d9a64662d37f6c7c50bdd7b8edd80f42192b tdb_backup: avoid transaction on backup file, use lockall
36cfa7b79e36d880cdbf24d0769558be44d0edda tdb: make sure we skip over recovery area correctly.
cb884186a55c9ef8aca6ee48b16423b3c881e689 tdb_expand: limit the expansion with huge records
094ab60053bcc0bc3542af8144e394d83270053e tdb: tdb_repack() only when it's worthwhile.
6aa72dae8fc341de5b497f831ded1f8f519fa8fb tdb: fix transaction recovery area for converted tdbs.
0080f944b47f3afa676153e5da7093a8667fc005 tdb: Fix Coverity ID 2238: SECURE_CODING
25397de589e577e32bb291576b10c18978b5bc4e tdb: Fix Coverity ID 2192: NO_EFFECT
bfce962c8f5219e72a07810a663a14542355927d tdb: rename convert_string() to tdb_convert_string()
c56e3ccfc9eafbb69b03dc40346e3115bec42ef6 lib: don't install public headers if a private library
7b948a39e1783ff4732f9734830b0544d6a814b1 tdb: use public_headers to install header files
0a0ebd73fb98002f099544eca5deaf6763790277 tdb: use system include style for system headers
949427c208159f4ac580f547dd5465a70b4751b7 python: use os.environ[] instead of os.putenv()
91cad71390bd2a0330891083c65d3f9000b74657 tdb: Fix a C++ warning
8b8caac6d0ac980e59bc5bcbfb06502deebb9f42 build: removed the old autogen.sh and autogen-waf.sh scripts
b42afa0edf375c944d39a888f4db422e8d2b13cf tdb: Added doxygen documentation.
005c6370cdaab69d4228ecbf5e7369ebc61b86ae waf: ensure "make dist" works from a clean git tree for all libraries
24d5a7202ab521b92eb07c93647ae2d381e181a5 tdbrestore: Update to GPLv3+, remove old FSF address.
5792fa90ace06f736661d9924ec9a75c3a0a9771 s4-python: Only set BASETYPE flag if subclassing is supported.
51239bb26a714bf4c41fb15fde211df1255f9468 talloc/tdb/tevent: Remove obsolete signatures files.
b222615b5978aa78e82af79359b7cc3baec0bc87 tdb: add ABI/tdb-1.2.9.sigs
cac57328a6077dc428396402036636095f139569 tdb: tdb_summary() support.
7ea1b767977c8c201c0f5bfaeb01f96af4b51f7b tdb: setup TEST_DATA_PREFIX for make test
b83672b36c1ea8c35833c40c3919b63809f16624 tdb:tdbtorture: use TEST_DATA_PREFIX for files
d81ceeb9833ec80d7ab21b34e4b3af3d79aeb845 tdb:tdbtest: use TEST_DATA_PREFIX for files
9e8a04984327ffae611165244a159ff9c6ca30f4 tdb: Remove autotools support.
46ee6908be64c4405b3a8f7477abc119aa060020 tdb: add ABI/tdb-1.2.8.sigs
c754fad5712cc7c1912f27eb5595c12cf65e55c6 tdb: Bump version number after symbol versioning changes.
51e7244269e9c14a920f91a485cda6c785b2fc85 pytdb: Make PyTdb variable static.
87337383572324e3d1d00ed710614ebe217aa2b2 build: introduce SAMBA_CHECK_PYTHON_HEADERS
57f2f1d72a70a80e61a2ed6f1abc63a177a590ab waf: remove the restriction that private libraries must not have a vnum
ebe2867fc2c01fb5288d62eedb0e2f43788b9f27 waf-abi: auto-generate per-symbol versions from ABI files
735c1cd2da15167748e92ba6de48fdb5169db587 s4-pkgconfig: add @LIB_RPATH@ to our link flags
989d8803f28826e6541667127abad801c4fa4566 tdb:common/open.c - use "discard_const_p" for certain "tdb->name" assignments
d2560cd7dc106d7853442133f237001f68bcb971 tdb:tdbstore.c - remove an useless '\'
2ac5cedb719d220db412d0bdc69e34bad9ab26f1 Avoid the use of PyAPI_DATA, which is for internal Python API's.
dedd064aa825edd57f992b12218a184398db9586 tdb: set tdb->name early, as it's needed for tdb_name()
f0a472a2d678dd0374181f1a6ac0a3d35503636e waf: added reconfigure targets to our libraries
1aa8308c30962ac04a2997acaa7f2a7458729cc2 tdb: Use waf by default.
3deece559159150a0710d8160f39583ba7f2e582 s4: Remove the old perl/m4/make/mk-based build system.
50256c01d061c6d73bb2d8ee2c60785d58748e6c waf: Only specify vnum for non-private libraries.
49ef2888193dd7cc37c3fe0a980b7cc1abdac805 waf: Rename some BUNDLED_ functios to PRIVATE_.
dec00bf0974ea3b5079c32e2a6e6253954297253 tdb: Revert re-addition of tdb_set_logging_function.
ee913f45683e66d4391944e034217a56d42e7ab5 tdb: commit the version 1.2.7 signatures
c529317fe2b48e045b35a613cfd1ad3f03b68435 Lowercase socket_wrapper name.
62c4af99428abb2d4ac1b18454d72e0c8cbb67e8 tdb: Set _PUBLIC_ in C file rather than header files (Debian bug 600898)
7cba3cfac8781061e4114573517b30baedbf891a waf: replace the is_bundled option with private_library
713900b81297548c44a890c3bca1dde9019af8bc s4-build: fixed some formatting
05c1beb6b47e607dac9850e81cef775a1d9b00ae tdb: Bump version to 1.2.7 after addition of pytdb.__version__.
bb0017615d44b66828c98a408ca15b50956f3e91 waf: fixed exit status of test suites
20d39691a8eecd57b27cb709a70c50bf572b8114 tdb: Only use system pytdb when using system tdb.
e805bf52c9ed32bd53759996b5700c5d582a2a58 tdb: Support using system pytdb.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agoNew version 1.2.62 ctdb-1.2.62
Amitay Isaacs [Mon, 22 Apr 2013 04:26:56 +0000 (14:26 +1000)]
New version 1.2.62

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Set num_clients statistic from ctdb->num_clients
Amitay Isaacs [Fri, 19 Apr 2013 03:29:04 +0000 (13:29 +1000)]
ctdbd: Set num_clients statistic from ctdb->num_clients

This fixes the problem of "ctdb statisticsreset" clearing the number of
clients even when there are active clients.

Values returned in statistics for frozen, recovering, memory_used are based on
the current state of CTDB and are not maintained as statistics.  This should
include num_clients as well.

Currently ctdb->num_clients is unused. So use that to track the number of
clients and fill in statistics field only when requested.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit dc4ca816630ed44b419108da53421331243fb8c7)

11 years agoctdbd: Log PID file creation and removal at NOTICE level
Martin Schwenke [Mon, 22 Apr 2013 03:52:04 +0000 (13:52 +1000)]
ctdbd: Log PID file creation and removal at NOTICE level

Unexpected removal of this file can have serious consequences, so it
is best if this is logged at the default level.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit bfed6a8d1771db3401d12b819204736c33acb312)

11 years agoscripts: Crash cleanup script should pass a tag to logger
Martin Schwenke [Tue, 16 Apr 2013 06:10:04 +0000 (16:10 +1000)]
scripts: Crash cleanup script should pass a tag to logger

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoscripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is running
Martin Schwenke [Mon, 15 Apr 2013 05:42:55 +0000 (15:42 +1000)]
scripts: ctdb-crash-cleanup.sh uses initscript to see if ctdbd is running

"ctdb ping" (or "ctdb status") can time out.  How many times should we
try?

Instead, depend on the initscript to implement something sane.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 90cb337e5ccf397b69a64298559a428ff508f196)

Conflicts:
config/ctdb-crash-cleanup.sh

11 years agoinitscript: Use a PID file to implement the "status" option
Martin Schwenke [Mon, 15 Apr 2013 05:18:12 +0000 (15:18 +1000)]
initscript: Use a PID file to implement the "status" option

Using "ctdb ping" and "ctdb status" is fraught with danger.  These
commands can timeout when ctdbd is running, leading callers to believe
that ctdbd is not running.  Timeouts could be increased but we would
still have to handle potential timeouts.

Everything else in the world implements the "status" option by
checking if the relevant process is running.  This change makes CTDB
do the same thing and uses standard distro functions.

This change is backward compatible in sense that a missing
/var/run/ctdb/ directory means that we don't do a PID file check but
just depend on the distro's checking method.  Therefore, if CTDB was
started with an older version of this script then "service ctdb
status" will still work.

This script does not support changing the value of CTDB_VALGRIND
between calls.  If you start with CTDB_VALGRIND=yes then you need to
check status with the same setting.  CTDB_VALGRIND is a debug
variable, so this is acceptable.

This also adds sourcing of /lib/lsb/init-functions to make the Debian
function status_of_proc() available.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 687e2eace4f48400cf5029914f62b6ddabb85378)

Conflicts:
config/ctdb.init

11 years agoctdbd: Add --pidfile option
Amitay Isaacs [Fri, 19 Apr 2013 06:47:32 +0000 (16:47 +1000)]
ctdbd: Add --pidfile option

Default is not to create a pid file.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 996e74d3db0c50f91b320af8ab7c43ea6b1136af)

Conflicts:
server/ctdb_daemon.c

11 years agoctdbd: Change some fork() calls to ctdb_fork()
Martin Schwenke [Fri, 19 Apr 2013 05:16:19 +0000 (15:16 +1000)]
ctdbd: Change some fork() calls to ctdb_fork()

This guarantees that ctdb_set_child_info() is called.

Signed-off-by: Martin Schwenke <martin@meltin.net>
11 years agoutil: ctdb_fork() should call ctdb_set_child_info()
Martin Schwenke [Fri, 19 Apr 2013 04:54:03 +0000 (14:54 +1000)]
util: ctdb_fork() should call ctdb_set_child_info()

For now we pass NULL as the child name.  Later we'll give ctdb_fork()
and friends an extra argument and pass that through.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(backported from commit ba8866d40125bab06391a17d48ff06a4a9f9da89)

11 years agoutil: New functions ctdb_set_child_info() and ctdb_is_child_process()
Martin Schwenke [Fri, 19 Apr 2013 04:42:44 +0000 (14:42 +1000)]
util: New functions ctdb_set_child_info() and ctdb_is_child_process()

Must be called by all child processes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Michael Adam <obnox@samba.org>
(backported from commit 59b019a97aad9a731f9080ea5be14d0dbdfe03d6)

11 years agoLogging: Fix breakage when freeing the log ringbuffer
Martin Schwenke [Fri, 19 Apr 2013 04:35:49 +0000 (14:35 +1000)]
Logging: Fix breakage when freeing the log ringbuffer

Commit c6e1b84595039edb5c49a5851b440710dc0e2ac1 broke fetching from
the log ringbuffer.  The solution there is still generally good: there
is no need to keep the ringbuffer in children created by
ctdb_fork()... except for those special children that are created to
fetch data from the ringbuffer!

Introduce a new function ctdb_fork_no_free_ringbuffer() that does
everything ctdb_fork() needs to do except free the ringbuffer (i.e. it
is the old ctdb_fork() function).  The new ctdb_fork() function just
calls that function and then frees the ringbuffer in the child.

This means all callers of ctdb_fork() have the convenience of having
the ringbuffer freed, apart from the special case in the ringbuffer
fetching code where we call ctdb_fork_no_free_ringbuffer() instead.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(backported from commit 00db5fa00474f8a83f1aa3b603fd756cc9b49ff4)

11 years agoctdb_call: don't bump the rsn in ctdb_become_dmaster() any more
Michael Adam [Wed, 3 Apr 2013 10:02:59 +0000 (12:02 +0200)]
ctdb_call: don't bump the rsn in ctdb_become_dmaster() any more

This is now done in ctdb_ltdb_store_server(), so this
extra bump can be spared.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit cad3107b12e8392f786f9a758ee38cf3a3d58538)

11 years agoFix a severe recovery bug that can lead to data corruption for SMB clients.
Michael Adam [Wed, 3 Apr 2013 09:40:25 +0000 (11:40 +0200)]
Fix a severe recovery bug that can lead to data corruption for SMB clients.

Problem:
Recovery can under certain circumstances lead to old record copies
resurrecting: Recovery selects the newest record copy purely by RSN. At
the end of the recovery, the recovery master is the dmaster for all
records in all (non-persistent) databases. And the other nodes locally
hold the complete copy of the databases. The bug is that the recovery
process does not increment the RSN on the recovery master at the end of
the recovery. Now clients acting directly on the Recovery master will
directly change a record's content on the recmaster without migration
and hence without RSN bump.  So a subsequent recovery can not tell that
the recmaster's copy is newer than the copies on the other nodes, since
their RSN is the same. Hence, if the recmaster is not node 0 (or more
precisely not the active node with the lowest node number), the recovery
will choose copies from nodes with lower number and stick to these.

Here is how to reproduce:

- assume we have a cluster with at least 2 nodes
- ensure that the recmaster is not node 0
  (maybe ensure with "onnode 0 ctdb setrecmasterrole off")
  say recmaster is node 1
- choose a new database name, say "test1.tdb"
  (make sure it is not yet attached as persistent)
- choose a key name, say "key1"
- all clustere nodes should ok and no recovery running
- now do the following on node 1:

1. dbwrap_tool test1.tdb store key1 uint32 1
2. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
3. ctdb recover
4. dbwrap_tool test1.tdb store key1 uint32 2
5. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 2
4. ctdb recover
7. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
   ==> BUG

This is a very severe bug, since when applied to Samba's locking.tdb
database, it means that for SMB clients on clustered Samba there is
the potential for locking out oneself from previously opened files
or even worse, data corruption:

Case 1: locking out

- client on recmaster opens file
- recovery propagates open file handle (entry in locking.tdb) to
  other nodes
- client closes file
- client opens the same file
- recovery resurrects old copy of open file record in locking.tdb
  from lower node
- client closes file but fails to delete entry in locking.tdb
- client tries to open same file again but fails, since
  the old record locks it out (since the client is still connected)

Case 2: data corruption

- clien1 on recmaster opens file
- recovery propagates open file info to other nodes
- client1 closes the file and disconnects
- client2 opens the same file
- recovery resurrects old copy of locking.tdb record,
  where client2 has no entry, but client1 has.
- but client2 believes it still has a handle
- client3 opens the file and succees without
  conflicting with client2
  (the detached entry for client1 is discarded because
   the server does not exist any more).
=> both client2 and client3 believe they have exclusive
  access to the file and writing creates data corruption

Fix:

When storing a record on the dmaster, bump its RSN.

The ctdb_ltdb_store_server() is the central function for storing
a record to a local tdb from the ctdbd server context.
So this is also the place where the RSN of the record to be stored
should be incremented, when storing on the dmaster.

For the case of the record migration, this is currently done in
ctdb_become_dmaster() in ctdb_call.c, but there are other places
such as in recovery, where we should bump the RSN, but currently
don't do it.

So moving the RSN incrementation into ctdb_ltdb_store_server fixes
the recovery-record-resurrection bug.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit feb1d40b21a160737aead22e398f3c34ff3be8de)

11 years agoREADONLY: dont schedule for fast vacuum deletion if any of the readonly record flags...
Ronnie Sahlberg [Mon, 20 Feb 2012 19:54:09 +0000 (06:54 +1100)]
READONLY: dont schedule for fast vacuum deletion if any of the readonly record flags are set
(cherry picked from commit b3307d78fd15f446b423f8cdd1e403f89fbe8ac8)

11 years agoReadOnly: Make sure we dont try to fast-vacuum records that are set for readonly...
Ronnie Sahlberg [Mon, 20 Feb 2012 10:13:46 +0000 (21:13 +1100)]
ReadOnly: Make sure we dont try to fast-vacuum records that are set for readonly delegation
(cherry picked from commit 303134cf10a08ce61954d5de9025d9bbcb5f75ef)

11 years agoctdb_ltdb_store_server: when storing a record that is not to be scheduled for deletio...
Michael Adam [Thu, 7 Apr 2011 10:17:42 +0000 (12:17 +0200)]
ctdb_ltdb_store_server: when storing a record that is not to be scheduled for deletion, remove it from the delete queue

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
(cherry picked from commit 489148e465e2b8aed87ea836e3518f43490671ca)

11 years agovacuum: add ctdb_local_remove_from_delete_queue()
Michael Adam [Thu, 7 Apr 2011 10:17:16 +0000 (12:17 +0200)]
vacuum: add ctdb_local_remove_from_delete_queue()

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
(cherry picked from commit a5065b42a98c709173503e02d217f97792878625)

11 years agoNew version 1.2.61 ctdb-1.2.61
Amitay Isaacs [Fri, 5 Apr 2013 05:26:24 +0000 (16:26 +1100)]
New version 1.2.61

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agolockwait: Pass CTDB daemon PID on command line
Amitay Isaacs [Fri, 5 Apr 2013 04:31:26 +0000 (15:31 +1100)]
lockwait: Pass CTDB daemon PID on command line

In lockwait helper process we cannot rely on getppid() to find the pid
of CTDB daemon as CTDB daemon can go away before the helper executes. In
which case, ctdb helper process will hang around forever.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agorecoverd/takeover: Use IP->node mapping info from nodes hosting that IP
Amitay Isaacs [Fri, 5 Apr 2013 02:34:06 +0000 (13:34 +1100)]
recoverd/takeover: Use IP->node mapping info from nodes hosting that IP

When collating IP information for IP layout, only trust the nodes that are
hosting an IP, to have correct information about that IP.  Ignore what all the
other nodes think.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 1c7adbccc69ac276d2b957ad16c3802fdb8868ca)

11 years agostatd-callout: Make sure statd callout script always runs as root
Amitay Isaacs [Wed, 3 Apr 2013 03:44:08 +0000 (14:44 +1100)]
statd-callout: Make sure statd callout script always runs as root

In RHEL 6+, rpc.statd runs as "rpcuser" instead of root as on RHEL 5. This
prevents CTDB tool commands talking to daemon since "rpcuser" cannot access
CTDB socket.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(cherry picked from commit fe8c4880b371492a38554868d4ca10918c54e412)

Conflicts:
packaging/RPM/ctdb.spec.in

11 years agoclient: Set the socket non-blocking only after connect succeeds
Amitay Isaacs [Mon, 18 Mar 2013 02:45:08 +0000 (13:45 +1100)]
client: Set the socket non-blocking only after connect succeeds

If the socket is set non-blocking before connect, then we should catch
EAGAIN errors and retry. Instead of adding a random number of retries,
better to wait for connect to succeed and then set the socket to
non-blocking.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 524ec206e6a5e8b11723f4d8d1251ed5d84063b0)

11 years agocommon/messaging: Use the jenkins hash in ctdb_message
Volker Lendecke [Wed, 3 Apr 2013 12:59:21 +0000 (14:59 +0200)]
common/messaging: Use the jenkins hash in ctdb_message

This give a better hash distribution
(cherry picked from commit f7f8bde2376f8180a0dca6d7b8d7d2a4a12f4bd8)

11 years agocommon/messaging: use tdb_parse_record in message_list_db_fetch
Volker Lendecke [Fri, 5 Apr 2013 02:11:31 +0000 (13:11 +1100)]
common/messaging: use tdb_parse_record in message_list_db_fetch

This avoids malloc/free in a hot code path.
(cherry picked from commit c137531fae8f7f6392746ce1b9ac6f219775fc29)

11 years agocommon/messaging: Abstract db related operations inside db functions
Amitay Isaacs [Wed, 3 Apr 2013 04:08:14 +0000 (15:08 +1100)]
common/messaging: Abstract db related operations inside db functions

This simplifies the use of message indexdb API and abstracts tdb related code
inside the API.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit bf7296ce9b98563bcb8426cd035dbeab6d884f59)

11 years agocommon/messaging: Don't forget to free the result returned by tdb_fetch()
Amitay Isaacs [Tue, 2 Apr 2013 05:57:51 +0000 (16:57 +1100)]
common/messaging: Don't forget to free the result returned by tdb_fetch()

This fixes a memory leak in the messaging code.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 20be1f991dd75c2333c9ec9db226432a819f57ba)

11 years agocommon/messaging: Free message list header if all message handlers are freed
Amitay Isaacs [Tue, 2 Apr 2013 01:08:39 +0000 (12:08 +1100)]
common/messaging: Free message list header if all message handlers are freed

This makes sure that even if the srvids are not deregistered, the header
structure is freed when the last message handler has been freed as a result of
client going away.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 4e1ec7412866f2d31c41de1bec0fbf788c03051b)

11 years agoNew version 1.2.60 ctdb-1.2.60
Amitay Isaacs [Mon, 25 Mar 2013 07:05:07 +0000 (18:05 +1100)]
New version 1.2.60

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agolockwait: Allow for zero length key requests
Amitay Isaacs [Thu, 14 Mar 2013 04:44:44 +0000 (15:44 +1100)]
lockwait: Allow for zero length key requests

Samba sends zero length key requests for notify database. To support older
Samba behaviour for now, allow zero length key requests. Zero length key is
encoded as "NULL" string.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agolockwait: Pass all locking information on commandline to lockwait helper
Amitay Isaacs [Wed, 13 Mar 2013 06:05:00 +0000 (17:05 +1100)]
lockwait: Pass all locking information on commandline to lockwait helper

Simplify lockwait code by getting rid of the communication between ctdbd
and ctdb lockwait helper child by passing all the locking information
on command line.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agolockwait: Check result of lockwait child
Volker Lendecke [Tue, 12 Mar 2013 12:35:17 +0000 (13:35 +0100)]
lockwait: Check result of lockwait child

11 years agolockwait: fix a comment typo
Michael Adam [Wed, 13 Mar 2013 08:12:50 +0000 (09:12 +0100)]
lockwait: fix a comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
11 years agoutil: Add hex_decode_talloc() to decode hex string into a binary blob
Amitay Isaacs [Wed, 13 Mar 2013 11:57:44 +0000 (22:57 +1100)]
util: Add hex_decode_talloc() to decode hex string into a binary blob

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 307416afda707b687f5e89e8438e45c154a4c806)

11 years agologging: Do not ignore stdout/stderr from the exec'd children
Amitay Isaacs [Wed, 13 Mar 2013 00:46:18 +0000 (11:46 +1100)]
logging: Do not ignore stdout/stderr from the exec'd children

To log debugging information from child processes that are started
with vfork and exec, do not set close_on_exec on STDOUT and STDERR for
that process.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 08c53ee609b80f87450a7a1d7dd24fbcdf5ab7bc)

11 years agoNew Version 1.2.59 ctdb-1.2.59
Amitay Isaacs [Wed, 6 Mar 2013 06:48:44 +0000 (17:48 +1100)]
New Version 1.2.59

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Exec lockwait helper for locking a record
Amitay Isaacs [Mon, 18 Feb 2013 07:05:28 +0000 (18:05 +1100)]
ctdbd: Exec lockwait helper for locking a record

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agoctdbd: Create a standalone helper for record locking
Amitay Isaacs [Mon, 18 Feb 2013 07:04:07 +0000 (18:04 +1100)]
ctdbd: Create a standalone helper for record locking

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
11 years agotevent: optimize adding new timer events
Stefan Metzmacher [Fri, 22 Feb 2013 11:45:39 +0000 (12:45 +0100)]
tevent: optimize adding new timer events

There're two cases:

1. Adding a timer with a zero timestamp.
   Such events were used before we had immediate events.
   It's likely that there're a lot of this events
   and we need to add new ones in fifo order.

2. Adding a timer with a real timestamp.
   As this timestamps typically get higher:-)
   it's better to traverse the existing list from
   the tail.

This is not completely optimal, but it should be better
than before.

Signed-off-by: Stefan Metzmacher <metze@samba.org>
11 years agocommon/io: For scheduling immediate events use tevent_schedule_immediate
Amitay Isaacs [Fri, 22 Feb 2013 01:59:39 +0000 (12:59 +1100)]
common/io: For scheduling immediate events use tevent_schedule_immediate

tevent_schedule_immediate() is much more efficient at handling events that need
to be processed immediately rather than creating timed events with
timeval_zero().

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Cherry-pick-from: 11734be353a1e246163eda631d35dfe55d1d6fb1

11 years agoctdbd: Add an index db for message list for faster searches
Amitay Isaacs [Thu, 21 Feb 2013 02:16:15 +0000 (13:16 +1100)]
ctdbd: Add an index db for message list for faster searches

When CTDB is busy with lots of smbd, CTDB was spending too much time in
daemon_check_srvids() which searches a list of srvids in the registered
message handlers.  Using a hash based index significantly improves the
performance of search in a linked list.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Cherry-pick-from: 3e09f25d419635f6dd679b48fa65370f7860be7d

11 years agotools/ctdb: delip no longer fails if IP can not be moved
Martin Schwenke [Wed, 27 Feb 2013 05:01:55 +0000 (16:01 +1100)]
tools/ctdb: delip no longer fails if IP can not be moved

Moving the IP is an optimisation so should not cause failure.

Refactor and simplify the retry-move-IP into new function
try_moveip().

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Cherry-pick-from: 5402f85dde045576cbaf64e01c68e28ed52204e8

11 years agorecoverd: Do not send "ipreallocated" event to stopped nodes
Martin Schwenke [Mon, 18 Feb 2013 05:32:14 +0000 (16:32 +1100)]
recoverd: Do not send "ipreallocated" event to stopped nodes

Stopped nodes will reject "ipreallocated" because they are in
recovery, so they will eventually be banned.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Cherry-pick-from: c270381ee81903ff459a8b23fd57c997d038cf14