ctdb.git
9 years agoNew version 1.0.114.8 ctdb-1.0.114.8
Michael Adam [Wed, 25 Jun 2014 13:37:15 +0000 (15:37 +0200)]
New version 1.0.114.8

Signed-off-by: Michael Adam <obnox@samba.org>
9 years agoctdbd: Avoid leaking file descriptor if talloc fails
Amitay Isaacs [Mon, 5 Aug 2013 07:38:42 +0000 (17:38 +1000)]
ctdbd: Avoid leaking file descriptor if talloc fails

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit d7f6bc3fed2dc61e6e587b4c0ec0ac27d533bbbe)

9 years agoouch, the ordering of the constants and the strings must be kept in sync
Ronnie Sahlberg [Mon, 30 Aug 2010 09:42:30 +0000 (19:42 +1000)]
ouch,   the ordering of the constants and the strings must be kept in sync
manually   and ther eis no check for errors.     should fix this later

(cherry picked from commit e824af1a41f8ceec1edf6b3d1d6e1758fa00deb2)

9 years agorecoverd: avoid triggering a full recovery if just some ip allocation
Ronnie Sahlberg [Mon, 10 Jan 2011 05:51:56 +0000 (16:51 +1100)]
recoverd: avoid triggering a full recovery if just some ip allocation
has failed.
We dont need to rebuild the databases in this situation, we just
need to try again to sort out the ip address allocations.

(cherry picked from commit 044c398ffea23d36ee033c8ddf07d11028197346)

9 years agowhen checking that the interfaces exist in ctdb_add_public_address()
Ronnie Sahlberg [Wed, 21 Sep 2011 01:42:19 +0000 (11:42 +1000)]
when checking that the interfaces exist in ctdb_add_public_address()
cant talloc off vnn since it is not yet initialized and might not always be NULL
(cherry picked from commit 3d37be3e2bfb61ede824028aeebaa18ba304faae)

9 years agoWhen we find an ip we shouldnt host, just release it
Ronnie Sahlberg [Wed, 20 Jun 2012 05:10:05 +0000 (15:10 +1000)]
When we find an ip we shouldnt host, just release it

Dont call a full blown clusterwide ipreallocation,  just release it locally
(cherry picked from commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e)

9 years agorecoverd: Fix spurious warnings when running with --nopublicipcheck
Amitay Isaacs [Wed, 4 Apr 2012 04:42:23 +0000 (14:42 +1000)]
recoverd: Fix spurious warnings when running with --nopublicipcheck

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 7f8096f56d8274151705ac822b582d972078f8fe)

9 years agoctdbd: Stop takeovers and releases from colliding in mid-air
Martin Schwenke [Wed, 11 Jul 2012 04:46:07 +0000 (14:46 +1000)]
ctdbd: Stop takeovers and releases from colliding in mid-air

There's a race here where release and takeover events for an IP can
run at the same time.  For example, a "ctdb deleteip" and a takeover
initiated by the recovery daemon.  The timeline is as follows:

1. The release code registers a callback to update the VNN.  The
   callback is executed *after* the eventscripts run the releaseip
   event.

2. The release code calls the eventscripts for the releaseip event,
   removing IP from its interface.

   The takeover code "updates" the VNN saying that IP is on some
   iface.... even if/though the address is already there.

3. The release callback runs, removing the iface associated with IP in
   the VNN.

   The takeover code calls the eventscripts for the takeip event,
   adding IP to an interface.

As a result, CTDB doesn't think it should be hosting IP but IP is on
an interface.  The recovery daemon fixes this later... but it
shouldn't happen.

This patch can cause some additional noise in the logs:

  Release of IP 10.0.2.133/24 on interface eth2  node:2
  recoverd:We are still serving a public address '10.0.2.133' that we should not be serving. Removing it.
  Release of IP 10.0.2.133/24 rejected update for this IP already in flight
  recoverd:client/ctdb_client.c:2455 ctdb_control for release_ip failed
  recoverd:Failed to release local ip address

In this case the node has started releasing an IP when the recovery
daemon notices the addresses is still hosted and initiates another
release.  This noise is harmless but annoying.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit bfe16cf69bf2eee93c0d831f76d88bba0c2b96c2)

9 years agoctdbd: Fix ctdb_control_release_ip() on local daemons
Martin Schwenke [Mon, 2 Jul 2012 04:09:32 +0000 (14:09 +1000)]
ctdbd: Fix ctdb_control_release_ip() on local daemons

When running on local daemons no IPs are actually assigned to
interfaces.  Commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e broke
ctdb_control_release_ip() for local daemons because it asks the system
which interface the given IP is on, instead of the old behaviour of
trusting CTDB's internal records.

For local deamons (i.e. !ctdb->do_checkpublicip) revert to the old
behaviour of looking up the interface internally.  This is good
enough, given that the tests don't tend to misconfigure the addresses.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 38e8651b955afdbaf0ae87c24c55c052f8209290)

9 years agoWhen we release an ip, get the interface name from the kernel
Ronnie Sahlberg [Wed, 20 Jun 2012 00:08:11 +0000 (10:08 +1000)]
When we release an ip, get the interface name from the kernel

instead of using the interface where ctdb thinks the ip is hosted at.
The difference is that this now allows us to handle cases where we want to release an ip   but ctdbd does not know which interface the ip is assigned on.
(user has used 'ip addr add...'  and manually assigned an ip to the wrong interface)
(cherry picked from commit c6bf22ba5c01001b7febed73dd16a03bd3fd2bed)

9 years agoctdbd: Fix spurious warnings when running with --nopublicipcheck
Amitay Isaacs [Wed, 4 Apr 2012 04:42:56 +0000 (14:42 +1000)]
ctdbd: Fix spurious warnings when running with --nopublicipcheck

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 67b909a0718d6cfce82ffce0830da3a6ff1f6c4b)

9 years agoInterface monitoring: add a event to trigger every 30 seconds to check that all inter...
Ronnie Sahlberg [Tue, 6 Sep 2011 07:02:19 +0000 (17:02 +1000)]
Interface monitoring: add a event to trigger every 30 seconds to check that all interfaces referenced by the public address list actually exists.

This will make it much easier to root-cause problems such as
S1029023
when an external application deleted the interface while it is still is in use by ctdbd.
(cherry picked from commit 9abf9c919a7e6789695490e2c3de56c21b63fa57)

9 years agoIP reallocation. If a public address is already hosted on the node when we startup...
Ronnie Sahlberg [Sun, 13 Mar 2011 22:55:28 +0000 (09:55 +1100)]
IP reallocation. If a public address is already hosted on the node when we startup, log a warning message but do not cause the recovery to fail.

CQ S1022356

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 89f8169c24da96c1fdd0ac19b8a1e0e1df01a72a)

9 years agoIPALLOCATION : If the node is held pinned down in "init" state
Ronnie Sahlberg [Wed, 12 Jan 2011 22:35:37 +0000 (09:35 +1100)]
IPALLOCATION : If the node is held pinned down in "init" state
by external services failing to start, or blocking CTDBD from finishing the startup phase,
we can encounter a situation where we have not yet fully initialized, but a
remote recovery master tries to release a certain ip clusterwide.

In this situation the node that is pinned down in init/startup phase
would fail to perform the release of the ip address since we are not yet fully operational and not yet host any valid interfaces.

In this situation, we just need to remain unhealthy, there is on need to
also ban the node.

Remove the autobanning for this condition and just let the node remain in
unhealthy mode.
Banning is overkill in this situation when the system is broken and just
draws attention to ctdbd instead of the root cause.
(cherry picked from commit d8af74e4c4961deb94c18dde8ba7fc07e944729c)

9 years agoWhen adding an ip at runtime, it might not yet have an iface assigned to it, so ensur...
Ronnie Sahlberg [Tue, 1 Jun 2010 06:22:48 +0000 (16:22 +1000)]
When adding an ip at runtime, it might not yet have an iface assigned to it, so ensure that the next takover_ip call will fall through to accept the ip and add it.
(cherry picked from commit 2d60f96680d16c2992e2a35517822f88c12538b7)

9 years agoadd a new serverid to send a message everytime an ip address is taken on the local...
Ronnie Sahlberg [Mon, 13 Sep 2010 05:42:00 +0000 (15:42 +1000)]
add a new serverid to send a message everytime an ip address is taken on the local node

(cherry picked from commit 1261f3d9702800a4e59550c881350daf479f00ef)

Conflicts:

include/ctdb_protocol.h

9 years agoUndo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281
Martin Schwenke [Thu, 22 Mar 2012 04:27:25 +0000 (15:27 +1100)]
Undo damage done by d8d37493478a26c5f1809a5f3df89ffd6e149281

The implementation of DisableIPFailover got intermingled with
--nopublicipcheck.  This just looks wrong - Ronnie must have been
having a bad day.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 5083b266dd68b292c4275505f3d1b878dbf12f11)

Conflicts:

include/ctdb_private.h

9 years agoduring ip allocation, there are failure modes where a node might hold a ip address
Ronnie Sahlberg [Fri, 3 Dec 2010 02:28:35 +0000 (13:28 +1100)]
during ip allocation, there are failure modes where a node might hold a ip address
but thinks it is still unassigned (-1).

add code to the recovery daemon to detect this case and trigger a reallocation
so that the ip gets covered

and change the takeip code to allow for this condition, taking on an ip address that is
already hosted.

cq s1021073
(cherry picked from commit 9020baf27cab7821c9094cda185206fb7af0fee7)

9 years agoDont check remote ip allocation if public ip mgmt is disabled
Ronnie Sahlberg [Wed, 10 Nov 2010 03:46:05 +0000 (14:46 +1100)]
Dont check remote ip allocation if public ip mgmt is disabled
(cherry picked from commit 441ad00af842a8b7b5291de60d8ab08a064f5327)

9 years agodont check the public ip assignment or if even we are hosting them and shouldnt
Ronnie Sahlberg [Wed, 10 Nov 2010 01:06:05 +0000 (12:06 +1100)]
dont check the public ip assignment or if even we are hosting them and shouldnt
when public ips have been disabled
(cherry picked from commit 7d07a74dc7f907ac757d20626f68e257d7ba16be)

9 years ago Dont check ip assignment across the cluster while ip-verification
Ronnie Sahlberg [Mon, 3 May 2010 05:52:02 +0000 (15:52 +1000)]
Dont check ip assignment across the cluster while ip-verification
    checks are disabled
(cherry picked from commit 189f4a5af1053271b0834522e35c336df959aa03)

9 years agoAdd a new tunable : DisableIPFailover that when set to non 0
Ronnie Sahlberg [Tue, 9 Nov 2010 04:19:06 +0000 (15:19 +1100)]
Add a new tunable : DisableIPFailover that when set to non 0
will stopp any ip reallocations at all from happening.
(cherry picked from commit d8d37493478a26c5f1809a5f3df89ffd6e149281)

Conflicts:

server/ctdb_tunables.c

9 years ago Add a new event "ipreallocated"
Ronnie Sahlberg [Mon, 30 Aug 2010 08:08:38 +0000 (18:08 +1000)]
Add a new event "ipreallocated"
    This is called everytime a reallocation is performed.

    While STARTRECOVERY/RECOVERED events are only called when
    we do ipreallocation as part of a full database/cluster recovery,
    this new event can be used to trigger on when we just do a light
    failover due to a node becomming unhealthy.

    I.e. situations where we do a failover but we do not perform a full
    cluster recovery.

    Use this to trigger for natgw so we select a new natgw master node
    when failover happens and not just when cluster rebuilds happen.
(cherry picked from commit 7f4c591388adae20e98984001385cba26598ec67)

Conflicts:

include/ctdb_protocol.h

9 years agoAdd new command to find which interface is located on
Ronnie Sahlberg [Wed, 20 Jun 2012 03:32:02 +0000 (13:32 +1000)]
Add new command to find which interface is located on

(cherry picked from commit f07376309e70f5ccdb7de8453caacc71b451ab48)

Conflicts:

tools/ctdb.c

9 years agoCheck interfaces: when reading the public addresses file to create the vnn list
Ronnie Sahlberg [Tue, 6 Sep 2011 06:11:00 +0000 (16:11 +1000)]
Check interfaces:  when reading the public addresses file to create the vnn list
check that the actual interface exist, print error and fail startup if the interface does not exist.
(cherry picked from commit cd33bbe6454b7b0316bdfffbd06c67b29779e873)

9 years agoctdb:server: fix DEBUG message for wrong event script options.
Michael Adam [Thu, 5 Jun 2014 10:48:03 +0000 (12:48 +0200)]
ctdb:server: fix DEBUG message for wrong event script options.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay@samba.org>
Autobuild-Date(master): Thu Jun  5 19:51:36 CEST 2014 on sn-devel-104

(Imported from commit 4811cbea933c3665a08f9f0b37ad43afeace2b6c)

10 years agoLess verbosity when there is no public addresses file
Michael Adam [Thu, 24 Apr 2014 11:56:19 +0000 (13:56 +0200)]
Less verbosity when there is no public addresses file

This is partly cherry-picked from master commits:

8837daa424732aeb5a20814b1709c345a97a0e09
e646142f4d28b5401235cd5edee325f7a29f8193

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoNew version 1.0.114.7 ctdb-1.0.114.7
Michael Adam [Wed, 4 Sep 2013 13:30:37 +0000 (15:30 +0200)]
New version 1.0.114.7

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agodoc: The second half of monitoring is only for recovery master
Amitay Isaacs [Thu, 27 Jun 2013 07:46:43 +0000 (17:46 +1000)]
doc: The second half of monitoring is only for recovery master

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6)

10 years agorecoverd: when the recmaster is banned, use that information when forcing an election
Michael Adam [Wed, 26 Jun 2013 07:23:22 +0000 (09:23 +0200)]
recoverd: when the recmaster is banned, use that information when forcing an election

When we trigger an election because the recmaster considers itself inactive,
update our local nodemap with the recmaster's flags before calling
force_election(). This way, we don't send the inactive node freeze commands
(e.g.) that may fail and then lead to ourselves getting banned.

The theory is that this should help avoiding banning loops.

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 932360992b08a5483d90c0590218ba0fd756119e)

10 years agorecoverd: fix a comment typo
Michael Adam [Wed, 26 Jun 2013 05:11:51 +0000 (07:11 +0200)]
recoverd: fix a comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 741944f118e98f178b860194eecb215180949d18)

10 years agorecoverd: fix a comment in main_loop
Michael Adam [Fri, 21 Jun 2013 15:57:37 +0000 (17:57 +0200)]
recoverd: fix a comment in main_loop

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95)

10 years agorecoverd: eliminate some trailing spaces from ctdb_election_win()
Michael Adam [Fri, 21 Jun 2013 12:06:22 +0000 (14:06 +0200)]
recoverd: eliminate some trailing spaces from ctdb_election_win()

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit df30c0a05ed908fc2a997c56ff5484736b23b70f)

10 years agorecoverd: Don't continue if the current node gets banned
Martin Schwenke [Fri, 28 Jun 2013 06:31:07 +0000 (16:31 +1000)]
recoverd: Don't continue if the current node gets banned

Can not continue with recovery or monitoring cluster.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a)

10 years agorecoverd: Refactor code to ban misbehaving nodes
Amitay Isaacs [Fri, 28 Jun 2013 04:31:02 +0000 (14:31 +1000)]
recoverd: Refactor code to ban misbehaving nodes

Since we have nodemap information, there is no need to hardcode the
limit of 20.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>
(cherry picked from commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe)

Conflicts:

server/ctdb_recoverd.c

10 years agorecoverd: Move code to ban other nodes after we get local node flags
Amitay Isaacs [Thu, 27 Jun 2013 06:01:16 +0000 (16:01 +1000)]
recoverd: Move code to ban other nodes after we get local node flags

If a node gets banned first, then it should not ban other nodes.

This code was moved up in main_loop to avoid waiting for nodemap
from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795).

To prevent a banned node from banning other nodes, we need to first get
nodemap information from local node, so trying to ban other nodes can
fail if we are already banned.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit ae1693905036ecdbc4594fde1f12500faae4a554)

10 years agorecoverd: Delay the initial election if node is started in stopped state
Amitay Isaacs [Thu, 27 Jun 2013 05:44:27 +0000 (15:44 +1000)]
recoverd: Delay the initial election if node is started in stopped state

Since there is an early exit if a node is stopped or banned, we can wait till
the node becomes active to start initial election.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 593a17678fbd3109e118154b034d43b852659518)

10 years agorecoverd: Update capabilities only if the current node is active
Amitay Isaacs [Thu, 27 Jun 2013 05:33:49 +0000 (15:33 +1000)]
recoverd: Update capabilities only if the current node is active

Since we do an early return if a node is stopped or banned, move update
capabilities code below the early return and just before we check the
capabilities of current recovery master.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 93bcb6617e1024f810533e12390a572f51703ca0)

10 years agorecoverd: No need to check if node is recovery master when inactive
Amitay Isaacs [Thu, 27 Jun 2013 05:46:04 +0000 (15:46 +1000)]
recoverd: No need to check if node is recovery master when inactive

If a node is stopped or banned, it will cause early return from the
main_loop, so this check is redundent.  The election will called by an
active node.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 815ddd3341b7e9db39e05a3a3fcd9a1420f053bc)

10 years agorecoverd: Always do an early exit from main_loop if node is stopped or banned
Amitay Isaacs [Thu, 27 Jun 2013 05:39:15 +0000 (15:39 +1000)]
recoverd: Always do an early exit from main_loop if node is stopped or banned

A stopped or banned node cannot do anything useful.  So do not participate
in any cluster activity and do not cause any unnecessary network traffic.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 2396981c4bcf30530aeb7f4395093cc202105b50)

10 years agorecoverd: main_loop() should not verify local IPs if node is stopped
Martin Schwenke [Tue, 3 Jul 2012 00:30:29 +0000 (10:30 +1000)]
recoverd: main_loop() should not verify local IPs if node is stopped

Doing these checks is pointless and potentially causes unnecessary log
messages.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit a0c30c820fd47d4f8620dc060c825be10754f5d1)
(cherry picked from commit d181a5dadffacc5bfe04dcab6595b03499e613ad)

10 years agorecoverd: Do not set banning credits on a node if current node is inactive
Amitay Isaacs [Fri, 28 Jun 2013 04:10:47 +0000 (14:10 +1000)]
recoverd: Do not set banning credits on a node if current node is inactive

If the current node is banned or stopped, then it should not assign banning
credits to other nodes since the current node will not have up-to-date flags
of other nodes.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 38304f88e0c634e97d4687c25adef975f71537b8)

10 years agobanning: Do not come out of ban if databases are not frozen
Amitay Isaacs [Mon, 1 Jul 2013 07:40:36 +0000 (17:40 +1000)]
banning: Do not come out of ban if databases are not frozen

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit a60f228f8380f222f838eb619d2ab55f96f11ac2)

10 years agobanning: No need to check if banned pnn is for local node
Amitay Isaacs [Mon, 24 Jun 2013 04:33:32 +0000 (14:33 +1000)]
banning: No need to check if banned pnn is for local node

If the banned pnn is not the local node, the function returns early.
So no need for additional check.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 297d93cecc3c0655e72ecac38508e113bdbeab9c)

10 years agobanning: Make ctdb_local_node_got_banned() a void function
Amitay Isaacs [Fri, 28 Jun 2013 04:04:18 +0000 (14:04 +1000)]
banning: Make ctdb_local_node_got_banned() a void function

When this function is called, we are already committed to banning
and there is no point in failing this function.  In case, freezing of
databases fails, it will be fixed from recovery daemon.
(cherry picked from commit bb178338658b4ae32382a1f62f7c21cee1d4878f)

10 years agorecoverd: Also check if current node is in recovery when it is banned
Amitay Isaacs [Fri, 28 Jun 2013 04:02:44 +0000 (14:02 +1000)]
recoverd: Also check if current node is in recovery when it is banned

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8)

10 years agorecoverd: Set node_flags information as soon as we get nodemap
Amitay Isaacs [Fri, 28 Jun 2013 04:09:35 +0000 (14:09 +1000)]
recoverd: Set node_flags information as soon as we get nodemap

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 8d622660a14c929e365d306147b378ea6ab92175)

10 years agorecovered: Remove old comment as the code corresponding to that has gone away
Amitay Isaacs [Wed, 26 Jun 2013 06:02:23 +0000 (16:02 +1000)]
recovered: Remove old comment as the code corresponding to that has gone away

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7)

10 years agobanning: Log ban state changes for other nodes at higher debug level
Amitay Isaacs [Mon, 24 Jun 2013 04:31:50 +0000 (14:31 +1000)]
banning: Log ban state changes for other nodes at higher debug level

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit c6f8407648abb37f2ed781afa5171dad8c9f59e9)

10 years agofreeze: Make ctdb_start_freeze() a void function
Amitay Isaacs [Mon, 1 Jul 2013 06:28:04 +0000 (16:28 +1000)]
freeze: Make ctdb_start_freeze() a void function

If this function fails due to memory errors, there is no way to recover.
The best course of action is to abort.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 46efe7a886f8c4c56f19536adc98a73c22db906a)

Conflicts:

server/ctdb_freeze.c

10 years agofreeze: If priority is invalid here, it's time to abort
Amitay Isaacs [Mon, 1 Jul 2013 06:21:00 +0000 (16:21 +1000)]
freeze: If priority is invalid here, it's time to abort

ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the
priority if it's 0 and return error if it's invalid.  Other callers of
ctdb_start_freeze() are internal to CTDB.  So if priority is invalid in
ctdb_start_freeze(), definitely something is seriously wrong.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 87716e8f504d659515d3dbcf93badbf106873bc8)

Conflicts:

server/ctdb_freeze.c

10 years agofreeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()
Amitay Isaacs [Mon, 1 Jul 2013 03:26:33 +0000 (13:26 +1000)]
freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()

This ensures that whenever databases are frozen either via sending
control or by calling ctdb_start_freeze(), the action is logged.
Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of
message in early return condition if databases are already frozen.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 478e24bceda3fedfba54ccb48faa115df726b819)

10 years agorecoverd: Print banning message only after verifying pnn
Amitay Isaacs [Mon, 24 Jun 2013 04:18:58 +0000 (14:18 +1000)]
recoverd: Print banning message only after verifying pnn

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 4be8dff3a4451192f838497b4747273685959bed)

10 years agorecoverd: When updating flags on nodes, send updated flags and not old flags
Amitay Isaacs [Wed, 26 Jun 2013 05:22:46 +0000 (15:22 +1000)]
recoverd: When updating flags on nodes, send updated flags and not old flags

This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa.
Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control
was sent to the local daemon which in turn informed the recovery daemon.
And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc)

10 years agovacuum: Reduce the priority of non-critical error
Amitay Isaacs [Fri, 24 May 2013 08:07:39 +0000 (18:07 +1000)]
vacuum: Reduce the priority of non-critical error

Since the complete database is not locked when the receive_records
control is received, it's possible that we may not be able to obtain
lock on a chain.  We will try again to store this record.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 32723c9efdad1c6ca4aa53f308ccd9bef1aadfff)

10 years agoctdbd: remove a nonempty blank line
Michael Adam [Fri, 17 May 2013 09:01:31 +0000 (11:01 +0200)]
ctdbd: remove a nonempty blank line

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit d9e24782a90d9ce29c0e6584b75d2b186142174d)

10 years agoctdbd: update comment describing ctdb_call_send_redirect()
Michael Adam [Fri, 17 May 2013 09:00:32 +0000 (11:00 +0200)]
ctdbd: update comment describing ctdb_call_send_redirect()

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 9a21d417c51fb9cad8f2e87e00ca54d379aef860)

Conflicts:

server/ctdb_call.c

10 years agorecoverd: Clarify some misleading log messages
Martin Schwenke [Thu, 11 Oct 2012 04:59:00 +0000 (15:59 +1100)]
recoverd: Clarify some misleading log messages

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 14589bf7c16ba017fe00d4e8bea8cc501546c60f)

10 years agorecoverd: All inactive nodes should yield recovery master role
Martin Schwenke [Fri, 6 Jul 2012 10:43:46 +0000 (20:43 +1000)]
recoverd: All inactive nodes should yield recovery master role

Not just stopped nodes.  In reality, this means that banned nodes will
also yield, since nodes in the other inactive states won't be running
a daemon.

This seems sensible since if another node notices that an inactive
node is the recovery master then it will force an election anyway.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit fc18188b7b63eb0dafbc47e3abf80e306e1dfc31)

10 years agoClean up warnings: remove changed_flags in monitor_helper
Martin Schwenke [Wed, 9 Nov 2011 03:45:01 +0000 (14:45 +1100)]
Clean up warnings: remove changed_flags in monitor_helper

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit 3e4fa518f02db75e4e4a7f326a71df226913f8a8)

10 years agospeed startup: alter recovery loop
Rusty Russell [Tue, 22 Jun 2010 13:20:23 +0000 (22:50 +0930)]
speed startup: alter recovery loop

We do a recovery on startup.  But the code does:
   Sleep for ctdb->tunable.recover_interval.
   Check for recovery.

We want to do it in the other order.  This is best done by extracting
the loop into a separate "main_loop" function.

Seconds between ctdbd first log message and node healthy:
BEFORE: 24.09
AFTER: 23.58

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from commit 097046025176b9fcb670839d1a9f100f890e7ed2)

10 years agorename ctdb_set_message_handler to ctdb_client_set_message_handler
Ronnie Sahlberg [Tue, 1 Jun 2010 23:51:47 +0000 (09:51 +1000)]
rename ctdb_set_message_handler to ctdb_client_set_message_handler
to avoid a colission with the function of the same name in libctdb
(cherry picked from commit 41dbdd4fc0ab560420fb0e24a3179ff7c94c5bb7)

Conflicts:

include/ctdb_client.h
tests/src/ctdb_fetch.c

10 years agoAdded some #ifndefs to stop files being included multiple times.
Martin Schwenke [Fri, 11 Nov 2011 01:41:24 +0000 (12:41 +1100)]
Added some #ifndefs to stop files being included multiple times.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit fdca12c25e6fce6206135b994dedf44265e4eb09)

10 years agoThe recent change to the recovery daemon to keep track of and
Ronnie Sahlberg [Wed, 28 Apr 2010 05:43:11 +0000 (15:43 +1000)]
The recent change to the recovery daemon to keep track of and
verify that all nodes agree on the most recent ip address assignments
broke "ctdb moveip ..." since that call would never trigger
a full takeover run and thus would immediately trigger an inconsistency.

Add a new message to the recovery daemon where we can tell the recovery daemon to update its assignments.

BZ62782
(cherry picked from commit e7069082e5f0380dcddee247db8754218ce18cab)

10 years agoFix the build after backporting f3bf2ab61f8dbbc806ec23a68a87aaedd458e712.
Michael Adam [Wed, 21 Aug 2013 07:16:47 +0000 (09:16 +0200)]
Fix the build after backporting f3bf2ab61f8dbbc806ec23a68a87aaedd458e712.

This patch (keeping track of public IP assignment in recovery daemon)
which was backported to 1.0.0114 as 9640e2bb889bd99389d9fb247191a19785a75104
renamed "struct _trbt_tree_t" to "struct trbt_tree".

In master, this patch came before the introduction of the delete queue
to the db context. So in the 1.0.114 branch we need to fix up afterwards.

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoIn the recovery daemon, keep track of which node we have assigned public ip
Ronnie Sahlberg [Thu, 8 Apr 2010 04:07:57 +0000 (14:07 +1000)]
In the recovery daemon, keep track of which node we have assigned public ip
addresses and verify that the remote nodes have/keep a consistent view of
assigned addresses.

If a remote node has an inconsistent view of addresses visavi the recovery
master this will trigger a full ip reallocation.
(cherry picked from commit f3bf2ab61f8dbbc806ec23a68a87aaedd458e712)

Conflicts:

include/ctdb_private.h

10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)
Amitay Isaacs [Mon, 12 Aug 2013 05:50:30 +0000 (15:50 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit f0853013655ac3bedf1b793de128fb679c6db6c6)

Conflicts:

server/ctdb_recover.c

10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)
Amitay Isaacs [Mon, 12 Aug 2013 05:51:00 +0000 (15:51 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit a610bc351f0754c84c78c27d02f9a695e60c5b0f)

10 years agoSet FD_CLOEXEC for epoll file descriptors
Sumit Bose [Wed, 10 Aug 2011 15:14:40 +0000 (17:14 +0200)]
Set FD_CLOEXEC for epoll file descriptors

Don't leak file descriptors.
This showed up as selinux AVCs on RHEL:
https://bugzilla.redhat.com/show_bug.cgi?id=728545

Reviewed-by: Michael Adam <obnox@samba.org>
10 years agoPrint deleted nodes as well
Sumit Bose [Mon, 19 Nov 2012 17:45:37 +0000 (18:45 +0100)]
Print deleted nodes as well

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 0930a3b806977555509c3228726e2250aef1f971)

Conflicts:

tools/ctdb.c

10 years agoIPv6 neighbor solicit cleanup
Sumit Bose [Thu, 1 Sep 2011 13:18:46 +0000 (15:18 +0200)]
IPv6 neighbor solicit cleanup

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit a81edf7eb908659a379f0cb55fd5d04551dc2c37)

10 years agoFix memory leak in ctdb_send_message()
Sumit Bose [Mon, 19 Nov 2012 10:13:03 +0000 (11:13 +0100)]
Fix memory leak in ctdb_send_message()

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit da87395d29f5d11ecfedaf36b53fa060a9140bfd)

10 years agotdb: Fix Coverity ID 2192: NO_EFFECT
Volker Lendecke [Sun, 27 Mar 2011 19:43:53 +0000 (21:43 +0200)]
tdb: Fix Coverity ID 2192: NO_EFFECT

(ret < 0) can never be true
(cherry picked from commit 25397de589e577e32bb291576b10c18978b5bc4e)

10 years agoFixes for various issues found by Coverity
Sumit Bose [Wed, 10 Aug 2011 15:53:56 +0000 (17:53 +0200)]
Fixes for various issues found by Coverity

Corresponds to commit 05bfdbbd0d4abdfbcf28e3930086723508b35952 from master.

10 years agoWhen memory allocations for recovery fails,
Ronnie Sahlberg [Fri, 3 Sep 2010 01:58:27 +0000 (11:58 +1000)]
When memory allocations for recovery fails,
dont dereference a null pointer while trying to print the log message for the failure.

also shutdown ctdb with ctdb_fatal()
(cherry picked from commit f8642d0438c6bbb34a72c25d6a904b626e247410)

10 years agoidtree: fix overflow for v. large ids on allocation and removal
Rusty Russell [Mon, 6 Dec 2010 03:22:38 +0000 (13:52 +1030)]
idtree: fix overflow for v. large ids on allocation and removal

(Imported from SAMBA commit 09a6538969ac).

Chris Cowan tracked down a SEGV in sub_alloc: idp->level can actually
be equal to 7 (MAX_LEVEL) there, as it can be in sub_remove.

(We unfairly blamed a shift of a signed var for this crash in commit
 2db1987f5a3a).

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(cherry picked from commit 73764104356d3738d9d20a9d06ce51535f74f475)

10 years agoidtree: fix right shift of signed ints, crash on large ids on AIX
Rusty Russell [Tue, 5 Oct 2010 02:36:19 +0000 (13:06 +1030)]
idtree: fix right shift of signed ints, crash on large ids on AIX

Right-shifting signed integers in undefined; indeed it seems that on
AIX with their compiler, doing a 30-bit shift on (INT_MAX-200) gives
0, not 1 as we might expect.

The obvious fix is to make id and oid unsigned: l (level count) is also
logically unsigned.

(Note: Samba doesn't generally get to ids > 1 billion, but ctdb does)

Reported-by: Chris Cowan <cc@us.ibm.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Autobuild-User: Rusty Russell <rusty@samba.org>
Autobuild-Date: Wed Oct  6 08:31:09 UTC 2010 on sn-devel-104
(cherry picked from commit 2db1987f5a3a4268ce64fe570ff598e3bf4ecc73)

10 years agoCheck return value of tdb_delete()
Sumit Bose [Mon, 19 Nov 2012 10:20:31 +0000 (11:20 +0100)]
Check return value of tdb_delete()

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 5cdcc3d45d358ddbcd7e864898eed9cbd9935429)

10 years agoNew version 1.0.114.6 ctdb-1.0.114.6
Michael Adam [Fri, 26 Apr 2013 15:22:16 +0000 (17:22 +0200)]
New version 1.0.114.6

10 years agovacuum: Update (C)
Michael Adam [Fri, 22 Feb 2013 15:12:17 +0000 (16:12 +0100)]
vacuum: Update (C)

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 61264debba58355b9716ac1637fdedef5ed249c8)

10 years agovacuum: extend the header comment for ctdb_process_delete_list()
Michael Adam [Sat, 29 Dec 2012 16:23:27 +0000 (17:23 +0100)]
vacuum: extend the header comment for ctdb_process_delete_list()

Describe the (new) process more precisely.
And mention that is the last step of the vacuuming process
that is performed on the lmaster.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 06de786c786f1cab4c6721adf47c2cb1e8a72adb)

10 years agovacuum: turn the vacuuming on lmaster into a three-phase process.
Michael Adam [Sat, 5 Jan 2013 00:20:18 +0000 (01:20 +0100)]
vacuum: turn the vacuuming on lmaster into a three-phase process.

More precisely, before locally deleting an empty record, that has been
migrated with data and that we are dmaster and laster for, we now perform
the deletion on the other nodes in two steps instead of a single step.

- First send out the list of records to be deleted to all
  other nodes with the new RECEIVE_RECORDS control to store
  the lmaster's current empty copy.
- Then send those records that could be deleted on all nodes
  to all nodes again with the TRY_DELETE_RECORDS control
  as before for deletion.
- Finally delete those records locally that were successfully
  deleted remotely in the previous step.

This fixes an old race where a recovery that hits the vacuum process
square between the eyes can create gaps in the record's history and
hence let the records resurrect. In the case of the locking.tdb,
that could mean that a file that was already closed, was recorded as
being open and locked again, so samba clients were locked out of that
file until samba was restarted.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit eee23d44b6427be8ab49bbfcee3abb62f37dfcc7)

10 years agovacuum: introduce the RECEIVE_RECORDS control
Michael Adam [Thu, 20 Dec 2012 23:24:47 +0000 (00:24 +0100)]
vacuum: introduce the RECEIVE_RECORDS control

This in preparation of turning the vacuming on the lmaster into
into a two phase process:

- First the node sends the list of records to be vacuumed
  to all other nodes with this new RECEIVE_RECORDS control.
  The remote nodes should store the lmaster's empty current copy.
- Only those records that could be stored on all other nodes
  are processed further. They are send to all other nodes with
  the TRY_DELETE_RECORDS control as before for deletion.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit e397702e271af38204fd99733bbeba7c1db3a999)

Conflicts:

include/ctdb_protocol.h
server/ctdb_control.c

10 years agovacuum: reorder some of ctdb_process_delete_list() more intuitively
Michael Adam [Sat, 29 Dec 2012 17:32:39 +0000 (18:32 +0100)]
vacuum: reorder some of ctdb_process_delete_list() more intuitively

Now that the nodemap and its talloc children don't hang off of the
delete_records_list talloc context, we can build the nodemap
and earlier, and move the construction of the delete_records_list
to where it is more obvious what it is used for.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit e3740899c1af6962f93c85ad7d1cb71bddce45c6)

10 years agovacuum: add explicit temporary memory context to ctdb_process_delete_list()
Michael Adam [Sat, 29 Dec 2012 16:16:33 +0000 (17:16 +0100)]
vacuum: add explicit temporary memory context to ctdb_process_delete_list()

This removes the implicit artificial talloc hierarchy and makes the
code easier to understand.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit b7c3b8cdf92c597e621e3dae28b110d321de5ea8)

10 years agovacuum: fix indentation in ctdb_process_delete_list()
Michael Adam [Sat, 5 Jan 2013 00:19:06 +0000 (01:19 +0100)]
vacuum: fix indentation in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 59a887e12469266e514ad7d4e34810e7ea888ba3)

10 years agovacuum: free temporary allocated memory correctly in ctdb_process_delete_list().
Michael Adam [Mon, 17 Dec 2012 16:31:55 +0000 (17:31 +0100)]
vacuum: free temporary allocated memory correctly in ctdb_process_delete_list().

Add a common exit point for cleanup.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 11d728465a9c635e1829abaae17e2f7720433b69)

10 years agovacuum: move variable into scope of use in ctdb_process_delete_list()
Michael Adam [Mon, 17 Dec 2012 16:26:22 +0000 (17:26 +0100)]
vacuum: move variable into scope of use in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 3710dd0f313f551f1b302b4961e0203243e3d661)

10 years agovacuum: move variable into scope of use in ctdb_process_delete_list()
Michael Adam [Mon, 17 Dec 2012 12:07:21 +0000 (13:07 +0100)]
vacuum: move variable into scope of use in ctdb_process_delete_list()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 4640979b526b6dac69a6a0555bfce75fe0206dac)

10 years agovacuum: simplify ctdb_process_delete_list(): reduce indentation
Michael Adam [Mon, 17 Dec 2012 12:03:42 +0000 (13:03 +0100)]
vacuum: simplify ctdb_process_delete_list(): reduce indentation

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit f3e6e7f8ef22bd70dd2f101d818e2e5ab5ed3cd8)

Conflicts:

server/ctdb_vacuum.c

10 years agovacuum: add DEBUG to skip conditions in delete_record_traverse()
Michael Adam [Wed, 3 Apr 2013 12:12:27 +0000 (14:12 +0200)]
vacuum: add DEBUG to skip conditions in delete_record_traverse()

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 817c77a3d0a3546bf46389cec5f6b54778dd1693)

Conflicts:

server/ctdb_vacuum.c

10 years agoclient: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLY
Michael Adam [Mon, 22 Apr 2013 14:21:02 +0000 (10:21 -0400)]
client: fix ctdb_control() to be able to cope with CTDB_CTRL_FLAG_NOREPLY

This was apparently not used before in this context, and the bug hence
not detected. It becomes necessary when ctdb_local_schedule_for_deletion()
is called from a client ctdbd (the vacuuming child), hence needs to send
the SCHEDULE_FOR_DELETION control to its parent.

Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>

Signed-off-by: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit e72a5e11845fe445baaee4730bb0bea8588ee9e3)

10 years agoctdb_call: don't bump the rsn in ctdb_become_dmaster() any more
Michael Adam [Wed, 3 Apr 2013 10:02:59 +0000 (12:02 +0200)]
ctdb_call: don't bump the rsn in ctdb_become_dmaster() any more

This is now done in ctdb_ltdb_store_server(), so this
extra bump can be spared.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit cad3107b12e8392f786f9a758ee38cf3a3d58538)

10 years agoFix a severe recovery bug that can lead to data corruption for SMB clients.
Michael Adam [Wed, 3 Apr 2013 09:40:25 +0000 (11:40 +0200)]
Fix a severe recovery bug that can lead to data corruption for SMB clients.

Problem:
Recovery can under certain circumstances lead to old record copies
resurrecting: Recovery selects the newest record copy purely by RSN. At
the end of the recovery, the recovery master is the dmaster for all
records in all (non-persistent) databases. And the other nodes locally
hold the complete copy of the databases. The bug is that the recovery
process does not increment the RSN on the recovery master at the end of
the recovery. Now clients acting directly on the Recovery master will
directly change a record's content on the recmaster without migration
and hence without RSN bump.  So a subsequent recovery can not tell that
the recmaster's copy is newer than the copies on the other nodes, since
their RSN is the same. Hence, if the recmaster is not node 0 (or more
precisely not the active node with the lowest node number), the recovery
will choose copies from nodes with lower number and stick to these.

Here is how to reproduce:

- assume we have a cluster with at least 2 nodes
- ensure that the recmaster is not node 0
  (maybe ensure with "onnode 0 ctdb setrecmasterrole off")
  say recmaster is node 1
- choose a new database name, say "test1.tdb"
  (make sure it is not yet attached as persistent)
- choose a key name, say "key1"
- all clustere nodes should ok and no recovery running
- now do the following on node 1:

1. dbwrap_tool test1.tdb store key1 uint32 1
2. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
3. ctdb recover
4. dbwrap_tool test1.tdb store key1 uint32 2
5. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 2
4. ctdb recover
7. dbwrap_tool test1.tdb fetch key1 uint32
   ==> 1
   ==> BUG

This is a very severe bug, since when applied to Samba's locking.tdb
database, it means that for SMB clients on clustered Samba there is
the potential for locking out oneself from previously opened files
or even worse, data corruption:

Case 1: locking out

- client on recmaster opens file
- recovery propagates open file handle (entry in locking.tdb) to
  other nodes
- client closes file
- client opens the same file
- recovery resurrects old copy of open file record in locking.tdb
  from lower node
- client closes file but fails to delete entry in locking.tdb
- client tries to open same file again but fails, since
  the old record locks it out (since the client is still connected)

Case 2: data corruption

- clien1 on recmaster opens file
- recovery propagates open file info to other nodes
- client1 closes the file and disconnects
- client2 opens the same file
- recovery resurrects old copy of locking.tdb record,
  where client2 has no entry, but client1 has.
- but client2 believes it still has a handle
- client3 opens the file and succees without
  conflicting with client2
  (the detached entry for client1 is discarded because
   the server does not exist any more).
=> both client2 and client3 believe they have exclusive
  access to the file and writing creates data corruption

Fix:

When storing a record on the dmaster, bump its RSN.

The ctdb_ltdb_store_server() is the central function for storing
a record to a local tdb from the ctdbd server context.
So this is also the place where the RSN of the record to be stored
should be incremented, when storing on the dmaster.

For the case of the record migration, this is currently done in
ctdb_become_dmaster() in ctdb_call.c, but there are other places
such as in recovery, where we should bump the RSN, but currently
don't do it.

So moving the RSN incrementation into ctdb_ltdb_store_server fixes
the recovery-record-resurrection bug.

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-By: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit feb1d40b21a160737aead22e398f3c34ff3be8de)

Conflicts:

server/ctdb_ltdb_server.c

10 years agologging: fix comment typo
Michael Adam [Mon, 15 Apr 2013 10:50:42 +0000 (12:50 +0200)]
logging: fix comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 4c0cbfbe8b19f2e6fe17093b52c734bec63dd8b7)

10 years agoctdbd: unimplement the unused SET_DMASTER control
Michael Adam [Wed, 3 Apr 2013 12:03:32 +0000 (14:03 +0200)]
ctdbd: unimplement the unused SET_DMASTER control

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 2e92deef5221ee651028ef87138b3113f1fece91)

Conflicts:

include/ctdb_protocol.h
server/ctdb_recover.c

10 years agorecoverd: remove bogus comment "qqq" from "add prototype new banning code"
Michael Adam [Fri, 22 Mar 2013 16:48:00 +0000 (17:48 +0100)]
recoverd: remove bogus comment "qqq" from "add prototype new banning code"

Signed-off-by: Michael Adam <obnox@samba.org>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
(cherry picked from commit 9f01b8db72780acf2f88f1392bc0a796dd4c6176)

11 years agoFix typo in ctdb_ltdb_store_server()
Martin Schwenke [Wed, 9 Nov 2011 03:55:07 +0000 (14:55 +1100)]
Fix typo in ctdb_ltdb_store_server()

The if statement uses ret but means to use ret2.

Signed-off-by: Martin Schwenke <martin@meltin.net>
(cherry picked from commit f40101a615f8b9826a484e4697bfea6ee2b9ba88)

11 years agoctdb:recover: fix a comment typo
Michael Adam [Tue, 20 Nov 2012 10:20:34 +0000 (11:20 +0100)]
ctdb:recover: fix a comment typo

Signed-off-by: Michael Adam <obnox@samba.org>
(cherry picked from commit 5067392d2e06795559f25828b65c129608b65c0b)

11 years agovacuum: Avoid some tallocs in ctdb recovery
Volker Lendecke [Thu, 22 Nov 2012 14:27:51 +0000 (15:27 +0100)]
vacuum: Avoid some tallocs in ctdb recovery

In a heavily loaded and volatile database a lot of SCHEDULE_FOR_DELETION
requests can come in between fast vacuuming runs. This can lead to
significant ctdb cpu load due to the cost of doing talloc_free. This
reduces the number of objects a bit by coalescing the two objects
of delete_record_data into one. It will also avoid having to allocate
another talloc header for a SCHEDULE_FOR_DELETION key. Not the full fix
for this problem, but it might contribute a bit.
(cherry picked from commit 9a02f61547ddf74629aca21639d8fb61c1df7cbb)