amitay/ctdb.git
10 years agotraverse: Send traverse end record from traverse child process master
Amitay Isaacs [Mon, 9 Sep 2013 02:46:26 +0000 (12:46 +1000)]
traverse: Send traverse end record from traverse child process

To improve the traverse performance, records are directly sent from
traverse child process to originating node.  This creates a race condition
between ctdbd and traverse child.  There are two fds from traverse child
to ctdbd - a pipe to track status of the child process and unix socket
connection for sending records.  It's possible that last few records
are sitting in unix socket buffer when ctdbd processes the status write
from traverse child.  This will be interpreted as end of traverse and
ctdbd will send the last empty record to originating node before it has
processed the pending packets in unix socket connection.

The race is avoided by sending the last empty record marking end of
traverse from the child process.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Wait till all data has been flushed from output queue
Amitay Isaacs [Tue, 10 Sep 2013 07:52:26 +0000 (17:52 +1000)]
traverse: Wait till all data has been flushed from output queue

To improve the traverse performance, records are directly sent from
traverse child process to the originating node.  Make sure that all the
data is sent via socket, before informing ctdbd that traverse is complete.

Without waiting for all the packets to be flushed from the queue,
child process can incorrectly signal ctdbd that traverse has ended.
This will cause the pending records in the queue never to make it to
the originating node and traverse information will not be complete.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Check if local traverse failed or succeeded
Amitay Isaacs [Fri, 6 Sep 2013 08:11:40 +0000 (18:11 +1000)]
traverse: Check if local traverse failed or succeeded

By passing the result of tdb_traverse_read() allows ctdbd to determine
if the local traverse succeeded or not.  In case of a problem with local
traverse, ctdbd can log an error.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Use ctdb local variable for convenience
Amitay Isaacs [Fri, 13 Sep 2013 03:28:31 +0000 (13:28 +1000)]
traverse: Use ctdb local variable for convenience

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotraverse: Log information when traverse starts and ends
Amitay Isaacs [Fri, 6 Sep 2013 04:51:54 +0000 (14:51 +1000)]
traverse: Log information when traverse starts and ends

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoeventscripts: Load CTDB configuration settings in 70.iscsi
Amitay Isaacs [Mon, 16 Sep 2013 04:35:13 +0000 (14:35 +1000)]
eventscripts: Load CTDB configuration settings in 70.iscsi

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agostatd-callout: Disable statd-callout till persistent transactions are fixed
Amitay Isaacs [Fri, 6 Sep 2013 04:53:47 +0000 (14:53 +1000)]
statd-callout: Disable statd-callout till persistent transactions are fixed

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon: Make parse_ip() valgrind-clean
Martin Schwenke [Mon, 9 Sep 2013 06:16:24 +0000 (16:16 +1000)]
common: Make parse_ip() valgrind-clean

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agorecoverd: Remove an orphaned comment
Martin Schwenke [Tue, 27 Aug 2013 05:27:30 +0000 (15:27 +1000)]
recoverd: Remove an orphaned comment

This should have been removed with the associated code in commit
14bd0b6961ef1294e9cba74ce875386b7dfbf446.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Update a comment to use current terminology
Martin Schwenke [Tue, 27 Aug 2013 05:24:17 +0000 (15:24 +1000)]
recoverd: Update a comment to use current terminology

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoclient: Remove unused function list_of_active_nodes_except_pnn()
Martin Schwenke [Tue, 27 Aug 2013 05:16:51 +0000 (15:16 +1000)]
client: Remove unused function list_of_active_nodes_except_pnn()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()
Martin Schwenke [Tue, 27 Aug 2013 05:14:10 +0000 (15:14 +1000)]
tools/ctdb: list_of_active_nodes_except_pnn() -> list_of_nodes()

list_of_active_nodes_except_pnn() is only used here and can be removed
if we remove this call.  Less is more...

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Fix a memory leak in parse_nodestring()
Martin Schwenke [Wed, 28 Aug 2013 05:36:27 +0000 (15:36 +1000)]
tools/ctdb: Fix a memory leak in parse_nodestring()

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/eventscripts: Tests for memory checking in 00.ctdb
Martin Schwenke [Fri, 6 Sep 2013 06:37:52 +0000 (16:37 +1000)]
tests/eventscripts: Tests for memory checking in 00.ctdb

... plus updates to test infrastructure to support.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Clean up monitoring of system memory in 00.ctdb
Martin Schwenke [Fri, 6 Sep 2013 02:13:31 +0000 (12:13 +1000)]
eventscripts: Clean up monitoring of system memory in 00.ctdb

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoserver: standardize formatting of comment block for ctdb_reply_dmaster() while I...
Michael Adam [Thu, 22 Aug 2013 14:17:09 +0000 (16:17 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..

This was the comment block I was touching and meant to adapt in
commit 00d3bf092e2f72eda330978c75ec85f17e870553.
My search was apparently not unique...

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agodoc: Update NEWS
Martin Schwenke [Wed, 21 Aug 2013 04:01:25 +0000 (14:01 +1000)]
doc: Update NEWS

Signed-off-by: Martin Schwenke <martin@meltin.net>
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agobuild: Fix build dependencies for ctdb_lock_tdb
Amitay Isaacs [Thu, 22 Aug 2013 07:59:31 +0000 (17:59 +1000)]
build: Fix build dependencies for ctdb_lock_tdb

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests/simple: Minimise the chance of a monitor event being cancelled
Martin Schwenke [Thu, 22 Aug 2013 04:04:59 +0000 (14:04 +1000)]
tests/simple: Minimise the chance of a monitor event being cancelled

A monitor event following a "ctdb delip" might reconfigure services.
If the monitor event is cancelled then a service might be stopped but
not yet restarted and this could result in the subsequent monitor
events failing.

This obviously needs to be fixed in CTDB itself.  This will happen by
making "ctdb reloadips" the supported way of reconfiguring IPs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agopackaging: Remove pushd/popd from maketarball.sh, don't need bash
Martin Schwenke [Wed, 21 Aug 2013 07:24:03 +0000 (17:24 +1000)]
packaging: Remove pushd/popd from maketarball.sh, don't need bash

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb_diagnostics: Add output of "ctdb getdbmap"
Martin Schwenke [Wed, 21 Aug 2013 06:48:21 +0000 (16:48 +1000)]
tools/ctdb_diagnostics: Add output of "ctdb getdbmap"

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb_diagnostics: Safer temporary file creation
Martin Schwenke [Wed, 21 Aug 2013 06:38:17 +0000 (16:38 +1000)]
tools/ctdb_diagnostics: Safer temporary file creation

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Avoid using a temporary file in 62.cnfs
Martin Schwenke [Wed, 21 Aug 2013 04:34:49 +0000 (14:34 +1000)]
eventscripts: Avoid using a temporary file in 62.cnfs

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoscripts: Remove gdb_backtrace
Martin Schwenke [Wed, 21 Aug 2013 04:27:39 +0000 (14:27 +1000)]
scripts: Remove gdb_backtrace

This uses potentially insecure temporary files and is not referenced
anywhere else.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Make most non-auto-all commands abort if run with -n all
Martin Schwenke [Mon, 19 Aug 2013 04:40:52 +0000 (14:40 +1000)]
tools/ctdb: Make most non-auto-all commands abort if run with -n all

Or if run with -n A,B,...

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Remove more non-essential fetching of PNN from daemon
Martin Schwenke [Wed, 14 Aug 2013 19:02:37 +0000 (05:02 +1000)]
tools/ctdb: Remove more non-essential fetching of PNN from daemon

The useful cases are either CTDB_CURRENT_NODE, in which case
ctdb_get_pnn() does the job, or a PNN, which is... ummm... a PNN!  :-)

This works because parse_nodestring() validates PNNs.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Improve auto-all settings for some commands
Martin Schwenke [Mon, 19 Aug 2013 03:54:49 +0000 (13:54 +1000)]
tools/ctdb: Improve auto-all settings for some commands

* ipreallocate is cluster-wide so should not be auto-all

* enablescript, disablescript, getreclock, setreclock, natgwlist can
  all be auto-all without issues

* xpnn, ipiface a local-only so don't work with -n, so might as well
  not be auto-all

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Remove an unused temporary talloc context
Martin Schwenke [Fri, 16 Aug 2013 10:27:25 +0000 (20:27 +1000)]
recoverd: Remove an unused temporary talloc context

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c
Martin Schwenke [Fri, 16 Aug 2013 04:10:57 +0000 (14:10 +1000)]
recoverd: Move struct ctdb_public_ip_list back into ctdb_takeover.c

This is an internal structure.  It was moved into ctdb_private.h a
long time ago to allow unit testing.  Unit test compilation was
changed shortly afterwards to make this unnecessary.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Log more information when interfaces change
Martin Schwenke [Thu, 15 Aug 2013 07:04:01 +0000 (17:04 +1000)]
recoverd: Log more information when interfaces change

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotraverse: Log when database traverse is started
Amitay Isaacs [Thu, 11 Jul 2013 06:00:30 +0000 (16:00 +1000)]
traverse: Log when database traverse is started

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Finish eventscript callback processing before debugging hung script
Amitay Isaacs [Thu, 22 Aug 2013 05:12:17 +0000 (15:12 +1000)]
ctdbd: Finish eventscript callback processing before debugging hung script

This ensures that the result of eventscripts is updated and callback is
processed before debugging hung script.  So "ctdb scriptstatus" output
will be useful from debug hung script.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

10 years agoctdbd: Make sure call data is freed if doing an early return
Amitay Isaacs [Tue, 23 Jul 2013 06:00:15 +0000 (16:00 +1000)]
ctdbd: Make sure call data is freed if doing an early return

This should avoid memory bloat when a request bounces between nodes.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon/io: Limit the queue buffer size for fair scheduling via tevent
Amitay Isaacs [Wed, 21 Aug 2013 04:42:06 +0000 (14:42 +1000)]
common/io: Limit the queue buffer size for fair scheduling via tevent

If we process all the data available in a socket buffer, CTDB can stay busy
processing lots of packets via immediate event mechanism in tevent.  After
processing an immediate event, tevent returns without epoll_wait.  So as long
as there are immediate events, tevent will never poll other FDs.  CTDB will
report this as "Event handling took xx seconds" warning.  This is misleading
since CTDB is very busy processing packets, but never gets to the point of
polling FDs.

The improvement in socket handling made it worse when handling traverse
control.  There were lots of packets filled in the socket buffer quickly and
CTDB stayed busy processing those packets and not polling other FDs and timer
events.  This can lead to controls timing out and in worse case other nodes
marking busy node as disconnected.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "common/io: Keep queue buffer size multiple of 4K"
Amitay Isaacs [Tue, 20 Aug 2013 04:20:09 +0000 (14:20 +1000)]
Revert "common/io: Keep queue buffer size multiple of 4K"

This reverts commit 5e9b1a7e24d058ff88aaa0563db36a804e866fa9.

This is not the best approach.  Allowing queue buffer size to grow
indefinitely causes large number of CTDB packets to be queued up very
quickly which when processed via immediate events will block CTDB from
processing events from other FDs.  If there are immediate events queued
up, tevent will never process any of the FDs till all immediate events
are processed.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "LACOUNT: Add back lacount mechanism to defer migrating a fetched/read copy...
Amitay Isaacs [Mon, 19 Aug 2013 05:04:46 +0000 (15:04 +1000)]
Revert "LACOUNT:  Add back lacount mechanism to defer migrating a fetched/read copy until after default of 20 consecutive requests from the same node"

This reverts commit 035c0d981bde8c0eee8b3f24ba8e2dc817e5b504.

This is a premature optimization.  Record can bounce between nodes
very quickly if it is a contended record.  There is no need to hold a
record on a node unnecessarily.  In case record contention becomes bad,
enabling sticky records on a database is a better idea.

Conflicts:
include/ctdb_private.h
server/ctdb_tunables.c

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Print a log message when a key becomes hot
Amitay Isaacs [Mon, 15 Jul 2013 05:39:47 +0000 (15:39 +1000)]
ctdbd: Print a log message when a key becomes hot

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster
Amitay Isaacs [Fri, 9 Aug 2013 07:22:55 +0000 (17:22 +1000)]
ctdbd: For volatile databases, write an empty record with rsn=0 only on dmaster

Empty record with rsn=0 should not be written on any other node other than
dmaster.  This is however not true for persistent databases.  So currently
apply the check only for volatile databases.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Fix message in showban when node is banned
Martin Schwenke [Fri, 9 Aug 2013 07:00:10 +0000 (17:00 +1000)]
tools/ctdb: Fix message in showban when node is banned

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()
Martin Schwenke [Fri, 9 Aug 2013 06:58:42 +0000 (16:58 +1000)]
tools/ctdb: Reimplement ban/unban using update_flags_wait_and_ipreallocate()

This has the side effect of making these commands more resilient to
control timeouts.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Factor out common pattern used in disable/enable/stop/continue
Martin Schwenke [Fri, 9 Aug 2013 06:34:59 +0000 (16:34 +1000)]
tools/ctdb: Factor out common pattern used in disable/enable/stop/continue

Now we will only have one set of bugs.  :-)

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotools/ctdb: Factor, simplify and improve robustness of ipreallocate code
Martin Schwenke [Fri, 9 Aug 2013 05:41:37 +0000 (15:41 +1000)]
tools/ctdb: Factor, simplify and improve robustness of ipreallocate code

Having other functions call control_ipreallocate() suggests that the
it might look at the argv/argv arguments that are passed.  This is not
the case.  Change the callers so they call the new ipreallocate()
function instead.

Broadcast CTDB_SRVID_TAKEOVER_RUN to all connected nodes.  Inactive
nodes will ignore it.  This is safe since we only want 1 reply.  If we
didn't get a response, we don't actually care if there's no active
recovery master - just fire, wait, retry, ...

Ignore some failures on the basis that they might be transient, so it
is probably worth retrying.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Use ctdb_get_pnn() to get PNN of the current node
Martin Schwenke [Wed, 14 Aug 2013 18:38:02 +0000 (04:38 +1000)]
tools/ctdb: Use ctdb_get_pnn() to get PNN of the current node

This has already been stored at connect time and can't fail.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoutil: In passing the code, fix a space vs. tab in set_close_on_exec().
Michael Adam [Mon, 19 Aug 2013 14:54:06 +0000 (16:54 +0200)]
util: In passing the code, fix a space vs. tab in set_close_on_exec().

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoserver: standardize formatting of comment block for ctdb_reply_dmaster() while I...
Michael Adam [Mon, 19 Aug 2013 15:07:19 +0000 (17:07 +0200)]
server: standardize formatting of comment block for ctdb_reply_dmaster() while I'm at it..

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agoserver: fix wording and punctuation in comment block for ctdb_reply_dmaster().
Michael Adam [Tue, 13 Aug 2013 08:17:45 +0000 (10:17 +0200)]
server: fix wording and punctuation in comment block for ctdb_reply_dmaster().

Signed-off-by: Michael Adam <obnox@samba.org>
10 years agorecoverd: Improve log message when nodes disagree on recmaster
Amitay Isaacs [Wed, 14 Aug 2013 01:44:12 +0000 (11:44 +1000)]
recoverd: Improve log message when nodes disagree on recmaster

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon: Null terminate process name string so valgrind doesn't complain
Amitay Isaacs [Fri, 2 Aug 2013 01:05:08 +0000 (11:05 +1000)]
common: Null terminate process name string so valgrind doesn't complain

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)
Amitay Isaacs [Mon, 12 Aug 2013 05:50:30 +0000 (15:50 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 2)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agovacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)
Amitay Isaacs [Mon, 12 Aug 2013 05:51:00 +0000 (15:51 +1000)]
vacuuming: Fix vacuuming bug where requests keep bouncing between nodes (part 1)

This is caused by corruption of a record header such that the records
on two nodes point to each other as dmaster.  This makes a request for
that record bounce between nodes endlessly.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agodb_wrap: Make sure tdb messages are logged correctly
Amitay Isaacs [Tue, 6 Aug 2013 04:37:13 +0000 (14:37 +1000)]
db_wrap: Make sure tdb messages are logged correctly

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoeventscripts: Become unhealthy faster on nfsd failure
Martin Schwenke [Mon, 12 Aug 2013 01:36:25 +0000 (11:36 +1000)]
eventscripts: Become unhealthy faster on nfsd failure

Anecdotal evidence suggests that most nfsd RPC check failures are due
to cluster filesystem or storage problem.  Apparently these are rarely
helped by attempting to restart the NFS service because the restart
tends to hang.

Fail after 2 nfsd RPC check failures, instead of waiting for 6
failures.  Restart on every 10th failure to try to bring the node back
to good health.

Update unit tests to match.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotools/ctdb: Increase default control timeout to 10 seconds
Martin Schwenke [Fri, 9 Aug 2013 01:56:29 +0000 (11:56 +1000)]
tools/ctdb: Increase default control timeout to 10 seconds

The current 3 second timeout is arbitrary and users trip over it
sometimes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Improve message logged when a counter hits a limit
Martin Schwenke [Thu, 8 Aug 2013 06:02:44 +0000 (16:02 +1000)]
eventscripts: Improve message logged when a counter hits a limit

It should print the actual number of consecutive failures rather than
the limit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Print a message when waiting for TCP connections to be killed
Martin Schwenke [Tue, 6 Aug 2013 02:42:13 +0000 (12:42 +1000)]
eventscripts: Print a message when waiting for TCP connections to be killed

This makes the gaps in the logs more obvious.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: New configuration variable $CTDB_RPCINFO_LOCALHOST
Martin Schwenke [Mon, 5 Aug 2013 05:12:14 +0000 (15:12 +1000)]
eventscripts: New configuration variable $CTDB_RPCINFO_LOCALHOST

Passing "localhost" to the rpcinfo command causes overheads, like
reading /etc/services multiple times.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoeventscripts: Add modulo (%) operator to ctdb_check_counter()
Martin Schwenke [Fri, 2 Aug 2013 05:18:47 +0000 (15:18 +1000)]
eventscripts: Add modulo (%) operator to ctdb_check_counter()

Also add it to the corresponding eventscript unit test infrastructure.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Separate out RPC service restart code
Martin Schwenke [Fri, 2 Aug 2013 06:05:46 +0000 (16:05 +1000)]
eventscripts: Separate out RPC service restart code

While doing this:

* Explicitly assign RPC program and version information in
  _nfs_check_rpc_common().  This is more lines of code but is easier
  to read.

* Don't print the options when starting a service.  Trying to print it
  makes the code messy for little benefit.

  Update the eventscript unit testing code and a Ganesha test to
  reflect this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/eventscripts: Override background_with_logging(), just prepend "&"
Martin Schwenke [Fri, 2 Aug 2013 06:03:42 +0000 (16:03 +1000)]
tests/eventscripts: Override background_with_logging(), just prepend "&"

That is, output that goes through background_with_logging() just gets
"&" prepended to each line.  This is cleaner than having the tests
grovel through logs.

Update some 49.winbind/50.samba tests to deal with this.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: Remove support for RPC service 'q' and 's' restart flags
Martin Schwenke [Tue, 30 Jul 2013 06:24:24 +0000 (16:24 +1000)]
eventscripts: Remove support for RPC service 'q' and 's' restart flags

They're hard to maintain and provide very little benefit.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: When restarting the nfslock service only show output of start
Martin Schwenke [Tue, 30 Jul 2013 06:21:36 +0000 (16:21 +1000)]
eventscripts: When restarting the nfslock service only show output of start

That is, /dev/null the "stop" output.  This is consistent with the way
CTDB generally deals with the output when stopping a service.

It also makes updating the eventscript unit tests easier.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/simple: Unreachable node test should wait for recovery to complete
Martin Schwenke [Mon, 29 Jul 2013 05:27:24 +0000 (15:27 +1000)]
tests/simple: Unreachable node test should wait for recovery to complete

This should minimise the chances of a control timing out.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotests/simple: Fix the missing IP test
Martin Schwenke [Mon, 29 Jul 2013 05:09:23 +0000 (15:09 +1000)]
tests/simple: Fix the missing IP test

Update the missing IP test to wait until restarts are complete.
Otherwise a service restart can collide with the following monitor
event and cause chaos.

Also, do not disable 10.interface until it matters.  Disabling it too
early can cause even more chaos if something goes wrong with the
monitor step.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agorecoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databases
Amitay Isaacs [Tue, 13 Aug 2013 04:02:46 +0000 (14:02 +1000)]
recoverd: Use TDB_INCOMPATIBLE_HASH when creating volatile databases

When creating missing databases either locally or remotely, recovery
master calls ctdb_ctrl_createdb().  Recovery master always passes 0
for tdb_flags.  For volatile databases, if TDB_INCOMPATIBLE_HASH is not
specified, then they will be attached without using jenkins hash causing
database corruption.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "recoverd: Use correct tdb flags when creating missing databases"
Amitay Isaacs [Tue, 13 Aug 2013 03:55:47 +0000 (13:55 +1000)]
Revert "recoverd: Use correct tdb flags when creating missing databases"

This reverts commit 10a057d8e15c8c18e540598a940d3548c731b0b4.

This approach would not work when creating local databases since currently
there is no control to receive TDB flags for remote databases.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agocommon/io: Keep queue buffer size multiple of 4K
Amitay Isaacs [Mon, 5 Aug 2013 07:28:47 +0000 (17:28 +1000)]
common/io: Keep queue buffer size multiple of 4K

Currently queue buffer size is realloc'd every time we need to extend the
buffer.  Small increments can cause memory fragmentation.  Instead always
extend buffer in multiples of 4K.  This should reduce multiple talloc_realloc
calls when there are lots of packets in the socket buffer.

Also, if queue buffer has grown larger than 64K, throw away the buffer once
all the requests in the queue have been processed.  That way queue does not
hold on to large buffers.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agopackaging: Allow setting custom release number in RPM spec file
Martin Schwenke [Fri, 26 Jul 2013 03:57:03 +0000 (13:57 +1000)]
packaging: Allow setting custom release number in RPM spec file

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-Programmed-With: Amitay Isaacs <amitay@gmail.com>

10 years agoctdbd: When a record is made sticky, log only once
Amitay Isaacs [Wed, 31 Jul 2013 05:59:11 +0000 (15:59 +1000)]
ctdbd: When a record is made sticky, log only once

Instead of logging from ctdb_request_call(), log the message from
ctdb_make_record_sticky().  That way if the record is already sticky, the
message is not repeated unnecessarily.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Improve high hopcount log messages when request is redirected
Amitay Isaacs [Mon, 15 Jul 2013 07:34:31 +0000 (17:34 +1000)]
ctdbd: Improve high hopcount log messages when request is redirected

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoscripts: Do not run ctdb tool commands when debugging hung "init" event
Martin Schwenke [Tue, 6 Aug 2013 06:11:40 +0000 (16:11 +1000)]
scripts: Do not run ctdb tool commands when debugging hung "init" event

CTDB daemon is not ready to accept clients in INIT runstate (init event).
CTDB daemon will start accepting connections in SETUP runstate (setup event)
and later.

Also, minor log formatting changes.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoctdbd: Avoid leaking file descriptor if talloc fails
Amitay Isaacs [Mon, 5 Aug 2013 07:38:42 +0000 (17:38 +1000)]
ctdbd: Avoid leaking file descriptor if talloc fails

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoeventscript: Wait for debug hung script to finish or timeout before continuing
Amitay Isaacs [Mon, 5 Aug 2013 04:08:28 +0000 (14:08 +1000)]
eventscript: Wait for debug hung script to finish or timeout before continuing

Currently if the debug hung script takes long time to finish, the subsequent
monitor event can collide with the previous event which is not yet finished.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoeventscripts: Use configured RECLOCK file instead of asking CTDB
Amitay Isaacs [Fri, 2 Aug 2013 05:49:06 +0000 (15:49 +1000)]
eventscripts: Use configured RECLOCK file instead of asking CTDB

On cluster where recovery lock file is not being used, asking CTDB daemon
is unnecessary overhead.  And if CTDB is using recovery file, then changing
configuration without restarting is *stupid*.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-Programmed-With: Martin Schwenke <martin@meltin.net>

10 years agolocking: Do not create multiple lock processes for the same key
Amitay Isaacs [Fri, 2 Aug 2013 00:54:38 +0000 (10:54 +1000)]
locking: Do not create multiple lock processes for the same key

If there are multiple lock helper processes waiting for the same record, then
it will cause a thundering herd when that record has been unlocked.  So avoid
scheduling lock contexts for the same record.  This will also mean that
multiple requests will get queued up behind the same lock context and can be
processed quickly once the lock has been obtained.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agolocking: Move function find_lock_context() before ctdb_lock_schedule()
Amitay Isaacs [Fri, 2 Aug 2013 00:51:45 +0000 (10:51 +1000)]
locking: Move function find_lock_context() before ctdb_lock_schedule()

So that ctdb_lock_schedule() can call this function without requiring extra
prototype declaration.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Print set db sticky message after it's set
Amitay Isaacs [Tue, 30 Jul 2013 04:17:55 +0000 (14:17 +1000)]
ctdbd: Print set db sticky message after it's set

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotests: Add a test program to hold a lock on a database
Amitay Isaacs [Tue, 4 Dec 2012 07:27:10 +0000 (18:27 +1100)]
tests: Add a test program to hold a lock on a database

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agorecoverd: Use correct tdb flags when creating missing databases
Amitay Isaacs [Tue, 30 Jul 2013 02:45:01 +0000 (12:45 +1000)]
recoverd: Use correct tdb flags when creating missing databases

When creating missing databases either locally or remotely, make sure
to use the correct tdb flags from other nodes.  Without this, volatile
databases can get attached without TDB_INCOMPATIBLE_HASH flag.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoclient: Always use jenkins hash when attaching volatile databases
Amitay Isaacs [Thu, 1 Aug 2013 01:07:59 +0000 (11:07 +1000)]
client: Always use jenkins hash when attaching volatile databases

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agorecoverd: Make sure to use jenkins hash for recovery databases
Amitay Isaacs [Mon, 29 Jul 2013 03:50:44 +0000 (13:50 +1000)]
recoverd: Make sure to use jenkins hash for recovery databases

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agorecoverd: Assemble up-to-date node flags information from remote nodes
Amitay Isaacs [Mon, 22 Jul 2013 07:26:28 +0000 (17:26 +1000)]
recoverd: Assemble up-to-date node flags information from remote nodes

Currently nodemap used by recovery master is the one obtained from the local
node.  This information may have been updated while processing main loop.
Before comparing node flags on all the nodes, create up-to-date node flags
information based on the information received from all the nodes.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agotools/ctdb: Only print the hot records with non-zero hopcount
Amitay Isaacs [Mon, 15 Jul 2013 06:35:30 +0000 (16:35 +1000)]
tools/ctdb: Only print the hot records with non-zero hopcount

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Don't consider a hot record if the hopcount is zero
Amitay Isaacs [Mon, 15 Jul 2013 06:32:40 +0000 (16:32 +1000)]
ctdbd: Don't consider a hot record if the hopcount is zero

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Fix updating of hot keys in database statistics
Amitay Isaacs [Fri, 12 Jul 2013 07:33:13 +0000 (17:33 +1000)]
ctdbd: Fix updating of hot keys in database statistics

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoctdbd: Remove incomplete ctdb_db_statistics_wire structure
Amitay Isaacs [Mon, 15 Jul 2013 05:24:11 +0000 (15:24 +1000)]
ctdbd: Remove incomplete ctdb_db_statistics_wire structure

Instead of maintaining another structure, add an element as place holder for
marshall buffer of hot keys.  This avoids duplication of the structure.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agoRevert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure"
Amitay Isaacs [Mon, 15 Jul 2013 04:52:07 +0000 (14:52 +1000)]
Revert "ctdbd: Remove incomplete ctdb_db_statistics_wire structure"

The structure cannot be removed without adding support for marshalling keys
for hot records.

This reverts commit 26a4653df594d351ca0dc1bd5f5b2f5b0eb0a9a5.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
10 years agodoc: Update XML files to use standard DocBook DTD
Martin Schwenke [Fri, 26 Jul 2013 05:09:24 +0000 (15:09 +1000)]
doc: Update XML files to use standard DocBook DTD

This simplifies building since we don't use any of the Samba
extensions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoinitscript: The wrapper script should export CTDB_SOCKET
Martin Schwenke [Fri, 26 Jul 2013 01:20:47 +0000 (11:20 +1000)]
initscript: The wrapper script should export CTDB_SOCKET

This ensures that any invocation of the ctdb tool (within the wrapper)
gets the desired value.  This at least ensures that ctdbd will be
started.

If a non-standard value is set for CTDB_SOCKET then command-line users
will still need the variable in their environment.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoctdbd: Kill client process without checking for tracked child
Martin Schwenke [Thu, 25 Jul 2013 06:17:07 +0000 (16:17 +1000)]
ctdbd: Kill client process without checking for tracked child

Commit f73a4b1495830bcdd094a93732a89dd53b3c2f78 added a safety check
to ensure that CTDB never kills unrelated processes.  However, client
processes are unrelated.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoeventscripts: kill_tcp_connections() should send connections to stdin
Martin Schwenke [Thu, 25 Jul 2013 03:40:43 +0000 (13:40 +1000)]
eventscripts: kill_tcp_connections() should send connections to stdin

This avoids issuing multiple "ctdb killtcp" commands to terminate tcp
connections, one per connection.  This will considerably reduce the
time when there is a large number of tcp connections.  This also makes
it possible to avoid calling "ctdb killtcp" when there are no connections.

Add a couple of unit tests for killtcp and update eventscript unit
test infrastructure to support.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotools/ctdb: Allow killtcp to read connections from standard input
Martin Schwenke [Thu, 25 Jul 2013 03:28:26 +0000 (13:28 +1000)]
tools/ctdb: Allow killtcp to read connections from standard input

This will allows eventscripts to send information about multiple tcp
connections to a single "ctdb killtcp" command, saving the overhead of
setting up a client connection per tcp connection.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotests: Always tally the number of passed/failed tests
Martin Schwenke [Mon, 22 Jul 2013 10:11:58 +0000 (20:11 +1000)]
tests: Always tally the number of passed/failed tests

Regardless of whether a summary is being printed!

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agorecoverd: Call takeover fail callback only once per node
Martin Schwenke [Mon, 22 Jul 2013 06:39:46 +0000 (16:39 +1000)]
recoverd: Call takeover fail callback only once per node

Currently the fail callback is called once per (takeip/releaseip) control
failure.  This is overkill and can get a node banned much too quickly.

Instead, keep track of control failures per node and only call fail
callback once per failed node.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoscripts: Run scriptstatus for hung event
Martin Schwenke [Mon, 22 Jul 2013 05:08:32 +0000 (15:08 +1000)]
scripts: Run scriptstatus for hung event

The timeout information printed by ctdbd is less than useful because
it refers to the cumulative time taken by the eventscripts run so far.
Adding scriptstatus output indicates where time was actually spent.

Since there is now quite a bit of output, serialise the calls to this
script using flock.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoctdbd: Pass event name to hung script debugger
Martin Schwenke [Mon, 22 Jul 2013 05:06:52 +0000 (15:06 +1000)]
ctdbd: Pass event name to hung script debugger

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotests/complex: Fix NFS tests to work with root_squash
Martin Schwenke [Mon, 22 Jul 2013 04:32:13 +0000 (14:32 +1000)]
tests/complex: Fix NFS tests to work with root_squash

Refactor the NFS test setup/cleanup code into new common functions.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agotests: Fix exit status of run_tests when a single test is run with -H
Martin Schwenke [Fri, 19 Jul 2013 09:59:43 +0000 (19:59 +1000)]
tests: Fix exit status of run_tests when a single test is run with -H

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agotests/simple: Add -p in onnode test to help show groups of connections
Martin Schwenke [Fri, 19 Jul 2013 05:33:38 +0000 (15:33 +1000)]
tests/simple: Add -p in onnode test to help show groups of connections

Change the command from "true" to "hostname" since the former won't
produce any output when used in combination with "onnode -p".  This
could just be changed to "echo" but the hostname might actually be
useful.

Signed-off-by: Martin Schwenke <martin@meltin.net>
10 years agoctdbd: Sleep at exit to allow time for log messages to flush
Martin Schwenke [Wed, 17 Jul 2013 01:14:37 +0000 (11:14 +1000)]
ctdbd: Sleep at exit to allow time for log messages to flush

Register print_exit_message() earlier so that it covers most of the
early exits.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>

10 years agoctdbd: Exit if something is already listening on CTDB socket
Martin Schwenke [Fri, 19 Jul 2013 05:36:29 +0000 (15:36 +1000)]
ctdbd: Exit if something is already listening on CTDB socket

Don't blindly remove the socket.

Signed-off-by: Martin Schwenke <martin@meltin.net>