Martin Schwenke [Tue, 26 Nov 2013 01:35:44 +0000 (12:35 +1100)]
recoverd: Ignore failed ipreallocated controls to inactive nodes
Currently timeouts for controls to inactive nodes can cause banning
credits to be applied. This should not happen.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 25 Nov 2013 08:28:10 +0000 (19:28 +1100)]
Update NEWS
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 26 Nov 2013 04:41:50 +0000 (15:41 +1100)]
scripts: Be careful when generating unique pids for stack traces
sort expects the data to be line based, so make it so.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 26 Nov 2013 03:38:58 +0000 (14:38 +1100)]
config: Simplify the default CTDB configuration file
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Tue, 26 Nov 2013 03:29:52 +0000 (14:29 +1100)]
scripts: Replace hard-coded /var/ctdb with CTDB_VARDIR
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 26 Nov 2013 02:27:46 +0000 (13:27 +1100)]
scripts: Set defaults for CTDB_DBDIR and CTDB_DBDIR_PERSISTENT
If these configuration variables are not defined, then there should
a default fallback. This is a workaround till CTDB compile time
configuration can be accessed at runtime.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 26 Nov 2013 00:39:54 +0000 (11:39 +1100)]
eventscripts: Perform share check before NFS RPC checks in 60.ganesha
If NFS RPC checks do restart Ganesha, then it's possible that share
check can fail prematurely.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 22 Nov 2013 02:57:31 +0000 (13:57 +1100)]
tools/ctdb: Improve error checking when parsing node string
If a node isn't numeric then it is silently converted to 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 22 Nov 2013 02:57:03 +0000 (13:57 +1100)]
recoverd: Only respond to currently queued ipreallocated requests
Otherwise new requests can come in during the latter parts of the
takeover run when the IP allocation algorithm has already run, and the
new requests will be dequeued even though they haven't really be
processed.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Tue, 19 Nov 2013 04:40:08 +0000 (15:40 +1100)]
scripts: Add an early exit to statd-callout's notify case
If $statd_state is empty then the loop will run once and print
spurious errors.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 19 Nov 2013 04:37:58 +0000 (15:37 +1100)]
eventscripts: Remove the nfs_statd_update() call from 60.ganesha
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 18 Nov 2013 10:04:49 +0000 (21:04 +1100)]
tests/integration: Neaten up some of the persistent database tests
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Mon, 18 Nov 2013 04:09:27 +0000 (15:09 +1100)]
tools/ctdb: Fix tstore command to generate ltdb header internally
This fixes an alignment discrepancy on 32-bit vs 64-bit platforms.
sizeof(struct ctdb_ltdb_header) = 20 (32-bit)
= 24 (64-bit)
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 15 Nov 2013 04:31:03 +0000 (15:31 +1100)]
tests/takeover: Fix bogus test description
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 15 Nov 2013 04:23:14 +0000 (15:23 +1100)]
tests/simple: User sleep_for() instead of sleep
Progress...
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 15 Nov 2013 04:21:58 +0000 (15:21 +1100)]
tests/simple: Update persistent DB tests
* Low level DB checks should ignore the sequence number record.
* A restart is needed after messing with the RecoverPDBBySeqNum
tunable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 15 Nov 2013 04:20:40 +0000 (15:20 +1100)]
recoverd: For persistent databases a sequence number of 0 is valid
Otherwise recovery ends up done by RSN when it is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 19 Nov 2013 04:31:39 +0000 (15:31 +1100)]
locking: Use vfork instead of fork to exec helpers
There is a significant overhead using fork() over vfork(), specially
when the child process execs a helper. The overhead is in memory space
and time.
# strace -c ./test_fork 1024 200
count=1024, size=204800, total=200M
failed fork=0
time for fork() = 4879.597000 us
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
100.00 4.543321 3304 1375 375 clone
0.00 0.000071 0 1033 mmap
0.00 0.000000 0 1 read
0.00 0.000000 0 3 write
0.00 0.000000 0 2 open
0.00 0.000000 0 2 close
0.00 0.000000 0 3 fstat
0.00 0.000000 0 3 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 4.543392 2429 376 total
# strace -c ./test_vfork 1024 200
count=1024, size=204800, total=200M
failed fork=0
time for fork() = 82.041000 us
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
96.47 0.001204 1 1000 vfork
3.53 0.000044 0 1033 mmap
0.00 0.000000 0 1 read
0.00 0.000000 0 3 write
0.00 0.000000 0 2 open
0.00 0.000000 0 2 close
0.00 0.000000 0 3 fstat
0.00 0.000000 0 3 mprotect
0.00 0.000000 0 1 munmap
0.00 0.000000 0 3 brk
0.00 0.000000 0 1 1 access
0.00 0.000000 0 1 execve
0.00 0.000000 0 1 arch_prctl
------ ----------- ----------- --------- --------- ----------------
100.00 0.001248 2054 1 total
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Tue, 19 Nov 2013 05:13:20 +0000 (16:13 +1100)]
common: Refactor code to keep track of child processes
This code can then be used to track child processes created with vfork().
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 15 Nov 2013 07:59:04 +0000 (18:59 +1100)]
scripts: Run a single instance of debug_locks.sh at a give time
This prevents spamming of logs if multiple lock requests are waiting
and keep timing out.
Also, improve the logging format with separators.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 15 Nov 2013 07:36:09 +0000 (18:36 +1100)]
locking: Update current lock statistics when lock is scheduled
When a child process is created for a lock request, the current locks
statistics should be updated immediately. This will provide accurate
information on number of active lock requests.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 18 Nov 2013 04:48:22 +0000 (15:48 +1100)]
locking: Do not merge multiple lock requests to avoid unfair scheduling
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Fri, 15 Nov 2013 04:58:59 +0000 (15:58 +1100)]
locking: Implement active lock requests limit per database
This limit was currently a global limit and not per database. This
prevents any database freeze lock requests from getting scheduled if
the global limit was reached.
Only individual record requests should be limited and database freeze
requests should always get scheduled.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 8 Nov 2013 05:41:11 +0000 (16:41 +1100)]
scripts: Rewrite statd-callout to avoid 10 minute lag
This is naive and assumes no performance problems when updating
persistent DBs. It also does no error handling.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 13 Nov 2013 06:45:25 +0000 (17:45 +1100)]
client: Treat empty __db_sequence_number__ record as 0
This fixes the issue of transaction commit failing due to an empty
__db_sequence_number__ record in persistent database left by previous
cancelled transaction.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 13 Nov 2013 05:19:00 +0000 (16:19 +1100)]
doc: Update ctdb.1 - primarily to add pdelete/pfetch/pstore/ptrans
Also:
* More <refentryinfo> above <refmeta> to make the XML valid.
* Describe DB argument in introduction and use it for database
commands.
* Remove unnecessary format="linespecific" from <screen> tags, since
it will not be allowed in DocBook 5.0.
* Sort the items in "INTERNAL COMMANDS".
* Update/simplify some command descriptions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 6 Nov 2013 02:43:53 +0000 (13:43 +1100)]
tools/ctdb: New ptrans command
Also add test.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 13 Nov 2013 03:04:17 +0000 (14:04 +1100)]
onnode: New -i option to stop stdin from being closed
This can be useful for piping data to onnode in certain circumstances.
There are now also enough command-line options that they should
definitely be alphabetically ordered.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 13 Nov 2013 03:13:52 +0000 (14:13 +1100)]
tests/integration: try_command_on_node() shouldn't lose onnode options
Currently it only passes the last (non -v) option seen. It should
pass them all.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 12 Nov 2013 04:16:49 +0000 (15:16 +1100)]
recoverd: Fix backward compatibility for CTDB_SRVID_TAKEOVER_RUN
When running a mixed version cluster, compatibility with older
versions was was broken during recent refactorisation.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 4 Nov 2013 01:56:39 +0000 (12:56 +1100)]
scripts: debug_locks.sh should use configuration to find TDB location
That is, don't use fixed paths.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 1 Nov 2013 03:34:20 +0000 (14:34 +1100)]
recoverd: A node refuses to play against itself
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Thu, 14 Nov 2013 03:25:47 +0000 (14:25 +1100)]
recoverd: Remove duplicate code to update flags during recovery
This also happens earlier in do_recovery() and the nodemap is not
updated after that, so this update is redundant.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 14 Nov 2013 03:14:10 +0000 (14:14 +1100)]
build: Update to latest upstream config.guess
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Wed, 13 Nov 2013 04:25:46 +0000 (15:25 +1100)]
tools/ctdb: Fix db commands when dbid is given instead of name
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 13 Nov 2013 03:33:31 +0000 (14:33 +1100)]
tests: CTDB tool should always be invoked as $CTDB instad of ctdb
$CTDB_TEST_WRAPPER is required only to run test functions or test binaries
on remote nodes. For running ctdb command, $CTDB is sufficient.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 13 Nov 2013 03:25:59 +0000 (14:25 +1100)]
tests: No need to run onnode in parallel for single node
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 13 Nov 2013 03:19:43 +0000 (14:19 +1100)]
tests: Remove -q option to try_command_on_node
This option is always passed to onnode by default.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:41:17 +0000 (12:41 +1100)]
tests: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:41:00 +0000 (12:41 +1100)]
tcp: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:40:44 +0000 (12:40 +1100)]
tools/ctdb: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:40:28 +0000 (12:40 +1100)]
common: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:39:48 +0000 (12:39 +1100)]
client: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Mon, 11 Nov 2013 01:39:27 +0000 (12:39 +1100)]
server: Coverity fixes
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 7 Nov 2013 05:01:49 +0000 (16:01 +1100)]
tests: Fix calling of ctdb tool from test
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 7 Nov 2013 04:54:28 +0000 (15:54 +1100)]
Revert "tests: If transaction_start fails, try again"
This reverts commit
ed7d999214ee009e480c26410a04fa105028cb8e.
This is not necessary since ctdb_transaction_start() now will return NULL
only when there is a failure and not when another transaction is currently
active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 7 Nov 2013 04:54:20 +0000 (15:54 +1100)]
client: Make g_lock_lock() wait till lock is obtained
This makes the behaviour of g_lock_lock() similar to that implemented in
Samba. Now ctdb_transaction_start() will return NULL only when there are
failures and not when another transaction is active.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Srikrishan Malik [Thu, 31 Oct 2013 06:24:58 +0000 (11:54 +0530)]
eventscript: Fix link creation failure if the link already exist but the target path is missing
Signed-off-by: Srikrishan Malik <srimalik@in.ibm.com>
Martin Schwenke [Wed, 16 Oct 2013 00:46:54 +0000 (11:46 +1100)]
doc: Update NEWS
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 30 Oct 2013 02:22:21 +0000 (13:22 +1100)]
web: Add links to new manpages
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 23 Sep 2013 06:26:16 +0000 (16:26 +1000)]
doc: Major updates to manual pages
This includes new manpages for ctdb.7, ctdb.conf.5 and ctdb-tunables.7.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 30 Oct 2013 01:37:15 +0000 (12:37 +1100)]
tunables: Remove obsolete tunables
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 30 Oct 2013 01:17:37 +0000 (12:17 +1100)]
recoverd: Rebalancing should be done regardless tunable
Rebalance target nodes should be set even if a deferred rebalance is
not configured. The user can explicitly cause a takeover run.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 30 Oct 2013 00:32:28 +0000 (11:32 +1100)]
recoverd: Improve an error message in the election code
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 29 Oct 2013 05:38:42 +0000 (16:38 +1100)]
Revert "if a new node enters the cluster, that node will already be frozen at start"
This is unnecessary due to
03e2e436db5cfd29a56d13f5d2101e42389bfc94.
Furthermore, if a node doesn't force an election but wins it then it
can fail to record that it is the new recovery master. This can lead
to a reverse split brain where there is no recovery master.
This reverts commit
c5035657606283d2e35bea40992505e84ca8e7be.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Conflicts:
server/ctdb_recoverd.c
Martin Schwenke [Tue, 29 Oct 2013 03:05:41 +0000 (14:05 +1100)]
ctdbd: When a node is connected, log at DEBUG NOTICE not DEBUG_INFO
This is important enough that we should see it when the log level is
DEBUG_NOTICE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 28 Oct 2013 05:20:44 +0000 (16:20 +1100)]
tests/complex: Remove CTDB_NFS_SKIP_SHARE_CHECK test
This is a needlessly complex way of testing the same thing as the
eventscripts unit tests 60.nfs.monitor.161.sh and
60.nfs.monitor.162.sh.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 28 Oct 2013 05:14:40 +0000 (16:14 +1100)]
tests/complex: Remove CTDB_SAMBA_SKIP_SHARE_CHECK test
This is adequately covered by eventscripts unit tests
50.samba.monitor.105.sh and 50.samba.monitor.106.sh.
This test is broken if CTDB_SAMBA_CHECK_PORTS is not specified in the
CTDB configuration. Fixing it is hard and involves adding a more
complex stub for testparm. We already have that in the eventscript
unit tests above.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 28 Oct 2013 05:00:54 +0000 (16:00 +1100)]
eventscripts: Rewrite the smb.conf cache file handling
The background update is never guaranteed to complete before the cache
is used, so don't bother trying it at the beginning. Instead, put a
timeout on a foreground update.
If the foreground update fails:
* If there's no available cache file then die.
* If there is a previous cache file then use it and log a warning.
* Do a background update at the end of the monitor event.
Also remove commas in the "smb ports" list before use, since (newer?)
testparm seem to insert commas into the default value. Update the
associated test to add a comma.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 25 Oct 2013 05:25:25 +0000 (16:25 +1100)]
tools/ctdb: Fix documentation string for ban command
Ban time of 0 is not supported.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 24 Oct 2013 00:13:16 +0000 (11:13 +1100)]
Revert "recoverd: Disable takeover runs on other nodes for 5 minutes"
5 minutes is too long to leave the cluster in limbo if the recovery
daemon dies during a takeover run, even though this is quite unlikely.
We need a new recover master to be able to do takeover runs fairly
quickly.
This reverts commit
71080676bb4acbd0d9b595a30cf7fe6dddbf426f.
Martin Schwenke [Thu, 24 Oct 2013 03:15:53 +0000 (14:15 +1100)]
tools/onnode: Fix healthy/ok node handling
This bit-rotted a long time ago when the "ThisNode" column was added
to "ctdb -Y status" output. The fake "ctdb -Y status" output in the
test was never updated to reflect this change.
Instead of making sure that all columns are "0", just check that
they're not "1". This implicitly ignores "Y" and "N" in this
"ThisNode" column without having to do anything else clever.
Also update associated tests. The main "ctdb ok" test had a duplicate
opening line for a here document, which was tickled by this change.
This fixes samba bz#8122.
Signed-off-by: Martin Schwenke <martin@meltin.net>
onnode test fixup
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Mon, 28 Oct 2013 07:49:51 +0000 (18:49 +1100)]
daemon: Change the default recovery method for persistent databases
Use sequence numbers to do recovery for persistent databases instead of
RSNs. This fixes the problem of registry corruption during recovery.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Wed, 23 Oct 2013 04:37:41 +0000 (15:37 +1100)]
packaging: Create runtime directories for CTDB
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Wed, 23 Oct 2013 00:28:26 +0000 (11:28 +1100)]
initscript: Update systemd configuration to put PID file in /run/ctdb
Elsewhere we're moving the socket to /var/run/ctdb. We might end up
with PID files and sockets for other daemons later, so let's call the
directory "ctdb" instead of "ctdbd".
Signed-off-by: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Thu, 3 Oct 2013 05:19:05 +0000 (15:19 +1000)]
build: Move the default CTDB socket from /tmp to /var/run/ctdb
Use /var/run/ctdb/ctdbd.socket because there might be other daemons
that need sockets in the future.
The local daemons test code to create a link for the default
convenience socket has to be removed because the link can't be created
as a regular user in the new location. This should be OK since all
calls to the ctdb tool in the test code should be wrapped in onnode.
When debugging tests, a developer will have to set CTDB_SOCKET by
hand.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Thu, 3 Oct 2013 05:47:30 +0000 (15:47 +1000)]
packaging: Move ctdb/ directory from /var to /var/lib
Introduce CTDB_VARDIR variable that points to /var/lib/ctdb by default.
This makes CTDB_VARDIR consistent across C code and scripts.
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 21 Oct 2013 08:36:36 +0000 (19:36 +1100)]
ctdbd: Simplify database directory setting logic
No need to check if the options are set. The options are always set
via static defaults.
No need to talloc_strdup() the values via wrapper functions. The
options aren't going away. Remove now unused ctdb_set_tdb_dir() and
similar functions.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 21 Oct 2013 08:36:36 +0000 (19:36 +1100)]
ctdbd: Remove duplicate database directory setting logic
Defaults for ctdb->db_directory and similar variables are currently
set in 2 places.
Change this to set them in only 1 place and make the directories at
initialisation time instead of waiting until later.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 21 Oct 2013 08:29:39 +0000 (19:29 +1100)]
common: New function ctdb_mkdir_p_or_die()
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 21 Oct 2013 08:08:52 +0000 (19:08 +1100)]
common: New function mkdir_p()
Behaves like mkdir -p.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Amitay Isaacs [Thu, 3 Oct 2013 05:13:41 +0000 (15:13 +1000)]
tcp: Create socket lock in /var/run/ctdb instead of /tmp
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Pair-programmed-with: Martin Schwenke <martin@meltin.net>
Amitay Isaacs [Thu, 24 Oct 2013 03:26:12 +0000 (14:26 +1100)]
doc/examples: Add CTDB configuration examples
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Mathieu Parent [Thu, 29 Aug 2013 06:20:05 +0000 (08:20 +0200)]
Add missing $remote_fs LSB dependency
Mathieu Parent [Thu, 29 Aug 2013 05:42:12 +0000 (07:42 +0200)]
Improved check_ctdb
- increase verbosity with "-v"
- concat error messages (if there are several)
- handle 255 return code as warning (as it is the return code when any of the node is missing)
- read /etc/ctdb/nodes remotely (ctdb_check can be run on a non-ctdb host)
Mathieu Parent [Thu, 15 Aug 2013 18:23:57 +0000 (20:23 +0200)]
Add missing events.d/99.timeout
Amitay Isaacs [Thu, 24 Oct 2013 03:37:41 +0000 (14:37 +1100)]
eventscripts: Instead of listing all tunables, query EventScriptTimeout
Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Michael Adam [Tue, 22 Oct 2013 22:46:34 +0000 (00:46 +0200)]
ctdb_client.h: fix build on AIX by removing C++-style comments
Reported by John P Janosik <jpjanosi@us.ibm.com>
Signed-off-by: Michael Adam <obnox@samba.org>
Martin Schwenke [Mon, 21 Oct 2013 08:52:01 +0000 (19:52 +1100)]
ctdbd: Pass the public address file location in ctdb context
No need to pass it as an extra argument to ctdb_start_daemon.
Also ensure options.public_address_list gets a nice static default.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 1 Oct 2013 05:13:29 +0000 (15:13 +1000)]
ctdbd: Debug locks by default with override from enviroment variable
Default is debug_locks.sh, relative to CTDB_BASE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 15 Oct 2013 03:10:58 +0000 (14:10 +1100)]
ctdbd: Default for event_script_dir should use CTDB_BASE
Also get rid of ctdb_set_event_script_dir(). It creates an
unnecessary copy of something that will be around for the lifetime of
the process.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 21 Oct 2013 08:33:10 +0000 (19:33 +1100)]
ctdbd: Add nodes_file member to struct ctdb_context
This allows ctdb_load_nodes_file() to move to ctdb_server.c and
ctdb_set_nlist() to become static.
Setting ctdb->nodes_file needs to be done early, before the nodes file
is loaded. It is now set from CTDB_BASE instead ETCDIR, so setting
CTDB_BASE also needs to be done earlier.
Unhack ctdbd_test.c - it no longer needs to define
ctdb_load_nodes_file().
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Mon, 21 Oct 2013 08:43:47 +0000 (19:43 +1100)]
tools/ctdb: CTDB_BASE is the default location of configuration files
Ensure that environment variable CTDB_BASE is set.
Update defaults for nodes and natgw_nodes to use CTDB_BASE.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Tue, 15 Oct 2013 03:02:31 +0000 (14:02 +1100)]
ctdbd: Don't check CTDB_BASE before setting it, just don't override
That's what the 3rd argument to setenv(3) is for... :-)
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 22 Oct 2013 04:36:30 +0000 (15:36 +1100)]
tests/integration: Pass --valgrinding option when running under valgrind
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Mon, 21 Oct 2013 08:42:32 +0000 (19:42 +1100)]
ctdbd: Fix some errors in the popt configuration
That 4th argument isn't a default or similar, so consistently make it 0.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 18 Oct 2013 05:43:26 +0000 (16:43 +1100)]
initscript: New configuration variable CTDB_DBDIR_STATE
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Fri, 18 Oct 2013 02:24:03 +0000 (13:24 +1100)]
scripts: Make detect_init_style() more readable
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:44:24 +0000 (16:44 +1100)]
eventscripts: Rework the iSCSI eventscript
* It should run on "ipreallocated" instead of "recovered"
* Variable name NODE -> ip since that's what it is
* Simplify some logic
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:20:18 +0000 (16:20 +1100)]
eventscripts: Don't update static routes on "recovered" event
Routes only need to be updated when IPs have moved. IP takeover runs
will generate "ipreallocated", which is enough. "recovered" always
follows "ipreallocated" anyway, so avoid the redundancy.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:17:26 +0000 (16:17 +1100)]
eventscripts: NAT gateway script doesn't need to handle "recovered" event
Any time a node changes flags in any significant way there will be a
takeover run, which will generate an "ipreallocated" event. The
"recovered" event always happens straight after a takeover run so we
update the NAT gateway twice.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:14:14 +0000 (16:14 +1100)]
eventscripts: Delete placeholder "recovered" and "shutdown" events
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:13:21 +0000 (16:13 +1100)]
eventscripts: Clean up comment at the top of 00.ctdb
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 05:00:39 +0000 (16:00 +1100)]
eventscripts: Remove reconfigure check from samba and winbind eventscripts
There is no reconfigure code for these scripts so no need to check for
reconfiguration.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 04:58:25 +0000 (15:58 +1100)]
eventscripts: Remove reconfigure code from httpd eventscript
Nothing ever (or has ever) set the "needs reconfigure" flag, so this
code is unnecessary.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 04:23:35 +0000 (15:23 +1100)]
eventscripts: Fold ctdb_check_tcp_ports_ctdb() into ctdb_check_tcp_ports()
A generic framework is no longer needed now that the "ctdb" checker is
the only one left. Simplify the code.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Thu, 17 Oct 2013 00:02:54 +0000 (11:02 +1100)]
eventscripts: Remove TCP port checks other than the built-in CTDB one
"ctdb checktcpport" is no longer experimental so the other checkers
are no longer required.
Remove tests related to the removed checkers.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 16 Oct 2013 23:52:00 +0000 (10:52 +1100)]
scripts: Remove setting of PATH from functions file
The current setting is inconsistent with settings on most systems,
putting /bin before /sbin. Use of /usr/local/bin, which may be
required on some systems, is also overridden. This can make it
difficult to do interactive debugging of script problems.
Rely on the system PATH instead.
If system-specific changes need to be made then this can be done in a
configuration file.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Wed, 16 Oct 2013 23:39:09 +0000 (10:39 +1100)]
tests/eventscripts: Run scripts under sh by default
Some scripts are disabled by default so are no executable. Explicitly
running them under sh allows them to be run without having to mess
around and make them executable or similar.
Signed-off-by: Martin Schwenke <martin@meltin.net>
Martin Schwenke [Tue, 15 Oct 2013 05:44:45 +0000 (16:44 +1100)]
tests/eventscripts: New tests for 20.multipathd
Signed-off-by: Martin Schwenke <martin@meltin.net>