Volker Lendecke [Wed, 9 Dec 2009 14:11:45 +0000 (15:11 +0100)]
Make fetch_locked more scalable
This patch improves the handling of the fetch_lock operation on non-persistent
databases that ctdb clients have to do very frequently.
The normal flow how this goes is the following:
1. Client does a local fetch_lock on the database
2. Client looks if the local node is dmaster.
If yes, everything is fine
If no, continue here
3. Client unlocks the local record
4. Client issues a "get me the record" call to ctdbd
5. ctdbd goes out and fetches the dmaster role
6. ctdbd tells the client to retry
7. Client starts over again
The problem is between step 6 and 7: Before the client has had the chance to
retry (i.e. catch the record with a fetch_locked), another node might have come
asking ctdbd to migrate away the record again. This is a real problem, I've
seen >20 loops of this kind in real workloads.
This patch does the following: Whenever ctdb receives a record as result of
step 5, it puts the key on a "holdback list". As long as a key is on this list,
a request to migrate away the dmaster is put on hold. It is the client's duty
to issue the "CTDB_CONTROL_GOTIT" control when it has successfully done step 2
after having asked ctdb to fetch the record. This will release the key from the
"holdback list" and re-issue all dmaster migration requests.
As a safeguard against malicious clients, once a second (default 1000msecs,
tunable "HoldbackCleanupInterval" in milliseconds) ctdbd goes over the list of
held back keys, deletes them and releases all held back migration requests.
(This used to be ctdb commit
5736e17c139c9a8049e235429aeae0c6c9d0e93d)
Volker Lendecke [Thu, 10 Dec 2009 12:02:29 +0000 (13:02 +0100)]
Import "talloc_array_length" from upstream talloc
(This used to be ctdb commit
844aa6300ee4d87561e698001ebc15ac1e455528)
Michael Adam [Fri, 11 Dec 2009 15:39:58 +0000 (16:39 +0100)]
tests: temporarily disable the transaction test tool.
Make it return success for make test.
This is temporarily disabled until the rewrite of the
transaction code (in samba and the daemon) using the global
lock feature has been ported to the ctdb client code.
Michael
(This used to be ctdb commit
78ca29352aa39f4ef4e41096b92d55cb2e0d348a)
Michael Adam [Fri, 11 Dec 2009 14:31:02 +0000 (15:31 +0100)]
Add a new control CTDB_GET_DB_SEQNUM - fetch a persistent db's sequence number.
Michael
(This used to be ctdb commit
a7e3b5fac6b3f5d74473f26eb86c067b35647996)
Michael Adam [Fri, 11 Dec 2009 13:19:55 +0000 (14:19 +0100)]
define CTDB_DB_SEQNUM_KEY - used with the new implementation of transactions.
Michael
(This used to be ctdb commit
4b1dbcf0853bdc4832d39a477823ae34f216da52)
Volker Lendecke [Wed, 9 Dec 2009 16:20:23 +0000 (17:20 +0100)]
Tiny simplification of ctdb_queue_packet()
(This used to be ctdb commit
1640da1cab7e8b545367824204c82931f3346848)
Volker Lendecke [Tue, 8 Dec 2009 16:00:55 +0000 (17:00 +0100)]
Rename a struct member for clarity
(This used to be ctdb commit
6af5e74a21546d723008d69d6752ebebf898c947)
Michael Adam [Thu, 3 Dec 2009 16:59:49 +0000 (17:59 +0100)]
server: add a new control CTDB_CONTROL_TRANS3_COMMIT
This is a simplified version of the trans2 commit control:
It just rolls out the marshall buffer to all active nodes.
It is the main ctdbd part of the re-implementation of the
persistent transactions. The client code is changed to
take a global lock to start a transactions and store into
the marshal buffer instead of writing to the local tdb
under a local transaction.
The old transaction implementation is going to be
removed in a later commit.
Michael
(This used to be ctdb commit
f66428f9d2013080a414404c1ba6117888352fd6)
Ronnie Sahlberg [Wed, 9 Dec 2009 21:53:55 +0000 (08:53 +1100)]
From: Volker Lendecke <vl@samba.org>
Date: Wed, 9 Dec 2009 22:45:12 +0100
Subject: [PATCH] Revert an accidential commit
(This used to be ctdb commit
af6656f2844d8fd72204a70358c9d589dbe1bd34)
Michael Adam [Wed, 9 Dec 2009 21:04:48 +0000 (22:04 +0100)]
tests: remove the no_trans mode from ctdb_transaction.
Writes without transaction are not possible any more on
persistent databases.
Michael
(This used to be ctdb commit
59f46d7261dfdbdef900bf95dd9eb28ad22a46b2)
Michael Adam [Thu, 30 Jul 2009 09:59:59 +0000 (11:59 +0200)]
tests: remove the persistent_unsafe writes test.
This is useless now that persistent write operations without
transaction are forbidden.
Michael
(This used to be ctdb commit
b022863d44026c19d5aae54aa485b670bea0540e)
Michael Adam [Thu, 30 Jul 2009 09:59:02 +0000 (11:59 +0200)]
tests: remove persistent_safe write test.
This is useless now that persistent writes without transactions are forbidden.
Michael
(This used to be ctdb commit
9ac82311d796e1fab31f8de62b8ccc754445093c)
Michael Adam [Wed, 9 Dec 2009 20:38:44 +0000 (21:38 +0100)]
test: add test 54_ctdb_transaction_recovery.sh
This is like the 53_ctdb_transaction test, but it additionally
runs a loop with recoveries while the transactions are running.
When called like this, the transaction loops run for 10 minutes:
CTDB_TEST_TIMELIMIT=600 tests/scripts/run_tests tests/simple/54_ctdb_transaction_recovery.sh
The default timelimit is 30 seconds.
Michael
(This used to be ctdb commit
2ff2679e8f3d50ebf735f2c420898a84268bdc95)
Michael Adam [Wed, 9 Dec 2009 20:36:42 +0000 (21:36 +0100)]
test: get value for --timelimit from environment var CTDB_TEST_TIMELIMIT in transaction test
Michael
(This used to be ctdb commit
c13077ca64f6e6569c30ef7fcb044e5711dce1a3)
Michael Adam [Wed, 9 Dec 2009 14:05:20 +0000 (15:05 +0100)]
client: lower level of commit retry message WARNING->DEBUG
This can happen frequently when recoveries intercept transactions.
Michael
(This used to be ctdb commit
c46adb210e47530488503e20d682d4d182c0fb79)
Michael Adam [Wed, 9 Dec 2009 12:48:49 +0000 (13:48 +0100)]
client: lower debug level of transaction-active-retry message to DEBUG
This reduces some noise.
Michael
(This used to be ctdb commit
54d227811753f4a87f1a2c9dc0b1389f5ca2a12f)
Michael Adam [Wed, 9 Dec 2009 12:43:38 +0000 (13:43 +0100)]
call: lower the debug message "refusing migration while transction" to lvl INFO
This gets just too noisy on a busy system.
And it is purley informational anyways...
Michael
(This used to be ctdb commit
7f64a00c76203fdf6673c3f862a4bfd17fb848d7)
Volker Lendecke [Wed, 9 Dec 2009 16:14:16 +0000 (17:14 +0100)]
Run only one event for each epoll_wait/select call
This might be a bit less efficient, but experience in winbind has shown that
event callbacks can trigger changes in the socket state in very hard to
diagnose ways.
(This used to be ctdb commit
a78b8ea7168e5fdb2d62379ad3112008b2748576)
Christian Ambach [Tue, 8 Dec 2009 18:23:19 +0000 (19:23 +0100)]
reduce vacuuming lognoise
syslog.h says:
LOG_NOTICE 5 normal but significant condition
LOG_INFO 6 informational
several vacuuming related logs logged at NOTICE level although I don't see
any real significance, these are just informational messages for me
Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>
(This used to be ctdb commit
142111983c103e90ccccbe26fd580c4eb28e949f)
Christian Ambach [Tue, 8 Dec 2009 18:08:37 +0000 (19:08 +0100)]
improve time jump logging
add the __location__ macro to the logs to get a better idea
in which loop the problem occured
Signed-off-by: Christian Ambach <christian.ambach@de.ibm.com>
(This used to be ctdb commit
dccb549fd6a6e338063699544e52f2a1a6a966b5)
Ronnie Sahlberg [Wed, 9 Dec 2009 03:26:42 +0000 (14:26 +1100)]
Merge commit 'rusty/script-report'
(This used to be ctdb commit
6e8b279ed307eccac08386e98510361ba3ab3d36)
Ronnie Sahlberg [Wed, 9 Dec 2009 00:33:04 +0000 (11:33 +1100)]
Bond devices can have any name the user configures, so
when checking link status for an interface, first
check if this interface is in fact a bond device
(by the precense of a /proc/net/bonding/IFACE file)
and use that file for checking status.
Othervise assume ib* is an infiniband interface which we donnt know how
to check, or otherwise it is an ethernet interface and ethtool should
hopefully work.
(This used to be ctdb commit
8cc6c5de3d7abb0b72eaa6e769e70963b02d84cb)
Ronnie Sahlberg [Wed, 9 Dec 2009 00:13:29 +0000 (11:13 +1100)]
make sure to also check that interfaces used for NATGW are ok
and have a link.
if not the node should become unhealthy
(This used to be ctdb commit
03b5bbaae1b53830a4cd20d3079ab8f45ffce923)
Stefan Metzmacher [Mon, 7 Dec 2009 13:37:21 +0000 (14:37 +0100)]
events/50.samba: only use wbinfo --ping-dc if available
metze
(This used to be ctdb commit
7b73834ba3ac197cc8a3020c111f9bb2c567e70b)
Rusty Russell [Mon, 7 Dec 2009 15:20:55 +0000 (01:50 +1030)]
ctdb: scriptstatus can now query non-monitor events
We also no longer return an error before scripts have been run; a special
zero-length data means we have never run the scripts.
"ctdb scriptstatus all" returns all event script results.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
9b90d671581e390e2892d3a68f3ca98d58bef4df)
Rusty Russell [Mon, 7 Dec 2009 15:17:13 +0000 (01:47 +1030)]
eventscript: expost call names and enum
We're going to need this so ctdb can query non-monitor status.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
53bc5ca23ca55a3ac63a440051f16716944a2a51)
Rusty Russell [Mon, 7 Dec 2009 15:02:36 +0000 (01:32 +1030)]
eventscript: lock logging on timeout.
Ronnie suggested this; seems like a very good idea.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
93153bca68926401dc9ae7fd77ed3f17be923344)
Rusty Russell [Mon, 7 Dec 2009 15:01:53 +0000 (01:31 +1030)]
ctdb: support --machinereadable (-Y) for scriptstatus
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
47ffe75848f216568ce3db0a60ca88cfe3d6903a)
Rusty Russell [Tue, 8 Dec 2009 01:59:10 +0000 (12:29 +1030)]
eventscript: get rid of ctdb_control_event_script_finished altogether
We always have to call it before freeing the state; we should just do
this work in the destructor itself.
Unfortunately, the script state would already be freed by the time
the state destructor is called, so we make the script state a child of
ctdb, and talloc_free() it manually on the one path which doesn't use
the destructor.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
c1ba1392fe52762960e896ace0aca0ee4faa94d5)
Rusty Russell [Tue, 8 Dec 2009 01:57:48 +0000 (12:27 +1030)]
eventscript: save state for all script invocations
Rather than only tranferring to last_status for monitor events, do
it for every event (ctdb->last_status is now an array).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
c73ea56275d4be76f7ed983d7565b20237dbdce3)
Rusty Russell [Tue, 8 Dec 2009 01:54:56 +0000 (12:24 +1030)]
eventscript: cleanup finished to take state arg
We only need ctdb->current_monitor so we can kill it when we want to run
something else; we don't need to use it here as we always know what script
we are running.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
4cf1b7c32bcf7e4b65aec1fa7ee1a4b162cac889)
Rusty Russell [Tue, 8 Dec 2009 02:18:17 +0000 (12:48 +1030)]
eventscript: use wire format internally for script status.
The only difference between the exposed an internal structure now is
that the name and output fields were pointers. Switch to using
ctdb_scripts_wire/ctdb_script_wire internally as well so marshalling
is a noop.
We now reject scripts which are too long and truncate logging to the
511 characters we have space for (the entire output will be in the
normal ctdbd log).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
fd2f04554e604bc421806be96b987e601473a9b8)
Rusty Russell [Mon, 7 Dec 2009 14:21:24 +0000 (00:51 +1030)]
eventscript: rename ctdb_monitoring_wire to ctdb_scripts_wire
We're going to allow fetching status of all script runs, so this
name is no longer appropriate.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
f5cb41ecf3fa986b8af243e8546eb3b985cd902a)
Rusty Russell [Tue, 8 Dec 2009 02:17:24 +0000 (12:47 +1030)]
eventscript: get_current_script() helper
This neatens the code slightly. We also use the name 'current' in
ctdb_event_script_handler() for uniformity.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
e9661b383e0c50b9e3d114b7434dfe601aff5744)
Rusty Russell [Tue, 8 Dec 2009 02:17:05 +0000 (12:47 +1030)]
eventscript: use an array rather than a linked list of scripts
This brings us closer to the wire format, by using a simple array
and a 'current' iterator.
The downside is that a 'struct ctdb_script' is no longer a talloc
object: the state must be passed to our log fn, and the current
script extracted with &state->scripts->scripts[state->current].
The wackiness of marshalling is simplified, and as a bonus, we can
distinguish between an empty event directory
(state->scripts->num_scripts == 0) and and error (state->scripts ==
NULL).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
76e8bdc11b953398ce8850de57aa51f30cb46bff)
Rusty Russell [Tue, 8 Dec 2009 02:16:18 +0000 (12:46 +1030)]
eventscript: record script status for all events
This unifies almost everything: the state->current pointer points to
the struct ctdb_script where we record start, finish, status and
output.
We still only marshall up the monitor events; the rest disappear when
the state structure is freed.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
c476c81f3e3d8fc62f2e53d82fce5774044ee9ce)
Rusty Russell [Tue, 8 Dec 2009 02:15:17 +0000 (12:45 +1030)]
eventscript: use scripts array directly, rather than separate list
We rename ctdb_monitor_script_status to ctdb_script, and instead of
allocating them as the scripts are executed, we allocate them up front
and keep a "current" interator.
This slightly simplifies the code, though it means we only marshall up
to the last successfully run script.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
b2a300768536d10bd867a987ad4cf1c5268c44bc)
Rusty Russell [Tue, 8 Dec 2009 02:14:30 +0000 (12:44 +1030)]
eventscript: ctdb_fork_with_logging()
A new helper functions which sets up an event attached to the child's
stdout/stderr which gets routed to the logging callback after being
placed in the normal logs.
This is a generalization of the previous code which was hardcoded to
call ctdb_log_event_script_output.
The only subtlety is that we hang the child fds off the output buffer;
the destructor for that will flush, which means it has to be destroyed
before the output buffer is.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
32cfdc3aec34272612f43a3588e4cabed9c85b68)
Rusty Russell [Mon, 7 Dec 2009 14:01:29 +0000 (00:31 +1030)]
eventscript: pass struct ctdb_log_state directly to ctdb_log_handler().
The current logging logic assumes that any stdout/stderr belongs to
the currently running monitor script output. This isn't quite right
anyway, and we'd like to capture stderr output of other script
invocations.
So we move towards multiple struct ctdb_log_state by handing it
directly to ctdb_log_handler to use, rather than having it assume
ctdb->log. We need a ctdb pointer inside the log struct now though.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
497766cf186442de00fb324343150442457be858)
Rusty Russell [Mon, 7 Dec 2009 13:57:40 +0000 (00:27 +1030)]
eventscript: remove unused ctbd_ctrl_event_script*
The child no longer uses ctdb_ctrl_event_script_init or
ctdb_ctrl_event_script_finished, and the others are redundant: it
doesn't need to tell us it's starting a script when it only runs one.
We move start and stop calls to the parent, and eliminate the RPC
infrastructure altogether.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
391926a87a7af73840f10bb314c0a2f951a0854c)
Rusty Russell [Mon, 7 Dec 2009 13:52:55 +0000 (00:22 +1030)]
eventscript: refactor forking code into fork_child_for_script()
We do the same thing in two places: fire off a child from the initial
ctdb_event_script_callback_v() and also from the ctdb_event_script_handler()
when it's done.
Unify this logic into fork_child_for_script().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
814704a3286756d40c2a6c508c1c0b77fa711891)
Rusty Russell [Mon, 7 Dec 2009 13:51:25 +0000 (00:21 +1030)]
eventscript: fork() a child for each script.
We rename child_run_scripts() to child_run_script(), because it now
runs a single script rather than walking the list. When it's
finished, we fork the next child from the ctdb_event_script_handler()
callback.
ctdb_control_event_script_init() and ctdb_control_event_script_finished()
are now called directly by the parent process; the child still calls
ctdb_ctrl_event_script_start() and ctdb_ctrl_event_script_stop() before
and after the script.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
0fafdcb8d3532a05846abaa5805b2e2f3cee8f47)
Rusty Russell [Mon, 7 Dec 2009 13:45:18 +0000 (00:15 +1030)]
eventscript: store from_user and script_list inside state structure
This means all the state about running the scripts is in that structure,
which helps in the next patch.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
020fd21e0905e7f11400f6537988645987f2bb32)
Rusty Russell [Mon, 7 Dec 2009 13:44:01 +0000 (00:14 +1030)]
eventscript: use direct script state pointer for current monitor
We put a "scripts" member in ctdb_event_script_state, rather than using
a special struct for monitor events. This will fit better as we further
unify the different events, and holds the reports from the child process
running each monitor script.
Rather than making the monitor state a child of current_monitor_status_ctx,
we just point current_monitor directly at it. This means we need to reset
that pointer in the destructor for ctdb_event_script_state.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
9a2b4f6b17e54685f878d75bad27aa5090b4571f)
Rusty Russell [Mon, 7 Dec 2009 13:39:20 +0000 (00:09 +1030)]
eventscript: make current_monitor_status_ctx serve as monitor_event_script_ctx
We have monitor_event_script_ctx and other_event_script_ctx, and
current_monitor_status_ctx in struct ctdb_context. This seems more
complex than it needs to be.
We use a single "event_script_ctx" as parent for all event script
state structures. Then we explicitly reparent monitor events under
current_monitor_status_ctx: this is freed every script invocation to
kill off any running scripts anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
0d925e6f2767691fa561f15bbb857a2aec531143)
Rusty Russell [Mon, 7 Dec 2009 13:25:03 +0000 (23:55 +1030)]
eventscript: split ctdb_run_event_script into multiple parts
Simple refactoring in preparation for switching to one-child-per-script.
We also call the functions run by the child process "child_".
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
bfee777faff75e9bed4aedc1558957483616a6d3)
Rusty Russell [Mon, 7 Dec 2009 13:23:35 +0000 (23:53 +1030)]
eventscript: hoist work out of child process, into parent
This is the start of a move towards finer-grained reporting, with one
child per script. Simple code motion to do sanity check and get the
list of scripts before fork().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
816b9177f51ae5b21b92ff4a404f548fe9723c96)
Rusty Russell [Mon, 7 Dec 2009 13:22:01 +0000 (23:52 +1030)]
eventscript: don't make ourselves healthy if we're under ban_count
If we've timed out, but we've not timed out more than
ctdb->tunable.script_ban_count, we pretend we haven't.
There's a logic bug in the way this is done: if we were unhealthy before,
this would set us to "healthy" again (status == 0). I don't think this
would happen in real life, but it's a little surprising.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
e6488c0e05bab5c4c2c0a6370930b0b27e5ed56e)
Rusty Russell [Mon, 7 Dec 2009 13:18:57 +0000 (23:48 +1030)]
eventscript: handle banning within the callbacks
Currently the timeout handler in eventscript.c does the banning if a
timeout happens. However, because monitor events are different, it has
to special case them.
As we call the callback anyway in this case, we should make that handle
-ETIME as it sees fit: for everyone but the monitor event, we simply ban
ourselves. The more complicated monitor event banning logic is now in
ctdb_monitor.c where it belongs.
Note: I wrapped the other bans in "if (status == -ETIME)", though they
should probably ban themselves on any error. This change should be a
noop.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
9ecee127e19a9e7cae114a66f3514ee7a75276c5)
Rusty Russell [Mon, 7 Dec 2009 12:48:40 +0000 (23:18 +1030)]
eventscript: expost ctdb_ban_self()
eventscript.c uses this now, but our next patch makes others use it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
a305cb7743c24386e464f6b2efab7e2108bb1e7e)
Rusty Russell [Mon, 7 Dec 2009 12:47:23 +0000 (23:17 +1030)]
eventscript: handle v. unlikely timeout race
If we time out just as the child exits, we currently will report an
uninitialized cb_status field. Set it to -ETIME as expected.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
024386931bda9757079f206238ae09bae4de6ea2)
Rusty Russell [Mon, 7 Dec 2009 12:45:56 +0000 (23:15 +1030)]
eventscript: replace other -1 returns with -errno
This completes our "problem with script" reporting; we never set cb_status
to -1 on error. Real errnos are used where the failure is a system call
(eg. read, setpgid), otherwise -EIO is used if we couldn't communicate with
the parent.
The latter case is a bit useless, since the parent probably won't see
the error anyway, but it's neater.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
1269458547795c90d544371332ba1de68df29548)
Rusty Russell [Mon, 7 Dec 2009 12:43:12 +0000 (23:13 +1030)]
eventscript: simplify ctdb_run_event_script loop
If we break, we avoid cut & paste code inside the loop. Need to initialize
ret to 0 for the "no scripts" case.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
ec36ced9446da7e3bf866466d265ee8e18f606c1)
Rusty Russell [Mon, 7 Dec 2009 12:42:19 +0000 (23:12 +1030)]
eventscript: handle and report generic stat/execution errors
Rather than ignoring deleted event scripts (or pretending that they were "OK"),
and discarding other stat errors, we save the errno and turn it into a negative
status.
This gives us a bit more information if we can't execute a script (eg.
too many symlinks or other weird errors).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
5d894e1ae5228df6bbe4fc305ccba19803fa3798)
Rusty Russell [Mon, 7 Dec 2009 12:41:47 +0000 (23:11 +1030)]
eventscript: use -ENOEXEC for disabled status value
This unifies code paths and simplifies things: we just hand -ENOEXEC to
ctdb_ctrl_event_script_stop().
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
eadf5e44ef97d7703a7d3bce0e7ea0f21cb11f14)
Rusty Russell [Mon, 7 Dec 2009 12:39:02 +0000 (23:09 +1030)]
eventscript: enhance script delete race check
We currently assume 127 == script removed. The script can also return 127;
best to re-check the execution status in this case (and for 126, which will
happen if the script is non-executable).
If the script is no longer executable/not present, we ignore it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
0a53d6b5ac81daf0efa32f35e7758ede2a5bdb63)
Rusty Russell [Mon, 7 Dec 2009 12:39:39 +0000 (23:09 +1030)]
eventscript: check_executable() to centralize stat/perm checks
This is used later in the "script vanished" check.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
8ddb97040842375daf378cbb5816d0c2b031fa65)
Rusty Russell [Mon, 7 Dec 2009 12:35:58 +0000 (23:05 +1030)]
talloc: save errno over talloc_free
As we start to use errno more, it's a huge pain if talloc_free() can blatt
it (esp. destructors).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
76a0ca77feba14e1e1162c195ffbdf516e62aa4d)
Rusty Russell [Mon, 7 Dec 2009 12:39:42 +0000 (23:09 +1030)]
eventscript: use -ETIME for timeout status value
This starts the move toward more expressive encoding of return values:
positive values mean the script ran, negative means we had a problem with
the script (and the value is the errno).
This does timeout, but changes the ctdb tool to recognize it.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
0eb1d0aa14e68b598d9e281c8a02b8f94a042fd9)
Rusty Russell [Mon, 7 Dec 2009 12:39:40 +0000 (23:09 +1030)]
eventscript: marshall onto last_status immediately
This simplifies the code a little: last_status is now read to go
(it's only used by the scriptstatus command at the moment).
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
6be931266a4e41fd0253f760936ad9707dd97c47)
Ronnie Sahlberg [Mon, 7 Dec 2009 08:04:41 +0000 (19:04 +1100)]
version 1.0.108
(This used to be ctdb commit
fff280878e670e93a818c0071f3172056214e8c4)
Ronnie Sahlberg [Mon, 7 Dec 2009 07:27:46 +0000 (18:27 +1100)]
Use wbinfo --ping-dc isntead of wbingo -p sicne this is a more reliable way to determine if winbindd is in a useful state.
(This used to be ctdb commit
7c95e56ba871a4e0cb893a5cb5d821e7ff6e6dd6)
Michael Adam [Fri, 4 Dec 2009 22:18:12 +0000 (23:18 +0100)]
packaging: package tests/bin/ctdb_transaction under /usr/share/doc/tests/bin
For testing/diagnostic purposes.
Michael
(This used to be ctdb commit
b796d736946856abfbe53de95dfcd73072ee8ccd)
Michael Adam [Thu, 3 Dec 2009 23:19:44 +0000 (00:19 +0100)]
client: improve two error messages in ctdb_transaction_commit().
Michael
(This used to be ctdb commit
d971b2ca84c0451dc7e5acbf4a5ade06270a2044)
Michael Adam [Thu, 3 Dec 2009 23:06:34 +0000 (00:06 +0100)]
server:trans2_commit: move the check for active recovery down.
This needs to be done after the control-dispatcher:
In the TRANS2_COMMIT control, the client->db_id needs
to be set before bailing out, since otherwise the
next TRANS2_COMMIT_RETRY will fail...
Michael
(This used to be ctdb commit
59faf3f923a5989b5ee94ef02a12827412775bae)
Michael Adam [Wed, 2 Dec 2009 23:28:32 +0000 (00:28 +0100)]
client: increase the number of commit retries 10-->100
To cope with timeouts when recoveries and transactions collide.
Maybe 100 is too high.
Michael
(This used to be ctdb commit
c23d804165e84bdf95ba960c953c736d361011d7)
Michael Adam [Wed, 2 Dec 2009 23:27:34 +0000 (00:27 +0100)]
client: untangle checks and produce more detailed error messages
in ctdb_transaction_fetch_start
Michael
(This used to be ctdb commit
428914377851a98b3fc893798783fbfebffc1c0d)
Michael Adam [Wed, 2 Dec 2009 23:26:52 +0000 (00:26 +0100)]
client: increase the rsn of the __transaction_lock__ when storing
So that it is correctly handled by recoveries.
Also explicitly set the dmaster field to the current node's pnn.
Michael
(This used to be ctdb commit
03a5bb727b9db1ba952632f08ceb5355f0df842d)
Michael Adam [Fri, 4 Dec 2009 10:21:29 +0000 (11:21 +0100)]
recovery: add special pull-logic for persistent databases
The decision mechanism which records of a persistent db
are to be pulled into the recdb during recovery is now
as follows:
* Usually a record with the higher rsn than that already
stored is taken. (Just as for normal tdbs.)
* If a transaction is running on some node, then those
nodes copies of all records are taken and are not
overwritten later by other nodes' copies.
In order to keep track of whether a record's copy was obtained
from a node with a transaction running, the recovery mechanism
misuses the ctdb tdb header field 'lacount' in the recdb.
It is cleared later when pushing out the recdb database to the
other nodes.
This way, an incomplete transaction is not spoiled when
a recovery interrupts and the replay should usually succeed
(possibly after a few retries).
Michael
(This used to be ctdb commit
8aef46d2aab3efb322dda51eaa202653cefd5222)
Michael Adam [Wed, 2 Dec 2009 23:25:16 +0000 (00:25 +0100)]
make ctdb_ctrl_transaction_active public.
Michael
(This used to be ctdb commit
e5496a83ef4a01604195b27c4b97f50d4979510e)
Michael Adam [Sun, 29 Nov 2009 10:17:18 +0000 (11:17 +0100)]
recovery: for persistent db's don't set the dmaster to the recmaster node number
It is important to keep track of the dmaster (i.e. the node that last committed
a transaction containing changes to this node).
Michael
(This used to be ctdb commit
fe68972eb9cf3aa1f16ba1aacf57ade5d66e647c)
Michael Adam [Sun, 29 Nov 2009 10:14:31 +0000 (11:14 +0100)]
recovery: pass the persistent flag to recover_database()
and further down to pull_remote_database(), pull_one_remote_database(),
and push_recdb_database().
This is in preparation of special handling of persistent databases
during recoveries.
Michael
(This used to be ctdb commit
90abc4ac7c16e854cf6e8f96b60a77bc92e35e07)
Michael Adam [Sun, 29 Nov 2009 10:07:36 +0000 (11:07 +0100)]
tests:ctdb_transaction: print an extra counters when a commit fails
Michael
(This used to be ctdb commit
4113385865f53a57b18ea752a7dad8a08bed588e)
Michael Adam [Sun, 29 Nov 2009 09:38:33 +0000 (10:38 +0100)]
client: in catdb, print the keyname first, and separate records by a blank line
Michael
(This used to be ctdb commit
b9882710e12f28c96a0af298e419160f00578241)
Michael Adam [Tue, 1 Dec 2009 22:54:12 +0000 (23:54 +0100)]
packaging: remove the lib/popt from the tarball in debian mode
Debian CTDB packaging fails when this is included.
Michael
(This used to be ctdb commit
574702f8d701fe3e493b31948420b2981eb36f93)
Michael Adam [Tue, 1 Dec 2009 22:51:51 +0000 (23:51 +0100)]
packaging: rework maketarball.sh to accept an arbitrary githas to pack
The githash can be specified through the environment variable "GITHASH"
that can contain a commit hash or a tag name, e.g.
The call syntax is now
[GITHASH=xyz] [USE_GITHASH=yes/no] [DEBIAN_MODE=yes/no] maketarball.sh
Michael
(This used to be ctdb commit
41aa9bdfa2934f564bdc14374362437dfad0045f)
Michael Adam [Sun, 29 Nov 2009 03:05:03 +0000 (04:05 +0100)]
ctdb: add command "ctdb wipedb" to wipe the contents of an attached tdb
Michael
(This used to be ctdb commit
5a7c1e7f15693522bbf1c39a53be2304ece9a134)
Michael Adam [Thu, 29 Oct 2009 21:40:50 +0000 (22:40 +0100)]
tests: turn printfs into DEBUG statements in the ctdb_transaction test
Michael
(This used to be ctdb commit
0e130d79ab71cf3aa65c40af91866823246a0283)
Martin Schwenke [Fri, 4 Dec 2009 03:44:46 +0000 (14:44 +1100)]
Merge branch 'status-test-2'
(This used to be ctdb commit
5fc297a6bd49d9366703eef3edb9bdf0fe8505cc)
Ronnie Sahlberg [Fri, 4 Dec 2009 00:45:37 +0000 (11:45 +1100)]
Dont store debug level DEBUG_DEBUG in the in-memory ringbuffer.
It is unlikely we will need something this verbose for normal troubleshooting.
This allows us to keep a significantly longer time interval of log messages
in the 500k slots available in the ringbuffer.
(This used to be ctdb commit
cc99c05c0c6484ad574039a454e6133852cb41fa)
Ronnie Sahlberg [Fri, 4 Dec 2009 00:36:27 +0000 (11:36 +1100)]
Use statically allocated ringbuffer to store the last 500k log entries
in memory instead of dynamically allocated ones so that we reduce the pressure
on malloc/free.
(This used to be ctdb commit
c5cbb95512f034abeec515579983bf7ac55eadd9)
Ronnie Sahlberg [Thu, 3 Dec 2009 21:33:56 +0000 (08:33 +1100)]
Document the procedure to remove/change the NATGW configuration at
runtime without restarting the ctdb service
(This used to be ctdb commit
0a0526e03ef995b6b6634f5b75c7a17cb7b5df8f)
Rusty Russell [Wed, 2 Dec 2009 05:45:57 +0000 (16:15 +1030)]
eventscript: reduce code duplication for ending a script, and fix bug
Commit
50c2caed57c0 removed a gratuitous talloc_steal from the code in
ctdb_control_event_script_finished(), but not ctdb_event_script_timeout().
Easiest to call ctdb_control_event_script_finished() at the bottom of the
timeout routine.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
17fa252d0d6981fbae8083a818f26d5ce9c5102e)
Ronnie Sahlberg [Wed, 2 Dec 2009 03:53:21 +0000 (14:53 +1100)]
lower the loglevel for the message that a client has attached to a persistent database
(This used to be ctdb commit
2027cf3881ba890648c543bacbfd5b06464efc10)
Ronnie Sahlberg [Wed, 2 Dec 2009 03:51:57 +0000 (14:51 +1100)]
lower the loglevel for the message that a client has attached through a domian socket
(This used to be ctdb commit
de9e5236b20d70eac5ed29991703d6d25a103963)
Ronnie Sahlberg [Wed, 2 Dec 2009 02:58:27 +0000 (13:58 +1100)]
Add a proper function to process a process-exist control in the daemon.
This controls is only used by samba when samba wants to check if a subrecord held by a <node-id>:<smbd-pid> is still valid or if it can be reclaimed.
If the node is banned or stopped, we kill the smbd process and return that the process does not exist to the caller. This allows us to recover subrecords from stopped/banned nodes where smbd is hung waiting for the databases to thaw.
bz58185
(This used to be ctdb commit
157807af72ed4f7314afbc9c19756f9787b92c15)
Ronnie Sahlberg [Wed, 2 Dec 2009 02:41:04 +0000 (13:41 +1100)]
Add a double linked list to the ctdb_context to store a mapping between client pids and client structures.
Add the mapping to the list everytime we accept() a new client connection
and set it up to remove in the destructor when the client structure is freed.
(This used to be ctdb commit
f75d379377f5d4abbff2576ddc5d58d91dc53bf4)
Ronnie Sahlberg [Wed, 2 Dec 2009 02:17:12 +0000 (13:17 +1100)]
Use the PID we pick up from the domain socket when a client connects
and store this in the client structure.
There is no need to rely on the hack that samba sends some special message
handle registrations that encodes the pid in the srvid any more.
This might not work on AIX since I recall some issues to get the pid in
this way on that platform.
(This used to be ctdb commit
b4a7efa7e53e060a91dea0e8e57b116e2aeacebf)
Ronnie Sahlberg [Wed, 2 Dec 2009 00:28:42 +0000 (11:28 +1100)]
version 1.0.107
(This used to be ctdb commit
22f00368b4cb3a6bfb92033a7dbe693d31b41a54)
Rusty Russell [Tue, 1 Dec 2009 22:27:42 +0000 (08:57 +1030)]
ctdb_io: fix use-after-free on invalid packets
Wolfgang saw a talloc complaint about using freed memory in ctdb_tcp_read_cb.
His fix was to remove the talloc_free() in that function, which causes
loops when a socket is closed (as it does not get removed from the event
system), eg:
netcat 192.168.1.2 4379 < /dev/null
The real bug is that when we have more than one pending packet in the
queue, we loop calling the callback without any safeguards should that
callback free the queue (as it tends to do on invalid packets). This
can be reproduced by sending more than one bogus packet at once:
# Length word at start: 4 == empty packet (assumed little endian)
/usr/bin/printf \\4\\0\\0\\0\\4\\0\\0\\0 > /tmp/pkt
netcat 192.168.1.2 4379 < /tmp/pkt
Using a destructor we can check if the callback frees us, and exit
immediately. Elsewhere, we return after the callback anyway.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
(This used to be ctdb commit
4d0523dd94fb07e860b3e8118691f93d1ef8d0fa)
Ronnie Sahlberg [Wed, 2 Dec 2009 00:26:51 +0000 (11:26 +1100)]
version 1.0.106
(This used to be ctdb commit
b5a21fd39269a6e2a9d1c8182dd42a1773ccbb3f)
Martin Schwenke [Tue, 1 Dec 2009 07:08:57 +0000 (18:08 +1100)]
Eventscripts: Fix syntax error in 00.ctdb.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit
9ea261f791ab919eb1ce5b37073b4f1d30694bb8)
Michael Adam [Thu, 26 Nov 2009 07:35:20 +0000 (08:35 +0100)]
packaging:maketarball.sh: add a DEBIAN_MODE to the tarball creation
It is triggered by setting DEBIAN_MODE=yes in the environment.
This creates a tarball suitable for use in debian packages.
The differences from the standard tarball are these:
* The tar ball file is called ctdb_VERSION.orig.tar.gz
* The base directory in the tar ball is ctdb-VERSION.orig/
Michael
(This used to be ctdb commit
83e7c161efa93cd7acdfc803142b4fb3bfde7538)
Michael Adam [Thu, 26 Nov 2009 07:34:44 +0000 (08:34 +0100)]
configure:maketarball.sh: call autogen.sh and include configure in the tarball
Michael
(This used to be ctdb commit
bc8aee079e09164e06533a1474f5e9d899795933)
Michael Adam [Thu, 26 Nov 2009 07:32:24 +0000 (08:32 +0100)]
packaging:maketarball.sh: create the specfile from the ctdb.spec.in
Michael
(This used to be ctdb commit
bb8d02abd88899d259085b9b23fa52accb222be9)
Martin Schwenke [Tue, 1 Dec 2009 06:54:45 +0000 (17:54 +1100)]
Eventscripts: Remove executable bit accidently set on some scripts.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit
4c6e68ae942c05224c5f8b683fbc2dc1adced8ee)
Martin Schwenke [Tue, 1 Dec 2009 06:43:47 +0000 (17:43 +1100)]
Eventscript argument cleanups and introduction of ctdb_standard_event_handler.
The functions file no longer causes a side-effect by doing a shift.
It also doesn't set a convenience variable for $1.
All eventscripts now explicitly use "$1" in their case statement, as
does the initscript. The absence of a shift means that the
takeip/releaseip events now explicitly reference $2-$4 rather than
$1-$3.
New function ctdb_standard_event_handler handles the status and
setstatus events, and exits for either of those events. It is called
via a default case in each eventscript, replacing an explicit status
case where applicable.
Signed-off-by: Martin Schwenke <martin@meltin.net>
(This used to be ctdb commit
3d55408cbbb3bb71670b80f3dad5639ea0be5b5b)
Ronnie Sahlberg [Tue, 1 Dec 2009 05:06:59 +0000 (16:06 +1100)]
when we detect a ip-allocation mismatch, just force a new ip reassignment
instead of a full blown recovery
(This used to be ctdb commit
4f50aa8bb8be544058523f2f544109a26c2b3b51)
Ronnie Sahlberg [Tue, 1 Dec 2009 02:19:58 +0000 (13:19 +1100)]
When starting up ctdbd, wait until all initial recoveries have finished
and until we have gone through a full re-recovery timeout without triggering
any pending recoveries before we start up the services and start monitoring
the node.
(This used to be ctdb commit
821333afb458358f90446062b0242790695e5060)
Ronnie Sahlberg [Mon, 30 Nov 2009 23:53:18 +0000 (10:53 +1100)]
Merge commit 'martins/status-test-2'
Conflicts:
server/eventscript.c
(This used to be ctdb commit
e9b3477a5b9a2eff18f727e7d59338bfb5214793)