ctdb-tests: Add synchronisation points in reload IPs tests
authorMartin Schwenke <martin@meltin.net>
Wed, 15 Feb 2017 08:33:02 +0000 (19:33 +1100)
committerMartin Schwenke <martins@samba.org>
Fri, 24 Feb 2017 06:47:11 +0000 (07:47 +0100)
commitfdc0dbee29f8cb81dfcb1c995df6468469fd75ce
tree6a84ca04933cfa117c7d97eab48581c1c3bb3bfb
parent2d22454f17a691648dc6d26864a896588de944b2
ctdb-tests: Add synchronisation points in reload IPs tests

"ctdb reloadips" use of ipreallocate() can result in a spurious
takeover runs.  This can cause a subsequent "ctdb reloadips" to fail
to disable takeover runs (due to there being one already in progress).

There are various possible improvements but a proper fix probably
requires a protocol change.  That would mean receiving an ACK for a
takeover run request to indicate that the request will be processes
and then a broadcast to indicate a completed takeover run.

There are various other partial fixes (e.g. de-duping queued takeover
run requests against those in the in-progess queue) and workarounds
(e.g. always do a double ipreallocate() in the tool, which should
absorb the spurious takeover run).

However, this is unlikely to be a real-world problem.  Real use cases
should not involve repeatedly reloading the IP configuration.

Instead, work around the problem of flaky tests by manually adding
"ctdb sync" commands to cause extra no-op takeover runs.  These should
not add spurious takeover runs and will create synchronisation points
to help avoid the issue.

Signed-off-by: Martin Schwenke <martin@meltin.net>
Reviewed-by: Amitay Isaacs <amitay@gmail.com>
ctdb/tests/complex/18_ctdb_reloadips.sh
ctdb/tests/simple/18_ctdb_reloadips.sh