ctdb-recoverd: Avoid duplicate recoverd event in parallel recovery
authorAmitay Isaacs <amitay@gmail.com>
Wed, 8 Jun 2016 04:15:22 +0000 (14:15 +1000)
committerAmitay Isaacs <amitay@samba.org>
Wed, 8 Jun 2016 08:33:19 +0000 (10:33 +0200)
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11956

In do_recovery, after the recovery and takeover is complete, recoverd
event is triggered.  When the parallel database recovery was separated,
ctdb_recovery_helper implemented sending END_RECOVERY control which
causes recoverd event to be triggered.  So when there is parallel database
recovery, recoverd event is triggered twice.

Instead move the call to run_recovered_eventscript() explicitly in
the serial recovery code path.  This avoids the duplication trigger of
recoverd event.

Signed-off-by: Amitay Isaacs <amitay@gmail.com>
Reviewed-by: Martin Schwenke <martin@meltin.net>
ctdb/server/ctdb_recoverd.c

index b55f8b0d1f632488f7f8f640190eae27ddd796b2..01807b4e2be242f51c9ca9d7289951305d345dac 100644 (file)
@@ -2027,6 +2027,15 @@ static int db_recovery_serial(struct ctdb_recoverd *rec, TALLOC_CTX *mem_ctx,
 
        DEBUG(DEBUG_NOTICE, (__location__ " Recovery - disabled recovery mode\n"));
 
+       /* execute the "recovered" event script on all nodes */
+       ret = run_recovered_eventscript(rec, nodemap, "do_recovery");
+       if (ret!=0) {
+               DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered' event on cluster. Recovery process failed.\n"));
+               return -1;
+       }
+
+       DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered event\n"));
+
        return 0;
 }
 
@@ -2192,15 +2201,6 @@ static int do_recovery(struct ctdb_recoverd *rec,
 
        do_takeover_run(rec, nodemap);
 
-       /* execute the "recovered" event script on all nodes */
-       ret = run_recovered_eventscript(rec, nodemap, "do_recovery");
-       if (ret!=0) {
-               DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered' event on cluster. Recovery process failed.\n"));
-               goto fail;
-       }
-
-       DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered event\n"));
-
        /* send a message to all clients telling them that the cluster 
           has been reconfigured */
        ret = ctdb_client_send_message(ctdb, CTDB_BROADCAST_CONNECTED,