tevent_threads: Fix a rundown race introduced with 1828011317b
authorVolker Lendecke <vl@samba.org>
Thu, 15 Jun 2017 09:48:24 +0000 (11:48 +0200)
committerStefan Metzmacher <metze@samba.org>
Fri, 16 Jun 2017 17:34:17 +0000 (19:34 +0200)
commit1fe7ec237a7036d76764ef1981de6b3000b2cfd3
treeda1de9f705ea0e0cce178275214bc4450014d19a
parentaafc1c28289933e05d7434c79c0dda2caef0fde8
tevent_threads: Fix a rundown race introduced with 1828011317b

The race is easily reproduced by adding a poll(NULL,0,10) in between the two
pthread_mutex_unlock calls in _tevent_threaded_schedule_immediate.

Before 1828011317b, the main thread was signalled only after the helper
had already unlocked event_ctx_mutex.

Full explaination follows:
-----------------------------------------------------------------
Inside _tevent_threaded_schedule_immediate() we have:

476         ret = pthread_mutex_unlock(&ev->scheduled_mutex);
477         if (ret != 0) {
478                 abort();
479         }

HERE!!!!

481         ret = pthread_mutex_unlock(&tctx->event_ctx_mutex);
482         if (ret != 0) {
483                 abort();
484         }

At the HERE!!! point, what happens is tevent_common_threaded_activate_immediate(),
which is blocked on ev->scheduled_mutex, get released and does:

514         while (ev->scheduled_immediates != NULL) {
515                 struct tevent_immediate *im = ev->scheduled_immediates;
516                 DLIST_REMOVE(ev->scheduled_immediates, im);
517                 DLIST_ADD_END(ev->immediate_events, im);
518         }

- making an immediate event ready to be scheduled.

This then returns into epoll_event_loop_once(), which then calls:

910         if (ev->immediate_events &&
911             tevent_common_loop_immediate(ev)) {
912                 return 0;
913         }

which causes the immediate event to fire. This immediate
event is the pthread job terminate event, which was previously
set up in pthreadpool_tevent_job_signal() by:

198         if (state->tctx != NULL) {
199                 /* with HAVE_PTHREAD */
200                 tevent_threaded_schedule_immediate(state->tctx, state->im,
201                                                    pthreadpool_tevent_job_done,
202                                                    state);

So we now call pthreadpool_tevent_job_done() - which does:

225         TALLOC_FREE(state->tctx);

calling tevent_threaded_context_destructor():

384         ret = pthread_mutex_destroy(&tctx->event_ctx_mutex); <---------------- BOOM returns an error !
385         if (ret != 0) {
386                 abort();
387         }

as we haven't gotten to line 481 above (the line after
HERE!!!!) so the tctx->event_ctx_mutex is still
locked when we try to destroy it.

So doing an additional:

        ret = pthread_mutex_lock(&tctx->event_ctx_mutex);
        ret = pthread_mutex_unlock(&tctx->event_ctx_mutex);

(error checking elided) forces tevent_threaded_context_destructor()
to wait until tctx->event_ctx_mutex is unlocked before it locks/unlocks
and then is guaranteed safe to destroy.

Signed-off-by: Volker Lendecke <vl@samba.org>
Reviewed-by: Jeremy Allison <jra@samba.org>
Reviewed-by: Stefan Metzmacher <metze@samba.org>
lib/tevent/tevent_threads.c