replace: add checks for atomic_thread_fence(memory_order_seq_cst) and add possible fallbacks
This implements a full memory barrier.
On ubuntu amd64 with results in an 'mfence' instruction.
This is required to syncronization between threads, where
there's typically only one write of a memory that should be
synced between all threads with the barrier.
Much more details can be found here:
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/_005f_005fatomic-Builtins.html#g_t_005f_005fatomic-Builtins
https://gcc.gnu.org/onlinedocs/gcc-7.3.0/gcc/_005f_005fsync-Builtins.html#g_t_005f_005fsync-Builtins
The main one we use seems to be in C11 via stdatomic.h,
the oldest fallback is __sync_synchronize(), which is available
since 2005 in gcc.
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Reviewed-by: Ralph Boehme <slow@samba.org>