Changes in CTDB 2.3 =================== User-visible changes -------------------- * 2 new configuration variables for 60.nfs eventscript: - CTDB_MONITOR_NFS_THREAD_COUNT - CTDB_NFS_DUMP_STUCK_THREADS See ctdb.sysconfig for details. * Removed DeadlockTimeout tunable. To enable debug of locking issues set CTDB_DEBUG_LOCKS=/etc/ctdb/debug_locks.sh * In overall statistics and database statistics, lock buckets have been updated to use following timings: < 1ms, < 10ms, < 100ms, < 1s, < 2s, < 4s, < 8s, < 16s, < 32s, < 64s, >= 64s * Initscript is now simplified with most CTDB-specific functionality split out to ctdbd_wrapper, which is used to start and stop ctdbd. * Add systemd support. * CTDB subprocesses are now given informative names to allow them to be easily distinguished when using programs like "top" or "perf". Important bug fixes ------------------- * ctdb tool should not exit from a retry loop if a control times out (e.g. under high load). This simple fix will stop an exit from the retry loop on any error. * When updating flags on all nodes, use the correct updated flags. This should avoid wrong flag change messages in the logs. * The recovery daemon will not ban other nodes if the current node is banned. * ctdb dbstatistics command now correctly outputs database statistics. * Fixed a panic with overlapping shutdowns (regression in 2.2). * Fixed 60.ganesha "monitor" event (regression in 2.2). * Fixed a buffer overflow in the "reloadips" implementation. * Fixed segmentation faults in ping_pong (called with incorrect argument) and test binaries (called when ctdbd not running). Important internal changes -------------------------- * The recovery daemon on stopped or banned node will stop participating in any cluster activity. * Improve cluster wide database traverse by sending the records directly from traverse child process to requesting node. * TDB checking and dropping of all IPs moved from initscript to "init" event in 00.ctdb. * To avoid "rogue IPs" the release IP callback now fails if the released IP is still present on an interface. Changes in CTDB 2.2 =================== User-visible changes -------------------- * The "stopped" event has been removed. The "ipreallocated" event is now run when a node is stopped. Use this instead of "stopped". * New --pidfile option for ctdbd, used by initscript * The 60.nfs eventscript now uses configuration files in /etc/ctdb/nfs-rpc-checks.d/ for timeouts and actions instead of hardcoding them into the script. * Notification handler scripts can now be dropped into /etc/ctdb/notify.d/. * The NoIPTakeoverOnDisabled tunable has been renamed to NoIPHostOnAllDisabled and now works properly when set on individual nodes. * New ctdb subcommand "runstate" prints the current internal runstate. Runstates are used for serialising startup. Important bug fixes ------------------- * The Unix domain socket is now set to non-blocking after the connection succeeds. This avoids connections failing with EAGAIN and not being retried. * Fetching from the log ringbuffer now succeeds if the buffer is full. * Fix a severe recovery bug that can lead to data corruption for SMB clients. * The statd-callout script now runs as root via sudo. * "ctdb delip" no longer fails if it is unable to move the IP. * A race in the ctdb tool's ipreallocate code was fixed. This fixes potential bugs in the "disable", "enable", "stop", "continue", "ban", "unban", "ipreallocate" and "sync" commands. * The monitor cancellation code could sometimes hang indefinitely. This could cause "ctdb stop" and "ctdb shutdown" to fail. Important internal changes -------------------------- * The socket I/O handling has been optimised to improve performance. * IPs will not be assigned to nodes during CTDB initialisation. They will only be assigned to nodes that are in the "running" runstate. * Improved database locking code. One improvement is to use a standalone locking helper executable - the avoids creating many forked copies of ctdbd and potentially running a node out of memory. * New control CTDB_CONTROL_IPREALLOCATED is now used to generate "ipreallocated" events. * Message handlers are now indexed, providing a significant performance improvement.