tdb: workaround starvation problem in locking entire database.
authorRusty Russell <rusty@rustcorp.com.au>
Fri, 13 Aug 2010 16:43:26 +0000 (02:13 +0930)
committerRusty Russell <rusty@rustcorp.com.au>
Fri, 13 Aug 2010 17:01:22 +0000 (02:31 +0930)
commit11ab43084b10cf53b530cdc3a6036c898b79ca38
tree04549a9c16f5e68349aded2f8ead56571df01312
parentf00b61c7d4611802c66495824c97af6cad69704e
tdb: workaround starvation problem in locking entire database.

We saw tdb_lockall() take 71 seconds under heavy load; this is because Linux
(at least) doesn't prevent new small locks being obtained while we're waiting
for a big log.

The workaround is to do divide and conquer using non-blocking chainlocks: if
we get down to a single chain we block.  Using a simple test program where
children did "hold lock for 100ms, sleep for 1 second" the time to do
tdb_lockall() dropped signifiantly.  There are ln(hashsize) locks taken in
the contended case, but that's slow anyway.

More analysis is given in my blog at http://rusty.ozlabs.org/?p=120

This may also help transactions, though in that case it's the initial
read lock which uses this gradual locking routine; the update-to-write-lock
code is separate and still tries to update in one go.

Even though ABI doesn't change, minor version bumped so behavior change
can be easily detected.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
lib/tdb/ABI/tdb-1.2.3.sigs [new file with mode: 0644]
lib/tdb/common/lock.c
lib/tdb/configure.ac
lib/tdb/wscript