Eventscripts: clean up 60.nfs monitor event.
This adds a helper function called nfs_check_rpc_service() and uses it
to make the monitor event much more readable. An example of usage is
as follows:
nfs_check_rpc_service "mountd" \
-ge 10 "verbose restart:b unhealthy" \
-eq 5 "restart:b"
The first argument to nfs_check_rpc_service() is the name of the RPC
service to be checked. The RPC service corresponding to this command
is checked for availability using the rpcinfo command. If the service
is available then the function succeeds and subsequent arguments are
ignored.
If the rpcinfo check fails then a failure counter for that particular
RPC service is incremented and subsequent arguments are processed in
groups of 3:
1. An integer comparison operator supported by test.
2. An integer failure limit.
3. An action string.
The value of the failure counter is checked using (1) and (2) above.
The first check that succeeds has its action string processed - note
that this explains the somewhat curious reverse ordering of checks.
It the example above:
* If the counter is >= 10 then a verbose message is printed
describing the failure, the service is restarted in the background
and the node is marked as unhealthy (via an "exit 1" from the
function).
* If the counter is == 5 then the service us restarted in the
background.
For more action options please see the code.
This also changes the ctdb_check_rpc() function so that it no longer
takes a program number to check. It now just takes a real RPC program
name that rpcinfo can resolve via /etc/rpc.
Signed-off-by: Martin Schwenke <martin@meltin.net>