1 This directory is where you should put any local or application
2 specific event scripts for ctdb to call.
4 All event scripts start with the prefic 'NN.' where N is a digit.
5 The event scripts are run in sequence based on NN.
6 Thus 10.interfaces will be run before 60.nfs.
8 Each NN must be unique and duplicates will cause undefined behaviour.
9 I.e. having both 10.interfaces and 10.otherstuff is not allowed.
12 As a special case, any eventscript that ends with a '~' character will be
13 ignored since this is a common postfix that some editors will append to
14 older versions of a file.
17 The eventscripts are called with varying number of arguments.
18 The first argument is the "event" and the rest of the arguments depend
19 on which event was triggered.
21 All of the events except the 'shutdown' and 'startrecovery' events will be
22 called with the ctdb daemon in NORMAL mode (ie. not in recovery)
24 The events currently implemented are
26 This event does not take any additional arguments.
27 This event is only invoked once, when ctdb is starting up.
28 This event is used to do some cleanup work from earlier runs
29 and prepare the basic setup.
31 Example: 00.ctdb cleans up $CTDB_BASE/state
34 This event does not take any additional arguments.
35 This event is only invoked once, when ctdb has finished
36 the initial recoveries. This event is used to wait for
37 the service to start and all resources for the service
40 This is used to prevent ctdb from starting up and advertize its
41 services until all dependent services have become available.
43 All services that are managed by ctdb should implement this
44 event and use it to start the service.
46 Example: 50.samba uses this event to start the samba daemon
47 and then wait until samba and all its associated services have
48 become available. It then also proceeds to wait until all
49 shares have become available.
52 This event is called when the ctdb service is shuting down.
54 All services that are managed by ctdb should implement this event
55 and use it to perform a controlled shutdown of the service.
57 Example: 60.nfs uses this event to shut down nfs and all associated
58 services and stop exporting any shares when this event is invoked.
61 This event is invoked every X number of seconds.
62 The interval can be configured using the MonitorInterval tunable
63 but defaults to 15 seconds.
65 This event is triggered by ctdb to continously monitor that all
66 managed services are healthy.
67 When invoked, the event script will check that the service is healthy
68 and return 0 if so. If the service is not healthy the event script
69 should return non zero.
71 If a service returns nonzero from this script this will cause ctdb
72 to consider the node status as UNHEALTHY and will cause the public
73 address and all associated services to be failed over to a different
76 All managed services should implement this event.
78 Example: 10.interfaces which checks that the public interface (if used)
79 is healthy, i.e. it has a physical link established.
82 This event is triggered everytime the node takes over a public ip
83 address during recovery.
84 This event takes three additional arguments :
85 'interface' 'ipaddress' and 'netmask'
87 Before this event there will always be a 'startrecovery' event.
89 This event will always be followed by a 'recovered' event once
90 all ipaddresses have been reassigned to new nodes and the ctdb database
92 If multiple ip addresses are reassigned during recovery it is
93 possible to get several 'takeip' events followed by a single
96 Since there might involve substantial work for the service when an ip
97 address is taken over and since multiple ip addresses might be taken
98 over in a single recovery it is often best to only mark which addresses
99 are being taken over in this event and defer the actual work to
100 reconfigure or restart the services until the 'recovered' event.
102 Example: 60.nfs which just records which ip addresses are being taken
103 over into a local state directory and which defers the actual
104 restart of the services until the 'recovered' event.
108 This event is triggered everytime the node releases a public ip
109 address during recovery.
110 This event takes three additional arguments :
111 'interface' 'ipaddress' and 'netmask'
113 In all other regards this event is analog to the 'takeip' event above.
118 This event is triggered everytime the node moves a public ip
119 address between interfaces
120 This event takes four additional arguments :
121 'old-interface' 'new-interface' 'ipaddress' and 'netmask'
123 Example: 10.interface
126 This event is triggered everytime we start a recovery process
127 or before we start changing ip address allocations.
130 This event is triggered every time we have finished a full recovery
131 and also after we have finished reallocating the public ip addresses
134 Example: 60.nfs which if the ip address configuration has changed
135 during the recovery (i.e. if addresses have been taken over or
136 released) will kill off any tcp connections that exist for that
137 service and also send out statd notifications to all registered
141 This event is called when a node is STOPPED and can be used to
142 perform additional cleanup that is required.
143 Note that a stopped node is considered inactive, so it will not
144 be issuing the recovered event once the cluster has recovered.
145 See 91.lvs for a use of this event.
147 Additional note for takeip, releaseip, recovered:
149 ALL services that depend on the ip address configuration of the node must
150 implement all three of these events.
152 ALL services that use TCP should also implement these events and at least
153 kill off any tcp connections to the service if the ip address config has
154 changed in a similar fashion to how 60.nfs does it.
155 The reason one must do this is that ESTABLISHED tcp connections may survive
156 when an ip address is released and removed from the host until the ip address
158 Any tcp connections that survive a release/takeip sequence can potentially
159 cause the client/server tcp connection to get out of sync with sequence and
160 ack numbers and cause a disruptive ack storm.