metze/ctdb/wip.git
16 years agoafter we have checked dest address that it is a public address
Ronnie Sahlberg [Mon, 30 Jul 2007 06:10:14 +0000 (16:10 +1000)]
after we have checked dest address that it is a public address
update addr to the source address so the rpintout in the log matches
the client that attached to samba

16 years agoadd a small tool to compare rb tree with a timeval_compare()+add an
Ronnie Sahlberg [Mon, 30 Jul 2007 00:50:35 +0000 (10:50 +1000)]
add a small tool to compare rb tree with a timeval_compare()+add an
entry to the end of the list DLIST (worst case insert)

16 years agofix the remaining bugs with tree delete that testing found.
Ronnie Sahlberg [Sun, 29 Jul 2007 23:09:34 +0000 (09:09 +1000)]
fix the remaining bugs with tree delete that testing found.

the binary tree should work reasonably well now for delete.
insert always worked fine.

16 years agoremove dead code
Ronnie Sahlberg [Wed, 25 Jul 2007 21:22:36 +0000 (07:22 +1000)]
remove dead code

16 years agofix some remaining bugs with deleting nodes
Ronnie Sahlberg [Wed, 25 Jul 2007 21:21:32 +0000 (07:21 +1000)]
fix some remaining bugs with deleting nodes

16 years agothere were situations where we were not guaranteed that a sibling had 2
Ronnie Sahlberg [Wed, 25 Jul 2007 07:53:55 +0000 (17:53 +1000)]
there were situations where we were not guaranteed that a sibling had 2
child nodes which would cause a segv when trying to dereferencing those
two child nodes in order to read their color

16 years agoif sibling is NULL it is a leaf node and thus black.
Ronnie Sahlberg [Wed, 25 Jul 2007 07:22:04 +0000 (17:22 +1000)]
if sibling is NULL it is a leaf node and thus black.

16 years agono need to have a separate assignment of the tcparray pointer followed
Ronnie Sahlberg [Tue, 24 Jul 2007 22:03:58 +0000 (08:03 +1000)]
no need to have a separate assignment of the tcparray pointer followed
by a talloc_steal()
use the returned pointer in talloc_steal as the value to assign

16 years agoinitial version of talloc based red-black trees
Ronnie Sahlberg [Tue, 24 Jul 2007 08:51:13 +0000 (18:51 +1000)]
initial version of talloc based red-black trees
very initial version

16 years agowhen we build the arp structure for sending gratious arp (and tcp
Ronnie Sahlberg [Mon, 23 Jul 2007 21:46:51 +0000 (07:46 +1000)]
when we build the arp structure for sending gratious arp (and tcp
tickles) just talloc_steal the enture tcp_array into the arp
structure instead of copying each of the entries into a linked list
and then releasing the tcparray.

16 years agoset the tcp tickle update flag to true once we have done a takeover and
Ronnie Sahlberg [Fri, 20 Jul 2007 09:11:45 +0000 (19:11 +1000)]
set the tcp tickle update flag to true once we have done a takeover and
tickled all connections
othervise the other nodes will still remember this list until next time
we have had a connection/client closing.

16 years agowhen a client connects with TCP_CLIENT we should look at the
Ronnie Sahlberg [Fri, 20 Jul 2007 07:04:08 +0000 (17:04 +1000)]
when a client connects with TCP_CLIENT  we should look at the
destination address to find the public address   not the source address

16 years agoupdated ctdb tickle management
Ronnie Sahlberg [Fri, 20 Jul 2007 05:05:55 +0000 (15:05 +1000)]
updated ctdb tickle management

there is an array for each node/public address that contains tcp tickles

we send a TCP_ADD as a broadcast to all nodes when a client is added

if tcp tickles are removed, they are only removed immediately from the
local node.
once every 20 seconds a node will push/broadcast out the tickle list for
all public addresses it manages.   this will remove any deleted tickles
from the remote nodes

16 years agochange the tickle list from one global list into an array per public
Ronnie Sahlberg [Fri, 20 Jul 2007 00:06:41 +0000 (10:06 +1000)]
change the tickle list from one global list into an array per public
ip/node

once we have started sending all tickles for a specific ip   delete the
entire array   so that the tickles dont remain forever in the ctdb
server

add a control to send the full list of every tickle that is registered
for a particular public ip/node

16 years agomerge from tridge
Ronnie Sahlberg [Thu, 19 Jul 2007 05:07:27 +0000 (15:07 +1000)]
merge from tridge

16 years ago- log registering of tcp clients
Andrew Tridgell [Thu, 19 Jul 2007 05:04:54 +0000 (15:04 +1000)]
- log registering of tcp clients
- don\'t remove a tcp entry if we do not own the ip

16 years agomake sure we still run events when waiting for ctdb_event_script()
Andrew Tridgell [Thu, 19 Jul 2007 03:36:00 +0000 (13:36 +1000)]
make sure we still run events when waiting for ctdb_event_script()

16 years agomerge from tridge
Ronnie Sahlberg [Wed, 18 Jul 2007 21:29:53 +0000 (07:29 +1000)]
merge from tridge

16 years agomerged from ronnie
Andrew Tridgell [Wed, 18 Jul 2007 10:13:57 +0000 (20:13 +1000)]
merged from ronnie

16 years agoadd a check if start_node is beyond the end of the nodemap and reset it
Ronnie Sahlberg [Sun, 15 Jul 2007 22:36:09 +0000 (08:36 +1000)]
add a check if start_node is beyond the end of the nodemap and reset it
back to 0 if it is to prevent an infinite loop.

this could happen if in the future we add a mechanism to add/remove
nodes to a cluster at runtime

16 years agochange the way we pick/find a new node to takeover for a failed node
Ronnie Sahlberg [Sun, 15 Jul 2007 22:28:44 +0000 (08:28 +1000)]
change the way we pick/find a new node to takeover for a failed node
to keep a static that controls at which noide to start searching the
list for takeover candidates next time we need to find a node.

each time we find a node to takeover, reset the start variable to point
to the next node in the list

this makes the distribution of takeover nodes much more even

16 years agowe dont do nfstickles unless ctdb manages nfs
Ronnie Sahlberg [Sun, 15 Jul 2007 01:43:11 +0000 (11:43 +1000)]
we dont do nfstickles unless ctdb manages nfs

16 years agofix bug introduced in previous commit
Ronnie Sahlberg [Sun, 15 Jul 2007 01:37:22 +0000 (11:37 +1000)]
fix bug introduced in previous commit

16 years agothere is no point in doing anything in 10.interfaces unless we have a
Ronnie Sahlberg [Sun, 15 Jul 2007 01:28:53 +0000 (11:28 +1000)]
there is no point in doing anything in 10.interfaces unless we have a
public interface

16 years agotry netstat as a last attempt to check a tcp port in
Ronnie Sahlberg [Sat, 14 Jul 2007 23:29:08 +0000 (09:29 +1000)]
try netstat as a last attempt to check a tcp port in
ctdb_check_tcp_ports() as well

16 years agoif we dont have nc or netcat, try using netstat as a final attempt to
Ronnie Sahlberg [Sat, 14 Jul 2007 23:26:54 +0000 (09:26 +1000)]
if we dont have nc or netcat,  try using netstat as a final attempt to
check for tcp ports

(the check for these tools should not really use hardcoded paths)

16 years agoif we dont have /etc/sysconfig and we dont have /etc/default
Ronnie Sahlberg [Sat, 14 Jul 2007 23:13:50 +0000 (09:13 +1000)]
if we dont have /etc/sysconfig  and we dont have /etc/default
check /etc/ctdb/sysconfig as a last option

16 years agowhen we have found that /etc/rc.d/init.d/SERVICE exists, then run that
Ronnie Sahlberg [Sat, 14 Jul 2007 22:54:48 +0000 (08:54 +1000)]
when we have found that /etc/rc.d/init.d/SERVICE exists, then run that
script and not /etc/rc.d/SERVICE

16 years agoadd some configure magic to make it configure and build properly on
Ronnie Sahlberg [Sat, 14 Jul 2007 05:16:52 +0000 (15:16 +1000)]
add some configure magic to make it configure and build properly on
linux and aix

16 years agoadd some support for controlling Linux or AIX in the makefile
Ronnie Sahlberg [Sat, 14 Jul 2007 00:58:51 +0000 (10:58 +1000)]
add some support for controlling Linux or AIX in the makefile

this should really be done by configure

16 years agoadd an initial system_aix.c to manage raw sockets under aix
Ronnie Sahlberg [Sat, 14 Jul 2007 00:27:34 +0000 (10:27 +1000)]
add an initial system_aix.c  to manage raw sockets under aix

16 years agoupdate the comment at the top of file to reflect the purpose of the file
Ronnie Sahlberg [Fri, 13 Jul 2007 07:10:09 +0000 (17:10 +1000)]
update the comment at the top of file to reflect the purpose of the file

16 years agoadd a private_data field to the killtcp structure and let the system
Ronnie Sahlberg [Fri, 13 Jul 2007 07:07:10 +0000 (17:07 +1000)]
add a private_data field to the killtcp structure and let the system
specific routines populate it as it see fit when creating a
capture socket.
pass this structure to read_tcp and close capture socket as parameter

16 years agoensure killtcp structure is initialised
Andrew Tridgell [Fri, 13 Jul 2007 01:55:58 +0000 (11:55 +1000)]
ensure killtcp structure is initialised

16 years ago- merge from ronnie
Andrew Tridgell [Fri, 13 Jul 2007 01:31:18 +0000 (11:31 +1000)]
- merge from ronnie
- cleaner handling of system capture socket

16 years agomerge from tridge
Ronnie Sahlberg [Fri, 13 Jul 2007 01:30:19 +0000 (11:30 +1000)]
merge from tridge

16 years agofully save/restore scheduler parameters
Andrew Tridgell [Thu, 12 Jul 2007 23:35:46 +0000 (09:35 +1000)]
fully save/restore scheduler parameters

16 years agofixed the sense of do_setsched
Andrew Tridgell [Thu, 12 Jul 2007 23:14:31 +0000 (09:14 +1000)]
fixed the sense of do_setsched

16 years agoallow extra option override in /etc/sysconfig/ctdb
Andrew Tridgell [Thu, 12 Jul 2007 23:14:15 +0000 (09:14 +1000)]
allow extra option override in /etc/sysconfig/ctdb

16 years agoadded --nosetsched option to ctdbd
Andrew Tridgell [Thu, 12 Jul 2007 22:47:02 +0000 (08:47 +1000)]
added --nosetsched option to ctdbd

16 years agonetinet/if_ether.h is more portable than net/ethernet.h
Ronnie Sahlberg [Thu, 12 Jul 2007 01:43:30 +0000 (11:43 +1000)]
netinet/if_ether.h is more portable than net/ethernet.h

16 years agothe posix.4 name for the priority field is sched_priority
Ronnie Sahlberg [Thu, 12 Jul 2007 01:31:20 +0000 (11:31 +1000)]
the posix.4 name for the priority field is sched_priority
not __sched_priority

16 years agoas an optimization for when we want to send multiple tickles at a time
Ronnie Sahlberg [Wed, 11 Jul 2007 23:22:06 +0000 (09:22 +1000)]
as an optimization for when we want to send multiple tickles at a time
let the caller create the sending socket and use a single socket instead
of one new one for each tickle.
pass a sending socket to ctdb_sys_send_tcp()

ctdb_sys_kill_tcp is not longer used so remove it

set the socketflags for close on exec and nonblocking in the helper that
creates the sockets instead of in the caller

add a helper to create a sending socket to send tickles from

16 years agorename killtcp->fd to killtcp->capture_fd
Ronnie Sahlberg [Wed, 11 Jul 2007 22:52:24 +0000 (08:52 +1000)]
rename killtcp->fd to killtcp->capture_fd

we might want to have two sockets attached to the killtcp structure
one for capturing and a second one for sending  so we dont have to
create a new socket for each tickle we want to send

16 years agoctdb killtcp no longer takes a <numrst> argument to control how many
Ronnie Sahlberg [Wed, 11 Jul 2007 22:31:56 +0000 (08:31 +1000)]
ctdb killtcp  no longer takes a <numrst> argument to control how many
times to try the reset.

the reset retry attempt is now handled inside the daemon

update the 60.nfs script and remove this parameter that is no longer
used

16 years agomake the ctdb tool use the killtcp control in the daemon instead of
Ronnie Sahlberg [Wed, 11 Jul 2007 22:30:04 +0000 (08:30 +1000)]
make the ctdb tool use the killtcp control in the daemon instead of
calling killtcp directly

16 years agoadd daemon code for the new kill_tcp control
Ronnie Sahlberg [Wed, 11 Jul 2007 08:24:25 +0000 (18:24 +1000)]
add daemon code for the new kill_tcp control

16 years agoadd a ctdb_ prefix to two public functions
Ronnie Sahlberg [Wed, 11 Jul 2007 08:13:03 +0000 (18:13 +1000)]
add a ctdb_ prefix to two public functions

16 years agofirst cut at a better and more scalable socketkiller
Ronnie Sahlberg [Wed, 11 Jul 2007 07:43:51 +0000 (17:43 +1000)]
first cut at a better and more scalable socketkiller
that can kill multiple connections asynchronously using one listening
socket

16 years agoadd a ctdb_kill_tcp_callback() that will perform a kill tcp using a
Ronnie Sahlberg [Wed, 11 Jul 2007 02:33:14 +0000 (12:33 +1000)]
add a ctdb_kill_tcp_callback() that will perform a kill tcp using a
background process

16 years agopass the header to ctdb_become_dmaster instead of just the reqid
Ronnie Sahlberg [Tue, 10 Jul 2007 23:44:52 +0000 (09:44 +1000)]
pass the header to ctdb_become_dmaster instead of just the reqid

this allows us to print from which node Invalid or Dropped orphan become
dmaster packets came from

16 years agoprint the operation code in the debug message when we discard a packet
Ronnie Sahlberg [Tue, 10 Jul 2007 22:41:29 +0000 (08:41 +1000)]
print the operation code in the debug message when we discard a packet
due to incorrect generation number

16 years agoregenerated ctdbd manpage
Ronnie Sahlberg [Tue, 10 Jul 2007 22:27:22 +0000 (08:27 +1000)]
regenerated ctdbd manpage

16 years agomerge from tridge
Ronnie Sahlberg [Tue, 10 Jul 2007 09:07:23 +0000 (19:07 +1000)]
merge from tridge

16 years agominor back-merge from samba4
Andrew Tridgell [Tue, 10 Jul 2007 08:13:47 +0000 (18:13 +1000)]
minor back-merge from samba4

16 years agomerge from tridge
Ronnie Sahlberg [Tue, 10 Jul 2007 07:45:04 +0000 (17:45 +1000)]
merge from tridge

16 years agomore merges for GPLv3 update
Andrew Tridgell [Tue, 10 Jul 2007 05:46:05 +0000 (15:46 +1000)]
more merges for GPLv3 update

16 years agoupdate lib/events from samba4 (If->if)
Andrew Tridgell [Tue, 10 Jul 2007 05:34:00 +0000 (15:34 +1000)]
update lib/events from samba4 (If->if)

16 years agoupdate lib/tdb from samba4
Andrew Tridgell [Tue, 10 Jul 2007 05:32:27 +0000 (15:32 +1000)]
update lib/tdb from samba4

16 years agoupdate lib/replace from samba4
Andrew Tridgell [Tue, 10 Jul 2007 05:29:31 +0000 (15:29 +1000)]
update lib/replace from samba4

16 years agomerge from ronnie
Andrew Tridgell [Tue, 10 Jul 2007 04:59:23 +0000 (14:59 +1000)]
merge from ronnie

16 years agouse the socketkiller to kill off all lock manager sessions as well
Ronnie Sahlberg [Tue, 10 Jul 2007 03:09:35 +0000 (13:09 +1000)]
use the socketkiller to kill off all lock manager sessions as well

16 years agoupdate the documentation for NFS to mention that the lock manager must
Ronnie Sahlberg [Tue, 10 Jul 2007 02:43:46 +0000 (12:43 +1000)]
update the documentation for NFS to mention that the lock manager must
run on the same port on all nodes.

remove the CTDB_MANAGES_NFSLOCK variable that is no longer used

16 years agomake it possible to specify how many times ctdb killtcp will try to RST
Ronnie Sahlberg [Tue, 10 Jul 2007 00:24:20 +0000 (10:24 +1000)]
make it possible to specify how many times ctdb killtcp will try to RST
the tcp connection

change the 60.nfs script to run ctdb killtcp in the foreground so we
dont get lots of these running in parallel when there are a lot of tcp
connections to rst

16 years agorun the ctdb killtcp in the background
Ronnie Sahlberg [Tue, 10 Jul 2007 00:07:26 +0000 (10:07 +1000)]
run the ctdb killtcp in the background

16 years agodont restart the tcp service after a ip takeover, it is more efficient
Ronnie Sahlberg [Mon, 9 Jul 2007 23:45:14 +0000 (09:45 +1000)]
dont restart the tcp service after a ip takeover,   it is more efficient
to just kill off the tcp connections

16 years agonicer handling of DISCONNECTED flag when we update the node flags from
Ronnie Sahlberg [Mon, 9 Jul 2007 07:40:15 +0000 (17:40 +1000)]
nicer handling of DISCONNECTED flag  when we update the node flags from
a remote message

16 years agowhen a remote node has sent us a message to update the flags for a node,
Ronnie Sahlberg [Mon, 9 Jul 2007 03:21:17 +0000 (13:21 +1000)]
when a remote node has sent us a message to update the flags for a node,
dont let those messages modify the DISCONNECTED flag.

the DISCONNECTED flag must be managed locally since it describes whether
the local node can communicate with the remote node or not

16 years agoa better way to fix the DISCONNECT|BANNED vs DISCONNECT bug
Ronnie Sahlberg [Mon, 9 Jul 2007 02:55:15 +0000 (12:55 +1000)]
a better way to fix the DISCONNECT|BANNED vs DISCONNECT bug

16 years agowhen checking the nodemap flags for consitency while monitoring the
Ronnie Sahlberg [Mon, 9 Jul 2007 02:33:00 +0000 (12:33 +1000)]
when checking the nodemap flags for consitency while monitoring the
cluster,   we cant check that both the BANNED and the DISCONNECTED flags
are both set at the same time   since if a node becomes banned just
before it is DISCONNECTED   there is no guarantee that all other nodes
will have seen the BANNED flag.

So we must first check the DISCONNECTED flag only   and only if the
DISCONNECTED flag is not set should we check the BANNED flag.

othervise this can cause a recovery loop while some nodes thing the
disconnected node is DISCONNECTED|BANNED and other think it is just
DISCONNECTED

16 years agomerge from tridge
Ronnie Sahlberg [Sun, 8 Jul 2007 22:38:01 +0000 (08:38 +1000)]
merge from tridge

16 years agofixed sense of inet_aton test
Andrew Tridgell [Sun, 8 Jul 2007 11:09:09 +0000 (21:09 +1000)]
fixed sense of inet_aton test

16 years agocall kill_clients when releasing all IPs, as well as for individual IPs
Andrew Tridgell [Sun, 8 Jul 2007 10:45:12 +0000 (20:45 +1000)]
call kill_clients when releasing all IPs, as well as for individual IPs

16 years agowe do tell banned nodes to release IPs
Andrew Tridgell [Sun, 8 Jul 2007 10:24:03 +0000 (20:24 +1000)]
we do tell banned nodes to release IPs

16 years agolog the generation numbers to give a hint about this bug
Andrew Tridgell [Sun, 8 Jul 2007 09:36:55 +0000 (19:36 +1000)]
log the generation numbers to give a hint about this bug

16 years agoincrement rpm release number
Andrew Tridgell [Sun, 8 Jul 2007 00:41:30 +0000 (10:41 +1000)]
increment rpm release number

16 years agomerge from ronnie - we have an official port number, yay!
Andrew Tridgell [Fri, 6 Jul 2007 06:17:31 +0000 (16:17 +1000)]
merge from ronnie - we have an official port number, yay!

16 years agouse the official iana number for ctdb and not 9001
Ronnie Sahlberg [Fri, 6 Jul 2007 05:29:03 +0000 (15:29 +1000)]
use the official iana number for ctdb and not 9001

16 years agouse 'ctdb tickle' instead of sendip to tickle nfs clients.
Ronnie Sahlberg [Fri, 6 Jul 2007 01:51:34 +0000 (11:51 +1000)]
use 'ctdb tickle' instead of sendip to tickle nfs clients.

16 years agoremove 59.nfslock and fold this into 60.nfs
Ronnie Sahlberg [Fri, 6 Jul 2007 00:54:42 +0000 (10:54 +1000)]
remove 59.nfslock and fold this into 60.nfs
add a 61.nfstickle script to make nfs failover faster

16 years agomerge from tridge
Ronnie Sahlberg [Fri, 6 Jul 2007 00:48:46 +0000 (10:48 +1000)]
merge from tridge

16 years agomerge from ronnie (with spelling fixes)
Andrew Tridgell [Thu, 5 Jul 2007 05:06:42 +0000 (15:06 +1000)]
merge from ronnie (with spelling fixes)

16 years agobreak the tickle description into two paragraphs
Ronnie Sahlberg [Thu, 5 Jul 2007 00:17:46 +0000 (10:17 +1000)]
break the tickle description into two paragraphs

16 years agoupdate the manpage for ctdb to describe killtcp and tickle
Ronnie Sahlberg [Thu, 5 Jul 2007 00:16:11 +0000 (10:16 +1000)]
update the manpage for ctdb to describe killtcp and tickle

16 years agomerge from tridge
Ronnie Sahlberg [Thu, 5 Jul 2007 00:01:35 +0000 (10:01 +1000)]
merge from tridge

16 years agofixed help layout
Andrew Tridgell [Thu, 5 Jul 2007 00:00:51 +0000 (10:00 +1000)]
fixed help layout

16 years agofixed error message on bad IP/port
Andrew Tridgell [Wed, 4 Jul 2007 23:59:45 +0000 (09:59 +1000)]
fixed error message on bad IP/port

16 years agomerge from ronnie
Andrew Tridgell [Wed, 4 Jul 2007 23:59:11 +0000 (09:59 +1000)]
merge from ronnie

16 years agoadd a command to ctdb to send tickle-ack's
Ronnie Sahlberg [Wed, 4 Jul 2007 22:56:02 +0000 (08:56 +1000)]
add a command to ctdb to send tickle-ack's

16 years agomerge from tridge
Ronnie Sahlberg [Wed, 4 Jul 2007 07:53:16 +0000 (17:53 +1000)]
merge from tridge

16 years agoforgot to add this
Andrew Tridgell [Wed, 4 Jul 2007 07:45:46 +0000 (17:45 +1000)]
forgot to add this

16 years agomerge from tridge
Ronnie Sahlberg [Wed, 4 Jul 2007 07:37:26 +0000 (17:37 +1000)]
merge from tridge

16 years agomerge from tridge
Ronnie Sahlberg [Wed, 4 Jul 2007 07:35:16 +0000 (17:35 +1000)]
merge from tridge

16 years agoremoved unused makefile var
Andrew Tridgell [Wed, 4 Jul 2007 06:52:38 +0000 (16:52 +1000)]
removed unused makefile var

16 years ago- neaten up the command line for killtcp
Andrew Tridgell [Wed, 4 Jul 2007 06:51:13 +0000 (16:51 +1000)]
- neaten up the command line for killtcp
- split out the event script code into a separate module
- get rid of the separate takeover directory

16 years agomore careful checking of lengths
Andrew Tridgell [Wed, 4 Jul 2007 06:22:09 +0000 (16:22 +1000)]
more careful checking of lengths

16 years agomerge from ronnie
Andrew Tridgell [Wed, 4 Jul 2007 04:51:33 +0000 (14:51 +1000)]
merge from ronnie

16 years agowe dont need socketkiller anymore now that the
Ronnie Sahlberg [Wed, 4 Jul 2007 04:16:28 +0000 (14:16 +1000)]
we dont need socketkiller anymore now that the
kill-tcp-connection code is available from the ctdb tool

16 years agoadd a killtcp command to the ctdb tool
Ronnie Sahlberg [Wed, 4 Jul 2007 04:14:48 +0000 (14:14 +1000)]
add a killtcp command to the ctdb tool

16 years agoadd a new ctdb_sys_kill_tcp() function that kills (RST) the specified
Ronnie Sahlberg [Wed, 4 Jul 2007 03:53:22 +0000 (13:53 +1000)]
add a new  ctdb_sys_kill_tcp() function that kills (RST) the specified
connection