Ronnie Sahlberg [Thu, 30 Aug 2007 05:27:45 +0000 (15:27 +1000)]
when we start 60.nfs we must make sure that the shared storage
nfs-state directory actually exists (by creating it)
or else the lock manager will not start
Ronnie Sahlberg [Mon, 27 Aug 2007 05:03:52 +0000 (15:03 +1000)]
make the ctdb shutdown command use the async _send() function to send
the shutdown command
and return success to the caller if the _send() was successful
Ronnie Sahlberg [Fri, 24 Aug 2007 05:53:41 +0000 (15:53 +1000)]
add an initial implementation of a service_id structure and three
controls to register/unregister/check a server id.
a server id consists of TYPE:VNN:ID where type is specific to the
application. VNN is the node where the serverid was registered and ID
might be a node unique identifier such as a pid or similar.
Clients can register a server id for themself at the local ctdb daemon.
When a client dissappears or when the domain socket connection for the
client drops then any and all server ids registered across that domain
socket will also be automatically removed from the store.
clients can register as many server_ids as they want at the same time
but each TYPE:VNN:ID must be globally unique.
Clients have the option of explicitely unregister a server id by using
the UNREGISTER control.
Registration and unregistration can only be done by clients to the local
daemon. clients can not register their server id to a remote node.
clients can check if a server id does exist on any ctdb node in the
network by using the check control
Ronnie Sahlberg [Fri, 24 Aug 2007 00:42:06 +0000 (10:42 +1000)]
change the api for managing callbacks to controls so that isntead of
passing it as a parameter we set the callback function explicitely from
the caller if the ..._send() function returned a valid state pointer.
Ronnie Sahlberg [Thu, 23 Aug 2007 03:00:10 +0000 (13:00 +1000)]
hang the ctdb_req_control structure off the ctdb_client_control_state
struct so that if we timeout a control we can print debug info such as
what opcode failed and to which node
we dont need the *status parameter to ctdb_client_control_state
Ronnie Sahlberg [Thu, 23 Aug 2007 01:58:09 +0000 (11:58 +1000)]
in ctdb_call_recv() we must check that state is non-NULL since
ctdb_call() may pass a null pointer to _recv() and this would cause a
segfault.
fortunately there appears there are no critical users for this codepath
right now so the risk was more theoretical IF clients start using this
call it coult segfault.
change ctdb_control() to become fully async so we later can make
recovery daemon do the expensive controls to nodes in parallell instead
of in sequence
Ronnie Sahlberg [Wed, 22 Aug 2007 02:53:24 +0000 (12:53 +1000)]
when we receive a packet from the network, check explicitely that the
node is not banned it the call is for a database record. i.e a REQ/REPLY
CALL/DMASTER
if we get such a call while banned, ignore the packet and write an entry
in the logfile
Ronnie Sahlberg [Wed, 22 Aug 2007 00:38:35 +0000 (10:38 +1000)]
when a node becomes banned its databases are no longer part of ctdb
and it should thus no longer serve any database access calls until it
has been reintroduced into the cluster.
when becoming banned, reset the local generation id to 1 to prevent
any further database access calls from other nodes from being processed.
Ronnie Sahlberg [Tue, 21 Aug 2007 07:25:15 +0000 (17:25 +1000)]
change the structure used for node flag change messages so that we can
see both the old flags as well as the new flags (so we can tell which
flags changed)
send the CTDB_SRVID_RECONFIGURE messages to connected nodes only, not to
every node, connected or not, in the cluster.
in the handler inside the recovery daemon which is invoked for node flag
change messages, only do a takeover_run() and redistribute the ip addresses IF it was the
disabled or the unhealthy flags that changed. Also send out the cluster
reconfigured message in this case.
If any of the other flags changed we dont need to do the takeover_run(0
here since that will be done during recovery.
Ronnie Sahlberg [Mon, 20 Aug 2007 23:46:27 +0000 (09:46 +1000)]
when we shutdown the service due to receiving a 'ctdb shutdown' command
from the administrator, log this as 'Received SHUTDOWN command. Stopping
CTDB daemon.' so that the administrator will know when looking at the
log 'why' the ctdb service was terminated.
Previously the only thing logged was 'shutting down' which is not
detailed enough.
Ronnie Sahlberg [Mon, 20 Aug 2007 04:16:58 +0000 (14:16 +1000)]
if a public address has already been taken over by a node, then let that
public address remain at that node until either the node becomes
unhealthy or the original/primary node for that address becomes healthy
again.
Othervise what will happen is
1, if we ban a node, the banning code immediately does a
takeover_run() and reassigns the public address to a different node in
the cluster.
2, a few seconds later (at most) the recovery daemon will detect that
the number of nodes has shrunk and will initiate a recovery.
During the recovery the public address would again be assigned to a
node, this time a different node.
Ronnie Sahlberg [Wed, 15 Aug 2007 04:44:03 +0000 (14:44 +1000)]
call the service specific event scripts directly from the forked child
instead for from /etc/ctdb/events so that we can get better debugging
output in the logs when something fails in the scripts
Ronnie Sahlberg [Wed, 15 Aug 2007 00:01:00 +0000 (10:01 +1000)]
add a wrapper function to create the key used to insert/lookup a certain
tcp connection in the tree that stores the tcp connections to kill by
sending an RST
add a define that specified the keylength instead of hardcoding it as 4
Ronnie Sahlberg [Thu, 9 Aug 2007 04:08:59 +0000 (14:08 +1000)]
change the mem hierarchy for trees. let the node be owned by the data
we store in the tree and use a node destructor so that when the data is
talloc_free()d we also remove the node from the tree.
Ronnie Sahlberg [Wed, 8 Aug 2007 05:09:19 +0000 (15:09 +1000)]
when we want to kill a tcp connection we stored the connection
description (src + dst sockaddr_in) in a linked list.
everytime we receive a captured packet from the network we had to walk
this list in linear time to see if the packet matched a connection we
wanted to RST.
which wouldnt scale very well.
replace the linked list with a redblack tree that is indexed by
src address, src port, dst address, dst port
to make checking whether the packet belongs to a connection we want to
RST very fast and scalable
the reason we need to capture packets when we want to kill a TCP
connection is because we must wait for an ACK coming back from the
remote host so that we can learn which sequence number to use in the
RST.
Most tcp today will ingore any and all RST segments unless the
sequencenumber lies exactly on the right edge of the window to make
spoofing RST a little bit more difficult.
Ronnie Sahlberg [Wed, 8 Aug 2007 01:21:18 +0000 (11:21 +1000)]
add a tree insert function that takes a callback fucntion to populate
the data of the tree.
this callback makes it more convenient to manage cases where one might
want to insert multiple entries into the tree with the same key
rename the tree->tree pointer to tree->root since this is supposed to
point to the root of the tree
Ronnie Sahlberg [Tue, 7 Aug 2007 22:20:46 +0000 (08:20 +1000)]
when inserting data in the tree, if there was already a node with the
same key then replace the data in the node with the new data and return
the pointer to the previous data held in the node.
this allows a caller to avoid having to first check if a node already
exists before inserting a possibly duplicate/colliding entry and lets
the caller do whatever it needs to do after the fact.
Ronnie Sahlberg [Sat, 4 Aug 2007 01:23:04 +0000 (11:23 +1000)]
do not restart lockd/statd when we takeover an ip address this is
overkill since
1, we now kill the tcpconnections for lockd in 60.nfs
2, rpc.statd on linux sends out the notifications using the wrong
interface anyway which breaks a lot of clients including linux !
Ronnie Sahlberg [Mon, 30 Jul 2007 06:10:14 +0000 (16:10 +1000)]
after we have checked dest address that it is a public address
update addr to the source address so the rpintout in the log matches
the client that attached to samba
Ronnie Sahlberg [Wed, 25 Jul 2007 07:53:55 +0000 (17:53 +1000)]
there were situations where we were not guaranteed that a sibling had 2
child nodes which would cause a segv when trying to dereferencing those
two child nodes in order to read their color
Ronnie Sahlberg [Tue, 24 Jul 2007 22:03:58 +0000 (08:03 +1000)]
no need to have a separate assignment of the tcparray pointer followed
by a talloc_steal()
use the returned pointer in talloc_steal as the value to assign
Ronnie Sahlberg [Mon, 23 Jul 2007 21:46:51 +0000 (07:46 +1000)]
when we build the arp structure for sending gratious arp (and tcp
tickles) just talloc_steal the enture tcp_array into the arp
structure instead of copying each of the entries into a linked list
and then releasing the tcparray.