]> git.ipfire.org Git - thirdparty/samba.git/log
thirdparty/samba.git
18 years agomerge shutdown control from ronnie
Andrew Tridgell [Thu, 17 May 2007 00:48:43 +0000 (10:48 +1000)] 
merge shutdown control from ronnie
(This used to be ctdb commit 61bfe26dde0bfd494d4f12f0aa2a3bb78852ab31)

18 years agoadd a control to shutdown/kill a node
Ronnie Sahlberg [Thu, 17 May 2007 00:45:31 +0000 (10:45 +1000)] 
add a control to shutdown/kill a node

(This used to be ctdb commit 3802f7304fd59d56062c855987e2561753e85a69)

18 years agomerge from tridge
Ronnie Sahlberg [Wed, 16 May 2007 08:44:51 +0000 (18:44 +1000)] 
merge from tridge

(This used to be ctdb commit 0c6dc471e33e80db00a2b006262c4107f39fa023)

18 years ago- merge from ronnie
Andrew Tridgell [Wed, 16 May 2007 08:10:26 +0000 (18:10 +1000)] 
- merge from ronnie
- fixed a memory leak found by dmitry

(This used to be ctdb commit ae87bf0005666b50850161c3843d6bc7cb5c8971)

18 years agoremove a prototype we no longer need
Ronnie Sahlberg [Wed, 16 May 2007 04:45:43 +0000 (14:45 +1000)] 
remove a prototype we no longer need

(This used to be ctdb commit 4a11373ec5e8196cf430f18f6171915f790f794b)

18 years agoif a caller specifies a timeout when calling a control, it makes no
Ronnie Sahlberg [Wed, 16 May 2007 02:34:30 +0000 (12:34 +1000)] 
if a caller specifies a timeout when calling a control, it makes no
sense to have the daemon requeue the packets if they timeout or fail to
deliver to the remote node

(This used to be ctdb commit 9fb753046787190970654aeb937e96685ac53184)

18 years agomerge from tridge
Ronnie Sahlberg [Wed, 16 May 2007 01:12:28 +0000 (11:12 +1000)] 
merge from tridge

(This used to be ctdb commit 8d424b41d6cf2973b28a749d1b8e6a028dad9ffe)

18 years agoenable TCP keepalives
Andrew Tridgell [Tue, 15 May 2007 08:40:56 +0000 (18:40 +1000)] 
enable TCP keepalives
(This used to be ctdb commit a44f760f6260359201d8431d2f1267af2bc6b1b1)

18 years agomoved the recovery daemon into the main ctdbd and enable it by default
Andrew Tridgell [Tue, 15 May 2007 05:13:36 +0000 (15:13 +1000)] 
moved the recovery daemon into the main ctdbd and enable it by default
(This used to be ctdb commit 2a7d42124731f43d013cb76a798525eab4cc1ee0)

18 years agofixed two more places where we don't correctly handle write errors on sockets
Andrew Tridgell [Tue, 15 May 2007 04:08:58 +0000 (14:08 +1000)] 
fixed two more places where we don't correctly handle write errors on sockets
(This used to be ctdb commit f4a71bb63e7f75d21b66f9eaeac997c2029cd146)

18 years agomerge from tridge
Ronnie Sahlberg [Tue, 15 May 2007 00:34:14 +0000 (10:34 +1000)] 
merge from tridge

(This used to be ctdb commit eb64cde53ec5ed6949df1684e5c148f2294b1da7)

18 years agofixed a fd close error on reconnect
Andrew Tridgell [Tue, 15 May 2007 00:33:28 +0000 (10:33 +1000)] 
fixed a fd close error on reconnect
(This used to be ctdb commit 240651a6f67f914b06e273696cef6180d788221e)

18 years agomerge from tridge
Ronnie Sahlberg [Tue, 15 May 2007 00:28:41 +0000 (10:28 +1000)] 
merge from tridge

(This used to be ctdb commit 0697f59a044deeab126a39bff97bcd5c1101298e)

18 years agoadded a control to get the local vnn
Andrew Tridgell [Tue, 15 May 2007 00:17:16 +0000 (10:17 +1000)] 
added a control to get the local vnn
(This used to be ctdb commit 0b109f574b710f290372512d0694290ea7cd4368)

18 years agocheck for error on ctdb_ltdb_store
Andrew Tridgell [Tue, 15 May 2007 00:16:59 +0000 (10:16 +1000)] 
check for error on ctdb_ltdb_store
(This used to be ctdb commit c4a34bac4ad4d2f9699e08074668d25586e3c0da)

18 years agoadded a -i switch to run ctdbd without forking
Andrew Tridgell [Mon, 14 May 2007 23:44:33 +0000 (09:44 +1000)] 
added a -i switch to run ctdbd without forking
(This used to be ctdb commit 327df14ecd58f405fbe8b38afa2ee54a8dd0a2e4)

18 years agoreading on the write side of a pipe isn't allowed - this caused us to run without...
Andrew Tridgell [Mon, 14 May 2007 23:44:03 +0000 (09:44 +1000)] 
reading on the write side of a pipe isn't allowed - this caused us to run without locking in the lockwait code
(This used to be ctdb commit 2ac67ce737f30258915cc25bde531d361092ae14)

18 years agoAIX needs sin_len field for bind()
Andrew Tridgell [Mon, 14 May 2007 23:42:52 +0000 (09:42 +1000)] 
AIX needs sin_len field for bind()
(This used to be ctdb commit cd6c35d4aa4f4a4cfeedf6902cda84e43d7aeba4)

18 years agomerge from tridge
Ronnie Sahlberg [Mon, 14 May 2007 04:07:19 +0000 (14:07 +1000)] 
merge from tridge

(This used to be ctdb commit d1dae4fc8f4c2d16d313a27968d67c5825a133d1)

18 years agomerge from tridge
Ronnie Sahlberg [Mon, 14 May 2007 04:05:49 +0000 (14:05 +1000)] 
merge from tridge

(This used to be ctdb commit 65f4415e618dbbac0260f6a4e51e051e6df64a61)

18 years agowe must not free the fde until after we no longer need the lock child
Andrew Tridgell [Mon, 14 May 2007 04:01:33 +0000 (14:01 +1000)] 
we must not free the fde until after we no longer need the lock child
(This used to be ctdb commit e06776c7c37b63f5c3165c7043d665e0c1a95337)

18 years agokill the lockwait child if the pipe goes away
Andrew Tridgell [Mon, 14 May 2007 03:49:01 +0000 (13:49 +1000)] 
kill the lockwait child if the pipe goes away
(This used to be ctdb commit bdfa8ba9932fade074a05a6cb6bc14ae3b84618c)

18 years agodon't allow setvnnmap while not frozen
Andrew Tridgell [Mon, 14 May 2007 03:48:40 +0000 (13:48 +1000)] 
don't allow setvnnmap while not frozen
(This used to be ctdb commit a73f47f565894cc7e346177d87f2e6813837e1c6)

18 years agodon't allow setrecmaster while not frozen
Andrew Tridgell [Mon, 14 May 2007 03:48:14 +0000 (13:48 +1000)] 
don't allow setrecmaster while not frozen
(This used to be ctdb commit e84b05ba6062ffc45b7f3c23e88feef1d39069c4)

18 years agoremove the control to bump the rsn since we dont need it anymore
Ronnie Sahlberg [Sun, 13 May 2007 22:03:48 +0000 (08:03 +1000)] 
remove the control to bump the rsn since we dont need it anymore

(This used to be ctdb commit a646b6d77bd8adf6c986259c534a05400c4bde11)

18 years agoadd a mising parameter to the new signature for ctdb_control
Ronnie Sahlberg [Sun, 13 May 2007 20:50:24 +0000 (06:50 +1000)] 
add a mising parameter to the new signature for ctdb_control

(This used to be ctdb commit 3a3304cd48d644c758f416ec283faf3ba9690c04)

18 years agomerge from tridge
Ronnie Sahlberg [Sun, 13 May 2007 20:25:15 +0000 (06:25 +1000)] 
merge from tridge

(This used to be ctdb commit 7bca79ad6357149fd7c6b28ce4b05de3d223a7de)

18 years agomake sure the ctdb control socket is secure
Andrew Tridgell [Sat, 12 May 2007 23:20:16 +0000 (09:20 +1000)] 
make sure the ctdb control socket is secure
(This used to be ctdb commit 2954f2e501a418af578e75e8705b0b39a77c1861)

18 years agoadded error messages in ctdb_control replies
Andrew Tridgell [Sat, 12 May 2007 11:25:26 +0000 (21:25 +1000)] 
added error messages in ctdb_control replies
(This used to be ctdb commit bd848f5b760e6b2a73ebfc67fd8adb3c31479fb5)

18 years agoprioritise the dmaster in case of matching rsn
Andrew Tridgell [Sat, 12 May 2007 09:57:12 +0000 (19:57 +1000)] 
prioritise the dmaster in case of matching rsn
(This used to be ctdb commit 4996a12174aa0d215a5b14cb970bdf83eed34a39)

18 years agothe invalid dmaster is no longer needed in recovery
Andrew Tridgell [Sat, 12 May 2007 09:56:31 +0000 (19:56 +1000)] 
the invalid dmaster is no longer needed in recovery
(This used to be ctdb commit bd638ea63d11485bc3a8c50d923262a48095c2f3)

18 years agothe retry client code is no longer needed now that we use a freeze on recovery
Andrew Tridgell [Sat, 12 May 2007 09:55:55 +0000 (19:55 +1000)] 
the retry client code is no longer needed now that we use a freeze on recovery
(This used to be ctdb commit 4213475a2db93b149705bfbb578c78936124c608)

18 years agoensure we propogate the correct rsn for a request dmaster
Andrew Tridgell [Sat, 12 May 2007 09:55:18 +0000 (19:55 +1000)] 
ensure we propogate the correct rsn for a request dmaster
(This used to be ctdb commit 70c1c67db865db8a49b56e8e3e8fd56ec5063208)

18 years agosimplify the generation checking on incoming call packets
Andrew Tridgell [Sat, 12 May 2007 09:54:40 +0000 (19:54 +1000)] 
simplify the generation checking on incoming call packets
(This used to be ctdb commit 87ee47f7fbbf71228bc9cc16faff86b4c59333a2)

18 years agomake sure we ignore requeued ctdb_call packets of older generations except for packet...
Andrew Tridgell [Sat, 12 May 2007 08:08:50 +0000 (18:08 +1000)] 
make sure we ignore requeued ctdb_call packets of older generations except for packets from the client
(This used to be ctdb commit facab105fbd7fe50f96bdd763ae50ddc54fbdacc)

18 years agoadded -t option to ctdb_control
Andrew Tridgell [Sat, 12 May 2007 06:04:56 +0000 (16:04 +1000)] 
added -t option to ctdb_control
(This used to be ctdb commit 658141280eeb121a570d71c4b0af36d03004f320)

18 years ago- nicer message if freeze child dies
Andrew Tridgell [Sat, 12 May 2007 05:59:49 +0000 (15:59 +1000)] 
- nicer message if freeze child dies
- change local generation count after recovery/freeze started

(This used to be ctdb commit d9768142797f083a8c09b55d6a8a93cc12089348)

18 years agoshow total frozen/recoving in status
Andrew Tridgell [Sat, 12 May 2007 05:51:08 +0000 (15:51 +1000)] 
show total frozen/recoving in status
(This used to be ctdb commit 0d0eb66a63fe6912edb85bf7387ac76acb70babd)

18 years agoreport number of frozen/thawed nodes
Andrew Tridgell [Sat, 12 May 2007 05:44:56 +0000 (15:44 +1000)] 
report number of frozen/thawed nodes
(This used to be ctdb commit 997720bc0e15d882aefed3464fe285674beed691)

18 years agowatch for the freeze child exiting
Andrew Tridgell [Sat, 12 May 2007 05:44:35 +0000 (15:44 +1000)] 
watch for the freeze child exiting
(This used to be ctdb commit 7f350eca8598022ebd198b2476d1f2c2a8f03a8d)

18 years agomore robust freeze/thaw logic
Andrew Tridgell [Sat, 12 May 2007 05:29:06 +0000 (15:29 +1000)] 
more robust freeze/thaw logic
(This used to be ctdb commit 51c1e51aeb7dfac1683584df7ef1bef98c092f76)

18 years agoseparate out the freeze/thaw handling from recovery
Andrew Tridgell [Sat, 12 May 2007 05:15:27 +0000 (15:15 +1000)] 
separate out the freeze/thaw handling from recovery
(This used to be ctdb commit 0b0640bd8b8334961f240e0cf276ac112cd6e616)

18 years agoadded lockwait child code for entering recovery mode. A child processes holds lockall...
Andrew Tridgell [Sat, 12 May 2007 04:34:21 +0000 (14:34 +1000)] 
added lockwait child code for entering recovery mode. A child processes holds lockall locks for the entire recovery process
(This used to be ctdb commit f892f30def75b0d964c35eae38c4cf675597dd28)

18 years agoadded _mark calls for tdb_lockall
Andrew Tridgell [Sat, 12 May 2007 04:33:10 +0000 (14:33 +1000)] 
added _mark calls for tdb_lockall
(This used to be ctdb commit e59134fd2af67c746b907c23fdcde2eccbbe17cf)

18 years agofixed debug message
Andrew Tridgell [Fri, 11 May 2007 07:29:21 +0000 (17:29 +1000)] 
fixed debug message
(This used to be ctdb commit 9802bf1ef9104b31977020e803b0f81da71c7169)

18 years agowe have to get a NEW generation id after completing recovery
Ronnie Sahlberg [Fri, 11 May 2007 02:03:19 +0000 (12:03 +1000)] 
we have to get a NEW generation id after completing recovery
to solve a race condition with the logic to retransmit in
ctdb_call.c/ctdb_call_timeout()

(This used to be ctdb commit 1044ddca9ff5c434816de35d3f659aa182704e97)

18 years agomerge from tridge
Ronnie Sahlberg [Fri, 11 May 2007 00:37:42 +0000 (10:37 +1000)] 
merge from tridge

(This used to be ctdb commit 826058b547b8e836f0a7066e9479e481ad9c472e)

18 years agoadd a control to bump the rsn number for all records in a database
Ronnie Sahlberg [Fri, 11 May 2007 00:36:47 +0000 (10:36 +1000)] 
add a control to bump the rsn number for all records in a database

use this control from the recovery daemon to ensure that the recmaster
always have a higher rsn than andy other node for the records after
recovery completes

(This used to be ctdb commit 6fb6a8b981a804bfcc460c4481c51c7c647230f6)

18 years ago- merge from ronnie
Andrew Tridgell [Fri, 11 May 2007 00:33:43 +0000 (10:33 +1000)] 
- merge from ronnie
- increment rsn only in become_dmaster
- add torture check for rsn regression in ctdb_ltdb_store

(This used to be ctdb commit 8047506a08bb53ee01aa64f25c9f72839e1e2d68)

18 years agowe must bump the rsn everytime we do a REQ_DMASTER or a REPLY_DMASTER
Ronnie Sahlberg [Thu, 10 May 2007 20:08:17 +0000 (06:08 +1000)] 
we must bump the rsn everytime we do a REQ_DMASTER or a REPLY_DMASTER
to make sure that the "merge records based on rsn during recovery" will
merge correctly.

this is extra important since samba3 never bumps the record when it
writes new data to it !

(This used to be ctdb commit 857e67204065603592c2dbbadbd8667ebba9ccdb)

18 years agomake ctdb_control catdb work again
Ronnie Sahlberg [Thu, 10 May 2007 19:40:11 +0000 (05:40 +1000)] 
make ctdb_control catdb work again

(This used to be ctdb commit 40a8fb68c71be0b9f54ae88bf8aa39a4c71f3b5a)

18 years agomerge from tridge
Ronnie Sahlberg [Thu, 10 May 2007 07:59:51 +0000 (17:59 +1000)] 
merge from tridge

(This used to be ctdb commit f261f554ccf5d85a90f504cc20fc6f1f8b3f14d6)

18 years ago- got rid of the complex hand marshalling in the recovery controls
Andrew Tridgell [Thu, 10 May 2007 07:43:45 +0000 (17:43 +1000)] 
- got rid of the complex hand marshalling in the recovery controls

- fixed the re-send of ctdb calls after a generation change

- fixed a reqid idr leak in controls

- removed the write_record test code

- use the new nonblock lockall code to prevent ctdbd from ever doing a
  blocking lock that could deadlock with smbd

- moved more of the recovery controls into ctdb_recover.c

(This used to be ctdb commit 565a21aa4f1e842309986ab97d6244801153deec)

18 years agoadded nonblocking varients of the two lockall functions to tdb
Andrew Tridgell [Thu, 10 May 2007 07:43:08 +0000 (17:43 +1000)] 
added nonblocking varients of the two lockall functions to tdb
(This used to be ctdb commit 2e99fa41ce01fa282bc0f3244ca42a78173743ed)

18 years agobetter timeout handling for calls, controls and traverses
Andrew Tridgell [Thu, 10 May 2007 04:06:48 +0000 (14:06 +1000)] 
better timeout handling for calls, controls and traverses
(This used to be ctdb commit 63346a6c59d4821b4c443939b5d88db8cd20f5fe)

18 years agomerge from ronnie
Andrew Tridgell [Thu, 10 May 2007 03:15:58 +0000 (13:15 +1000)] 
merge from ronnie
(This used to be ctdb commit 92b7a849565730744c75a7fb776173554e9f57bf)

18 years agosetup the random number generator a bit better
Andrew Tridgell [Thu, 10 May 2007 03:10:23 +0000 (13:10 +1000)] 
setup the random number generator a bit better
(This used to be ctdb commit 708585eb0ed31b0df6543a1d7a20b82e751877c2)

18 years agocreate a correct vnnmap structure to prevent a segv
Ronnie Sahlberg [Thu, 10 May 2007 00:10:58 +0000 (10:10 +1000)] 
create a correct vnnmap structure to prevent a segv

(This used to be ctdb commit 17777bb5e6208e97a82a171243c6c406f53ee02e)

18 years agoupdate ctdb_control to create a correct ctdb_vnn_map->map array
Ronnie Sahlberg [Thu, 10 May 2007 00:03:21 +0000 (10:03 +1000)] 
update ctdb_control to create a correct ctdb_vnn_map->map array

(This used to be ctdb commit e510cc89068557881688d6cada38915b3e51f8cd)

18 years agowhen starting a new election, also force all nodes into recovery mode so
Ronnie Sahlberg [Wed, 9 May 2007 23:48:14 +0000 (09:48 +1000)] 
when starting a new election, also force all nodes into recovery mode so
there is no internode traffic to interfere with our election

(This used to be ctdb commit ccfb67a076c72a0e7f2b6dc5fce9c19f652ba2ad)

18 years agowhen starting recovery repoint dmaster to an invalid node and not the
Ronnie Sahlberg [Wed, 9 May 2007 23:46:10 +0000 (09:46 +1000)] 
when starting recovery repoint dmaster to an invalid node and not the
current vnn

(This used to be ctdb commit 3c2dcc7448b335cf42e8f7edffba21229dccbd79)

18 years agomerge from tridge
Ronnie Sahlberg [Wed, 9 May 2007 23:44:28 +0000 (09:44 +1000)] 
merge from tridge

(This used to be ctdb commit 8c5e6836280499243c0cd247093844a891f00da3)

18 years agoactually check the remote nodes and not just the local node
Ronnie Sahlberg [Wed, 9 May 2007 23:43:01 +0000 (09:43 +1000)] 
actually check the remote nodes and not just the local node

(This used to be ctdb commit 09df21be6361743d320fafc120718211eece85c3)

18 years agoremove old s3 recovery code
Andrew Tridgell [Wed, 9 May 2007 22:49:57 +0000 (08:49 +1000)] 
remove old s3 recovery code
fixed vnnmap wire format in recover daemon

(This used to be ctdb commit e03fab7bfe0cf43f40c49a3d63e75dc44001d8d8)

18 years agofixed setvnnmap to use wire structures too
Andrew Tridgell [Wed, 9 May 2007 22:22:26 +0000 (08:22 +1000)] 
fixed setvnnmap to use wire structures too
(This used to be ctdb commit 1208e4219d220b80e2f74974cac8ed2b8956d3ef)

18 years agoseparate the wire format and internal format for the vnn_map
Andrew Tridgell [Wed, 9 May 2007 22:13:19 +0000 (08:13 +1000)] 
separate the wire format and internal format for the vnn_map
(This used to be ctdb commit 9a71718d87c5162f1423d85c2e86a01f6771925e)

18 years agomoved the vnn_map initialisation out of the cmdline code
Andrew Tridgell [Wed, 9 May 2007 21:55:46 +0000 (07:55 +1000)] 
moved the vnn_map initialisation out of the cmdline code
(This used to be ctdb commit 81492b840d608dc724d5a25ddef6eb0ce12b95fb)

18 years agomerged ronnies code to delay client requests when in recovery mode
Andrew Tridgell [Wed, 9 May 2007 21:43:18 +0000 (07:43 +1000)] 
merged ronnies code to delay client requests when in recovery mode
(This used to be ctdb commit dfca37076d642f3407c63dfe3b685287d27c8f8d)

18 years agomerge from tridge
Ronnie Sahlberg [Wed, 9 May 2007 20:55:28 +0000 (06:55 +1000)] 
merge from tridge

(This used to be ctdb commit 190cca8488dff982062ae7b1a82cb33cc1cdfaf7)

18 years agohang the event from the retry structure instead of the hdr structure
Ronnie Sahlberg [Wed, 9 May 2007 04:08:11 +0000 (14:08 +1000)] 
hang the event from the retry structure instead of the hdr structure

(This used to be ctdb commit 8536c8c3a30a986ba4945d02aef82b47495ce3f8)

18 years agowhen we are in recovery mode and we get a REQ_CALL from a client,
Ronnie Sahlberg [Wed, 9 May 2007 04:06:47 +0000 (14:06 +1000)] 
when we are in recovery mode and we get a REQ_CALL from a client,
defer it for one second and try again

(This used to be ctdb commit 606fb6414b97d1813056982cda7c0fe84d746e67)

18 years agomerge from ronnie
Andrew Tridgell [Wed, 9 May 2007 01:54:37 +0000 (11:54 +1000)] 
merge from ronnie
(This used to be ctdb commit f67a4842e7b1efb2ad61c41e4895c7698e564bf3)

18 years agoadd a command line flag to ctdbd to start a recovery daemon.
Ronnie Sahlberg [Tue, 8 May 2007 23:59:23 +0000 (09:59 +1000)] 
add a command line flag to ctdbd to start a recovery daemon.

update the recovery test script to start all ctdb daemons with a
recovery daemon

(This used to be ctdb commit 47794e16df285cacefc30208d892d931a6e46b96)

18 years agochange the name of the recovery daemon to ctdb_recoverd
Ronnie Sahlberg [Tue, 8 May 2007 23:31:53 +0000 (09:31 +1000)] 
change the name of the recovery daemon to ctdb_recoverd

(This used to be ctdb commit b0cf919e4f38961e5cf4e1e79a0cfe4bb4a96d76)

18 years agoadd a small tool to monitor recovery
Ronnie Sahlberg [Tue, 8 May 2007 22:05:53 +0000 (08:05 +1000)] 
add a small tool to monitor recovery

(This used to be ctdb commit b45936828713c31ee670e2106b49c2351234f310)

18 years agofixed a problem with the number of timed events growing without bound with the new...
Andrew Tridgell [Tue, 8 May 2007 11:16:29 +0000 (21:16 +1000)] 
fixed a problem with the number of timed events growing without bound with the new seqnum code
(This used to be ctdb commit 6109ae3dae8d93c93a2dc76cc561ea6e21458aa6)

18 years agowe must repoint dmaster to an invalid node during recovery to stop the
Ronnie Sahlberg [Tue, 8 May 2007 04:51:55 +0000 (14:51 +1000)] 
we must repoint dmaster to an invalid node during recovery to stop the
shortcut from working

(This used to be ctdb commit 5e18930be8c0efb87aa9e2780d9457634b24e156)

18 years agofix alignment bug for pulldb
Ronnie Sahlberg [Tue, 8 May 2007 04:42:00 +0000 (14:42 +1000)] 
fix alignment bug for pulldb

(This used to be ctdb commit f1188289c18805c2c5f8bae61d73df3fc762faee)

18 years agomerge from tridge
Ronnie Sahlberg [Sun, 6 May 2007 22:07:26 +0000 (08:07 +1000)] 
merge from tridge

(This used to be ctdb commit da8636707547e77c76dc7e368ddfae35b8a21402)

18 years agomerged from ronnie
Andrew Tridgell [Sun, 6 May 2007 21:56:38 +0000 (07:56 +1000)] 
merged from ronnie
(This used to be ctdb commit 49aad9fb09ca2c787e6f82ba03cb229cc51844f0)

18 years agohang the timeout event off state and thus we dont need to explicitely
Ronnie Sahlberg [Sun, 6 May 2007 21:54:17 +0000 (07:54 +1000)] 
hang the timeout event off state   and thus we dont need to explicitely
free it   and also we wont accidentally return from the function without
killing the event first

(This used to be ctdb commit e3d72d024ef7342a808e5c488fd646a39e5fac78)

18 years agoit now works to talloc_free() the timed event if we no longer want it to
Ronnie Sahlberg [Sun, 6 May 2007 21:47:16 +0000 (07:47 +1000)] 
it now works to talloc_free() the timed event if we no longer want it to
trigger

this must have been a sideeffect of a different bug in the recoverd.c
code that has now been fixed

(This used to be ctdb commit 676446fd1083c371ad0ff72dd8c636ec8e6d1423)

18 years agorecovery daemon with recovery master election
Ronnie Sahlberg [Sun, 6 May 2007 20:51:58 +0000 (06:51 +1000)] 
recovery daemon with recovery master election

election is primitive, it elects the lowest vnn as the recovery master

two new controls, to get/set recovery master for a node

to use recovery daemon,   start one
./bin/recoverd --socket=ctdb.socket*
for each ctdb daemon

it has been briefly tested by deleting and adding nodes to a 4 node
cluster but needs more testing

(This used to be ctdb commit 541d1cc49d46d44042a31a8404d521412ef2fdb3)

18 years agoadd new controls to get and set the recovery master node of a daemon
Ronnie Sahlberg [Sun, 6 May 2007 19:02:48 +0000 (05:02 +1000)] 
add new controls to get and set the recovery master node of a daemon
i.e. which node is "elected" to check for and drive recovery

(This used to be ctdb commit d577093eb4b619392c71ab5ce81e8c02565d93f0)

18 years agoadd a test in the function that checks whether the cluster needs
Ronnie Sahlberg [Sun, 6 May 2007 18:41:12 +0000 (04:41 +1000)] 
add a test in the function that checks whether the cluster needs
recovery or not  that all active nodes are in normal mode.
If we discover that some node is still in recoverymode it may indicate
that a previous recovery ended prematurely and thus we should start a
new recovery

(This used to be ctdb commit c15517872e6c98c8c425a8d47d2b348ecb0620b0)

18 years agoupdate a comment to be more desciptive
Ronnie Sahlberg [Sun, 6 May 2007 02:46:56 +0000 (12:46 +1000)] 
update a comment to be more desciptive

(This used to be ctdb commit 96082c54d830974bf9a4d5bad33ad60379a85798)

18 years agochange a lot of printf into debug statements
Ronnie Sahlberg [Sun, 6 May 2007 00:51:25 +0000 (10:51 +1000)] 
change a lot of printf into debug statements

(This used to be ctdb commit 6edb9149c7eb36da47e4e6a9dd3ede22263ce3f9)

18 years agobreak out the code to update all nodes to the new vnnmap into a helper
Ronnie Sahlberg [Sun, 6 May 2007 00:42:18 +0000 (10:42 +1000)] 
break out the code to update all nodes to the new vnnmap into a helper
function

(This used to be ctdb commit 81d39177949b54715710907d14ddc888dc09b064)

18 years agocreate a helper function for recovery to push all local databases out
Ronnie Sahlberg [Sun, 6 May 2007 00:38:44 +0000 (10:38 +1000)] 
create a helper function for recovery to push all local databases out
onto the remote nodes

(This used to be ctdb commit 1ba76d374652cfa29e56fb77c7190349e42d3bcc)

18 years agoadd an extra blank line
Ronnie Sahlberg [Sun, 6 May 2007 00:30:18 +0000 (10:30 +1000)] 
add an extra blank line

(This used to be ctdb commit 75096dde58df6532abbf5b9ebd771e8810156483)

18 years agobreak the code that repoints dmaster for all local and remote records
Ronnie Sahlberg [Sun, 6 May 2007 00:22:13 +0000 (10:22 +1000)] 
break the code that repoints dmaster for all local and remote records
into a separate helper function

(This used to be ctdb commit d5ab30d0ac21e736eb34eaa19bccfee5f0ce7cfb)

18 years agocreate a helper function for recovery that pulls and merges all remote
Ronnie Sahlberg [Sun, 6 May 2007 00:16:48 +0000 (10:16 +1000)] 
create a helper function for recovery that pulls and merges all remote
databases onto the local node

(This used to be ctdb commit 5cecc47449c369f91e83389a94b987ac32b1e3f4)

18 years agocreate a helper function to make sure the local node that does recovery
Ronnie Sahlberg [Sun, 6 May 2007 00:12:42 +0000 (10:12 +1000)] 
create a helper function to make sure the local node that does recovery
has all the databases that exist on any other remote node

(This used to be ctdb commit 0f436e3d40fea6e6a146019b0c664e80e81e88b4)

18 years agoadd a helper function to create all missing remote databases detected
Ronnie Sahlberg [Sun, 6 May 2007 00:04:37 +0000 (10:04 +1000)] 
add a helper function to create all missing remote databases detected
during recovery

(This used to be ctdb commit 04758c6f7d8f61260be6d2472380cb7904984427)

18 years agobreak out the setting/clearing of recovery mode into a dedicated helper
Ronnie Sahlberg [Sat, 5 May 2007 23:53:12 +0000 (09:53 +1000)] 
break out the setting/clearing of recovery mode into a dedicated helper
function

(This used to be ctdb commit dba4e4f8aa4f2fde1e9f8d93bdf3a33f7de8ce18)

18 years agodont allocate arrays where we can just return a single integer
Ronnie Sahlberg [Sat, 5 May 2007 22:05:22 +0000 (08:05 +1000)] 
dont allocate arrays where we can just return a single integer

(This used to be ctdb commit 07bc338e490e0f7018808a2450bc54863eb88c94)

18 years agodont use arrays where a uint32_t works just as well
Ronnie Sahlberg [Sat, 5 May 2007 21:52:20 +0000 (07:52 +1000)] 
dont use arrays where a uint32_t works just as well

(This used to be ctdb commit 843e974b29c93df891ae7cf13323ee960a334f60)

18 years agoadd a ifdeffed out block to the call.
Ronnie Sahlberg [Sat, 5 May 2007 21:32:16 +0000 (07:32 +1000)] 
add a ifdeffed out block to the call.

we really should kill the event in case the call completed before the
timeout   so that we can also make timed_out non-static

(This used to be ctdb commit f297eed589b1d4e188f77f195683365cf91d0e62)

18 years agohte timed_out variable needs to be static and can not be on the stack
Ronnie Sahlberg [Sat, 5 May 2007 21:07:47 +0000 (07:07 +1000)] 
hte timed_out variable needs to be static and can not be on the stack
since if the command times out and we return from ctdb_control   we may
have events that can trigger later which will overwrite data that is no
longer in our stackframe

(This used to be ctdb commit 93942543092be618c0bd8ef68b470b0789bad7ad)

18 years agoupdate to rhe recovery daemon
Ronnie Sahlberg [Sat, 5 May 2007 20:58:01 +0000 (06:58 +1000)] 
update to rhe recovery daemon
ctdb_ctrl_ calls are timedout due to nodes arriving or leaving the
cluster it crashes the recovery daemon afterwards with a SEGV but no
useful stack backtrace

(This used to be ctdb commit cd3abc7349e86555ccd87cd47a1dcc2adad2f46c)