Martin Schwenke [Sat, 29 Jun 2024 02:25:59 +0000 (12:25 +1000)]
ctdb-scripts: Avoid flapping NFS services at startup
If an NFS service check is set to, say, unhealthy_after=2 then it will
always switch from the (default startup) unhealthy state to healthy,
even if there is a fatal problem. If all services/scripts appear OK
then the node will become healthy. When the counter hits the limit it
will return to unhealthy. This is misleading.
Instead, never use the counter at startup, until the service becomes
healthy. This stops services flapping unhealthy-healthy-unhealthy.
A side-effect is that a service that starts in a broken state will
never be restarted to try to fix the problem. This makes sense. The
counting and restarting really exist to deal with problems that might
occur under load. The first monitor events occur before public IPs
are hosted, so there can be no load. If a service doesn't start
reliably the first time then the admin probably wants to know about
it.
nfs_iterate_test() is updated to run an initial monitor event to mark
the services as healthy. This initialises the counter so it can be
used for the important part of the test. Passing the -i option avoids
running the extra monitor event, so the first iteration will be the
initial monitor event.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Sat, 29 Jun 2024 09:24:25 +0000 (19:24 +1000)]
ctdb-scripts: Make initial statistics output empty
This makes initial failure to retrieve statistics less likely to
result in a statistics change. To help with this, statistics
retrieval stderr now goes to the log - only stdout goes to the file.
This means that the test code for checking statistics changes needs to
be redone to actually run the statistics command and check. As with
rpcinfo output, this output needs to behave as deterministically in
the test code as it done in the event script.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Sun, 30 Jun 2024 00:35:09 +0000 (10:35 +1000)]
ctdb-scripts: Only consider statistics on timeout
Checking statistics is only really relevant to timeouts. That is, if
an rpcinfo times out it is worth checking if the service making
progress. If the RPC service is not registered then the statistics
don't need to be checked because they shouldn't be changing.
The 2 previously added tests added to check statistics progress now
behave identically and fail on all iterations. To support testing
with "timeouts", an optional TIMEOUT flag can now be added to the RPC
service passed to nfs_iterate_test(). 2 new tests are added to
exercise the new behaviour.
The 2 new "if" statements in nfs_iterate_test() could be combined.
However, a subsequent commit would split them and would be more
difficult to read.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Thu, 4 Jul 2024 01:10:59 +0000 (11:10 +1000)]
ctdb-tests: Make NFS RPC monitoring tests consistent
Update the remaining RPC monitoring tests to use nfs_iterate_test(),
depending on it to set results. This makes all RPC monitoring tests
consistent, so they will all benefit from future improvements.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 5 Jul 2024 00:46:30 +0000 (10:46 +1000)]
ctdb-tests: Simplify handling of statistics change
Handling this across two different functions led to insanity, so
simplify.
The handling of unhealthy_after when $_numfails = 0 implicitly causes
the node to be healthy. This is how the "rpcinfo succeeds" case
works. Doing it this way for statistics makes this patch easier to
read. The implicit behaviour will go away in the next patch.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Amitay Isaacs <amitay@gmail.com>
Martin Schwenke [Fri, 9 Aug 2024 01:20:06 +0000 (11:20 +1000)]
ctdb-tcp: Use path_rundir_append() to construct lock_path
The current constant value doesn't respect CTDB_TEST_MODE/CTDB_BASE.
Instead use the path module to allow automatic listening in test mode
with local daemons.
A single node can be tested with local daemons, using something like:
The trick is that commenting out the node address in ctdb.conf means
the chosen node address is the first one from the nodes file that
allows bind/listen. In this case it is the only line.
The following ensures that automatic listening works for a node that
isn't the first:
Note that the first address isn't local on this host, so will always
fail.
So, doing the above and starting both nodes yields...
...
$ tests/local_daemons.sh foo start 1
$ sleep 3; tests/local_daemons.sh foo start 0
$ tests/local_daemons.sh foo print-log all | grep -i 'chose\|bind'
[...] node.1 ctdbd[26351]: ctdb chose network address 127.0.0.1:4379
[...] node.0 ctdbd[26438]: ctdb_tcp_listen_addr: Failed to bind() to socket - Address already in use (98)
[...] node.0 ctdbd[26438]: Unable to bind to any node address - giving up
... as expected.
It would be nice to add tests for this, but we don't really have
infrastructure for that. At least manual testing shows, for the
obvious cases, the previous commits didn't break anything. :-)
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>
Martin Schwenke [Fri, 26 Jul 2024 00:49:16 +0000 (10:49 +1000)]
ctdb-daemon: Remove a use of ctdb_errstr()
Code to setup the transport is about to be cleaned up, including
removing uses of ctdb_set_error(), so avoid logging a NULL pointer or
some other old error.
Signed-off-by: Martin Schwenke <mschwenke@ddn.com> Reviewed-by: Volker Lendecke <vl@samba.org>
Ralph Boehme [Fri, 5 Apr 2024 13:25:03 +0000 (15:25 +0200)]
smbd: add options "fs:[logical|aligned|performance|effective aligned] bytes per sector"
In order to support certain Windows applications that make use of copy reflink,
we need some way to allow configuring these values. According to testing, the
application somehow uses the value of phys_bytes_per_sector_atomic for some check
when requesting server-side reflink copies, eg for ZFS the following is needed
block size = 131072
fs:aligned bytes per sector = 131072
For some reason "block size" must also be set to the value of fs:aligned bytes
per sector, but fs:logical bytes per sector, which according to the spec should
match "block size", must stay at the default of 512, otherwise the application
does not work.
As the whole client behaviour could not be fully understood, I'm proposing to
introduce these options as undocumented parametric options, so we can at least
start testing with them.
Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: David Disseldorp <ddiss@samba.org>
Autobuild-User(master): Ralph Böhme <slow@samba.org>
Autobuild-Date(master): Tue Aug 20 07:01:19 UTC 2024 on atb-devel-224
Ralph Boehme [Thu, 6 Jun 2024 13:38:16 +0000 (15:38 +0200)]
smbd: consolidate fs capabilities code in vfswrap_fs_capabilities()
This ensures the values we return via SMB_FS_ATTRIBUTE_INFORMATION is the same
we use internally via conn->fs_capabilities.
This deliberately preserves existing behaviour as much as possible and leaves
possible improvements as a future excercize. Particularily FILE_VOLUME_QUOTAS is
already set insided SMB_VFS_STATVFS() depending on backend filesystem flags
which is probably the correct way to do it instead of just setting the
capability when Samba was built with quota support.
Signed-off-by: Ralph Boehme <slow@samba.org> Reviewed-by: David Disseldorp <ddiss@samba.org>
Signed-off-by: Pavel Filipenský <pfilipensky@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org>
Autobuild-User(master): Pavel Filipensky <pfilipensky@samba.org>
Autobuild-Date(master): Mon Aug 19 13:21:08 UTC 2024 on atb-devel-224
kdc: warn if DES-only keys enforced on the account
With MIT Kerberos 1.21+ DES is not available by default and will be
refused. This means userAccountFlags with UF_DES_KEYS_ONLY will result
in a likely authentication falure (unless allow_des=true is set in
krb5.conf).
Warn about such cases to give admins yet another chance to detect an
error in setting userAccountFlags.
Signed-off-by: Alexander Bokovoy <ab@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org>
Autobuild-User(master): Alexander Bokovoy <ab@samba.org>
Autobuild-Date(master): Sat Aug 17 11:59:01 UTC 2024 on atb-devel-224
Anoop C S [Wed, 14 Aug 2024 14:19:04 +0000 (19:49 +0530)]
docs-xml: Fix script location in syncmachinepasswordscript.xml
Update the change in installation path for winbind_ctdb_updatekeytab.sh
from SAMBA_DATADIR to newly defined CTDB_DATADIR.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15689 Signed-off-by: Anoop C S <anoopcs@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Pavel Filipenský <pfilipensky@samba.org>
Autobuild-User(master): Anoop C S <anoopcs@samba.org>
Autobuild-Date(master): Fri Aug 16 09:49:30 UTC 2024 on atb-devel-224
Anoop C S [Wed, 14 Aug 2024 14:17:35 +0000 (19:47 +0530)]
source3/script: Fix installation of winbind_ctdb_updatekeytab.sh
winbind_ctdb_updatekeytab.sh assumes the presence `onnode` utility to
execute `net ads` command on all nodes in the cluster. But `onnode`
is only built when configured with clustering support. Therefore perform
the script installation only with ctdb configuration. Also fix the
installation path to /usr/share/ctdb/scripts.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=15689 Signed-off-by: Anoop C S <anoopcs@samba.org> Reviewed-by: Andreas Schneider <asn@samba.org> Reviewed-by: Guenther Deschner <gd@samba.org> Reviewed-by: Pavel Filipenský <pfilipensky@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Tue Aug 13 22:29:28 UTC 2024 on atb-devel-224
Pair-Programmed-With: Stefan Metzmacher <metze@samba.org>
Signed-off-by: Pavel Filipenský <pfilipensky@samba.org> Signed-off-by: Stefan Metzmacher <metze@samba.org>
Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Aug 13 15:27:26 UTC 2024 on atb-devel-224
Shachar Sharon [Mon, 5 Aug 2024 13:21:10 +0000 (16:21 +0300)]
vfs_ceph_new: use 'ceph_new' for config-param prefix
Use explicit 'ceph_new' prefix to each of the ceph specific config
parameters to avoid confusion with legacy 'vfs_ceph' module. Hence,
users will have in their smb.conf a format similar to:
Take special care for readdir errno setting: in case of error, update
errno by libcephfs (and protect from possible over-write by debug
logging); in the case of successful result or end-of-stream restore
errno to its previous value before calling the readdir_fn VFS hook.
vfs_ceph{_new}: do not set errno upon successful call to libcephfs
There is code in Samba that expects errno from a previous system call
to be preserved through a subsequent system call. Thus, avoid setting
"errno = 0" in status_code() and lstatus_code() upon successful return
from libcephfs API call.
smbd: Assert we have an fsp in smbd_do_setfilepathinfo
With this in the future we can avoid some special cases in our callees
Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>
Autobuild-User(master): Jeremy Allison <jra@samba.org>
Autobuild-Date(master): Tue Aug 6 17:37:39 UTC 2024 on atb-devel-224
Pass the persistent/volatile handle as uint64's. Why? I found the
talloc_memdup() slightly misleading, and smbXcli handles those 2 id's
separately. map_smb2_handle_to_fnum() is the function to create the
smb2_hnd.
Signed-off-by: Volker Lendecke <vl@samba.org> Reviewed-by: Jeremy Allison <jra@samba.org>
Signed-off-by: Stefan Metzmacher <metze@samba.org>
Autobuild-User(master): Stefan Metzmacher <metze@samba.org>
Autobuild-Date(master): Tue Aug 6 16:16:27 UTC 2024 on atb-devel-224