Unlike the pthread variant, Windows RCU uses broadcast instead
of targeted signal calls in some places, unnecessarily increasing
the number of used cycles.
The retire_qp should wake up only one thread to proceed, not
all of them. For update_qp, that signals the thread after
increasing writers_alloced, signalling all threads does not make
sense either.
The speedup is significant on lhash_test, running on many CPUs
(on 32 cores, a speedup from 6:20 to 1:40 minutes on test hw).
Co-Authored-By: Claude Opus 4.6 Extended <noreply@anthropic.com>
Signed-off-by: Milan Broz <gmazyland@gmail.com>
Reviewed-by: Saša Nedvědický <sashan@openssl.org>
Reviewed-by: Nikola Pajkovsky <nikolap@openssl.org>
Reviewed-by: Tomas Mraz <tomas@openssl.org>
MergeDate: Fri Mar 13 17:25:47 2026
(Merged from https://github.com/openssl/openssl/pull/30388)
#endif
/* wake up any waiters */
- ossl_crypto_condvar_broadcast(lock->alloc_signal);
+ ossl_crypto_condvar_signal(lock->alloc_signal);
ossl_crypto_mutex_unlock(lock->alloc_lock);
return &lock->qp_group[current_idx];
}
{
ossl_crypto_mutex_lock(lock->alloc_lock);
lock->writers_alloced--;
- ossl_crypto_condvar_broadcast(lock->alloc_signal);
+ ossl_crypto_condvar_signal(lock->alloc_signal);
ossl_crypto_mutex_unlock(lock->alloc_lock);
}