gh-148820: Fix _PyRawMutex use-after-free on spurious semaphore wakeup (gh-148852)
_PyRawMutex_UnlockSlow CAS-removes the waiter from the list and then
calls _PySemaphore_Wakeup, with no handshake. If _PySemaphore_Wait
returns Py_PARK_INTR, the waiter can destroy its stack-allocated
semaphore before the unlocker's Wakeup runs, causing a fatal error from
ReleaseSemaphore / sem_post.
Loop in _PyRawMutex_LockSlow until _PySemaphore_Wait returns Py_PARK_OK,
which is only signalled when a matching Wakeup has been observed.
Also include GetLastError() and the handle in the Windows fatal messages
in _PySemaphore_Init, _PySemaphore_Wait, and _PySemaphore_Wakeup to make
similar races easier to diagnose in the future.