_Py_closerange() currently assumes that close_range() closes
all file descriptors even if it returns an error (other than ENOSYS).
This assumption can be wrong on Linux if a seccomp sandbox denies
the underlying syscall, pretending that it returns EPERM or EACCES.
In this case _Py_closerange() won't close any descriptors at all,
which in the worst case can be a security issue.
Fix this by falling back to other methods in case of any close_range()
error. Note that fallbacks will not be triggered on any problems with
closing individual file descriptors because close_range() is documented
to ignore such errors on both Linux[1] and FreeBSD[2].
[1] https://man7.org/linux/man-pages/man2/close_range.2.html
[2] https://www.freebsd.org/cgi/man.cgi?query=close_range&sektion=2
(cherry picked from commit
1c8b3b5d66a629258f1db16939b996264a8b9c37)
Co-authored-by: Alexey Izbyshev <izbyshev@ispras.ru>
--- /dev/null
+Fix ``os.closerange()`` potentially being a no-op in a Linux seccomp
+sandbox.
first = Py_MAX(first, 0);
_Py_BEGIN_SUPPRESS_IPH
#ifdef HAVE_CLOSE_RANGE
- if (close_range(first, last, 0) == 0 || errno != ENOSYS) {
- /* Any errors encountered while closing file descriptors are ignored;
- * ENOSYS means no kernel support, though,
- * so we'll fallback to the other methods. */
+ if (close_range(first, last, 0) == 0) {
+ /* close_range() ignores errors when it closes file descriptors.
+ * Possible reasons of an error return are lack of kernel support
+ * or denial of the underlying syscall by a seccomp sandbox on Linux.
+ * Fallback to other methods in case of any error. */
}
else
#endif /* HAVE_CLOSE_RANGE */