Aurelien Darragon found a case of leak when working on ticket #2184.
When a reexec_on_failure() happens *BEFORE* protocol_bind_all(), the
worker is not fork and the mworker_proc struct is still there with
its 2 socketpairs.
The socketpair that is supposed to be in the master is already closed in
mworker_cleanup_proc(), the one for the worker was suppposed to
be cleaned up in mworker_cleanlisteners().
However, since the fd is not bound during this failure, the fd is never
closed.
This patch fixes the problem by setting the fd to -1 in the mworker_proc
after the fork, so we ensure that this it won't be close if everything
was done right, and then we try to close it in mworker_cleanup_proc()
when it's not set to -1.
This could be triggered with the script in ticket #2184 and a `ulimit -H
-n 300`. This will fail before the protocol_bind_all() when trying to
increase the nofile setrlimit.
In recent version of haproxy, there is a BUG_ON() in fd_insert() that
could be triggered by this bug because of the global.maxsock check.
Must be backported as far as 2.6.
The problem could exist in previous version but the code is different
and this won't be triggered easily without other consequences in the
master.
child->timestamp = date.tv_sec;
child->pid = ret;
child->version = strdup(haproxy_version);
+ /* at this step the fd is bound for the worker, set it to -1 so
+ * it could be close in case of errors in mworker_cleanup_proc() */
+ child->ipc_fd[1] = -1;
break;
}
}
list_for_each_entry_safe(child, it, &proc_list, list) {
if (child->pid == -1) {
- /* Close the socketpair master side. We don't need to
- * close the worker side, because it's stored in the
- * GLOBAL cli listener which was supposed to be in the
- * worker and which will be closed in
- * mworker_cleanlisteners()
- */
+ /* Close the socketpairs. */
if (child->ipc_fd[0] > -1)
close(child->ipc_fd[0]);
+ if (child->ipc_fd[1] > -1)
+ close(child->ipc_fd[1]);
if (child->srv) {
/* only exists if we created a master CLI listener */
srv_drop(child->srv);