From: Alex Rousskov Date: Wed, 21 Sep 2011 18:05:55 +0000 (-0600) Subject: Temporary fix: Avoid killing Coordinator with unregistered cache mgr actions X-Git-Tag: BumpSslServerFirst.take01~142 X-Git-Url: http://git.ipfire.org/gitweb.cgi?a=commitdiff_plain;h=46a7dc66ead0e64123576d9760680fdc82b91875;p=thirdparty%2Fsquid.git Temporary fix: Avoid killing Coordinator with unregistered cache mgr actions that cause isOpen() assertions. If a worker forwards a cache manager request to Coordinator and Coordinator does not have that action registered, CacheManager::createRequestedAction() throws (as it should) and Mgr::Request cleanup asserts when its half-baked connection tries to close a not-yet-imported socket descriptor. This workaround catches the exception, reports it, and manually closes the socket descriptor. It also prevents an ACK response from being sent to the worker, which triggers a worker timeout. Mid-term TODO: Coordinator should register all actions that are known to kids. Should Coordinator respond with an error instead of relying on a timeout? Long-term TODO: Consider an API where cache manager responses can be aggregated and formatted by Coordinator without knowing action-specific details. After all, there are not so many types of action information (size, count, rate, etc.) and most actions have simple reporting formats. Currently, it is awkward to guarantee that Coordinator and all workers know all actions, especially when some actions may be specific to non-worker kids such as Coordinator and diskers. --- diff --git a/src/ipc/Coordinator.cc b/src/ipc/Coordinator.cc index 6379f4b59e..7bc106ffce 100644 --- a/src/ipc/Coordinator.cc +++ b/src/ipc/Coordinator.cc @@ -164,15 +164,26 @@ Ipc::Coordinator::handleCacheMgrRequest(const Mgr::Request& request) { debugs(54, 4, HERE); + try { + Mgr::Action::Pointer action = + CacheManager::GetInstance()->createRequestedAction(request.params); + AsyncJob::Start(new Mgr::Inquirer(action, request, strands_)); + } + catch (const std::exception &ex) { + debugs(54, DBG_IMPORTANT, "BUG: cannot aggregate mgr:" << + request.params.actionName << ": " << ex.what()); + // TODO: Avoid half-baked Connections or teach them how to close. + ::close(request.conn->fd); + request.conn->fd = -1; + return; // the worker will timeout and close + } + // Let the strand know that we are now responsible for handling the request Mgr::Response response(request.requestId); TypedMsgHdr message; response.pack(message); SendMessage(MakeAddr(strandAddrPfx, request.requestorId), message); - Mgr::Action::Pointer action = - CacheManager::GetInstance()->createRequestedAction(request.params); - AsyncJob::Start(new Mgr::Inquirer(action, request, strands_)); } void