Nathan Hoad [Fri, 25 Mar 2016 13:03:30 +0000 (02:03 +1300)]
Fix memory leak of AccessLogentry::url
... created by ACLFilledChecklist::syncAle().
::syncAle() is the only place in the codebase that assigns a URL that
AccessLogEntry is expected to free(), which AccessLogEntry doesn't do.
This results in a memory leak.
Alex Rousskov [Thu, 24 Mar 2016 17:02:25 +0000 (11:02 -0600)]
Added shared_memory_locking configuration directive to control mlock(2).
Locking shared memory at startup avoids SIGBUS crashes when kernel runs
out of RAM during runtime. Why not enable it by default? Unfortunately,
locking requires privileges and/or much-higher-than-default
RLIMIT_MEMLOCK limits. Thus, requiring locked memory by default is
likely to cause too many complaints, especially since Squid has not
required that before. The default is off, at least for now.
As we gain more experience, we may try to enable locking by default
while making default locking failures non-fatal and warning about
significant [accumulated] locking delays.
Bug description:
- The client side and server side are finished
- On server side the Ftp::Relay::finalizeDataDownload() is called and
schedules the Ftp::Server::originDataCompletionCheckpoint
- On client side the "Ftp::Server::userDataCompletionCheckpoint" is
called. This is schedules a write to control connection and closes
data connection.
- The Ftp::Server::originDataCompletionCheckpoint is called which is
trying to write to control connection and the assertion triggered.
This bug is an corner case, where the client-side (FTP::Server) should
wait for the server side (Ftp::Client/Ftp::Relay) to finish its job before
respond to the FTP client. In this bug the existing mechanism, designed
to handle such problems, did not worked correctly and resulted to a double
write response to the client.
This patch try to fix the existing mechanism as follows:
- When Ftp::Server receives a "startWaitingForOrigin" callback, postpones
writting possible responses to the client and keeps waiting for the
stopWaitingForOrigin callback
- When the Ftp::Server receives a "stopWaitingForOrigin" callback,
resumes any postponed response.
- When the Ftp::Client starts working on a DATA-related transaction, calls the
Ftp::Server::startWaitingForOrigin callback
- When the Ftp::Client finishes its job or when its abort abnormaly, checks
whether it needs to call Ftp::Server::stopWaitingForOrigin callback.
- Also this patch try to fix the status code returned to the FTP client
taking in account the status code returned by FTP server. The
"Ftp::Server::stopWaitingForOrigin" is used to pass the returned status code
to the client side.
Author: Eduard Bagdasaryan <eduard.bagdasaryan@measurement-factory.com>
Added ACL-driven server_pconn_for_nonretriable squid.conf directive.
This directive provides fine-grained control over persistent connection
reuse when forwarding HTTP requests that Squid cannot retry. It is
useful in environments where opening new connections is very expensive
and race conditions associated with persistent connections are very rare
and/or only cause minor problems.
Alex Rousskov [Sat, 12 Mar 2016 18:40:29 +0000 (11:40 -0700)]
Trying to avoid "looser throw specifier" error with Wheezy GCC.
AFAICT, the default CbdataParent destructor gets implicit
"noexcept(true)" specifier (because the default destructor does not
throw itself, and CbdataParent has no data members or parents that could
have contributed potentially throwing destructors). The AsyncJob child
uses a lot of things that might throw during destruction (the compiler
cannot tell for sure because we do not use noexcept specifiers). Thus,
the compiler has to use "noexcept(false)" specifier for ~AsyncJob, which
is "looser" that "noexcept(true)" for ~CbdataParent and, hence, violates
the parent interface AsyncJob is implementing/overriding.
I have doubts about the above analysis because many other compilers,
including GCC v5 and clang are happy with the default virtual
CbdataParent destructor. If my analysis is correct, then the rule of
thumb is: Base classes must not use "= default" destructors until all
our implicit destructors become "noexcept".
AsyncJob classes can now use C++11 overrides as long as they use the new
CBDATA_CHILD() macro instead of old CBDATA_CLASS().
I have prohibited multiple CBDATA_CHILD() classes on the same
inheritance branch by adding the "final" specifier to toCbdata(). Such
classes feel dangerous because they may have different sizes and it is
not obvious to me whether the cbdata code will call the right size-
specific delete for them. We can easily relax this later if needed.
Alex Rousskov [Fri, 11 Mar 2016 18:00:51 +0000 (11:00 -0700)]
Bug 7: Update cached entries on 304 responses.
New Store API to update entry metadata and headers on 304s.
Support entry updates in shared memory cache and rock cache_dirs.
No changes to ufs-based cache_dirs: Their entries are still not updated.
* Atomic StoreEntry metadata updating
StoreEntry metadata (swap_file_sz, timestamps, etc.) is used
throughout Squid code. Metadata cannot be updated atomically because
it has many fields, but a partial update to those fields causes
assertions. Still, we must update metadata when updating HTTP
headers. Locking the entire entry for a rewrite does not work well
because concurrent requests will attempt to download a new entry
copy, defeating the very HTTP 304 optimization we want to support.
Ipc::StoreMap index now uses an extra level of indirection (the
StoreMap::fileNos index) which allows StoreMap control which
anchor/fileno is associated with a given StoreEntry key. The entry
updating code creates a disassociated (i.e., entry/key-less) anchor,
writes new metadata and headers using that new anchor, and then
_atomically_ switches the map to use that new anchor. This allows old
readers to continue reading using the stale anchor/fileno as if
nothing happened while a new reader gets the new anchor/fileno.
Shared memory usage increase: 8 additional bytes per cache entry: 4
for the extra level of indirection (StoreMapFileNos) plus 4 for
splicing fresh chain prefix with the stale chain suffix
(StoreMapAnchor::splicingPoint). However, if the updated headers are
larger than the stale ones, Squid will allocate shared memory pages
to accommodate for the increase, leading to shared memory
fragmentation/waste for small increases.
* Revamped rock index rebuild process
The index rebuild process had to be completely revamped because
splicing fresh and stale entry slot chain segments implies tolerating
multiple entry versions in a single chain and the old code was based
on the assumption that different slot versions are incompatible. We
were also uncomfortable with the old cavalier approach to accessing
two differently indexed layers of information (entry vs. slot) using
the same set of class fields, making it trivial to accidentally
access entry data while using slot index.
During the rewrite of the index rebuilding code, we also discovered a
way to significantly reduce RAM usage for the index build map (a
temporary object that is allocated in the beginning and freed at the
end of the index build process). The savings depend on the cache
size: A small cache saves about 30% (17 vs 24 bytes per entry/slot)
while a 1TB cache_dir with 32KB slots (which implies uneven
entry/slot indexes) saves more than 50% (~370MB vs. ~800MB).
Adjusted how invalid slots are counted. The code was sometimes
counting invalid entries and sometimes invalid entry slots. We should
always count _slots_ now because progress is measured in the number
of slots scanned, not entries loaded. This accounting change may
surprise users with much higher "Invalid entries" count in cache.log
upon startup, but at least the new reports are meaningful.
This rewrite does not attempt to solve all rock index build problems.
For example, the code still assumes that StoreEntry metadata fits a
single slot which is not always true for very small slots.
Alex Rousskov [Fri, 11 Mar 2016 17:24:13 +0000 (10:24 -0700)]
Removed SWAPOUT_WRITING assertion from storeSwapMetaBuild().
I do not see any strong dependency of that code on that state and we
need to be able to build swap metadata when updating a stale entry
(which would not normally be in the SWAPOUT_WRITING state).
The biggest danger is that somebody calls storeSwapMetaBuild() when the
entry metadata is not yet stable. I am not sure we have a way of
detecting that without using something as overly strong as
SWAPOUT_WRITING.
Squid crashes on shutdown while cleaning up idle ICAP connections.
The global Adaptation::Icap::TheConfig object is automatically
destroyed when Squid exits. Its destructor destroys Icap::ServiceRep
objects that, in turn, close all open connections in the idle
connections pool. Since this happens after comm_exit has destroyed all
Comm structures associated with those connections, Squid crases.
Amos Jeffries [Tue, 1 Mar 2016 02:57:50 +0000 (15:57 +1300)]
RFC 7725: Add registry entry for 451 status text
While Squid does not generate these messages automatically we still have
to relay the status line text accurately, and admin may want to use it
for deny_info status.
After certain failures, FwdState::retryOrBail() may be called twice,
once from FwdState::unregisterdServerEnd() [called from
HttpStateData::swanSong()] and once from the FwdState's own connection
close handler. This may result in two concurrent connections to the
remote server, followed by an assertion upon a connection closure.
This patch:
- After HttpStateData failures, instead of closing the squid-to-peer
connection directly (and, hence, triggering closure handlers), calls
HttpStateData::closeServer() and mustStop() for a cleaner exit with
fewer wasteful side effects and better debugging.
- Creates and remembers a FwdState close handler AsyncCall so that
comm_remove_close_handler() can cancel an already scheduled callback.
The conversion to the AsyncCall was necessary because legacy [close
handler callbacks] cannot be canceled once scheduled.
Marcos Mello [Tue, 23 Feb 2016 22:27:43 +0000 (11:27 +1300)]
Bug 3826: SMP compatibility with systemd
** These changes require capabilities changes specific to Squid-4 and
require systemd 209+
NOTE: 'squid -z' command does not yet support SMP with systemd.
Differences from the Squid-3 tools/systemd/squid.service:
- After=nss-lookup.target, for people running a local DNS server like BIND.
Since there is no requirement dependency, it is a NOP when no such
service is running.
- Type=forking and squid without -N in ExecStart: SMP now works.
- PIDFile=/var/run/squid.pid to tell systemd what pid is the main one. This
is actually optional with Squid 4, because systemd will consider its first
child as the main pid. But let's be safe. DEFAULT_PID_FILE could be used
here with proper autoconf/automake magic...
- ExecReload calls kill rather than 'squid -k reconfigure'. systemd already
knows the main pid.
- KillMode=mixed. The old KillMode=process sends SIGTERM (and SIGKILL after
TimeoutStopSec) only to main daemon process. 'mixed' OTOH sends SIGTERM
only to main process, but SIGKILL to all services' cgroup processes after
timeout. With 'mixed' systemd ensures if daemon shutdown fails it will
clean up all the remains. 'mixed' requires systemd >= 209.
Alex Rousskov [Fri, 19 Feb 2016 21:26:00 +0000 (14:26 -0700)]
Fix propagation of response status line parsing error details.
This is a follow-up patch to trunk r14548 (Bug 4432). Now that the
calling code is using the right field to get the parsing error details
(parseStatusCode), we need to fix the code that sets those parsing error
details [in case of response status line parsing errors].
TODO: To minimize chances of similar "I forgot to set parseStatusCode"
bugs slipping through, hide that data member behind a method that
returns scInvalidHeader (or a new scInternalSquidError) if parseError_
is still zero. Rename parseStatusCode to parseError_ and stop confusing
it with the response status code.
Alex Rousskov [Fri, 19 Feb 2016 21:23:08 +0000 (14:23 -0700)]
Throw instead of asserting on some String overflows.
Note that Client-caught exceptions result in HTTP 500 (Internal Server
Error) responses with X-Squid-Error set to "ERR_CANNOT_FORWARD 0".
Also avoid stuck Client jobs on exceptions. See trunk r8266 for a
similar fix with a detailed discussion. Here, I added doneWithFwd
instead of setting fwd to NULL because we dereference fwd (and store
pointers to things stored in fwd!) in many places. I think it is too
risky to just clear refcounted FwdState pointer (except in the
destructor where doing so is pointless).
Using doneWithFwd correctly is difficult because there are many ways we
can be "done" with FwdState, including:
* calling fwd->complete(),
* calling fwd->handleUnregisteredServerEnd(), and
* closing the connection that FwdState monitors for closures.
The latter is especially tricky case because the closing is initiated in
many places, the process is asynchronous, and not all control
connections are monitored by FwdState.
For example, the updated control connection closure handler assumes that
it is being used for either external closures or internal closures
incorrectly used instead of mustStop()/abortAll(). In both cases, either
FwdState is still monitoring the connection (OK) or we forgot to call
one of its "done" methods listed above before closing. The latter would
be a bug, but I did not find any signs of it and fixing it would be
outside this change scope anyway.
Also unified String size limit checks [that I could find].
external_acl parameters separated by %20 instead of space
If an external ACL is configured with more than one parameter as shown
in the example below, then Squid sends those parameters to the
external_acl helper separated by %20 characters instead of spaces:
acl TEST external ACLTYPE param1=val1 param2=val2
This change fixes regression introduced in trunk r14351 (Support
logformat %macros in external_acl_type format) but more work may
be needed to make Squid behave as squid.conf.documented promises.
Amos Jeffries [Fri, 19 Feb 2016 15:06:42 +0000 (04:06 +1300)]
Revert r14303: Migrate StoreEntry to using MEMPROXY_CLASS
This change has been identified as the trigger for several object caching
errors. The real cause is not yet known, but reverting this optimisation
avoids it, so is being done for stability.
This resolves bugs 4370 and maybe also 4354 and 4355
William Lima [Thu, 18 Feb 2016 12:48:08 +0000 (01:48 +1300)]
Bug 3870: assertion failed: String.cc: 'len_ + len <65536' in ESI::CustomParser
The custom ESI parser used in absence of libxml2 or libexpat parsers was
restricted to handling 64KB buffers but under some conditions could expand
to over 64KB during the parse process. Hitting this assertion.
TODO: the parser can now be redesigned to make use of Tokenizer and
CharacterSet parsing tools. But that is left for later work.
* Do not use parsing leftovers, such as HTTP response status code. Doing
so screws up error detection logic in continueAfterParsingHeader() and
leads to stuck transactions instead of error responses.
* Do not store the fake half-baked response (via replaceHttpReply).
Doing so leads to assertions. The fake response is only meant for
continueAfterParsingHeader().
I also removed a misleading XXX about connection closure. Our
continueAfterParsingHeader() handles errors, not processReplyHeader().
TODO: The error detection/propagation code is ugly and should be
rewritten [using C++ exceptions].
tangqinghao [Thu, 18 Feb 2016 02:48:41 +0000 (15:48 +1300)]
Bug 4111: leave_suid() does not properly handle error codes returned by setuid
... this will cause privilege escalation in the rare case that setuid fails.
So far there are no known cases of this happening when downgrading from root.
Also fixes several incorrect uses of errno which may have been obscuring
error message details if it did happen.