Joel Rosdahl [Thu, 6 Jun 2019 18:10:10 +0000 (20:10 +0200)]
Remove the hard link mode
Rationale:
* The hard link feature is prone to errors: a) changes to files outside
the cache will corrupt the cache, and b) the mtime field in the file's
i-node is used for different purposes by ccache and build tools like
make.
* The upcoming enabling of LZ4 compression by default will make the hard
link mode obsolete as a means of saving cache space.
* Not supporting hard links will make a future backend storage API
simpler.
Joel Rosdahl [Thu, 6 Jun 2019 11:44:16 +0000 (13:44 +0200)]
Improve error handling of (de)compressors
Previously, some kinds of corruption were not detected by the zlib
decompressor since it didn’t check that it had reached the end of the
stream and therefore didn’t verify the Adler-32 checksum.
Joel Rosdahl [Tue, 4 Jun 2019 19:49:52 +0000 (21:49 +0200)]
Use the compression API for results
It didn’t feel right to use zlib’s gzip format for the embedded content,
especially since other compression libraries don’t support a similar
interface. Therefore, use the standard low-level zlib API instead.
Joel Rosdahl [Thu, 30 May 2019 18:37:12 +0000 (20:37 +0200)]
Revise disk format for results
* Removed unused hash_size and reserved fields. Since there are no
hashes stored in in the result metadata, hash size is superfluous. The
reserved bits field is also unnecessary; if we need to change the
format, we can just step RESULT_VERSION and be done with it.
* Instead of storing file count in the header, store an EOF marker after
the file entries. The main reason for this is that files then can be
appended to the result file without having to precalculate how many
files the result will contain.
* Don’t include trailing NUL in suffix strings since the length is known.
* Instead of potentially compressing the whole file, added an
uncompressed header telling how/if the rest of the file is
compressed (which algorithm and level). This makes it possible to more
efficiently recompress files in a batch job since it’s possible to
reasonably efficiently check if a cached file should be repacked. The
reason for not having compression info in each subfile
header (supporting different compression algorithms/levels per
subfile) is to make the repacking scenario simpler.
* Prepared for adding support for “reference entries”, which refer to
other results. There are two potential use cases for reference
entries: a) deduplication and b) storing partial results with a
different compression algorithm/level. It’s probably only the
deduplication use case that is interesting, though. It can be done
either at cache miss time or later as a batch job. If we really want
to, we can in the future add similar “raw reference entries” that
refer to files stored verbatim in the storage, thus re-enabling hard
link functionality.
* Changed to cCrS as the magic bytes for result files. This is analogous
to the magic bytes used for manifest files.
* Added documentation of the format.
Luboš Luňák [Mon, 20 May 2019 19:18:16 +0000 (21:18 +0200)]
Fix PCH detection in depend mode (+test improvements) (#427)
* do not refer to Clang's PTH in tests
The PTH feature has been removed (https://reviews.llvm.org/D54547)
and according to the commit it has never really been used. Maybe this
made sense somewhen in the past, but now those .pth files must be PCHs
internally. This commit actually just changes the .pth extensions
to .pch to avoid confusion, technically nothing should change
except for filenames.
* try to share PCH tests between GCC and Clang
Clang is supposed to be a drop-in for GCC, so in general it should
be able to handle everything GCC can. That's not completely true
in practice, there are differences, but it doesn't make sense
to completely duplicate a testcase just because there are some
differences. So start creating a shared common base for the PCH
tests and do separately only tests that act differently.
* more sharing of PCH tests between GCC and Clang
There's e.g. no need to do all kinds of complex tests with both
.gch and .pch with Clang, except for checking that Clang finds
one of them if none is specified explicitly.
* log also when pch usage is detect from pragma pch_preprocess
* try harder to verify in tests that ccache detects PCH changes
Some of the tests did that, e.g. those 'file changed', but e.g. the cached
.gch creation did not. So try to intentionally change the .gch/.h and test
that it leads to a cache miss. Otherwise there might be a hit simply
because ccache failed to detect PCH usage and ignores the .gch completely.
* clean up #include vs -fpch-preprocess in pch tests
As the manpage says, -fpch-preprocess is needed only with the #include
form, otherwise it's pointless.
* do not mention sloppiness in pch tests, only no sloppiness
Since sloppiness is normally required, so no point in stating the obvious.
* test also -include-pch with clang
* hash also pch introduced only using -include
GCC does not output the pch in the .d dependencies file, so without
this there would be false cache hits.
* be consistent about sloppiness in pch tests
create pch -> pch_defines
use pch -> time_macros
* test CCACHE_PCH_EXTSUM more thoroughly and also with -include
* pch test for .gch file being in an extra directory
* doc corrections for how to use PCH with ccache
- ccache will fail to properly detect that -include a.h means using
a.h.gch if it requires using path from -I (they are not searched)
- -fpch-preprocess does nothing with Clang, it doesn't output
pragma GCC pch_preprocess and so #include form for PCHs doesn't work
* explain better problems of -MD/-MMD in depend mode
Pavol Sakac [Sun, 5 May 2019 19:04:30 +0000 (21:04 +0200)]
Fix object size verification + bump to 64 bit file sizes in manifest (#407)
Changed manifest format to save the actual file size along with hashed content size.
File size field in manifest updated to 64bits.
Manifest version set to 2.
Joel Rosdahl [Wed, 1 May 2019 12:51:45 +0000 (14:51 +0200)]
Improve fix in #400 to handle more cases
The dependency file name can come from e.g. DEPENDENCIES_OUTPUT as well,
so hash information about a /dev/null .d file after the argument
processing loop instead.
Joel Rosdahl [Wed, 1 May 2019 11:58:18 +0000 (13:58 +0200)]
Bail out on “-MF /dev/null”
This is an alternative fix for #397, based on the observation/assumption
that using “-MF /dev/null” is only ever used as part of a compiler probe
call in combination with “-c /dev/null -o /dev/null”, so there is little
reason to cache the result. The advantage of just bailing out is to
reduce the number of special cases we have to handle.
this is useful for determining the length of the generated argument string
* correctly handle @file syntax on Windows
the @file syntax means that the process reads command arguments from the
specified file. this is commonly used in order to shorten commands which
would otherwise be longer than the maximum length limit: many build systems
do this in all cases to avoid hitting this limit.
when a command exceeds 8192 characters on on Windows, ccache now writes
the parsed/modified arguments to a tmpfile and then runs the command using
that tmpfile with @tmpfile in order to preserve this mechanism and avoid hitting
the length limit
Joel Rosdahl [Mon, 22 Apr 2019 13:22:07 +0000 (15:22 +0200)]
Fix minitrace.c compilation error with GCC 7.3
The error/warning looks like this:
src/minitrace.c: In function ‘mtr_flush’:
src/minitrace.c:256:54: error: ‘%.*s’ directive output may be truncated writing up to 700 bytes into a region of size 252 [-Werror=format-truncation=]
snprintf(arg_buf, ARRAY_SIZE(arg_buf), "\"%s\":\"%.*s\"", raw->arg_name, 700, raw->a_str);
^~~~
In file included from /usr/include/stdio.h:862:0,
from src/minitrace.c:9:
/usr/include/x86_64-linux-gnu/bits/stdio2.h:64:10: note: ‘__builtin___snprintf_chk’ output 6 or more bytes (assuming 706) into a destination of size 256
return __builtin___snprintf_chk (__s, __n, __USE_FORTIFY_LEVEL - 1,
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
__bos (__s), __fmt, __va_arg_pack ());
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Joel Rosdahl [Mon, 15 Apr 2019 19:35:38 +0000 (21:35 +0200)]
Touch up NEWS.adoc
* Use “-” for bullets consistently.
* Use “curly quotation marks” instead of ``asciidoctor'' quotation style
for readability, and similar for apostrophes.
Joel Rosdahl [Sat, 13 Apr 2019 20:52:23 +0000 (22:52 +0200)]
Improve handling of debug levels
Fixes #368.
* Remember if we have seen any option on level 3.
* Let “-g0”, “-ggdb0” and similar cancel out any previously seen “-g”
options except “-gsplit-dwarf”. This is based on observations on how
GCC 7.3 behaves.
* Delay acting on seen debug options until after we have processed all
arguments. This way we can avoid e.g. hashing the current directory if
we get “-g3 -g0”.
Joel Rosdahl [Thu, 11 Apr 2019 20:10:43 +0000 (22:10 +0200)]
Rewrite mdfour routines to not modify state when computing the result
This makes the interface much more intuitive, at the minor expense of
having to copy 32-96 extra bytes (unnecessary if the hash state won’t be
fed with more bytes) when fetching the result.
This change is also partly motivated by the code that handles several
-arch options – it calls get_object_name_from_cpp multiple times to
compute a combined hash, but that function also computes the hash result
each time.
Joel Rosdahl [Wed, 10 Apr 2019 20:22:24 +0000 (22:22 +0200)]
Fix bad calculation of object hash in the depend mode
When the depend mode is enabled, there is a code path where the direct
mode hash state is finalized and then more bytes are fed into it to
create a hash sum for the depend mode. The effect is that the last 0-63
bytes (depending on the number of previously hashed bytes) of the data
used for the depend mode will be ignored, which means that the contents
of the last 0-2 or so header files in the .d file are not accounted for
in the object hash used in the depend mode.
Fix this by making a copy of the direct mode hash state before computing
the hash result in calculate_object_hash().