Joel Rosdahl [Tue, 20 Sep 2022 17:05:38 +0000 (19:05 +0200)]
feat: Use subsecond timestamps for include file check
To avoid a race condition, ccache disables the direct mode if an include
file has a too new mtime or ctime. Previously this check used one second
resolution timestamps, which meant that a generated include file often
would disable direct mode hits for up to one second. Now ccache uses
timestamps with subsecond resolution (nanoseconds on Linux), so the
direct mode will in practice no longer have to be disabled for generated
include files.
Joel Rosdahl [Wed, 14 Sep 2022 19:19:14 +0000 (21:19 +0200)]
feat: Merge local manifest with fetched remote manifest
With read-only secondary storage, it can happen that primary storage has
a manifest named M with a result entry R1, while secondary storage also
has a manifest M but with result R2. On a compilation that matches R2,
ccache will first succeed to look up M in primary storage, fail to find
R2 and then get M from secondary storage where R2 can be found. Since M
already exists locally, ccache will simply return the cache hit but not
store knowledge of R2 locally. On a rebuild of R2, ccache therefore
needs to fetch from secondary storage again.
The improvement brought by this commit is that ccache now merges the
manifests from primary and secondary storage and stores the merged
version in primary storage. In other words, ccache setups with read-only
secondary storage will be able to accumulate local header file
combinations and seamlessly combine them with changes from secondary
storage.
Joel Rosdahl [Sun, 11 Sep 2022 11:48:05 +0000 (13:48 +0200)]
chore: Simplify cache entry reading and writing
Cache entries are now fully read into memory before (de)compressing,
checksumming and parsing, instead of streaming data like before. While
this increases memory usage when working with large object files, it
also simplifies the code a lot. Another motivation for this change is
that cache entry data is not streamed from secondary storage anyway, and
it makes sense to keep the architecture simple and similar for primary
and secondary storage code paths.
The cache entry format has modified so that the checksum covers the
potentially compressed payload (plus the header), not the uncompressed
payload (plus the header) like before. The checksum is now also stored
in an uncompressed epilogue. Since the cache entry format has been
changed, the input hash has been changed as well.
Joel Rosdahl [Mon, 5 Sep 2022 18:23:21 +0000 (20:23 +0200)]
refactor: Use memory buffers instead of streams for results
- Result objects now only know and care about the result payload part of
a result cache entry.
- Result object are no longer tightly coupled with the primary storage
implementation.
This is part of a larger refactoring effort with the goal of simplifying
how cache entries are read and processed.
Joel Rosdahl [Tue, 30 Aug 2022 18:34:32 +0000 (20:34 +0200)]
chore: Remove share-hits attribute for secondary storage
[1] added a share-hits attribute for secondary storages so that it's
possible to avoid sharing hits to primary storage for a specific
secondary storage. I believe that nobody needs that level of control --
what one would like is the ability to not use the primary storage at
all. Such a feature will be added in a a future commit, but for now the
share-hits=false functionality is just in the way, so let's remove it.
Joel Rosdahl [Sat, 27 Aug 2022 18:04:26 +0000 (20:04 +0200)]
test: Disable "output file failure" test for unsupported filesystem
The "Failure to write output file" test assumes that the filesystem
supports read-only directories. Improve this by probing this assumption
before running the test.
Joel Rosdahl [Fri, 5 Aug 2022 14:39:29 +0000 (16:39 +0200)]
feat: Improve inode cache robustness
- Only enable the inode cache at compile-time if it's possible to
determine filesystem type.
- Only use the inode cache at run-time if the filesystem type is known
to work with the inode cache instead of refusing just on NFS.
Joel Rosdahl [Wed, 3 Aug 2022 08:04:03 +0000 (10:04 +0200)]
fix: Hash time information on inode cache hit
As mentioned in discussion #1086: If the inode cache is enabled,
hash_source_code_file will on an inode cache hit fail to hash time
information if there are temporal macros in the code. This is because
hash_source_code_string (called from hash_source_code_file via
hash_source_code_file_nocache) on an inode cache miss adds time
information to the hash in case one of those macros were found. However,
on an inode cache hit the return value of hash_source_code_file will be
correctly fetched from the cache, but the hash sum will only be updated
with (the hash of) the include file and not the time information.
The fix is to let the inode cache only cache the effects of hashing the
file and checking for macros, not the hashing of time information since
that's volatile.
After the fix:
- The new do_hash_file function performs file hashing and macro
checking. The inode cache caches this step.
- hash_source_code_file returns a Digest instead of adding data to a
Hash.
- hash_source_code_file calls do_hash_file and then potentially hashes
time information. If there are no temporal macros the returned digest
will be identical to the file hash, otherwise the returned digest will
be of a hash of file content hash + time information.
This also improves hashes that are stored in the direct mode manifest:
Previously they were always the hash of the file content hash but now
they are just the file content hash in the common case when there is no
__DATE__ or __TIMESTAMP__ macro in the file.
Joel Rosdahl [Mon, 1 Aug 2022 12:53:52 +0000 (14:53 +0200)]
fix: Always rewrite dependency file if base_dir is used
[1] added the has_absolute_include_headers variable as a performance
optimization for base_dir/CCACHE_BASEDIR so that the dependency file
only has to be parsed/rewritten when necessary. This is based on the
assumption that if no include file has an absolute path then no rewrite
is needed, but apparently Clang can insert other paths into the
dependency file as well, for instance the asan_blacklist.txt file when
using -fsanitize=address.
Fix this by simply always parsing the dependency file when base_dir is
active.
Joel Rosdahl [Thu, 28 Jul 2022 08:58:05 +0000 (10:58 +0200)]
feat: Improve handling of dependency files
- Cache entries are now shared for different -MT/-MQ options. This is
implemented by (on a cache hit) rewriting the stored dependency file
data to have the correct dependency target before writing it to the
destination. Closes #359.
- An intentional side effect of the above is that the correct dependency
target will be produced even when base_dir/CCACHE_BASEDIR is used -
the dependency target will still be an absolute path. Fixes #1042.
- Buggy support for GCC-specific environment variables
DEPENDENCIES_OUTPUT and SUNPRO_DEPENDENCIES has been removed. When one
of those variables was set, ccache used to store and fetch the content
just as if -MMD or -MD were used. This is however incorrect since GCC
*appends* to the destination file instead of (like -MMD/-MD)
*rewriting* it. Since there is no way for ccache to know what the
compiler appended to the dependency file, we simply can't support it.
Reverts #349.
Joel Rosdahl [Tue, 19 Jul 2022 13:31:34 +0000 (15:31 +0200)]
feat: Don't remove inode cache file on -C/--clear
-C/--clear is tied to the cache directory while the inode cache file is
a temporary file. I think it makes more sense to not consider the inode
cache part of the main cache directory.