Joel Rosdahl [Thu, 5 Jan 2023 18:14:27 +0000 (19:14 +0100)]
feat: Improve automatic cache cleanup mechanism
The cache cleanup mechanism has worked essentially the same ever since
ccache was initially created in 2002:
- The total number and size of all files in one of the 16 subdirectories
(AKA level 1) are kept in the stats file in said subdirectory.
- On a cache miss, the new compilation result file is written (based on
the first digits of the hash) to a subdirectory of one of those 16
subdirectories, and the stats file is updated accordingly.
- Automatic cleanup is triggered if the size of the level 1 subdirectory
becomes larger than max_size / 16.
- ccache then lists all files in the subdirectory recursively, stats
them to check their size and mtime, sorts the file list on mtime and
deletes the 20% oldest files.
Some problems with the approach described above:
- (A) If several concurrent ccache invocations result in a cache miss
and write their results to the same subdirectory then all of them will
start cleaning up the same subdirectory simultaneously, doing
unnecessary work.
- (B) The ccache invocation that resulted in a cache miss will perform
cleanup and then exit, which means that an arbitrary ccache process
that happens to trigger cleanup will take a long time to finish.
- (C) Listing all files in a subdirectory of a large cache can be quite
slow.
- (D) stat-ing all files in a subdirectory of a large cache can be quite
slow.
- (E) Deleting many files can be quite slow.
- (F) Since a cleanup by default removes 20% of the files in a
subdirectory, the actual cache size will (once the cache limit is
reached) on average hover around 90% of the configured maximum size,
which can be confusing.
This commit solves or improves on all of the listed problems:
- Before starting automatic cleanup, a global "auto cleanup" lock is
acquired (non-blocking) so that at most one process is performing
cleanup at a time. This solves the potential "cache cleanup stampede"
described in (A).
- Automatic cleanup is now performed in just one of the 256 level 2
directories. This means that a single cleanup on average will be 16
times faster than before. This improves on (B), (C), (D) and (E) since
the cleanup made by a single compilation will not have to access a
large part of the cache. On the other hand, cleanups will be triggered
16 times more often, but the cleanup duty will be more evenly spread
out during a build.
- The total cache size is calculated and compared with the configured
maximum size before starting automatic cleanup. This, in combination
with performing cleanup on level 2, means that the actual cache size
will stay very close to the maximum size instead of about 90%. This
solves (F).
The limit_multiple configuration option has been removed since it is no
longer used.
Joel Rosdahl [Thu, 5 Jan 2023 10:07:06 +0000 (11:07 +0100)]
enhance: Make it possible for LockFile::try_acquire to break the lock
If a long-lived lock is stale and has no alive file,
LockFile::try_acquire will never succeed to acquire the lock. Fix this
by creating the alive file for all lock types and making
LockFile::try_acquire exit when lock activity is seen instead of
immediately after failing to acquire the lock.
Another advantage is that a stale lock can now always be broken right
away if the alive file exists.
Joel Rosdahl [Fri, 30 Dec 2022 20:49:23 +0000 (21:49 +0100)]
fix: Avoid sometimes too wide percent figure in --show-stats
If the nominator is 99999 and the denominator is 100000, the percent
function in Statistics.cpp would return "(100.00%)" instead of the
wanted "(100.0%)". Fix this by using the alternate format string if the
result string overflows its target size.
Joel Rosdahl [Tue, 29 Nov 2022 20:54:08 +0000 (21:54 +0100)]
feat: Do clean/clear/evict-style operations per level 2 directory
Progress bars will now be smoother since the operations are now divided
into 256 instead of 16 "read files + act on files" steps. This is also
in preparation for future improvements related to cache cleanup.
Oleg Sidorkin [Wed, 4 Jan 2023 13:53:21 +0000 (16:53 +0300)]
fix: Use spinlocks for inode cache memory synchronization (#1229)
Changed the inode cache implementation to use spinlocks instead of pthread
mutexes. This makes the inode cache work on FreeBSD and other systems where the
pthread mutexes are destroyed when the last memory mapping containing the
mutexes is unmapped.
Also added tmpfs, ufs and zfs to the list of supported filesystems on macOS and
BSDs.
Joel Rosdahl [Wed, 21 Dec 2022 12:16:12 +0000 (13:16 +0100)]
fix: Fix matching of base directory for MSVC
The base directory will now match case-insensitively with absolute paths
in preprocessed output, or from /showIncludes in the depend mode case,
when compiling with MSVC.
Joel Rosdahl [Mon, 5 Dec 2022 19:50:58 +0000 (20:50 +0100)]
enhance: Extract lock keep-alive thread to a manager class
Instead of running one keep-alive thread per lock, a long-lived LockFile
now lets a separate LongLivedLockFileManager object handle keep-alive
for several locks in a single thread.
Joel Rosdahl [Wed, 23 Nov 2022 19:11:12 +0000 (20:11 +0100)]
fix: Don't use copy of mutex/condition in long-lived lock thread
This was kind of due to a typo in 0babd33e84147e923a729ee07a3b85097ec8baa8. Since the LongLivedLockFile
class is not used yet, the bug does not affect any released code.
Joel Rosdahl [Thu, 10 Nov 2022 09:15:12 +0000 (10:15 +0100)]
enhance: Only keep atime if needed
- For the --recompress case, only reset timestamps if mtime has changed
since local cache LRU cleanup always uses mtime.
- For the --trim-dir/--trim-recompress case, always reset timestamps
since atime may be used for LRU cleanup.
Erik Flodin [Sun, 27 Nov 2022 20:32:36 +0000 (21:32 +0100)]
fix: Fix edge case where a non-temporal identifier is misidentified (#1227)
If a non-temporal identifier, that ends with a temporal macro, happens
to be at the end of the buffer with the temporal suffix starting on the
avx boundary, then it would be incorrectly classified as a temporal
macro. This since the helper function lacks the context to see that the
data before the match is something that invalidates the match.
Joel Rosdahl [Thu, 17 Nov 2022 20:31:58 +0000 (21:31 +0100)]
fix: Avoid race condition in inode cache for quick updates
The inode cache has a race condition that consists of these events:
1. A file is written with content C1, size S and timestamp (ctime/mtime)
T.
2. Ccache hashes the file content and asks the inode cache to store the
digest with a hash of S and T (and some other data) as the key.
3. The file is quickly thereafter written with content C2 without
changing size S and timestamp T. The timestamp is not updated since
the file writes are made within a time interval smaller than the
granularity of the clock used for file system timestamps. At the time
of writing, a common granularity on a Linux system is 0.004 s (250
Hz).
4. The inode cache is asked for the file digest and the inode cache
delivers a digest of C1 even though the file's content is C2.
To avoid the race condition, the inode cache now only caches inodes
whose timestamp was updated more than two seconds ago. This conservative
value is chosen since not all file systems have subsecond resolution.
Joel Rosdahl [Sat, 5 Nov 2022 12:03:22 +0000 (13:03 +0100)]
feat: Include I_MPI_CC/I_MPI_CXX in the input hash
The I_MPI_CC and I_MPI_CXX variables affect which underlying compiler
ICC uses. Reference:
<https://www.intel.com/content/www/us/en/develop/documentation/
mpi-developer-reference-windows/top/environment-variable-reference/
compilation-environment-variables.html>.
rblx-kbuck [Fri, 28 Oct 2022 18:41:28 +0000 (11:41 -0700)]
fix: Process the argument following a -Xarch argument (#1199)
Since there are already checks enforcing that all -Xarch arguments match
each other and -arch, we can assume that the compiler would also
interpret the following argument, so ccache should interpret it too.