From: Joel Rosdahl Date: Sun, 14 Jan 2024 09:46:54 +0000 (+0100) Subject: fix: Disable caching for modified source/include files X-Git-Tag: v4.9.1~7 X-Git-Url: http://git.ipfire.org/?a=commitdiff_plain;h=d75dbe260dbadf56ae9bd8a86f55ba23a27fd75a;p=thirdparty%2Fccache.git fix: Disable caching for modified source/include files On a cache miss, ccache opts out of the direct mode when a "too new" include file is found. This is done to avoid a race condition in the direct mode when a header file if modified after ccache is invoked but before the compiler is executed. Modifying the main source code file is not a race condition when "run_second_cpp = false", which was the case at the time the direct mode was introduced. However, the default of run_second_cpp was changed to true in ccache 3.3, and the potiential race condition then exists for both the source and the include files. Fix this by disabling caching completely (not only the direct mode) when modification of a source/include file is detected. Closes #1378. (cherry picked from commit 43c3a44aadcb122853ee1e3eb8db00db9d8de68f) --- diff --git a/doc/MANUAL.adoc b/doc/MANUAL.adoc index a21d5ad33..52eb00d2c 100644 --- a/doc/MANUAL.adoc +++ b/doc/MANUAL.adoc @@ -1027,13 +1027,15 @@ preprocessing first. directory in the `.gcno` file. *gcno_cwd* also disables hashing of the current working directory if `-fprofile-abs-path` is used. *include_file_ctime*:: - By default, ccache will disable the direct mode if an include file has too - new ctime. This sloppiness disables that check. See also _<>_. + By default, ccache will disable caching if a source code file has a status + change time (ctime) after the start of the ccache invocation. This + sloppiness disables that check. See also _<>_. *include_file_mtime*:: - By default, ccache will disable the direct mode if an include file has too - new mtime. This sloppiness disables that check. See also _<>_. + By default, ccache will disable caching if a source code file has a + modification time (mtime) after the start of the ccache invocation. This + sloppiness disables that check. See also _<>_. *ivfsoverlay*:: Ignore the Clang compiler option `-ivfsoverlay` and its argument. This is useful if you use Xcode, which uses a virtual file system (VFS) for things @@ -1429,6 +1431,9 @@ Preconditions for using <> were not fulfilled. Preconditions for using <> were not fulfilled. +| Could not read or parse input file | +An input file could not be read or parsed (see the debug log for details). + | Could not write to output file | The output path specified with `-o` could not be written to. @@ -1457,6 +1462,9 @@ Failure reading a file specified by | Forced recache | <> was used to overwrite an existing result. +| Input file modified during compilation | +An input file was modified during compilation. + | Internal error | Unexpected failure, e.g. due to problems reading/writing the cache. @@ -1631,31 +1639,23 @@ The depend mode will be disabled if any of the following holds: `/showIncludes` is added automatically if not specified by the user). -== Handling of newly created header files - -If modification time (mtime) or status change time (ctime) of one of the include -files is equal to (or newer than) the time compilation is being done, ccache -disables the direct mode (or, in the case of a <>, disables caching completely). This done as a safety measure to avoid a -race condition (see below). - -To be able to use a newly created header files in direct mode (or use a newly -precompiled header), either: +== Handling of newly created source files -* create the include file earlier in the build process, or -* set <> to - *include_file_ctime,include_file_mtime* if you are willing to take the risk, - for instance if you know that your build system is robust enough not to - trigger the race condition. +If modification time (mtime) or status change time (ctime) of the source file or +one of the include files is equal to (or newer than) the time that ccache was +invoked, ccache disables caching completely. This is done as a safety measure to +avoid a race condition (see below). In practice, this is only a problem when +using file systems with very low timestamp granularity. You can set +<> to *include_file_ctime,include_file_mtime* to +opt out of the safety measure. For reference, the race condition mentioned above consists of these events: -1. The preprocessor is run. -2. An include file is modified by someone. -3. The new include file is hashed by ccache. -4. The real compiler is run on the preprocessor's output, which contains data - from the old header file. -5. The wrong object file is stored in the cache. +1. A source code file is read by ccache and added to the input hash. +2. The source code file is modified. +3. The compiler is executed and reads the modified source code. +4. Ccache stores the compiler output in the cache associated with the incorrect + key (based on the unmodified source code). == Cache debugging @@ -1767,7 +1767,7 @@ to do some things to make it work properly: works in combination with precompiled headers. * You may also want to include *include_file_mtime,include_file_ctime* in <>. See - _<>_. + _<>_. * You must either: + -- @@ -1932,9 +1932,8 @@ problems and what may be done to increase the hit rate: `-Wp,-MMD,`, and `-Wp,-D`) is used. ** This was the first compilation with a new value of the <>. -** A modification or status change time of one of the include files is too new - (created the same second as the compilation is being done). See - _<>_. +** A modification or status change time of one of the include files is too new . + See _<>_. ** The `+__TIME__+` preprocessor macro is (potentially) being used. Ccache turns off direct mode if `+__TIME__+` is present in the source code. This is done as a safety measure since the string indicates that a `+__TIME__+` macro diff --git a/src/ccache.cpp b/src/ccache.cpp index 43eb92b4a..c97e83c73 100644 --- a/src/ccache.cpp +++ b/src/ccache.cpp @@ -304,54 +304,29 @@ guess_compiler(const fs::path& path) #endif } -static bool -include_file_too_new(const Context& ctx, - const std::string& path, - const DirEntry& dir_entry) -{ - // The comparison using >= is intentional, due to a possible race between - // starting compilation and writing the include file. See also the notes under - // "Performance" in doc/MANUAL.adoc. - if (!(ctx.config.sloppiness().contains(core::Sloppy::include_file_mtime)) - && dir_entry.mtime() >= ctx.time_of_invocation) { - LOG("Include file {} too new", path); - return true; - } - - // The same >= logic as above applies to the change time of the file. - if (!(ctx.config.sloppiness().contains(core::Sloppy::include_file_ctime)) - && dir_entry.ctime() >= ctx.time_of_invocation) { - LOG("Include file {} ctime too new", path); - return true; - } - - return false; -} - -// Returns false if the include file was "too new" and therefore should disable -// the direct mode (or, in the case of a preprocessed header, fall back to just -// running the real compiler), otherwise true. -static bool -do_remember_include_file(Context& ctx, - std::string path, - Hash& cpp_hash, - bool system, - Hash* depend_mode_hash) +// This function hashes an include file and stores the path and hash in +// ctx.included_files. If the include file is a PCH, cpp_hash is also updated. +[[nodiscard]] tl::expected +remember_include_file(Context& ctx, + std::string path, + Hash& cpp_hash, + bool system, + Hash* depend_mode_hash) { if (path.length() >= 2 && path[0] == '<' && path[path.length() - 1] == '>') { // Typically or . - return true; + return {}; } if (path == ctx.args_info.normalized_input_file) { // Don't remember the input file. - return true; + return {}; } if (system - && (ctx.config.sloppiness().contains(core::Sloppy::system_headers))) { + && ctx.config.sloppiness().contains(core::Sloppy::system_headers)) { // Don't remember this system header. - return true; + return {}; } // Canonicalize path for comparison; Clang uses ./header.h. @@ -361,7 +336,7 @@ do_remember_include_file(Context& ctx, if (ctx.included_files.find(path) != ctx.included_files.end()) { // Already known include file. - return true; + return {}; } #ifdef _WIN32 @@ -370,52 +345,35 @@ do_remember_include_file(Context& ctx, DWORD attributes = GetFileAttributes(path.c_str()); if (attributes != INVALID_FILE_ATTRIBUTES && attributes & FILE_ATTRIBUTE_DIRECTORY) { - return true; + return {}; } } #endif DirEntry dir_entry(path, DirEntry::LogOnError::yes); if (!dir_entry.exists()) { - return false; + return tl::unexpected(Statistic::bad_input_file); } if (dir_entry.is_directory()) { // Ignore directory, typically $PWD. - return true; + return {}; } if (!dir_entry.is_regular_file()) { // Device, pipe, socket or other strange creature. LOG("Non-regular include file {}", path); - return false; + return tl::unexpected(Statistic::bad_input_file); } for (const auto& ignore_header_path : ctx.ignore_header_paths) { if (file_path_matches_dir_prefix_or_file(ignore_header_path, path)) { - return true; + return {}; } } - const bool is_pch = is_precompiled_header(path); - const bool too_new = include_file_too_new(ctx, path, dir_entry); - - if (too_new) { - // Opt out of direct mode because of a race condition. - // - // The race condition consists of these events: - // - // - the preprocessor is run - // - an include file is modified by someone - // - the new include file is hashed by ccache - // - the real compiler is run on the preprocessor's output, which contains - // data from the old header file - // - the wrong object file is stored in the cache. - - return false; - } - // Let's hash the include file content. Hash::Digest file_digest; + const bool is_pch = is_precompiled_header(path); if (is_pch) { if (ctx.args_info.included_pch_file.empty()) { LOG("Detected use of precompiled header: {}", path); @@ -433,7 +391,7 @@ do_remember_include_file(Context& ctx, } if (!hash_binary_file(ctx, file_digest, path)) { - return false; + return tl::unexpected(Statistic::bad_input_file); } cpp_hash.hash_delimiter(using_pch_sum ? "pch_sum_hash" : "pch_hash"); cpp_hash.hash(util::format_digest(file_digest)); @@ -442,45 +400,24 @@ do_remember_include_file(Context& ctx, if (ctx.config.direct_mode()) { if (!is_pch) { // else: the file has already been hashed. auto ret = hash_source_code_file(ctx, file_digest, path); - if (ret.contains(HashSourceCode::error) - || ret.contains(HashSourceCode::found_time)) { - return false; + if (ret.contains(HashSourceCode::error)) { + return tl::unexpected(Statistic::bad_input_file); + } + if (ret.contains(HashSourceCode::found_time)) { + LOG_RAW("Disabling direct mode"); + ctx.config.set_direct_mode(false); } } - ctx.included_files.emplace(path, file_digest); - if (depend_mode_hash) { depend_mode_hash->hash_delimiter("include"); depend_mode_hash->hash(util::format_digest(file_digest)); } } - return true; -} - -enum class RememberIncludeFileResult { ok, cannot_use_pch }; - -// This function hashes an include file and stores the path and hash in -// ctx.included_files. If the include file is a PCH, cpp_hash is also updated. -static RememberIncludeFileResult -remember_include_file(Context& ctx, - const std::string& path, - Hash& cpp_hash, - bool system, - Hash* depend_mode_hash) -{ - if (!do_remember_include_file( - ctx, path, cpp_hash, system, depend_mode_hash)) { - if (is_precompiled_header(path)) { - return RememberIncludeFileResult::cannot_use_pch; - } else if (ctx.config.direct_mode()) { - LOG_RAW("Disabling direct mode"); - ctx.config.set_direct_mode(false); - } - } + ctx.included_files.emplace(path, file_digest); - return RememberIncludeFileResult::ok; + return {}; } static void @@ -627,10 +564,7 @@ process_preprocessed_file(Context& ctx, Hash& hash, const std::string& path) hash.hash(inc_path); } - if (remember_include_file(ctx, inc_path, hash, system, nullptr) - == RememberIncludeFileResult::cannot_use_pch) { - return tl::unexpected(Statistic::could_not_use_precompiled_header); - } + TRY(remember_include_file(ctx, inc_path, hash, system, nullptr)); p = q; // Everything of interest between p and q has been hashed now. } else if (strncmp(q, incbin_directive, sizeof(incbin_directive)) == 0 && ((q[7] == ' ' @@ -672,7 +606,7 @@ process_preprocessed_file(Context& ctx, Hash& hash, const std::string& path) std::string pch_path = Util::make_relative_path(ctx, ctx.args_info.included_pch_file); hash.hash(pch_path); - remember_include_file(ctx, pch_path, hash, false, nullptr); + TRY(remember_include_file(ctx, pch_path, hash, false, nullptr)); } bool debug_included = getenv("CCACHE_DEBUG_INCLUDED"); @@ -685,7 +619,7 @@ process_preprocessed_file(Context& ctx, Hash& hash, const std::string& path) // Extract the used includes from the dependency file. Note that we cannot // distinguish system headers from other includes here. -static std::optional +static tl::expected result_key_from_depfile(Context& ctx, Hash& hash) { // Make sure that result hash will always be different from the manifest hash @@ -699,7 +633,7 @@ result_key_from_depfile(Context& ctx, Hash& hash) LOG("Failed to read dependency file {}: {}", ctx.args_info.output_dep, file_content.error()); - return std::nullopt; + return tl::unexpected(Statistic::bad_input_file); } for (std::string_view token : Depfile::tokenize(*file_content)) { @@ -707,7 +641,7 @@ result_key_from_depfile(Context& ctx, Hash& hash) continue; } std::string path = Util::make_relative_path(ctx, token); - remember_include_file(ctx, path, hash, false, &hash); + TRY(remember_include_file(ctx, path, hash, false, &hash)); } // Explicitly check the .gch/.pch/.pth file as it may not be mentioned in the @@ -716,7 +650,7 @@ result_key_from_depfile(Context& ctx, Hash& hash) std::string pch_path = Util::make_relative_path(ctx, ctx.args_info.included_pch_file); hash.hash(pch_path); - remember_include_file(ctx, pch_path, hash, false, nullptr); + TRY(remember_include_file(ctx, pch_path, hash, false, nullptr)); } bool debug_included = getenv("CCACHE_DEBUG_INCLUDED"); @@ -759,14 +693,14 @@ struct DoExecuteResult // Extract the used includes from /showIncludes output in stdout. Note that we // cannot distinguish system headers from other includes here. -static std::optional +static tl::expected result_key_from_includes(Context& ctx, Hash& hash, std::string_view stdout_data) { for (std::string_view include : core::MsvcShowIncludesOutput::get_includes( stdout_data, ctx.config.msvc_dep_prefix())) { const std::string path = Util::make_relative_path( ctx, Util::normalize_abstract_absolute_path(include)); - remember_include_file(ctx, path, hash, false, &hash); + TRY(remember_include_file(ctx, path, hash, false, &hash)); } // Explicitly check the .pch file as it is not mentioned in the @@ -775,7 +709,7 @@ result_key_from_includes(Context& ctx, Hash& hash, std::string_view stdout_data) std::string pch_path = Util::make_relative_path(ctx, ctx.args_info.included_pch_file); hash.hash(pch_path); - remember_include_file(ctx, pch_path, hash, false, nullptr); + TRY(remember_include_file(ctx, pch_path, hash, false, nullptr)); } const bool debug_included = getenv("CCACHE_DEBUG_INCLUDED"); @@ -1072,6 +1006,59 @@ rewrite_stdout_from_compiler(const Context& ctx, util::Bytes&& stdout_data) } } +static std::string +format_epoch_time(const util::TimePoint& tp) +{ + return FMT("{}.{:09}", tp.sec(), tp.nsec_decimal_part()); +} + +static bool +source_file_is_too_new(const Context& ctx, const fs::path& path) +{ + const bool sloppy_ctime = + ctx.config.sloppiness().contains(core::Sloppy::include_file_ctime); + const bool sloppy_mtime = + ctx.config.sloppiness().contains(core::Sloppy::include_file_mtime); + + if ((sloppy_mtime && sloppy_ctime) || util::is_dev_null_path(path)) { + return false; + } + + // It's not enough to check if mtime/ctime >= ctx.time_of_invocation since + // filesystem timestamps are granular. See the comment for + // InodeCache::InodeCache for details. + // + // A relatively small safety margin is used in this case to make things safe + // on common filesystems while also not bailing out when creating a source + // file reasonably close in time before the compilation. + const util::Duration min_age(0, 100'000'000); // 0.1 s + util::TimePoint deadline = ctx.time_of_invocation + min_age; + + DirEntry dir_entry(path); + + if (!sloppy_mtime && dir_entry.mtime() >= deadline) { + LOG( + "{} was modified near or after invocation (mtime {}, invocation time {})", + dir_entry.path(), + format_epoch_time(dir_entry.mtime()), + format_epoch_time(ctx.time_of_invocation)); + return true; + } + + // The same logic as above applies to the change time of the file. + if (!sloppy_ctime && dir_entry.ctime() >= deadline) { + LOG( + "{} had status change near or after invocation (ctime {}, invocation" + " time {})", + dir_entry.path(), + format_epoch_time(dir_entry.ctime()), + format_epoch_time(ctx.time_of_invocation)); + return true; + } + + return false; +} + // Run the real compiler and put the result in cache. Returns the result key. static tl::expected to_cache(Context& ctx, @@ -1178,22 +1165,36 @@ to_cache(Context& ctx, if (ctx.config.depend_mode()) { ASSERT(depend_mode_hash); if (ctx.args_info.generating_dependencies) { - result_key = result_key_from_depfile(ctx, *depend_mode_hash); + auto key = result_key_from_depfile(ctx, *depend_mode_hash); + if (!key) { + return tl::unexpected(key.error()); + } + result_key = *key; } else if (ctx.args_info.generating_includes) { - result_key = result_key_from_includes( + auto key = result_key_from_includes( ctx, *depend_mode_hash, util::to_string_view(result->stdout_data)); + if (!key) { + return tl::unexpected(key.error()); + } + result_key = *key; } else { ASSERT(false); } - if (!result_key) { - return tl::unexpected(Statistic::internal_error); - } LOG_RAW("Got result key from dependency file"); LOG("Result key: {}", util::format_digest(*result_key)); } ASSERT(result_key); + if (source_file_is_too_new(ctx, ctx.args_info.input_file)) { + return tl::unexpected(Statistic::modified_input_file); + } + for (const auto& [path, digest] : ctx.included_files) { + if (source_file_is_too_new(ctx, path)) { + return tl::unexpected(Statistic::modified_input_file); + } + } + if (ctx.args_info.generating_dependencies) { Depfile::make_paths_relative_in_output_dep(ctx); } diff --git a/src/core/Statistic.hpp b/src/core/Statistic.hpp index fb0e37907..22b5fe71d 100644 --- a/src/core/Statistic.hpp +++ b/src/core/Statistic.hpp @@ -1,4 +1,4 @@ -// Copyright (C) 2021-2023 Joel Rosdahl and other contributors +// Copyright (C) 2021-2024 Joel Rosdahl and other contributors // // See doc/AUTHORS.adoc for a complete list of contributors. // @@ -79,7 +79,9 @@ enum class Statistic { subdir_size_kibibyte_base = 65, disabled = 81, - END = 82 + bad_input_file = 82, + modified_input_file = 83, + END = 84 }; } // namespace core diff --git a/src/core/Statistics.cpp b/src/core/Statistics.cpp index aa01b18c4..c2fefba2b 100644 --- a/src/core/Statistics.cpp +++ b/src/core/Statistics.cpp @@ -1,4 +1,4 @@ -// Copyright (C) 2021-2023 Joel Rosdahl and other contributors +// Copyright (C) 2021-2024 Joel Rosdahl and other contributors // // See doc/AUTHORS.adoc for a complete list of contributors. // @@ -76,6 +76,9 @@ const StatisticsField k_statistics_fields[] = { // option argument. FIELD(bad_compiler_arguments, "Bad compiler arguments", FLAG_UNCACHEABLE), + // An input file could not be read or parsed (see the debug log for details). + FIELD(bad_input_file, "Could not read or parse input file", FLAG_ERROR), + // The output path specified with -o could not be written to. FIELD(bad_output_file, "Could not write to output file", FLAG_ERROR), @@ -173,6 +176,10 @@ const StatisticsField k_statistics_fields[] = { // cache while another instance removed the file as part of cache cleanup. FIELD(missing_cache_file, "Missing cache file", FLAG_ERROR), + // An input file was modified during compilation. + FIELD( + modified_input_file, "Input file modified during compilation", FLAG_ERROR), + // The compiler was called to compile multiple source files in one go. This is // not supported by ccache. FIELD(multiple_source_files, "Multiple source files", FLAG_UNCACHEABLE), diff --git a/test/suites/base.bash b/test/suites/base.bash index 17056a4cc..b7eab5369 100644 --- a/test/suites/base.bash +++ b/test/suites/base.bash @@ -281,6 +281,48 @@ fi rm -rf src + # ------------------------------------------------------------------------- + TEST "Too new source file" + + touch new.c + touch -t 203801010000 new.c + + $CCACHE_COMPILE -c new.c + expect_stat modified_input_file 1 + expect_stat cache_miss 0 + + CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS include_file_mtime" $CCACHE_COMPILE -c new.c + expect_stat modified_input_file 1 + expect_stat cache_miss 1 + + # ------------------------------------------------------------------------- + TEST "Too new include file" + + cat <new.c +#include "new.h" +EOF + cat <new.h +int test; +EOF + touch -t 203801010000 new.h + + $CCACHE_COMPILE -c new.c + expect_stat modified_input_file 1 + expect_stat cache_miss 0 + + CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS include_file_mtime" $CCACHE_COMPILE -c new.c + expect_stat modified_input_file 1 + expect_stat cache_miss 1 + + # ------------------------------------------------------------------------- + TEST "Too new source file ignored if sloppy" + + touch new.c + touch -t 203801010000 new.c + + CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS include_file_mtime" $CCACHE_COMPILE -c new.c + expect_stat cache_miss 1 + # ------------------------------------------------------------------------- TEST "LANG" diff --git a/test/suites/direct.bash b/test/suites/direct.bash index aebc37146..77c5ac601 100644 --- a/test/suites/direct.bash +++ b/test/suites/direct.bash @@ -1072,27 +1072,6 @@ EOF expect_stat preprocessed_cache_hit 0 expect_stat cache_miss 1 - # ------------------------------------------------------------------------- - TEST "Too new include file disables direct mode" - - cat <new.c -#include "new.h" -EOF - cat <new.h -int test; -EOF - touch -t 203801010000 new.h - - $CCACHE_COMPILE -c new.c - expect_stat direct_cache_hit 0 - expect_stat preprocessed_cache_hit 0 - expect_stat cache_miss 1 - - $CCACHE_COMPILE -c new.c - expect_stat direct_cache_hit 0 - expect_stat preprocessed_cache_hit 1 - expect_stat cache_miss 1 - # ------------------------------------------------------------------------- TEST "__DATE__ in header file results in direct cache hit as the date remains the same" @@ -1117,27 +1096,6 @@ EOF expect_stat preprocessed_cache_hit 0 expect_stat cache_miss 1 - # ------------------------------------------------------------------------- - TEST "New include file ignored if sloppy" - - cat <new.c -#include "new.h" -EOF - cat <new.h -int test; -EOF - touch -t 203801010000 new.h - - CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS include_file_mtime" $CCACHE_COMPILE -c new.c - expect_stat direct_cache_hit 0 - expect_stat preprocessed_cache_hit 0 - expect_stat cache_miss 1 - - CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS include_file_mtime" $CCACHE_COMPILE -c new.c - expect_stat direct_cache_hit 1 - expect_stat preprocessed_cache_hit 0 - expect_stat cache_miss 1 - # ------------------------------------------------------------------------- TEST "Sloppy Clang index store" diff --git a/test/suites/pch.bash b/test/suites/pch.bash index fefaac6a9..bff55095d 100644 --- a/test/suites/pch.bash +++ b/test/suites/pch.bash @@ -631,14 +631,6 @@ fi # ------------------------------------------------------------------------- TEST "Too new PCH file" - # If the precompiled header is too new we shouldn't cache the result at all - # since: - # - # - the precompiled header content must be included in the hash, but - # - we don't trust the precompiled header content so we can't hash it - # ourselves, and - # - the preprocessed output doesn't contain the preprocessed header content. - touch lib.h touch main.c @@ -646,10 +638,7 @@ fi touch -d "@$(($(date +%s) + 60))" lib.h.gch # 1 minute in the future CCACHE_SLOPPINESS="$DEFAULT_SLOPPINESS pch_defines,time_macros" $CCACHE_COMPILE -include lib.h -c main.c - expect_stat direct_cache_hit 0 - expect_stat preprocessed_cache_hit 0 - expect_stat cache_miss 0 - expect_stat could_not_use_precompiled_header 1 + expect_stat modified_input_file 1 } pch_suite_clang() {