Document direct mode

author Joel Rosdahl <joel@rosdahl.net>

Tue, 5 Jan 2010 21:44:09 +0000 (22:44 +0100)

committer Joel Rosdahl <joel@rosdahl.net>

Wed, 6 Jan 2010 21:23:10 +0000 (22:23 +0100)
author Joel Rosdahl <joel@rosdahl.net>
Tue, 5 Jan 2010 21:44:09 +0000 (22:44 +0100)
committer Joel Rosdahl <joel@rosdahl.net>
Wed, 6 Jan 2010 21:23:10 +0000 (22:23 +0100)
diff --git a/ccache.yo b/ccache.yo

index f7b42d9a56eea3b74fa02a65de8ea4087a548467..9d76600b2fc7098291cced5e2b3e95bcff8945d8 100644 (file)
--- a/ccache.yo
+++ b/ccache.yo
@@ -165,7 +165,7 @@ temporary files.
  
  dit(bf(CCACHE_CPP2)) If you set the environment variable CCACHE_CPP2
  then ccache will not use the optimisation of avoiding the 2nd call to
-the pre-processor by compiling the pre-processed output that was used
+the preprocessor by compiling the preprocessed output that was used
  for finding the hash in the case of a cache miss. This is primarily a
  debugging option, although it is possible that some unusual compilers
  will have problems with the intermediate filename extensions used in
@@ -231,7 +231,7 @@ slightly slower, but makes copes better with compiler upgrades during
  a build bootstrapping process.
  
  dit(bf(CCACHE_UNIFY)) If you set the environment variable CCACHE_UNIFY
-then ccache will use the C/C++ unifier when hashing the pre-processor
+then ccache will use the C/C++ unifier when hashing the preprocessor
  output if -g is not used in the compile. The unifier is slower than a
  normal hash, so setting this environment variable loses a little bit
  of speed, but it means that ccache can take advantage of not
@@ -244,7 +244,7 @@ compiler warning messages. Enabling the unifier implies turning off
  the direct mode.
  
  dit(bf(CCACHE_EXTENSION)) Normally ccache tries to automatically
-determine the extension to use for intermediate C pre-processor files
+determine the extension to use for intermediate C preprocessor files
  based on the type of file being compiled. Unfortunately this sometimes
  doesn't work, for example when using the aCC compiler on HP-UX. On
  systems like this you can use the CCACHE_EXTENSION option to override
@@ -275,25 +275,68 @@ CCACHE_NOCOMPRESS environment variable.
  manpagesection(HOW IT WORKS)
  
  The basic idea is to detect when you are compiling exactly the same
-code a 2nd time and use the previously compiled output. You detect
-that it is the same code by forming a hash of:
+code a second time and use the previously compiled output. The
+detection is done by hashing different kinds of information that
+should be unique for the compilation and then using the hash sum to
+find the cached compilation output. ccache uses MD4, a very fast
+cryptographic hash algorithm, for the hashing. (MD4 is nowadays too
+weak to be useful in cryptographic contexts, but it should be safe
+enough to be used to identify recompilations.) When the same
+compilation is done a second time, ccache is able to supply the
+correct compiler output (including all warnings, dependency file, etc)
+from the cache.
+
+ccache has two ways of doing the detection:
  
  itemization(
-  it() the pre-processor output from running the compiler with -E
+  it() the direct mode (hashes the source code and include files directly)
+  it() the pre-preprocessor mode (hashes output from the preprocessor)
+)
+
+In the direct mode, a hash is formed of:
+
+itemization(
+  it() the input source file
    it() the command line options
-  it() the real compilers size and modification time
-  it() any stderr output generated by the compiler
+  it() the real compiler's size and modification time
+)
+
+Based on the hash, a data structure called "manifest" is looked up in
+the cache. The manifest contains paths to include files (previously
+read by the compiler), their hash sums and associated files produced
+by the compiler. The current contents of the include files are then
+hashed and compared to the information in the manifest. If there is a
+match, ccache knows the result of the compilation. If there is no
+match (or if the direct mode is disabled), ccache falls back to the
+preprocessor mode.
+
+The direct mode will be disabled if any of the following holds:
+
+itemization(
+  it() the environment variable bf(CCACHE_NODIRECT) is set
+  it() a modification timestamp of any of the include files is too new
+       (needed to avoid a race condition)
+  it() the unifier is enabled (the environment variable
+       bf(CCACHE_UNIFY) is set)
+  it() a compiler option unsupported by the direct mode is used
  )
  
-These are hashed using md4 (a strong hash) and a cache file is formed
-based on that hash result. When the same compilation is done a second
-time ccache is able to supply the correct compiler output (including
-all warnings etc) from the cache.
+In the preprocessor mode, a hash is formed of:
+
+itemization(
+  it() the preprocessor output from running the compiler with bf(-E)
+  it() the command line options except options that affect include
+       files (bf(-I), bf(-include), bf(-D), etc; the theory is that
+       these options will change the preprocessor output if they have
+       any effect at all)
+  it() the real compilers size and modification time
+  it() any stderr output generated by the preprocessor
+)
  
  ccache has been carefully written to always produce exactly the same
  compiler output that you would get without the cache. If you ever
  discover a case where ccache changes the output of your compiler then
-please let me know.
+please let us know.
  
  manpagesection(USING CCACHE WITH DISTCC)
  
@@ -367,7 +410,7 @@ manpagesection(CREDITS)
  Thanks to the following people for their contributions to ccache
  itemization(
   it() Erik Thiele for the original compilercache script
- it() Luciano Rocha for the idea of compiling the pre-processor output
+ it() Luciano Rocha for the idea of compiling the preprocessor output
   to avoid a 2nd cpp pass
   it() Paul Russell for many suggestions and the debian packaging
  )
author	Joel Rosdahl <joel@rosdahl.net>
	Tue, 5 Jan 2010 21:44:09 +0000 (22:44 +0100)
committer	Joel Rosdahl <joel@rosdahl.net>
	Wed, 6 Jan 2010 21:23:10 +0000 (22:23 +0100)