]> git.ipfire.org Git - thirdparty/openembedded/openembedded-core-contrib.git/commitdiff
user-manual-metadata.xml: Added "Checksums (Signatures)" section.
authorScott Rifenbark <scott.m.rifenbark@intel.com>
Wed, 29 Jan 2014 18:06:15 +0000 (12:06 -0600)
committerRichard Purdie <richard.purdie@linuxfoundation.org>
Tue, 11 Feb 2014 12:16:37 +0000 (12:16 +0000)
Added this section to the end of the Metadata chapter.

Signed-off-by: Scott Rifenbark <scott.m.rifenbark@intel.com>
doc/user-manual/user-manual-metadata.xml

index d191d3589a529f317ae4d1554562d61ea08c5abb..abfc89d75c44558aafc63736d9fd091e8f34d155 100644 (file)
             deletes all the flags for a variable.
         </para>
     </section>
+
+    <section id='task-checksums-and-setscene'>
+        <title>Task Checksums and Setscene</title>
+
+        <para>
+            This list is a place holder of content that needs explanation here.
+            Items should be moved to appropriate sections below as completed.
+            <itemizedlist>
+                <listitem><para><filename>STAMP</filename></para></listitem>
+                <listitem><para><filename>STAMPCLEAN</filename></para></listitem>
+                <listitem><para><filename>BB_STAMP_WHITELIST</filename></para></listitem>
+                <listitem><para><filename>BB_STAMP_POLICY</filename></para></listitem>
+                <listitem><para><filename>BB_HASHCHECK_FUNCTION</filename></para></listitem>
+                <listitem><para><filename>BB_SETSCENE_VERIFY_FUNCTION</filename></para></listitem>
+                <listitem><para><filename>BB_SETSCENE_DEPVALID</filename></para></listitem>
+                <listitem><para><filename>BB_TASKHASH</filename></para></listitem>
+            </itemizedlist>
+        </para>
+
+        <section id='checksums'>
+            <title>Checksums (Signatures)</title>
+
+            <para>
+                BitBake uses checksums (or signatures) along with the setscene
+                to determine if a task needs to be run.
+                This section describes the process.
+                To help understand how BitBake does this, the section assumes an
+                OpenEmbedded metadata-based example.
+            </para>
+
+            <para>
+                The setscene code uses a checksum, which is a unique signature of a task's
+                inputs, to determine if a task needs to be run again.
+                Because it is a change in a task's inputs that triggers a rerun, the process
+                needs to detect all the inputs to a given task.
+                For shell tasks, this turns out to be fairly easy because
+                BitBake generates a "run" shell script for each task and
+                it is possible to create a checksum that gives you a good idea of when
+                the task's data changes.
+            </para>
+
+            <para>
+                To complicate the problem, some things should not be included in
+                the checksum.
+                First, there is the actual specific build path of a given task -
+                the working directory.
+                It does not matter if the work directory changes because it should not
+                affect the output for target packages.
+                The simplistic approach for excluding the work directory is to set
+                it to some fixed value and create the checksum for the "run" script.
+            </para>
+
+            <para>
+                Another problem results from the "run" scripts containing functions that
+                might or might not get called.
+                The incremental build solution contains code that figures out dependencies
+                between shell functions.
+                This code is used to prune the "run" scripts down to the minimum set,
+                thereby alleviating this problem and making the "run" scripts much more
+                readable as a bonus.
+            </para>
+
+            <para>
+                So far we have solutions for shell scripts.
+                What about Python tasks?
+                The same approach applies even though these tasks are more difficult.
+                The process needs to figure out what variables a Python function accesses
+                and what functions it calls.
+                Again, the incremental build solution contains code that first figures out
+                the variable and function dependencies, and then creates a checksum for the data
+                used as the input to the task.
+            </para>
+
+            <para>
+                Like the working directory case, situations exist where dependencies
+                should be ignored.
+                For these cases, you can instruct the build process to ignore a dependency
+                by using a line like the following:
+                <literallayout class='monospaced'>
+     PACKAGE_ARCHS[vardepsexclude] = "MACHINE"
+                </literallayout>
+                This example ensures that the <filename>PACKAGE_ARCHS</filename> variable does not
+                depend on the value of <filename>MACHINE</filename>, even if it does reference it.
+            </para>
+
+            <para>
+                Equally, there are cases where we need to add dependencies BitBake
+                is not able to find.
+                You can accomplish this by using a line like the following:
+                <literallayout class='monospaced'>
+      PACKAGE_ARCHS[vardeps] = "MACHINE"
+                </literallayout>
+                This example explicitly adds the <filename>MACHINE</filename> variable as a
+                dependency for <filename>PACKAGE_ARCHS</filename>.
+            </para>
+
+            <para>
+                Consider a case with in-line Python, for example, where BitBake is not
+                able to figure out dependencies.
+                When running in debug mode (i.e. using <filename>-DDD</filename>), BitBake
+                produces output when it discovers something for which it cannot figure out
+                dependencies.
+            </para>
+
+            <para>
+                Thus far, this section has limited discussion to the direct inputs into a task.
+                Information based on direct inputs is referred to as the "basehash" in the
+                code.
+                However, there is still the question of a task's indirect inputs - the
+                things that were already built and present in the build directory.
+                The checksum (or signature) for a particular task needs to add the hashes
+                of all the tasks on which the particular task depends.
+                Choosing which dependencies to add is a policy decision.
+                However, the effect is to generate a master checksum that combines the basehash
+                and the hashes of the task's dependencies.
+            </para>
+
+            <para>
+                At the code level, there are a variety of ways both the basehash and the
+                dependent task hashes can be influenced.
+                Within the BitBake configuration file, we can give BitBake some extra information
+                to help it construct the basehash.
+                The following statement effectively results in a list of global variable
+                dependency excludes - variables never included in any checksum.
+                This example uses variables from OpenEmbedded to help illustrate
+                the concept:
+                <literallayout class='monospaced'>
+     BB_HASHBASE_WHITELIST ?= "TMPDIR FILE PATH PWD BB_TASKHASH BBPATH DL_DIR \
+         SSTATE_DIR THISDIR FILESEXTRAPATHS FILE_DIRNAME HOME LOGNAME SHELL TERM \
+         USER FILESPATH STAGING_DIR_HOST STAGING_DIR_TARGET COREBASE PRSERV_HOST \
+         PRSERV_DUMPDIR PRSERV_DUMPFILE PRSERV_LOCKDOWN PARALLEL_MAKE \
+         CCACHE_DIR EXTERNAL_TOOLCHAIN CCACHE CCACHE_DISABLE LICENSE_PATH SDKPKGSUFFIX"
+                </literallayout>
+                The previous example excludes the work directory, which is part of
+                <filename>TMPDIR</filename>.
+            </para>
+
+            <para>
+                The rules for deciding which hashes of dependent tasks to include through
+                dependency chains are more complex and are generally accomplished with a
+                Python function.
+                The code in <filename>meta/lib/oe/sstatesig.py</filename> shows two examples
+                of this and also illustrates how you can insert your own policy into the system
+                if so desired.
+                This file defines the two basic signature generators OpenEmbedded Core
+                uses:  "OEBasic" and "OEBasicHash".
+                By default, there is a dummy "noop" signature handler enabled in BitBake.
+                This means that behavior is unchanged from previous versions.
+                <filename>OE-Core</filename> uses the "OEBasicHash" signature handler by default
+                through this setting in the <filename>bitbake.conf</filename> file:
+                <literallayout class='monospaced'>
+     BB_SIGNATURE_HANDLER ?= "OEBasicHash"
+                </literallayout>
+                The "OEBasicHash" <filename>BB_SIGNATURE_HANDLER</filename> is the same as the
+                "OEBasic" version but adds the task hash to the stamp files.
+                This results in any metadata change that changes the task hash, automatically
+                causing the task to be run again.
+                This removes the need to bump
+                <link linkend='var-PR'><filename>PR</filename></link>
+                values, and changes to metadata automatically ripple across the build.
+            </para>
+
+            <para>
+                It is also worth noting that the end result of these signature generators is to
+                make some dependency and hash information available to the build.
+                This information includes:
+                <itemizedlist>
+                    <listitem><para><filename>BB_BASEHASH_task-&lt;taskname&gt;</filename>:
+                        The base hashes for each task in the recipe.
+                        </para></listitem>
+                    <listitem><para><filename>BB_BASEHASH_&lt;filename:taskname&gt;</filename>:
+                        The base hashes for each dependent task.
+                        </para></listitem>
+                    <listitem><para><filename>BBHASHDEPS_&lt;filename:taskname&gt;</filename>:
+                        The task dependencies for each task.
+                        </para></listitem>
+                    <listitem><para><filename>BB_TASKHASH</filename>:
+                        The hash of the currently running task.
+                        </para></listitem>
+                </itemizedlist>
+            </para>
+        </section>
+    </section>
 </chapter>