gcc/doc/gccint/analyzer-internals.rst

   1 ..
   2   Copyright 1988-2022 Free Software Foundation, Inc.
   3   This is part of the GCC manual.
   4   For copying conditions, see the copyright.rst file.
   5
   6 .. index:: analyzer, internals, static analyzer, internals
   7
   8 .. _analyzer-internals:
   9
  10 Analyzer Internals
  11 ******************
  12
  13 Overview
  14 ^^^^^^^^
  15
  16 The analyzer implementation works on the gimple-SSA representation.
  17 (I chose this in the hopes of making it easy to work with LTO to
  18 do whole-program analysis).
  19
  20 The implementation is read-only: it doesn't attempt to change anything,
  21 just emit warnings.
  22
  23 The gimple representation can be seen using :option:`-fdump-ipa-analyzer`.
  24
  25 TipIf the analyzer ICEs before this is written out, one workaround is to use
  26 :option:`--param=analyzer-bb-explosion-factor=0` to force the analyzer
  27 to bail out after analyzing the first basic block.
  28
  29 First, we build a ``supergraph`` which combines the callgraph and all
  30 of the CFGs into a single directed graph, with both interprocedural and
  31 intraprocedural edges.  The nodes and edges in the supergraph are called
  32 'supernodes' and 'superedges', and often referred to in code as
  33 ``snodes`` and ``sedges``.  Basic blocks in the CFGs are split at
  34 interprocedural calls, so there can be more than one supernode per
  35 basic block.  Most statements will be in just one supernode, but a call
  36 statement can appear in two supernodes: at the end of one for the call,
  37 and again at the start of another for the return.
  38
  39 The supergraph can be seen using :option:`-fdump-analyzer-supergraph`.
  40
  41 We then build an ``analysis_plan`` which walks the callgraph to
  42 determine which calls might be suitable for being summarized (rather
  43 than fully explored) and thus in what order to explore the functions.
  44
  45 Next is the heart of the analyzer: we use a worklist to explore state
  46 within the supergraph, building an "exploded graph".
  47 Nodes in the exploded graph correspond to <point,state> pairs, as in
  48 "Precise Interprocedural Dataflow Analysis via Graph Reachability"
  49 (Thomas Reps, Susan Horwitz and Mooly Sagiv).
  50
  51 We reuse nodes for <point, state> pairs we've already seen, and avoid
  52 tracking state too closely, so that (hopefully) we rapidly converge
  53 on a final exploded graph, and terminate the analysis.  We also bail
  54 out if the number of exploded <end-of-basic-block, state> nodes gets
  55 larger than a particular multiple of the total number of basic blocks
  56 (to ensure termination in the face of pathological state-explosion
  57 cases, or bugs).  We also stop exploring a point once we hit a limit
  58 of states for that point.
  59
  60 We can identify problems directly when processing a <point,state>
  61 instance.  For example, if we're finding the successors of
  62
  63 .. code-block:: c++
  64
  65      <point: before-stmt: "free (ptr);",
  66       state: {"ptr": freed}>
  67
  68 then we can detect a double-free of "ptr".  We can then emit a path
  69 to reach the problem by finding the simplest route through the graph.
  70
  71 Program points in the analysis are much more fine-grained than in the
  72 CFG and supergraph, with points (and thus potentially exploded nodes)
  73 for various events, including before individual statements.
  74 By default the exploded graph merges multiple consecutive statements
  75 in a supernode into one exploded edge to minimize the size of the
  76 exploded graph.  This can be suppressed via
  77 :option:`-fanalyzer-fine-grained`.
  78 The fine-grained approach seems to make things simpler and more debuggable
  79 that other approaches I tried, in that each point is responsible for one
  80 thing.
  81
  82 Program points in the analysis also have a "call string" identifying the
  83 stack of callsites below them, so that paths in the exploded graph
  84 correspond to interprocedurally valid paths: we always return to the
  85 correct call site, propagating state information accordingly.
  86 We avoid infinite recursion by stopping the analysis if a callsite
  87 appears more than ``analyzer-max-recursion-depth`` in a callstring
  88 (defaulting to 2).
  89
  90 Graphs
  91 ^^^^^^
  92
  93 Nodes and edges in the exploded graph are called 'exploded nodes' and
  94 'exploded edges' and often referred to in the code as
  95 ``enodes`` and ``eedges`` (especially when distinguishing them
  96 from the ``snodes`` and ``sedges`` in the supergraph).
  97
  98 Each graph numbers its nodes, giving unique identifiers - supernodes
  99 are referred to throughout dumps in the form :samp:`SN': {index}` and
 100 exploded nodes in the form :samp:`EN: {index}` (e.g. :samp:`SN: 2` and
 101 :samp:`EN:29`).
 102
 103 The supergraph can be seen using :option:`-fdump-analyzer-supergraph-graph`.
 104
 105 The exploded graph can be seen using :option:`-fdump-analyzer-exploded-graph`
 106 and other dump options.  Exploded nodes are color-coded in the .dot output
 107 based on state-machine states to make it easier to see state changes at
 108 a glance.
 109
 110 State Tracking
 111 ^^^^^^^^^^^^^^
 112
 113 There's a tension between:
 114
 115 * precision of analysis in the straight-line case, vs
 116
 117 * exponential blow-up in the face of control flow.
 118
 119 For example, in general, given this CFG:
 120
 121 .. code-block::
 122
 123         A
 124        / \
 125       B   C
 126        \ /
 127         D
 128        / \
 129       E   F
 130        \ /
 131         G
 132
 133 we want to avoid differences in state-tracking in B and C from
 134 leading to blow-up.  If we don't prevent state blowup, we end up
 135 with exponential growth of the exploded graph like this:
 136
 137 .. code-block::
 138
 139              1:A
 140             /   \
 141            /     \
 142           /       \
 143         2:B       3:C
 144          |         |
 145         4:D       5:D        (2 exploded nodes for D)
 146        /   \     /   \
 147      6:E   7:F 8:E   9:F
 148       |     |   |     |
 149      10:G 11:G 12:G  13:G    (4 exploded nodes for G)
 150
 151 Similar issues arise with loops.
 152
 153 To prevent this, we follow various approaches:
 154
 155 * state pruning: which tries to discard state that won't be relevant
 156   later on withing the function.
 157   This can be disabled via :option:`-fno-analyzer-state-purge`.
 158
 159 * state merging.  We can try to find the commonality between two
 160   program_state instances to make a third, simpler program_state.
 161   We have two strategies here:
 162
 163   * the worklist keeps new nodes for the same program_point together,
 164     and tries to merge them before processing, and thus before they have
 165     successors.  Hence, in the above, the two nodes for D (4 and 5) reach
 166     the front of the worklist together, and we create a node for D with
 167     the merger of the incoming states.
 168
 169   * try merging with the state of existing enodes for the program_point
 170     (which may have already been explored).  There will be duplication,
 171     but only one set of duplication; subsequent duplicates are more likely
 172     to hit the cache.  In particular, (hopefully) all merger chains are
 173     finite, and so we guarantee termination.
 174     This is intended to help with loops: we ought to explore the first
 175     iteration, and then have a "subsequent iterations" exploration,
 176     which uses a state merged from that of the first, to be more abstract.
 177
 178   We avoid merging pairs of states that have state-machine differences,
 179   as these are the kinds of differences that are likely to be most
 180   interesting.  So, for example, given:
 181
 182   .. code-block::
 183
 184           if (condition)
 185             ptr = malloc (size);
 186           else
 187             ptr = local_buf;
 188
 189           .... do things with 'ptr'
 190
 191           if (condition)
 192             free (ptr);
 193
 194           ...etc
 195
 196   then we end up with an exploded graph that looks like this:
 197
 198   .. code-block::
 199
 200                        if (condition)
 201                          / T      \ F
 202                 ---------          ----------
 203                /                             \
 204           ptr = malloc (size)             ptr = local_buf
 205               |                               |
 206           copy of                         copy of
 207             "do things with 'ptr'"          "do things with 'ptr'"
 208           with ptr: heap-allocated        with ptr: stack-allocated
 209               |                               |
 210           if (condition)                  if (condition)
 211               | known to be T                 | known to be F
 212           free (ptr);                         |
 213                \                             /
 214                 -----------------------------
 215                              | ('ptr' is pruned, so states can be merged)
 216                             etc
 217
 218   where some duplication has occurred, but only for the places where the
 219   the different paths are worth exploringly separately.
 220
 221   Merging can be disabled via :option:`-fno-analyzer-state-merge`.
 222
 223 Region Model
 224 ^^^^^^^^^^^^
 225
 226 Part of the state stored at a ``exploded_node`` is a ``region_model``.
 227 This is an implementation of the region-based ternary model described in
 228 `"A Memory Model for Static Analysis of C Programs" <https://www.researchgate.net/publication/221430855_A_Memory_Model_for_Static_Analysis_of_C_Programs>`_
 229 (Zhongxing Xu, Ted Kremenek, and Jian Zhang).
 230
 231 A ``region_model`` encapsulates a representation of the state of
 232 memory, with a ``store`` recording a binding between ``region``
 233 instances, to ``svalue`` instances.  The bindings are organized into
 234 clusters, where regions accessible via well-defined pointer arithmetic
 235 are in the same cluster.  The representation is graph-like because values
 236 can be pointers to regions.  It also stores a constraint_manager,
 237 capturing relationships between the values.
 238
 239 Because each node in the ``exploded_graph`` has a ``region_model``,
 240 and each of the latter is graph-like, the ``exploded_graph`` is in some
 241 ways a graph of graphs.
 242
 243 Here's an example of printing a ``program_state``, showing the
 244 ``region_model`` within it, along with state for the ``malloc``
 245 state machine.
 246
 247 .. code-block::
 248
 249   (gdb) call debug (*this)
 250   rmodel:
 251   stack depth: 1
 252     frame (index 0): frame: ‘test’@1
 253   clusters within frame: ‘test’@1
 254     cluster for: ptr_3: &HEAP_ALLOCATED_REGION(12)
 255   m_called_unknown_fn: FALSE
 256   constraint_manager:
 257     equiv classes:
 258     constraints:
 259   malloc:
 260     0x2e89590: &HEAP_ALLOCATED_REGION(12): unchecked ('ptr_3')
 261
 262 This is the state at the point of returning from ``calls_malloc`` back
 263 to ``test`` in the following:
 264
 265 .. code-block:: c++
 266
 267   void *
 268   calls_malloc (void)
 269   {
 270     void *result = malloc (1024);
 271     return result;
 272   }
 273
 274   void test (void)
 275   {
 276     void *ptr = calls_malloc ();
 277     /* etc.  */
 278   }
 279
 280 Within the store, there is the cluster for ``ptr_3`` within the frame
 281 for ``test``, where the whole cluster is bound to a pointer value,
 282 pointing at ``HEAP_ALLOCATED_REGION(12)``.  Additionally, this pointer
 283 has the ``unchecked`` state for the ``malloc`` state machine
 284 indicating it hasn't yet been checked against NULL since the allocation
 285 call.
 286
 287 Analyzer Paths
 288 ^^^^^^^^^^^^^^
 289
 290 We need to explain to the user what the problem is, and to persuade them
 291 that there really is a problem.  Hence having a ``diagnostic_path``
 292 isn't just an incidental detail of the analyzer; it's required.
 293
 294 Paths ought to be:
 295
 296 * interprocedurally-valid
 297
 298 * feasible
 299
 300 Without state-merging, all paths in the exploded graph are feasible
 301 (in terms of constraints being satisfied).
 302 With state-merging, paths in the exploded graph can be infeasible.
 303
 304 We collate warnings and only emit them for the simplest path
 305 e.g. for a bug in a utility function, with lots of routes to calling it,
 306 we only emit the simplest path (which could be intraprocedural, if
 307 it can be reproduced without a caller).
 308
 309 We thus want to find the shortest feasible path through the exploded
 310 graph from the origin to the exploded node at which the diagnostic was
 311 saved.  Unfortunately, if we simply find the shortest such path and
 312 check if it's feasible we might falsely reject the diagnostic, as there
 313 might be a longer path that is feasible.  Examples include the cases
 314 where the diagnostic requires us to go at least once around a loop for a
 315 later condition to be satisfied, or where for a later condition to be
 316 satisfied we need to enter a suite of code that the simpler path skips.
 317
 318 We attempt to find the shortest feasible path to each diagnostic by
 319 first constructing a 'trimmed graph' from the exploded graph,
 320 containing only those nodes and edges from which there are paths to
 321 the target node, and using Dijkstra's algorithm to order the trimmed
 322 nodes by minimal distance to the target.
 323
 324 We then use a worklist to iteratively build a 'feasible graph'
 325 (actually a tree), capturing the pertinent state along each path, in
 326 which every path to a 'feasible node' is feasible by construction,
 327 restricting ourselves to the trimmed graph to ensure we stay on target,
 328 and ordering the worklist so that the first feasible path we find to the
 329 target node is the shortest possible path.  Hence we start by trying the
 330 shortest possible path, but if that fails, we explore progressively
 331 longer paths, eventually trying iterations through loops.  The
 332 exploration is captured in the feasible_graph, which can be dumped as a
 333 .dot file via :option:`-fdump-analyzer-feasibility` to visualize the
 334 exploration.  The indices of the feasible nodes show the order in which
 335 they were created.  We effectively explore the tree of feasible paths in
 336 order of shortest path until we either find a feasible path to the
 337 target node, or hit a limit and give up.
 338
 339 This is something of a brute-force approach, but the trimmed graph
 340 hopefully keeps the complexity manageable.
 341
 342 This algorithm can be disabled (for debugging purposes) via
 343 :option:`-fno-analyzer-feasibility`, which simply uses the shortest path,
 344 and notes if it is infeasible.
 345
 346 The above gives us a shortest feasible ``exploded_path`` through the
 347 ``exploded_graph`` (a list of ``exploded_edge *``).  We use this
 348 ``exploded_path`` to build a ``diagnostic_path`` (a list of
 349 **events** for the diagnostic subsystem) - specifically a
 350 ``checker_path``.
 351
 352 Having built the ``checker_path``, we prune it to try to eliminate
 353 events that aren't relevant, to minimize how much the user has to read.
 354
 355 After pruning, we notify each event in the path of its ID and record the
 356 IDs of interesting events, allowing for events to refer to other events
 357 in their descriptions.  The ``pending_diagnostic`` class has various
 358 vfuncs to support emitting more precise descriptions, so that e.g.
 359
 360 * a deref-of-unchecked-malloc diagnostic might use:
 361
 362   .. code-block::
 363
 364       returning possibly-NULL pointer to 'make_obj' from 'allocator'
 365
 366   for a ``return_event`` to make it clearer how the unchecked value moves
 367   from callee back to caller
 368
 369 * a double-free diagnostic might use:
 370
 371   .. code-block::
 372
 373       second 'free' here; first 'free' was at (3)
 374
 375   and a use-after-free might use
 376
 377   .. code-block::
 378
 379       use after 'free' here; memory was freed at (2)
 380
 381 At this point we can emit the diagnostic.
 382
 383 Limitations
 384 ^^^^^^^^^^^
 385
 386 * Only for C so far
 387
 388 * The implementation of call summaries is currently very simplistic.
 389
 390 * Lack of function pointer analysis
 391
 392 * The constraint-handling code assumes reflexivity in some places
 393   (that values are equal to themselves), which is not the case for NaN.
 394   As a simple workaround, constraints on floating-point values are
 395   currently ignored.
 396
 397 * There are various other limitations in the region model (grep for TODO/xfail
 398   in the testsuite).
 399
 400 * The constraint_manager's implementation of transitivity is currently too
 401   expensive to enable by default and so must be manually enabled via
 402   :option:`-fanalyzer-transitivity`).
 403
 404 * The checkers are currently hardcoded and don't allow for user extensibility
 405   (e.g. adding allocate/release pairs).
 406
 407 * Although the analyzer's test suite has a proof-of-concept test case for
 408   LTO, LTO support hasn't had extensive testing.  There are various
 409   lang-specific things in the analyzer that assume C rather than LTO.
 410   For example, SSA names are printed to the user in 'raw' form, rather
 411   than printing the underlying variable name.
 412
 413 Some ideas for other checkers
 414
 415 * File-descriptor-based APIs
 416
 417 * Linux kernel internal APIs
 418
 419 * Signal handling