Skip to content

DebugWarnings

Jorge Navas edited this page Jun 10, 2023 · 10 revisions

Understand the reasons why a static analyzer is imprecise is a hard task. There are many sources of imprecision: property is not expressible in the abstract domain, non-optimal abstract transformers, joins, widenings, etc.

In Clam, there are several things you can try to understand better the root cause of false positives.

Offline analysis of the invariants: semantic slicing at the statement level

Clam provides a script debug_assertion.py that slices a CrabIR program annotated with invariants and creates another CrabIR program containing only the relevant invariants to a particular assertion.

This is achieved by running an inter-procedural dataflow analysis that for each statement s computes the set of facts <i,V> such that there is at least on path from s to assertion i whose evaluation of its condition depends on the set of variables V. This is implemented by the assertion crawler analysis implemented in Crab. This analysis allows us to remove a statement if it cannot affect the evaluation of the assertion of interest.

You need first to run clam.py with these specific options:

clam.py -g prog.c --crab-track=mem --crab-inter --crab-check=assert \
                  --crab-print-invariants=true --crab-print-voi=true \
                  --log=region-print-debug \
                  --ocrab=prog.crabir MORE_OPTS

All the above options passed to clam.py are needed. First, the program must be compiled with debug symbols (-g). Second, we want clam.py to print invariants (--crab-print-invariants) and the set of variables of interest (--crab-print-voi) for each assertion. A variable v is of interest for an assertion s, if the reachability of s depends on the evaluation of v. The computation of set of voi's is done by a dataflow analysis that requires interprocedural analysis (--crab-inter). Finally, this script assumes that invariants are printed in a certain form. The option value --crab-track=mem produces usually the most precise results so that this script only understands the invariants printed by that option value.

Second, you need to open prog.crabir and select which assertion you want to debug. In a *.crabir file, all assertions are printed with a unique identifier and they get the same identifier across different executions. Moreover, it's printed whether the assertion was proven by clam or not. With that information, you can select which assertion identifier was not proven and use the script debug_assertion.py as follows:

debug_assertion.py --assertion=NUM prog.crabir

This script will output another *.crabir file but this time only relevant invariants for that particular assertion will be printed.

TODO: add demo

Interactive analysis

Sometimes an offline analysis is too painful because the program can be too large or Clam prints too many invariants which are hard to digest even after the slicing method explained above. Ideally, we would like to have an interactive debugger (similar to Mopsa) which can allow users to add breakpoints and run the analyzer step-by-step. Unfortunately, we have currently very little support for interactive analysis. There are two things you can do:

Use of __CRAB_intrinsic_print_invariants

One thing you can do is to add special calls in the source program (C or C++) to tell clam to print the invariants projected onto some small subset of variables each time that analyzes that special call. This means that if the special call is placed inside a loop, the invariants will be printed as many times as the number of iterations that Clam needs to compute the fixpoint of the loop.

The special call is __CRAB_intrinsic_print_invariants and it's defined in clam/clam.h. See this program to see how to place calls to __CRAB_intrinsic_print_invariants in a C program.

Use Clam logging facilities

Another way to understand what Clam is doing is by running clam.py with option --crab-verbose=3. This option will print a lot of stuff, including the abstract states before and after each CrabIR statement, joins, and widenings. This information is very useful with small programs but it might not be that useful with larger ones.