-
Notifications
You must be signed in to change notification settings - Fork 31
GhostVariables
Ghost variables are variables that do not appear in the
program. They are created dynamically (i.e., during the analysis) by
abstract domains to express complex properties which typically help to
improve the expressiveness of the analysis. In Crab, memory and array
domains create ghost variables. For instance, memory domains use ghost
variables to represent the offset of a pointer or the size of an
allocated object. Array domains can represent each array cell with a
ghost variable that represents its contents. Ghost variables are also
very useful when low-level properties are modeled involving precise
bitwise reasoning (e.g., to represent the k
least or most
significant bits separately) although Crab does not provide such a
domains at the moment.
However, the implementation of ghost variables in Crab has been tricky
because there is a tradeoff between modularity and
performance. Modularity is a very important property for an abstract
domain. This is the idea that complex abstract domains can be composed
of a stack and/or products of simpler abstract domains without making
any assumptions apart from providing sound abstract
operations. However, when one of the simpler domains operates on a
ghost variable created by another domain, modularity can be
broken. Whenever the domain that created a ghost variable renames it
with another ghost variable (this can happen during binary operations
such as join, meet, etc) the simpler domain that uses that ghost
variable needs to be notified. One way of doing this is by heavily using
the abstract rename
operation which is implemented by all the Crab
abstract domains. At each binary operation, the domain that
creates ghost variables generates a new map from program variables to
their ghost variables and communicate recursively to all its dependent
domains by calling rename
. Unfortunately, this operation can be very
expensive. For instance, environment maps which are implemented by
Patricia trees do not implement this operation efficiently and their
sharing properties can be completely destroyed if too many rename
operations take place.
In Crab, each abstract domain is responsible for its ghost variables. There are currently two different strategies to manage ghost variables:
- Crab variable factory provides a method
get
that returns always the same variable for a given fixed pair of variable and string. With this method, we can create ghost variables from an existing variablev
and a string that defines the role of the ghost variable. For instance,get(v, ".offset")
,get(v, ".size")
, etc create ghost variables to represent the offset of a pointer, the size of an allocated object, etc. However, this solution only works under two assumptions:
a. The roles can be efficiently named. For instance, this might not be the case if we use ghost variables to represent symbolically array slices and these slices are determined by complex linear constraints which cannot be efficiently extracted.
b. This solution is sound only if the variable v
is a program
variable or if v
is another ghost variable then it cannot be
changed during the analysis.
Currently, Crab memory domains use this approach because they never act as subdomains or reduced product with the other domains. Note that this might change in the future and modularity in that case should be revisited.
- Each domain that uses ghost variables creates an internal map from program variables to its ghost variables
and use abstract
rename
operation to communicate to other domains changes in the ghost variables during binary operations (join, meet, widening, etc). This is an expensive approach as described above and therefore, it should be used when #1 is not an option.
- Sharing Ghost Variables in a Collection of Abstract Domains by Marc Chevalier and Jerome Feret published in VMCAI 2020. In this work, authors address exactly the same problem described above and provide a clean and elegant way to deal with ghost variables by defining the Reduced Product with Ghost Variables. Authors reported 3x slowdown wrt to a baseline implementation that provided support for ghost variables in a hacky way.