We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm.
|Number of pages
|International journal on software tools for technology transfer
|Published - Nov 2016
- 2023 OA procedure