Provably correct control flow graphs from Java bytecode programs with exceptions

A. Amighi, Pedro de Carvalho Gomes, Dilian Gurov, Marieke Huisman

Research output: Contribution to journalArticle

  • 1 Citations

Abstract

We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm.
LanguageUndefined
Pages653-684
Number of pages32
JournalInternational journal on software tools for technology transfer
Volume18
Issue number6
DOIs
StatePublished - Nov 2016

Keywords

  • IR-104119
  • EWI-27673

Cite this

@article{5d9f991c94d0436f92a49250cb63e4e6,
title = "Provably correct control flow graphs from Java bytecode programs with exceptions",
abstract = "We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm.",
keywords = "IR-104119, EWI-27673",
author = "A. Amighi and {de Carvalho Gomes}, Pedro and Dilian Gurov and Marieke Huisman",
year = "2016",
month = "11",
doi = "10.1007/s10009-015-0375-0",
language = "Undefined",
volume = "18",
pages = "653--684",
journal = "International journal on software tools for technology transfer",
issn = "1433-2779",
publisher = "Springer Berlin Heidelberg",
number = "6",

}

Provably correct control flow graphs from Java bytecode programs with exceptions. / Amighi, A.; de Carvalho Gomes, Pedro; Gurov, Dilian; Huisman, Marieke.

In: International journal on software tools for technology transfer, Vol. 18, No. 6, 11.2016, p. 653-684.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Provably correct control flow graphs from Java bytecode programs with exceptions

AU - Amighi,A.

AU - de Carvalho Gomes,Pedro

AU - Gurov,Dilian

AU - Huisman,Marieke

PY - 2016/11

Y1 - 2016/11

N2 - We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm.

AB - We present an algorithm for extracting control flow graphs from Java bytecode that captures normal as well as exceptional control flow. We prove its correctness, in the sense that the behaviour of the extracted control flow graph is a sound over-approximation of the behaviour of the original program. This makes control flow graphs suitable for performing various static analyses, such as model checking of temporal safety properties. Analysing exceptional control flow for Java bytecode is difficult because of the stack-based nature of the language. We therefore develop the extraction in two stages. In the first, we abstract away from the complications arising from exceptional flows, and relativize the extraction on an oracle that is able to look into the stack and predict the exceptions that can be raised at each instruction. This idealized algorithm provides a specification for concrete extraction algorithms, which have to provide a suitable implementation for the oracle. We prove correctness of the idealized algorithm by means of behavioural simulation. In the second stage, we develop a concrete extraction algorithm that consists of two phases. In the first phase, the program is transformed into a BIR program, a stack-less intermediate representation of Java bytecode, from which the control flow graph is extracted in the second phase. We use this intermediate format because it provides the information needed to implement the oracle, and since it gives rise to more compact graphs. We show that the behaviour of the control flow graph extracted via the intermediate representation is a sound over-approximation of the behaviour of the graph extracted by the direct, idealized algorithm, and thus of the original program. The concrete extraction algorithm is implemented as the ConFlEx tool. A number of test cases are performed to evaluate the efficiency of the algorithm.

KW - IR-104119

KW - EWI-27673

U2 - 10.1007/s10009-015-0375-0

DO - 10.1007/s10009-015-0375-0

M3 - Article

VL - 18

SP - 653

EP - 684

JO - International journal on software tools for technology transfer

T2 - International journal on software tools for technology transfer

JF - International journal on software tools for technology transfer

SN - 1433-2779

IS - 6

ER -