TY - JOUR
T1 - On hypothesis testing for statistical model checking
AU - Reijsbergen, D.P.
AU - de Boer, Pieter-Tjerk
AU - Scheinhardt, Willem R.W.
AU - Haverkort, Boudewijn R.H.M.
N1 - eemcs-eprint-26168
PY - 2015/8
Y1 - 2015/8
N2 - Hypothesis testing is an important part of statistical model checking (SMC). It is typically used to verify statements of the form $p>p_0$ or $p<p_0$, where $p$ is an unknown probability intrinsic to the system model and $p_0$ is a given threshold value. Many techniques for this have been introduced in the SMC literature. We give a comprehensive overview and comparison of these techniques, starting by introducing a framework in which they can all be described. We distinguish between three classes of techniques, differing in what type of output correctness guarantees they give when the true $p$ is very close to the threshold $p_0$. For each technique, we show how to parametrise it in terms of quantities that are meaningful to the user. Having parametrised them consistently, we graphically compare the boundaries of their decision thresholds, and numerically compare the correctness, power and efficiency of the tests. A companion website allows users to get more insight in the properties of the tests by interactively manipulating the parameters.
AB - Hypothesis testing is an important part of statistical model checking (SMC). It is typically used to verify statements of the form $p>p_0$ or $p<p_0$, where $p$ is an unknown probability intrinsic to the system model and $p_0$ is a given threshold value. Many techniques for this have been introduced in the SMC literature. We give a comprehensive overview and comparison of these techniques, starting by introducing a framework in which they can all be described. We distinguish between three classes of techniques, differing in what type of output correctness guarantees they give when the true $p$ is very close to the threshold $p_0$. For each technique, we show how to parametrise it in terms of quantities that are meaningful to the user. Having parametrised them consistently, we graphically compare the boundaries of their decision thresholds, and numerically compare the correctness, power and efficiency of the tests. A companion website allows users to get more insight in the properties of the tests by interactively manipulating the parameters.
KW - EWI-26168
KW - IR-96736
KW - METIS-312683
U2 - 10.1007/s10009-014-0350-1
DO - 10.1007/s10009-014-0350-1
M3 - Article
VL - 17
SP - 377
EP - 395
JO - International journal on software tools for technology transfer
JF - International journal on software tools for technology transfer
SN - 1433-2779
IS - 4
ER -