Increasing System Availability with Local Recovery based on Fault Localization

Hasan Sözer, Rui Abreu, Mehmet Aksit, Arjan J.C. van Gemund

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    2 Citations (Scopus)
    57 Downloads (Pure)

    Abstract

    Due to the fact that software systems cannot be tested exhaustively, software systems must cope with residual defects at run-time. Local recovery is an approach for recovering from errors, in which only the defective parts of the system are recovered while the other parts are kept operational. To be efficient, local recovery must be aware of which component is at fault. In this paper, we combine a fault localization technique (spectrum-based fault localization, SFL) with local recovery techniques to achieve fully autonomous fault detection, isolation, and recovery. A framework is used for decomposing the system into separate units that can be recovered in isolation, while SFL is used for monitoring the activities of these units and diagnose the faulty one whenever an error is detected. We have applied our approach to MPlayer, a large open-source software. We have observed that SFL can increase the system availability by 23.4% on average.
    Original languageUndefined
    Title of host publicationProceedings of the 10th International Conference on Quality Software, QSIC 2010
    Place of PublicationUSA
    PublisherIEEE Computer Society
    Pages276-281
    Number of pages6
    ISBN (Print)978-1-4244-8078-4
    DOIs
    Publication statusPublished - Jul 2010
    Event10th International Conference on Quality Software, QSIC 2010 - Zhangjiajie, China
    Duration: 14 Jul 201015 Jul 2010

    Publication series

    Name
    PublisherIEEE Computer Society
    ISSN (Print)1550-6002

    Conference

    Conference10th International Conference on Quality Software, QSIC 2010
    CountryChina
    CityZhangjiajie
    Period14/07/1015/07/10

    Keywords

    • IR-75776
    • METIS-276761
    • fault localization
    • EWI-19408
    • Fault Tolerance
    • Availability
    • Recovery

    Cite this

    Sözer, H., Abreu, R., Aksit, M., & van Gemund, A. J. C. (2010). Increasing System Availability with Local Recovery based on Fault Localization. In Proceedings of the 10th International Conference on Quality Software, QSIC 2010 (pp. 276-281). USA: IEEE Computer Society. https://doi.org/10.1109/QSIC.2010.29
    Sözer, Hasan ; Abreu, Rui ; Aksit, Mehmet ; van Gemund, Arjan J.C. / Increasing System Availability with Local Recovery based on Fault Localization. Proceedings of the 10th International Conference on Quality Software, QSIC 2010. USA : IEEE Computer Society, 2010. pp. 276-281
    @inproceedings{bf7ad6f1f8f34352b11020798027d1de,
    title = "Increasing System Availability with Local Recovery based on Fault Localization",
    abstract = "Due to the fact that software systems cannot be tested exhaustively, software systems must cope with residual defects at run-time. Local recovery is an approach for recovering from errors, in which only the defective parts of the system are recovered while the other parts are kept operational. To be efficient, local recovery must be aware of which component is at fault. In this paper, we combine a fault localization technique (spectrum-based fault localization, SFL) with local recovery techniques to achieve fully autonomous fault detection, isolation, and recovery. A framework is used for decomposing the system into separate units that can be recovered in isolation, while SFL is used for monitoring the activities of these units and diagnose the faulty one whenever an error is detected. We have applied our approach to MPlayer, a large open-source software. We have observed that SFL can increase the system availability by 23.4{\%} on average.",
    keywords = "IR-75776, METIS-276761, fault localization, EWI-19408, Fault Tolerance, Availability, Recovery",
    author = "Hasan S{\"o}zer and Rui Abreu and Mehmet Aksit and {van Gemund}, {Arjan J.C.}",
    note = "10.1109/QSIC.2010.29",
    year = "2010",
    month = "7",
    doi = "10.1109/QSIC.2010.29",
    language = "Undefined",
    isbn = "978-1-4244-8078-4",
    publisher = "IEEE Computer Society",
    pages = "276--281",
    booktitle = "Proceedings of the 10th International Conference on Quality Software, QSIC 2010",
    address = "United States",

    }

    Sözer, H, Abreu, R, Aksit, M & van Gemund, AJC 2010, Increasing System Availability with Local Recovery based on Fault Localization. in Proceedings of the 10th International Conference on Quality Software, QSIC 2010. IEEE Computer Society, USA, pp. 276-281, 10th International Conference on Quality Software, QSIC 2010, Zhangjiajie, China, 14/07/10. https://doi.org/10.1109/QSIC.2010.29

    Increasing System Availability with Local Recovery based on Fault Localization. / Sözer, Hasan; Abreu, Rui; Aksit, Mehmet; van Gemund, Arjan J.C.

    Proceedings of the 10th International Conference on Quality Software, QSIC 2010. USA : IEEE Computer Society, 2010. p. 276-281.

    Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

    TY - GEN

    T1 - Increasing System Availability with Local Recovery based on Fault Localization

    AU - Sözer, Hasan

    AU - Abreu, Rui

    AU - Aksit, Mehmet

    AU - van Gemund, Arjan J.C.

    N1 - 10.1109/QSIC.2010.29

    PY - 2010/7

    Y1 - 2010/7

    N2 - Due to the fact that software systems cannot be tested exhaustively, software systems must cope with residual defects at run-time. Local recovery is an approach for recovering from errors, in which only the defective parts of the system are recovered while the other parts are kept operational. To be efficient, local recovery must be aware of which component is at fault. In this paper, we combine a fault localization technique (spectrum-based fault localization, SFL) with local recovery techniques to achieve fully autonomous fault detection, isolation, and recovery. A framework is used for decomposing the system into separate units that can be recovered in isolation, while SFL is used for monitoring the activities of these units and diagnose the faulty one whenever an error is detected. We have applied our approach to MPlayer, a large open-source software. We have observed that SFL can increase the system availability by 23.4% on average.

    AB - Due to the fact that software systems cannot be tested exhaustively, software systems must cope with residual defects at run-time. Local recovery is an approach for recovering from errors, in which only the defective parts of the system are recovered while the other parts are kept operational. To be efficient, local recovery must be aware of which component is at fault. In this paper, we combine a fault localization technique (spectrum-based fault localization, SFL) with local recovery techniques to achieve fully autonomous fault detection, isolation, and recovery. A framework is used for decomposing the system into separate units that can be recovered in isolation, while SFL is used for monitoring the activities of these units and diagnose the faulty one whenever an error is detected. We have applied our approach to MPlayer, a large open-source software. We have observed that SFL can increase the system availability by 23.4% on average.

    KW - IR-75776

    KW - METIS-276761

    KW - fault localization

    KW - EWI-19408

    KW - Fault Tolerance

    KW - Availability

    KW - Recovery

    U2 - 10.1109/QSIC.2010.29

    DO - 10.1109/QSIC.2010.29

    M3 - Conference contribution

    SN - 978-1-4244-8078-4

    SP - 276

    EP - 281

    BT - Proceedings of the 10th International Conference on Quality Software, QSIC 2010

    PB - IEEE Computer Society

    CY - USA

    ER -

    Sözer H, Abreu R, Aksit M, van Gemund AJC. Increasing System Availability with Local Recovery based on Fault Localization. In Proceedings of the 10th International Conference on Quality Software, QSIC 2010. USA: IEEE Computer Society. 2010. p. 276-281 https://doi.org/10.1109/QSIC.2010.29