Towards the Online Reconfiguration of a Dependable Distributed On-Board Computer

Glen te Hofsté*, Andreas Lund, Marco Ottavi, Daniel Lüdtke

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

On-board Computers (OBC) are at the centre of space-faring systems. They provide computational performance to the system with high availability and dependability. However, these systems typically consist of expensive, slow, fault-tolerant hardware to cope with errors or failures during a mission. Commercial-off-the-shelf (COTS) components offer higher performance but do not provide the fault-tolerance mechanisms. The ScOSA (Scalable On-board Computing for Space Avionics) architecture uses COTS and rad-hard components as a distributed system, with the advantage of providing more computing performance than current OBCs while maintaining the dependability properties. ScOSA uses a middleware to manage the COTS components as a distributed system of nodes, which, in the event of a node failure, mitigates the effects by reconfiguring the system to a configuration that excludes the failed node using a pre-determined configuration. These configurations are computed offline and have an exponentially growing memory usage depending on the number of nodes in the system, which limits the system’s scalability. This paper presents an online reconfiguration algorithm as a solution to this scalability problem. Upon the occurrence of a node failure event, the online algorithm makes scheduling decisions at run-time, eliminating the need for pre-determined configurations. A novel online scheduling mechanism, consisting of six phases, which includes a combination of fault-tolerance, parallelism, and the use of the real-time state of the system, is a step towards higher dependability in distributed on-board computing. The online reconfiguration is evaluated by comparing it to the offline reconfiguration in terms of time and network traffic, showing that it is not only capable of generating configurations dynamically but also provides a solution to the scalability problem.

Original languageEnglish
Title of host publicationArchitecture of Computing Systems - 37th International Conference, ARCS 2024, Proceedings
EditorsDietmar Fey, Benno Stabernack, Stefan Lankes, Mathias Pacher, Thilo Pionteck
PublisherSpringer
Pages127-141
Number of pages15
ISBN (Electronic)978-3-031-66146-4
ISBN (Print)9783031661457
DOIs
Publication statusPublished - 1 Aug 2024
Event37th International Conference on Architecture of Computing Systems, ARCS 2024 - Potsdam, Germany
Duration: 14 May 202416 May 2024
Conference number: 37

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14842 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference37th International Conference on Architecture of Computing Systems, ARCS 2024
Abbreviated titleARCS 2024
Country/TerritoryGermany
CityPotsdam
Period14/05/2416/05/24

Keywords

  • 2024 OA procedure
  • Distributed Systems
  • Embedded Systems
  • Fault-Tolerance
  • Middleware
  • On-board Computers
  • Reconfiguration
  • Self-Configuration
  • Self-Healing
  • Dependability

Fingerprint

Dive into the research topics of 'Towards the Online Reconfiguration of a Dependable Distributed On-Board Computer'. Together they form a unique fingerprint.

Cite this