Abstract
On-board Computers (OBC) are at the centre of space-faring systems. They provide computational performance to the system with high availability and dependability. However, these systems typically consist of expensive, slow, fault-tolerant hardware to cope with errors or failures during a mission. Commercial-off-the-shelf (COTS) components offer higher performance but do not provide the fault-tolerance mechanisms. The ScOSA (Scalable On-board Computing for Space Avionics) architecture uses COTS and rad-hard components as a distributed system, with the advantage of providing more computing performance than current OBCs while maintaining the dependability properties. ScOSA uses a middleware to manage the COTS components as a distributed system of nodes, which, in the event of a node failure, mitigates the effects by reconfiguring the system to a configuration that excludes the failed node using a pre-determined configuration. These configurations are computed offline and have an exponentially growing memory usage depending on the number of nodes in the system, which limits the system’s scalability. This paper presents an online reconfiguration algorithm as a solution to this scalability problem. Upon the occurrence of a node failure event, the online algorithm makes scheduling decisions at run-time, eliminating the need for pre-determined configurations. A novel online scheduling mechanism, consisting of six phases, which includes a combination of fault-tolerance, parallelism, and the use of the real-time state of the system, is a step towards higher dependability in distributed on-board computing. The online reconfiguration is evaluated by comparing it to the offline reconfiguration in terms of time and network traffic, showing that it is not only capable of generating configurations dynamically but also provides a solution to the scalability problem.
Original language | English |
---|---|
Title of host publication | Architecture of Computing Systems - 37th International Conference, ARCS 2024, Proceedings |
Editors | Dietmar Fey, Benno Stabernack, Stefan Lankes, Mathias Pacher, Thilo Pionteck |
Publisher | Springer |
Pages | 127-141 |
Number of pages | 15 |
ISBN (Electronic) | 978-3-031-66146-4 |
ISBN (Print) | 9783031661457 |
DOIs | |
Publication status | Published - 1 Aug 2024 |
Event | 37th International Conference on Architecture of Computing Systems, ARCS 2024 - Potsdam, Germany Duration: 14 May 2024 → 16 May 2024 Conference number: 37 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 14842 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |
Conference
Conference | 37th International Conference on Architecture of Computing Systems, ARCS 2024 |
---|---|
Abbreviated title | ARCS 2024 |
Country/Territory | Germany |
City | Potsdam |
Period | 14/05/24 → 16/05/24 |
Keywords
- 2024 OA procedure
- Distributed Systems
- Embedded Systems
- Fault-Tolerance
- Middleware
- On-board Computers
- Reconfiguration
- Self-Configuration
- Self-Healing
- Dependability