Analysis and optimization techniques for real-time streaming image processing software on general purpose systems

Mark Westmijze

Research output: ThesisPhD Thesis - Research UT, graduation UTAcademic

74 Downloads (Pure)

Abstract

Commercial Off The Shelf (COTS) Chip Multi-Processor (CMP) systems are for cost reasons often used in industry for soft real-time stream processing. COTS CMP systems typically have a low timing predictability, which makes it difficult to develop software applications for these systems with tight temporal requirements. Restricting the way applications use the hardware and Operating System (OS) might alleviate this difficulty, so that certain types of applications could be run on COTS CMP systems with
statistically verified temporal requirements. In this thesis we restrict the application domain to soft real-time medical image processing applications, which have a much more ‘stable’ usage of hardware resources than applications in general. Techniques at the application level are employed to improve the reproducibility (i.e. to reduce the variance) of the end-to-end latency of these imaging processing systems.

Firstly, we study the effectiveness of a number of scheduling heuristics that are intended to improve the reproducibility of a stream processing application that is executed on COTS multiprocessor systems. Experiments show that the proposed heuristics can reduce the end-to-end latency with almost 60%, and reduce the variation in the latency with more than 90%, when compared with a naive scheduling heuristic that does not consider execution times, dependencies and the memory hierarchy.

Secondly, we want to be able to integrate multiple real-time and best-effort applications on a single COTS CMP system without reducing the reproducibility of the real-time application too much. For this we examined the first component that is shared between different applications running on separate cores, the shared cache and in particular the bandwidth in the cache. We propose a technique that implements cache bandwidth reservation in software. This is achieved by dynamically duty-cycling best-effort applications based on their cache bandwidth usages measured with processor performance counters. With this technique we can control the latency increase of real-time applications that is caused by best-effort applications.

Thirdly, we introduce the Probabilistic Time Triggered System (PTTS) model to analyze and optimize the end-to-end latency of a complete system that contains multiple time triggered interfaces. Our case study demonstrates the applicability of the PTTS model and the corresponding analysis techniques for an interventional X-ray system. We expect that the PTTS model is also applicable for other systems than medical image processing systems.
Original languageEnglish
Awarding Institution
  • University of Twente
Supervisors/Advisors
  • Bekooij, Marco Jan Gerrit, Supervisor
Award date29 Jun 2018
Place of PublicationEnschede
Publisher
Print ISBNs978-90-365-4569-3
DOIs
Publication statusPublished - 29 Jun 2018

Fingerprint

Image processing
Medical image processing
Bandwidth
Processing
Scheduling
Hardware
Application programs
Imaging techniques
Data storage equipment
X rays

Cite this

@phdthesis{17311ec0785e4321a23c67615dcd5101,
title = "Analysis and optimization techniques for real-time streaming image processing software on general purpose systems",
abstract = "Commercial Off The Shelf (COTS) Chip Multi-Processor (CMP) systems are for cost reasons often used in industry for soft real-time stream processing. COTS CMP systems typically have a low timing predictability, which makes it difficult to develop software applications for these systems with tight temporal requirements. Restricting the way applications use the hardware and Operating System (OS) might alleviate this difficulty, so that certain types of applications could be run on COTS CMP systems withstatistically verified temporal requirements. In this thesis we restrict the application domain to soft real-time medical image processing applications, which have a much more ‘stable’ usage of hardware resources than applications in general. Techniques at the application level are employed to improve the reproducibility (i.e. to reduce the variance) of the end-to-end latency of these imaging processing systems.Firstly, we study the effectiveness of a number of scheduling heuristics that are intended to improve the reproducibility of a stream processing application that is executed on COTS multiprocessor systems. Experiments show that the proposed heuristics can reduce the end-to-end latency with almost 60{\%}, and reduce the variation in the latency with more than 90{\%}, when compared with a naive scheduling heuristic that does not consider execution times, dependencies and the memory hierarchy.Secondly, we want to be able to integrate multiple real-time and best-effort applications on a single COTS CMP system without reducing the reproducibility of the real-time application too much. For this we examined the first component that is shared between different applications running on separate cores, the shared cache and in particular the bandwidth in the cache. We propose a technique that implements cache bandwidth reservation in software. This is achieved by dynamically duty-cycling best-effort applications based on their cache bandwidth usages measured with processor performance counters. With this technique we can control the latency increase of real-time applications that is caused by best-effort applications.Thirdly, we introduce the Probabilistic Time Triggered System (PTTS) model to analyze and optimize the end-to-end latency of a complete system that contains multiple time triggered interfaces. Our case study demonstrates the applicability of the PTTS model and the corresponding analysis techniques for an interventional X-ray system. We expect that the PTTS model is also applicable for other systems than medical image processing systems.",
author = "Mark Westmijze",
note = "DSI Ph.D. thesis series no. 18-002 ISSN 2589-7721",
year = "2018",
month = "6",
day = "29",
doi = "10.3990/1.9789036545693",
language = "English",
isbn = "978-90-365-4569-3",
publisher = "University of Twente",
address = "Netherlands",
school = "University of Twente",

}

Analysis and optimization techniques for real-time streaming image processing software on general purpose systems. / Westmijze, Mark.

Enschede : University of Twente, 2018. 107 p.

Research output: ThesisPhD Thesis - Research UT, graduation UTAcademic

TY - THES

T1 - Analysis and optimization techniques for real-time streaming image processing software on general purpose systems

AU - Westmijze, Mark

N1 - DSI Ph.D. thesis series no. 18-002 ISSN 2589-7721

PY - 2018/6/29

Y1 - 2018/6/29

N2 - Commercial Off The Shelf (COTS) Chip Multi-Processor (CMP) systems are for cost reasons often used in industry for soft real-time stream processing. COTS CMP systems typically have a low timing predictability, which makes it difficult to develop software applications for these systems with tight temporal requirements. Restricting the way applications use the hardware and Operating System (OS) might alleviate this difficulty, so that certain types of applications could be run on COTS CMP systems withstatistically verified temporal requirements. In this thesis we restrict the application domain to soft real-time medical image processing applications, which have a much more ‘stable’ usage of hardware resources than applications in general. Techniques at the application level are employed to improve the reproducibility (i.e. to reduce the variance) of the end-to-end latency of these imaging processing systems.Firstly, we study the effectiveness of a number of scheduling heuristics that are intended to improve the reproducibility of a stream processing application that is executed on COTS multiprocessor systems. Experiments show that the proposed heuristics can reduce the end-to-end latency with almost 60%, and reduce the variation in the latency with more than 90%, when compared with a naive scheduling heuristic that does not consider execution times, dependencies and the memory hierarchy.Secondly, we want to be able to integrate multiple real-time and best-effort applications on a single COTS CMP system without reducing the reproducibility of the real-time application too much. For this we examined the first component that is shared between different applications running on separate cores, the shared cache and in particular the bandwidth in the cache. We propose a technique that implements cache bandwidth reservation in software. This is achieved by dynamically duty-cycling best-effort applications based on their cache bandwidth usages measured with processor performance counters. With this technique we can control the latency increase of real-time applications that is caused by best-effort applications.Thirdly, we introduce the Probabilistic Time Triggered System (PTTS) model to analyze and optimize the end-to-end latency of a complete system that contains multiple time triggered interfaces. Our case study demonstrates the applicability of the PTTS model and the corresponding analysis techniques for an interventional X-ray system. We expect that the PTTS model is also applicable for other systems than medical image processing systems.

AB - Commercial Off The Shelf (COTS) Chip Multi-Processor (CMP) systems are for cost reasons often used in industry for soft real-time stream processing. COTS CMP systems typically have a low timing predictability, which makes it difficult to develop software applications for these systems with tight temporal requirements. Restricting the way applications use the hardware and Operating System (OS) might alleviate this difficulty, so that certain types of applications could be run on COTS CMP systems withstatistically verified temporal requirements. In this thesis we restrict the application domain to soft real-time medical image processing applications, which have a much more ‘stable’ usage of hardware resources than applications in general. Techniques at the application level are employed to improve the reproducibility (i.e. to reduce the variance) of the end-to-end latency of these imaging processing systems.Firstly, we study the effectiveness of a number of scheduling heuristics that are intended to improve the reproducibility of a stream processing application that is executed on COTS multiprocessor systems. Experiments show that the proposed heuristics can reduce the end-to-end latency with almost 60%, and reduce the variation in the latency with more than 90%, when compared with a naive scheduling heuristic that does not consider execution times, dependencies and the memory hierarchy.Secondly, we want to be able to integrate multiple real-time and best-effort applications on a single COTS CMP system without reducing the reproducibility of the real-time application too much. For this we examined the first component that is shared between different applications running on separate cores, the shared cache and in particular the bandwidth in the cache. We propose a technique that implements cache bandwidth reservation in software. This is achieved by dynamically duty-cycling best-effort applications based on their cache bandwidth usages measured with processor performance counters. With this technique we can control the latency increase of real-time applications that is caused by best-effort applications.Thirdly, we introduce the Probabilistic Time Triggered System (PTTS) model to analyze and optimize the end-to-end latency of a complete system that contains multiple time triggered interfaces. Our case study demonstrates the applicability of the PTTS model and the corresponding analysis techniques for an interventional X-ray system. We expect that the PTTS model is also applicable for other systems than medical image processing systems.

U2 - 10.3990/1.9789036545693

DO - 10.3990/1.9789036545693

M3 - PhD Thesis - Research UT, graduation UT

SN - 978-90-365-4569-3

PB - University of Twente

CY - Enschede

ER -