Repeatability of 18F-FDG PET radiomic features: A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method

Elisabeth Pfaehler (Corresponding Author), Roelof J. Beukinga, Johan R. de Jong, Riemer H.J.A. Slart, Cornelis H. Slump, Rudi A.J.O. Dierckx, Ronald Boellaard

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)
19 Downloads (Pure)

Abstract

Background: 18F-fluoro-2-deoxy-D-Glucose positron emission tomography (18F-FDG PET) radiomics has the potential to guide the clinical decision making in cancer patients, but validation is required before radiomics can be implemented in the clinical setting. The aim of this study was to explore how feature space reduction and repeatability of 18F-FDG PET radiomic features are affected by various sources of variation such as underlying data (e.g., object size and uptake), image reconstruction methods and settings, noise, discretization method, and delineation method. Methods: The NEMA image quality phantom was scanned with various sphere-to-background ratios (SBR), simulating different activity uptakes, including spheres with low uptake, that is, SBR smaller than 1. Furthermore, images of a phantom containing 3D printed inserts reflecting realistic heterogeneity uptake patterns were acquired. Data were reconstructed using various matrix sizes, reconstruction algorithms, and scan durations (noise). For every specific reconstruction and noise level, ten statistically equal replicates were generated. The phantom inserts were delineated using CT and PET-based segmentation methods. A total of 246 radiomic features was extracted from each image dataset. Images were discretized with a fixed number of 64 bins (FBN) and a fixed bin width (FBW) of 0.25 for the high and a FBW of 0.05 for the low uptake data. In terms of feature reduction, we determined the impact of these factors on the composition of feature clusters, which were defined on the basis of Spearman's correlation matrices. To assess feature repeatability, the intraclass correlation coefficient was calculated over the ten replicates. Results: In general, larger spheres with high uptake resulted in better repeatability compared to smaller low uptake spheres. In terms of repeatability, features extracted from heterogeneous phantom inserts were comparable to features extracted from bigger high uptake spheres. For example, for an EARL-compliant reconstruction, larger and smaller high uptake spheres yielded good repeatability for 32% and 30% of the features, while the heterogeneous inserts resulted in 34% repeatable features. For the low uptake spheres, this was the case for 22% and 20% of the features for bigger and smaller spheres, respectively. Images reconstructed with point-spread-function (PSF) resulted in the highest repeatability when compared with OSEM or time-of-flight, for example, 53%, 30%, and 32% of repeatable features, respectively (for unsmoothed data, discretized with FBN, 300 s scan duration). Reducing image noise (increasing scan duration and smoothing) and using CT-based segmentation for the low uptake spheres yielded improved repeatability. FBW discretization resulted in higher repeatability than FBN discretization, for example, 89% and 35% of the features, respectively (for the EARL-compliant reconstruction and larger high uptake spheres). Conclusion: Feature space reduction and repeatability of 18F-FDG PET radiomic features depended on all studied factors. The high sensitivity of PET radiomic features to image quality suggests that a high level of image acquisition and preprocessing standardization is required to be used as clinical imaging biomarker.

Original languageEnglish
Pages (from-to)665-678
Number of pages14
JournalMedical physics
Volume46
Issue number2
DOIs
Publication statusPublished - 1 Feb 2019

Fingerprint

Computer-Assisted Image Processing
Deoxyglucose
Positron-Emission Tomography
Noise
Biomarkers
Neoplasms

Keywords

  • UT-Hybrid-D
  • delineation
  • image reconstruction settings
  • F-FDG PET/CT radiomic features

Cite this

Pfaehler, Elisabeth ; Beukinga, Roelof J. ; de Jong, Johan R. ; Slart, Riemer H.J.A. ; Slump, Cornelis H. ; Dierckx, Rudi A.J.O. ; Boellaard, Ronald. / Repeatability of 18F-FDG PET radiomic features : A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method. In: Medical physics. 2019 ; Vol. 46, No. 2. pp. 665-678.
@article{1049bbd462884fdeabfd43bf80587f9c,
title = "Repeatability of 18F-FDG PET radiomic features: A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method",
abstract = "Background: 18F-fluoro-2-deoxy-D-Glucose positron emission tomography (18F-FDG PET) radiomics has the potential to guide the clinical decision making in cancer patients, but validation is required before radiomics can be implemented in the clinical setting. The aim of this study was to explore how feature space reduction and repeatability of 18F-FDG PET radiomic features are affected by various sources of variation such as underlying data (e.g., object size and uptake), image reconstruction methods and settings, noise, discretization method, and delineation method. Methods: The NEMA image quality phantom was scanned with various sphere-to-background ratios (SBR), simulating different activity uptakes, including spheres with low uptake, that is, SBR smaller than 1. Furthermore, images of a phantom containing 3D printed inserts reflecting realistic heterogeneity uptake patterns were acquired. Data were reconstructed using various matrix sizes, reconstruction algorithms, and scan durations (noise). For every specific reconstruction and noise level, ten statistically equal replicates were generated. The phantom inserts were delineated using CT and PET-based segmentation methods. A total of 246 radiomic features was extracted from each image dataset. Images were discretized with a fixed number of 64 bins (FBN) and a fixed bin width (FBW) of 0.25 for the high and a FBW of 0.05 for the low uptake data. In terms of feature reduction, we determined the impact of these factors on the composition of feature clusters, which were defined on the basis of Spearman's correlation matrices. To assess feature repeatability, the intraclass correlation coefficient was calculated over the ten replicates. Results: In general, larger spheres with high uptake resulted in better repeatability compared to smaller low uptake spheres. In terms of repeatability, features extracted from heterogeneous phantom inserts were comparable to features extracted from bigger high uptake spheres. For example, for an EARL-compliant reconstruction, larger and smaller high uptake spheres yielded good repeatability for 32{\%} and 30{\%} of the features, while the heterogeneous inserts resulted in 34{\%} repeatable features. For the low uptake spheres, this was the case for 22{\%} and 20{\%} of the features for bigger and smaller spheres, respectively. Images reconstructed with point-spread-function (PSF) resulted in the highest repeatability when compared with OSEM or time-of-flight, for example, 53{\%}, 30{\%}, and 32{\%} of repeatable features, respectively (for unsmoothed data, discretized with FBN, 300 s scan duration). Reducing image noise (increasing scan duration and smoothing) and using CT-based segmentation for the low uptake spheres yielded improved repeatability. FBW discretization resulted in higher repeatability than FBN discretization, for example, 89{\%} and 35{\%} of the features, respectively (for the EARL-compliant reconstruction and larger high uptake spheres). Conclusion: Feature space reduction and repeatability of 18F-FDG PET radiomic features depended on all studied factors. The high sensitivity of PET radiomic features to image quality suggests that a high level of image acquisition and preprocessing standardization is required to be used as clinical imaging biomarker.",
keywords = "UT-Hybrid-D, delineation, image reconstruction settings, F-FDG PET/CT radiomic features",
author = "Elisabeth Pfaehler and Beukinga, {Roelof J.} and {de Jong}, {Johan R.} and Slart, {Riemer H.J.A.} and Slump, {Cornelis H.} and Dierckx, {Rudi A.J.O.} and Ronald Boellaard",
note = "Wiley deal",
year = "2019",
month = "2",
day = "1",
doi = "10.1002/mp.13322",
language = "English",
volume = "46",
pages = "665--678",
journal = "Medical physics",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "2",

}

Repeatability of 18F-FDG PET radiomic features : A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method. / Pfaehler, Elisabeth (Corresponding Author); Beukinga, Roelof J.; de Jong, Johan R.; Slart, Riemer H.J.A.; Slump, Cornelis H.; Dierckx, Rudi A.J.O.; Boellaard, Ronald.

In: Medical physics, Vol. 46, No. 2, 01.02.2019, p. 665-678.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Repeatability of 18F-FDG PET radiomic features

T2 - A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method

AU - Pfaehler, Elisabeth

AU - Beukinga, Roelof J.

AU - de Jong, Johan R.

AU - Slart, Riemer H.J.A.

AU - Slump, Cornelis H.

AU - Dierckx, Rudi A.J.O.

AU - Boellaard, Ronald

N1 - Wiley deal

PY - 2019/2/1

Y1 - 2019/2/1

N2 - Background: 18F-fluoro-2-deoxy-D-Glucose positron emission tomography (18F-FDG PET) radiomics has the potential to guide the clinical decision making in cancer patients, but validation is required before radiomics can be implemented in the clinical setting. The aim of this study was to explore how feature space reduction and repeatability of 18F-FDG PET radiomic features are affected by various sources of variation such as underlying data (e.g., object size and uptake), image reconstruction methods and settings, noise, discretization method, and delineation method. Methods: The NEMA image quality phantom was scanned with various sphere-to-background ratios (SBR), simulating different activity uptakes, including spheres with low uptake, that is, SBR smaller than 1. Furthermore, images of a phantom containing 3D printed inserts reflecting realistic heterogeneity uptake patterns were acquired. Data were reconstructed using various matrix sizes, reconstruction algorithms, and scan durations (noise). For every specific reconstruction and noise level, ten statistically equal replicates were generated. The phantom inserts were delineated using CT and PET-based segmentation methods. A total of 246 radiomic features was extracted from each image dataset. Images were discretized with a fixed number of 64 bins (FBN) and a fixed bin width (FBW) of 0.25 for the high and a FBW of 0.05 for the low uptake data. In terms of feature reduction, we determined the impact of these factors on the composition of feature clusters, which were defined on the basis of Spearman's correlation matrices. To assess feature repeatability, the intraclass correlation coefficient was calculated over the ten replicates. Results: In general, larger spheres with high uptake resulted in better repeatability compared to smaller low uptake spheres. In terms of repeatability, features extracted from heterogeneous phantom inserts were comparable to features extracted from bigger high uptake spheres. For example, for an EARL-compliant reconstruction, larger and smaller high uptake spheres yielded good repeatability for 32% and 30% of the features, while the heterogeneous inserts resulted in 34% repeatable features. For the low uptake spheres, this was the case for 22% and 20% of the features for bigger and smaller spheres, respectively. Images reconstructed with point-spread-function (PSF) resulted in the highest repeatability when compared with OSEM or time-of-flight, for example, 53%, 30%, and 32% of repeatable features, respectively (for unsmoothed data, discretized with FBN, 300 s scan duration). Reducing image noise (increasing scan duration and smoothing) and using CT-based segmentation for the low uptake spheres yielded improved repeatability. FBW discretization resulted in higher repeatability than FBN discretization, for example, 89% and 35% of the features, respectively (for the EARL-compliant reconstruction and larger high uptake spheres). Conclusion: Feature space reduction and repeatability of 18F-FDG PET radiomic features depended on all studied factors. The high sensitivity of PET radiomic features to image quality suggests that a high level of image acquisition and preprocessing standardization is required to be used as clinical imaging biomarker.

AB - Background: 18F-fluoro-2-deoxy-D-Glucose positron emission tomography (18F-FDG PET) radiomics has the potential to guide the clinical decision making in cancer patients, but validation is required before radiomics can be implemented in the clinical setting. The aim of this study was to explore how feature space reduction and repeatability of 18F-FDG PET radiomic features are affected by various sources of variation such as underlying data (e.g., object size and uptake), image reconstruction methods and settings, noise, discretization method, and delineation method. Methods: The NEMA image quality phantom was scanned with various sphere-to-background ratios (SBR), simulating different activity uptakes, including spheres with low uptake, that is, SBR smaller than 1. Furthermore, images of a phantom containing 3D printed inserts reflecting realistic heterogeneity uptake patterns were acquired. Data were reconstructed using various matrix sizes, reconstruction algorithms, and scan durations (noise). For every specific reconstruction and noise level, ten statistically equal replicates were generated. The phantom inserts were delineated using CT and PET-based segmentation methods. A total of 246 radiomic features was extracted from each image dataset. Images were discretized with a fixed number of 64 bins (FBN) and a fixed bin width (FBW) of 0.25 for the high and a FBW of 0.05 for the low uptake data. In terms of feature reduction, we determined the impact of these factors on the composition of feature clusters, which were defined on the basis of Spearman's correlation matrices. To assess feature repeatability, the intraclass correlation coefficient was calculated over the ten replicates. Results: In general, larger spheres with high uptake resulted in better repeatability compared to smaller low uptake spheres. In terms of repeatability, features extracted from heterogeneous phantom inserts were comparable to features extracted from bigger high uptake spheres. For example, for an EARL-compliant reconstruction, larger and smaller high uptake spheres yielded good repeatability for 32% and 30% of the features, while the heterogeneous inserts resulted in 34% repeatable features. For the low uptake spheres, this was the case for 22% and 20% of the features for bigger and smaller spheres, respectively. Images reconstructed with point-spread-function (PSF) resulted in the highest repeatability when compared with OSEM or time-of-flight, for example, 53%, 30%, and 32% of repeatable features, respectively (for unsmoothed data, discretized with FBN, 300 s scan duration). Reducing image noise (increasing scan duration and smoothing) and using CT-based segmentation for the low uptake spheres yielded improved repeatability. FBW discretization resulted in higher repeatability than FBN discretization, for example, 89% and 35% of the features, respectively (for the EARL-compliant reconstruction and larger high uptake spheres). Conclusion: Feature space reduction and repeatability of 18F-FDG PET radiomic features depended on all studied factors. The high sensitivity of PET radiomic features to image quality suggests that a high level of image acquisition and preprocessing standardization is required to be used as clinical imaging biomarker.

KW - UT-Hybrid-D

KW - delineation

KW - image reconstruction settings

KW - F-FDG PET/CT radiomic features

UR - http://www.scopus.com/inward/record.url?scp=85059272006&partnerID=8YFLogxK

U2 - 10.1002/mp.13322

DO - 10.1002/mp.13322

M3 - Article

VL - 46

SP - 665

EP - 678

JO - Medical physics

JF - Medical physics

SN - 0094-2405

IS - 2

ER -