Aggregating published prediction models with individual participant data: A comparison of different approaches

Thomas P.A. Debray*, Hendrik Koffijberg, Yvonne Vergouwe, Karel G.M. Moons, Ewout W. Steyerberg

*Corresponding author for this work

Research output: Contribution to journalArticleAcademicpeer-review

28 Citations (Scopus)

Abstract

During the recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual participant data (IPD) from their population of interest. We aimed to evaluate approaches to aggregate previously published prediction models with new data. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. We adopt a two-stage method and explore three approaches to calculate a synthesis model, hereby relying on the principles of multivariate meta-analysis. The former approach employs a naive pooling strategy, whereas the latter accounts for within-study and between-study covariance. These approaches are applied to a collection of 15 datasets of patients with traumatic brain injury, and to five previously published models for predicting deep venous thrombosis. Here, we illustrated how the generally unrealistic assumption of consistency in the availability of evidence across included studies can be relaxed. Results from the case studies demonstrate that aggregation yields prediction models with an improved discrimination and calibration in a vast majority of scenarios, and result in equivalent performance (compared with the standard approach) in a small minority of situations. The proposed aggregation approaches are particularly useful when few participant data are at hand. Assessing the degree of heterogeneity between IPD and literature findings remains crucial to determine the optimal approach in aggregating previous evidence into new prediction models.
Original languageEnglish
Pages (from-to)2697-2712
Number of pages16
JournalStatistics in medicine
Volume31
Issue number23
DOIs
Publication statusPublished - 15 Oct 2012

Fingerprint

Prediction Model
Venous Thrombosis
Calibration
Meta-Analysis
Multivariate Analysis
Hand
Research Personnel
Aggregation
Thrombosis
Pooling
Population
Model
Discrimination
Predictors
Availability
Datasets
Synthesis
Calculate
Scenarios
Evidence

Keywords

  • Bayesian inference
  • Logistic regression
  • Meta-analysis
  • Multivariable
  • Prediction models
  • Prediction research

Cite this

Debray, Thomas P.A. ; Koffijberg, Hendrik ; Vergouwe, Yvonne ; Moons, Karel G.M. ; Steyerberg, Ewout W. / Aggregating published prediction models with individual participant data : A comparison of different approaches. In: Statistics in medicine. 2012 ; Vol. 31, No. 23. pp. 2697-2712.
@article{ceff85ffae634f2d952fc674f6c1b014,
title = "Aggregating published prediction models with individual participant data: A comparison of different approaches",
abstract = "During the recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual participant data (IPD) from their population of interest. We aimed to evaluate approaches to aggregate previously published prediction models with new data. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. We adopt a two-stage method and explore three approaches to calculate a synthesis model, hereby relying on the principles of multivariate meta-analysis. The former approach employs a naive pooling strategy, whereas the latter accounts for within-study and between-study covariance. These approaches are applied to a collection of 15 datasets of patients with traumatic brain injury, and to five previously published models for predicting deep venous thrombosis. Here, we illustrated how the generally unrealistic assumption of consistency in the availability of evidence across included studies can be relaxed. Results from the case studies demonstrate that aggregation yields prediction models with an improved discrimination and calibration in a vast majority of scenarios, and result in equivalent performance (compared with the standard approach) in a small minority of situations. The proposed aggregation approaches are particularly useful when few participant data are at hand. Assessing the degree of heterogeneity between IPD and literature findings remains crucial to determine the optimal approach in aggregating previous evidence into new prediction models.",
keywords = "Bayesian inference, Logistic regression, Meta-analysis, Multivariable, Prediction models, Prediction research",
author = "Debray, {Thomas P.A.} and Hendrik Koffijberg and Yvonne Vergouwe and Moons, {Karel G.M.} and Steyerberg, {Ewout W.}",
year = "2012",
month = "10",
day = "15",
doi = "10.1002/sim.5412",
language = "English",
volume = "31",
pages = "2697--2712",
journal = "Statistics in medicine",
issn = "0277-6715",
publisher = "Wiley",
number = "23",

}

Aggregating published prediction models with individual participant data : A comparison of different approaches. / Debray, Thomas P.A.; Koffijberg, Hendrik; Vergouwe, Yvonne; Moons, Karel G.M.; Steyerberg, Ewout W.

In: Statistics in medicine, Vol. 31, No. 23, 15.10.2012, p. 2697-2712.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - Aggregating published prediction models with individual participant data

T2 - A comparison of different approaches

AU - Debray, Thomas P.A.

AU - Koffijberg, Hendrik

AU - Vergouwe, Yvonne

AU - Moons, Karel G.M.

AU - Steyerberg, Ewout W.

PY - 2012/10/15

Y1 - 2012/10/15

N2 - During the recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual participant data (IPD) from their population of interest. We aimed to evaluate approaches to aggregate previously published prediction models with new data. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. We adopt a two-stage method and explore three approaches to calculate a synthesis model, hereby relying on the principles of multivariate meta-analysis. The former approach employs a naive pooling strategy, whereas the latter accounts for within-study and between-study covariance. These approaches are applied to a collection of 15 datasets of patients with traumatic brain injury, and to five previously published models for predicting deep venous thrombosis. Here, we illustrated how the generally unrealistic assumption of consistency in the availability of evidence across included studies can be relaxed. Results from the case studies demonstrate that aggregation yields prediction models with an improved discrimination and calibration in a vast majority of scenarios, and result in equivalent performance (compared with the standard approach) in a small minority of situations. The proposed aggregation approaches are particularly useful when few participant data are at hand. Assessing the degree of heterogeneity between IPD and literature findings remains crucial to determine the optimal approach in aggregating previous evidence into new prediction models.

AB - During the recent decades, interest in prediction models has substantially increased, but approaches to synthesize evidence from previously developed models have failed to keep pace. This causes researchers to ignore potentially useful past evidence when developing a novel prediction model with individual participant data (IPD) from their population of interest. We aimed to evaluate approaches to aggregate previously published prediction models with new data. We consider the situation that models are reported in the literature with predictors similar to those available in an IPD dataset. We adopt a two-stage method and explore three approaches to calculate a synthesis model, hereby relying on the principles of multivariate meta-analysis. The former approach employs a naive pooling strategy, whereas the latter accounts for within-study and between-study covariance. These approaches are applied to a collection of 15 datasets of patients with traumatic brain injury, and to five previously published models for predicting deep venous thrombosis. Here, we illustrated how the generally unrealistic assumption of consistency in the availability of evidence across included studies can be relaxed. Results from the case studies demonstrate that aggregation yields prediction models with an improved discrimination and calibration in a vast majority of scenarios, and result in equivalent performance (compared with the standard approach) in a small minority of situations. The proposed aggregation approaches are particularly useful when few participant data are at hand. Assessing the degree of heterogeneity between IPD and literature findings remains crucial to determine the optimal approach in aggregating previous evidence into new prediction models.

KW - Bayesian inference

KW - Logistic regression

KW - Meta-analysis

KW - Multivariable

KW - Prediction models

KW - Prediction research

UR - http://www.scopus.com/inward/record.url?scp=84866440887&partnerID=8YFLogxK

U2 - 10.1002/sim.5412

DO - 10.1002/sim.5412

M3 - Article

C2 - 22733546

AN - SCOPUS:84866440887

VL - 31

SP - 2697

EP - 2712

JO - Statistics in medicine

JF - Statistics in medicine

SN - 0277-6715

IS - 23

ER -