A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis

Thomas P.A. Debray, Karel G.M. Moons, Ikhlaaq Ahmed, Hendrik Koffijberg, Richard David Riley

Research output: Contribution to journalArticleAcademicpeer-review

74 Citations (Scopus)

Abstract

The use of individual participant data (IPD) from multiple studies is an increasingly popular approach when developing a multivariable risk prediction model. Corresponding datasets, however, typically differ in important aspects, such as baseline risk. This has driven the adoption of meta-analytical approaches for appropriately dealing with heterogeneity between study populations. Although these approaches provide an averaged prediction model across all studies, little guidance exists about how to apply or validate this model to new individuals or study populations outside the derivation data. We consider several approaches to develop a multivariable logistic regression model from an IPD meta-analysis (IPD-MA) with potential between-study heterogeneity. We also propose strategies for choosing a valid model intercept for when the model is to be validated or applied to new individuals or study populations. These strategies can be implemented by the IPD-MA developers or future model validators. Finally, we show how model generalizability can be evaluated when external validation data are lacking using internal-external cross-validation and extend our framework to count and time-to-event data. In an empirical evaluation, our results show how stratified estimation allows study-specific model intercepts, which can then inform the intercept to be used when applying the model in practice, even to a population not represented by included studies. In summary, our framework allows the development (through stratified estimation), implementation in new individuals (through focused intercept choice), and evaluation (through internal-external validation) of a single, integrated prediction model from an IPD-MA in order to achieve improved model performance and generalizability.

Original languageEnglish
Pages (from-to)3158-3180
Number of pages23
JournalStatistics in medicine
Volume32
Issue number18
DOIs
Publication statusPublished - 15 Aug 2013

Fingerprint

Prediction Model
Meta-Analysis
Intercept
Population
Logistic Models
Model
Internal
Logistic Regression Model
Integrated Model
Evaluation
Performance Model
Framework
Cross-validation
Guidance
Baseline
Count
Valid

Keywords

  • Individual participant data (IPD)
  • Internal-external validation
  • Logistic regression
  • Meta-analysis
  • Multivariable
  • Prediction research
  • Risk prediction models

Cite this

Debray, Thomas P.A. ; Moons, Karel G.M. ; Ahmed, Ikhlaaq ; Koffijberg, Hendrik ; Riley, Richard David. / A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. In: Statistics in medicine. 2013 ; Vol. 32, No. 18. pp. 3158-3180.
@article{25b01c86368d450b990f66e7b1b16203,
title = "A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis",
abstract = "The use of individual participant data (IPD) from multiple studies is an increasingly popular approach when developing a multivariable risk prediction model. Corresponding datasets, however, typically differ in important aspects, such as baseline risk. This has driven the adoption of meta-analytical approaches for appropriately dealing with heterogeneity between study populations. Although these approaches provide an averaged prediction model across all studies, little guidance exists about how to apply or validate this model to new individuals or study populations outside the derivation data. We consider several approaches to develop a multivariable logistic regression model from an IPD meta-analysis (IPD-MA) with potential between-study heterogeneity. We also propose strategies for choosing a valid model intercept for when the model is to be validated or applied to new individuals or study populations. These strategies can be implemented by the IPD-MA developers or future model validators. Finally, we show how model generalizability can be evaluated when external validation data are lacking using internal-external cross-validation and extend our framework to count and time-to-event data. In an empirical evaluation, our results show how stratified estimation allows study-specific model intercepts, which can then inform the intercept to be used when applying the model in practice, even to a population not represented by included studies. In summary, our framework allows the development (through stratified estimation), implementation in new individuals (through focused intercept choice), and evaluation (through internal-external validation) of a single, integrated prediction model from an IPD-MA in order to achieve improved model performance and generalizability.",
keywords = "Individual participant data (IPD), Internal-external validation, Logistic regression, Meta-analysis, Multivariable, Prediction research, Risk prediction models",
author = "Debray, {Thomas P.A.} and Moons, {Karel G.M.} and Ikhlaaq Ahmed and Hendrik Koffijberg and Riley, {Richard David}",
year = "2013",
month = "8",
day = "15",
doi = "10.1002/sim.5732",
language = "English",
volume = "32",
pages = "3158--3180",
journal = "Statistics in medicine",
issn = "0277-6715",
publisher = "Wiley",
number = "18",

}

A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis. / Debray, Thomas P.A.; Moons, Karel G.M.; Ahmed, Ikhlaaq; Koffijberg, Hendrik; Riley, Richard David.

In: Statistics in medicine, Vol. 32, No. 18, 15.08.2013, p. 3158-3180.

Research output: Contribution to journalArticleAcademicpeer-review

TY - JOUR

T1 - A framework for developing, implementing, and evaluating clinical prediction models in an individual participant data meta-analysis

AU - Debray, Thomas P.A.

AU - Moons, Karel G.M.

AU - Ahmed, Ikhlaaq

AU - Koffijberg, Hendrik

AU - Riley, Richard David

PY - 2013/8/15

Y1 - 2013/8/15

N2 - The use of individual participant data (IPD) from multiple studies is an increasingly popular approach when developing a multivariable risk prediction model. Corresponding datasets, however, typically differ in important aspects, such as baseline risk. This has driven the adoption of meta-analytical approaches for appropriately dealing with heterogeneity between study populations. Although these approaches provide an averaged prediction model across all studies, little guidance exists about how to apply or validate this model to new individuals or study populations outside the derivation data. We consider several approaches to develop a multivariable logistic regression model from an IPD meta-analysis (IPD-MA) with potential between-study heterogeneity. We also propose strategies for choosing a valid model intercept for when the model is to be validated or applied to new individuals or study populations. These strategies can be implemented by the IPD-MA developers or future model validators. Finally, we show how model generalizability can be evaluated when external validation data are lacking using internal-external cross-validation and extend our framework to count and time-to-event data. In an empirical evaluation, our results show how stratified estimation allows study-specific model intercepts, which can then inform the intercept to be used when applying the model in practice, even to a population not represented by included studies. In summary, our framework allows the development (through stratified estimation), implementation in new individuals (through focused intercept choice), and evaluation (through internal-external validation) of a single, integrated prediction model from an IPD-MA in order to achieve improved model performance and generalizability.

AB - The use of individual participant data (IPD) from multiple studies is an increasingly popular approach when developing a multivariable risk prediction model. Corresponding datasets, however, typically differ in important aspects, such as baseline risk. This has driven the adoption of meta-analytical approaches for appropriately dealing with heterogeneity between study populations. Although these approaches provide an averaged prediction model across all studies, little guidance exists about how to apply or validate this model to new individuals or study populations outside the derivation data. We consider several approaches to develop a multivariable logistic regression model from an IPD meta-analysis (IPD-MA) with potential between-study heterogeneity. We also propose strategies for choosing a valid model intercept for when the model is to be validated or applied to new individuals or study populations. These strategies can be implemented by the IPD-MA developers or future model validators. Finally, we show how model generalizability can be evaluated when external validation data are lacking using internal-external cross-validation and extend our framework to count and time-to-event data. In an empirical evaluation, our results show how stratified estimation allows study-specific model intercepts, which can then inform the intercept to be used when applying the model in practice, even to a population not represented by included studies. In summary, our framework allows the development (through stratified estimation), implementation in new individuals (through focused intercept choice), and evaluation (through internal-external validation) of a single, integrated prediction model from an IPD-MA in order to achieve improved model performance and generalizability.

KW - Individual participant data (IPD)

KW - Internal-external validation

KW - Logistic regression

KW - Meta-analysis

KW - Multivariable

KW - Prediction research

KW - Risk prediction models

UR - http://www.scopus.com/inward/record.url?scp=84880044696&partnerID=8YFLogxK

U2 - 10.1002/sim.5732

DO - 10.1002/sim.5732

M3 - Article

C2 - 23307585

AN - SCOPUS:84880044696

VL - 32

SP - 3158

EP - 3180

JO - Statistics in medicine

JF - Statistics in medicine

SN - 0277-6715

IS - 18

ER -