TY - JOUR
T1 - Can artificial intelligence separate the wheat from the chaff in systematic reviews of health economic articles?
AU - Oude Wolcherink, M.J.
AU - Pouwels, X.G.L.V.
AU - van Dijk, S.H.B.
AU - Doggen, C.J.M.
AU - Koffijberg, H.
N1 - Publisher Copyright:
© 2023 The Author(s). Published by Informa UK Limited, trading as Taylor & Francis Group.
PY - 2023/10/21
Y1 - 2023/10/21
N2 - Objectives: Artificial intelligence-powered tools, such as ASReview, could reduce the burden of title and abstract screening. This study aimed to assess the accuracy and efficiency of using ASReview in a health economic context.Methods: A sample from a previous systematic literature review containing 4,994 articles was used. Previous manual screening resulted in 134 articles included for full-text screening (FT) and 50 for data extraction (DE). Here, accuracy and efficiency was evaluated by comparing the number of identified relevant articles with ASReview versus manual screening. Pre-defined stopping rules using sampling criteria and heuristic criteria were tested. Robustness of the AI-tool’s performance was determined using 1,000 simulations.Results: Considering included stopping rules, median accuracy for FT articles remained below 85%, but reached 100% for DE articles. To identify all relevant articles, a median of 89.9% of FT articles needed to be screened, compared to 7.7% for DE articles. Potential time savings between 49 and 59 hours could be achieved, depending on the stopping rule.Conclusions: In our case study, all DE articles were identified after screening 7.7% of the sample, allowing for substantial time savings. ASReview likely has the potential to substantially reduce screening time in systematic reviews of health economic articles.
AB - Objectives: Artificial intelligence-powered tools, such as ASReview, could reduce the burden of title and abstract screening. This study aimed to assess the accuracy and efficiency of using ASReview in a health economic context.Methods: A sample from a previous systematic literature review containing 4,994 articles was used. Previous manual screening resulted in 134 articles included for full-text screening (FT) and 50 for data extraction (DE). Here, accuracy and efficiency was evaluated by comparing the number of identified relevant articles with ASReview versus manual screening. Pre-defined stopping rules using sampling criteria and heuristic criteria were tested. Robustness of the AI-tool’s performance was determined using 1,000 simulations.Results: Considering included stopping rules, median accuracy for FT articles remained below 85%, but reached 100% for DE articles. To identify all relevant articles, a median of 89.9% of FT articles needed to be screened, compared to 7.7% for DE articles. Potential time savings between 49 and 59 hours could be achieved, depending on the stopping rule.Conclusions: In our case study, all DE articles were identified after screening 7.7% of the sample, allowing for substantial time savings. ASReview likely has the potential to substantially reduce screening time in systematic reviews of health economic articles.
KW - Accuracy
KW - Artificial intelligence
KW - Asreview
KW - Efficiency
KW - Simulation
KW - Stopping rule
KW - Systematic review
KW - UT-Hybrid-D
UR - http://www.scopus.com/inward/record.url?scp=85167875173&partnerID=8YFLogxK
U2 - 10.1080/14737167.2023.2234639
DO - 10.1080/14737167.2023.2234639
M3 - Article
AN - SCOPUS:85167875173
SN - 1473-7167
VL - 23
SP - 1049
EP - 1056
JO - Expert Review of Pharmacoeconomics and Outcomes Research
JF - Expert Review of Pharmacoeconomics and Outcomes Research
IS - 9
ER -