TY - JOUR
T1 - Automatic classification of literature in systematic reviews on food safety using machine learning
AU - van den Bulk, Leonieke M.
AU - Bouzembrak, Yamine
AU - Gavai, Anand
AU - Liu, Ningjing
AU - van den Heuvel, Lukas J.
AU - Marvin, Hans J.P.
N1 - Funding Information:
The research leading to this result has received funding from the Ministry of Agriculture, Nature and Food Quality ( LNV ), the Netherlands (KB-23: Healthy and Safe Food for Healthy lives).
Publisher Copyright:
© 2021 The Authors
PY - 2022/1
Y1 - 2022/1
N2 - Systematic reviews are used to collect relevant literature to answer a research question in a way that is clear, thorough, unbiased and reproducible. They are implemented as a standard method in the domain of food safety to obtain a literature overview on the state-of-the-art research related to food safety topics of interest. A disadvantage to systematic reviews, however, is that this process is time-consuming and requires expert domain knowledge. The work reported here aims to reduce the time needed by an expert to screen all possible relevant articles by applying machine learning techniques to classify the articles automatically as either relevant or not relevant. Eight different machine learning algorithms and ensembles of all combinations of these algorithms were tested on two different systematic reviews on food safety (i.e. chemical hazards in cereals and leafy greens). The results showed that the best performance was obtained by an ensemble of naive Bayes and a support vector machine, resulting in an average decrease of 32.8% in the amount of articles the expert has to read and an average decrease in irrelevant articles of 57.8% while keeping 95% of the relevant articles. It was concluded that automatic classification of the literature in a systematic literature review can support experts in their task and save valuable time without compromising the quality of the review.
AB - Systematic reviews are used to collect relevant literature to answer a research question in a way that is clear, thorough, unbiased and reproducible. They are implemented as a standard method in the domain of food safety to obtain a literature overview on the state-of-the-art research related to food safety topics of interest. A disadvantage to systematic reviews, however, is that this process is time-consuming and requires expert domain knowledge. The work reported here aims to reduce the time needed by an expert to screen all possible relevant articles by applying machine learning techniques to classify the articles automatically as either relevant or not relevant. Eight different machine learning algorithms and ensembles of all combinations of these algorithms were tested on two different systematic reviews on food safety (i.e. chemical hazards in cereals and leafy greens). The results showed that the best performance was obtained by an ensemble of naive Bayes and a support vector machine, resulting in an average decrease of 32.8% in the amount of articles the expert has to read and an average decrease in irrelevant articles of 57.8% while keeping 95% of the relevant articles. It was concluded that automatic classification of the literature in a systematic literature review can support experts in their task and save valuable time without compromising the quality of the review.
KW - Artificial intelligence
KW - Classification models
KW - Document screening
KW - Food safety hazards
KW - Literature reviews
KW - Text mining
KW - UT-Gold-D
UR - http://www.scopus.com/inward/record.url?scp=85121922195&partnerID=8YFLogxK
U2 - 10.1016/j.crfs.2021.12.010
DO - 10.1016/j.crfs.2021.12.010
M3 - Article
SN - 2665-9271
VL - 5
SP - 84
EP - 95
JO - Current Research in Food Science
JF - Current Research in Food Science
ER -