Automatic classification of literature in systematic reviews on food safety using machine learning

Leonieke M. van den Bulk; Yamine Bouzembrak; Anand Gavai; Ningjing Liu; Lukas J. van den Heuvel; Hans J.P. Marvin

Current Research in Food Science (Jan 2022)

Automatic classification of literature in systematic reviews on food safety using machine learning

Leonieke M. van den Bulk,
Yamine Bouzembrak,
Anand Gavai,
Ningjing Liu,
Lukas J. van den Heuvel,
Hans J.P. Marvin

Affiliations

Leonieke M. van den Bulk: Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands
Yamine Bouzembrak: Corresponding author.; Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands
Anand Gavai: Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands
Ningjing Liu: Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands
Lukas J. van den Heuvel: Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands
Hans J.P. Marvin: Wageningen Food Safety Research, Akkermaalsbos 2, 6708, WB, Wageningen, the Netherlands

Journal volume & issue: Vol. 5
pp. 84 – 95

Abstract

Read online

Systematic reviews are used to collect relevant literature to answer a research question in a way that is clear, thorough, unbiased and reproducible. They are implemented as a standard method in the domain of food safety to obtain a literature overview on the state-of-the-art research related to food safety topics of interest. A disadvantage to systematic reviews, however, is that this process is time-consuming and requires expert domain knowledge. The work reported here aims to reduce the time needed by an expert to screen all possible relevant articles by applying machine learning techniques to classify the articles automatically as either relevant or not relevant. Eight different machine learning algorithms and ensembles of all combinations of these algorithms were tested on two different systematic reviews on food safety (i.e. chemical hazards in cereals and leafy greens). The results showed that the best performance was obtained by an ensemble of naive Bayes and a support vector machine, resulting in an average decrease of 32.8% in the amount of articles the expert has to read and an average decrease in irrelevant articles of 57.8% while keeping 95% of the relevant articles. It was concluded that automatic classification of the literature in a systematic literature review can support experts in their task and save valuable time without compromising the quality of the review.

Published in Current Research in Food Science

ISSN: 2665-9271 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Technology: Home economics: Nutrition. Foods and food supply; Technology: Chemical technology: Food processing and manufacture
Website: https://www.journals.elsevier.com/current-research-in-food-science

About the journal

Abstract

Keywords