Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning

Dthenifer Cordeiro Santana; Rafael Felipe Ratke; Fabio Luiz Zanatta; Cid Naudi Silva Campos; Ana Carina da Silva Cândido Seron; Larissa Pereira Ribeiro Teodoro; Natielly Pereira da Silva; Gabriela Souza Oliveira; Regimar Garcia dos Santos; Rita de Cássia Félix Alvarez; Carlos Antonio da Silva Junior; Matildes Blanco; Paulo Eduardo Teodoro

doi:10.3390/agriengineering6040255

AgriEngineering (Nov 2024)

Caffeine Content Prediction in Coffee Beans Using Hyperspectral Reflectance and Machine Learning

Dthenifer Cordeiro Santana,
Rafael Felipe Ratke,
Fabio Luiz Zanatta,
Cid Naudi Silva Campos,
Ana Carina da Silva Cândido Seron,
Larissa Pereira Ribeiro Teodoro,
Natielly Pereira da Silva,
Gabriela Souza Oliveira,
Regimar Garcia dos Santos,
Rita de Cássia Félix Alvarez,
Carlos Antonio da Silva Junior,
Matildes Blanco,
Paulo Eduardo Teodoro

Affiliations

Dthenifer Cordeiro Santana: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Rafael Felipe Ratke: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Fabio Luiz Zanatta: Agronomy Department, Professor Cinobelina Elvas Campus, Federal University of Piauí, Bom Jesus 58930-000, PI, Brazil
Cid Naudi Silva Campos: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Ana Carina da Silva Cândido Seron: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Larissa Pereira Ribeiro Teodoro: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Natielly Pereira da Silva: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Gabriela Souza Oliveira: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Regimar Garcia dos Santos: Plant Sciences Building, Department of Horticulture, The University of Georgia, Athens, GA 30602, USA
Rita de Cássia Félix Alvarez: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Carlos Antonio da Silva Junior: Department of Geography, State University of Mato Grosso (UNEMAT), Sinop 78555-000, MT, Brazil
Matildes Blanco: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil
Paulo Eduardo Teodoro: Agronomy Department, Federal University of Mato Grosso do Sul (UFMS), Chapadão do Sul 79560-000, MS, Brazil

DOI: https://doi.org/10.3390/agriengineering6040255
Journal volume & issue: Vol. 6, no. 4
pp. 4480 – 4492

Abstract

Read online

The application of hyperspectral data in machine learning models can contribute to the rapid and accurate determination of caffeine content in coffee beans. This study aimed to identify the machine learning algorithm with the best performance for predicting caffeine content and to find input data for these models that can improve the accuracy of these algorithms. The coffee beans were harvested one year after the seedlings were planted. The fresh beans were taken to the spectroscopy laboratory (Laspec) at the Federal University of Mato Grosso do Sul, Chapadão do Sul campus, for spectral evaluation using a spectroradiometer. For the analysis, the dried coffee beans were ground and sieved for the quantification of caffeine, which was carried out using a liquid chromatograph on the Waters Acquity 1100 series UPLC system, with an automatic sample injector. The spectral data of the beans, as well as the spectral data of the roasted and ground coffee, were analyzed using machine learning (ML) algorithms to predict caffeine content. Four databases were used as input: the spectral information of the bean (CG), the spectral information of the bean with additional clone information (CG+C), the spectral information of the bean after roasting and grinding (CGRG) and the spectral information of the bean after roasting and grinding with additional clone information (CGRG+C). The caffeine content was used as an output to be predicted. Each database was subjected to different machine learning models: artificial neural networks (ANNs), decision tree (DT), linear regression (LR), M5P, and random forest (RF) algorithms. Pearson’s correlation coefficient, mean absolute error, and root mean square error were tested as model accuracy metrics. The support vector machine algorithm showed the best accuracy in predicting caffeine content when using hyperspectral data from roasted and ground coffee beans. This performance was significantly improved when clone information was included, allowing for an even more accurate analysis.

Published in AgriEngineering

ISSN: 2624-7402 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Agriculture: Agriculture (General); Technology: Engineering (General). Civil engineering (General)
Website: https://www.mdpi.com/journal/agriengineering

About the journal

Abstract

Keywords