Data in Brief (Jun 2023)
CocoaMFDB: A dataset of cocoa pod maturity and families in an uncontrolled environment in Côte d'Ivoire
Abstract
Cocoa cultivation is the basis for chocolate production; it has a unique aroma that makes it useful in the production of snacks and usable for cooking or baking. The maximum harvest period of cocoa is normally once or twice a year and spread over several months, depending on the country. Determining the best harvesting period for cocoa pods plays a major role in the export process and the pods quality. The degree of ripening of the pods affects the quality of the resulting beans. Also, unripe pods do not have enough sugar and may prevent proper bean fermentation. As for too-mature pods, they are usually dry, and their beans may germinate inside the pods, or they may develop a fungal disease and cannot be used. Computer-based determination of the ripeness of cocoa pods throughout image analysis could facilitate massive cocoa ripeness detection. Recent technological advances in computing power, communication systems, and machine learning techniques provide opportunities for agricultural engineering and computer scientists to meet the demands of the manual. The need for diverse and representative sets of pod images is essential for developing and testing automatic cocoa pod maturity detection systems. In this perspective, we collected images of cocoa pods to set up a database of cocoa pods of the Côte d'Ivoire named CocoaMFDB. We performed a pre-processing step using the CLAHE algorithm to improve the quality of the images since the effect of the light was not controlled on our data set. CocoaMFDB allows the characterization of cocoa pods according to their maturity level and provides information on the pod family for each image. Our dataset comprises three large families, namely Amelonado, Angoleta, and Guiana, grouped into two maturity categories: the ripe and unripe pods. It is, therefore, perfect for developing and evaluating image analysis algorithms for future research.