Agriculture (Jul 2024)
A Decision Support System for Crop Recommendation Using Machine Learning Classification Algorithms
Abstract
Today, crop suggestions and necessary guidance have become a regular need for a farmer. Farmers generally depend on their local agriculture officers regarding this, and it may be difficult to obtain the right guidance at the right time. Nowadays, crop datasets are available on different websites in the agriculture sector, and they play a crucial role in suggesting suitable crops. So, a decision support system that analyzes the crop dataset using machine learning techniques can assist farmers in making better choices regarding crop selections. The main objective of this research is to provide quick guidance to farmers with more accurate and effective crop recommendations by utilizing machine learning methods, global positioning system coordinates, and crop cloud data. Here, the recommendation can be more personalized, which enables the farmers to predict crops in their specific geographical context, taking into account factors like climate, soil composition, water availability, and local conditions. In this regard, an existing historical crop dataset that contains the state, district, year, area-wise production rate, crop name, and season was collected for 246,091 sample records from the Dataworld website, which holds data on 37 different crops from different areas of India. Also, for better analysis, a dataset was collected from the agriculture offices of the Rayagada, Koraput, and Gajapati districts in Odisha state, India. Both of these datasets were combined and stored using a Firebase cloud service. Thirteen different machine learning algorithms have been applied to the dataset to identify dependencies within the data. To facilitate this process, an Android application was developed using Android Studio (Electric Eel | 2023.1.1) Emulator (Version 32.1.14), Software Development Kit (SDK, Android SDK 33), and Tools. A model has been proposed that implements the SMOTE (Synthetic Minority Oversampling Technique) to balance the dataset, and then it allows for the implementation of 13 different classifiers, such as logistic regression, decision tree (DT), K-Nearest Neighbor (KNN), SVC (Support Vector Classifier), random forest (RF), Gradient Boost (GB), Bagged Tree, extreme gradient boosting (XGB classifier), Ada Boost Classifier, Cat Boost, HGB (Histogram-based Gradient Boosting), SGDC (Stochastic Gradient Descent), and MNB (Multinomial Naive Bayes) on the cloud dataset. It is observed that the performance of the SGDC method is 1.00 in accuracy, precision, recall, F1-score, and ROC AUC (Receiver Operating Characteristics–Area Under the Curve) and is 0.91 in sensitivity and 0.54 in specificity after applying the SMOTE. Overall, SGDC has a better performance compared to all other classifiers implemented in the predictions.
Keywords