Agriculture (Mar 2024)

Combinations of Feature Selection and Machine Learning Models for Object-Oriented “Staple-Crop-Shifting” Monitoring Based on Gaofen-6 Imagery

  • Yujuan Cao,
  • Jianguo Dai,
  • Guoshun Zhang,
  • Minghui Xia,
  • Zhitan Jiang

DOI
https://doi.org/10.3390/agriculture14030500
Journal volume & issue
Vol. 14, no. 3
p. 500

Abstract

Read online

This paper combines feature selection with machine learning algorithms to achieve object-oriented classification of crops in Gaofen-6 remote sensing images. The study provides technical support and methodological references for research on regional monitoring of food crops and precision agriculture management. “Staple-food-shifting” refers to the planting of other cash crops on cultivated land that should have been planted with staple crops such as wheat, rice, and maize, resulting in a change in the type of arable land cultivated. An accurate grasp of the spatial and temporal patterns of “staple-food-shifting” on arable land is an important basis for rationalizing land use and protecting food security. In this study, the Shihezi Reclamation Area in Xinjiang is selected as the study area, and Gaofen-6 satellite images are used to study the changes in the cultivated area of staple food crops and their regional distribution. Firstly, the images are segmented at multiple scales and four types of features are extracted, totaling sixty-five feature variables. Secondly, six feature selection algorithms are used to optimize the feature variables, and a total of nine feature combinations are designed. Finally, k-Nearest Neighbor (KNN), Random Forest (RF), and Decision Tree (DT) are used as the basic models of image classification to explore the best combination of feature selection method and machine learning model suitable for wheat, maize, and cotton classification. The results show that our proposed optimal feature selection method (OFSM) can significantly improve the classification accuracy by up to 15.02% compared to the Random Forest Feature Importance Selection (RF-FI), Random Forest Recursive Feature Elimination (RF-RFE), and XGBoost Feature Importance Selection (XGBoost-FI) methods. Among them, the OF-RF-RFE model constructed based on KNN performs the best, with the overall accuracy, average user accuracy, average producer accuracy, and kappa coefficient reaching 90.68%, 87.86%, 86.68%, and 0.84, respectively.

Keywords