Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm

Yangyang Fu; Ruoque Shen; Chaoqing Song; Jie Dong; Wei Han; Tao Ye; Wenping Yuan

Science of Remote Sensing (Jun 2023)

Exploring the effects of training samples on the accuracy of crop mapping with machine learning algorithm

Yangyang Fu,
Ruoque Shen,
Chaoqing Song,
Jie Dong,
Wei Han,
Tao Ye,
Wenping Yuan

Affiliations

Yangyang Fu: School of Atmospheric Sciences, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
Ruoque Shen: School of Atmospheric Sciences, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
Chaoqing Song: School of Atmospheric Sciences, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Sun Yat-sen University, Zhuhai, 519082, Guangdong, China
Jie Dong: College of Geomatics & Municipal Engineering, Zhejiang University of Water Resources and Electric Power, Hangzhou, 310018, Zhejiang, China
Wei Han: Shandong General Station of Agricultural Technology Extension, Jinan, 250013, Shandong, China
Tao Ye: Key Laboratory of Environmental Change and Natural Disaster, Ministry of Education, Faculty of Geographical Science, Beijing Normal University, Beijing, 100875, China
Wenping Yuan: School of Atmospheric Sciences, Southern Marine Science and Engineering Guangdong Laboratory (Zhuhai), Sun Yat-sen University, Zhuhai, 519082, Guangdong, China; Corresponding author.

Journal volume & issue: Vol. 7
p. 100081

Abstract

Read online

Machine learning algorithms are a frequently used crop classification method and have been applied to identify the distribution of various crops over regional and national scales. Previous studies have underscored that the number of training samples strongly influences the classification accuracy of machine learning algorithms, resulting in extensive training sample collection efforts. This study, taking winter wheat as an example, challenges the above principle by selecting training samples with the time-weighted dynamic time warping (TWDTW) method and finds that the classification accuracy of machine learning algorithms highly relies on the representativeness and proportion of training samples rather than the quantity. With the increase of the representativeness of training samples, i.e. more comprehensively reflected the characteristics of winter wheat, the classification accuracy is continually improved. The best classification accuracy is further achieved when selecting the training samples of winter wheat and non-winter wheat according to the ratio of their statistical areas. On the contrary, only a slight difference was found in overall accuracy (91.26% and 90.74%), producer’s accuracy (86.33% and 86.65%) and user’s accuracy (97.37% and 96.01%) when using 1,000 and 10,000 training samples. Overall, this study demonstrates that the characteristics of training samples have a great impact on the classification accuracy of machine learning algorithms, and the training samples generated by TWDTW method are reliable for crop mapping.

Published in Science of Remote Sensing

ISSN: 2666-0172 (Online)
Publisher: Elsevier
Country of publisher: Netherlands
LCC subjects: Geography. Anthropology. Recreation: Physical geography; Science
Website: https://www.journals.elsevier.com/science-of-remote-sensing

About the journal

Abstract

Keywords