Applied Sciences (Nov 2022)
E3W—A Combined Model Based on GreedySoup Weighting Strategy for Chinese Agricultural News Classification
Abstract
With the continuous development of the internet and big data, modernization and informatization are rapidly being realized in the agricultural field. In this line, the volume of agricultural news is also increasing. This explosion of agricultural news has made accurate access to agricultural news difficult, and the spread of news about some agricultural technologies has slowed down, resulting in certain hindrance to the development of agriculture. To address this problem, we apply NLP to agricultural news texts to classify the agricultural news, in order to ultimately improve the efficiency of agricultural news dissemination. We propose a classification model based on ERNIE + DPCNN, ERNIE, EGC, and Word2Vec + TextCNN as sub-models for Chinese short-agriculture text classification (E3W), utilizing the GreedySoup weighting strategy and multi-model combination; specifically, E3W consists of four sub-models, the output of which is processed using the GreedySoup weighting strategy. In the E3W model, we divide the classification process into two steps: in the first step, the text is passed through the four independent sub-models to obtain an initial classification result given by each sub-model; in the second step, the model considers the relationship between the initial classification result and the sub-models, and assigns weights to this initial classification result. The final category with the highest weight is used as the output of E3W. To fully evaluate the effectiveness of the E3W model, the accuracy, precision, recall, and F1-score are used as evaluation metrics in this paper. We conduct multiple sets of comparative experiments on a self-constructed agricultural data set, comparing E3W and its sub-models, as well as performing ablation experiments. The results demonstrate that the E3W model can improve the average accuracy by 1.02%, the average precision by 1.62%, the average recall by 1.21%, and the average F1-score by 1.02%. Overall, E3W can achieve state-of-the-art performance in Chinese agricultural news classification.
Keywords