Plant Phenomics (Jan 2024)

Handling the Challenges of Small-Scale Labeled Data and Class Imbalances in Classifying the N and K Statuses of Rubber Leaves Using Hyperspectroscopy Techniques

  • Wenfeng Hu,
  • Weihao Tang,
  • Chuang Li,
  • Jinjing Wu,
  • Hong Liu,
  • Chao Wang,
  • Xiaochuan Luo,
  • Rongnian Tang

DOI
https://doi.org/10.34133/plantphenomics.0154
Journal volume & issue
Vol. 6

Abstract

Read online

The nutritional status of rubber trees (Hevea brasiliensis) is inseparable from the production of natural rubber. Nitrogen (N) and potassium (K) levels in rubber leaves are 2 crucial criteria that reflect the nutritional status of the rubber tree. Advanced hyperspectral technology can evaluate N and K statuses in leaves rapidly. However, high bias and uncertain results will be generated when using a small size and imbalance dataset to train a spectral estimaion model. A typical solution of laborious long-term nutrient stress and high-intensive data collection deviates from rapid and flexible advantages of hyperspectral tech. Therefore, a less intensive and streamlined method, remining information from hyperspectral image data, was assessed. From this new perspective, a semisupervised learning (SSL) method and resampling techniques were employed for generating pseudo-labeling data and class rebalancing. Subsequently, a 5-classification spectral model of the N and K statuses of rubber leaves was established. The SSL model based on random forest classifiers and mean sampling techniques yielded optimal classification results both on imbalance/balance dataset (weighted average precision 67.8/78.6%, macro averaged precision 61.2/74.4%, and weighted recall 65.7/78.5% for the N status). All data and code could be viewed on the:Github https://github.com/WeehowTang/SSL-rebalancingtest. Ultimately, we proposed an efficient way to rapidly and accurately monitor the N and K levels in rubber leaves, especially in the scenario of small annotation and imbalance categories ratios.