Frontiers in Plant Science (Dec 2024)

Using the Pearson’s correlation coefficient as the sole metric to measure the accuracy of quantitative trait prediction: is it sufficient?

  • Shouhui Pan,
  • Shouhui Pan,
  • Zhongqiang Liu,
  • Zhongqiang Liu,
  • Yanyun Han,
  • Yanyun Han,
  • Dongfeng Zhang,
  • Dongfeng Zhang,
  • Xiangyu Zhao,
  • Xiangyu Zhao,
  • Jinlong Li,
  • Jinlong Li,
  • Kaiyi Wang,
  • Kaiyi Wang

DOI
https://doi.org/10.3389/fpls.2024.1480463
Journal volume & issue
Vol. 15

Abstract

Read online

How to evaluate the accuracy of quantitative trait prediction is crucial to choose the best model among several possible choices in plant breeding. Pearson’s correlation coefficient (PCC), serving as a metric for quantifying the strength of the linear association between two variables, is widely used to evaluate the accuracy of the quantitative trait prediction models, and generally performs well in most circumstances. However, PCC may not always offer a comprehensive view of predictive accuracy, especially in cases involving nonlinear relationships or complex dependencies in machine learning-based methods. It has been found that many papers on quantitative trait prediction solely use PCC as a single metric to evaluate the accuracy of their models, which is insufficient and limited from a formal perspective. This study addresses this crucial issue by presenting a typical example and conducting a comparative analysis of PCC and nine other evaluation metrics using four traditional methods and four machine learning-based methods, thereby contributing to the improvement of practical applicability and reliability of plant quantitative trait prediction models. It is recommended to employ PCC in conjunction with other evaluation metrics in a targeted manner based on specific application scenarios to reduce the likelihood of drawing misleading conclusions.

Keywords