The Indian Journal of Agricultural Sciences (Dec 2023)

Performance of clustering procedures for grouping germplasms based on mixture data with missing observations

  • RUPAM KUMAR SARKAR,
  • A R RAO,
  • S D WAHI,
  • K V BHAT

DOI
https://doi.org/10.56093/ijas.v82i12.26254
Journal volume & issue
Vol. 82, no. 12

Abstract

Read online

Occurrence of missing observations in mixture of qualitative and quantitative trait data is a common feature in breeding experiments. However, it becomes difficult to cluster the germplasms in presence of missing data. In the present study, five different clustering methods, six different ways of imputing missing data and three levels of missing observations have been considered in order to compare the performance of clustering procedures meant for mixture data. It was found that all the clustering methods are robust against imputation up to 5% missing observations. The INDOMIX and PRINQUAL methods in conjunction with k-means clustering with imputation of missing observations by (i) mean substitution in quantitative traits and frequency substitution in qualitative traits and (ii) multiple imputation in quantitative traits and 0 imputation in qualitative traits found to perform better than EM, ANN and PCAMIX methods for classification of germplasms. This study has been conducted during 2009–10 at Indian Agricultural Statistics Research Institute and for illustration purpose data has been obtained from National Bureau of Plant Genetic Resources.

Keywords