Genetics Selection Evolution (Nov 2011)

Accuracies of genomic breeding values in American Angus beef cattle using K-means clustering for cross-validation

  • Saatchi Mahdi,
  • McClure Mathew C,
  • McKay Stephanie D,
  • Rolf Megan M,
  • Kim JaeWoo,
  • Decker Jared E,
  • Taxis Tasia M,
  • Chapple Richard H,
  • Ramey Holly R,
  • Northcutt Sally L,
  • Bauck Stewart,
  • Woodward Brent,
  • Dekkers Jack CM,
  • Fernando Rohan L,
  • Schnabel Robert D,
  • Garrick Dorian J,
  • Taylor Jeremy F

DOI
https://doi.org/10.1186/1297-9686-43-40
Journal volume & issue
Vol. 43, no. 1
p. 40

Abstract

Read online

Abstract Background Genomic selection is a recently developed technology that is beginning to revolutionize animal breeding. The objective of this study was to estimate marker effects to derive prediction equations for direct genomic values for 16 routinely recorded traits of American Angus beef cattle and quantify corresponding accuracies of prediction. Methods Deregressed estimated breeding values were used as observations in a weighted analysis to derive direct genomic values for 3570 sires genotyped using the Illumina BovineSNP50 BeadChip. These bulls were clustered into five groups using K-means clustering on pedigree estimates of additive genetic relationships between animals, with the aim of increasing within-group and decreasing between-group relationships. All five combinations of four groups were used for model training, with cross-validation performed in the group not used in training. Bivariate animal models were used for each trait to estimate the genetic correlation between deregressed estimated breeding values and direct genomic values. Results Accuracies of direct genomic values ranged from 0.22 to 0.69 for the studied traits, with an average of 0.44. Predictions were more accurate when animals within the validation group were more closely related to animals in the training set. When training and validation sets were formed by random allocation, the accuracies of direct genomic values ranged from 0.38 to 0.85, with an average of 0.65, reflecting the greater relationship between animals in training and validation. The accuracies of direct genomic values obtained from training on older animals and validating in younger animals were intermediate to the accuracies obtained from K-means clustering and random clustering for most traits. The genetic correlation between deregressed estimated breeding values and direct genomic values ranged from 0.15 to 0.80 for the traits studied. Conclusions These results suggest that genomic estimates of genetic merit can be produced in beef cattle at a young age but the recurrent inclusion of genotyped sires in retraining analyses will be necessary to routinely produce for the industry the direct genomic values with the highest accuracy.