Aquaculture Reports (Jul 2021)
Data imputation and machine learning improve association analysis and genomic prediction for resistance to fish photobacteriosis in the gilthead sea bream
Abstract
Disease resistance represents a key trait for breeding programs in aquaculture species. Here we re-analysed 2bRAD sequence data from two experimental challenges of gilthead sea bream with Photobacterium damsealae piscicida. Using a high quality reference genome, we carried out variant calling and data imputation with Beagle to obtain a large set of SNPs (80,744). This allowed the identification of eight novel QTLs for resistance to photobacteriosis across different chromosomes and revealed a highly polygenic genetic architecture.Bayesian regression approaches and machine learning methods (support vector machines and linear bagging) were compared to evaluate relative performance to classify susceptible-resistant individuals. Both data sets showed higher Matthew Correlation Coefficient (MCC) and accuracy values for machine learning methods, particularly linear bagging, with 20–70 % increase in prediction performance. Overall, machine learning methods should be explored in parallel with parametric regression approaches to increase the chances of highly effective genomic prediction.