The Plant Genome (Jun 2019)
Multiple Maize Reference Genomes Impact the Identification of Variants by Genome-Wide Association Study in a Diverse Inbred Panel
Abstract
Use of a single reference genome for genome-wide association studies (GWAS) limits the gene space represented to that of a single accession. This limitation can complicate identification and characterization of genes located within presence–absence variations (PAVs). In this study, we present the draft de novo genome assembly of ‘PHJ89’, an ‘Oh43’-type inbred line of maize ( L.). From three separate reference genome assemblies (‘B73’, ‘PH207’, and PHJ89) that represent the predominant germplasm groups of maize, we generated three separate whole-seedling gene expression profiles and single nucleotide polymorphism (SNP) matrices from a panel of 942 diverse inbred lines. We identified 34,447 (B73), 39,672 (PH207), and 37,436 (PHJ89) transcripts that are not present in the respective reference genome assemblies. Genome-wide association studies were conducted in the 942 inbred panel with both the SNP and expression data values to map (SCMV) resistance. Highlighting the impact of alternative reference genomes in gene discovery, the GWAS results for SCMV resistance with expression values as a surrogate measure of PAV resulted in robust detection of the physical location of a known resistance gene when the B73 reference that contains the gene was used, but not the PH207 reference. This study provides the valuable resource of the Oh43-type PHJ89 genome assembly as well as SNP and expression data for 942 individuals generated from three different reference genomes.