Annals of Forest Science (May 2024)
Whole-genome screening for near-diagnostic genetic markers for four western European white oak species identification
Abstract
Abstract Key message Mining genome-wide DNA sequences enabled the discovery of near-diagnostic markers for species assignment in four European white oaks (Quercus petraea (Matt.) Liebl., Quercus pubescens Willd., Quercus pyrenaica Willd., and Quercus robur L.) despite their low interspecific differentiation. Near-diagnostic markers are almost fully fixed in one species and absent in the three others. As a result, only a handful of markers are needed for species identification, making this genetic assay a very promising operational taxonomic assignment procedure in research and forestry. Context Identifying species in the European white oak complex has been a long-standing concern in taxonomy, evolution, forest research, and management. Quercus petraea (Matt.) Liebl., Q. robur L., Q. pubescens Willd., and Q. pyrenaica Willd. are part of this species complex in western temperate Europe and hybridize in mixed stands, challenging species identification. Aims Our aim was to identify near-diagnostic single-nucleotide polymorphisms (SNPs) for each of the four species that are suitable for routine use and rapid diagnosis in research and applied forestry. Methods We first scanned existing whole-genome and target-capture data sets in a reduced number of samples (training set) to identify candidate diagnostic SNPs, i.e., genomic positions being characterized by a reference allele in one species and by the alternative allele in all other species. Allele frequencies of the candidates SNPs were then explored in a larger, range-wide sample of populations in each species (validation step). Results We found a subset of 38 SNPs (10 for Q. petraea, 7 for Q. pubescens, 9 for Q. pyrenaica, and 12 for Q. robur) that showed near-diagnostic features across their species distribution ranges with Q. pyrenaica and Q. pubescens exhibiting the highest (0.876) and lowest (0.747) diagnosticity, respectively. Conclusions We provide a new, efficient, and reliable molecular tool for the identification of the species Q. petraea, Q. robur, Q. pubescens, and Q. pyrenaica, which can be used as a routine tool in forest research and management. This study highlights the resolution offered by whole-genome sequencing data to design near-diagnostic marker sets for taxonomic assignment, even for species complexes with relatively low differentiation.
Keywords