PLoS ONE (Jan 2017)
Testing efficacy of distance and tree-based methods for DNA barcoding of grasses (Poaceae tribe Poeae) in Australia.
Abstract
In Australia, Poaceae tribe Poeae are represented by 19 genera and 99 species, including economically and environmentally important native and introduced pasture grasses [e.g. Poa (Tussock-grasses) and Lolium (Ryegrasses)]. We used this tribe, which are well characterised in regards to morphological diversity and evolutionary relationships, to test the efficacy of DNA barcoding methods. A reference library was generated that included 93.9% of species in Australia (408 individuals, [Formula: see text] = 3.7 individuals per species). Molecular data were generated for official plant barcoding markers (rbcL, matK) and the nuclear ribosomal internal transcribed spacer (ITS) region. We investigated accuracy of specimen identifications using distance- (nearest neighbour, best-close match, and threshold identification) and tree-based (maximum likelihood, Bayesian inference) methods and applied species discovery methods (automatic barcode gap discovery, Poisson tree processes) based on molecular data to assess congruence with recognised species. Across all methods, success rate for specimen identification of genera was high (87.5-99.5%) and of species was low (25.6-44.6%). Distance- and tree-based methods were equally ineffective in providing accurate identifications for specimens to species rank (26.1-44.6% and 25.6-31.3%, respectively). The ITS marker achieved the highest success rate for specimen identification at both generic and species ranks across the majority of methods. For distance-based analyses the best-close match method provided the greatest accuracy for identification of individuals with a high percentage of "correct" (97.6%) and a low percentage of "incorrect" (0.3%) generic identifications, based on the ITS marker. For tribe Poeae, and likely for other grass lineages, sequence data in the standard DNA barcode markers are not variable enough for accurate identification of specimens to species rank. For recently diverged grass species similar challenges are encountered in the application of genetic and morphological data to species delimitations, with taxonomic signal limited by extensive infra-specific variation and shared polymorphisms among species in both data types.