Evolutionary Bioinformatics (May 2018)
In Silico Study on Molecular Sequences for Identification of Species
Abstract
Our study searched all available sequences of Paphiopedilum from NCBI (National Center for Biotechnology Information) and tested for their species resolution capability in single as well as in combination forms. A total of 28 loci were applied for analyses in the study. From the nuclear genome, the highest resolution was of LFY , followed by ACO, DEF 4, and RAD 51. These 4 loci were found to be even better than the popular region ITS for Paphiopedilum identification. Among the chloroplast regions, the intergenic spacer atpB - rbc L gave the highest species resolution (76.7%), followed by mat K, trn L, rpo C2, and ycf 1. The divergence of CHS, XDH , 18S, Nad 1, ccs A, rbc L, and ycf 2 was very low and should not be used as identifying markers for Paphiopedilum . In addition, 2-locus combinations could improve significantly the resolving capability for the genus, in which 14/36 data sets could be resolved completely (100%) with interspecies relationships. The indel information was also effective supporting data for molecular discrimination of species.