Computational and Structural Biotechnology Journal (Dec 2024)
PidTools: Algorithm and web tools for crop pedigree identification analysis
Abstract
Crop pedigrees incorporate information on the kinship and genetic evolutionary history of breeding materials. Complete and accurate pedigree information is vital for effective genetic improvement of crops and maximal exploitation of heterosis in crop production. It is difficult for breeders to accurately extrapolate the selection of germplasm resources with missing genealogical information based on breeding experience. In this study, an algorithm called PidTools was developed, consisting of five sets of algorithms from three core modules, for accurate pedigree identification analysis. The algorithms and associated tools are suitable for all crops, for the reconstruction and visualization of a complete pedigree for breeding materials. The algorithm and tools were validated with the model crop maize. A genotype database was constructed using Maize6H-60K array data from 5791 maize inbred lines. The pedigree of the maize inbred line Jing72464 was identified without prior provision of any parental information. The pedigree information for Zheng58 was fully identified at the genome-wide scale. With regard to group identification, the parents of a doubled-haploid group were identified based on the genotyping data. The pedigree of 21 Dan340 derived lines were visualized using PidTools. The algorithms are incorporated into a user-friendly online analytical platform, PidTools-WS, with an associated customizable toolkit program, PidTools-CLI. These analytical tools and the present results provide useful information for future maize breeding. The PidTools online analysis platform is available at https://PidTools.plantdna.site/.