Scientific Reports (Jan 2024)
A phased genome assembly of a Colombian Trypanosoma cruzi TcI strain and the evolution of gene families
Abstract
Abstract Chagas is an endemic disease in tropical regions of Latin America, caused by the parasite Trypanosoma cruzi. High intraspecies variability and genome complexity have been challenges to assemble high quality genomes needed for studies in evolution, population genomics, diagnosis and drug development. Here we present a chromosome-level phased assembly of a TcI T. cruzi strain (Dm25). While 29 chromosomes show a large collinearity with the assembly of the Brazil A4 strain, three chromosomes show both large heterozygosity and large divergence, compared to previous assemblies of TcI T. cruzi strains. Nucleotide and protein evolution statistics indicate that T. cruzi Marinkellei separated before the diversification of T. cruzi in the known DTUs. Interchromosomal paralogs of dispersed gene families and histones appeared before but at the same time have a more strict purifying selection, compared to other repeat families. Previously unreported large tandem arrays of protein kinases and histones were identified in this assembly. Over one million variants obtained from Illumina reads aligned to the primary assembly clearly separate the main DTUs. We expect that this new assembly will be a valuable resource for further studies on evolution and functional genomics of Trypanosomatids.