Bioinformatics and Biology Insights (Jun 2024)

Rapid Targeted Assembly of the Proteome Reveals Evolutionary Variation of GC Content in Avian Lice

  • Avery R Grant,
  • Kevin P Johnson,
  • Edward L Stanley,
  • James Baldwin-Brown,
  • Stanislav Kolenčík,
  • Julie M Allen

DOI
https://doi.org/10.1177/11779322241257991
Journal volume & issue
Vol. 18

Abstract

Read online

Nucleotide base composition plays an influential role in the molecular mechanisms involved in gene function, phenotype, and amino acid composition. GC content (proportion of guanine and cytosine in DNA sequences) shows a high level of variation within and among species. Many studies measure GC content in a small number of genes, which may not be representative of genome-wide GC variation. One challenge when assembling extensive genomic data sets for these studies is the significant amount of resources (monetary and computational) associated with data processing, and many bioinformatic tools have not been optimized for resource efficiency. Using a high-performance computing (HPC) cluster, we manipulated resources provided to the targeted gene assembly program, automated target restricted assembly method (aTRAM), to determine an optimum way to run the program to maximize resource use. Using our optimum assembly approach, we assembled and measured GC content of all of the protein-coding genes of a diverse group of parasitic feather lice. Of the 499 426 genes assembled across 57 species, feather lice were GC-poor (mean GC = 42.96%) with a significant amount of variation within and between species (GC range = 19.57%-73.33%). We found a significant correlation between GC content and standard deviation per taxon for overall GC and GC 3 , which could indicate selection for G and C nucleotides in some species. Phylogenetic signal of GC content was detected in both GC and GC 3 . This research provides a large-scale investigation of GC content in parasitic lice laying the foundation for understanding the basis of variation in base composition across species.