PeerJ (Aug 2021)
Identifying transcript 5′ capped ends in Plasmodium falciparum
Abstract
Background The genome of the human malaria parasite Plasmodium falciparum is poorly annotated, in particular, the 5′ capped ends of its mRNA transcripts. New approaches are needed to fully catalog P. falciparum transcripts for understanding gene function and regulation in this organism. Methods We developed a transcriptomic method based on next-generation sequencing of complementary DNA (cDNA) enriched for full-length fragments using eIF4E, a 5′ cap-binding protein, and an unenriched control. DNA sequencing adapter was added after enrichment of full-length cDNA using two different ligation protocols. From the mapped sequence reads, enrichment scores were calculated for all transcribed nucleotides and used to calculate P-values of 5′ capped nucleotide enrichment. Sensitivity and accuracy were increased by combining P-values from replicate experiments. Data were obtained for P. falciparum ring, trophozoite and schizont stages of intra-erythrocytic development. Results 5′ capped nucleotide signals were mapped to 17,961 non-overlapping P. falciparum genomic intervals. Analysis of the dominant 5′ capped nucleotide in these genomic intervals revealed the presence of two groups with distinctive epigenetic features and sequence patterns. A total of 4,512 transcripts were annotated as 5′ capped based on the correspondence of 5′ end with 5′ capped nucleotide annotated from full-length cDNA data. Discussion The presence of two groups of 5′ capped nucleotides suggests that alternative mechanisms may exist for producing 5′ capped transcript ends in P. falciparum. The 5′ capped transcripts that are antisense, outside of, or partially overlapping coding regions may be important regulators of gene function in P. falciparum.
Keywords