International Journal of Molecular Sciences (Mar 2023)

Full-Length Transcriptome of the Great Himalayan Leaf-Nosed Bats (<i>Hipposideros armiger</i>) Optimized Genome Annotation and Revealed the Expression of Novel Genes

  • Mingyue Bao,
  • Xue Wang,
  • Ruyi Sun,
  • Zhiqiang Wang,
  • Jiqian Li,
  • Tinglei Jiang,
  • Aiqing Lin,
  • Hui Wang,
  • Jiang Feng

DOI
https://doi.org/10.3390/ijms24054937
Journal volume & issue
Vol. 24, no. 5
p. 4937

Abstract

Read online

The Great Himalayan Leaf-nosed bat (Hipposideros armiger) is one of the most representative species of all echolocating bats and is an ideal model for studying the echolocation system of bats. An incomplete reference genome and limited availability of full-length cDNAs have hindered the identification of alternatively spliced transcripts, which slowed down related basic studies on bats’ echolocation and evolution. In this study, we analyzed five organs from H. armiger for the first time using PacBio single-molecule real-time sequencing (SMRT). There were 120 GB of subreads generated, including 1,472,058 full-length non-chimeric (FLNC) sequences. A total of 34,611 alternative splicing (AS) events and 66,010 Alternative Polyadenylation (APA) sites were detected by transcriptome structural analysis. Moreover, a total of 110,611 isoforms were identified, consisting of 52% new isoforms of known genes and 5% of novel gene loci, as well as 2112 novel genes that have not been annotated before in the current reference genome of H. armiger. Furthermore, several key novel genes, including Pol, RAS, NFKB1, and CAMK4, were identified as being associated with nervous, signal transduction, and immune system processes, which may be involved in regulating the auditory nervous perception and immune system that helps bats to regulate in echolocation. In conclusion, the full-length transcriptome results optimized and replenished existing H. armiger genome annotation in multiple ways and offer advantages for newly discovered or previously unrecognized protein-coding genes and isoforms, which can be used as a reference resource.

Keywords