Virology Journal (Jan 2022)

Integrative profiling of Epstein–Barr virus transcriptome using a multiplatform approach

  • Ádám Fülöp,
  • Gábor Torma,
  • Norbert Moldován,
  • Kálmán Szenthe,
  • Ferenc Bánáti,
  • Islam A. A. Almsarrhad,
  • Zsolt Csabai,
  • Dóra Tombácz,
  • János Minárovits,
  • Zsolt Boldogkői

DOI
https://doi.org/10.1186/s12985-021-01734-6
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 17

Abstract

Read online

Abstract Background Epstein–Barr virus (EBV) is an important human pathogenic gammaherpesvirus with carcinogenic potential. The EBV transcriptome has previously been analyzed using both Illumina-based short read-sequencing and Pacific Biosciences RS II-based long-read sequencing technologies. Since the various sequencing methods have distinct strengths and limitations, the use of multiplatform approaches have proven to be valuable. The aim of this study is to provide a more complete picture on the transcriptomic architecture of EBV. Methods In this work, we apply the Oxford Nanopore Technologies MinION (long-read sequencing) platform for the generation of novel transcriptomic data, and integrate these with other’s data generated by another LRS approach, Pacific BioSciences RSII sequencing and Illumina CAGE-Seq and Poly(A)-Seq approaches. Both amplified and non-amplified cDNA sequencings were applied for the generation of sequencing reads, including both oligo-d(T) and random oligonucleotide-primed reverse transcription. EBV transcripts are identified and annotated using the LoRTIA software suite developed in our laboratory. Results This study detected novel genes embedded into longer host genes containing 5′-truncated in-frame open reading frames, which potentially encode N-terminally truncated proteins. We also detected a number of novel non-coding RNAs and transcript length isoforms encoded by the same genes but differing in their start and/or end sites. This study also reports the discovery of novel splice isoforms, many of which may represent altered coding potential, and of novel replication-origin-associated transcripts. Additionally, novel mono- and multigenic transcripts were identified. An intricate meshwork of transcriptional overlaps was revealed. Conclusions An integrative approach applying multi-technique sequencing technologies is suitable for reliable identification of complex transcriptomes because each techniques has different advantages and limitations, and the they can be used for the validation of the results obtained by a particular approach.

Keywords