Data in Brief (Aug 2022)
Transcriptome dataset of six human pathogen RNA viruses generated by nanopore sequencing
Abstract
Long-read sequencing (LRS) approaches shed new light on the complexity of viral (Kakuk et al., 2021 [1]; Boldogkői et al., 2019 [2]; Depledge et a., 2019 [3]), bacterial (Yan et al., 2018 [4]) and eukaryotic (Tilgner et al., 2014 [5]) transcriptomes. Emerging RNA viruses are zoonotic (Woolhouse et al., 2016 [6]) and create public health problems, e.g. influenza pandemic caused by H1N1 virus in (Fraser et al., 2009 [7]), as well as the current SARS-CoV-2 pandemic (Kim et al., 2020 [8]). In this study, we carried out nanopore sequencing for generating transcriptomic data valuable for structural and kinetic profiling of six important human pathogen RNA viruses, the H1N1 subtype of Influenza A virus (IVA), the Zika virus (ZIKV), the West Nile virus (WNV), the Crimean-Congo hemorrhagic fever virus (CCHFV), the Coxsackievirus [group B serotype 5 (CVB5)] and the Vesicular stomatitis Indiana virus (VSIV), and the response of host cells upon viral infection. The raw sequencing data were filtered during basecalling and only high quality reads (Qscore ≥ 7) were mapped to the appropriate viral and host genomes. Length distribution of sequencing reads were assessed and statistics of data were plotted by the ReadStat.4 python script. The datasets can be used to profile the transcriptomic landscape of RNA viruses, provide information for novel gene annotations, can serve as resource for studying the virus-host interactions, and for the analysis of RNA base modifications. These datasets can be used to compare the different sequencing techniques, library preparation approaches, bioinformatics pipelines, and to analyze the RNA profiles of viruses with small RNA genomes.