BMC Bioinformatics (Jun 2020)

PVAmpliconFinder: a workflow for the identification of human papillomaviruses from high-throughput amplicon sequencing

  • Alexis Robitaille,
  • Rosario N. Brancaccio,
  • Sankhadeep Dutta,
  • Dana E. Rollison,
  • Marcis Leja,
  • Nicole Fischer,
  • Adam Grundhoff,
  • Tarik Gheit,
  • Massimo Tommasino,
  • Magali Olivier

DOI
https://doi.org/10.1186/s12859-020-03573-8
Journal volume & issue
Vol. 21, no. 1
pp. 1 – 18

Abstract

Read online

Abstract Background The detection of known human papillomaviruses (PVs) from targeted wet-lab approaches has traditionally used PCR-based methods coupled with Sanger sequencing. With the introduction of next-generation sequencing (NGS), these approaches can be revisited to integrate the sequencing power of NGS. Although computational tools have been developed for metagenomic approaches to search for known or novel viruses in NGS data, no appropriate tool is available for the classification and identification of novel viral sequences from data produced by amplicon-based methods. Results We have developed PVAmpliconFinder, a data analysis workflow designed to rapidly identify and classify known and potentially new Papillomaviridae sequences from NGS amplicon sequencing with degenerate PV primers. Here, we describe the features of PVAmpliconFinder and its implementation using biological data obtained from amplicon sequencing of human skin swab specimens and oral rinses from healthy individuals. Conclusions PVAmpliconFinder identified putative new HPV sequences, including one that was validated by wet-lab experiments. PVAmpliconFinder can be easily modified and applied to other viral families. PVAmpliconFinder addresses a gap by providing a solution for the analysis of NGS amplicon sequencing, increasingly used in clinical research. The PVAmpliconFinder workflow, along with its source code, is freely available on the GitHub platform: https://github.com/IARCbioinfo/PVAmpliconFinder .

Keywords