BMC Bioinformatics (May 2010)

Analysing 454 amplicon resequencing experiments using the modular and database oriented Variant Identification Pipeline

  • Deforce Dieter,
  • Coucke Paul,
  • Van Nieuwerburgh Filip,
  • Pattyn Filip,
  • Sabbe Nick,
  • Lefever Steve,
  • De Leeneer Kim,
  • De Schrijver Joachim M,
  • Vandesompele Jo,
  • Bekaert Sofie,
  • Hellemans Jan,
  • Van Criekinge Wim

DOI
https://doi.org/10.1186/1471-2105-11-269
Journal volume & issue
Vol. 11, no. 1
p. 269

Abstract

Read online

Abstract Background Next-generation amplicon sequencing enables high-throughput genetic diagnostics, sequencing multiple genes in several patients together in one sequencing run. Currently, no open-source out-of-the-box software solution exists that reliably reports detected genetic variations and that can be used to improve future sequencing effectiveness by analyzing the PCR reactions. Results We developed an integrated database oriented software pipeline for analysis of 454/Roche GS-FLX amplicon resequencing experiments using Perl and a relational database. The pipeline enables variation detection, variation detection validation, and advanced data analysis, which provides information that can be used to optimize PCR efficiency using traditional means. The modular approach enables customization of the pipeline where needed and allows researchers to adopt their analysis pipeline to their experiments. Clear documentation and training data is available to test and validate the pipeline prior to using it on real sequencing data. Conclusions We designed an open-source database oriented pipeline that enables advanced analysis of 454/Roche GS-FLX amplicon resequencing experiments using SQL-statements. This modular database approach allows easy coupling with other pipeline modules such as variant interpretation or a LIMS system. There is also a set of standard reporting scripts available.