PLoS ONE (Jan 2013)

Coval: improving alignment quality and variant calling accuracy for next-generation sequencing data.

  • Shunichi Kosugi,
  • Satoshi Natsume,
  • Kentaro Yoshida,
  • Daniel MacLean,
  • Liliana Cano,
  • Sophien Kamoun,
  • Ryohei Terauchi

DOI
https://doi.org/10.1371/journal.pone.0075402
Journal volume & issue
Vol. 8, no. 10
p. e75402

Abstract

Read online

Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in 'targeted' alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.