PLoS ONE (Jan 2012)

Characterisation and validation of insertions and deletions in 173 patient exomes.

  • Francesco Lescai,
  • Silvia Bonfiglio,
  • Chiara Bacchelli,
  • Estelle Chanudet,
  • Aoife Waters,
  • Sanjay M Sisodiya,
  • Dalia Kasperavičiūtė,
  • Julie Williams,
  • Denise Harold,
  • John Hardy,
  • Robert Kleta,
  • Sebahattin Cirak,
  • Richard Williams,
  • John C Achermann,
  • John Anderson,
  • David Kelsell,
  • Tom Vulliamy,
  • Henry Houlden,
  • Nicholas Wood,
  • Una Sheerin,
  • Gian Paolo Tonini,
  • Donna Mackay,
  • Khalid Hussain,
  • Jane Sowden,
  • Veronica Kinsler,
  • Justyna Osinska,
  • Tony Brooks,
  • Mike Hubank,
  • Philip Beales,
  • Elia Stupka

DOI
https://doi.org/10.1371/journal.pone.0051292
Journal volume & issue
Vol. 7, no. 12
p. e51292

Abstract

Read online

Recent advances in genomics technologies have spurred unprecedented efforts in genome and exome re-sequencing aiming to unravel the genetic component of rare and complex disorders. While in rare disorders this allowed the identification of novel causal genes, the missing heritability paradox in complex diseases remains so far elusive. Despite rapid advances of next-generation sequencing, both the technology and the analysis of the data it produces are in its infancy. At present there is abundant knowledge pertaining to the role of rare single nucleotide variants (SNVs) in rare disorders and of common SNVs in common disorders. Although the 1,000 genome project has clearly highlighted the prevalence of rare variants and more complex variants (e.g. insertions, deletions), their role in disease is as yet far from elucidated.We set out to analyse the properties of sequence variants identified in a comprehensive collection of exome re-sequencing studies performed on samples from patients affected by a broad range of complex and rare diseases (N = 173). Given the known potential for Loss of Function (LoF) variants to be false positive, we performed an extensive validation of the common, rare and private LoF variants identified, which indicated that most of the private and rare variants identified were indeed true, while common novel variants had a significantly higher false positive rate. Our results indicated a strong enrichment of very low-frequency insertion/deletion variants, so far under-investigated, which might be difficult to capture with low coverage and imputation approaches and for which most of study designs would be under-powered. These insertions and deletions might play a significant role in disease genetics, contributing specifically to the underlining rare and private variation predicted to be discovered through next generation sequencing.