BMC Bioinformatics (Feb 2008)

Supervised Lowess normalization of comparative genome hybridization data – application to lactococcal strain comparisons

  • Karsens Harma A,
  • Zomer Aldert L,
  • Baerends Richard JS,
  • van Hijum Sacha AFT,
  • Martin-Requena Victoria,
  • Trelles Oswaldo,
  • Kok Jan,
  • Kuipers Oscar P

DOI
https://doi.org/10.1186/1471-2105-9-93
Journal volume & issue
Vol. 9, no. 1
p. 93

Abstract

Read online

Abstract Background Array-based comparative genome hybridization (aCGH) is commonly used to determine the genomic content of bacterial strains. Since prokaryotes in general have less conserved genome sequences than eukaryotes, sequence divergences between the genes in the genomes used for an aCGH experiment obstruct determination of genome variations (e.g. deletions). Current normalization methods do not take into consideration sequence divergence between target and microarray features and therefore cannot distinguish a difference in signal due to systematic errors in the data or due to sequence divergence. Results We present supervised Lowess, or S-Lowess, an application of the subset Lowess normalization method. By using a predicted subset of array features with minimal sequence divergence between the analyzed strains for the normalization procedure we remove systematic errors from dual-dye aCGH data in two steps: (1) determination of a subset of conserved genes (i.e. likely conserved genes, LCG); and (2) using the LCG for subset Lowess normalization. Subset Lowess determines the correction factors for systematic errors in the subset of array features and normalizes all array features using these correction factors. The performance of S-Lowess was assessed on aCGH experiments in which differentially labeled genomic DNA fragments of Lactococcus lactis IL1403 and L. lactis MG1363 strains were hybridized to IL1403 DNA microarrays. Since both genomes are sequenced and gene deletions identified, the success rate of different aCGH normalization methods in detecting these deletions in the MG1363 genome were determined. S-Lowess detects 97% of the deletions, whereas other aCGH normalization methods detect up to only 60% of the deletions. Conclusion S-Lowess is implemented in a user-friendly web-tool accessible from http://bioinformatics.biol.rug.nl/websoftware/s-lowess. We demonstrate that it outperforms existing normalization methods and maximizes detection of genomic variation (e.g. deletions) from microbial aCGH data.