BMC Research Notes (Nov 2010)

KC-SMARTR: An R package for detection of statistically significant aberrations in multi-experiment aCGH data

  • Reinders Marcel JT,
  • Holstege Henne,
  • Velds Arno,
  • Klijn Christiaan,
  • de Ronde Jorma J,
  • Jonkers Jos,
  • Wessels Lodewyk FA

DOI
https://doi.org/10.1186/1756-0500-3-298
Journal volume & issue
Vol. 3, no. 1
p. 298

Abstract

Read online

Abstract Background Most approaches used to find recurrent or differential DNA Copy Number Alterations (CNA) in array Comparative Genomic Hybridization (aCGH) data from groups of tumour samples depend on the discretization of the aCGH data to gain, loss or no-change states. This causes loss of valuable biological information in tumour samples, which are frequently heterogeneous. We have previously developed an algorithm, KC-SMART, that bases its estimate of the magnitude of the CNA at a given genomic location on kernel convolution (Klijn et al., 2008). This accounts for the intensity of the probe signal, its local genomic environment and the signal distribution across multiple samples. Results Here we extend the approach to allow comparative analyses of two groups of samples and introduce the R implementation of these two approaches. The comparative module allows for a supervised analysis to be performed, to enable the identification of regions that are differentially aberrated between two user-defined classes. We analyzed data from a series of B- and T-cell lymphomas and were able to retrieve all positive control regions (VDJ regions) in addition to a number of new regions. A t-test employing segmented data, that we implemented, was also able to locate all the positive control regions and a number of new regions but these regions were highly fragmented. Conclusions KC-SMARTR offers recurrent CNA and class specific CNA detection, at different genomic scales, in a single package without the need for additional segmentation. It is memory efficient and runs on a wide range of machines. Most importantly, it does not rely on data discretization and therefore maximally exploits the biological information in the aCGH data. The program is freely available from the Bioconductor website http://www.bioconductor.org/ under the terms of the GNU General Public License.