BMC Genomics (Feb 2021)

GCSscore: an R package for differential gene expression analysis in Affymetrix/Thermo-Fisher whole transcriptome microarrays

  • Guy M. Harris,
  • Shahroze Abbas,
  • Michael F. Miles

DOI
https://doi.org/10.1186/s12864-021-07370-2
Journal volume & issue
Vol. 22, no. 1
pp. 1 – 12

Abstract

Read online

Abstract Background Despite the increasing use of RNAseq for transcriptome analysis, microarrays remain a widely-used methodology for genomic studies. The latest generation of Affymetrix/Thermo-Fisher microarrays, the ClariomD/XTA and ClariomS array, provide a sensitive and facile method for complex transcriptome expression analysis. However, existing methods of analysis for these high-density arrays do not leverage the statistical power contained in having multiple oligonucleotides representing each gene/exon, but rather summarize probes into a single expression value. We previously developed a methodology, the Sscore algorithm, for probe-level identification of differentially expressed genes (DEGs) between treatment and control samples with oligonucleotide microarrays. The Sscore algorithm was validated for sensitive detection of DEGs by comparison with existing methods. However, the prior version of the Sscore algorithm and a R-based implementation software, sscore, do not function with the latest generations of Affymetrix/Fisher microarrays due to changes in microarray design that eliminated probes previously used for estimation of non-specific binding. Results Here we describe the GCSscore algorithm, which utilizes the GC-content of a given oligonucleotide probe to estimate non-specific binding using antigenomic background probes found on new generations of arrays. We implemented this algorithm in an improved GCSscore R package for analysis of modern oligonucleotide microarrays. GCSscore has multiple methods for grouping individual probes on the ClariomD/XTA chips, providing the user with differential expression analysis at the gene-level and the exon-level. By utilizing the direct probe-level intensities, the GCSscore algorithm was able to detect DEGs under stringent statistical criteria for all Clariom-based arrays. We demonstrate that for older 3′-IVT arrays, GCSscore produced very similar differential gene expression analysis results compared to the original Sscore method. However, GCSscore functioned well for both the ClariomS and ClariomD/XTA newer microarrays and outperformed existing analysis approaches insofar as the number of DEGs and cognate biological functions identified. This was particularly striking for analysis of the highly complex ClariomD/XTA based arrays. Conclusions The GCSscore package represents a powerful new application for analysis of the newest generation of oligonucleotide microarrays such as the ClariomS and ClariomD/XTA arrays produced by Affymetrix/Fisher.

Keywords