Genetics Selection Evolution (Nov 2007)

Analysis of the real EADGENE data set: Multivariate approaches and post analysis (<it>Open Access publication</it>)

  • Schuberth Hans-Joachim,
  • van Schothorst Evert M,
  • Lund Mogens,
  • San Cristobal Magali,
  • Robert-Granié Christèle,
  • Pool Marco H,
  • Petzl Wolfram,
  • Nie Haisheng,
  • Cao Kim-Anh,
  • de Koning Dirk-Jan,
  • Jiang Li,
  • Jensen Kirsty,
  • Hulsegge Ina,
  • Jaffrézic Florence,
  • Hornshøj Henrik,
  • Hedegaard Jakob,
  • Glass Liz,
  • Duval Mylène,
  • Delmas Céline,
  • Déjean Sébastien,
  • Closset Rodrigue,
  • Buitenhuis Bart,
  • Bonnet Agnès,
  • Sørensen Peter,
  • Seyfert Hans-Martin,
  • Tosser-Klopp Gwenola,
  • Waddington David,
  • Watson Michael,
  • Yang Wei,
  • Zerbe Holm

DOI
https://doi.org/10.1186/1297-9686-39-6-651
Journal volume & issue
Vol. 39, no. 6
pp. 651 – 668

Abstract

Read online

Abstract The aim of this paper was to describe, and when possible compare, the multivariate methods used by the participants in the EADGENE WP1.4 workshop. The first approach was for class discovery and class prediction using evidence from the data at hand. Several teams used hierarchical clustering (HC) or principal component analysis (PCA) to identify groups of differentially expressed genes with a similar expression pattern over time points and infective agent (E. coli or S. aureus). The main result from these analyses was that HC and PCA were able to separate tissue samples taken at 24 h following E. coli infection from the other samples. The second approach identified groups of differentially co-expressed genes, by identifying clusters of genes highly correlated when animals were infected with E. coli but not correlated more than expected by chance when the infective pathogen was S. aureus. The third approach looked at differential expression of predefined gene sets. Gene sets were defined based on information retrieved from biological databases such as Gene Ontology. Based on these annotation sources the teams used either the GlobalTest or the Fisher exact test to identify differentially expressed gene sets. The main result from these analyses was that gene sets involved in immune defence responses were differentially expressed.

Keywords