BMC Bioinformatics (Jul 2007)

HoxPred: automated classification of Hox proteins using combinations of generalised profiles

  • Leyns Luc,
  • Thomas-Chollier Morgane,
  • Ledent Valérie

DOI
https://doi.org/10.1186/1471-2105-8-247
Journal volume & issue
Vol. 8, no. 1
p. 247

Abstract

Read online

Abstract Background Correct identification of individual Hox proteins is an essential basis for their study in diverse research fields. Common methods to classify Hox proteins focus on the homeodomain that characterise homeobox transcription factors. Classification is hampered by the high conservation of this short domain. Phylogenetic tree reconstruction is a widely used but time-consuming classification method. Results We have developed an automated procedure, HoxPred, that classifies Hox proteins in their groups of homology. The method relies on a discriminant analysis that classifies Hox proteins according to their scores for a combination of protein generalised profiles. 54 generalised profiles dedicated to each Hox homology group were produced de novo from a curated dataset of vertebrate Hox proteins. Several classification methods were investigated to select the most accurate discriminant functions. These functions were then incorporated into the HoxPred program. Conclusion HoxPred shows a mean accuracy of 97%. Predictions on the recently-sequenced stickleback fish proteome identified 44 Hox proteins, including HoxC1a only found so far in zebrafish. Using the Uniprot databank, we demonstrate that HoxPred can efficiently contribute to large-scale automatic annotation of Hox proteins into their paralogous groups. As orthologous group predictions show a higher risk of misclassification, they should be corroborated by additional supporting evidence. HoxPred is accessible via SOAP and Web interface http://cege.vub.ac.be/hoxpred/. Complete datasets, results and source code are available at the same site.