SIFTER-T: A scalable and optimized framework for the SIFTER phylogenomic method of probabilistic protein domain annotation

Danillo C. Almeida-e-Silva; Ricardo Z.N. Vêncio

doi:10.2144/000114266

BioTechniques (Mar 2015)

SIFTER-T: A scalable and optimized framework for the SIFTER phylogenomic method of probabilistic protein domain annotation

Danillo C. Almeida-e-Silva,
Ricardo Z.N. Vêncio

Affiliations

Danillo C. Almeida-e-Silva: 1Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirão Preto, Brazil
Ricardo Z.N. Vêncio: 1Department of Computing and Mathematics FFCLRP-USP, University of Sao Paulo, Ribeirão Preto, Brazil

DOI: https://doi.org/10.2144/000114266
Journal volume & issue: Vol. 58, no. 3
pp. 140 – 142

Abstract

Read online

Statistical Inference of Function Through Evolutionary Relationships (SIFTER) is a powerful computational platform for probabilistic protein domain annotation. Nevertheless, SIFTER is not widely used, likely due to usability and scalability issues. Here we present SIFTER-T (SIFTER Throughput-optimized), a substantial improvement over SIFTER's original proof-of-principle implementation. SIFTER-T is optimized for better performance, allowing it to be used at the genome-wide scale. Compared to SIFTER 2.0, SIFTER-T achieved an 87-fold performance improvement using published test data sets for the known annotations recovering module and a 72.3% speed increase for the gene tree generation module in quad-core machines, as well as a major decrease in memory usage during the realignment phase. Memory optimization allowed an expanded set of proteins to be handled by SIFTER's probabilistic method. The improvement in performance and automation that we achieved allowed us to build a web server to bring the power of Bayesian phylogenomic inference to the genomics community. SIFTER-T and its online interface are freely available under GNU license at http://labpib.fmrp.usp.br/methods/SIFTER-t/ and https://github.com/dcasbioinfo/SIFTER-t.

Published in BioTechniques

ISSN: 0736-6205 (Print); 1940-9818 (Online)
Publisher: Taylor & Francis Group
Country of publisher: United Kingdom
LCC subjects: Science: Biology (General)
Website: https://www.tandfonline.com/journals/ibtn20

About the journal

Abstract

Keywords