F1000Research (May 2015)

Prediction of multi-drug resistance transporters using a novel sequence analysis method [v2; ref status: indexed, http://f1000r.es/5ef]

  • Jason E. McDermott,
  • Paul Bruillard,
  • Christopher C. Overall,
  • Luke Gosink,
  • Stephen R. Lindemann

DOI
https://doi.org/10.12688/f1000research.6200.2
Journal volume & issue
Vol. 4

Abstract

Read online

There are many examples of groups of proteins that have similar function, but the determinants of functional specificity may be hidden by lack of sequence similarity, or by large groups of similar sequences with different functions. Transporters are one such protein group in that the general function, transport, can be easily inferred from the sequence, but the substrate specificity can be impossible to predict from sequence with current methods. In this paper we describe a linguistic-based approach to identify functional patterns from groups of unaligned protein sequences and its application to predict multi-drug resistance transporters (MDRs) from bacteria. We first show that our method can recreate known patterns from PROSITE for several motifs from unaligned sequences. We then show that the method, MDRpred, can predict MDRs with greater accuracy and positive predictive value than a collection of currently available family-based models from the Pfam database. Finally, we apply MDRpred to a large collection of protein sequences from an environmental microbiome study to make novel predictions about drug resistance in a potential environmental reservoir.

Keywords