Computational and Structural Biotechnology Journal (Dec 2024)
Enhancing prediction of short linear protein motifs with Wregex 3.0
Abstract
Short linear motifs (SLiMs) play an important role in protein-protein interactions. However, SLiM patterns are intrinsically permissive and result into many matches that occur just by chance, specially when targeting large datasets. To prioritize these matches as candidates for functional testing, we developed Wregex (Weighted regular expression), which uses a position-specific scoring matrix (PSSM) to order a list of regular expression matches according to a PSSM-derived score. Here we present Wregex 3.0, an improved version with new functionalities such as the support for a second auxiliary motif to help refining prediction of a primary SLiM, and post-translational modifications (PTMs) enrichment taking into account that many regulatory SLiM-mediated interactions are modulated by one or more PTMs. This version also incorporates a number of new features such as a convenient use of subproteomes, showing UniProt annotations such as disordered regions, searching for all known motifs and generating decoy databases for enrichment analysis. We provide case studies to illustrate how these new Wregex functionalities enhance prediction of short linear protein motifs. The Wregex 3.0 server is freely accessible at https://ehubio.ehu.eus/wregex3/.