BMC Bioinformatics (Mar 2023)
REDfold: accurate RNA secondary structure prediction using residual encoder-decoder network
Abstract
Abstract Background As the RNA secondary structure is highly related to its stability and functions, the structure prediction is of great value to biological research. The traditional computational prediction for RNA secondary prediction is mainly based on the thermodynamic model with dynamic programming to find the optimal structure. However, the prediction performance based on the traditional approach is unsatisfactory for further research. Besides, the computational complexity of the structure prediction using dynamic programming is $$O(N^3)$$ O ( N 3 ) ; it becomes $$O(N^6)$$ O ( N 6 ) for RNA structure with pseudoknots, which is computationally impractical for large-scale analysis. Results In this paper, we propose REDfold, a novel deep learning-based method for RNA secondary prediction. REDfold utilizes an encoder-decoder network based on CNN to learn the short and long range dependencies among the RNA sequence, and the network is further integrated with symmetric skip connections to efficiently propagate activation information across layers. Moreover, the network output is post-processed with constrained optimization to yield favorable predictions even for RNAs with pseudoknots. Experimental results based on the ncRNA database demonstrate that REDfold achieves better performance in terms of efficiency and accuracy, outperforming the contemporary state-of-the-art methods.
Keywords