Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts

Fatme Ghaddar; Kamaludin Dingle

doi:10.3390/life13030708

Life (Mar 2023)

Random and Natural Non-Coding RNA Have Similar Structural Motif Patterns but Differ in Bulge, Loop, and Bond Counts

Fatme Ghaddar,
Kamaludin Dingle

Affiliations

Fatme Ghaddar: Department of Computer Science, Gulf University for Science and Technology, Hawally 32093, Kuwait
Kamaludin Dingle: Centre for Applied Mathematics and Bioinformatics (CAMB), Department of Mathematics and Natural Sciences, Gulf University for Science and Technology, Hawally 32093, Kuwait

DOI: https://doi.org/10.3390/life13030708
Journal volume & issue: Vol. 13, no. 3
p. 708

Abstract

Read online

An important question in evolutionary biology is whether (and in what ways) genotype–phenotype (GP) map biases can influence evolutionary trajectories. Untangling the relative roles of natural selection and biases (and other factors) in shaping phenotypes can be difficult. Because the RNA secondary structure (SS) can be analyzed in detail mathematically and computationally, is biologically relevant, and a wealth of bioinformatic data are available, it offers a good model system for studying the role of bias. For quite short RNA (length L≤126), it has recently been shown that natural and random RNA types are structurally very similar, suggesting that bias strongly constrains evolutionary dynamics. Here, we extend these results with emphasis on much larger RNA with lengths up to 3000 nucleotides. By examining both abstract shapes and structural motif frequencies (i.e., the number of helices, bonds, bulges, junctions, and loops), we find that large natural and random structures are also very similar, especially when contrasted to typical structures sampled from the spaces of all possible RNA structures. Our motif frequency study yields another result, where the frequencies of different motifs can be used in machine learning algorithms to classify random and natural RNA with high accuracy, especially for longer RNA (e.g., ROC AUC 0.86 for L = 1000). The most important motifs for classification are the number of bulges, loops, and bonds. This finding may be useful in using SS to detect candidates for functional RNA within ‘junk’ DNA regions.

Published in Life

ISSN: 2075-1729 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Science
Website: http://www.mdpi.com/journal/life

About the journal

Abstract

Keywords