International Journal of Applied Mathematics and Computer Science (Sep 2015)

Building the library of RNA 3D nucleotide conformations using the clustering approach

  • Zok Tomasz,
  • Antczak Maciej,
  • Riedel Martin,
  • Nebel David,
  • Villmann Thomas,
  • Lukasiak Piotr,
  • Blazewicz Jacek,
  • Szachniuk Marta

DOI
https://doi.org/10.1515/amcs-2015-0050
Journal volume & issue
Vol. 25, no. 3
pp. 689 – 700

Abstract

Read online

An increasing number of known RNA 3D structures contributes to the recognition of various RNA families and identification of their features. These tasks are based on an analysis of RNA conformations conducted at different levels of detail. On the other hand, the knowledge of native nucleotide conformations is crucial for structure prediction and understanding of RNA folding. However, this knowledge is stored in structural databases in a rather distributed form. Therefore, only automated methods for sampling the space of RNA structures can reveal plausible conformational representatives useful for further analysis. Here, we present a machine learning-based approach to inspect the dataset of RNA three-dimensional structures and to create a library of nucleotide conformers. A median neural gas algorithm is applied to cluster nucleotide structures upon their trigonometric description. The clustering procedure is two-stage: (i) backbone- and (ii) ribose-driven. We show the resulting library that contains RNA nucleotide representatives over the entire data, and we evaluate its quality by computing normal distribution measures and average RMSD between data points as well as the prototype within each cluster.

Keywords