Informatika (Dec 2019)
A computational approach and software package RNAexploreR for grouping RNA molecules of human genes by exon features
Abstract
The study on the exon combinatoric rules of human genes during the process of splicing is of great interest for the diagnosis and treatment of cancer. A certain part of the research is aimed at developing reliable prediction models for global exon combinatorics during the formation of mature RNA. The primary task is to develop standards or uniform systematic statistical approaches to the analysis and interpretation of possible exon sequences of genes.A computational approach is proposed to group alternative splicing events in primary messenger RNA of human genes with the aim of determining the gene correspondence or molecule class. The method consists of reducing the dimension of the exon feature space and combining closely located exons into a limited number of classes, replacing the exon pathways of RNA generation with sequences of corresponding exon class labels, calculating the distances between RNA transcripts by some measure of similarity, and associating closely spaced RNA objects into clusters. The performance evaluation of developed algorithms has been done using the examples of RNA molecules of selected nonhomologous human genes and human hybrid oncogene RUNX1/RUNX1T1. The mean accuracy of the assignment of the transcript to given gene is about 99,5 % for the considered nonhomologous pairs of genes.A software package and web application RNAexploreR, integrating the implemented algorithms for the analysis of alternative splicing of human gene RNA products, have been developed. The proposed algorithms and software can be used to study the organization and functioning of both aberrant and normal human genes.