IEEE Access (Jan 2022)

Prediction of Aptamer Protein Interaction Using Random Forest Algorithm

  • N. Manju,
  • C. M. Samiha,
  • S. P. Pavan Kumar,
  • H. L. Gururaj,
  • Francesco Flammini

DOI
https://doi.org/10.1109/ACCESS.2022.3172278
Journal volume & issue
Vol. 10
pp. 49677 – 49687

Abstract

Read online

Aptamers are oligonucleotides that may attach to amino acids, polypeptide, tiny compounds, allergens and living cell membrane. Therapeutics, bio sensing and diagnostics are all sectors where the aptamers may be used. In this work, we present a model based on Random Forest Algorithms to predict the interaction of aptamer and target proteins by combining their most prominent characteristics. Amino Acid Composition and Psuedo Amino Acid Composition were utilized to express desired data by employing physicochemical and structural features of the amino acids. The dominant features were selected using feature importance classifiers such as random forest and eXtreme Gradient Boosting. Compared to these, principal component analysis techniques yielded a good feature set. As a result, 98% accuracy is obtained for 50 principal components. Many relevant characteristics in defining aptamer target protein interactions were discovered after analysing the best set of features. Our prediction approach is expected to become a valuable tool for discovering aptamer-target interactions, and the traits chosen and studied in this work might give helpful insight into the process of Aptamer Protein interactions.

Keywords