Explainabilty Comparison between Random Forests and Neural Networks—Case Study of Amino Acid Volume Prediction

Roberta De Fazio; Rosy Di Giovannantonio; Emanuele Bellini; Stefano Marrone

doi:10.3390/info14010021

Information (Dec 2022)

Explainabilty Comparison between Random Forests and Neural Networks—Case Study of Amino Acid Volume Prediction

Roberta De Fazio,
Rosy Di Giovannantonio,
Emanuele Bellini,
Stefano Marrone

Affiliations

Roberta De Fazio: Dipartimento di Matematica e Fisica, Università degli Studi della Campania “Luigi Vanvitelli”, Viale Lincoln, 5, 81100 Caserta, Italy
Rosy Di Giovannantonio: Dipartimento di Matematica e Fisica, Università degli Studi della Campania “Luigi Vanvitelli”, Viale Lincoln, 5, 81100 Caserta, Italy
Emanuele Bellini: Dipartimento di Studi Umanistici, Università degli Studi Roma Tre, Via Ostiense, 234, 00146 Roma, Italy
Stefano Marrone: Dipartimento di Matematica e Fisica, Università degli Studi della Campania “Luigi Vanvitelli”, Viale Lincoln, 5, 81100 Caserta, Italy

DOI: https://doi.org/10.3390/info14010021
Journal volume & issue: Vol. 14, no. 1
p. 21

Abstract

Read online

As explainability seems to be the driver for a wiser adoption of Artificial Intelligence in healthcare and in critical applications, in general, a comprehensive study of this field is far from being completed. On one hand, a final definition and theoretical measurements of explainability have not been assessed, yet, on the other hand, some tools and frameworks for the practical evaluation of this feature are now present. This paper aims to present a concrete experience in using some of these explainability-related techniques in the problem of predicting the size of amino acids in real-world protein structures. In particular, the feature importance calculation embedded in Random Forest (RF) training is compared with the results of the Eli-5 tool applied to the Neural Network (NN) model. Both the predictors are trained on the same dataset, which is extracted from Protein Data Bank (PDB), considering 446 myoglobins structures and process it with several tools to implement a geometrical model and perform analyses on it. The comparison between the two models draws different conclusions about the residues’ geometry and their biological properties.

Published in Information

ISSN: 2078-2489 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Technology (General): Industrial engineering. Management engineering: Information technology
Website: http://www.mdpi.com/journal/information/

About the journal

Abstract

Keywords