ARIA, HaRIA, and GeRIA: Novel Metrics for Pre-Model Interpretability

Marek Pawlicki

doi:10.1109/ACCESS.2024.3454084

IEEE Access (Jan 2024)

ARIA, HaRIA, and GeRIA: Novel Metrics for Pre-Model Interpretability

Marek Pawlicki

Affiliations

Marek Pawlicki: ORCiD; Faculty of Telecommunications, Computer Science and Electrical Engineering, Bydgoszcz University of Science and Technology, Bydgoszcz, Poland

DOI: https://doi.org/10.1109/ACCESS.2024.3454084
Journal volume & issue: Vol. 12
pp. 123561 – 123580

Abstract

Read online

This work proposes three novel Pre-Model Interpretability metrics: HaRIA, ARIA, and GeRIA. They aim to assess the potential utilization of features in machine learning models prior to the training phase, by quantifying the Relative Information Availability. These metrics integrate Mutual Information and ANOVA F-values, scaled using Maximum Absolute Scaling. This allows to evaluate the potential of a feature being used in the learning process efficiently and effectively without the computational expense of model training. The metrics are designed to provide a holistic view of feature relevance by capturing both the non-linear dependencies and variance effects among features. Validation of these metrics across multiple datasets demonstrates their capability to approximate the importance assigned by more complex models, as evidenced by their strong correlation with traditional feature importance measures and SHAP values obtained post-model training. The consistency observed in various datasets underscores the potential of RIA metrics to facilitate early-stage model development decisions, offering a cost-effective tool for feature evaluation in scenarios where computational resources are limited or rapid prototyping is necessary. However, some discrepancies, especially with complex models like ANNs, indicate areas for future research and refinement. The introduction of these metrics marks a significant step toward enhancing the efficiency and transparency of AI development by enabling a better understanding of data characteristics and potential model behavior before actual model deployment.

Published in IEEE Access

ISSN: 2169-3536 (Online)
Publisher: IEEE
Country of publisher: United States
LCC subjects: Technology: Electrical engineering. Electronics. Nuclear engineering
Website: https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=6287639

About the journal

Abstract

Keywords