Robustness and reproducibility for AI learning in biomedical sciences: RENOIR

Alessandro Barberis; Hugo J. W. L. Aerts; Francesca M. Buffa

doi:10.1038/s41598-024-51381-4

Scientific Reports (Jan 2024)

Robustness and reproducibility for AI learning in biomedical sciences: RENOIR

Alessandro Barberis,
Hugo J. W. L. Aerts,
Francesca M. Buffa

Affiliations

Alessandro Barberis: Nuffield Department of Surgical Sciences, Medical Sciences Division, University of Oxford
Hugo J. W. L. Aerts: Artificial Intelligence in Medicine (AIM) Program, Mass General Brigham, Harvard Medical School
Francesca M. Buffa: Computational Biology and Integrative Genomics Lab, Department of Oncology, Medical Sciences Division, University of Oxford

DOI: https://doi.org/10.1038/s41598-024-51381-4
Journal volume & issue: Vol. 14, no. 1
pp. 1 – 13

Abstract

Read online

Abstract Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR’s successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge—predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .

Published in Scientific Reports

ISSN: 2045-2322 (Online)
Publisher: Nature Portfolio
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://www.nature.com/srep/

About the journal