Machine Learning and Knowledge Extraction (Jun 2024)
Cross-Validation Visualized: A Narrative Guide to Advanced Methods
Abstract
This study delves into the multifaceted nature of cross-validation (CV) techniques in machine learning model evaluation and selection, underscoring the challenge of choosing the most appropriate method due to the plethora of available variants. It aims to clarify and standardize terminology such as sets, groups, folds, and samples pivotal in the CV domain, and introduces an exhaustive compilation of advanced CV methods like leave-one-out, leave-p-out, Monte Carlo, grouped, stratified, and time-split CV within a hold-out CV framework. Through graphical representations, the paper enhances the comprehension of these methodologies, facilitating more informed decision making for practitioners. It further explores the synergy between different CV strategies and advocates for a unified approach to reporting model performance by consolidating essential metrics. The paper culminates in a comprehensive overview of the CV techniques discussed, illustrated with practical examples, offering valuable insights for both novice and experienced researchers in the field.
Keywords