PeerJ Computer Science (Sep 2024)
Trade-off between training and testing ratio in machine learning for medical image processing
Abstract
Artificial intelligence (AI) and machine learning (ML) aim to mimic human intelligence and enhance decision making processes across various fields. A key performance determinant in a ML model is the ratio between the training and testing dataset. This research investigates the impact of varying train-test split ratios on machine learning model performance and generalization capabilities using the BraTS 2013 dataset. Logistic regression, random forest, k nearest neighbors, and support vector machines were trained with split ratios ranging from 60:40 to 95:05. Findings reveal significant variations in accuracies across these ratios, emphasizing the critical need to strike a balance to avoid overfitting or underfitting. The study underscores the importance of selecting an optimal train-test split ratio that considers tradeoffs such as model performance metrics, statistical measures, and resource constraints. Ultimately, these insights contribute to a deeper understanding of how ratio selection impacts the effectiveness and reliability of machine learning applications across diverse fields.
Keywords