Applied Sciences (Oct 2024)
Feature Selection with Small Data Sets: Identifying Feature Importance for Predictive Classification of Return-to-Work Date after Knee Arthroplasty
Abstract
In recent decades, the number of cases of knee arthroplasty among people of working age has increased. The integrated clinical pathway ‘back at work after surgery’ is an initiative to reduce the possible cost of sick leave. The evaluation of this pathway, like many clinical studies, faces the challenge of small data sets with a relatively high number of features. In this study, we investigate the possibility of identifying features that are important in determining the duration of rehabilitation, expressed in the return-to-work period, by using feature selection tools. Several models are used to classify the patient’s data into two classes, and the results are evaluated based on the accuracy and the quality of the ordering of the features, for which we introduce a ranking score. A selection of estimators are used in an optimization step, reorganizing the feature ranking. The results show that for some models, the proposed optimization results in a better ordering of the features. The ordering of the features is evaluated visually and identified by the ranking score. Furthermore, for all models, higher accuracy, with a maximum of 91%, is achieved by applying the optimization process. The features that are identified as relevant for the duration of the return-to-work period are discussed and provide input for further research.
Keywords