Scripta Medica (Jan 2022)
Optimised feature selection and cervical cancer prediction using Machine learning classification
Abstract
Background: Screening and early detection play a key role in cervical cancer prevention. The present study predicts the outcome of various diagnostic tests used to diagnose cervical cancer using machine learning algorithms. Methods: The present study ran various cervical cancer risk factors on a machine learning (ML) classifier to predict outcomes of Hinselmann, Schiller, cytology and biopsy. The dataset is publicly available on the Machine Learning Repository website of the University of California Irvine. The imbalanced dataset was pre-processed using oversampling methods. The significantly varied features between the two levels of a response variable were used to train the machine learning classifiers on MATLAB. The classifiers used were Decision Trees, Support Vector Machine, K-Nearest Neighbours and Ensemble learning classifiers. The performance metrics of the classifiers were expressed as accuracy, the area under the receiver operator characteristic (AU-ROC) curve, sensitivity and specificity. Results: The Fine Gaussian SVM classifier was the best to classify Hinselmann, cytology and biopsy with the accuracy of 97.5 %, 62.5 % and 98 %, respectively. However, Boosted trees performed best in the classification of Schiller with 81.3 % accuracy. Conclusion: The present study selected optimised features among multiple risk factors to train various ML classifiers to predict cervical cancer.
Keywords