Increasing efficiency of SVMp+ for handling missing values in healthcare prediction.

Yufeng Zhang; Zijun Gao; Emily Wittrup; Jonathan Gryak; Kayvan Najarian

doi:10.1371/journal.pdig.0000281

PLOS Digital Health (Jun 2023)

Increasing efficiency of SVMp+ for handling missing values in healthcare prediction.

Yufeng Zhang,
Zijun Gao,
Emily Wittrup,
Jonathan Gryak,
Kayvan Najarian

Affiliations

Yufeng Zhang
Zijun Gao
Emily Wittrup
Jonathan Gryak
Kayvan Najarian

DOI: https://doi.org/10.1371/journal.pdig.0000281
Journal volume & issue: Vol. 2, no. 6
p. e0000281

Abstract

Read online

Missing data presents a challenge for machine learning applications specifically when utilizing electronic health records to develop clinical decision support systems. The lack of these values is due in part to the complex nature of clinical data in which the content is personalized to each patient. Several methods have been developed to handle this issue, such as imputation or complete case analysis, but their limitations restrict the solidity of findings. However, recent studies have explored how using some features as fully available privileged information can increase model performance including in SVM. Building on this insight, we propose a computationally efficient kernel SVM-based framework (l2-SVMp+) that leverages partially available privileged information to guide model construction. Our experiments validated the superiority of l2-SVMp+ over common approaches for handling missingness and previous implementations of SVMp+ in both digit recognition, disease classification and patient readmission prediction tasks. The performance improves as the percentage of available privileged information increases. Our results showcase the capability of l2-SVMp+ to handle incomplete but important features in real-world medical applications, surpassing traditional SVMs that lack privileged information. Additionally, l2-SVMp+ achieves comparable or superior model performance compared to imputed privileged features.

Published in PLOS Digital Health

ISSN: 2767-3170 (Online)
Publisher: Public Library of Science (PLoS)
Country of publisher: United States
LCC subjects: Medicine: Medicine (General): Computer applications to medicine. Medical informatics
Website: https://journals.plos.org/digitalhealth/

About the journal