Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]

Syed Shariyar Murtaza; Patrycja Kolpak; Ayse Bener; Prabhat Jha

doi:10.12688/gatesopenres.12891.2

Gates Open Research (Jan 2019)

Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]

Syed Shariyar Murtaza,
Patrycja Kolpak,
Ayse Bener,
Prabhat Jha

Affiliations

Syed Shariyar Murtaza: Data Science Lab, Ryerson University, Toronto, Ontario, M5B 2K3, Canada
Patrycja Kolpak: Centre for Global Health Research, St. Michael's Hospital, Toronto, Ontario, Canada
Ayse Bener: Data Science Lab, Ryerson University, Toronto, Ontario, M5B 2K3, Canada
Prabhat Jha: Centre for Global Health Research, St. Michael's Hospital, Toronto, Ontario, Canada

DOI: https://doi.org/10.12688/gatesopenres.12891.2
Journal volume & issue: Vol. 2

Abstract

Read online

Verbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated algorithms for VA COD assignment have been developed and their performance has been assessed against physician and clinical diagnoses. Since the performance of automated classification methods remains low, we aimed to enhance the Naïve Bayes Classifier (NBC) algorithm to produce better ranked COD classifications on 26,766 deaths from four globally diverse VA datasets compared to some of the leading VA classification methods, namely Tariff, InterVA-4, InSilicoVA and NBC. We used a different strategy, by training multiple NBC algorithms using the one-against-all approach (OAA-NBC). To compare performance, we computed the cumulative cause-specific mortality fraction (CSMF) accuracies for population-level agreement from rank one to five COD classifications. To assess individual-level COD assignments, cumulative partially-chance corrected concordance (PCCC) and sensitivity was measured for up to five ranked classifications. Overall results show that OAA-NBC consistently assigns CODs that are the most alike physician and clinical COD assignments compared to some of the leading algorithms based on the cumulative CSMF accuracy, PCCC and sensitivity scores. The results demonstrate that our approach improves the performance of classification (sensitivity) by between 6% and 8% compared with other VA algorithms. Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Although OAA-NBC still requires improvement for individual-level COD assignment, the one-against-all approach improved its ability to assign CODs that more closely resemble physician or clinical COD classifications compared to some of the other leading VA classifiers.

Published in Gates Open Research

ISSN: 2572-4754 (Online)
Publisher: F1000 Research Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: https://gatesopenresearch.org/

About the journal