Performance of InSilicoVA for assigning causes of death to verbal autopsies: multisite validation study using clinical diagnostic gold standards

Abraham D. Flaxman; Jonathan C. Joseph; Christopher J. L. Murray; Ian Douglas Riley; Alan D. Lopez

doi:10.1186/s12916-018-1039-1

BMC Medicine (Apr 2018)

Performance of InSilicoVA for assigning causes of death to verbal autopsies: multisite validation study using clinical diagnostic gold standards

Abraham D. Flaxman,
Jonathan C. Joseph,
Christopher J. L. Murray,
Ian Douglas Riley,
Alan D. Lopez

Affiliations

Abraham D. Flaxman: Institute for Health Metrics and Evaluation, University of Washington
Jonathan C. Joseph: Institute for Health Metrics and Evaluation, University of Washington
Christopher J. L. Murray: Institute for Health Metrics and Evaluation, University of Washington
Ian Douglas Riley: School of Population and Global Health, University of Melbourne
Alan D. Lopez: School of Population and Global Health, University of Melbourne

DOI: https://doi.org/10.1186/s12916-018-1039-1
Journal volume & issue: Vol. 16, no. 1
pp. 1 – 11

Abstract

Read online

Abstract Background Recently, a new algorithm for automatic computer certification of verbal autopsy data named InSilicoVA was published. The authors presented their algorithm as a statistical method and assessed its performance using a single set of model predictors and one age group. Methods We perform a standard procedure for analyzing the predictive accuracy of verbal autopsy classification methods using the same data and the publicly available implementation of the algorithm released by the authors. We extend the original analysis to include children and neonates, instead of only adults, and test accuracy using different sets of predictors, including the set used in the original paper and a set that matches the released software. Results The population-level performance (i.e., predictive accuracy) of the algorithm varied from 2.1 to 37.6% when trained on data preprocessed similarly as in the original study. When trained on data that matched the software default format, the performance ranged from −11.5 to 17.5%. When using the default training data provided, the performance ranged from −59.4 to −38.5%. Overall, the InSilicoVA predictive accuracy was found to be 11.6–8.2 percentage points lower than that of an alternative algorithm. Additionally, the sensitivity for InSilicoVA was consistently lower than that for an alternative diagnostic algorithm (Tariff 2.0), although the specificity was comparable. Conclusions The default format and training data provided by the software lead to results that are at best suboptimal, with poor cause-of-death predictive performance. This method is likely to generate erroneous cause of death predictions and, even if properly configured, is not as accurate as alternative automated diagnostic methods.

Published in BMC Medicine

ISSN: 1741-7015 (Online)
Publisher: BMC
Country of publisher: United Kingdom
LCC subjects: Medicine
Website: http://bmcmedicine.biomedcentral.com

About the journal

Abstract

Keywords