BMC Medical Informatics and Decision Making (May 2019)

Decision tree–based classifier in providing telehealth service

  • Ching-Chin Chern,
  • Yu-Jen Chen,
  • Bo Hsiao

DOI
https://doi.org/10.1186/s12911-019-0825-9
Journal volume & issue
Vol. 19, no. 1
pp. 1 – 15

Abstract

Read online

Abstract Background Although previous research showed that telehealth services can reduce the misuse of resources and urban–rural disparities, most healthcare insurers do not include telehealth services in their health insurance schemes. Therefore, no target variable exists for the classification approaches to learn from or train with. The problem of identifying the potential recipients of telehealth services when introducing telehealth services into health welfare or health insurance schemes becomes an unsupervised classification problem without a target variable. Methods We propose a HDTTCA approach, which is a systematic approach (the main process of HDTTCA involves (1) data set preprocessing, (2) decision tree model building, and (3) predicting and explaining of the most important attributes in the data set for patients who qualify for telehealth service) to identify those who are eligible for telehealth services. Results This work uses data from the NHIRD provided by the NHIA in Taiwan in 2012 as our research scope, which consist of 55,389 distinct hospitals and 653,209 distinct patients with 15,882,153 outpatient and 135,775 inpatient records. After HDTTCA produces the final version of the decision tree, the rules can be used to assign the values of the target variables in the entire NHIRD. Our data indicate that 3.56% (23,262 out of 653,209) of the patients are eligible for telehealth services in 2012. This study verifies the efficiency and validity of HDTTCA by using a large data set from the NHI of Taiwan. Conclusion This study conducts a series of experiments 30 times to compare the HDTTCA results with the logistic regression findings by measuring their average performance and determining which model addresses the telehealth patient classification problem better. Four important metrics are used to compare the results. In terms of sensitivity, the decision trees generated by HDTTCA and the logistic regression model are on equal grounds. In terms of accuracy, specificity, and precision, the decision tree generated by HDTTCA provides a better performance than that of the logistic regression model. When HDTTCA is applied, the decision tree model generates a competitive performance and provides clear, easily understandable rules. Therefore, HDTTCA is a suitable choice in solving telehealth service classification problems.

Keywords