Journal of Spatial and Organizational Dynamics (Sep 2016)
Classification of the financial sustainability of health insurance beneficiaries through data mining techniques
Abstract
Advances in information technologies have led to the storage of large amounts of data by organizations. An analysis of this data through data mining techniques is important support for decision-making. This article aims to apply techniques for the classification of the beneficiaries of an operator of health insurance in Brazil, according to their financial sustainability, via their sociodemographic characteristics and their healthcare cost history. Beneficiaries with a loss ratio greater than 0.75 are considered unsustainable. The sample consists of 38875 beneficiaries, active between the years 2011 and 2013. The techniques used were logistic regression and classification trees. The performance of the models was compared to accuracy rates and receiver operating Characteristic curves (ROC curves), by determining the area under the curves (AUC). The results showed that most of the sample is composed of sustainable beneficiaries. The logistic regression model had a 68.43% accuracy rate with AUC of 0.7501, and the classification tree obtained 67.76% accuracy and an AUC of 0.6855. Age and the type of plan were the most important variables related to the profile of the beneficiaries in the classification. The highlights with regard to healthcare costs were annual spending on consultation and on dental insurance.