Journal of Big Data (Jul 2019)
Medicare fraud detection using neural networks
Abstract
Abstract Access to affordable healthcare is a nationwide concern that impacts a large majority of the United States population. Medicare is a Federal Government healthcare program that provides affordable health insurance to the elderly population and individuals with select disabilities. Unfortunately, there is a significant amount of fraud, waste, and abuse within the Medicare system that costs taxpayers billions of dollars and puts beneficiaries’ health and welfare at risk. Previous work has shown that publicly available Medicare claims data can be leveraged to construct machine learning models capable of automating fraud detection, but challenges associated with class-imbalanced big data hinder performance. With a minority class size of 0.03% and an opportunity to improve existing results, we use the Medicare fraud detection task to compare six deep learning methods designed to address the class imbalance problem. Data-level techniques used in this study include random over-sampling (ROS), random under-sampling (RUS), and a hybrid ROS–RUS. The algorithm-level techniques evaluated include a cost-sensitive loss function, the Focal Loss, and the Mean False Error Loss. A range of class ratios are tested by varying sample rates and desirable class-wise performance is achieved by identifying optimal decision thresholds for each model. Neural networks are evaluated on a 20% holdout test set, and results are reported using the area under the receiver operating characteristic curve (AUC). Results show that ROS and ROS–RUS perform significantly better than baseline and algorithm-level methods with average AUC scores of 0.8505 and 0.8509, while ROS–RUS maximizes efficiency with a 4× speedup in training time. Plain RUS outperforms baseline methods with up to 30× improvements in training time, and all algorithm-level methods are found to produce more stable decision boundaries than baseline methods. Thresholding results suggest that the decision threshold always be optimized using a validation set, as we observe a strong linear relationship between the minority class size and the optimal threshold. To the best of our knowledge, this is the first study to compare multiple data-level and algorithm-level deep learning methods across a range of class distributions. Additional contributions include a unique analysis of the relationship between minority class size and optimal decision threshold and state-of-the-art performance on the given Medicare fraud detection task.
Keywords