Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records

Varadraj P. Gurupur; Paniz Abedin; Sahar Hooshmand; Muhammed Shelleh

doi:10.3390/app122110746

Applied Sciences (Oct 2022)

Analyzing the Data Completeness of Patients’ Records Using a Random Variable Approach to Predict the Incompleteness of Electronic Health Records

Varadraj P. Gurupur,
Paniz Abedin,
Sahar Hooshmand,
Muhammed Shelleh

Affiliations

Varadraj P. Gurupur: School of Global Health Management and Informatics, University of Central Florida, Orlando, FL 32816, USA
Paniz Abedin: Department of Computer Science, Florida Polytechnic University, Lakeland, FL 33805, USA
Sahar Hooshmand: Department of Computer Science, California State University-Dominguez Hills, Carson, CA 90747, USA
Muhammed Shelleh: Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA

DOI: https://doi.org/10.3390/app122110746
Journal volume & issue: Vol. 12, no. 21
p. 10746

Abstract

Read online

The purpose of this article is to illustrate an investigation of methods that can be effectively used to predict the data incompleteness of a dataset. Here, the investigators have conceptualized data incompleteness as a random variable, with the overall goal behind experimentation providing a 360-degree view of this concept conceptualizing incompleteness of a dataset both as a continuous, discrete random variable depending on the aspect of the required analysis. During the course of the experiments, the investigators have identified Kolomogorov–Smirnov goodness of fit, Mielke distribution, and beta distributions as key methods to analyze the incompleteness of a dataset for the datasets used for experimentation. A comparison of these methods with a mixture density network was also performed. Overall, the investigators have provided key insights into the use of methods and algorithms that can be used to predict data incompleteness and have provided a pathway for further explorations and prediction of data incompleteness.

Published in Applied Sciences

ISSN: 2076-3417 (Online)
Publisher: MDPI AG
Country of publisher: Switzerland
LCC subjects: Technology: Engineering (General). Civil engineering (General); Science: Biology (General); Science: Physics; Science: Chemistry
Website: http://www.mdpi.com/journal/applsci

About the journal

Abstract

Keywords