Journal of King Saud University: Science (Jun 2021)

Illustration of missing data handling technique generated from hepatitis C induced hepatocellular carcinoma cohort study

  • Jesna Jose,
  • Gajendra K. Vishwakarma,
  • Atanu Bhattacharjee

Journal volume & issue
Vol. 33, no. 4
p. 101403

Abstract

Read online

Background and Objectives: Missing outcome data are a common occurrence for most clinical research trials. The ’complete case analysis’ is a widely adopted method to tackle with missing observations. However, it reduced the sample size of the study and thus have an impact on statistical power. Hence every effort should be made to reduce the amount of missing data. The objective of this work is to provide the application of different analytical tools to handle missing data imputation techniques through illustration. Methods: We used Imputation techniques such as EM algorithm, MCMC, Regression, and Predictive Mean matching methods and compared the results on hepatitis C virus-induced hepatocellular carcinoma (HCV-HCC) data. The statistical models by Generalized Estimating Equations, Time-dependent Cox Regression, and Joint Modeling were applied to obtain the statistical inference on imputed data. The missing data handling technique compatible with Principle Component Analysis (PCA) was found suitable to work with high dimensional data. Results: Joint modelling provides a slightly lower standard error than other analytical methods each imputation. Accordingly, to our methodology, Joint Modeling analysis with the EM algorithm imputation method has appeared to be the most appropriate method with HCV-HCC data. However, Generalized Estimating Equations and Time-dependent Cox Regression methods were relatively easy to run. Conclusion: The multiple imputation methods are efficient to provide inference with missing data. It is technically robust than any ad hoc approach to working with missing data.

Keywords