Data (Jan 2022)

Regression-Based Approach to Test Missing Data Mechanisms

  • Serguei Rouzinov,
  • André Berchtold

DOI
https://doi.org/10.3390/data7020016
Journal volume & issue
Vol. 7, no. 2
p. 16

Abstract

Read online

Missing data occur in almost all surveys; in order to handle them correctly it is essential to know their type. Missing data are generally divided into three types (or generating mechanisms): missing completely at random, missing at random, and missing not at random. The first step to understand the type of missing data generally consists in testing whether the missing data are missing completely at random or not. Several tests have been developed for that purpose, but they have difficulties when dealing with non-continuous variables and data with a low quantity of missing data. Our approach checks whether the missing data are missing completely at random or missing at random using a regression model and a distribution test, and it can be applied to continuous and categorical data. The simulation results show that our regression-based approach tends to be more sensitive to the quantity and the type of missing data than the commonly used methods.

Keywords