Tutorials in Quantitative Methods for Psychology (Jun 2023)
How to Generate Missing Data For Simulation Studies
Abstract
Missing data are common in psychological and educational research. With the improvement in computing technology in recent decades, more researchers have begun developing missing data techniques. In their research, they often conduct Monte Carlo simulation studies to compare the performances of different missing data techniques. During such simulation studies, researchers must generate missing data in the simulated dataset by deciding which data values to delete. However, in the current literature, there are limited guidelines on how to generate missing data for simulation studies. Our paper is one of the first that examines ways of generating missing data for simulation studies. I emphasize the importance of specifying missing data rules which are statistical models for generating missing data. I begin the paper by reviewing the types of missing data mechanisms and missing data patterns. I then explain how to specify missing data rules to generate missing data with different mechanisms and patterns. I emphasize the advantages and disadvantages of using different missing data rules and algorithms to generate missing data for simulation studies. Next, I discuss other important aspects of simulation studies involving missing data. I end the paper by offering recommendations for generating missing data for simulation studies.
Keywords