Journal of Statistical Software (Dec 2011)

mice : Multivariate Imputation by Chained Equations in R

  • Stef van Buuren,
  • Karin Groothuis-Oudshoorn

Journal volume & issue
Vol. 45, no. 3

Abstract

Read online

The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as anS-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic pooling. This article documents mice 2.9, which extends the functionality ofmice 1.0 in several ways. In mice 2.9, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. mice 2.9 adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs. Imputation of categorical data is improved in order to bypassproblems caused by perfect prediction. Special attention is paid to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. mice 2.9 can be downloaded from the Comprehensive R Archive Network. This article provides a hands-on, stepwise approach to solve applied incomplete data problems.

Keywords