mice: Multivariate Imputation by Chained Equations in R

Research output: Contribution to journalArticleAcademic

4090 Citations (Scopus)
1595 Downloads (Pure)

Abstract

The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as an S-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic pooling. This article documents mice, which extends the functionality of mice 1.0 in several ways. In mice, the analysis of imputed data is made completely general, whereas the range of models under which pooling works is substantially extended. mice adds new functionality for imputing multilevel data, automatic predictor selection, data handling, post-processing imputed values, specialized pooling routines, model selection tools, and diagnostic graphs. Imputation of categorical data is improved in order to bypass problems caused by perfect prediction. Special attention is paid to transformations, sum scores, indices and interactions using passive imputation, and to the proper setup of the predictor matrix. mice can be downloaded from the Comprehensive R Archive Network. This article provides a hands-on, stepwise approach to solve applied incomplete data problems.
Original languageUndefined
Number of pages67
JournalJournal of statistical software
Volume45
Issue number3
Publication statusPublished - 2011

Keywords

  • MICE
  • R
  • multiple imputation
  • fully conditional speci�cation
  • Gibbs sampler
  • chained equations
  • predictor selection
  • IR-78938
  • passive imputation

Cite this