Multiple Imputation Method in SAS Exemplified through a Case Study of Programmatic Data from Emergency Nutrition Programs Open Access

Bugli, Dante (2017)

Permanent URL: https://etd.library.emory.edu/concern/etds/4q77fs297?locale=en%255D
Published

Abstract

Background. Missing data is a problem that all researchers encounter. Historically applied imputation methods expose a study to bias while advanced statistical methodology called multiple imputation (MI) method introduces the smallest amount of bias. Drawing upon a complex theoretical basis, statistical software responded accordingly by providing a sound and rapid application of MI. Few resources exist detailing the application of the method.

Objective. This paper provides a brief explanation of the foundations of MI method and applies it as a sensitivity analysis of a study implemented across three countries. By comparing results of model selection from both analyses, factors of significant impact on programmatic success can be more clearly identified.

Methods. Using a dataset of information from an exit questionnaire of a supplemental feeding program (SFP) implemented in emergency settings, MI was applied to artificially complete the dataset. Bivariate and multivariate regression were used to determine appropriate models to identify important factors that would lead to a patient defaulting from the program.

Results. Missing data was a large problem in this case study's dataset with variables ranging from 14% to 52% missing. MI completed the datasets and produced 10 imputed datasets for multivariate analysis. Models selected based on the imputed datasets were not entirely identical to those from the original analysis but reflected similar adjusted odds ratios with higher precision for those that coincided.

Conclusions. MI was valuable as a sensitivity analysis to identify important modifiable factors to decrease program defaulting. By identifying factors that were significantly influencing or impeding participants' abilities/desire to remain in the SFP future programming may be improved. This paper shows that applying MI to categorical datasets can still confirm the results of a primary analysis and aid in targeting key factors.

Table of Contents

Table of Contents

Introduction 1

Methods 3

Preparation for Imputation 3

Imputation Phase 6

Analysis and Pooling Phase 9

Analysis 10

Ethics Statement 10

Sample 10

Imputation Phase 11

Analysis and Pooling Phase 12

Results 12

Analysis of Missingness 12

Imputation 13

Discussion 15

Limitations 16

Conclusions 17

Acknowledgements 17

References 18

Tables 20

Figures 39

Appendices 41

PROC MI Sample 41

PROC SURVEYLOGISTIC Sample 42

PROC MIANALYZE Sample 43

Sample Code from Case Study 44

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Partnering Agencies
Last modified

Primary PDF

Supplemental Files