Likelihood Methods for Logistic Regression with Missing Data Open Access
Lin, Ji (2012)
Abstract
In biometric research, missing data are often encountered.
This
dissertation explores methods to deal with missing data in
statistical analysis of logistic
regression. The disease status and risk exposure could be subject
to missing data
separately or together. The interest is on identifying the
covariate-adjusted association
between the disease status and the risk exposure, with
consideration of the potential
impact of the missing data.
The first research topic was focused on providing an intuitive and
computationally
accessible approach when the assumption of missing at random (MAR)
was imposed. We
proposed a weighting method, utilizing an expanded dataset with two
approaches to
estimation in different situations. The assumption of MAR is
usually imposed in practice but often not testable. It is
then important to assess how sensitive the results are to the
violation of this assumption.
In the second research topic, a framework of sensitivity analysis
was proposed by
specification of alternative missing data mechanisms. The result
from each specified
scenario is compared to that from MAR so that to assess the
magnitude of change of
parameter estimates relative to deviation from MAR. Examples and
simulation results
suggest that the proposed method succeeds in detecting the
direction and magnitude of
bias in parameter estimates even if the specification of the
alternative missing data
mechanism is not completely correct.
In the third research topic, we explore the reassessment design,
where a second wave
of sampling is made in an attempt to recover some portion of the
missing data in the
original data collection. We construct a joint likelihood based on
the original model of
interest and a model for the missing data mechanism, with emphasis
upon "non-ignorable"
missingness. The estimation is carried out by numerical
maximization of the joint
likelihood and standard errors are estimated via a close
approximation of the Hessian
matrix. We show how likelihood ratio tests can be used for model
selection and how they
facilitate hypothesis testing for whether missingness is at random,
which is an assumption
that can be suspect in many practical applications. Examples and
simulations are
presented to demonstrate the performance of the proposed
method.
Table of Contents
Table of Contents
Chapter 1.
INTRODUCTION AND BACKGROUND
................................................... 1
1.1.
Introduction
.............................................................................................................
1
1.2.
Background
..............................................................................................................
1
1.2.1.
Missing-Data Mechanisms
....................................................................................
1
1.2.2.
Complete-Case Analysis
........................................................................................
3
1.2.3.
Maximum Likelihood Method
..............................................................................
5
1.2.4.
Inverse Propensity Weighting
...............................................................................
6
1.2.5.
Weighted Estimating Equations
............................................................................
7
1.2.6.
Multiple Imputation
...............................................................................................
8
1.2.7.
Predictive Probability Weighting
........................................................................
10
1.2.8.
Jackknife Resampling Method
............................................................................
11
1.2.9.
Sensitivity Analysis with Data Missing Not-At-Random
................................. 11
1.2.10.
Reassessment Data in Missing Data Problems
.................................................. 12
Chapter 2.
A WEIGHTING METHOD FOR LOGISTIC REGRESSION WITH
DATA MISSING-AT-RANDOM
.........................................................................................
14
2.1.
Introduction
...........................................................................................................
14
2.2.
Methods
..................................................................................................................
15
2.2.1.
Outcome Missing
.................................................................................................
16
2.2.2.
Predictor Variable Missing
..................................................................................
17
2.2.3.
More Than One Variable Missing
.......................................................................
42
2.3.
Simulation Results
.................................................................................................
44
2.3.1.
With Categorical Covariate C
...........................................................................
45
2.3.2.
With Continuous Covariates C
.........................................................................
47
2.4.
Discussions
.............................................................................................................
62
2.4.1.
Comparison of the Methods
................................................................................
62
2.4.2.
Connection between the IPW and the "Flipped-Around" Model
..................... 65
Chapter 3.
SENSITIVITY ANALYSIS FOR DATA NOT MISSING AT
RANDOM IN LOGISTIC REGRESSION
........................................................................
66
3.1.
Introduction
...........................................................................................................
66
3.2.
Methods
..................................................................................................................
68
3.2.1.
The No-Covariate Case: Basic Sensitivity Analysis
.......................................... 68
3.2.2.
The Covariate Case
..............................................................................................
76
3.2.3.
Standard Error Estimation
...................................................................................
82
3.3.
Examples
................................................................................................................
83
3.3.1.
No Covariate Case
...............................................................................................
83
3.3.2.
Covariate Case
.....................................................................................................
91
3.3.3.
Monte Carlo Sensitivity Analysis
.......................................................................
96
3.4.
Simulations
..........................................................................................................
100
3.5.
Discussions
...........................................................................................................
103
3.5.1.
Connection Between the Three Ways to Specify Alternative
Missing
Mechanism
.........................................................................................................................
103
3.5.2.
Extensions...........................................................................................................
103
Chapter 4.
JOINT MODEL FOR LOGISTIC REGRESSION WITH MISSING
DATA AND REASSESSMENT DESIGN
.........................................................................
105
4.1.
Introduction
.........................................................................................................
105
4.2.
Methods
................................................................................................................
106
4.2.1.
Outcome or Exposure Missing in Logistic Regression with
Reassessment Data
.............................................................................................................
106
4.2.2.
Outcome and Exposure Missing in Logistic Regression with
Reassessment Data
.............................................................................................................
110
4.2.3.
Estimation
...........................................................................................................
113
4.2.4.
Model Selection and Testing Not-Missing-At-Random
.................................. 114
4.3.
Simulations
..........................................................................................................
114
4.3.1.
Comparison of Methods with NMAR X
....................................................... 115
4.3.2.
Comparison of Methods with MAR X
..........................................................
119
4.3.3.
Both Outcome and Exposure NMAR with Reassessment
.............................. 121
4.4.
Example................................................................................................................
122
4.5.
Discussion
.............................................................................................................
128
Chapter 5.
SUMMARY AND FUTURE RESEARCH
................................................ 130
5.1.
Summary
..............................................................................................................
130
5.2.
Future Research
..................................................................................................
132
BIBLIOGRAPHY
..............................................................................................................
134
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Likelihood Methods for Logistic Regression with Missing Data () | 2018-08-28 16:13:03 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|