Analysis of Data with Complex Misclassification in Response or Predictor Variables by Incorporating Validation Subsampling Open Access

Tang, Li (2012)

Permanent URL: https://etd.library.emory.edu/concern/etds/wp988k18n?locale=en%255D
Published

Abstract

Abstract
Analysis of Data with Complex Misclassification in Response or Predictor Variables by Incorporating Validation Subsampling
By Li Tang

The problems of misclassification are common in epidemiological and clinical research. Misclassification may be present in either an exposure or outcome variable, or both. It is well known that the validity of analytic results (e.g., estimates of odds ratios of interest) might be questionable when no correction effort is made. Therefore, valid and accessible methods with which to deal with these issues are still in high demand.

In this dissertation, we first consider the situation when correlated binary response variables are subject to misclassification. Building upon prior work that extended McNemar's test to correct paired-data odds ratio estimation, we propose a nonlinear mixed model-based approach to adjust for potentially complex differential misclassification in correlated binary responses via internal validation sampling.

In the second topic, we shift gears toward predictor misclassification, for which we develop likelihood-based approaches based on generalized linear and generalized linear mixed models that can efficiently incorporate internal validation data in univariate and multivariate settings, respectively. We discuss the use of the approach both in the case when a baseline predictor is misclassified and when a time-dependent predictor is misclassified.

In the final topic, we elucidate extensions of well-studied methods in order to facilitate misclassification adjustment when a binary outcome and binary exposure variable are both subject to complex differential misclassification in the 2-by-2 table scenario. We develop maximum likelihood approaches to accommodate a broad range of complexity in the joint misclassification process while incorporating various types of internal validation observations. We then generalize the method to a more standard binary regression setting, allowing the incorporation of covariates both in the main health effects model of interest and in misclassification models for both the binary outcome and exposure variable. Throughout, illustrative examples are presented via detailed analyses of bacterial vaginosis and trichomoniasis data from the HIV Research Epidemiology Study (HERS).

Key Words: Differential; Misclassification; Internal Validation; Likelihood

Table of Contents

Table of Contents
Chapter 1 Introduction...1

1.1 Overview...1
1.2 Misclassification in Correlated Binary Responses...2
1.3 Misclassification in Predictors...4
1.4 Misclassification in Response and Predictor Variables in 2×2 Tables...6
1.5 Misclassification in Response and Predictor Variables in Regression...7
1.6 Motivating Example...8

Chapter 2 Regression Analysis for Differentially Misclassified Correlated Binary Responses...10

2.1 Methods...10

2.1.1 Notation...10
2.1.2 Validation Sampling Scheme...12
2.1.3 Non-differential Misclassification with External Validation...12
2.1.4 Differential Misclassification...13
2.1.5 Main-study Only and Sensitivity Analysis...16
2.1.6 Estimation...17
2.1.7 Correlation in Misclassification Processes...17

2.2 Simulation Studies...18

2.2.1 Non-differential Misclassification...18
2.2.2 Differential Misclassification...20
2.2.3 Importance of Correctly Specifying SE/SP Model...22
2.2.4 A Note About Correlated Misclassification...24

2.3 Example...27

2.3.1 HERS Example...27
2.3.2 Example 1: Pairwise No-covariate case...27
2.3.3 Example 2: Pairwise Covariate-adjusted case...30
2.3.4 Example 3: Longitudinal Analysis with >2 Time Points...38

2.4. Discussion...44

Chapter 3 Regression Analysis for Differentially Misclassified Binary Covariates...47

3.1 Univariate Case...47

3.1.1 Model Specification...47
3.1.2 External Validation: Non-differential Misclassification...48
3.1.3 Internal Validation: Differential Misclassification...50
3.1.4 Note on Impact of Mis-specifying X|C Model...52

3.2 Extension to Repeated Measures...52

3.2.1 Model Specification...52
3.2.2 External Validation: Non-Differential Misclassification...54
3.2.3 Internal Validation: Differential...55
3.2.4 Estimation...56

3.3. Simulation Studies...57

3.3.1 External Validation in Univariate Case: Non-Differential Misclassification...57
3.3.2 Internal Validation in Univariate Case: Differential Misclassification...59
3.3.3 External Validation in Longitudinal Case: Non-Differential Misclassification...61
3.3.4 Internal Validation in Longitudinal Case: Differential Misclassification...63

3.4. Example...65

3.4.1 HERS Example...65
3.4.2 Example 1: Univariate Analysis with Visit 4...65
3.4.3 Example 2: Longitudinal Analysis...68

3.5. Discussion...71

Chapter 4 Misclassification in Response and Predictor Variables in 2×2 Tables...73

4.1 Methods...73

4.1.1 Notations and Terminology...73
4.1.2 Maximum Likelihood (ML) Approach...76
4.1.3 Generalized Matrix Method...77
4.1.4 Generalized Inverse Matrix Method...78
4.1.5 Estimation of Misclassification Probabilities and Variance...79
4.1.6 Notes on Case-Control Studies...82
4.1.7 Model Selection...85
4.1.8 Comments Regarding Null Testing...86

4.2. SIMULATION STUDIES...90

4.2.1 Study I: Mimicking Real-data Example...90
4.2.2 Study II: Different Types of Misclassification...92
4.2.3 Study III: Performance of Model Selection...97
4.2.4 Study IV: Misclassification in Case-control studies...100

4.3. EXAMPLE...101
4.4 Discussion...104

Chapter 5 Misclassification in Response and Predictor Variables in Logistic Regression...108

5.1. Methods...108

5.1.1 Notation...108
5.1.2 Independent Nondifferential Misclassification...108
5.1.3 Independent Differential Misclassification...110
5.1.4 Dependent and Differential Misclassification...113
5.1.5 Other Types of Misclassification...114

5.2. Example...115
5.3. Simulation Studies...122
5.4. Discussion...124

REFERENCES...125

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files