Bias Correction under Measurement Error: A Comparative Study of Extrapolation Strategies in the SIMEX Framework Open Access

Hua, Shujie (Spring 2025)

Permanent URL: https://etd.library.emory.edu/concern/etds/6682x5461?locale=en
Published

Abstract

Measurement error in covariates is a common challenge in regression analysis, leading to biased parameter estimates, particularly attenuation of slope coefficients. The Simulation-Extrapolation (SIMEX) method provides a flexible and model-agnostic framework for correcting such bias by simulating additional noise and extrapolating to a noise-free scenario. This thesis presents a comprehensive comparative study of three extrapolation strategies—linear, quadratic, and nonlinear—within the SIMEX framework, focusing on their ability to recover unbiased slope estimates in linear and logistic regression models. We conduct extensive Monte Carlo simulations under varying levels of measurement error and assess the performance of each method in terms of bias, variance, and root mean squared error (RMSE). Our theoretical analysis justifies the structural form of nonlinear extrapolation and explains its superior performance, while also identifying conditions under which it may fail to converge. We further examine variance estimation for the corrected estimator and discuss open questions surrounding uncertainty quantification in SIMEX. Finally, we demonstrate the extension of SIMEX to logistic regression and evaluate its effectiveness compared to the naive and regression calibration approaches. Our findings highlight the practical value and limitations of different extrapolation strategies and provide guidance for applying SIMEX in both classical and generalized linear modeling contexts.

Table of Contents

1 Introduction 1  

2 Method 1  

 2.1 Model Specification 1  

 2.2 Standard Method: Corrected-Estimate 2  

  2.2.1 Mathematical Rationale 3  

  2.2.2 Limitations 3  

 2.3 SIMEX Framework 3  

  2.3.1 Simulation Stage 3  

  2.3.2 Extrapolation Stage 4  

 2.4 Data Generation 4  

  2.4.1 Core Parameter Settings 4  

  2.4.2 Experimental Design 4  

  2.4.3 Parameter Selection Rationale 5  

 2.5 Theoretical Analysis 5  

  2.5.1 Theoretical Basis of Nonlinear Extrapolation and Choice of ω→1 5  

  2.5.2 Theoretical Mechanisms of Convergence Failure 7  

 2.6 Extension to Logistic Regression 7  

 2.7 Variance Estimation 9  

  2.7.1 Variance Estimation for Corrected-Estimate 9  

3 Results 10  

 3.1 SIMEX Workflow: Data Generation, Averaging, and Extrapolation 10  

 3.2 Comparison of Extrapolation Methods 11  

 3.3 Convergence Failure of Nonlinear Extrapolation 12  

 3.4 Application to Logistic Regression 15  

4 Discussion 16  

References 17

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files