Stability of Inference Derived from Machine Learning-based Doubly Robust Estimators of Treatment Effects Open Access

Song, Weishan (Spring 2020)

Permanent URL:


Doubly robust targeted minimum loss-based estimator (DRTMLE) is a causal inference technique used to estimate the covariate-adjusted treatment effects. These estimators often involve the use of super learning, a flexible regression technique that involves cross-validation. Accordingly, estimates and inference obtained using this methodology may change when different seeds are set to control the random splitting process. This may decrease the trustworthiness of such analyses. In this paper, we evaluate two solutions to this problem. Simulation studies are presented that assess the performance of both tactics in different scenarios, and a real data analysis is presented. We conclude that by averaging estimates over repeated runs with different seeds set, more stable performance is achieved without deleterious effect on estimator performance.

Table of Contents

1.     Introduction 1

2.     Methods 3

2.1 Causal Inference with Doubly Robust Methods 3

2.2 Super Learner   5

2.3 Dependence of Results on Random Number Generation 7

2.4 Proposed Solutions 9

3.     Simulation 9

3.1 Study Design 9

3.2 Results 11

4.     Implementation on Clinical Study of Tuberculosis Drug-Resistance 15

5.     Discussion 18


References 19

Appendix: Tables and Figures 21

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
Subfield / Discipline
  • English
Research Field
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files