Assess Improvement of Balancing Covariates by Propensity Score approach using Generalized Boosted Model (GBM) and Application Based on National Cancer Database Público

Song, Haocan (Spring 2018)

Permanent URL: https://etd.library.emory.edu/concern/etds/k3569441r?locale=pt-BR

Published

Abstract

Background: Observational study is one of the most commonly used study designs in many

medical research, but they have a major limitation of getting vulnerable to selection bias to

make valid causal inference. Propensity score (PS) matching and weighting are popular

methods that can be applied to reduce the bias and estimating causal effects in observational

studies. In this work, we focused on General Boosted Method (GBM), a tree-based approach

to obtain more accurate estimated PS score without specifying the form of prediction

function, and we further compared its performance in terms of covariate balancing with the

conventional model-based approach, such as logistic regression.

Method and Study Design: In this study, we tested 3 alternative methods for propensity

score (PS) estimation: main-effect logistic regression model (model 1: LOGREG),

comprehensive logistic regression model with all two-way interactions and polynomial terms

(model 2: LOGREG(INT)), and GBM (model 3). Implemented these algorithms for an

application based on prostate cancer from NCDB dataset, where we aimed to conduct an

effect comparison of overall survival between proton radiation therapy and conventional xray

based radiation therapy. Matching was performed to eliminate confounding effect via

PSM with caliper and different matching ratio up to 1:5. Balance was evaluated before and

after matching by standardized difference. The proportional hazard model was carried out to

estimate the hazard ratio of proton therapy with 95% confidence interval in the matched

sample.

Conclusion: The study reveals that covariate balancing can be improved by a more accurate

PS estimation model through GBM or comprehensive logistic regression, and both

approaches should be encouraged in the practice. In case study, we also found that proton

radiation therapy hold an improved clinical benefit for prostate cancer patients for long-term

survival.

1. INTRODUCTION

1.1 Observational Study

1.2 Propensity Score

1.3 Variable Selection for the Propensity Score Model

1.4 Propensity Score Calculation

1.4.1 Main-effect Logistic Regression Model (LOGREG)

1.4.2 Comprehensive Logistic Regression Model with all Two-way Interactions and Polynomial Terms (LOGREG(INT))

1.4.3 Generalized Boosted Models (GBM)

1.5 Propensity Score Matching

1.5.1 Greedy Matching

1.5.2 1-1 to 1-N Caliper Matching

1.6 Treatment Effect

1.6.1 Average Treatment Effect (ATE):

1.6.2 Average Treatment Effect Among the Treated (ATT):

1.7 Checking balance on the covariates before and after matching

2. CASE STUDY

2.1 Study Objective

2.2 NCDB database

2.3 Define study population

2.4 Select the covariates

2.5 Statistical methods

3. RESULTS

3.1 Patients characteristics

3.2 Estimating propensity scores

3.3 PS Matching

3.4 Checking balance on the covariates before and after matching

3.4.1 Greedy Matching

3.4.2 1-1 to 1-N Caliper Matching

4. DISSUSSION

Bibliography

APPENDIX

About this Master's Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Rollins School of Public Health
Department	Biostatistics
Degree	M.P.H.
Submission	Master's Thesis
Language	English
Research Field	Health Sciences, Oncology
Palavra-chave	GBM matching propensity score covariates balance check observational study
Committee Chair / Thesis Advisor	Liu, Yu-an, Emory University
Committee Members	Suprateek, Kundu, Emory University
Partnering Agencies	Emory University schools, faculty or affiliated programs

Última modificação

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Assess Improvement of Balancing Covariates by Propensity Score approach using Generalized Boosted Model (GBM) and Application Based on National Cancer Database ()	2018-04-10 09:30:23 -0400	Download

Supplemental Files

Thumbnail	Title	Date Uploaded	Actions
	Appendix Tables (Additional Tables for Thesis)	2018-04-11 07:38:32 -0400	Download

Abstract

Table of Contents

About this Master's Thesis

Primary PDF

Supplemental Files