Evaluation of Propensity Score Matching Techniques on Overall Survival using the National Cancer Data Base Open Access

Zhong, Luer (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/sf268622c?locale=en
Published

Abstract

Background: Observational studies are often used to mimic randomized controlled trial. Propensity scores can help facilitate since they can balance the distribution of baseline covariates of the treated and the untreated through matching. However, it is not clear whether a 1:1 match can be improved by increasing the number of controls (N) matched to cases. In this analysis, we calculated propensity score, and matched cases with controls using the greedy matching algorithm. Furthermore, we determined the differences between one to one and one to N greedy matching, along with different matching digits, and determined which matching performed better for two National Cancer Data Base (NCDB) files.

Methods: We calculated the propensity score by using the logistic regression model, and performed 1 to 1, …, 1 to 5 greedy matching across HPV status, with 5-to-1 digit and 5-to-2 digit matching, on both larynx and hypopharynx cancer datasets from NCDB. Overall survival was the clinical outcome, and match rate and standardized difference were utilized to determine which approach performed better. For the survival outcome, Kaplan-Meier survival curves, stratified log-rank tests and hazard ratios with 95% confidence intervals from the Cox proportional hazard model were reported.

Results: The number of matched HPV positive patients for 5-to-2 digit matching is smaller than that of 5-to-1 digit matching. Widths of the hazard ratio confidence interval for 5-to-1 digit matching are generally narrower than 5-to-2 digit matching. There are almost no standardized differences that are greater than 0.1 after N is bigger than 2, except for 5-to-2 digit matching on larynx cancer stage 1&2.

Conclusion: This paper concludes that as the matching ratio of the case and control changes from 1 to 1, to 1 to 5, the variable balancing is better. Better variable balancing and higher match rates exist when using 5-to-1 digit matching instead of 5-to-2 digit matching. If a dataset has the capacity to allow 1 to 3, 1 to 4, or 1 to 5 matching, it is recommended to increase the matching ratio to achieve better balance across baseline characteristics.

Keywords: Observational study, propensity score, greedy matching, survival outcome, balance check

Table of Contents

TABLE OF CONTENTS

INTRODUCTION ……………………………………….…………… 1

METHODS ………………...…………………………….………… 5

 Data Description …………………………………………………… 5

1.      Data sets ………………………………………………………. 5

2.      Study Population ……………………………………………….… 6

3.      Variable selection ………………………………………………… 6

 Statistical Analysis ………………………………..….……………… 7

RESULTS ……………………………………..…………………… 10

 Descriptive Statistics ………………………………………………… 10

1.      Larynx cancer …………………………………………………… 10

2.      Hypopharynx cancer ………………………………………….…… 10

 Kaplan-Meier Analysis …………………………...…………………… 11

 Propensity Score Calculation …………………………………………… 11

 Overall Survival Analysis ………………….…………………………… 12

1.      Larynx cancer …………………………………………………… 12

2.      Hypopharynx cancer ………………………….…………………… 13

 Balance Check ……………………………...……………………… 14

1.      Larynx Cancer – 5-to-1 digit matching …………………………………… 14

2.      Larynx Cancer – 5-to-2 digit matching …………………………………… 15

3.      Hypopharynx Cancer – 5-to-1 digit matching ………….…………………… 17

4.      Hypopharynx Cancer – 5-to-2 digit matching ………………………………17

DISCUSSION ………………………………….…………………… 19

APPENDIX ………………………………………………………… 21

BIBLIOGRAPHY …………………………………………..………… 43

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files