Evaluation of Propensity Score Matching Techniques on Overall Survival using the National Cancer Data Base 公开
Zhong, Luer (Spring 2019)
Abstract
Background: Observational studies are often used to mimic randomized controlled trial. Propensity scores can help facilitate since they can balance the distribution of baseline covariates of the treated and the untreated through matching. However, it is not clear whether a 1:1 match can be improved by increasing the number of controls (N) matched to cases. In this analysis, we calculated propensity score, and matched cases with controls using the greedy matching algorithm. Furthermore, we determined the differences between one to one and one to N greedy matching, along with different matching digits, and determined which matching performed better for two National Cancer Data Base (NCDB) files.
Methods: We calculated the propensity score by using the logistic regression model, and performed 1 to 1, …, 1 to 5 greedy matching across HPV status, with 5-to-1 digit and 5-to-2 digit matching, on both larynx and hypopharynx cancer datasets from NCDB. Overall survival was the clinical outcome, and match rate and standardized difference were utilized to determine which approach performed better. For the survival outcome, Kaplan-Meier survival curves, stratified log-rank tests and hazard ratios with 95% confidence intervals from the Cox proportional hazard model were reported.
Results: The number of matched HPV positive patients for 5-to-2 digit matching is smaller than that of 5-to-1 digit matching. Widths of the hazard ratio confidence interval for 5-to-1 digit matching are generally narrower than 5-to-2 digit matching. There are almost no standardized differences that are greater than 0.1 after N is bigger than 2, except for 5-to-2 digit matching on larynx cancer stage 1&2.
Conclusion: This paper concludes that as the matching ratio of the case and control changes from 1 to 1, to 1 to 5, the variable balancing is better. Better variable balancing and higher match rates exist when using 5-to-1 digit matching instead of 5-to-2 digit matching. If a dataset has the capacity to allow 1 to 3, 1 to 4, or 1 to 5 matching, it is recommended to increase the matching ratio to achieve better balance across baseline characteristics.
Keywords: Observational study, propensity score, greedy matching, survival outcome, balance check
Table of Contents
TABLE OF CONTENTS
INTRODUCTION ……………………………………….…………… 1
METHODS ………………...…………………………….………… 5
Data Description …………………………………………………… 5
1. Data sets ………………………………………………………. 5
2. Study Population ……………………………………………….… 6
3. Variable selection ………………………………………………… 6
Statistical Analysis ………………………………..….……………… 7
RESULTS ……………………………………..…………………… 10
Descriptive Statistics ………………………………………………… 10
1. Larynx cancer …………………………………………………… 10
2. Hypopharynx cancer ………………………………………….…… 10
Kaplan-Meier Analysis …………………………...…………………… 11
Propensity Score Calculation …………………………………………… 11
Overall Survival Analysis ………………….…………………………… 12
1. Larynx cancer …………………………………………………… 12
2. Hypopharynx cancer ………………………….…………………… 13
Balance Check ……………………………...……………………… 14
1. Larynx Cancer – 5-to-1 digit matching …………………………………… 14
2. Larynx Cancer – 5-to-2 digit matching …………………………………… 15
3. Hypopharynx Cancer – 5-to-1 digit matching ………….…………………… 17
4. Hypopharynx Cancer – 5-to-2 digit matching ………………………………17
DISCUSSION ………………………………….…………………… 19
APPENDIX ………………………………………………………… 21
BIBLIOGRAPHY …………………………………………..………… 43
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
关键词 | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Evaluation of Propensity Score Matching Techniques on Overall Survival using the National Cancer Data Base () | 2019-04-07 20:55:47 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|