Statistical Methods for Spatial Data in Public Health Público
Kianian, Behzad (Fall 2020)
Abstract
Data in public health often contain a spatial component relevant to understanding underlying relationships of interest. Accounting for different manifestations of spatial components in statistical analyses is frequently challenged by a dearth of developed methodology or high computational costs. First, we consider the problem of estimating treatment effects from observational data with propensity score matching allowing for the presence of spatial and multi-level confounding. We build on recently developed distance-adjusted propensity score matching (DAPSm) and propose a two-stage approach that first matches within clusters (WC), and then uses the DAPSm approach to match remaining subjects (WC+DAPsm). We demonstrate the benefits and robustness of our approach through an extensive simulation study. We apply our method to a population of patients in Georgia who have recently started dialysis, where both the treatment (informed of transplant options) and outcome (1-year referral for transplant) may be plausibly affected by individual, facility, and area-level factors.
Next, we consider the task of using satellite-derived aerosol optical depth (AOD) as a predictor for particulate matter (PM2.5) concentrations, allowing broader coverage than the network of air pollution monitors. However, AOD contains large contiguous areas of missing data due to cloud cover. We propose imputing missing AOD data using lattice kriging, a large-scale spatial statistical method, and random forest, a regression tree-based machine learning method, as well as a distance-based ensemble for combining the two methods. Throughout our application, we construct cross-validation folds and testing data based on spatially clustered holdouts more closely mimicking observed data patterns than traditional random holdouts. Our results show that the proposed distance-based ensemble outperforms individual methods.
For the third topic, we discuss on-going work assessing the equity of COVID-19 testing site access in the Atlanta area. We adapt methods from the environmental justice literature using empirical cumulative distribution functions to compare demographic subgroup access to testing sites. We consider different measures of access, and we conduct Monte Carlo simulations of test site placements under different sampling schemes to assess factors associated with site placement.
Table of Contents
Propensity score matching for multi-level and spatial data: 1
Imputing satellite-derived aerosol optical depth using a multi-resolution spatial model and random forest for PM2.5 prediction: 44
A framework for assessing COVID-19 testing site spatial access: 76
Appendix A. Supplemental Materials to “Propensity score matching for multi-level and spatial data”: 102
Appendix B. Supplemental Materials to “Imputing satellite-derived aerosol optical depth using a multi-resolution spatial model and random forest for PM2.5 prediction”: 131
Appendix C. Supplemental Materials to “A framework for assessing COVID-19 testing site spatial access”: 158
Bibliography: 164
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Palabra Clave | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Statistical Methods for Spatial Data in Public Health () | 2020-11-19 13:03:23 -0500 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|