The development and application of advanced PM2.5 exposure models driven by satellite data Open Access
Xiao,Qingyang (Spring 2018)
Satellite aerosol optical depth (AOD) has been increasingly used to predict ground level PM2.5 concentrations and assess PM2.5 exposures. However, non-random missing AOD due to cloud/snow cover and the complex non-linear relationship between AOD and PM2.5 concentration make this task highly challenging. Previous studies used ground PM2.5 measurements to fill missing data and included predictors constructed from ground measurements to improve model performance; however, these strategies cannot be applied in developing regions where historical air quality measurements are unavailable. In this study, we developed an original gap-filling method that provided high-resolution complete-coverage PM2.5 predictions (Aim 1). Then the maternal PM2.5 exposure was assessed by satellite-based PM2.5 predictions to estimate its associations with adverse birth outcomes in Shanghai, China (Aim 2). In Aim 3, an ensemble machine learning model was developed to hindcast historical PM2.5 levels in China where routine air quality monitoring began only recently.
For Aim 1, we applied the Multiple Imputation (MI) method that combined the emerging high-resolution satellite retrievals with chemical transport model (CTM) AOD simulations and cloud fraction retrievals to fill missing AOD. Then we fitted a two-stage statistical model driven by gap-filled AOD, meteorology and land use information to estimate daily PM2.5 concentrations in the Yangtze River Delta at 1-km resolution. For Aim 2, birth registration records of 132 783 singleton live births during 2011-2014 in Shanghai were obtained and maternal exposures were assessed with satellite predictions from Aim 1. Linear and logistic regressions were used to estimate associations with term birth weight and term low birth weight, respectively. Logistic and discrete-time survival models were used to estimate associations with preterm birth. For Aim 3, a clustering method was designed to control unobserved spatial heterogeneity in PM2.5 prediction models. Regional models for each cluster were trained with various machine learning algorithms, including random forest, generalized additive model and extreme gradient boosting. Then we fitted a generalized additive model that fused predictions from these algorithms to improve hindcast accuracy and robustness.
In Aim 1, our gap-filling method did not rely on ground PM2.5 measurements and performed better than previous gap-filling methods with complete coverage and high accuracy. In Aim 2, we observed decreased term birth weight, increased risk of preterm birth, and increased risk of term low birth weight in association with maternal PM2.5 exposure. We noticed that satellite-based exposure assessments without accounting for missing data led to attenuation of estimated health effects. In Aim 3, our ensemble model provided more accurate PM2.5 hindcasts at daily and monthly level compared with previous models. Cluster-based models outperformed corresponding national models.
We presented a gap-filling method that corrected the exposure bias due to missing satellite data and a machine learning-based ensemble model that provided reliable historical PM2.5 predictions. Our methods can support epidemiological studies on the chronic and acute health effects of PM2.5 in highly polluted regions with limited ground PM2.5 monitoring.
Table of Contents
INTRODUCTION 1 DISSERTATION AIMS 2 REFERENCES 4 Chapter 1 1 ABSTRACT 2 KEYWORDS 3 INTRODUCTION 4 METHODS 7 RESULTS AND DISCUSSION 16 CONCLUSIONS 22 ACKNOWLEDGMENTS 23 REFERENCES 24 SUPPLEMENTARY MATERIALS 36 Chapter 2 39 ABSTRACT 40 INTRODUCTION 41 METHODS 43 RESULTS 47 DISCUSSION 50 CONCLUSIONS 54 ACKNOWLEDGMENTS 54 REFERENCES 54 APPENDIX A 65 APPENDIX B 68 Chapter 3 71 ABSTRACT 72 INTRODUCTION 73 METHODS 76 RESULTS 85 DISCUSSION 89 CONCLUSIONS 92 ACKNOWLEDGMENTS 93 REFERENCES 93 SUPPLEMENTARY MATERIALS 104 CONCLUSIONS 113
About this Dissertation
|Committee Chair / Thesis Advisor|
|The development and application of advanced PM2.5 exposure models driven by satellite data ()||2018-04-02 13:16:06 -0400||