Bayesian Spatial-temporal Models for Areal Count Data Open Access
Ling, Qiang (2014)
Abstract
Analyses of spatial-temporally correlated areal data arise
frequently in public health. In this dissertation, a series of
hierarchical models are developed for spatial-temporal count data
in the Bayesian framework, with a focus on addressing potential
overdispersion and inflated zero counts in the data.
The ability to predict areal count data and quantify the associated
prediction uncertainty is valuable for describing population
health. We first consider recent developments in Bayesian
hierarchical modeling approaches with flexible spatio-temporal
interactions, and examine their use in projecting future annual
county-level cancer incidence rates, with an application to the
Colorado cancer incidence data reported to the National Program of
Cancer Registries at the US Centers for Disease Control and
Prevention for 1998 to 2007. By examining the 2-year ahead
predictive performance of models with different random effect
specifications, our results demonstrate the advantages of
considering temporal trends in spatial associations when modeling
cancer incidence rates.
Overdispersion due to zero-inflation is a common challenge in
analyzing count data. To address this issue, we first develop
spatial-temporal zero-inflated models, which has two parts: a
Poisson count model and a logisic model for predicting excess
zeros. We further consider a class of two-part hurdle models. The
hurdle models also consist of two components: a binary component
modeling the probability of any occurrence and a truncated count
component modeling the counts given occurrence. The two components
in zero-inflated and hurdle models address, respectively, the
abundance of zeros and the skewness of the nonzero counts. Several
distributions for the non-zero component are considered, including
Poisson, negative binomial, and generalized Poisson. We also
evaluate the spatial-temporal dependence between the two model
components via multivariate conditionally autoregressive priors,
which provide spatial and temporal smoothing.
The zero-inflated and hurdle models are applied to (1) Iowa cancer
data reported to the Surveillance, Epidemiology, and End Results
(SEER) program of the National Cancer Institute during 1998-2007,
and (2) emergency department visits data in the Duke University
Health System. Results demonstrate that the two component models
using negative binomial and generalized Poisson as the base
distribution outperform the standard Poisson models.
Table of Contents
1 Introduction 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 1
1.2 Proximity Matrix W . . . . . . . . . . . . . . . . . . . . . .
. . . . . 2
1.3 Spatial and Temporal Association . . . . . . . . . . . . . . .
. . . . . 3
1.4 Conditional Autoregressive Prior Distributions . . . . . . . .
. . . . . 4
1.5 Bayesian Model Comparison . . . . . . . . . . . . . . . . . . .
. . . . 5
1.6 Overall Goals and Organization . . . . . . . . . . . . . . . .
. . . . . 7
2 STUDY 1: BAYESIAN SPATIAL-TEMPORAL DISEASE MAP-
PING AND PROJECTION FOR COLORADO LUNG AND BRONCHUS
CANCER DATA 10
2.1 US Cancer Surveillance . . . . . . . . . . . . . . . . . . . .
. . . . . . 10
2.2 Statistical Challenges and Objectives . . . . . . . . . . . . .
. . . . . 12
2.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 14
2.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 14
2.4.1 Bayesian spatio-temporal models for areal data . . . . . . .
. 15
2.4.2 Modeling space-time random effects . . . . . . . . . . . . .
. . 16
2.4.3 Accounting for geographic unit boundary changes . . . . . . .
19
2.4.4 Estimation and computation details . . . . . . . . . . . . .
. . 20
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 22
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .32
3 STUDY 2: BAYESIAN SPATIAL ZERO-INFLATED MIXTURE
MODELS ACCOUNTING FOR ZERO-INFLATION AND OVERDIS-
PERSION IN AREAL COUNT DATA 34
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 34
3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 36
3.2.1 Iowa Central Cancer Registry Data, 1998-2007 . . . . . . . .
. 36
3.2.2 Statistical Models . . . . . . . . . . . . . . . . . . . . .
. . . . 37
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 41
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .43
4 STUDY 3: Bayesian Dynamic Spatial-temporal Hurdle Models
for
Zero-inflated Count Data 51
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 51
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 53
4.2.1 The DSR Data, 2007-2011 . . . . . . . . . . . . . . . . . . .
. 53
4.2.2 The Hurdle Model . . . . . . . . . . . . . . . . . . . . . .
. . 54
4.2.3 Spatial-temporal Hurdle Model . . . . . . . . . . . . . . . .
. 55
4.2.4 Choice of Base Distribution . . . . . . . . . . . . . . . . .
. . 56
4.2.5 Computation Details . . . . . . . . . . . . . . . . . . . . .
. . 57
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 59
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . .62
5 Summary 71
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 71
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 72
Appendices 74
A Metropolis-Hastings algorithm for estimating the dispersion
param-
eter in the generalized Poisson model. 75
Bibliography 77
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Bayesian Spatial-temporal Models for Areal Count Data () | 2018-08-28 14:40:13 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
mybib.bib () | 2018-08-28 14:51:25 -0400 |
|
|
Appendix.tex () | 2018-08-28 14:52:56 -0400 |
|