Bayesian Spatial-temporal Models for Areal Count Data Open Access

Ling, Qiang (2014)

Permanent URL: https://etd.library.emory.edu/concern/etds/hh63sw68k?locale=en
Published

Abstract

Analyses of spatial-temporally correlated areal data arise frequently in public health. In this dissertation, a series of hierarchical models are developed for spatial-temporal count data in the Bayesian framework, with a focus on addressing potential overdispersion and inflated zero counts in the data.

The ability to predict areal count data and quantify the associated prediction uncertainty is valuable for describing population health. We first consider recent developments in Bayesian hierarchical modeling approaches with flexible spatio-temporal interactions, and examine their use in projecting future annual county-level cancer incidence rates, with an application to the Colorado cancer incidence data reported to the National Program of Cancer Registries at the US Centers for Disease Control and Prevention for 1998 to 2007. By examining the 2-year ahead predictive performance of models with different random effect specifications, our results demonstrate the advantages of considering temporal trends in spatial associations when modeling cancer incidence rates.

Overdispersion due to zero-inflation is a common challenge in analyzing count data. To address this issue, we first develop spatial-temporal zero-inflated models, which has two parts: a Poisson count model and a logisic model for predicting excess zeros. We further consider a class of two-part hurdle models. The hurdle models also consist of two components: a binary component modeling the probability of any occurrence and a truncated count component modeling the counts given occurrence. The two components in zero-inflated and hurdle models address, respectively, the abundance of zeros and the skewness of the nonzero counts. Several distributions for the non-zero component are considered, including Poisson, negative binomial, and generalized Poisson. We also evaluate the spatial-temporal dependence between the two model components via multivariate conditionally autoregressive priors, which provide spatial and temporal smoothing.

The zero-inflated and hurdle models are applied to (1) Iowa cancer data reported to the Surveillance, Epidemiology, and End Results (SEER) program of the National Cancer Institute during 1998-2007, and (2) emergency department visits data in the Duke University Health System. Results demonstrate that the two component models using negative binomial and generalized Poisson as the base distribution outperform the standard Poisson models.

Table of Contents

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Proximity Matrix W . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Spatial and Temporal Association . . . . . . . . . . . . . . . . . . . . 3
1.4 Conditional Autoregressive Prior Distributions . . . . . . . . . . . . . 4
1.5 Bayesian Model Comparison . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Overall Goals and Organization . . . . . . . . . . . . . . . . . . . . . 7


2 STUDY 1: BAYESIAN SPATIAL-TEMPORAL DISEASE MAP-
PING AND PROJECTION FOR COLORADO LUNG AND BRONCHUS
CANCER DATA 10
2.1 US Cancer Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2 Statistical Challenges and Objectives . . . . . . . . . . . . . . . . . . 12
2.3 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.4.1 Bayesian spatio-temporal models for areal data . . . . . . . . 15
2.4.2 Modeling space-time random effects . . . . . . . . . . . . . . . 16
2.4.3 Accounting for geographic unit boundary changes . . . . . . . 19
2.4.4 Estimation and computation details . . . . . . . . . . . . . . . 20
2.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .32


3 STUDY 2: BAYESIAN SPATIAL ZERO-INFLATED MIXTURE
MODELS ACCOUNTING FOR ZERO-INFLATION AND OVERDIS-
PERSION IN AREAL COUNT DATA 34
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.1 Iowa Central Cancer Registry Data, 1998-2007 . . . . . . . . . 36
3.2.2 Statistical Models . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .43


4 STUDY 3: Bayesian Dynamic Spatial-temporal Hurdle Models for
Zero-inflated Count Data 51
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.1 The DSR Data, 2007-2011 . . . . . . . . . . . . . . . . . . . . 53
4.2.2 The Hurdle Model . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2.3 Spatial-temporal Hurdle Model . . . . . . . . . . . . . . . . . 55
4.2.4 Choice of Base Distribution . . . . . . . . . . . . . . . . . . . 56
4.2.5 Computation Details . . . . . . . . . . . . . . . . . . . . . . . 57
4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .62


5 Summary 71
5.1 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

Appendices 74
A Metropolis-Hastings algorithm for estimating the dispersion param-
eter in the generalized Poisson model. 75


Bibliography 77

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files