Framework for Spatial Health Analytics Using Digital Exhaust, Crowd-Source Data and Electronic Health Records Restricted; Files Only

Salari Sharif Abad, Mohsen (Fall 2022)

Permanent URL:


 Low accessibility to healthy and nutritious food has been hypothesized to increase health disparities. Low-accessibility areas, called food deserts, are particularly commonplace in lower-income neighborhoods. However, indices for modeling food desert intensity are subjectively defined, and there is little agreement in the literature on their validity or relative strength. Moreover, such indices are largely based on census data, which limits their frequency and geographical resolution to that of the census. 

In this dissertation, we first propose an assessment framework for objectively defining and comparing the utility of food desert indices using machine learning models. We introduce the concept of food desert index utility score, based on which we can compare the strength of indices for describing the food environment. We then focus on the Metro Atlanta area in Georgia, USA, as a case study to explore the effect of the geographic spatial resolution of models and the impact of adding or ignoring neighborhood-level income or vehicle access on the utility of food desert indices. 

We then use demographic, geographic, and health data, as well as real-time data from platforms such as Yelp, Google Maps and crowd-sourced data using Amazon Mechanical Turks, to build a food desert index that has both higher spatial and temporal resolution than the standard Food Desert measures currently used in the US Food Reseach Atlas Data Base (FRADB). The new food desert index can be used, for example, to measure a person's exposure to varying food environments during a commute, to analyze the effect of the food environments on health outcomes or nutrition behaviors, or to suggest behavioral changes to reduce exposures. We use this temporally and spatially high-resolution, context-aware index (factoring in pseudo-real-time traffic density) in a concept application that suggests alternative routes with similar ETAs between a source and destination in the Atlanta metropolitan area to expose a traveler to better food environments. The resulting model was sensitive to changes in the environment that occurred after the census data was collected. In addition to informing community planners and policymakers more accurately than traditional food desert indices in the FRADB , our novel food desert index allows us to measure environmental effects on individuals and suggest personal behavioral changes, this is showcased in the fact that our index was sensitive to a new healthy food outlet being opened after the FRADB data were collected and suggested routes that were in the proximity of this newly opened center. In particular, an evaluation of 248,000 routes from random locations to 28,000 food retailers demonstrates that the Atlanta food environment creates a strong bias towards eating out rather than preparing a meal at home when access to vehicles is limited. 

Finally, as a case study, we analyze the hospitalization data of more than 64,000 COVID-19 patients in the Metro Atlanta Area in the first 19 months of the pandemic and observe that living in a food desert was associated with a higher per capita number of patient hospitalizations and higher number of deaths per hospitalizations in the first months of the pandemic. 

Table of Contents

1 Introduction 1

1.1 Aim of this thesis  6

1.2 Thesis outline 7

1.3 List of publications  8

2 How to Identify Food Deserts: An Analytical Framework for Comparing Utility of

Metrics and Indices; Case Study of Key Factors, Concurrences, and Divergence 9

2.1 Abstract  10

2.2 Introduction 11

2.3 Methods  19

2.3.1 Data 19

2.3.2 Food Desert Indices 19

2.3.3 Assessment Framework 20

2.4 Results  26

2.5 Discussion 28

2.5.1 Effect of Distance Granularity 28

2.5.2 Effect of Distinguishing Between Rural and Urban Tracts  29

2.5.3 Effect of Adding Income 30

2.5.4 Generalizability 30

2.6 Limitations 31

3 Combining Crowd-Sourcing, Census Data, and Public Review Forums for Real-Time,

High-Resolution Food Desert Estimation 32

3.1 Abstract  33

3.2 Introduction 34

3.3 Data 37

3.4 Methods  41

3.4.1 Representative points  41

3.4.2 Extracting retailer information 41

3.4.3 Estimating actual distance 42

3.4.4 Merging similar retailers by fuzzy matching  42

3.4.5 Crowd-sourcing retailer health scores 43

3.4.6 Feature engineering 44

3.4.7 Data preprocessing and normalization 47

3.4.8 Kriging Census-Level Features 47

3.4.9 Training the Model 48

3.5 Results  53

3.6 Conclusion 57

4 An Open-Source Privacy-Preserving Large-Scale Mobile Framework for Cardiovascular Health Monitoring and Intervention Planning With an Urban African American

Population of Young Adults: User-Centered Design Approach 58

4.1 Abstract  59

4.2 Introduction 61

4.2.1 Background  61

4.2.2 Objectives  63

4.3 Methods  64

4.3.1 Overview  64

4.3.2 Community-Based Participatory Research and User-Centered Design 64

4.3.3 Co-design Sessions 65

4.3.4 Participants and Mentors 65

4.3.5 ACloud-EnabledHealthInsurancePortabilityandAccountabilityAct–Compliant

mHealth Sensing Infrastructure 67

4.3.6 Pilot Testing 69

4.3.7 Analysis 70

4.4 Results  72

4.4.1 Overview  72

4.4.2 Individual Factors   73

4.4.3 Interpersonal Factors  73

4.4.4 Expert-Informed Factors 74

4.4.5 Technological Factors 75

4.4.6 Final Design 75

4.4.7 Pilot Test  76

4.5 Discussion 79

4.5.1 Principal Findings   79

4.6 Conclusions 81

5 Socio-economic factors of COVID-19 early spread and mortality in Atlanta Metropoli-

tan Area, a case study of food desert dynamics and pandemic spread 82

5.1 Abstract  83

5.2 Introduction 84

5.3 Methods  86

5.3.1 COVID-19 Hospitalizations and Deaths 86

5.3.2 Defining Food Desert Neighborhoods 86

5.3.3 Re-hospitalization rate  87

5.3.4 Duration of Care 88

5.3.5 Aligning neighborhood level infections   88

5.3.6 Hypothesis tests   89

5.4 Results  91

5.5 Conclusions 97

6 Conclusion 100

6.1 Summary and contributions 101

6.2 Limitations 104

6.3 Future work 106

Bibliography 108

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
  • English
Research Field
Committee Chair / Thesis Advisor
Last modified Preview image embargoed

Primary PDF

Supplemental Files