Food Flows to Assess Community Exposure to Escherichia Coli O157:H7 in Romaine Lettuce Open Access

Burke, Thomas (Summer 2020)

Permanent URL:


In 2018, there were two E coli O157:H7 nationwide outbreaks tied to romaine lettuce consumption with product originating from Yuma, Arizona and the Salinas, CA growing regions.1, 2 Principles in “First in, First out” (FIFO) and other industry practices to maximize sellable product may create conditions in which some consumers have more risk of consuming romaine subject to temperature abuse and/or subject to other food safety risks. Proximity to food distribution centers may affect quality and potential safety given transit time to the ultimate retail destination. This ecological study utilizes a Poisson regression to evaluate the hypothesis that proximity to primary distribution node (DN) (i.e. the immediate first destination of product originating from Yuma County, Arizona) influences case counts at the county level. The model utilizes the Food Flows dataset created by Lin et al to find primary distribution nodes and calculated proximity distances through Python. Average county temperatures in March 2018 were also included to account for its influence on cold chain integrity. Other covariates evaluated were age proportions for persons under 15 and over 60, which are ages of special vulnerability to illness from E coli O157:H7. The final model includes average March temperature, the exposure of interest, and an effect modifier of temperature on DN proximity. We find a relation between DN proximity and case counts in the Yuma outbreak, with a risk ratio of 5.86 (95% CI 3.08, 11.2) for an increase of 250 kilometers, while holding temperature constant. Overall, increased distance from a DN portends increased risk for cases in the Yuma outbreak. Higher temperatures were also associated with an increase in risk. Because of the inability to include potential confounders, ecological bias is a primary concern to the interpretation of the results. Further exploration using Bayesian methods for mapping supply chain risk may better account for inter county variations in underlying covariates. Public health entities should consider attaching supply chain characteristics as a component of data collection in epidemiologic analyses to improve evaluation of future food safety outbreaks.

Table of Contents

Table of Contents

Introduction and Concept



Public Health Consequences

Description of Supply Chains

Population at Risk

Model Justification


Data Source Description


Food Flows Description and Primary Distribution Node Proximity

Temperature Data

Model Analysis

Proposed Model

Analysis of Exposure

Analysis of Covariates

Analysis of Case Count Data

Model Building

Analysis Conclusion




Appendix 1: SAS Code

Appendix 2: Python Data Cleaning Code

Appendix 3: Data Dictionary

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
Subfield / Discipline
  • English
Research Field
Committee Chair / Thesis Advisor
Partnering Agencies
Last modified

Primary PDF

Supplemental Files