Variability in case and mortality between WHO/MOH and Health Mapcurated news reports during the West Africa Ebola outbreak (March 14 - August 28, 2014) Open Access

Moore, Roxanne (2015)

Permanent URL:


Background: The public is more likely to obtain information regarding Ebola through the news as compared to scientific journals or official reports. The manner in which news media portrays scientific information, particularly case and mortality values has an impact upon trust in formal health agencies, public donations, and fear mongering. Previous research has demonstrated that HealthMap curated news articles can effectively be used as a sentinel surveillance system for early detection of infectious diseases (1-6) .

Methods: This study uses HealthMap identified news articles as a representative sample of online news to evaluate the variation between news-derived case and mortality as compared to the World Health Organization (WHO) and Ministry of Health (MOH) official reports via the Humanitarian Data Exchange (HDX) sub-national time series dataset.

Results: HealthMap counts, country, and reoriented date accurately predicts 75.9% of the change in estimate for WHO/MOH cases and 90.3% for deaths. When limited to news articles providing a citation for case and mortality counts, prediction increases to 92.2% for cases and 95.9% for deaths with no statistically significant difference in news-derived and WHO/MOH official estimates for cases under the subset model.

Discussion: It is hypothesized that lower predictive capacity for cases is related to greater variability in case estimates due to multiple definitions (suspected, probable, and confirmed cases) as well as changes in case reporting for Sierra Leone during the time of study. Two strengths of this study included evaluating a secondary use of HealthMap, and quantifying distrust in media reported values. Limitations include non-repetitive news article sources and non-longitudinal analysis, as well as inaccuracies in the official source. As a result, differences in news and official counts may be caused by the official reports rather than news.

Future Directions: Five alternative analyses are highlighted: case and death comparisons, rumors, regional-level reporting, longitudinal time series, and incidence analysis. News and official variability analysis may not directly facilitate health responders, but instead evaluate public perception. News-derived predictive models should not be used as a supplement to official estimates. Rather, news articles should link the public to official reports; therefore reducing variability in the public domain.

Table of Contents

Background/Literature Review. 1

A. Introduction. 1

B. Ebola Virus Disease (EVD). 2

1. EVD transmission. 2

2. Symptoms and sequelae. 3

3. Previous EVD outbreaks. 3

4. How the 2014-2015 West African EVD outbreak differs. 4

5. Case definitions. 7

6. Surveillance. 9

7. EVD Conclusion. 9

C. Big data surveillance. 10

1. Introduction. 10

2. Web 2.0 and health surveillance terminology. 10

3. HealthMap: How big data surveillance is used for infectious diseases. 11

4. Discussion and future research. 13

5. Big data surveillance conclusion. 13

D. Literature Review Conclusion. 14

Methods. 16

A. Hypothesis. 16

B. Study Design. 16

C. Variable Selection. 18

1. WHO/MOH. 18

2. HealthMap. 19

3. WHO/MOH and HealthMap merged dataset. 20

D. Method of Analysis. 20

Results. 21

Discussion. 27

A. Strengths and Weaknesses. 28

Future Directions. 29

References. 31

Tables. 37

Table 1: Count of news-curated case and mortality by country (March 14 - August 28, 2014). 37

Table 2: News reports excluded by topic due to missing case and mortality counts (March 14 -August 28, 2014). 38

Table 3: Date of first Ebola news rumor, news report, and official report for Guinea, Liberia, and Sierra Leone (March 14-August 28, 2014). 39

Table 4: News-curated variable counts by cases and deaths (March 14 - August 28, 2014). 40

Table 5: Unadjusted means for news-curated variables by cases and deaths (March 14 -August 28, 2014). 41

Table 6: Full analysis t-test, ANOVA, and beta estimates for WHO/MOH cases (n=178) and deaths (n=222). 42

Table 7: Sub-analysis t-test, ANOVA, and beta estimates for WHO/MOH cases (n=145) and deaths (n=156). 43

Figures and Figure Legends. 44

Figure 1: Map of 2014 EVD origin Meliandou, Guinea (80). 44

Figure 2: HealthMap: Alerts from past week (July 5, 2015) (81). 45

Figure 3: HealthMap Access Data Collection Form from March 14, 2014. 46

Figure 4: Variability between WHO/MOH and news reported case and mortality for Guinea (March 14 - August 28, 2014). 47

Figure 5: Variability between WHO/MOH and news reported case and mortality for Liberia (March 14 - August 28, 2014). 48

Figure 6: Variability between WHO/MOH and news reported case and mortality for Sierra Leone (March 14 - August 28, 2014). 49

Figure 7: Variability between WHO/MOH and news reported case and mortality for Guinea, Liberia, and Sierra Leone (March 14 - August 28, 2014). 50

Figure 8: Variability between WHO/MOH and news reported case and mortality for West Africa total (March 14 - August 28, 2014). 51

Appendices. 52

A. Data Dictionaries. 52

1. Data Dictionary: HDX Sub-national time series (March 23 - August 28, 2015). 52

2. Data Dictionary: HealthMap 2014 Ebola Outbreak (March 14-August 28, 2014). 53

3. Data Dictionary: WHO/MOH and HealthMap merge. 55

B. Model Selection Process. 57

1. Model 1 & 2: Full analysis for cases and deaths. 57

2. Model 3 & 4: Sub-analysis for cases and deaths. 58

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
Subfield / Discipline
  • English
Research Field
Committee Chair / Thesis Advisor
Partnering Agencies
Last modified

Primary PDF

Supplemental Files