Statistical Methods for Mediation Analysis of Omics Data Open Access

Lane, Andrea (Summer 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/bc386k63b?locale=en
Published

Abstract

Epigenome-wide association studies (EWAS) have identified associations between epigenetic modifications (e.g. DNA methylation) and both exposures (e.g. smoking status) and certain health outcomes (e.g. lung function). These associations naturally lead to an interest in studying DNA methylation as a potential mediator between an exposure and outcome. To this point, EWAS mediation studies adopted the canonical mediation analysis method and ignored a very important aspect of the data: the sample complexity. The samples being studied (such as blood) are comprised of a mix of cell types. Distinct cell types are known to present distinct methylation profiles and play unique mediation roles in disease pathogenesis.

In this dissertation, we develop novel statistical methods to study the cell type-specific mediating effects from population level EWAS data. In the first project, we present a novel statistical method called TOols for the Analysis of heterogeneouS Tissues – Mediation with a Continuous outcome (TOAST-MC) to detect this cell-type-specific mediation effect with a continuous outcome. Our method extends the traditional mediation models by treating the unobserved cell type-specific methylation as missing data. We then derive an EM-algorithm for parameter estimations and perform a bootstrap test of the indirect effect.

In the second project, we develop a procedure called TOAST-MB that can handle both a continuous and a binary outcome. The method utilizes a Bayesian model framework to obtain a marginal posterior distribution of the indirect effect for each cell type. Posterior samples are obtained via Hamiltonian Monte Carlo MCMC.

In the third project, we conduct a series of simulation studies to compare the performance of three methods of high dimensional mediation analysis: HIgh dimensional Mediation Analysis (HIMA), Divide-Aggregate Composite-null Test (DACT), and BAyesian Mediation Analysis (BAMA). We then apply the three methods to a dataset from the Grady Trauma Project, in which we analyze the role of DNA methylation as a mediator between smoking and weight.

The statistical methods and tools developed in this dissertation help to better analyze EWAS data and can potentially aid in the discovery of novel diagnostic biomarkers and therapeutic targets. 

Table of Contents

1 Introduction 1

1.1 The omics revolution ........................... 1 1.2 Omics mediation ............................. 3 1.3 Outline................................... 5

2 Detecting cell type-specific mediation effects from bulk omics data with continuous outcomes 7

2.1 Introduction................................ 7 2.2 Methods.................................. 13 2.2.1 Notation and Models....................... 13 2.2.2 Simulation Study ......................... 21 2.3 Results................................... 23 2.3.1 Simulation Study ......................... 23 2.3.2 Real Data Analysis........................ 25 2.4 Discussion................................. 29

3 Detecting cell type-specific mediation effects from bulk omics data with binary outcomes 32

3.1 Introduction................................ 32 3.2 Methods.................................. 34 3.2.1 Notation and Models....................... 34 3.2.2 Simulation Study ......................... 39 3.3 Results................................... 41 3.3.1 Simulation Study ......................... 41 3.3.2 Real Data Analysis........................ 47 3.4 Discussion................................. 49

4 A comparison of high dimensional mediation methods 54

4.1 Introduction................................ 54 4.1.1 General overview of mediation with multiple mediators . . . . 56 4.1.2 General overview of proposed high-dimensional mediation methods................................. 57 4.2 Methods.................................. 61 4.2.1 Selected high-dimensional mediation methods . . . . . . . . . 61 4.2.2 Simulation............................. 63 4.3 Results................................... 65 4.3.1 Simulation............................. 65 4.3.2 Real Data Analysis........................ 70 4.4 Discussion................................. 71

5 Discussion 75

Bibliography 80 

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files