Cell Type–specific Gene Expression and DNA Methylation Differences in Complex Tissues Open Access

Geng, Siyi (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/3r074v993?locale=en
Published

Abstract

A majority of tissues such as blood and tumor are complex and heterogeneous samples containing different cell types. Thus, the profiles of the genome or epigenome of tissue samples from high-throughput technologies are mixed signals. The heterogeneity in such data bring difficulties in data analysis and result in biases without proper adjustment.

We extend an existing method TOAST (TOols for the Analysis of heterogeneouS Tissues) to model the data from mixed, heterogeneous samples and detect cell-specific differential signals. We design a series of simulation studies on cell-specific differential expression (csDE) detection to evaluate the TOAST performance. Furthermore, we conduct analysis on DNA methylation (DNAm) data from two existing human blood datasets. We use a reference-based method EpiDISH to estimate cell proportions and apply TOAST to detect age-related cell-specific differential methylated CpG sites (csDMC).

Simulation studies and analysis on real data show good performance of upgraded TOAST on csDE/DM detection. The results from the simulation study show that larger sample size has a positive effect on performance accuracy, while the larger noise level has a negative effect. In real data study, we find that age is related to cell proportions of mixed samples. Through csDM analysis using TOAST, we identify varies of age-related DMCs in each cell type and the numbers of csDMCs are different among cell types. These results show that the upgraded TOAST provides a flexible statistical method to analyze cell-specific differential gene expression and DNA methylation.

Table of Contents

Contents

Introduction ……………………………………………………………………………………. 1

Background of Genetics and Genomics …………………………………………………. 1

Sample Mixture Problem       ……………………………………………………………. 2

The Existing Statistical Methods           ……………………………………………………. 3

Age Impact on DNA Methylation           ……………………………………………………. 4

Study Goals ……………………………………………………………………………. 4

Method ……………………………………………………………………………………. 5

Estimating mixture proportions using EpiDISH           ……………………………………. 5

TOAST Model ……………………………………………………………………. 6

Simulation ……………………………………………………………………………. 7

Real Data Application …………………………………………………………………. 10

Enrichment Analysis …………………………………………………………………. 11

Results ……………………………………………….………………………………………… 12

Simulation …………………………………………………………………………. 12

Impact of noise level and sample size …………………………………………. 13

Real Data …………………………………………………………………………. 20

Descriptions of the datasets           …………………………………………………. 20

Reference-based deconvolution results ………………………………………. 21

Age Effect on Cell Type Proportions           …………………………………………. 21

Age Effect on CpGs …………………………………………………………. 23

Pathway Analysis based on DMCs           …………………………………………. 26

Discussion …………………………………………………………………………………. 28

Reference …………………………………………………………………………………. 30

 

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files