Cell Type–specific Gene Expression and DNA Methylation Differences in Complex Tissues Pubblico
Geng, Siyi (Spring 2019)
Abstract
A majority of tissues such as blood and tumor are complex and heterogeneous samples containing different cell types. Thus, the profiles of the genome or epigenome of tissue samples from high-throughput technologies are mixed signals. The heterogeneity in such data bring difficulties in data analysis and result in biases without proper adjustment.
We extend an existing method TOAST (TOols for the Analysis of heterogeneouS Tissues) to model the data from mixed, heterogeneous samples and detect cell-specific differential signals. We design a series of simulation studies on cell-specific differential expression (csDE) detection to evaluate the TOAST performance. Furthermore, we conduct analysis on DNA methylation (DNAm) data from two existing human blood datasets. We use a reference-based method EpiDISH to estimate cell proportions and apply TOAST to detect age-related cell-specific differential methylated CpG sites (csDMC).
Simulation studies and analysis on real data show good performance of upgraded TOAST on csDE/DM detection. The results from the simulation study show that larger sample size has a positive effect on performance accuracy, while the larger noise level has a negative effect. In real data study, we find that age is related to cell proportions of mixed samples. Through csDM analysis using TOAST, we identify varies of age-related DMCs in each cell type and the numbers of csDMCs are different among cell types. These results show that the upgraded TOAST provides a flexible statistical method to analyze cell-specific differential gene expression and DNA methylation.
Table of Contents
Contents
Introduction ……………………………………………………………………………………. 1
Background of Genetics and Genomics …………………………………………………. 1
Sample Mixture Problem ……………………………………………………………. 2
The Existing Statistical Methods ……………………………………………………. 3
Age Impact on DNA Methylation ……………………………………………………. 4
Study Goals ……………………………………………………………………………. 4
Method ……………………………………………………………………………………. 5
Estimating mixture proportions using EpiDISH ……………………………………. 5
TOAST Model ……………………………………………………………………. 6
Simulation ……………………………………………………………………………. 7
Real Data Application …………………………………………………………………. 10
Enrichment Analysis …………………………………………………………………. 11
Results ……………………………………………….………………………………………… 12
Simulation …………………………………………………………………………. 12
Impact of noise level and sample size …………………………………………. 13
Real Data …………………………………………………………………………. 20
Descriptions of the datasets …………………………………………………. 20
Reference-based deconvolution results ………………………………………. 21
Age Effect on Cell Type Proportions …………………………………………. 21
Age Effect on CpGs …………………………………………………………. 23
Pathway Analysis based on DMCs …………………………………………. 26
Discussion …………………………………………………………………………………. 28
Reference …………………………………………………………………………………. 30
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
Parola chiave | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Cell Type–specific Gene Expression and DNA Methylation Differences in Complex Tissues () | 2019-04-09 10:01:46 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|