Powerful variance-component method for TWAS identifies novel and known risk genes for Alzheimer’s dementia Open Access

Tang, Shizhen (Spring 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/bg257g26f?locale=en


Background: Existing method for detecting disease related genes of complex disease including Genome-wide association studies (GWAS) and Transcriptome-wide association studies (TWAS). Typically, GWAS focuses on detecting the association between common single nucleotide polymorphisms (SNPs) and traits. However, the biology mechanisms for the majority of GWAS signals remain to be determined. Existing TWAS methods such as PrediXcan, FUSION, and TIGAR employ different regression models to estimate cis-eQTL effect sizes from reference panels, but conduct gene-based association studies by Burden approach that models the variant effect size as a linear function of their corresponding cis-eQTL effect size estimate which may not be true for majority genes.


Methods and Materials: We proposed a novel TWAS method based on Sequential Kernel Association Test (SKAT) as VC-TWAS method, which takes cis-eQTL effect size estimates as variant weights but does not model the directions of variant effect sizes. In our studies, we applied PrediXcan and the nonparametric Bayesian Dirichlet process regression (DPR) model to estimate the cis-eQTL effect sizes. In simulation studies, we compared the performance of VC-TWAS and Burden-TWAS and simulated the data using the real genotype data from ROS/MAP dataset to simulation gene expression level and phenotype in two models. In real application, we applied VC-TWAS with the nonparametric Bayesian Dirichlet process regression (DPR) model to study Alzheimer’s dementia related phenotypes.


Results: From simulation studies, Compared to Burden-TWAS, VC-TWAS with weights derived from DPR method was shown obtaining the highest power when phenotypes were simulated under the assumption of random effects. From Meta-analysis result, we detected 13 significant TWAS (FDR < 0.05) genes for AD diagnosis, including the well-known GWAS risk gene TOMM40 with 2.86× 10^-12.Top novel risk Gene ZNF234 with FDR 1.40 × 10^-12 and previously detected Gene TRAPPC6A by Burden type TWAS with FDR 1.52× 10^-10 are identified by VC-TWAS. All significant loci are proximal to the major known risk loci APOE for Alzheimer’s dementia.


Conclusion: Based on those result, our finding provided potential biological interpretations for the known AD risk genes that also had significant TWAS p-values, with respect to the mediated genetic effects through gene expression and the significant association with both AD diagnosis and AD pathology indices.

Table of Contents


1. Introduction 1

2. Methods 4

2.1. TWAS Procedure 4

2.2. VC-TWAS Method 5

2.3. Cis-eQTL effect size estimation 6

2.4. Computational considerations of VC-TWAS 7

2.5. ROS/MAP data 7

2.6. Mayo Clinic LOAD GWAS data 8

2.7. Simulation Study Design 9

3. Results 12

3.1. Simulation results 12

3.2. Application results 14

4. Discussion 19

5. Appendix 22

5.1. Tables 22

5.2. Figures 24

5.3. Supplemental Data 27

5.4. Web Resources 34

6. References 35

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
Subfield / Discipline
  • English
Research Field
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files