Integrate Proteomics Data with GWAS Summary data for Studying Alzheimer’s Disease by Nonparametric Bayesian Method Open Access

Tingyang, Hu (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/d791sh27f?locale=en%5D
Published

Abstract

Background: Alzheimer’s disease (AD) is a nerurogenerative disorder related to aging with polygenic inheritance. Genome-wide association studies (GWAS) of AD have identified many risk loci, but currently little is known about the underlying biological mechanism. Proteome-wide association study (PWAS) integrating proteomics data with GWAS summary data to identify risk genes associated with Alzheimer’s disease, would provide novel insights to the impacts of genetic variation on AD potentially mediated through brain protein abundance.

Method: We conducted the weighted protein network analysis on the human proteomes from European ancestry of ROS/MAP (12691 proteins donated by 400 samples), to identify clusters of proteins with unsupervised hierarchical clustering and relate the protein modules to external clinical traits of AD. The PWAS was implemented with Transcriptome-Integrated Genetic Association Resource V2 (TIGAR-V2) tool in two stages. Firstly we applied either nonparametric Bayesian Direchlet Process Regression (DPR) or Elastic-Net penalized regression (as used by PrediXcan) to train protein abundance imputation models, taking proteomics abundance as the outcome and cis-SNPs as predictors. The protein quantitative trait locus (pQTL) effect sizes estimated from the protein abundance prediction models were integrated with AD GWAS summary level data to implement asscoation test using burden test statistics.

Results: Weighted protein network analysis of 8874 proteins after quality control identified 32 network modules, ranged in size from 33 to 2386 proteins. We observed 2 protein modules significantly associated with AD clinical traits. At training stage of PWAS, we obtained 6673 protein abundance prediction models trained by Bayesian DPR, which were all valid with 5-fold CV R2 >0.005. Of 6389 protein abundance prediction models trained by Elastic net regression, only 1835 had 5-fold CV R2 >0.005. Based on GWAS summary statistics of AD and Bayesian estimated pQTL weights, the PWAS has detected 13 genes were associated with at an FDR of P <0.05, with 3 genes previously known as GWAS risk gene of AD. Furthermore, We compared the PWAS results of AD using pQTL weights estimated by DPR with weights estimated by Elastic-Net method (PrediXcan function integrated in our TIGAR tool). PrediXican detected 7 significant genes at an FDR of P <0.05, of which 2 was also identified by TIGAR.

Conclusion: In this work, we detected PWAS risk genes for AD and demonstrated the usefulness of nonparametric Bayesian DPR method in PWAS for AD. We believe this approach can be applied widely to study other complex polygenic diseases and provide new insights into their pathogenesis. 

Table of Contents

1 Introduction 1

2 Method and Materials 4

2.1 DataSource................................ 4 2.2 WeightedProteinNetworkAnalysis................... 5 2.3 PWASFramework ............................ 6

3 Protein Network Analysis 9

3.1 ProteomicsDatafromROS/MAPStudies ............... 9 3.2 ProteinModules/Networks........................ 10

3.2.1 Associations between Protein Modules and Clinical AD Traits 10

3.2.2 ProteinNetworksbySTRING.................. 13

4 PWAS of AD 14

4.1 TrainProteinAbundancePredictionModels . . . . . . . . . . . . . . 14 4.2 PWASResultsbyTIGAR/DPR..................... 15 4.3 ComparewithPWASresultsbyPrediXcan. . . . . . . . . . . . . . . 18 4.4 Compare with PWAS Results by FUSION as in Wingo’s Paper . . . . 20

5 Conclusion 22

Bibliography 24 

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files