Investigating genetic associations using power-optimizing analytic approaches Open Access
Holleman, Aaron (Fall 2021)
Abstract
Substantial progress has been made toward identifying genetic factors that contribute to many complex phenotypes, yet there remains an incomplete understanding of the genetics underlying such traits. This is partly due to insufficient study power. Increases in sample size will yield greater power, but may be challenging to accomplish given study constraints; and, at times, other approaches may be preferable for achieving power gains. In these situations, power-optimizing analytic techniques can be particularly useful. For this dissertation, we applied such techniques to more powerfully investigate genetic associations with phenotypes of interest.
In Aim 1, we employed polygenic risk score (PRS) methods to optimally examine the contribution of common genetic variation to atrioventricular septal defects (AVSD) in individuals with Down syndrome (DS). Using one of the largest available AVSD in DS case-control datasets, we constructed PRS based on large sets of common variants for each individual, using effect estimates from the largest available GWAS of congenital heart defects as weights. PRS were associated with AVSD with odds ratios ranging from 1.2 to 1.3 per standard deviation increase in PRS, suggesting at least a small contribution by common variants collectively to DS-associated AVSD.
In Aim 2, we evaluated the Gene Association with Multiple Traits (GAMuT) method as a potentially powerful approach to identify genes harboring common variants that influence psychiatric phenotypes. When applied to simulated data, GAMuT’s multivariate modeling of Beck Depression Inventory (BDI) items demonstrated greater power for identifying common variant associations than univariate methods analyzing a summary BDI score. Application of GAMuT to Grady Trauma Project data identified common variant associations with the PTSD Symptom Scale and the BDI.
In Aim 3, we investigated associations of rare regulatory variants with gene expression, for genes with schizophrenia-associated expression levels. We employed a modified version of a burden method developed to increase power for investigating rare variant associations with gene expression, and consistently observed U-shaped patterns of estimated association whereby rare regulatory allele burden was increased at both low and high expression levels.
By applying certain power-optimizing analytic approaches, we have generated novel findings suggestive of genetic associations with phenotypes of interest.
Table of Contents
Chapter 1: Introduction and Background..................................................................................................................................1
Introduction ..................................................................................................................................2
Overarching goal and specific aims..................................................................................................................................3
Background.............................................................................................................................3
Aim 1 ...................................................................................................................................3
Aim 2 ...................................................................................................................................5
Aim 3 ...................................................................................................................................8
Chapter 2: Employing polygenic risk score methods to examine the contribution of common genetic variants to atrioventricular septal defects in infants with Down syndrome ................................................................................................................................11
Abstract.................................................................................................................................13
Introduction ...........................................................................................................................15
Methods ................................................................................................................................18
Overview of the PRS method.............................................................................................18
Target dataset sources ......................................................................................................19
Whole genome sequencing dataset ...............................................................................19
Genome-wide imputation dataset...................................................................................20
Target dataset preparation.................................................................................................20
Primary analyses............................................................................................................20
Secondary analyses.......................................................................................................27
Analytic approach ..............................................................................................................28
Discovery data used to define weights for the PRS ........................................................28
Generating PRS for the primary analyses ......................................................................30
Generating PRS for the secondary analyses..................................................................32
Testing association of PRS with DS+AVSD....................................................................33
Results ..................................................................................................................................33
Primary analyses ...............................................................................................................33
Secondary analyses ..........................................................................................................41
Discussion.............................................................................................................................44
Chapter 3: A powerful multivariate method for examining genetic associations with psychiatric phenotypes..........................................................................................................49
Abstract.................................................................................................................................51
Introduction ...........................................................................................................................53
Methods ................................................................................................................................57
Overview of GAMuT ..........................................................................................................57
Simulated data analyses....................................................................................................58
Type I error ....................................................................................................................62
Power.............................................................................................................................64
Applied analyses................................................................................................................66
PSS analyses.................................................................................................................67
BDI analyses..................................................................................................................68
Multiple testing differences and correction......................................................................69
Results ..................................................................................................................................70
Simulated data analyses....................................................................................................70
Type I error ....................................................................................................................70
Power.............................................................................................................................72
Applied analyses................................................................................................................76
PSS................................................................................................................................76
BDI.................................................................................................................................79
Discussion.............................................................................................................................82
Supplement..............................................................................................................................84
Chapter 4: Investigating the association of rare regulatory variation and gene expression among genes with schizophrenia-associated expression levels ...................................... 106
Abstract ............................................................................................................................... 107
Introduction .........................................................................................................................109
Methods ..............................................................................................................................114
Data sources ...................................................................................................................114
Processing and QC of targeted DNA sequencing and expression datasets .....................119
Targeted DNA sequencing dataset...............................................................................119
RNA sequencing dataset..............................................................................................124
Microarray expression dataset .....................................................................................126
Generating final analytic datasets....................................................................................126
RNA sequencing samples............................................................................................126
Microarray expression samples....................................................................................127
Analytic approach ............................................................................................................127
Overview of burden method .........................................................................................127
Genes for analyses ......................................................................................................135
Covariate adjustment ...................................................................................................136
Analysis subsets ..........................................................................................................141
Sensitivity analyses......................................................................................................144
Results ................................................................................................................................ 153
RNA sequencing and microarray expression discordance ...............................................153
RNA-sequencing-only analyses.......................................................................................156
Genes with SZ-associated expression levels................................................................156
Genes with SZ-associated expression levels or within a SZ CNV.................................161
Filtering the combined gene set based on gene constraint metrics...............................166
Sensitivity analyses......................................................................................................170
Discussion...........................................................................................................................175
Supplement............................................................................................................................182
Chapter 5: Summary of Results, Future Research.............................................................210
References ............................................................................................................................219
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Investigating genetic associations using power-optimizing analytic approaches () | 2021-11-05 10:09:52 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|