Investigating genetic associations using power-optimizing analytic approaches Public

Holleman, Aaron (Fall 2021)

Permanent URL: https://etd.library.emory.edu/concern/etds/79407z51r?locale=fr
Published

Abstract

Substantial progress has been made toward identifying genetic factors that contribute to many complex phenotypes, yet there remains an incomplete understanding of the genetics underlying such traits. This is partly due to insufficient study power. Increases in sample size will yield greater power, but may be challenging to accomplish given study constraints; and, at times, other approaches may be preferable for achieving power gains. In these situations, power-optimizing analytic techniques can be particularly useful. For this dissertation, we applied such techniques to more powerfully investigate genetic associations with phenotypes of interest.

In Aim 1, we employed polygenic risk score (PRS) methods to optimally examine the contribution of common genetic variation to atrioventricular septal defects (AVSD) in individuals with Down syndrome (DS). Using one of the largest available AVSD in DS case-control datasets, we constructed PRS based on large sets of common variants for each individual, using effect estimates from the largest available GWAS of congenital heart defects as weights. PRS were associated with AVSD with odds ratios ranging from 1.2 to 1.3 per standard deviation increase in PRS, suggesting at least a small contribution by common variants collectively to DS-associated AVSD.

In Aim 2, we evaluated the Gene Association with Multiple Traits (GAMuT) method as a potentially powerful approach to identify genes harboring common variants that influence psychiatric phenotypes. When applied to simulated data, GAMuT’s multivariate modeling of Beck Depression Inventory (BDI) items demonstrated greater power for identifying common variant associations than univariate methods analyzing a summary BDI score. Application of GAMuT to Grady Trauma Project data identified common variant associations with the PTSD Symptom Scale and the BDI.

In Aim 3, we investigated associations of rare regulatory variants with gene expression, for genes with schizophrenia-associated expression levels. We employed a modified version of a burden method developed to increase power for investigating rare variant associations with gene expression, and consistently observed U-shaped patterns of estimated association whereby rare regulatory allele burden was increased at both low and high expression levels.

By applying certain power-optimizing analytic approaches, we have generated novel findings suggestive of genetic associations with phenotypes of interest.

Table of Contents

Chapter 1: Introduction and Background..................................................................................................................................1

Introduction ..................................................................................................................................2

Overarching goal and specific aims..................................................................................................................................3

Background.............................................................................................................................3

Aim 1 ...................................................................................................................................3

Aim 2 ...................................................................................................................................5

Aim 3 ...................................................................................................................................8

Chapter 2: Employing polygenic risk score methods to examine the contribution of common genetic variants to atrioventricular septal defects in infants with Down syndrome ................................................................................................................................11

Abstract.................................................................................................................................13

Introduction ...........................................................................................................................15

Methods ................................................................................................................................18

Overview of the PRS method.............................................................................................18

Target dataset sources ......................................................................................................19

Whole genome sequencing dataset ...............................................................................19

Genome-wide imputation dataset...................................................................................20

Target dataset preparation.................................................................................................20

Primary analyses............................................................................................................20

Secondary analyses.......................................................................................................27

Analytic approach ..............................................................................................................28

Discovery data used to define weights for the PRS ........................................................28

Generating PRS for the primary analyses ......................................................................30

Generating PRS for the secondary analyses..................................................................32

Testing association of PRS with DS+AVSD....................................................................33

Results ..................................................................................................................................33

Primary analyses ...............................................................................................................33

Secondary analyses ..........................................................................................................41

Discussion.............................................................................................................................44

Chapter 3: A powerful multivariate method for examining genetic associations with psychiatric phenotypes..........................................................................................................49

Abstract.................................................................................................................................51

Introduction ...........................................................................................................................53

Methods ................................................................................................................................57

Overview of GAMuT ..........................................................................................................57

Simulated data analyses....................................................................................................58

Type I error ....................................................................................................................62

Power.............................................................................................................................64

Applied analyses................................................................................................................66

PSS analyses.................................................................................................................67

BDI analyses..................................................................................................................68

Multiple testing differences and correction......................................................................69

Results ..................................................................................................................................70

Simulated data analyses....................................................................................................70

Type I error ....................................................................................................................70

Power.............................................................................................................................72

Applied analyses................................................................................................................76

PSS................................................................................................................................76

BDI.................................................................................................................................79

Discussion.............................................................................................................................82

Supplement..............................................................................................................................84

Chapter 4: Investigating the association of rare regulatory variation and gene expression among genes with schizophrenia-associated expression levels ...................................... 106

Abstract ............................................................................................................................... 107

Introduction .........................................................................................................................109

Methods ..............................................................................................................................114

Data sources ...................................................................................................................114

Processing and QC of targeted DNA sequencing and expression datasets .....................119

Targeted DNA sequencing dataset...............................................................................119

RNA sequencing dataset..............................................................................................124

Microarray expression dataset .....................................................................................126

Generating final analytic datasets....................................................................................126

RNA sequencing samples............................................................................................126

Microarray expression samples....................................................................................127

Analytic approach ............................................................................................................127

Overview of burden method .........................................................................................127

Genes for analyses ......................................................................................................135

Covariate adjustment ...................................................................................................136

Analysis subsets ..........................................................................................................141

Sensitivity analyses......................................................................................................144

Results ................................................................................................................................ 153

RNA sequencing and microarray expression discordance ...............................................153

RNA-sequencing-only analyses.......................................................................................156

Genes with SZ-associated expression levels................................................................156

Genes with SZ-associated expression levels or within a SZ CNV.................................161

Filtering the combined gene set based on gene constraint metrics...............................166

Sensitivity analyses......................................................................................................170

Discussion...........................................................................................................................175

Supplement............................................................................................................................182

Chapter 5: Summary of Results, Future Research.............................................................210

References ............................................................................................................................219 

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Mot-clé
Committee Chair / Thesis Advisor
Committee Members
Dernière modification

Primary PDF

Supplemental Files