Statistical Method for Studying Genome-Wide Interaction and the Natural History of Alzheimer’s Disease Restricted; Files Only
Bian, Shijia (Spring 2025)
Abstract
Researchers commonly use single-nucleotide polymorphism(SNP) data from genome-wide association studies (GWAS) to estimate the narrow-sense heritability of a trait or disease using statistical methods like GCTA and LD Score Regression. Typically, SNP-based heritability estimates of a trait are smaller than their family-based heritability estimates based on kinship. This discrepancy may arise from non-additive genetic effects, including gene-gene or gene-environment interactions. Therefore, identifying SNPs involved in interactions is crucial for accurately understanding genetic contributions to trait heritability. Variance-based tests offer computational efficiency in screening SNPs for potential interaction effects without specifying interacting variables. However, these tests typically evaluate only one trait at a time, neglecting pleiotropy, which, if accounted for, could enhance power to detect true interactions. To address this limitation, we introduce SCAMPI (Scalable Cauchy Aggregate test using Multiple Phenotypes to test Interactions), a computationally scalable and interpretable approach that screens for SNPs with interaction effects across multiple traits simultaneously.
Additionally, given the rapid growth in medical data, there is an increasing need to review and synthesize genotype interaction evidence across independent studies. Challenges like sensitive data-sharing restrictions and variability in sample sizes often lead to false positives from inflated type-I errors or false negatives due to limited statistical power, making replication difficult. In response, we propose MetaSCAMPI, a meta-analysis extension of SCAMPI, that is designed to combine effect estimates from multiple studies without requiring the pooling of individual-level data. MetaSCAMPI significantly reduces logistical and regulatory hurdles associated with data sharing, while improving statistical power and reproducibility.
Lastly, we expand our analyses beyond the genome and consider the role of the proteome in the origins of complex human disease. In particular, we studied proteomic data derived from cerebrospinal fluid (CSF) samples to determine their role in disease progression among individuals with autosomal-dominant Alzheimer’s disease (ADAD) This investigation revealed a comprehensive six-decade timeline of pathological evolution. Understanding this extended pathological progression provides valuable insights for developing precision therapeutic interventions and identifying novel biomarkers beyond traditional targets such as Aβ and Tau.
Table of Contents
1 Introduction 1
1.1 Background 1
1.2 Building on Prior Work: A Review and Our Research Contributions 4
1.2.1 Modeling and Screening Gene-Environment and Gene-Gene Interactions in Complex Traits 4
Traditional Modeling Approaches 4
Early Methods for Detecting Interactions 5
Variance-Based Screening Approaches 6
SCAMPI: A Multivariate, Scalable Framework 9
1.2.2 Meta-Analysis in Genetic Research and the Development of MetaSCAMPI 11
Importance of Meta-Analysis in Biological and Medical Research 11
Meta-Analysis Approaches 12
MetaSCAMPI: A Summary-Based Multivariate Meta-Analysis Framework 16
1.2.3 Decoding Alzheimer’s Disease Pathology Through Biomarkers 17
Genetic and Epigenetic Biomarkers 18
Fluid Biomarkers 19
Contribution of Our Work: CSF Proteomic Profiling in Autosomal Dominant AD 19
Limitations and Future Directions 20
2 SCAMPI: A scalable statistical framework for genome-wide interaction testing harnessing cross-trait correlations 21
2.1 Abstract 22
2.2 Introduction 23
2.3 Materials and Methods 26
2.3.1 Motivation 26
2.3.2 Notation and Trait Standardization 28
2.3.3 Analysis Strategy 30
2.3.4 Cauchy Combination Test (CCT) 31
2.3.5 Overview of the SCAMPI Framework 32
2.3.6 Application to UK Biobank Data 32
2.3.7 Simulations 34
2.4 Results 36
2.4.1 Simulation Studies 36
2.4.2 Application to UKBB 38
2.4.3 SCAMPI Analysis in UKBB Adjusting for APOE 41
2.4.4 Computational Performance 42
2.5 Discussion 43
3 MetaSCAMPI: A scalable variance- and covariance-based meta-analysis framework for genome-wide interaction testing utilizing summary statistics 55
3.1 Abstract 56
3.2 Introduction 57
3.3 Methods 60
3.3.1 Overview of SCAMPI 60
3.3.2 MetaSCAMPI Framework 61
3.3.3 Simulation Study 71
3.3.4 Application of MetaSCAMPI to UKBB 73
3.4 Result 74
3.4.1 Type I Error Simulation 74
3.4.2 Power Simulation 75
3.4.3 UKBB Data Analysis 76
3.5 Discussion 77
4 Proteomic analysis of cerebrospinal fluid from individuals with autosomal dominant Alzheimer’s disease reveals how this complex and chronic disease evolves over many decades 90
4.1 Abstract 91
4.2 Introduction 92
4.3 Results 94
4.3.1 Proteomics Identifies Early Elevations in SMOC1 and the Matrisome with Subsequent Cascading Pathological Changes 94
4.3.2 The Proteome Strongly Discriminates Mutation Carriers from Non-carriers Prior to Symptom Onset 98
4.4 Method 99
4.4.1 Participants 99
4.4.2 Clinical assessment and EYO 99
4.4.3 Statistical analysis 100
Bayesian Modeling 100
Classification 104
4.5 Discussion 105
5 Conclusions and Future Directions 119
5.1 Summary 120
5.2 Multimodal Biomarkers for AD 121
5.3 Outlook on Future Strategies by Integrating Multimodal Data for AD 124
A Appendix for Chapter 2 131
B Appendix for Chapter 3 186
C Appendix for Chapter 4 234
Bibliography 301
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Palavra-chave | |
Committee Chair / Thesis Advisor | |
Committee Members |

Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
![]() |
File download under embargo until 22 May 2027 | 2025-04-18 23:28:17 -0400 | File download under embargo until 22 May 2027 |
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|