Estimating Genetic Effects When Stratification-Score Matching Is Used to Correct for Confounding by Population Stratification in Case-Control Studies Open Access

Sun, Zhe (2014)

Permanent URL: https://etd.library.emory.edu/concern/etds/7w62f877f?locale=en
Published

Abstract

Case-control studies are most frequently used to investigate the association between the risk of developing a particular disorder and the genetic variation. This association may be confounded by population stratification, i.e., when genetic variation is correlated with variation in disease risk across latent subpopulations in the case-control sample. Failure to properly account for this confounding can lead to false associations between the genetic markers and disease. An efficient correction proposed by Epstein et al. (2007, 2012) is to infer the ancestry by principal components of the sample correlation matrix of SNP genotypes, and fine-match the case-control samples by the stratification score, which is the probability of disease given genomic variables. However, this approach only provides hypothesis testing of the association but not estimation of the genetic effects. In this thesis, we propose a novel estimation method based on the fine-matched case-control sample. Extensive simulation studies were carried out to evaluate the performance of the proposed method and compare with a few alternative strategies. The simulation results demonstrate little bias of the proposed estimator, even when there is a strong association between the ancestry and the genetic marker under study.

Table of Contents

1. Introduction............................................................................................................... 1

2. Methods..................................................................................................................... 5

2.1 Ancestry Components and Stratification Score................................................... 5

2.2 Fine Matching...................................................................................................... 6

2.3 Conditional Logistic Regression Based on Stratification Score.......................... 7

2.4 Software Implementation.................................................................................... 9

3. Results...................................................................................................................... 10

3.1 Simulation Studies............................................................................................. 10

3.2 Under the Null Hypothesis................................................................................ 12

3.3 Under the Alternative Hypothesis..................................................................... 13

4. Discussion................................................................................................................ 15

5. Reference................................................................................................................. 17

6. Appendix.................................................................................................................. 19

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Partnering Agencies
Last modified

Primary PDF

Supplemental Files