Identification of the Effect of Population Stratification on Association Studies of Rare Variants Público
Jiang, Yunxuan (2011)
Abstract
Abstract
Identification of the Effect of Population Stratification on
Association Studies of Rare
Variants
BY Yunxuan Jiang
Human genome research, which aims to find the genetic etiology
of the disease, is having a more
and more profound influence on public health. And rare variants,
which both have large effect
size and can explain a great proportion of heritability, are
becoming the focus of current human
genome research. Although several statistical methods have
developed to increase the power of
detecting rare variants and reduce false positive rate, none of
these methods address an important
issue that often arises in genetic studies: false positives due to
population stratification.
Population stratification is a well-known problem that can
substantially cause inflated false
positive rate and decreased power to detect real association. We
simulated several case-control
studies with different sample size and population structure
according to a series of disease
prevalence for each population (Europea and Africa), and found that
population stratification can
have a significant influence on rare variants studies. The false
positive rate increases dramatically
as sample size increase and population structure become extreme. We
applied principal
component analysis to control for population structure. Our results
showed that the principal
component method performed very well even for highly structured
data. The false positive rate
remained around 0.05 in our simulation. Our results implicates that
researchers need to carefully
match case and control ancestry, in order of avoid false positive
caused by population structure in
rare variants study. If it is inevitable to recruit samples from
different population, then researchers
can correct for it with our easy implemented method.
Identification of the Effect of Population Stratification on Association Studies of Rare
Variants
BY
Yunxuan Jiang
Bachelor of Science
Beijing Forestry University
2009
Thesis Committee Chair: Karen N. Conneely, Ph.D
Michael P. Epstein, Ph.D
A thesis submitted to the Faculty of the
Rollins School of Public Health of Emory University
in partial fulfillment of the requirements for the degree of
Master of Science in Public Health
in Biostatistics
2011
Table of Contents
Table of Contents
Chapter 1 Introduction ......................................................................1
Chapter 2 Review of the Literatures ...................................................6
2.1 Common vs. Rare Variants................................................................7
2.2 Study Design.................................................................................11
2.2.1 Linkage Analysis..........................................................................12
2.2.2 Association Studies......................................................................13
2.2.2.1 Candidate Gene Association Studies..............................................13
2.2.2.2 Genome Wide Association Studies.................................................14
2.3 Population Stratification...................................................................16
2.3.1 Genomic Control...........................................................................18
2.3.2 Structure....................................................................................19
2.3.3 Principal Component Analysis...........................................................19
2.4 Statistical Methods for Analyzing Rare Variants......................................20
Chapter 3 Methodology .......................................................................21
3.1 Simulating Population Specific Haplotype...............................................23
3.1.1 Building the Genealogy....................................................................23
3.1.2 Mutation......................................................................................25
3.1.3 Adding Neutral Mutations to the genealogy.........................................26
3.1.4 Migration......................................................................................27
3.1.5 Recombination...............................................................................29
3.2 Simulating a case-control Study..........................................................30
3.3 Simulating GWAS Data.......................................................................32
3.4 Methods to Calculate Principal Components............................................32
3.5 Testing for Association Between Rare Variants and Disease.......................34
Chapter 4 Results ................................................................................36
4.1 Simulated Study................................................................................37
4.2 False Positive Rate........................................................ ....................37
4.3 Correcting for Population Stratification using Principal Components..............40
Chapter 5 Conclusions, Implications and Recommendations ...................43
Reference ...........................................................................................47
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
Palavra-chave | |
Committee Chair / Thesis Advisor | |
Partnering Agencies |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Identification of the Effect of Population Stratification on Association Studies of Rare Variants () | 2018-08-28 11:21:09 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|