Evaluation of the impact of DNA Sequence Variations on in vivo Transcription Factor Binding Affinity Public
Jiang, Jiahui (Spring 2021)
Abstract
Genome-wide association studies (GWASs) identified huge amounts of single nucleotide variants (SNVs) and thousands of SNVs within non-coding regions have associations with complex diseases. However, how non-coding SNVs specifically affect diseases is not clear yet. Recently, the number of studies focusing on the impact of these SNVs are increasing rapidly. A possible mechanism is that some non-coding SNVs can alter regulatory elements such as disrupting transcription factor (TF) binding sites, leading to the change of gene expression which result in diseases. Traditionally, it is assumed that SNVs within TF binding sites will impact the TF binding. However, increasing studies show that not all SNVs contribute to the TF binding since most TF binding motifs are not well conserved. Therefore, more information is needed to annotate SNVs within TF binding sites. In this study, we conducted a comprehensive survey to quantify the impact of SNVs on TF binding affinity using a creative sequence-based machine learning method. We found that only 20% SNVs within putative TF binding sites would be possible to significantly impact the in vivo TF binding.
Table of Contents
1. Introduction............................................................................................................................ 1
2. Method................................................................................................................................... 3
2.1 Using Phenotype–Genotype Integrator (PheGenI) and UCSC Table Browser to find SNVs.................................................................................................................................. 4
2.2 Using PWM method to evaluate motif......................................................................... 4
2.3 Using gkm-SVM method to evaluate motif................................................................. 5
3. Results.................................................................................................................................... 6
3.1 Correlation between PWM scores and gkm-SVM weights.......................................... 7
3.2 Potential association between TFs and complex diseases............................................. 7
4. Discussion.............................................................................................................................. 9
5. References............................................................................................................................ 11
6. Figures.................................................................................................................................. 13
7. Supplementary Materials...................................................................................................... 16
About this Master's Thesis
School | |
---|---|
Department | |
Subfield / Discipline | |
Degree | |
Submission | |
Language |
|
Research Field | |
Mot-clé | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Evaluation of the impact of DNA Sequence Variations on in vivo Transcription Factor Binding Affinity () | 2021-04-24 23:30:16 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|