Genomics studies of population structure and evolution in Neisseria gonorrhoeae Open Access

Ezewudo, Matthew Nwachukwu (2015)

Permanent URL: https://etd.library.emory.edu/concern/etds/1c18dg573?locale=en
Published

Abstract

Improvements in whole genome sequencing technology have created opportunities to find answers to questions relating to evolution within biological populations and the underlying genetic architecture of phenotypes of interest to researchers. This dissertation is aimed at seizing this development, to both develop and refine bioinformatics tools that could interpret high throughput data from next generation sequencing platforms to answer both classical genetics questions and to better characterize microbial populations in the emerging microbial genomics field.

In this body of work, we analyzed genome-wide sequencing data of Neisseria gonorrhoeae isolates from across the globe to make inferences as to the nature of evolutionary forces prevalent in the pathogen population. N. gonorrhoeae causes gonorrhea, a sexually transmitted infection of public health relevance because the pathogen has shown the ability to evolve resistance to most known antibiotic drugs. It is believed that the transformative nature of N. gonorrhoeae underlies this ability. Our analysis suggests an appreciable effect of recombination within its population, and also reconfirmed the presence of previously described horizontally transferred resistance determinants in strains resistant to third generation cephalosporin.

We pointed out some limitations in using the more widespread current sequencing technology platforms to make accurate inferences about the nature of whole genome data. There is also the need for a broader sample set than the collection we assembled, to further characterize this pathogen and similar microbial populations.

Table of Contents

1. Chapter 1: Introduction 1

1.1 Neisseria Gonorrhoeae: Overview of pathogen and disease……………….. 1

1.2 Bacteria evolution and population biology …………………………………. 4

1.2.1 Genetic Drift ………………………………………………………. 4

1.2.2 Natural Selection ………………………………………………….. 7

1.2.2.1 De novo point mutation and bacteria evolution ………… 8

1.2.2.2 Impact of recombination in bacteria population ………... 13

1.2.2.3 Estimating positive selection in bacteria population …… 21

1.3 Bacteria antibiotic resistance ………………………………………………… 22

1.3.1 Molecular mechanisms of antibiotic resistance in bacteria ……….. 24

1.3.1.1 Antibiotics resistance mechanisms in Neisseria gonorrhoeae

…………………………………………………………….. 26

1.3.2 Antibiotics resistance in commensal bacteria ……………………... 28

1.4 Next generation sequencing (NGS) and big data analysis …………………... 31

1.4.1 Association studies using bacteria NGS …………………………… 32

1.5 Questions examined in this thesis ……………………………………….. 34

1.6 Outline of thesis and chapter summaries …………………………………….. 35

2. Chapter 2: Evaluating Rare Variants in Complex Disorders Using Next-Generation Sequencing 38

2.1 Introduction ………………………………………………………………….. 38

2.2 Common genetic variation and complex diseases …………………………… 39

2.3 Rare genetic variation and complex diseases ……………………………….. 42

2.4 Next-generation sequencing, targeted enrichment and complex diseases …. 43

2.5 Challenges facing next-generation sequencing and complex diseases ……… 45

2.5.1 Accurate identification of genomic variation ……………………… 45

2.5.2 Efficient analysis of next-generation sequencing data ……………. 46

2.5.3 Interpreting the functional effects of genetic variation ……………. 47

2.6 Conclusion ……………………………………………………………………. 49

2.7 Appendix ……………………………………………………………………... 51

3. Chapter 3: SeqAnt 3.0: Revisions and updates on sequence annotation web application 53

3.1 Introduction …………………………………………………………………... 53

3.2 Implementation ……………………………………………………………….. 55

3.2.1 Overview ……………………………………………………………. 55

3.2.2 Database Platforms ………………………………………………….. 56

3.2.3 Gene Annotation Track ……………………………………………... 56

3.2.4 SNP Annotation Tracks …………………………………………….. 57

3.2.5 Clinical variations Annotation Tracks ……………………………… 57

3.2.6 Conservation Score Tracks ………………………………………… 58

3.2.7 Prokaryotic Annotations ……………………………………………. 59

3.2.8 Web Interface ………………………………………………………. 60

3.3 Results and Discussion ……………………………………………………….. 61

3.4 Conclusion ……………………………………………………………………. 64

3.5 Appendix ……………………………………………………………………… 65

4. Chapter 4: Population structure of Neisseria gonorrhoeae based on whole genome data and its relationship with antibiotic resistance 69

4.1 Introduction …………………………………………………………………… 69

4.2 Materials and Methods ……………………………………………………….. 71

4.2.1 Neisseria gonorrhoeae isolates ……………………………………... 71

4.2.2 Sequence generation and assembly …………………………………. 72

4.2.3 Genome-wide phylogeny and pangenome analysis ………………… 72

4.2.4 Multi-locus sequence typing (MLST) locus analysis ………………. 73

4.2.5 Estimating population parameters and homologous recombination ... 74

4.2.6 Population structure analysis ………………………………………... 75

4.2.7 Mapping the movement of DNA between Neisseria gonorrhea clades 76

4.2.8 Comparison of nucleotide substitution rates ……………………….. 77

4.2.9 Analysis of positive selection ……………………………………….. 77

4.2.10 Confirming known predictors of antibiotic resistance phenotypes ... 78

4.3 Results and Discussion ……………………………………………………….. 79

4.3.1 Genome-wide homologous recombination in diverse N.gonorrhoeae79

4.3.2 Neisseria gonorrhoeae population structure and biogeography …… 81

4.3.3 Genetic admixture within N.gonorrhoeae and with other Neisseria species

…………………………………………………………………………….. 84

4.3.4 Genes under positive selection ……………………………………… 86

4.3.5 Analysis of known genetic predictors for AMR phenotypes ………. 88

4.4 Conclusions …………………………………………………………………… 91

4.5 Appendix ……………………………………………………………………… 93

5. Chapter 5: Genome-wide tests for antibiotic resistance-associated variants within Neisseria gonorrhoeae105

5.1 Introduction …………………………………………………………………… 105

5.2 Materials and Methods ……………………………………………………….. 107

5.2.1 Neisseria gonorrhoeae isolates …………………………………….. 107

5.2.2 Sequence generation ……………………………………………….. 108

5.2.3 Variant calling ……………………………………………………… 108

5.2.4 Nucleotide diversity analysis ………………………………………. 109

5.2.5 Pangenome analysis and test for flexible genes underlying resistance

…………………………………………………………………………….. 110

5.2.6 Genome-wide association analysis …………………………………. 110

5.3 Results and Discussion ……………………………………………………….. 113

5.3.1 Nucleotide diversity of sample set …………………………………. 113

5.3.2 Potential novel resistance genetic variants …………………………. 114

5.3.2.1 Comparison between PPFS and QROADTRIPS Methods

……………………………………………………………………………………… 118

5.4 Conclusion ……………………………………………………………………. 119

5.5 Appendix ……………………………………………………………………… 122

6. Chapter 6: Summary and future directions 129

7. Bibliography 137

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files