Multi-omic Analysis to Define Mechanisms of Antigenic Variation in Malaria Restricted; Files Only

Tseng, Christopher (Spring 2018)

Permanent URL: https://etd.library.emory.edu/concern/etds/02870v887?locale=en
Published

Abstract

During Plasmodium knowlesi malarial parasitic infections, distinctive variations of antigens are made by the parasite and are presented at the surface of infected erythrocytes. These antigens are significant to malaria pathogenesis, since they can allow infected erythrocytes to avoid detection by the host immune system, making them a potential target for vaccine development. Due to the important role SICA (schizont-infected cell agglutination) antigens play in the virulence of the malaria parasite P. knowlesi, studying the conditions for the expression of the SICAvar genes can lead to a clearer picture of how variant antigens contribute to malaria pathogenesis. We are especially curious about how P. knowlesi establishes and expresses five different cloned phenotypes depending on host conditions, like a missing spleen in SICA(-) parasites, or having been cured of a past infection by P. knowlesi A or B clones, then reinfected (B and C cloned phenotypes, respectively). To investigate this, we first used long read results from PacBio next generation sequencing to confirm high coverage of the recently generated P. knowlesi B and C genome assemblies. With completed P. knowlesi A, B, and C genomes, RNA-Seq data of P. knowlesi A+, B+, and C+ clones, as well as between A+ and A- clones, are compared to determine if there were any significant, large-scale differences in gene expression, perhaps due to switching at the genomic levels, which could result in the expression of these different protein repertoires. Finally, we apply different machine learning techniques to analyze the RNA-Seq data and further explore how they can be utilized to detect additional patterns of association between parasitic genes from the data. Building upon our recent efforts in developing the first PacBio-based Plasmodium genome sequence and studying P. knowlesi gene expression through transcriptomic analysis, we use the established P. knowlesi model system to gain novel insights into the underlying causes behind antigen variability and virulence. By better understanding how the parasite adapts to specific host environments, we can contribute to the development of more effective control measures and the eventual eradication of malaria.

Table of Contents

Chapter 1: Introduction……………………………………………………………………………….1

1.1 Malaria: The disease…………………………………………………………………………1

1.2 Malaria biology…………………………………………………………….……………………2

1.3 Intraerythrocytic cycle……………………………………………………………………..3

1.4 Symptoms, Treatment, Vaccine………………………………………………………4

1.5 Parasitic antigens……………………………………………………………………..………6

1.6 Plasmodium knowlesi……………………………………………………………………….7

1.7 SICA variant antigens………………………………………………………………………9

1.8 Systems Biology/Omics………………………………………………………………….13

1.9 Genomics/Next Generation Sequencing………………………………………..15

1.10 Transcriptomics/RNA-Seq……………………………………….……………………17

Chapter 2: Methods……………………………………………………………………….……………20

2.1 Genome Sequence Analysis……………………………………………………………20

2.2 RNA-Seq………………………………………………………………………………….………22

2.3 Quality Assessment………………………………….…………………………………….23

2.4 Splice Transcript Aligned to a Reference (STAR)………………………….23

2.5 High-Throughput Sequencing (HT-Seq)………………………………………..24

2.6 Normalization………………………………………………….………………………………25

2.7 Preliminary Classification……………………………………………….….………….26

2.8 Clustering……………………………………………………..….…………………………….26

2.9 Hierarchical Clustering……………………………………………………………………26

2.10 K-means Clustering………………………………………………………………………27

2.11 Self-Organizing Map (SOM)…………………………………………………………28

2.12 Plotting Clusters…………………………………………………………………………..28

2.13 Consensus Clustering………………………………………………………….…..….29

2.14 Second stage of clustering……………………………………………….………...30

2.15 Gene Pathway/PlasmoDB Analysis……………………………………….….…31

2.16 Machine Learning Approach…………………………………………………………32

2.17 Linear Regression…………………………………………………………………….….32

2.18 Neural Network…………………………………………………………………………….33

Chapter 3: Results………………………………………………………………………………………35

3.1 Genomics results…………………………………………………………………………….35

3.2 RNA-Seq analysis results…………………………………………………………….…47

3.3 Machine learning results…………………………………………………………………51

Chapter 4: Discussion……..………………………………………….…………………………….53

4.1 Genomics Analysis Discussion………………………………………….……………53

4.2 RNA-Seq Analysis Discussion……………………………………………………..…57

4.3 Consensus Clustering Discussion………………………………….……………….58

4.4 Machine Learning Discussion……………………………………..………………….62

Chapter 5: Conclusion……………………………………..…………….….………………………65

References………………………………………………..………………………………………………….67

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified No preview

Primary PDF

Supplemental Files