Transmission Bottleneck Size Estimation from De Novo Viral Genetic Variation Open Access
Shi, Yike (Spring 2024)
Abstract
Sequencing of viral infections has become increasingly common over the last decade. Deep sequencing data in particular have proven useful in characterizing the roles that genetic drift and natural selection play in shaping within-host viral populations. They have also been used to estimate transmission bottleneck sizes from identified donor–recipient pairs. These bottleneck sizes quantify the number of viral particles that establish genetic lineages in the recipient host and are important to estimate due to their impact on viral evolution. Current approaches for estimating bottleneck sizes exclusively consider the subset of viral sites that are observed as polymorphic in the donor individual. However, these approaches have the potential to substantially underestimate true transmission bottleneck sizes. Here, we present a new statistical approach for instead estimating bottleneck sizes using patterns of viral genetic variation that arise de novo within a recipient individual. Specifically, our approach makes use of the number of clonal viral variants observed in a transmission pair, defined as the number of viral sites that are monomorphic in both the donor and the recipient but carry different alleles. We first test our approach on a simulated dataset and then apply it to both influenza A virus sequence data and SARS-CoV-2 sequence data from identified transmission pairs. Our results confirm the existence of extremely tight transmission bottlenecks for these 2 respiratory viruses.
Table of Contents
Introduction .................................................................................................................................... 1
Methods .......................................................................................................................................... 4
The Stochastic Within-Host Model .................................................................................................. 4
Derivation of the Probability Distribution for the Number of Clonal Variants ...................................... 8
Results............................................................................................................................................. 10
Application to Simulated Data ......................................................................................................... 10
Application to Empirical Data .......................................................................................................... 15
Application to IAV ........................................................................................................................... 16
Application to SARS-CoV-2 .............................................................................................................. 19
Guarding Against the Erroneous Calling of Clonal Variants ................................................................. 21
Considering Alternative Distributions for the Initial Number of Viral Particles That Start an Infection....24
Discussion......................................................................................................................................... 27
Supplemental Material ....................................................................................................................... 31
Derivation of the probability distribution for the number of clonal variants.......................................... 31
Rederivation of the Bozic et al. (2016) equation for the mean number of clonal variants ....................... 38
Calculation of the mean transmission bottleneck size 𝐍𝐛 ................................................................... 39
Quantification of the number of clonal variants for the influenza A virus data set ................................ 39
Quantification of the number of clonal variants for the SARS-CoV-2 data set ....................................... 40
Probability of a donor iSNV transmitting and fixing in a recipient........................................................ 41
Supplemental Tables .......................................................................................................................... 42
Supplemental Figures ......................................................................................................................... 43
References ......................................................................................................................................... 51
About this Honors Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Transmission Bottleneck Size Estimation from De Novo Viral Genetic Variation () | 2024-04-03 17:34:53 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|