The Effect of Unclassified Taxa on the UniFrac Distance Measurement Open Access

Chen, Jessica Lawanna (2016)

Permanent URL: https://etd.library.emory.edu/concern/etds/0z708w992?locale=en
Published

Abstract

The microbes in our microbiome can both benefit and harm the human host. Researchers are still figuring out what combinations of microbes are beneficial to humans in order to prevent or fight against diseases. The UniFrac measure is a distance measurement between two samples that shows how similar the two samples are biologically. Many Operational Taxonomic Units (OTUs) have unclassified taxa. Researchers usually either keep the OTUs with unclassified taxa or delete those OTUs from analyses. This study analyzes the weighted UniFrac distance measurement when the counts for the OTUs with unclassified taxa are imputed onto a known OTU. These UniFrac distance measurements calculated from imputation are compared to the UniFrac distance measurements when the original OTUs are used and when the OTUs with unknown taxa are removed from analyses. We find that the UniFrac distances created from deletion of OTUs with unknown taxa are on average smaller than the UniFrac distances created from the original OTUs. We find that the UniFrac distances created from the imputation method are on average greater than the UniFrac distances created from the original OTUs.

Table of Contents

Contents
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Human Microbiome . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Measuring Biological Diversity . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 -diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 -diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3
1.3 Human Microbiome Project . . . . . . . . . . . . . . . . . . . . . . . 6
1.4 Unclassied Taxa . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7
1.5 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.6 Denitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Cleaning Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Phylogenetic Tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 Imputation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .12
2.4 Measuring Weighted UniFrac . . . . . . . . . . . . . . . . . . . . . . 14
3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .21
5 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
6 Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Partnering Agencies
Last modified

Primary PDF

Supplemental Files