Developing enrichment analyzing methods at sub-cell type level to generate novel insights on complex disease pathogenesis Restricted; Files Only

Wang, Boqi (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/8c97kr82w?locale=en
Published

Abstract

The development of biotechnologies and the consequent high throughput experiments have led to an urgent need to utilize such an enormous amount of biomedical data. It is necessary to develop bioinformatics tools that perform gene enrichment analysis at the sub-cell type level in complex diseases and traits for the derivation of disease etiology and the development of new treatment strategies.

In this study, we tackled the problem using newly emerged single-cell gene expression data and developed two approaches to accurately identify affected cell types in specific diseases. The first approach builds logistic regression models using cell type-specific marker genes, and the second one utilizes expression quantitative trait loci (eQTLs) colocalization and target gene read proportions in single nuclei RNA sequencing (snRNA-seq) data. The cell types are ranked based on the significance of cell types’ associations with diseases. The central hypothesis is that most disease-associated genes are expressed preferentially in affected cell types. The two methods take advantage of newly emerged single-cell gene expression data from hECA and GEO of NCBI. Other types of biomedical big data like eQTLs from GTEx and disease-associated genes from DisGeNET were utilized as well.

Our approach has presented significantly more accurate results. Various cell type-disease combinations were revealed for 916 diseases and traits while some suggested potential explanations for disease pathogenesis. The results showed great consistency with previous findings. Overall, our methods have shown great potential in uncovering novel pathogenesis mechanisms of complex diseases. In-depth analysis and experimental validation are required to fully understand these discovered tissue-trait associations and their enriched genes.

Table of Contents

Project 1: Loci2path……………………………………………1

Project 2: Loci2tissue…………………………………………..5

Project 3: Single-cell approaches………………………………7

Materials & Methods…………………………………………...8

Results………………………………………………………….12

Discussion………………………………………………………20

Supplementary Materials……………………………………….22

References………………………………………………………23

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified Preview image embargoed

Primary PDF

Supplemental Files