Cell type identification in single-cell genomics and its applications Restricted; Files & ToC

Ma, Wenjing (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/t722hb31f?locale=en


Advances in techniques for measuring genomics in cell-level resolution provide great opportunities to uncover cellular heterogeneity in genomic features of interest at the level of individual cells. Initiated by the introduction of single-cell RNA-sequencing (scRNA-seq), which measure transcriptomics information, single-cell techniques have been expanded to encompass other epigenomic modalities as well. Among all scientific goals in single-cell genomics studies, precise cell type identification (celltyping) is a fundamental and crucial step in analyzing single-cell genomics data. Supervised cell typing methods have become increasingly popular due to their superior accuracy, robustness, and efficiency. In our dissertation, we primarily focus on the development and application of supervised cell typing methods.

The dissertation starts with evaluating key factors for supervised celltyping methods developed for scRNA-seq data. After performing extensive real data analyses, we suggest combining all individuals from available datasets to construct the reference dataset and using the multi-layer perceptron (MLP) as the classifier, along with F-test as the feature selection method. This benchmark study not only offers valuable insights and suggestions for method developers but also lays the groundwork for our subsequent research endeavors.

We then developed a novel computational method with open-source software called Cellcano, which is specifically designed for the single-cell technique that profiles chromatin accessibility (scATAC-seq). Cellcano is based on a two-round supervised learning algorithm and provides significantly improved accuracy, robustness, and computational efficiency compared to existing tools. We have also explored the possibilities of using scRNA-seq data as references to perform a supervised manner of celltyping and data integration for scATAC-seq. 

Upon accurate identification of distinct cell types, specific markers unique to each cell type can be extracted to enable diverse applications and downstream analyses. Based on cell-type-specific marker genes, we developed a method named LRcell to identify cellular activities associated with psychiatric disorders.

The computational and statistical methods employed in this dissertation are designed to provide a comprehensive understanding of cell-type-specificity. We anticipate that this research will contribute to the understanding of cellular functions in biological mechanisms and disease progression, potentially providing valuable insights for biomedical researchers.

Table of Contents

This table of contents is under embargo until 22 May 2025

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
  • English
Research Field
Committee Chair / Thesis Advisor
Committee Members
Last modified Preview image embargoed

Primary PDF

Supplemental Files