A new method of network-guided dimension reduction Open Access

Hu, Jiani (2016)

Permanent URL: https://etd.library.emory.edu/concern/etds/bg257f58h?locale=en
Published

Abstract

Abstract

High throughput technologies transform the interest of single-gene and protein studies to the genome scale. Given the size and complexity of high-throughput data, dimension reduction is often used to simplify and visualize data. However, one obstacle to effective dimension reduction of complex gene expression matrix is the loss of true biological information caused by the pervasive correlation and interference of high measurement noise. To address these issues, we tried to incorporate existing knowledge as represented by known biological networks, by developing a new network-guided dimension reduction method. The effectiveness of this method was tested in both simulations and real gene expression data. The simulation results show the power of detecting major signal in large-scale network is high. The results from the real data analysis show the first few dimensions found by the method are dominated by meaningful biological signals. The network-guided dimensional reduction is an effective method that captures the main signals contained in the large data matrix.

By Jiani Hu

Table of Contents

Table of Contents
1. Introduction ..............................................................................................1
2 Methods.................................................................................................... 4
2.1 Scale-free gene network simulation ..............................................................5
2.2 Gene expression simulation .........................................................................7
Table 1 Parameters in simulations .....................................................................7
2.3 Network Guided Dimension Reduction............................................................ 8
2.4 Sparse Principal component analysis (SPCA) and canonical correlation calculation .9
2.5 Test on yeast cycle data ..........................................................................9
3 Results ...................................................................................................11
3.1 Simulation results ...................................................................................11
3.2 Testing the method on yeast cycle data ....................................................12
4 Discussion ...............................................................................................13
References .................................................................................................15
5 Appendices .............................................................................................17
A Figures & Tables...................................................................................... 17
Figure 1 Detected Correlation of hub one........................................................ 17
Figure2 Detected Correlation of hub two ........................................................17
Figure 3 Factor score of real yeast cell cycle data captured by the first PC........... 18
Figure 4 Factor score of real yeast cell cycle data captured by the second PC .......18
Table 2 Gene Ontology Classification result of Gene signal captured by the second PC .19
Figure 5 Factor score of real yeast cell cycle data captured by the third PC ...........20
Table 3 Gene Ontology Classification result of Gene signal captured by the third PC .20
B R Code ........................................................................................21

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Subfield / Discipline
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Partnering Agencies
Last modified

Primary PDF

Supplemental Files