Novel Statistical and Machine Learning Methods with Application to Brain Imaging Data 公开
Wang, Yikai (Spring 2020)
Abstract
Brain imaging has been a breakthrough technique for understanding the functionality and organization of the human brain, serving as the fundamental basis for neuroscientific research. My dissertation is focusing on developing statistical and machine learning methods for brain imaging data.
For the first topic, we propose a novel hierarchical independent component modeling framework for longitudinal fMRI study (L-ICA). Existing ICA methods are only applicable for cross-sectional study. In this topic, we provide the first formal statistical modeling framework extending ICA to longitudinal study. By incorporating subject-specific random effects and visit-specific covariate effects, L-ICA is able to provide more accurate estimates for brain networks and borrow information across repeated scans to increase statistical power in detecting the covariate effects. We develop a fully traceable EM algorithm and a subspace-based approximate EM algorithm which greatly reduce the computation time while retaining high accuracy. Simulation and real data results demonstrate the advantages of L-ICA.
For the second topic, we propose a novel blind signal separation (BSS) model for decomposing brain connectivity matrices. Existing BSS methods are mainly focusing on decomposing neural activity signals, instead of brain connectivities. In this topic, we propose a low-rank decomposition method with uniform sparsity (LOCUS) for brain network measures. LOCUS adopts a low-rank factorization in each latent signal for robust recovery, and also incorporates a novel penalization approach for sparsity control on latent sources. We propose a highly efficient algorithm for parameter estimation. Simulation and real data results show that LOCUS provides highly reproducible findings than existing approaches.
For the third topic, we propose a novel deep learning (DL) framework for brain network data. DL methods are often criticized for low interpretability and instability. By incorporating the existing brain subnetwork structure, we propose a DL framework with adaptively shaped graph convolutional layer (DLconv) for brain network. Specifically, the shape of convolutional filter is driven by brain subnetwork, and subnetwork-level features are propagated separately until the final layer. With the inherent structure in DLconv, we propose a robust training procedure by updating the subnetwork-specific parameters in parallel. Real data studies demonstrate the advantages of DLconv.
Table of Contents
2.1 Schematic illustration of the hierarchical modeling framework of LICA.
(A) the 1st level model of L-ICA with N subjects and K visits
where each subject/visit-specific fMRI data is decomposed into q
subject/visit-specific ICs, here q = 2 for illustration purpose. (B) the
second level model of L-ICA for one specific IC where the subject/visit specific
ICs are modelled in terms of population-level source signals,
subject specic random effects, visit effects and visit-specific covariate
effects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Comparison between the proposed L-ICA and the TC-GICA based approach
for estimating the population-level IC maps at baseline and the last visit
(N=20, low subject/visit-specific random variability): (A) truth, (B) L-ICA
estimates and (C) estimates from TC-GICA. Column (i) represents the IC
maps at baseline ; Column (ii) represents the IC maps at last visit; Column
(iii) represents the longitudinal trends for activated voxels (where each line
represents a voxel) in the rst IC (IC1). Results show that L-ICA provides
more accurate estimates than TC-GICA at each visit and more precisely
captures the voxel-specific longitudinal trend. . . . . . . . . . . . . . . . 27
v
2.3 Simulation results for testing covariate effects based on 1000 runs with sam-
ple size N = 40 using the proposed L-ICA method (red) and the TC-GICA
(blue) based method. We considered two types of hypothesis tests: test-
ing the time-specic covariate effect at a given visit (the 2nd visit), i.e.
H0 : 2(v) = 0 (the left column), and testing the time-varying longitudinal
covariate effects between the 1st and 2nd visit, i.e. H0 : \beta1(v) = \betat2(v) (the
right column). Panel (A) and (B) presents the type I error rates and the
statistical power, respectively. The results show that the L-ICA method
demonstrates lower type I error and higher statistical power as compared
with the TC-GICA based method. . . . . . . . . . . . . . . . . . . . . 30
2.4 L-ICA estimates of subpopulation spatial source signal maps for the
DMN for the four disease group across the visits, with the mean baseline
age (73.7 year old) and are averaged between genders. All IC maps
are thresholded based on the source signal intensity level. . . . . . . 40
2.5 L-ICA estimates of subpopulation spatial source signal maps for the
medial visual network for the four disease group across the visits, with
the mean baseline age (73.7 year old) and are averaged between genders.
All IC maps are thresholded based on the source signal intensity
level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6 L-ICA estimates of subpopulation spatial source signal maps for the
occipital visual network for the four disease group across the visits,
with the mean baseline age (73.7 year old) and are averaged between
genders. All IC maps are thresholded based on the source signal intensity
level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.7 L-ICA estimates of subpopulation spatial source signal maps for the
FPL for the four disease group across the visits, with the mean baseline
age (73.7 year old) and are averaged between genders. All IC maps are
thresholded based on the source signal intensity level. . . . . . . . . 43
2.8 L-ICA estimates of longitudinal trends for voxels in the DMN network
for each disease group in ADNI2 study. Results show that AD and late
MCI (LMCI) patients generally have more changes across visits and
that AD group has higher within-network variations than the other
disease groups at each visit. . . . . . . . . . . . . . . . . . . . . . . . 44
2.9 L-ICA estimates of longitudinal trends for voxels in FPL and visual
networks for each disease group in ADNI2 study. Results show that
AD and LMCI patients generally have more changes across visits and
that AD group has higher within-network variations than the other
disease groups at each visit. . . . . . . . . . . . . . . . . . . . . . . . 45
2.10 p-values for testing group differences in DMN between AD and CN
subjects at each visit. The rst row shows the test results based on
L-ICA and the second row shows the results from the TC-GICA based
approach. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.11 p-values, thresholded at 0.05, for testing group differences in DMN
between EMCI and LMCI subjects at each visit. L-ICA finds between-group
differences in DMN at each visit while TC-GICA detects little
group differences. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
vii
2.12 Longitudinal changes from baseline and later visits in DMN within AD,
LMCI, EMCI and CN groups. The first column shows the comparison
between year 1 versus baseline and the second column shows the comparison
between year 2 versus baseline, where the value represents the
longitudinal differences in source signal intensity for DMN voxels
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.13 p-values, thresholded at 0.05, for longitudinal changes between baseline
and year 2 for the default mode network (DMN) among the AD group.
L-ICA nds longitudinal changes in major regions of DMN among AD
patients while TC-GICA detects little changes in DMN among these
patients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.1 Preliminary findings based on PNC Study: (A) and (B) represents the av-
eraged covariance matrix and Pearson correlation matrix across 515 healthy
subjects in PNC study. . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.2 Four estimated latent sources based on connICA and PNC study, where
each source is further threshold to ensure sparsity. . . . . . . . . . . . . . 55
3.3 Illustration of the Node Moving algorithm until the 10th iteration for Locus
method based on a simulated dataset from the setting 1 with middle level
variance and 100 samples. The algorithm starts with a noisy estimate which
can hardly show the pattern, and after 10 iterations Xl(v)'s are grouped into
several clusters and those clusters are becoming orthogonal with each other,
resulting in a sparse and low-rank latent sources. . . . . . . . . . . . . . 69
3.4 Generated underlying true source signals of 2 settings in the simulation study 84
3.5 Estimated latent signals of 4 randomly selected simulation runs in setting
1 across all methods. The first row of each panel is a direct visualization of
estimated latent signal, and the second row of each panel is the trace plot
of the estimated latent signal. . . . . . . . . . . . . . . . . . . . . . . . 85
3.6 Simulation results of latent sources for comparing Locus with other
methods across 100 simulation runs based on the first setting. The
first row represents the averaged Pearson correlation between true and
estimated latent sources. The second row represents the standard deviation
of Pearson correlation between true and estimated latent sources
in log scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.7 Simulation results of latent sources for comparing Locus with other
methods across 100 simulation runs based on the second setting. The
first row represents the averaged Pearson correlation between true and
estimated latent sources. The second row represents the standard deviation
of Pearson correlation between true and estimated latent sources
in log scale. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.8 Simulation results of methods' reproducibility on latent sources for
comparing Locus with other methods across 100 simulation runs based
on the first setting. The first row represents the averaged adjusted
Pearson correlation between true and estimated latent sources. The
second row represents the averaged adjusted jaccard index between
true and estimated latent sources. . . . . . . . . . . . . . . . . . . . 88
3.9 Simulation results of methods' reproducibility on latent sources for
comparing Locus with other methods across 100 simulation runs based
on the second setting. The first row represents the averaged adjusted
Pearson correlation between true and estimated latent sources. The
second row represents the averaged adjusted jaccard index between
true and estimated latent sources. . . . . . . . . . . . . . . . . . . . 89
3.10 Estimated latent signals of 4 randomly selected simulation runs in setting
2 across all methods. The first row of each panel is a direct visualization of
estimated latent signal, and the second row of each panel is the trace plot
of the estimated latent signal. . . . . . . . . . . . . . . . . . . . . . . . 90
3.11 Heatmap of six matched latent sources between Locus and connICA with
high reproducibility, where these six latent sources estimated from Locus
have a Pearson-based reproducibility higher than 0.7. . . . . . . . . . . 91
3.12 Reproducibility analysis for 18 matched latent sources from Locus and
connICA. Left is based on Pearson's correlation and right is for Jaccard
Index. It is shown that for the matched latent sources Locus tends to
have higher reproducibility compared to connICA approach. . . . . 92
3.13 Intensity plot of six matched latent sources between Locus and connICA
with high reproducibility, where these six latent sources estimated from
Locus have a Pearson-based reproducibility higher than 0.7. . . . . . . . . 93
3.14 Visualizing the top 1% brain connectivities of the 6 matched latent signals
based on Locus using BrainNetViewer. . . . . . . . . . . . . . . . . . . . 94
3.15 Comparison between Locus and connICA. We selected the three most cor-
related latent sources from the 2 methods, and show the difference between
them. First row shows the scatter plot of the intensities from Locus and
connICA with a threshold at 0.08, where blue dots represent the edges only
signicant in connICA but not in Locus. In the second row, those blue
dots are visualized in the heatmap which are the edges only signicant in
connICA but not for Locus. . . . . . . . . . . . . . . . . . . . . . . . . 95
3.16 Two estimated latent sources based on Locus which are not identified by
connICA. These 2 latent sources have relatively high reproducibility and are
significantly associated with subjects' clinical outcomes, i.e. gender and age, 95
3.17 Visualizing the top 1% brain connectivities of the 2 estimated latent signals
from Locus which are not identified by connICA . . . . . . . . . . . . . . 96
4.1 Visualizing some highly reproducible brain functional subnetworks derived
from PNC study based on BrainNetViewer from Chapter 3. . . 99
4.2 Heatmap of some highly reproducible binary brain functional subnetwork
masks derived from PNC study from Chapter 3. . . . . . . . . . 100
4.3 A visualization of DLconv modeling framework for brain network data
analysis. This DLconv model contains a Mconv layer with 5 filters for
each subnetwork, a Mask2Score framework with 1 layer combining the
output from Mconv into subnetwork-specific output, and a final layer
combining the information from all subnetworks into the final output. 104
4.4 Model performance stability analysis across 50 initialization from DLconv,
FullNN, Mconv + FullNN. Solid line represents the average and
shadow area represents the 95% quantile. . . . . . . . . . . . . . . . 113
4.5 Boxplot of subnetwork-specific weights on the last Layer of DLconv
across 50 bootstrap runs from two training strategies. . . . . . . . . . 114
4.6 Boxplot of subnetwork-specific AUC for testing dataset across 50 bootstrap
runs of DLconv model trained by SepIC strategy. . . . . . . . 114
4.7 The 5 most predictive functional brain subnetworks for gender difference
based on DLconv trained via SepIC algorithm. Subnetworks are
selected based on average weight or AUC across 50 bootstrap runs
and the visualized subnetwork-specific filters are the ones with largest
weight in mask2score layer in the best performed model across 50 runs. 115
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
关键词 | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Novel Statistical and Machine Learning Methods with Application to Brain Imaging Data () | 2020-03-26 00:41:57 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|