Prediction Approaches for High-dimensional and Complex Neuroimaging Data Público

Ma, Xin (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/pv63g154c?locale=pt-BR

Published

Abstract

Neuroimaging studies continue to scale up with more participants, multiple follow-up visits, and higher scanning resolutions. High-dimensionality, spatial distribution and low signal-to-noise ratio make neuroimaging data challenging to work with, requiring development of novel and flexible methodology for prediction and feature selection.

In topic 1, we develop a novel two-stage Bayesian regression framework using functional connectivity networks as covariates and a scalar continuous outcome variable. The approach first finds a lower dimensional node-specific representation for the networks, then embeds these representations in a flexible Gaussian process regression framework with node selection via spike-and-slab prior. Extensive simulations and a real application show distinct advantages of the proposed approach regarding prediction, coverage, and node selection. To our knowledge, the proposed approach is one of the first nonlinear semi-parametric Bayesian regression models based on high-dimensional functional connectivity features.

In topic 2, we propose a novel joint scalar-on-image regression framework involving wavelet-based image representations with grouped penalties to pool information across inter-related images for joint learning. We explicitly account for noise in images via a corrected objective function. We derive non-asymptotic statistical error bounds under the grouped penalties, allowing the number of voxels to increase exponentially with sample size. A projected gradient descent algorithm is used for computation and shown to approximate the optimal solution via non-asymptotic optimization error bounds under noisy images. Extensive simulations and an application to Alzheimer's study demonstrate significantly improved predictability and greater power to detect signals.

In topic 3, we generalize the idea in topic 2 to Lipschitz continuous loss functions, including logistic loss, hinge loss and quantile regression loss. We propose a unified sparse learning framework in high-dimensional setting with built-in strategy for measurement errors. Unlike the approach with corrected objective function for linear models in topic 2, we find a sparse estimator in a confidence set based on the gradient of empirical loss function. We derive the non-asymptotic statistical error bounds and sign consistency results for the proposed estimator. We develop a Newton-Raphson type algorithm with linear programming and conduct extensive numerical experiments to illustrate the superior performance of our proposed estimator in various settings.

1 Introduction 1

1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Magnetic Resonance Imaging of Brain . . . . . . . . . . . . . . . . . . . . . 2

1.3 Motivating Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.1 Grady Trauma Project . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3.2 Alzheimer’s Disease Neuroimaging Initiative . . . . . . . . . . . . . . 4

1.4 Prediction Modeling on Neuroimaging Data . . . . . . . . . . . . . . . . . . 4

2 Semi-parametric Bayes Regression with Network-valued Covariates 6

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.3 Proposed Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3.1 Model formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.3.2 Computation Framework . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.3 Prediction for Testing Samples . . . . . . . . . . . . . . . . . . . . . 23

2.3.4 Hyper-parameter Selection . . . . . . . . . . . . . . . . . . . . . . . 24

2.4 Empirical Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.1 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.4.2 PTSD Data Application . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.5 Conclusion and Future Direction . . . . . . . . . . . . . . . . . . . . . . . . 41

2.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

3 Multi-task Learning with High-Dimensional Noisy Images 49

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.2 Multi-task learning without Measurement Errors . . . . . . . . . . . . . . . 54

3.2.1 Weak Oracle Properties under Group Bridge with Uncorrupted Images 58

3.2.2 Computation under Group Bridge with Uncorrupted Images . . . . 60

3.3 Multi-task learning with Measurement Errors . . . . . . . . . . . . . . . . . 62

3.3.1 Theoretical properties under noisy images . . . . . . . . . . . . . . . 64

3.3.2 Case with unknown noise covariance . . . . . . . . . . . . . . . . . . 68

3.3.3 Lower- and Upper-RE Conditions . . . . . . . . . . . . . . . . . . . . 69

3.3.4 Computational Algorithms . . . . . . . . . . . . . . . . . . . . . . . 70

3.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.4.1 Scenario with Known Noise Covariance . . . . . . . . . . . . . . . . 75

3.4.2 Scenario with Unknown Noise Covariance . . . . . . . . . . . . . . . 76

3.4.3 Additional Simulations with Other Signal Patterns . . . . . . . . . . 80

3.4.4 Sensitivity Analysis to Noise Covariance Estimation Bias . . . . . . 85

3.4.5 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

3.5 Analysis of ADNI Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

3.5.1 Data Pre-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

3.5.2 Analysis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

3.7 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

3.7.1 Discrete Wavelet Transform in 3-D . . . . . . . . . . . . . . . . . . . 98

3.7.2 KKT Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

3.7.3 Proof of Theorem 3.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 103

3.7.4 Proof of Corollary 3.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 108

3.7.5 Proof of Lemma 3.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 109

3.7.6 Proof of Theorem 3.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . 111

3.7.7 Proof of Corollary 3.3.1 . . . . . . . . . . . . . . . . . . . . . . . . . 114

3.7.8 Proof of Theorem 3.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . 114

3.7.9 Proof of Theorem 3.3.3 . . . . . . . . . . . . . . . . . . . . . . . . . 118

3.7.10 Proof of Corollary 3.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . 120

3.7.11 Proof of Lemma 3.3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 122

4 A Unified Sparse Learning Framework for Lipschitz Loss Functions 124

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

4.2 Proposed Method with Lipschitz Losses . . . . . . . . . . . . . . . . . . . . 128

4.2.1 Estimation with Noiseless Predictors . . . . . . . . . . . . . . . . . . 131

4.2.2 Estimation with Noisy Predictors . . . . . . . . . . . . . . . . . . . . 135

4.3 Computations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

4.3.1 Computational Algorithms . . . . . . . . . . . . . . . . . . . . . . . 138

4.3.2 Parameter Tuning and Initialization . . . . . . . . . . . . . . . . . . 139

4.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

4.5 Real Data Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

4.6 Appendices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

4.6.1 Proof of Lemma 4.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . . 147

4.6.2 Proof of Lemma 4.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . 147

4.6.3 Proof of Lemma 4.2.3 . . . . . . . . . . . . . . . . . . . . . . . . . . 148

4.6.4 Proof of Theorem 4.2.1 . . . . . . . . . . . . . . . . . . . . . . . . . 148

4.6.5 Proof of Theorem 4.2.2 . . . . . . . . . . . . . . . . . . . . . . . . . 149

4.6.6 Proof of Lemma 4.2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . 150

4.6.7 Proof of Lemma 4.2.5 . . . . . . . . . . . . . . . . . . . . . . . . . . 151

4.6.8 Proof of Theorem 4.2.3 . . . . . . . . . . . . . . . . . . . . . . . . . 151

4.6.9 Proof of Theorem 4.2.4 . . . . . . . . . . . . . . . . . . . . . . . . . 151

About this Dissertation

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Biostatistics
Degree	Ph.D.
Submission	Dissertation
Language	English
Research Field	Statistics Biology, Biostatistics
Palavra-chave	Prediction and classification Statistical imaging analysis Functional data analysis
Committee Chair / Thesis Advisor	Suprateek Kundu, Emory University
Committee Members	Ying Guo, Emory University John Hanfelt, Emory University Deqiang Qiu, Emory University

Última modificação

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Prediction Approaches for High-dimensional and Complex Neuroimaging Data ()	2022-04-18 16:50:26 -0400	Download

Prediction Approaches for High-dimensional and Complex Neuroimaging Data Público

Ma, Xin (Spring 2022)

Abstract

Table of Contents

About this Dissertation

Primary PDF

Supplemental Files