Statistical Methods for Handling Missing Data in Functional Data Analysis Open Access

Zhu, Wanzhe (Spring 2018)

Permanent URL: https://etd.library.emory.edu/concern/etds/5d86p020k?locale=en
Published

Abstract

Statistical analyses of functional data have drawn increased attention in recent years, yet handling missing data remains a notable obstacle in functional data analysis. This work is motivated by a renal study on detection of kidney obstruction, where up to two imaging scans, namely, baseline scan and the scan after furosemide treatment, are available for each kidney, resulting in two curves. In some cases, the kidney is judged to be non-obstructed and the patient does not receive furosemide, resulting in missing data for the second scan.

First, our objective is to develop a method that can impute the second curve based on the first curve, assuming that the first curve is informative about the missing second curve (Chapter 2). We model the curves for each individual using a set of potential basis functions and posit a sparse latent factor model for the basis coefficients, in which a shrinkage prior is assigned to the loadings to induce basis selection. We employ a Bayesian data augmentation algorithm to simultaneously estimate the model parameters and impute the missing curves. Our method is evaluated and compared to existing methods through a simulation study. We illustrate our method using a renal study, in which we impute the second curve for a kidney with a missing second curve, which can be useful in the interpretation of kidney obstruction.

In the same data situation with missing second curve, we consider an analysis of relationship between functional covariates and a binary outcome. We employ a Bayesian hierarchical model for jointly modeling the curves that are measured with error and the association between noise-free curves and the binary outcome in the presence of missing data. We consider two approaches of selecting basis functions for modeling the curves and for parameterizing functional coefficients in the functional generalized linear model used to model the association. In the first approach (Chapter 3), we use cubic B-spline basis functions and use deviance information criterion to select number of basis functions.

To overcome the difficulty in selecting basis functions, alternatively, we utilize functional principal component analysis (FPCA) to derive a more parsimonious model within the same framework, based on selecting functional principal components that explain large percent of variation in the curves (Chapter 4). We conduct simulation studies to assess the performance of the proposed methods in the presence of missing functional data. We illustrate our methods with the application to renal study.

Table of Contents

 

1 Introduction                                                                                                                  1

 

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                               2

 

1.2 Motivating Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                               2

 

1.3 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                6

 

1.4 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             6

 

1.4.1 Functional Data Analysis . . . . . . . . . . . . . . . . . . . . .                                                6

 

1.4.2 Missing Data . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                   11

 

1.4.3 Missing Data in Functional Data Analysis . . . . . . . . . . .                                            17

 

1.5 Statistical Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                          18

 

2 Multiple imputation of functional data with application to renal studies                              20

 

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             21

 

2.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                              24

 

2.2.1 FK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                      26

 

2.2.2 SLF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                       29

 

2.3 Simulation Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                            34

 

2.4 Renal Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             40

 

2.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                              41

 

3 Handling missing data in generalized functional linear models with application to renal studies                                                                                                                   43

 

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             44

 

3.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                              45

 

3.2.1 Data Structure . . . . . . . . . . . . . . . . . . . . . . . . . .                                                  45

 

3.2.2 Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                     45

 

3.2.3 Likelihood . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                   48

 

3.2.4 MCMC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                                      49

 

3.2.5 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . .                                                 51

 

3.3 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                           52

 

3.4 Renal Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             63

 

3.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             67

 

4 Handling missing data in generalized functional linear models through functional principal component analysis with application to renal studies                              69

 

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             70

 

4.2 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                              73

 

4.2.1 Data Structure and Model . . . . . . . . . . . . . . . . . . . .                                               73

 

4.3 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                            77

 

4.4 Renal Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             79

 

4.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .                                             86

 

5 Future work                                                                                                                 89

 

A Appendix for Chapter 3                                                                                                 91

 

B Appendix for Chapter 4                                                                                                 92

 

Bibliography                                                                                                                    93

 

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files