Bayesian Analysis for Repeated Compositional Data and Approaches for Correcting Measurement Errors in General Multivariate Linear Model Open Access

Qin, Tielin (2011)

Permanent URL: https://etd.library.emory.edu/concern/etds/2514nm23c?locale=en
Published

Abstract

Compositional data can be viewed as the positive vectors whose components are
the proportion or percentage of whole. In this dissertation, we are motivated by the
need to examine the different subpopulation of white blood cells in the Protective
Immunity Project (PIP) study conducted at the Emory Transplant Center. The first
research question is how to modeling the white cell compositions over time. The data
obtained from this study is the compositional data with repeated measurements. We
develop a Bayesian approach for the analysis of the repeat-measured compositional
data. Our results have been demonstrated that the Bayesian methodology can be
used to analyzed repeat-measured compositional data. We use MCMC for model
inference and show that the method is practical in high dimensional problems.

Another research question motivated from the PIP study is how to get the correct
estimates when the measurement errors exist on the total cell count data. In the
medical studies, some variables of interest are difficult to obtain, and surrogate vari-
ables are used instead. However, these surrogate variables may contain measurement
errors. We propose the likelihood-based estimators for general multivariate linear
model when the non-linear measurement errors exist in the response variables. The
observed response variables are related to the true values through a non-linear re-
gression model, and the parameters in the measurement error model are estimated
by using independent, external calibration data. The pseudo-MLE is used for model
inference to avoid computational problems. Our proposed models provide a tool to
correct for measurement errors in response variables in longitudinal data.

Finally, we propose a Bayesian approach for correcting the measurement error in
the general multivariate linear model when the non-linear measurement errors exist in
the response variables. We outline how the estimations of the parameters of interest
can be carried out in a Bayesian framework using Gibbs sampling and Metropolis
Algorithm. In the Bayesian approach, we impute the values of the unobservable
variable Y by sampling from their conditional distribution given all the observed
data and other parameters. Therefore, using Bayesian approach can avoid numerical
integrations which may be tedious and extensive.

Table of Contents

1 INTRODUCTION 1
1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 An Introduction of Flow Cytometry . . . . . . . . . . . . . . . . . . . 3
1.3 The Motivation Example . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 Proposed Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8


2 BACKGROUND 10
2.1 An Introduction of Composition Data Analysis . . . . . . . . . . . . . 10
2.1.1 Ternary diagrams to display compositional data . . . . . . . . 10
2.1.2 Algebra for Compositions . . . . . . . . . . . . . . . . . . . . 11
2.1.3 Logistic Normal Distribution . . . . . . . . . . . . . . . . . . . 15
2.1.4 State-Space model for Discrete Compositions . . . . . . . . . . 20
2.2 Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 Monte Carlo Integration . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 The Gibbs Sampler . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 The Metropolis-Hastings algorithm . . . . . . . . . . . . . . . 23
2.3 The Limitations of Existing Methods . . . . . . . . . . . . . . . . . . 25
2.3.1 Compositional Data Analysis . . . . . . . . . . . . . . . . . . 25
2.3.2 Measurement Error in Longitudinal Data . . . . . . . . . . . . 26

3 BAYESIAN ANALYSIS OF REPEATED COMPOSITIONAL DATA 28
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Model Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3.1 Incorporate The Eect of Covariates . . . . . . . . . . . . . . 34
3.3.2 Model Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Data Analysis and Results . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4.1 Model Diagnostics . . . . . . . . . . . . . . . . . . . . . . . . 43
3.5 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50


4 MEASUREMENT ERROR IN GENERAL MULTIVARIATE LINEAR MODEL 66
4.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.1.1 Measurement error in the response in the General linear model 68
4.1.2 Non-linear response measurement error in linear model . . . . 69
4.1.3 General likelihood methods for response measurement error . . 71
4.1.4 General Multivariate Linear Model . . . . . . . . . . . . . . . 71
4.2 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.3 Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.3.1 Point Estimation Procedure by EM Algorithm . . . . . . . . . 75
4.3.2 Asymptotic Covariance . . . . . . . . . . . . . . . . . . . . . . 78
4.4 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.5 Real-life Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

5 MEASUREMENT ERROR IN GENERAL MULTIVARIATE LINEAR MODEL-A BAYESIAN APPROACH 93
5.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.1.1 Measurement Error in Response Variables . . . . . . . . . . . 94
5.1.2 Bayesian methods for measurement errors . . . . . . . . . . . 95
5.2 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
5.3 Model Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4 Simulation Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
5.5 Real-Life Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
5.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111


6 SUMMARY AND FUTURE WORK 118
6.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
6.2 Future Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121


A ASYMPTOTIC RESULTS 123

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files