Flexible Methods to Incorporate Covariates in Latent Class Analysis Public

Kim, Grace (Summer 2021)

Permanent URL: https://etd.library.emory.edu/concern/etds/g158bj448?locale=fr
Published

Abstract

Mild Cognitive Impairment (MCI) is a neurocognitive disorder with a complex structure that sometimes precedes dementia. It is comprised of heterogeneous subclinical entities, which necessitates clinicians to assess different domains of cognitive, functional, neuropsychiatric, and possibly biological features for an accurate diagnosis and early intervention. Latent class analysis (LCA) is a method based on rigorous statistical derivation that can be used to explore heterogeneity of MCI. Latent class regression, an extension of the latent class framework established by Bandeen-Roche et al. (1997), can be used to incorporate covariates as risk factors of class membership. Under the latent class regression model, the population of interest consists of mixture of different subcategories of MCI with unobserved or latent class membership, which is further associated with risk factors of interest.

The first aim of this research is to explore situations when covariates unintentionally influence conceptualization of latent classes, and develop a flexible method for researchers to incorporate covariates without distorting too extensively the clinical interpretation of the latent classes in the maximum likelihood solution. Relative frequencies of latent classes resulting from covariates will be used to help investigate the structure of MCI. The EM algorithm will be used to provide optimal parameter estimates and latent class-specific means of manifest variables.

The second aim expands on the first aim by focusing on high-dimensional and potentially correlated covariates to develop a new method, termed compound LCA, that applies dimension reduction in covariate space simultaneously with dimension reduction in manifest variable space. Compound LCA will effectively avoid uncertainties or “fuzziness” in dimension reduction that are propagated in the LCA by introducing a second set of latent classes that are formulated based on the observed high-dimensional covariate patterns. The EM algorithm will be used to find the prevalence of classes of covariates and features, posterior probabilities of each individual, and latent class-specific means of covariates and feature variables for clinical interpretation of the latent classes. The third aim introduces an extension of compound LCA, which assumes that feature classes are nested within covariate classes. We provide a likelihood ratio test that compares compound LCA and its extension.

Table of Contents

1 Introduction 1

1.1 Overview...................................... 2

1.2 Motivating Example ............................... 4

1.2.1 Study Sample............................... 4

1.2.2 Assessments of Functional Abilities ................... 4

1.2.3 Assessments of Neuropsychiatric Symptoms . . . . . . . . . . . . . . 4

1.2.4 Assessments of Cognition......................... 5

1.2.5 Vascular Risk Factors........................... 6

1.2.6 Analysis.................................. 6

1.3 Scope of Research................................. 10

2 Literature Review 11

2.1 Latent Class Analysis............................... 12

2.1.1 Overview ................................. 12

2.1.2 Methods.................................. 12

2.1.3 EM Algorithm............................... 13

2.1.4 Information Matrix Under EM Algorithm . . . . . . . . . . . . . . . 14

2.1.5 Model Selection.............................. 15

2.1.6 Latent Class Regression Models ..................... 17

2.1.7 High-Dimensional Covariates in Latent Class Analysis . . . . . . . . . 18

3 Latent Class Analysis with Covariates Activity Governor 20

3.1 Overview...................................... 21

3.2 Methods...................................... 21

3.2.1 Relative Frequencies Model ....................... 21

3.2.2 Maximum Likelihood Estimation .................... 22

3.2.3 Model-Averaging ............................. 24

3.2.4 Analysis of Underlying Population ................... 25

3.3 Results....................................... 27

3.3.1 Simulation Studies ............................ 27

3.3.1.1 Design A: Unstructured Covariates Independent of Manifest Variables ............................ 27

3.3.1.2 Design B: Structured Covariates Independent of Manifest Variables ............................ 29

3.3.1.3 Comorbidity Design ...................... 31

3.3.1.4 Missingness Design....................... 33

3.3.2 MCI Dataset ............................... 34

3.3.2.1 Overview ............................ 34

3.3.2.2 Analysis of Underlying Population . . . . . . . . . . . . . . 35

3.3.2.3 Analysis............................. 37

3.4 Discussion..................................... 42

3.5 Appendix ..................................... 44

4 Compound Latent Class Analysis 46

4.1 Overview...................................... 47

4.2 Methods...................................... 47

4.2.1 Relative Frequencies Model ....................... 47

4.2.2 Maximum Likelihood Estimation .................... 48

4.2.3 Information Matrix............................ 49

4.2.4 Model Selection Criterion ........................ 51

4.2.5 Analysis of Underlying Subpopulations . . . . . . . . . . . . . . . . . 52

4.3 Results....................................... 54

4.3.1 Simulation Studies ............................ 54

4.3.1.1 Simulation Results: Sample Size=600 . . . . . . . . . . . . . 55

4.3.1.2 Simulation Results: Sample Size=2000 . . . . . . . . . . . . 57

4.3.2 MCI Dataset ............................... 58

4.3.2.1 Study Sample.......................... 58

4.3.2.2 Vascular Risk Factors ..................... 58

4.3.2.3 Demographic Characteristics ................. 59

4.3.2.4 APOE.............................. 59

4.3.2.5 Analysis............................. 59

4.4 Discussion..................................... 67

5 Extension of Compound Latent Class Analysis 69

5.1 Overview...................................... 70

5.2 Methods...................................... 70

5.2.1 Relative Frequencies Model ....................... 70

5.2.2 Maximum Likelihood Estimation .................... 71

5.2.3 Information Matrix............................ 72

5.2.4 Likelihood Ratio Test........................... 73

5.2.5 Analysis of Underlying Subpopulations ................. 74

5.3 Results....................................... 76

5.3.1 MCIDataset ............................... 76

5.3.1.1 Overview ............................ 76

5.3.1.2 Analysis............................. 76

5.4 Discussion..................................... 81

6 Future Research 83

6.1 Summary ..................................... 84

6.2 Future Research.................................. 84

Bibliography 86

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Mot-clé
Committee Chair / Thesis Advisor
Committee Members
Dernière modification

Primary PDF

Supplemental Files