Clustering the Liver Measures of Women Living with HIV 公开

Gerig, Logan (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/bg257g50h?locale=zh
Published

Abstract

Background: Non-alcoholic fatty liver disease is more prevalent amongst those living with HIV compared to the general population (Maurice et al., 2017). Our previous work has found that three commonly used non-invasive liver measures, APRI, FIB-4, and NFS, showed conflicting results in quantifying the degree of liver fibrosis in women living with HIV (WWH) over an extended period (Yu et al., 2022). Clustering, an unsupervised machine learning technique, can be used to partition trajectories into homogeneous discrete groups where they can be compared amongst each other (Teuling et al., 2021).

Objectives: Compare five longitudinal clustering algorithms on WWH’s liver trajectories to see how they perform with respect to observational data that is subject to unequal follow-up; explore the clusters identified by the best performing method; and compare these results to those identified by cluster results from Fibroscan data.

Methods: Data from the Women's Interagency HIV Study (WIHS) used in our previous work had all three liver measures clustered using: longitudinal K-Means (KML), growth-curve modeling into K-Means (GCKM), group-based trajectory modeling (GBTM), generalized linear mixed modeling assuming normal mixture in random effects (GLMM), and anchored k-medoids. The best performing method’s clusters were explored to discover features associated with cluster membership. Cross-sectional, Fibroscan data was clustered using K-Means and had their subsequent clusters compared with the longitudinal ones.

Results: GBTM was the best performing method for cross-validation and clinical interpretably with a cluster solution of five, five, and six clusters for APRI, FIB-4, and NFS. Little correlation was found between the features examined and the clusters identified. Furthermore, cluster membership was inconsistent among the three liver measurements, with all three showing discordance with the two Fibroscan-identified clusters.

Conclusions: Issues such as convergence and extensive imputation were encountered for several of the longitudinal clustering methods, suggesting that more flexible methods such as GBTM should be developed. The clustering identified by GBTM indicated a lack of latent variables responsible for all three liver measurement trajectories. Finally, the observed inconsistency between the three liver measurement clusters and the Fibroscan cluster suggests that clinicians should exercise caution when assessing liver health in WWH.

Table of Contents

Introduction..............................6 Methods..................................11 Results....................................20 Discussion...............................49 References..............................56 Appendix A..............................62

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
关键词
Committee Chair / Thesis Advisor
Committee Members
最新修改

Primary PDF

Supplemental Files