Multi-Modal Data Integration with LLMs for Cardiovascular Disease Prediction Restricted; Files Only

Liu, Yiyi (Fall 2024)

Permanent URL: https://etd.library.emory.edu/concern/etds/bn999796n?locale=zh
Published

Abstract

The integration of multimodal data in healthcare has emerged as a promising avenue for improving diagnostic accuracy and risk assessment. This study explores an approach for predicting cardiovascular disease (CVD) risk by integrating CT images, lung cancer diagnostics, and electronic health records (EHR), aiming to provide a perspective on patient health. The proposed model leverages transformers and fine-tuned large language models (LLMs) for biomedical applications, enabling feature extraction and multimodal data fusion.

The method utilizes a concatenation-based fusion strategy and a cross-attention mechanism to combine image and text features, leveraging their complementary information. Experimental results highlight the critical role of EHR data in achieving high predictive accuracy (72.91% in the text-only model), while image data, although secondary, offers valuable supplementary insights. The multimodal concatenation fusion model achieved an accuracy of 71.47%, indicating that further optimization of fusion strategies is needed to fully exploit the potential of multimodal integration.

Challenges remain in improving the lung cancer diagnostics component, as the performance of BiomedGPT on this dataset is limited, which constrains the overall model's effectiveness. Future work will focus on refining fusion methods, enhancing feature representation with advanced LLMs, and incorporating more detailed diagnostic information to further improve CVD risk prediction.

Table of Contents

Chapter 1     Introduction 1

1.1    Background and Motivation 1

1.2    Clinical Relevance between Lung Cancer and Cardiovascular Disease (CVD) 2

1.3    Problem Statement and Research Goals 3

1.4    Contributions of This Work 4

Chapter 2     Related Work 6

2.1    Cardiovascular Disease Prediction 6

2.2    Lung Cancer Detection and Prediction 7

2.3    Multimodal Data Fusion in Medical Diagnostics 10

2.4    LLMs in Medical Applications 11

Chapter 3     Methodology 13

3.1    Overall Model Architecture 13

3.2    Fine-Tuning BiomedGPT 14

3.3    Transformer 15

3.4    Bidirectional Encoder Representations from Transformers (BERT) 19

3.5    Cross-Attention for Embedding Fusion 21

Chapter 4     Experiments and Results 25

4.1    Dataset 25

4.2    Preprocessing 26

4.3    Experimental Setup 27

Chapter 5     Result Analysis and Discussion 31

5.1    Multimodal CVD Prediction 31

5.2    Advantages and Limitations of the Proposed Model 32

5.3    Challenges in Multimodal Data Integration 32

5.5    Future Works 33

Chapter 6     Conclusion 35

References 36

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
关键词
Committee Chair / Thesis Advisor
Committee Members
最新修改 Preview image embargoed

Primary PDF

Supplemental Files