Multi-Modal Data Integration with LLMs for Cardiovascular Disease Prediction Restricted; Files Only

Liu, Yiyi (Fall 2024)

Permanent URL: https://etd.library.emory.edu/concern/etds/bn999796n?locale=zh

Published

Abstract

The integration of multimodal data in healthcare has emerged as a promising avenue for improving diagnostic accuracy and risk assessment. This study explores an approach for predicting cardiovascular disease (CVD) risk by integrating CT images, lung cancer diagnostics, and electronic health records (EHR), aiming to provide a perspective on patient health. The proposed model leverages transformers and fine-tuned large language models (LLMs) for biomedical applications, enabling feature extraction and multimodal data fusion.

The method utilizes a concatenation-based fusion strategy and a cross-attention mechanism to combine image and text features, leveraging their complementary information. Experimental results highlight the critical role of EHR data in achieving high predictive accuracy (72.91% in the text-only model), while image data, although secondary, offers valuable supplementary insights. The multimodal concatenation fusion model achieved an accuracy of 71.47%, indicating that further optimization of fusion strategies is needed to fully exploit the potential of multimodal integration.

Challenges remain in improving the lung cancer diagnostics component, as the performance of BiomedGPT on this dataset is limited, which constrains the overall model's effectiveness. Future work will focus on refining fusion methods, enhancing feature representation with advanced LLMs, and incorporating more detailed diagnostic information to further improve CVD risk prediction.

Chapter 1 Introduction 1

1.1 Background and Motivation 1

1.2 Clinical Relevance between Lung Cancer and Cardiovascular Disease (CVD) 2

1.3 Problem Statement and Research Goals 3

1.4 Contributions of This Work 4

Chapter 2 Related Work 6

2.1 Cardiovascular Disease Prediction 6

2.2 Lung Cancer Detection and Prediction 7

2.3 Multimodal Data Fusion in Medical Diagnostics 10

2.4 LLMs in Medical Applications 11

Chapter 3 Methodology 13

3.1 Overall Model Architecture 13

3.2 Fine-Tuning BiomedGPT 14

3.3 Transformer 15

3.4 Bidirectional Encoder Representations from Transformers (BERT) 19

3.5 Cross-Attention for Embedding Fusion 21

Chapter 4 Experiments and Results 25

4.1 Dataset 25

4.2 Preprocessing 26

4.3 Experimental Setup 27

Chapter 5 Result Analysis and Discussion 31

5.1 Multimodal CVD Prediction 31

5.2 Advantages and Limitations of the Proposed Model 32

5.3 Challenges in Multimodal Data Integration 32

5.5 Future Works 33

Chapter 6 Conclusion 35

References 36

About this Master's Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Computer Science and Informatics
Degree	M.S.
Submission	Master's Thesis
Language	English
Research Field	Computer Science
关键词	Large Language Model Data Fusion Multimodality
Committee Chair / Thesis Advisor	Liang Zhao, Emory University
Committee Members	Carl Yang, Emory University Wei Jin, Emory University

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	File download under embargo until 09 January 2027	2024-11-30 19:39:11 -0500	File download under embargo until 09 January 2027

Multi-Modal Data Integration with LLMs for Cardiovascular Disease Prediction Restricted; Files Only

Liu, Yiyi (Fall 2024)

Abstract

Table of Contents

About this Master's Thesis

Primary PDF

Supplemental Files