Identification of Kidney Transplant Recipients at High-Risk for Post-Transplant Hospitalization using Natural Language Processing Pubblico

Arenson, Michael (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/mg74qn20w?locale=it
Published

Abstract

Post-discharge rehospitalization after kidney transplant is a common and preventable problem that is both costly to patients and healthcare systems and is associated with poor outcomes. There is epidemiological evidence that up to 50% of surgical readmissions may be preventable (e.g. through discharge planning, patient education, and/or follow-up communication). Predictive analytics have previously been used to identify patients at risk of rehospitalization with limited success.

 

The vast amount of free-text data in the form of clinical notes that exist in the electronic medical record (EMR) has been untapped in the field of kidney-transplant. To date EMR free-text clinical notes have not been included in predictive models of 30-day rehospitalization (30DR) post-kidney transplant. Unstructured data describes any source of data that is not easily placed in a traditional numeric dataset. Analyzing free-text requires Natural language processing (NLP), which is a subfield of Artificial Intelligence that uses computer algorithms to analyze human language. Here, NLP was used to analyze EMR free-text documentation of kidney transplant recipients with the ultimate goal of reducing readmission post-kidney transplant.

 

This was a retrospective observational analysis of first-time recipients of kidney transplant at a large institution in the Southeast between January 2005 and December 2015. Both structured and unstructured data in the form of clinical notes written in the EMR were analyzed. Eight clinical notes were characterized and mined for possible new predictive features that might be useful to improve predictive accuracy of 30DR post-kidney transplant. Predictive models using unstructured, free-text clinical notes were built using machine-learning, unsupervised approaches. These predictive models did not meaningfully improve predictive accuracy above structured data alone. However, the results generated a number of new hypotheses regarding potentially novel predictors to be examined in future research applying more human-driven approaches.

Table of Contents

Table of Contents

INTRODUCTION 1

BACKGROUND 3

METHODS 6

RESULTS 13

DISCUSSION 20

CONCLUSIONS 26

REFERENCES 27

TABLES / FIGURES 29

FIGURE 1: CONCEPTUAL MOCK-UP OF CLINICAL DASHBOARD IDENTIFYING KIDNEY TRANSPLANT PATIENTS AT HIGH-RISK OF 30-DAY READMISSION 29

FIGURE 2: LIST OF ALL STRUCTURED DATA VARIABLES AND ALL UNSTRUCTURED DATA SOURCES BY TIME COLLECTED IN TRANSPLANT PROCESS 30

FIGURE 3: TRADITIONAL CONCATENATION VS. ENSEMBLE LOGISTIC REGRESSION METHOD. 32

FIGURE 4: INCLUSION AND EXCLUSION FLOWCHART 33

TABLE 1: BASELINE CHARACTERISTICS OF KIDNEY TRANSPLANT RECIPIENTS FROM EMORY TRANSPLANT CENTER, STRATIFIED BY READMISSION WITHIN 30 DAYS POST -TRANSPLANT, 2005–2015 34

FIGURE 5: FREQUENCY OF WORDS IN THREE TYPES OF NOTES GRAPHED BY 30DR VS. NON-30DR PATIENTS. 39

TABLE 2: MOST COMMON WORDS THAT PRECEDE THE WORD "SUPPORT" AMONGST ALL NOTES AS IDENTIFIED BY TERM FREQUENCY 41

FIGURE 6: TOP 30 TF-IDF FEATURES FOR OPERATIVE, SELECTION CONFERENCE, AND SOCIAL WORK NOTES 42

FIGURE 7: TOP 20 TERMS IN TOPIC MODEL USING LDA WITH K=8 TOPICS 43

FIGURE 8: GAMMA FOR EACH TOPIC BY NOTE TYPE 44

TABLE 3: INDIVIDUAL CLINICAL NOTES ADDED TO STRUCTURED VARIABLES TO CREATE PREDICTIVE MODEL 45

TABLE 4: ADDING MULTIPLE NOTE TYPES TO PREDICTIVE MODELS FOR HOSPITAL READMISSION AFTER KIDNEY TRANSPLANTS 46

TABLE 5: RANKING TOP PREDICTIVE FEATURES FOR HIGHER READMISSION OF KIDNEY TRANSPLANT RECIPIENTS (2005-2015) IN HIGHEST PERFORMING PREDICTIVE MODEL FROM TABLE 4 47

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Parola chiave
Committee Chair / Thesis Advisor
Ultima modifica

Primary PDF

Supplemental Files