Using natural language processing to detect stigmatizing provider language and evaluate associations with opioid analgesic pain management outcomes Open Access
Walker, Andrew (Spring 2024)
Abstract
Objective: In this dissertation, we sought to detect and classify stigmatizing and biased language in intensive care unit (ICU) electronic health records (EHRs) using natural language processing techniques. We evaluated the prevalence of such language across different patient demographics and provider factors, and explored the association of linguistic biases with care outcomes, including opioid analgesic prescription rates, dispensation rates, and rates of self-directed discharge from the ICU. Methods: Utilizing the Medical Information Mart for Intensive Care-III (MIMIC-III) dataset, we developed a comprehensive lexicon from literature-driven stem words, expanded with Word2Vec and GPT 3.5, to identify stigmatizing patient labels, doubt markers, and scare quotes. This lexicon was used to search 18 million sentences, 3000 of which were then used to train various classifiers, including bag-of-words and transformer-based models. Supervised learning techniques assessed the distribution of linguistic bias and its clustering within patient records, leveraging sentence-level analysis to connect linguistic features with patient care outcomes. Results: We developed lexicons and with high utility in identifying stigmatizing labels and doubt markers, and classifiers showing high accuracy, recall, and precision. Stigmatizing labels and doubt markers was found to be more prevalent among historically marginalized groups, with notable disparities in care outcomes, namely higher likelihoods of self-directed discharge. No significant associations were found between the linguistic features and opioid prescription or dispensation rates. Discussion: This dissertation supports the feasibility of using natural language processing to identify stigmatizing and doubt-marking language within medical records. It highlights consistent trends of stigmatizing language, particularly against historically stigmatized patients, and underscores the need for further research and intervention to mitigate these stigmas and downstream health outcomes. Conclusions: The high performance of the classifiers, titled CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records, underscores their potential for broader application in identifying stigmatizing language within healthcare systems. Study findings also highlight the importance of addressing stigmatizing language as a component of quality care and suggest that methods used in the study can be used to reduce stigmatization in EHR notes and identify areas of intervention.
Table of Contents
Table of Contents
Introductory Literature Review 1
Defining Stigma and Bias 1
Ecosocial theory + stigma/bias 3
Clinical Outcomes of Associated with Stigma 12
Pain Management 13
Self-directed discharge 13
Origins and targets of stigmatization 14
Stigmatized Chronic Illness Populations frequently prescribed opioids 18
Current approaches to measuring Stigma and Bias 21
Survey methods 21
Administrative Data 22
Vignette studies 22
Implicit Association Tests (IATs) 23
Solution: Mining patient care notes to identify bias and stigma 25
Linguistic manifestations of bias and stigmatization: The Social Categories and Stereotype Communication Framework 26
Stigmatizing labels and negative descriptors 27
Evidentials and markers of doubt 28
Scare Quotes 29
Natural Language Processing to identify stigma and bias in EHR Data 30
Introductory Literature Review References 32
Aim 1: CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records: model development and validation 53
Abstract 53
Introduction: 54
Linguistic manifestations of stigmatization 56
Stigmatizing labels 57
Doubt Markers 58
Scare Quotes 59
Natural Language Processing to identify stigma and bias in EHR Data 60
Methods: 60
MIMIC-III Dataset 61
Lexicon development and sample preparation 61
Stigmatizing labels 63
Doubt markers 63
Scare quotes 63
Matching with sentences in MIMIC-III, creating coding samples 64
Annotation process 64
Sentence Classification 65
Results: 66
Lexicon Development 66
Regular Expression Search results 67
Annotation 70
Linguistic Bias Classifier Model Evaluation Results 71
Discussion: 74
Integrative summary of findings 74
Limitations 77
References 79
Appendix 1: Lexicons for Doubt Markers and Stigmatizing Labels 84
Appendix 2: Stigmatizing Labels Ontology 87
Coding Process 87
Link and Phelan Stigma Definition 87
Stigmatizing labels and negative descriptors in charts 88
Coding Rules 90
Appendix 3: Doubt Markers Ontology 92
Coding Process 92
Coding Rules 94
Appendix 4: Scare Quotes Ontology 95
Coding Process 95
Coding Rules 97
Appendix 5: Best Performing Model Hyperparameters 99
Aim 2: Distribution of stigmatizing and doubt-marking language in EHR across patients, providers, and frequently-stigmatized diagnoses 100
Introduction 101
Methods 107
CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records 109
Doubt markers 110
Results 113
Patient-level models- demographic predictors 118
Provider-level models- provider type 119
Patient-level models interaction model estimates with Black + African American /Non-Black + African American race variable 120
Discussion 121
Appendix 1: Patient Ethnicity and Provider Recategorizations 138
Appendix 2: Lexicons for Doubt Markers and Stigmatizing Labels 139
Aim 3: Evaluating the relationships between stigmatizing language features in the EHR and patient care outcomes of opioid analgesic prescription rates and self-directed discharge 142
Abstract: 142
Introduction 143
Defining Stigma 143
Stigmatizing labels 144
Doubt Markers 145
Theoretical frameworks for how provider stigmatizing language impacts health outcomes 145
Clinical Outcomes of Associated with Stigma 147
Pain Management 148
Self-directed discharge 148
Methods 149
Stigmatizing labels 150
Doubt markers 151
Results 155
Discussion 157
Conclusions 161
References 161
Appendix 1 170
Dissertation Conclusion 172
About this Dissertation
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Using natural language processing to detect stigmatizing provider language and evaluate associations with opioid analgesic pain management outcomes () | 2024-04-08 17:06:28 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|