Using natural language processing to detect stigmatizing provider language and evaluate associations with opioid analgesic pain management outcomes Open Access

Walker, Andrew (Spring 2024)

Permanent URL: https://etd.library.emory.edu/concern/etds/dr26z000s?locale=pt-BR%2A

Published

Abstract

Objective: In this dissertation, we sought to detect and classify stigmatizing and biased language in intensive care unit (ICU) electronic health records (EHRs) using natural language processing techniques. We evaluated the prevalence of such language across different patient demographics and provider factors, and explored the association of linguistic biases with care outcomes, including opioid analgesic prescription rates, dispensation rates, and rates of self-directed discharge from the ICU. Methods: Utilizing the Medical Information Mart for Intensive Care-III (MIMIC-III) dataset, we developed a comprehensive lexicon from literature-driven stem words, expanded with Word2Vec and GPT 3.5, to identify stigmatizing patient labels, doubt markers, and scare quotes. This lexicon was used to search 18 million sentences, 3000 of which were then used to train various classifiers, including bag-of-words and transformer-based models. Supervised learning techniques assessed the distribution of linguistic bias and its clustering within patient records, leveraging sentence-level analysis to connect linguistic features with patient care outcomes. Results: We developed lexicons and with high utility in identifying stigmatizing labels and doubt markers, and classifiers showing high accuracy, recall, and precision. Stigmatizing labels and doubt markers was found to be more prevalent among historically marginalized groups, with notable disparities in care outcomes, namely higher likelihoods of self-directed discharge. No significant associations were found between the linguistic features and opioid prescription or dispensation rates. Discussion: This dissertation supports the feasibility of using natural language processing to identify stigmatizing and doubt-marking language within medical records. It highlights consistent trends of stigmatizing language, particularly against historically stigmatized patients, and underscores the need for further research and intervention to mitigate these stigmas and downstream health outcomes. Conclusions: The high performance of the classifiers, titled CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records, underscores their potential for broader application in identifying stigmatizing language within healthcare systems. Study findings also highlight the importance of addressing stigmatizing language as a component of quality care and suggest that methods used in the study can be used to reduce stigmatization in EHR notes and identify areas of intervention.

Table of Contents

Introductory Literature Review 1

Defining Stigma and Bias 1

Ecosocial theory + stigma/bias 3

Clinical Outcomes of Associated with Stigma 12

Pain Management 13

Self-directed discharge 13

Origins and targets of stigmatization 14

Stigmatized Chronic Illness Populations frequently prescribed opioids 18

Current approaches to measuring Stigma and Bias 21

Survey methods 21

Administrative Data 22

Vignette studies 22

Implicit Association Tests (IATs) 23

Solution: Mining patient care notes to identify bias and stigma 25

Linguistic manifestations of bias and stigmatization: The Social Categories and Stereotype Communication Framework 26

Stigmatizing labels and negative descriptors 27

Evidentials and markers of doubt 28

Scare Quotes 29

Natural Language Processing to identify stigma and bias in EHR Data 30

Introductory Literature Review References 32

Aim 1: CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records: model development and validation 53

Abstract 53

Introduction: 54

Linguistic manifestations of stigmatization 56

Stigmatizing labels 57

Doubt Markers 58

Scare Quotes 59

Natural Language Processing to identify stigma and bias in EHR Data 60

Methods: 60

MIMIC-III Dataset 61

Lexicon development and sample preparation 61

Stigmatizing labels 63

Doubt markers 63

Scare quotes 63

Matching with sentences in MIMIC-III, creating coding samples 64

Annotation process 64

Sentence Classification 65

Results: 66

Lexicon Development 66

Regular Expression Search results 67

Annotation 70

Linguistic Bias Classifier Model Evaluation Results 71

Discussion: 74

Integrative summary of findings 74

Limitations 77

References 79

Appendix 1: Lexicons for Doubt Markers and Stigmatizing Labels 84

Appendix 2: Stigmatizing Labels Ontology 87

Coding Process 87

Link and Phelan Stigma Definition 87

Stigmatizing labels and negative descriptors in charts 88

Coding Rules 90

Appendix 3: Doubt Markers Ontology 92

Coding Process 92

Coding Rules 94

Appendix 4: Scare Quotes Ontology 95

Coding Process 95

Coding Rules 97

Appendix 5: Best Performing Model Hyperparameters 99

Aim 2: Distribution of stigmatizing and doubt-marking language in EHR across patients, providers, and frequently-stigmatized diagnoses 100

Introduction 101

Methods 107

CARE-SD: Classifier-based Analysis for Recognizing and Eliminating Stigmatizing and Doubt Marker Labels in Electronic Health Records 109

Doubt markers 110

Results 113

Patient-level models- demographic predictors 118

Provider-level models- provider type 119

Patient-level models interaction model estimates with Black + African American /Non-Black + African American race variable 120

Discussion 121

Appendix 1: Patient Ethnicity and Provider Recategorizations 138

Appendix 2: Lexicons for Doubt Markers and Stigmatizing Labels 139

Aim 3: Evaluating the relationships between stigmatizing language features in the EHR and patient care outcomes of opioid analgesic prescription rates and self-directed discharge 142

Abstract: 142

Introduction 143

Defining Stigma 143

Stigmatizing labels 144

Doubt Markers 145

Theoretical frameworks for how provider stigmatizing language impacts health outcomes 145

Clinical Outcomes of Associated with Stigma 147

Pain Management 148

Self-directed discharge 148

Methods 149

Stigmatizing labels 150

Doubt markers 151

Results 155

Discussion 157

Conclusions 161

References 161

Appendix 1 170

Dissertation Conclusion 172

About this Dissertation

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Behavioral Sciences and Health Education
Degree	Ph.D.
Submission	Dissertation
Language	English
Research Field	Sociology, Sociolinguistics Biology, Bioinformatics Health Sciences, Public Health
Keyword	stigma natural language processing
Committee Chair / Thesis Advisor	Melvin Livingston, Emory University
Committee Members	Jennifer Love, Mount Sinai Health Hannah LF Cooper, Emory University Abeed Sarker, Emory University

Last modified

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Using natural language processing to detect stigmatizing provider language and evaluate associations with opioid analgesic pain management outcomes ()	2024-04-08 17:06:28 -0400	Download

Abstract

Table of Contents

About this Dissertation

Primary PDF

Supplemental Files