Improving Biomedical Abstract Screening Using Contrastive Learning Open Access

Li, Tiantian (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/6m311q62d?locale=en

Published

Abstract

Systematic review is a crucial tool for evidence-based medicine as it identifies and synthesizes published medical literature to inform prevention and intervention strategies. However, it requires intensive labor and time to identify relevant articles to include. While automating the screening process has been proposed using the abstracts, the performance is still suboptimal. Contrastive learning has achieved great success in computer vision but has not been used to expedite the systematic review process. In this thesis, we propose a new method using an autoencoder trained with contrastive loss to generate vector representation for abstracts. We apply data augmentation techniques on the abstract and train the autoencoder to generate representations for anchor and positive samples that are closer in vector space than those for anchor and negative samples. Our experiments suggest that contrastive learning can be used to help filter irrelevant articles during the abstract screening phase.

1 Introduction 1

2 Background 5

2.1 Automating Systematic Reviews . . . . . . . . . . . . . . . . . . . . . 5

2.1.1 Word Embeddings . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1.2 Deep Learning Models . . . . . . . . . . . . . . . . . . . . . . 6

2.2 Contrastive Representation Learning . . . . . . . . . . . . . . . . . . 8

3 CTRL-Screener 10 3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2 Model Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.1 Abstract Representation . . . . . . . . . . . . . . . . . . . . . 11

3.2.2 Contrastive Autoencoder . . . . . . . . . . . . . . . . . . . . . 12

3.2.3 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

3.3 Implementation Details . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.3.1 Data augmentation . . . . . . . . . . . . . . . . . . . . . . . . 16

4 Experiment setup 17

4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.2 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2.1 Baseline Methods . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.2.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2.3 Evaluation Strategy . . . . . . . . . . . . . . . . . . . . . . . . 20

4.2.4 Hyperparameter tuning of CTRL-Screener . . . . . . . . . . . 20

5 Results 21

5.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.2 Effect of Relative Order . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.3 Effect of Sampling Percentage . . . . . . . . . . . . . . . . . . . . . . 24

6 Conclusion 25

6.1 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.2.1 Additional datasets . . . . . . . . . . . . . . . . . . . . . . . . 25

6.2.2 Comparison to other data augmentation techniques . . . . . . 26

6.2.3 Different levels of automation . . . . . . . . . . . . . . . . . . 26

6.2.4 Continual learning . . . . . . . . . . . . . . . . . . . . . . . . 26

Bibliography 27

About this Honors Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Emory College
Department	Computer Science
Degree	B.S.
Submission	Honors Thesis
Language	English
Research Field	Computer Science
Keyword	Citation Screening Contrastive Learning Biomedical Literature
Committee Chair / Thesis Advisor	Joyce Ho, Emory University
Committee Members	Xiong Li, Emory University Judy Gichoya, Emory University

Last modified

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Improving Biomedical Abstract Screening Using Contrastive Learning ()	2023-04-18 01:19:39 -0400	Download

Improving Biomedical Abstract Screening Using Contrastive Learning Open Access

Li, Tiantian (Spring 2023)

Abstract

Table of Contents

About this Honors Thesis

Primary PDF

Supplemental Files