Incremental sense weight training for contextualized word embedding interpretation Público

Jiang, Xinyi (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/zw12z642q?locale=es
Published

Abstract

In this work, we propose a new training procedure for learning the importance of dimensions of word embeddings in representing word meanings. Our algorithm advanced in the interpretation filed of word embeddings, which are extremely critical in the NLP filed due to the lack of understanding of word embeddings despite their superior ability in progressing NLP tasks. Although previous work has investigated in the interpretability of word embeddings through imparting interpretability to the embedding training models or through post-processing procedures of pre-trained embeddings, our algorithm proposes a new perspective to word embedding dimension interpretation where each dimension gets evaluated and can be visualized. Also, our algorithm adheres to a novel assumption that not all dimensions are necessary for representing a word sense (word meaning) and dimensions that are negligible get discarded, which have not been attempted in previous studies.

Table of Contents

1 Introduction 1

1.1 Problem Description . . . . . . . . . . . . . . . . . . . . . . . 2

1.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 Related Works . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Background 8

2.1 Word embedding extraction . . . . . . . . . . . . . . . . . . . 8

2.2 Embedding evaluations . . . . . . . . . . . . . . . . . . . . . . 9

2.2.1 Intrinsic evaluations . . . . . . . . . . . . . . . . . . . 9

2.2.2 Extrinsic evaluations . . . . . . . . . . . . . . . . . . . 11

2.2.3 Intrinsic and extrinsic evaluations . . . . . . . . . . . . 11

2.3 Sense Vector extraction . . . . . . . . . . . . . . . . . . . . . . 12

2.4 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4.1 Convolutional Neural Network . . . . . . . . . . . . . . 13

2.4.2 Long Short-Term Memory Networks . . . . . . . . . . 14

2.4.3 Transformer . . . . . . . . . . . . . . . . . . . . . . . . 14

3 Qualitative Evaluation of Contextual Word Embeddings 15

3.1 Embedding Model Structures . . . . . . . . . . . . . . . . . . 15

3.2 Evaluations on Conversational Dataset . . . . . . . . . . . . . 16

3.2.1 Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.2.2 Embedding Models Used . . . . . . . . . . . . . . . . . 18

3.2.3 Approach . . . . . . . . . . . . . . . . . . . . . . . . . 19

3.2.4 Result . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4 Representation of Sense Dimensions in Word Embeddings 23

4.1 Models used . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

4.4 Algorithm based on PCA . . . . . . . . . . . . . . . . . . . . . 26

4.4.1 Our Algorithm . . . . . . . . . . . . . . . . . . . . . . 29

5 Test and Evaluation 32

5.1 Test Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2 Evaluation of Algorithm . . . . . . . . . . . . . . . . . . . . . 37

6 Conclusion 47

6.1 Limitation and Future Work . . . . . . . . . . . . . . . . . . . 48

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palabra Clave
Committee Chair / Thesis Advisor
Committee Members
Última modificación

Primary PDF

Supplemental Files