Transformers to Learn Hierarchical Contexts in Multiparty Dialogue Open Access

Li, Changmao (Spring 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/5h73px09v?locale=en
Published

Abstract

This thesis introduces a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and utterance order prediction, that learn both token and utterance embeddings for better understanding in dialogue contexts.Then, multi-task learning between the utterance prediction and the token span prediction is applied to fine-tune for span-based question answering (QA). Our approach is evaluated on the FriendsQA dataset and shows improvements of 3.7% and 1.4% over the two state-of-the-art transformer models, BERT and RoBERTa, respectively.

Table of Contents

1 Introduction 1

2 Background 4

2.1 Language Modeling and Word Embedding . . . . . . . . . . . . . 4

2.2 Transformers and Muti-head Attention . . . . . . . . . . . . . . . 5

2.3 Related Question Answering Tasks................. 6

2.4 Related Transformer Based Approach ............... 7

2.5 Character Mining Project...................... 7

3 Approach 9

3.1 Transformers for Learning Dialogue ................ 10

3.2 Pre-training Language Models ................... 10

3.2.1 Token-level Masked LM .................. 11

3.2.2 Utterance-level Masked LM ................ 11

3.2.3 Utterance Order Prediction................. 13

3.3 Fine-tuning for QA on Dialogue .................. 14

3.3.1 Utterance ID Prediction .................. 14

3.3.2 Token Span Prediction ................... 14

4 Experiments 16

4.1 Corpus................................ 16

4.2 Models ............................... 17

4.3 Results................................ 17

4.4 Other Experimental Details..................... 19

5 Analysis 21

5.1 Ablation Studies Analysis...................... 21

5.2 Question Type Analysis....................... 22

5.3 Error Analysis............................ 23

6 Conclusion and Future Directions 28

6.1 Conclusion ............................. 28

6.2 FutureDirections .......................... 28

Appendix A - Results for Other Character Mining Tasks 30

A.1 Friends Reading Comprehension task ............... 30

A.1.1 Task Description ...................... 30

A.1.2 Results and Analysis .................... 30

A.2 Friends Emotion Detection task................... 31

A.2.1 Task Description ...................... 31

A.2.2 Results and Analysis .................... 32

A.3 Friends Personality Detection task ................. 33

A.3.1 Task Description ...................... 33

A.3.2 Results and Analysis .................... 33

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files