Transformers to Learn Hierarchical Contexts in Multiparty Dialogue Open Access
Li, Changmao (Spring 2020)
Abstract
This thesis introduces a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and utterance order prediction, that learn both token and utterance embeddings for better understanding in dialogue contexts.Then, multi-task learning between the utterance prediction and the token span prediction is applied to fine-tune for span-based question answering (QA). Our approach is evaluated on the FriendsQA dataset and shows improvements of 3.7% and 1.4% over the two state-of-the-art transformer models, BERT and RoBERTa, respectively.
Table of Contents
1 Introduction 1
2 Background 4
2.1 Language Modeling and Word Embedding . . . . . . . . . . . . . 4
2.2 Transformers and Muti-head Attention . . . . . . . . . . . . . . . 5
2.3 Related Question Answering Tasks................. 6
2.4 Related Transformer Based Approach ............... 7
2.5 Character Mining Project...................... 7
3 Approach 9
3.1 Transformers for Learning Dialogue ................ 10
3.2 Pre-training Language Models ................... 10
3.2.1 Token-level Masked LM .................. 11
3.2.2 Utterance-level Masked LM ................ 11
3.2.3 Utterance Order Prediction................. 13
3.3 Fine-tuning for QA on Dialogue .................. 14
3.3.1 Utterance ID Prediction .................. 14
3.3.2 Token Span Prediction ................... 14
4 Experiments 16
4.1 Corpus................................ 16
4.2 Models ............................... 17
4.3 Results................................ 17
4.4 Other Experimental Details..................... 19
5 Analysis 21
5.1 Ablation Studies Analysis...................... 21
5.2 Question Type Analysis....................... 22
5.3 Error Analysis............................ 23
6 Conclusion and Future Directions 28
6.1 Conclusion ............................. 28
6.2 FutureDirections .......................... 28
Appendix A - Results for Other Character Mining Tasks 30
A.1 Friends Reading Comprehension task ............... 30
A.1.1 Task Description ...................... 30
A.1.2 Results and Analysis .................... 30
A.2 Friends Emotion Detection task................... 31
A.2.1 Task Description ...................... 31
A.2.2 Results and Analysis .................... 32
A.3 Friends Personality Detection task ................. 33
A.3.1 Task Description ...................... 33
A.3.2 Results and Analysis .................... 33
About this Master's Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue () | 2020-04-28 18:26:29 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|