Transformers to Learn Hierarchical Contexts in Multiparty Dialogue Open Access

Li, Changmao (Spring 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/5h73px09v?locale=en

Published

Abstract

This thesis introduces a novel approach to transformers that learns hierarchical representations in multiparty dialogue. First, three language modeling tasks are used to pre-train the transformers, token- and utterance-level language modeling and utterance order prediction, that learn both token and utterance embeddings for better understanding in dialogue contexts.Then, multi-task learning between the utterance prediction and the token span prediction is applied to fine-tune for span-based question answering (QA). Our approach is evaluated on the FriendsQA dataset and shows improvements of 3.7% and 1.4% over the two state-of-the-art transformer models, BERT and RoBERTa, respectively.

1 Introduction 1

2 Background 4

2.1 Language Modeling and Word Embedding . . . . . . . . . . . . . 4

2.2 Transformers and Muti-head Attention . . . . . . . . . . . . . . . 5

2.3 Related Question Answering Tasks................. 6

2.4 Related Transformer Based Approach ............... 7

2.5 Character Mining Project...................... 7

3 Approach 9

3.1 Transformers for Learning Dialogue ................ 10

3.2 Pre-training Language Models ................... 10

3.2.1 Token-level Masked LM .................. 11

3.2.2 Utterance-level Masked LM ................ 11

3.2.3 Utterance Order Prediction................. 13

3.3 Fine-tuning for QA on Dialogue .................. 14

3.3.1 Utterance ID Prediction .................. 14

3.3.2 Token Span Prediction ................... 14

4 Experiments 16

4.1 Corpus................................ 16

4.2 Models ............................... 17

4.3 Results................................ 17

4.4 Other Experimental Details..................... 19

5 Analysis 21

5.1 Ablation Studies Analysis...................... 21

5.2 Question Type Analysis....................... 22

5.3 Error Analysis............................ 23

6 Conclusion and Future Directions 28

6.1 Conclusion ............................. 28

6.2 FutureDirections .......................... 28

Appendix A - Results for Other Character Mining Tasks 30

A.1 Friends Reading Comprehension task ............... 30

A.1.1 Task Description ...................... 30

A.1.2 Results and Analysis .................... 30

A.2 Friends Emotion Detection task................... 31

A.2.1 Task Description ...................... 31

A.2.2 Results and Analysis .................... 32

A.3 Friends Personality Detection task ................. 33

A.3.1 Task Description ...................... 33

A.3.2 Results and Analysis .................... 33

About this Master's Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Computer Science and Informatics
Degree	M.S.
Submission	Master's Thesis
Language	English
Research Field	Computer Science
Keyword	Multi-party Dialogue Transformers Natural Language Processing Question Answering
Committee Chair / Thesis Advisor	Choi, Jinho D., Emory University
Committee Members	Grigni, Michelangelo, Emory University Cheung, Shun Yan, Emory University

Last modified

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Transformers to Learn Hierarchical Contexts in Multiparty Dialogue ()	2020-04-28 18:26:29 -0400	Download

Transformers to Learn Hierarchical Contexts in Multiparty Dialogue Open Access

Li, Changmao (Spring 2020)

Abstract

Table of Contents

About this Master's Thesis

Primary PDF

Supplemental Files