Automatic Generation of Multi-turn Dialogues from Reddit Open Access

Hutsell, William (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/r781wh429?locale=pt-BR%2A
Published

Abstract

High-quality multi-turn dialogue datasets are a scarce commodity in the field of Natural Language Processing, and with the recent rise of chat bots powered by seq2seq models that train on these datasets, they have become more important than ever. This thesis describes work done on a model built to deconstruct Reddit posts and sequence the fragments to create high-quality, multi-turn, topic-specific conversations. The model works by using a post's content as a beginning framework for a single speaker's statements in a conversation, filling in the second speaker's utterances with comments left on the same post. A dialogue dataset with 951 dialogues was generated using this method comprising conversations across two topics: movies and books. This dataset, HuHu, was then manually evaluated against DailyDialog, Topical-Chat, and MultiWOZ, three good-quality datasets with ~10,000 dialogues constructed in varying ways. The results showed that our generated dialogues were overall considered more natural in 46% of cases and considered at least as natural in 73% of comparisons. This is an incredible result given that our model can generate millions of dialogues across any number of topics, limited only by the number of related Reddit posts. Future work in the task of dialogue assembly models appears to be very promising and could result in dialogues at a near-human level within the near future.

Table of Contents

1 Introduction 1

1.1 The Importance of Dialogue ....................... 1

1.2 The Dialogue Assembly Task....................... 2

1.3 Thesis Statement ............................. 3

2 Background 4

2.1 The Current State of Dialogue in the Field of Natural Language Processing................................... 4

2.2 Establishing Feasibility.......................... 6

2.3 Initial Ideas................................ 7

2.4 Beneficial Reddit Properties ....................... 9

2.5 Connections to the Emory Natural Language Processing Lab . . . . . 9

2.6 Collaboration ............................... 10

2.7 Beginnings................................. 10

3 Evaluation Techniques 11

3.1 Automatic Dialogue Metrics ....................... 11

3.2 Rubric for Manual Evaluation of Generated Dialogues . . . . . . . . 12

4 Development of Model 14

4.1 Initial Control Flow............................ 14

4.2 Next Sentence Prediction: BERT vs DialogRPT . . . . . . . . . . . . 15

4.3 Initial Generated Data and Error Analysis . . . . . . . . . . . . . . . 17

4.4 Continuing Development......................... 17

4.5 Advanced Control Flow.......................... 18

4.6 Beam Search ............................... 20

4.7 Threading................................. 22

5 Evaluation of Final Model 26

5.1 Evaluation Set-up............................. 26

5.2 Comparison Datasets........................... 26

5.3 Amazon Mechanical Turk Task Design ................. 27

5.4 Results................................... 29

5.4.1 Disproving Turk Results ..................... 29

5.4.2 Manual Annotation Results ................... 33

6 Discussion 38

6.1 Model Analysis .............................. 38

6.1.1 Model Strengths ......................... 38

6.1.2 Current Flaws........................... 38

6.1.3 Bias Propagation ......................... 39

6.1.4 Future Work and Difficulties................... 39

6.2 Data Analysis............................... 41

6.3 Data Examples .............................. 41

6.4 Analysis of Dialogues from Comparison Datasets . . . . . . . . . . . . 45

7 Conclusion 47

Appendix A Core Approach Pseudocode 48

Appendix B Comparison Dataset Example Dialogues 50

B.1 Topical-Chat ............................... 50

B.2 MultiWOZ................................. 51

B.3 DailyDialog ................................ 51

Appendix C Example Dialogue from Each Approach 53

Appendix D Punctuation Excluded 55

Appendix E Example Post and Comments 56

Appendix F Amazon Turk Detailed Results 58

Bibliography 60 

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files