Contextual Embedding Representations for Retrieval-Based and Generation-Based Dialogue Systems Open Access

Wang, Zihao (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/dz010r397?locale=en%255D
Published

Abstract

Context is a crucial element for conversational agents to conduct natural and engaging conversations with human users. By being aware of the context, a conversational agent can capture, understand, and utilize relevant information, such as named entity mentions, topics of interest, user intents, and emotional semantics. However, incorporating contextual information into dialogue systems is a challenging task due to the various forms it can take, the need to decide which information is most relevant, and how to organize and integrate it.

To address these challenges, this thesis proposes exploring and experimenting with different contextual information in the embedding space across different models and tasks. Furthermore, the thesis develops models that overcome the limitations of state-of-the-art language models in terms of the maximum number of tokens they can encode and their incapacity to fuse arbitrary forms of contextual information. Additionally, diarization methods are explored to resolve speaker ID errors in the transcriptions, which is crucial for training dialogue data.

The proposed models address the challenges of context integration into retrieval-based and generation-based dialogue systems. In retrieval-based systems, a response is selected and returned by ranking all responses from different components. A contextualized conversational ranking model is proposed and evaluated on the MSDialog benchmark conversational corpus, where three types of contextual information are leveraged and incorporated into the ranking model: previous conversation utterances from both speakers, semantically similar response candidates, and domain information associated with each candidate response. The performance of the contextual response ranking model exceeded state-of-the-art models in previous research, showing the potential to incorporate various forms of context into modeling.

In generation-based systems, a generative model generates a response to be returned to the conversing party. A generative model is built on top of the Blenderbot model, overcoming its limitations to integrate two types of contextual information: previous conversation utterances from both conversing parties and heuristically identified stacked questions that tackle repetition and provide topical diversity in dialogue generations. The models are trained on an interview dataset and evaluated on an annotated test set by professional interviewers and students in real conversations. The average satisfaction score from professional interviewers and students is 3.5 out of 5, showing promising future applications.

Additionally, to better understand topics of interest, topical clustering and diversity are investigated by grouping topics and analyzing the topic flow in the interview conversations. Frequent occurrences of some clusters of topics give a clear presentation of what scopes of topics an interview would touch on while maintaining a great selection of unique topics for individuals. Based on the observation, further discussions on the potential incorporation of such characteristics to improve conversational dialogue models are conducted.

Table of Contents

Contents

1 Introduction................. 1

1.1 Response Ranking in Dialogue Systems................. 3

1.2 End-to-end Generative Dialogue Systems.................4

1.3 Topic Flow in Conversations....................... 5

1.4 Summary of Contributions........................ 6

1.5 Thesis Structures ............................. 7

2 Background 8

2.1 Alexa Prize ................................ 8

2.1.1 Dialogue Manager Framework Implementation and Design.................9

2.1.2 Contextualized Proactivity.................... 10

2.2 Dialogue Systems............................. 11

2.3 Response Ranking ............................ 12

2.3.1 Learning to rank ......................... 12

2.3.2 Neural response ranking models................. 13

2.3.3 Topic modeling and classification in dialogues . . . . . . . . . 13

2.3.4 Ranking with integration of external knowledge . . . . . . . . 13

2.4 Dialogue Generation ........................... 14

2.4.1 Dialogue Generation Models................... 14

2.4.2 Current Applications of Generative Dialogue Systems For Admission Interviews ........................ 15

3 Contextual Response Ranking................. 16

3.1 Response Ranking Strategy in Irisbot.................. 16

3.2 Contextualized Response Ranking Models.................18

3.2.1     Task Formulation......................... 18

3.3 Approach and Implementation...................... 18

3.3.1 Approach Overview........................ 18

3.3.2     Model Architecture........................ 19

3.4 Experiments................................ 22

3.4.1 Dataset .............................. 22

3.4.2 Experimental Setup........................ 23

3.4.3     Model evaluation ......................... 24

3.5 Discussion and Conclusion........................ 25

4         Response Generation in Dialogue Systems – InterviewBot .................27

4.1 Interview Dataset............................. 27

4.2 Speaker Diarization............................ 28

4.2.1 Manual Annotation........................ 29

4.2.2 Pseudo Annotation........................ 30

4.2.3 Joint Model............................ 31

4.2.4     Experiments............................ 32

4.3 Dialogue Generation ........................... 33

4.3.1 Sliding Window.......................... 34

4.3.2 Context Attention ........................ 34

4.3.3     Question Storing ......................... 35

4.4 Experiments................................ 36

4.4.1 Static Evaluation ......................... 38

4.4.2 Real-time Evaluation....................... 38

4.5 Limitations ................................ 38

4.6 Conclusion................................. 39

5 Topical Investigation in Interview Conversations................. 40

5.1 Interview Topic Data Processing..................... 41

5.2 Interview Data Topic Clustering..................... 42

5.3 Topic Flow in Interview Conversations ................. 45

6 Summary, Discussion, and Future Work................. 48

A Appendix 52

A.1 Interviewee Demographics ........................ 52

A.1.1 Examples of Diarization Errors ................. 52

A.1.2 Examples of Generation Limitations . . . . . . . . . . . . . . 56 

Bibliography................. 52

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files