User Satisfaction Prediction in Open-Domain Conversational Systems Public

Choi, Jason Ingyu (Summer 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/qr46r201z?locale=fr
Published

Abstract

As voice-based assistants such as Alexa, Siri, and Google Assistant become ubiquitous, users increasingly expect to maintain natural and informative conversations with such systems. For open-domain conversations to be engaging, systems must maintain the user's interest for extended periods, without sounding boring or annoying. Unfortunately, evaluating success and failure remains challenging due to several reasons: (1) open-domain conversations do not have predefined goals; (2) satisfaction is highly subjective to user's preference and system performance; (3) extracting and understanding user behaviors in open-domain conversations are less explored; (4) creating an experiment setting with a functional conversational system requires significant engineering effort. 

In this thesis, I proposed a new satisfaction prediction model named ConvSAT that addressed these challenges. First, ConvSAT introduced a new behavioral feature matrix that broke down user behavior and system states into various features, allowing ConvSAT to jointly model heterogeneous signals. Moreover, since many features are generated with direct supervision, measuring feature importance provided a good estimation for identifying positively and negatively correlated behaviors. Second, many previous studies generalized satisfaction prediction problem into offline-setting (prediction after entire conversation) only. However, ConvSAT supports both offline evaluations and online predictions (prediction per each turn), which can be used as live feedback for adaptive dialogue strategies.

I validated the generality of ConvSAT through several applications, implemented as part of the Alexa Prize challenges and Dialogue Breakdown Detection Challenge 3. Lastly, this thesis demonstrates one application of ConvSAT, which is quantifying the effects of modulating prosody (i.e. changing the pitch and cadence of the system response to indicate delight, sadness or other common emotions) on user satisfaction. Together, the results and insights in this thesis provide promising directions for developing a new generation of more responsive and intelligent conversational agents.

Table of Contents

1 Introduction - 1

1.1 Summary and Contributions - 6

2 Background and Related Work - 8

2.1 Types of Conversational Systems - 8

2.1.1 Rule-based Conversational Systems - 9

2.1.2 End-to-End Conversational Systems - 10

2.1.3 Hybrid Conversational Systems - 12

2.2 Conversational System Evaluation - 13

2.2.1 Previously Proposed Satisfaction Metrics - 13

2.2.2 Satisfaction Prediction - 14

3 Conversational Dataset - 16

3.1 Dialogue Breakdown Detection Challenge - 16

3.1.1 DBDC3 Dataset Statistics - 17

3.2 Amazon Alexa Prize 2018 - 17

3.2.1 Irisbot - 18

3.2.2 Alexa Prize Dataset Statistics - 21

3.2.3 User rating vs. user satisfaction - 23

3.2.4 Annotating online satisfaction labels - 24

4 ConvSAT: Conversational Satisfaction Prediction - 26

4.1 ConvSAT: Method Description - 26

4.1.1 Model Architecture - 26

4.1.2 Behavioral Features - 32

4.1.3 Additional Implementation Details - 35

5 Experiments and Main Results - 36

5.1 Experimental Setting - 36

5.1.1 Label generation for Alexa Prize dataset - 36

5.1.2 Baseline Methods - 39

5.1.3 Prediction Tasks - 39

5.1.4 Evaluation Metrics and Training Details - 41

5.2 Main Results - 41

5.2.1 Dialogue Breakdown Detection Results - 41

5.2.2 Online Satisfaction Prediction Results - 43

5.2.3 Offline Satisfaction Prediction Results - 44

5.3 Discussion and Error Analysis - 45

5.3.1 Generalizing from Heuristic Labels - 45

5.3.2 Feature Ablation - 50

5.3.3 Importance of Behavioral Features - 51

5.3.4 Representative Error Analysis - 53

6 Applications of ConvSAT - 55

6.1 Quantifying Prosody Modulation Effects - 55

6.1.1 Controlled Dataset Selection - 56

6.1.2 Proposed Metrics - 58

6.1.3 Pre-training Details - 59

6.1.4 Results and Discussion - 59

7 Conclusions - 62

7.1 Summary of the Results - 62

7.2 Contributions and Future Work - 63

7.3 Acknowledgment - 64

Bibliography - 65

About this Master's Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Mot-clé
Committee Chair / Thesis Advisor
Committee Members
Dernière modification

Primary PDF

Supplemental Files