Towards Personality Trait Prediction from Chatbot Conversations Using Machine Learning with Domain Adaptation Open Access

Sun, Mingyang (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/08612p57x?locale=en%255D
Published

Abstract

Accurate personality prediction has been proven to be useful for tasks like solving the

cold-start problem in personalized recommendation[1]. In recent years, a number of

research works have been published in different areas: written texts[2], movie scripts[3]

and social media[4], with natural language processing (NLP) techniques and machine

learning algorithms. In the field of open domain conversations, however, automatic

personality trait detection has only been studies on natural human-human conversations,

but not human-machine conversations. Under this circumstance, we present first study on

personality trait prediction from open-domain conversations with a chatbot.

 

As intelligent assistants, such as Google Assistant, Apple Siri and Amazon Alexa, have

gained increasing popularity with the development of mobile devices, the potential of

usefulness of personality prediction on human-machine conversations data can be

extensive. News recommendation function in these intelligent assistant systems, for

example, can take users’ personality as a reference: users with positive score on openness

trait tend to be interested in aesthetic activities, so they possibly would like to know

about trending news about new art shows, exhibitions and movies, while users with high

consciousness might be attracted more by things happening in the White House.

Therefore we believe detecting personality traits during conversations with users is a both

challenging and valuable task.

 

In this thesis, we confirm the feasibility of user personality trait recognition in the opendomain

human-machine conversations. We explore three methods: 1) models learned on

engineered features, 2) models learned on transformed features mapped by linking

functions constructed through heterogeneous domain adaptation, and 3) domain

adaptation approaches applied to transformed features with social media data as the

auxiliary task. The experimental results on real conversations with users support the

feasibility off this task and suggest promising directions for future research.

 

Table of Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.1 Background and Motivation. . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Research Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Proposed Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.4.1 Feature Matrix Construction . . . . . . . . . . . . . . . . . . . 4

1.4.2 Domain Adaptation . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.1 Overview: Personality in Psychology. . . . . . . . . . . . . . . . . . . 8

2.2 Personality Prediction in Texts . . . . . . . . . . . . . . . . . . . . . . 9

2.3 Personality Prediction in Social Media. . . . . . . . . . . . . . . . . . 10

2.4 Personality Prediction in Conversations . . . . . . . . . . . . . . . . . 11

3 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 MyPersonality Project Dataset . . . . . . . . . . . . . . . . . . . . . . 13

3.2 Alexa Prize Chatbot Conversation Dataset . . . . . . . . . . . . . . . 13

4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1 Feature Engineering . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.1.1 General Feature Engineering . . . . . . . . . . . . . . . . . . . 17

4.1.2 Specific Feature Engineering on Social Media Dataset . . . . . 20

4.1.3 Specific Feature Engineering on Open-domain Human-machine

Conversation Dataset . . . . . . . . . . . . . . . . . . . . . . . 21

4.2 Domain Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.2.1 Simple Feature Augmentation. . . . . . . . . . . . . . . . . . . 24

4.2.2 Heterogeneous Domain Adaptation. . . . . . . . . . . . . . . . 25

4.2.3 Stacked Denoising Autoencoders . . . . . . . . . . . . . . . . . 28

5 Experiments and Results . . . . . . . . . . . . . . . . . . . . . . . 30

5.1 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.1.1 Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5.1.2 Evaluation Metrics . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.2.1 Overall Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.2.2 Analysis of Transfer Learning (Domain Adaptation) in Personality

Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.2.3 Comparison between Feature Augmentation and sDAE . . . . 37

6 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . 39

Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Keyword
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files