Cloud-based Active Learning System for Question Answering on Multiparty Dialogue Público

Gao, Shen (Spring 2019)

Permanent URL: https://etd.library.emory.edu/concern/etds/hd76s109j?locale=es
Published

Abstract

This thesis presents the design and architecture of an Active Learning system for Question Answering on Multiparty Dialogue. The goal of this system is to collect a robust Question Answering dataset and to improve the performance of the system on Question Answering challenges on Multiparty Dialogue. The system has an interactive web-based user interface which allows users to challenge the system with their own questions regarding a short passage of dialogues between multiple characters in a TV series. This system makes use of a state-of-art Machine Learning model to predict the answers to users’ questions. In the same time, the system learns from users’ responses and performs online update on the model. The system uses probability functions to guide user towards contributing data needed most for model improvement. The system is designed to handle high internet traffic by efficiently storing data and by carefully synchronizing the shared resources in the web system. The system has shown promising results in guiding users to contribute high quality data useful for model training. 

Table of Contents

1 Introduction .......................................................1

1.1 Question Answering..........................................1

1.2 Question Answering on Dialogue........................2

1.3 Active Learning.................................................3

1.4 Layout of the thesis...........................................4

2 Background .........................................................6

2.1 QA question types..............................................7

2.1.1 By Answer Formats..........................................7

2.1.2 By Degree of Inference.....................................8

2.2 Annotation.......................................................10

2.2.1 Amazon mTurk...............................................11

2.2.2 Process of Annotations....................................12

2.3 Friends Dataset.................................................15

2.4 Baseline Model................................................ .17

3 Approach ............................................................19

3.1 User Interaction.................................................20

3.2 User Guidance...................................................21

3.3 Architecture......................................................24

3.3.1 Database........................................................25

3.3.2 Application Programming Interface.................27

3.3.3 User Interface.................................................30

3.4 BERT Service.....................................................32

3.4.1 Snapshot........................................................34

4 Experiment..........................................................35

4.1 Pre-Deployment................................................36

4.1.1 User Guidance................................................36

4.1.2 Environment..................................................37

4.1.3 Latency Test...................................................37

4.1.4 Concurrency Test............................................38

4.2 Result...............................................................38

4.2.1 Data Collecting & Model Improvement............38

4.2.2 User-Model F-1..............................................40

5 Conclusion..........................................................41

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Palabra Clave
Committee Chair / Thesis Advisor
Committee Members
Última modificación

Primary PDF

Supplemental Files