Analysis of Temporal Relations in Various Types of Text Data Öffentlichkeit

Chen, Yingying (Spring 2022)

Permanent URL: https://etd.library.emory.edu/concern/etds/7h149r37p?locale=de
Published

Abstract

Detecting the temporal relations of events in a text is a complicated natural language understanding task. However, figuring out the timeline of events is key to improving machine comprehension. Previous work specified approaches to identifying events in texts, proposing appropriate temporal relations and ways to order events with respect to one another. However, the vast majority of existing temporal dependency annotation has been carried out on simple narrative text or news sources. The annotation schemes are not always applicable to noisy, highly variable, social media texts such as Reddit posts. We devise a more generalized and robust scheme to support a broader range of text annotation. In this research, we aim to 1) improve existing annotation guidelines for more complex sentence structures, 2) evaluate the annotation performance among student annotators to achieve competitive inter-annotator agreement scores, 3) quantify the characteristics unique to Reddit text and provide a statistical analysis of the difficulties encountered when annotating Reddit data, and 4) compare and contrast the effectiveness of our temporal annotation scheme across three diverse sources: children’s stories, social media texts, and news articles. The results show that our annotation scheme is effective in identifying events with high-level inter-annotator agreement scores, but there is still space to improve for identifying timelines of events. Besides, our results show the challenges of generating a unifying temporal relations scheme for different types of text. These challenges lead to the discussion of how to evaluate the effectiveness of temporal relation schemes.

Table of Contents

1 Introduction

1.1 Motivation

1.2 Thesis Statement

1.3 Objectives

2 Background

2.1 Related Work

2.2 The Use of Social Media Texts (Reddit Corpus)

2.3 Previous Attempts of This Research

3 Methodology

3.1 Data

3.2 Annotation Procedure

3.3 Evaluation Metrics

3.4 Three-Stage Annotation Method

3.5 Annotation Core Rules

3.5.1 The Definition of Event

3.5.2 The Definition of Time Reference

3.5.3 The Annotation of Event (Basic Type)

3.5.4 The Annotation of Event (Considering Context in Discourse)

3.5.5 The Definition of Temporal Relation

3.5.6 The Annotation of Relation

3.5.7 Pair Identification

4 Discussion & Future Work

4.1 Annotation Round 1 (Social Media Texts)

4.2 Annotation Round 2 (Children’s Stories)

4.3 Annotation Round 3 (Small-Scale Children’s Stories)

4.4 Annotation Round 4 (Large-Scale in Social Media Texts, Children’s Stories, News Report)

5 Discussion & Future Work

6 Conclusion

About this Honors Thesis

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
School
Department
Degree
Submission
Language
  • English
Research Field
Stichwort
Committee Chair / Thesis Advisor
Committee Members
Zuletzt geändert

Primary PDF

Supplemental Files