Analysis of Temporal Relations in Various Types of Text Data Open Access
Chen, Yingying (Spring 2022)
Abstract
Detecting the temporal relations of events in a text is a complicated natural language understanding task. However, figuring out the timeline of events is key to improving machine comprehension. Previous work specified approaches to identifying events in texts, proposing appropriate temporal relations and ways to order events with respect to one another. However, the vast majority of existing temporal dependency annotation has been carried out on simple narrative text or news sources. The annotation schemes are not always applicable to noisy, highly variable, social media texts such as Reddit posts. We devise a more generalized and robust scheme to support a broader range of text annotation. In this research, we aim to 1) improve existing annotation guidelines for more complex sentence structures, 2) evaluate the annotation performance among student annotators to achieve competitive inter-annotator agreement scores, 3) quantify the characteristics unique to Reddit text and provide a statistical analysis of the difficulties encountered when annotating Reddit data, and 4) compare and contrast the effectiveness of our temporal annotation scheme across three diverse sources: children’s stories, social media texts, and news articles. The results show that our annotation scheme is effective in identifying events with high-level inter-annotator agreement scores, but there is still space to improve for identifying timelines of events. Besides, our results show the challenges of generating a unifying temporal relations scheme for different types of text. These challenges lead to the discussion of how to evaluate the effectiveness of temporal relation schemes.
Table of Contents
1 Introduction
1.1 Motivation
1.2 Thesis Statement
1.3 Objectives
2 Background
2.1 Related Work
2.2 The Use of Social Media Texts (Reddit Corpus)
2.3 Previous Attempts of This Research
3 Methodology
3.1 Data
3.2 Annotation Procedure
3.3 Evaluation Metrics
3.4 Three-Stage Annotation Method
3.5 Annotation Core Rules
3.5.1 The Definition of Event
3.5.2 The Definition of Time Reference
3.5.3 The Annotation of Event (Basic Type)
3.5.4 The Annotation of Event (Considering Context in Discourse)
3.5.5 The Definition of Temporal Relation
3.5.6 The Annotation of Relation
3.5.7 Pair Identification
4 Discussion & Future Work
4.1 Annotation Round 1 (Social Media Texts)
4.2 Annotation Round 2 (Children’s Stories)
4.3 Annotation Round 3 (Small-Scale Children’s Stories)
4.4 Annotation Round 4 (Large-Scale in Social Media Texts, Children’s Stories, News Report)
5 Discussion & Future Work
6 Conclusion
About this Honors Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Keyword | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
|
Analysis of Temporal Relations in Various Types of Text Data () | 2022-04-12 23:53:54 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|