Hierarchical Entity Extraction and Ranking with Unsupervised Graph Convolutions Público

Liu, Zhexiong (Spring 2020)

Permanent URL: https://etd.library.emory.edu/concern/etds/b8515p57p?locale=es

Published

Abstract

Entity extraction problems have been extensively studied in terms of investigating the capability of extracting entities from text using natural language processing (NLP). Most research involves training learnable models on a large amount of corpus to ex- tract entities and determine their salience. Typically, these systems aim to retrieve an array of ranked entities from a set of documents while giving queries, which mainly measure the relevance between queries and entities. However, this thesis leverages semantic and syntactic information within the documents to perform entities extraction as well as entity ranking. In particular, given document corpus, constituency parsing trees are constructed to extract entity mentions (phrases) for each article. Meanwhile, dependency parsing trees and entity coreference clusters are employed to build a relation graph, of which nodes denote entity mentions and edges denote mention relations. Moreover, graph convolution is performed on the relation graph to normalize the mention representation with respect to mention embeddings. Hierarchical density-based clustering and ranking mechanism are applied to compute entity priors. To evaluate this work, three models are proposed and evaluated on 60 annotated articles. Preliminary results illustrate that the usage of parsing trees, along with entity coreference relations improves the effectiveness of entity extraction and ranking. The interesting hierarchical trees for entity extraction, the principles for graph construction, as well as the system architecture serve as main contributions of this thesis.

Contents

1 Introduction

1.1 Motivation

1.2 Research Questions

1.3 Contribution

1.4 Organization

2 Backgrounds

2.1 Word Embeddings

2.2 Keyphrase Extraction

2.3 Entities Ranking

2.4 Graph-based Approaches

3 Approaches

3.1 Constituency Parsing

3.2 Dependence Parsing

3.3 Coreference Resolution

3.4 Embedding Normalization

3.5 Graph Convolutions

3.6 Clustering

3.7 Models

3.7.1 Baseline Model

3.7.2 Coreference Model

3.7.3 Convolutional Model

4 Experiments

4.1 Experimental Setup

4.2 DataExploration

4.3 EvaluationMetrics

4.4 ModelEvaluation

5 Analysis

Bibliography

About this Master's Thesis

Rights statement

Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.

School	Laney Graduate School
Department	Computer Science and Informatics
Degree	M.S.
Submission	Master's Thesis
Language	English
Research Field	Computer Science
Palabra Clave	Graph Convolution Entity Extraction Coreference Resolution
Committee Chair / Thesis Advisor	Jinho D. Choi, Emory University
Committee Members	Michelangelo Grigni, Emory University Shun Yan Cheung, Emory University

Última modificación

Primary PDF

Thumbnail	Title	Date Uploaded	Actions
	Hierarchical Entity Extraction and Ranking with Unsupervised Graph Convolutions ()	2020-04-25 21:56:20 -0400	Download

Hierarchical Entity Extraction and Ranking with Unsupervised Graph Convolutions Público

Liu, Zhexiong (Spring 2020)

Abstract

Table of Contents

About this Master's Thesis

Primary PDF

Supplemental Files