Enhancing Document Understanding through the Incorporation of Structural Inference Open Access

Xu, Liyan (Spring 2023)

Permanent URL: https://etd.library.emory.edu/concern/etds/6969z234r?locale=en


Towards resolving a variety of Natural Language Processing (NLP) tasks, pretrained language models (PLMs) have been incredibly successful by simply modeling language sequences, backed by their powerful sequence encoding capabilities. However, for document understanding tasks involving multi-sentence or multi-paragraph inputs, the model still needs to overcome the inherent challenge of processing scattered information across the entire document context, such as resolving pronouns or recognizing relations among multiple sentences.

To address the motivation of effectively understanding document context beyond sequence modeling, this dissertation presents an in-depth study on the incorporation of structural inference, utilizing intrinsic structures of languages and documents. Four research works are outlined in Chapters 3-6 that experiment with various structural inference approaches for improving performance on document-oriented tasks. Particularly, Chapter 3 proposes to integrate syntactic dependency structures into the document encoding process, capturing inter-sentence dependencies through designed graph encoding in self-attention, which is shown effective for the task of machine reading comprehension, especially under the multilingual setting. Chapter 4 investigates different methods to perform inference on the discourse structure that concerns coreference relations, allowing for higher-order decision making, thus higher quality predictions, in coreference resolution. Chapter 5 presents a novel formulation of structural inference to facilitate joint information extraction. It incorporates a knowledge specific structure that comprises entity relations, fusing multi-facet information of document entities in terms of both coreference and relations, boosting towards entity-centric information information. Lastly, Chapter 6 continues on the same task as chapter 5, and explores the potential of the sequence-to-sequence generation as an approach that performs implicit inference on linearized entity structures without specific decoder design, which is motivated by its unified encoder-decoder architecture and inherent abilities to perform higher-order inference.

The results of the experiments presented in the dissertation demonstrate that incorporating designed structural inference upon certain intrinsic structures of languages or documents can effectively enhance document understanding, showing improved performance on various benchmarks for document-oriented tasks. This dissertation highlights that modeling dependencies among different parts of the context can lead to more accurate and robust encoding and decoding process, where auxiliary information can be provided through modeling these structures, complementing the sequence modeling of PLMs. Overall, the dissertation makes insightful contributions to the field of natural language processing by investigating the potentials and benefits of leveraging different structures for advancing the state-of-the-art in document understanding.

Table of Contents

1 Introduction 1

1.1 NLP in Document Understanding 1

1.2 Challenges and Motivations 3

1.2.1 Scattered Information 3

1.2.2 Limited Supervisions 4

1.2.3 Domain Adaptation 6

1.3 This Dissertation 7

2 Technical Foundations 9

2.1 Sequence Modeling 10 

2.2 Structures within Documents 12

3 Syntactic Structures for Reading Comprehension 14

3.1 Introduction 14 

3.1.1 Universal Dependencies (UD) 15 

3.1.2 Motivations 15 

3.1.3 Problem Formulation 16

3.2 Approach 17

3.2.1 Multilingual Pretrained Models 18

3.2.2 Syntactic Features 18

3.2.3 Inter-Sentence Dependency Graph (ISDG) 19

3.2.4 ISDG Encoder: Local Encoding 20

3.2.5 ISDG Encoder: Global Encoding 23

3.3 Evaluation and Analysis 26

3.4 Discussion 30

4 Discourse Structures for Coreference Resolution 32

4.1 Introduction 32 

4.1.1 Background and Motivations 33 

4.1.2 Problem Formulation 33

4.2 Approach: Local Inference 35

4.3 Approach: Higher-Order Inference (HOI) 37 

4.3.1 HOI via Span Refinement 38 

4.3.2 HOI via Maintaining Clusters 41

4.4 Evaluation and Analysis 43 

4.4.1 HOI Impact 44

4.5 Discussion 47

5 Relation Structures for Information Extraction 49

5.1 Introduction 49 

5.1.1 Entity-Centric Relation Extraction 50 

5.1.2 Background and Motivations 51 

5.1.3 Problem Formulation 51

5.2 Approach 52 

5.2.1 Independent Decoding 52 

5.2.2 Shallow Task Interactions 55 

5.2.3 Fuse Multi-Task Decoding 58

5.3 Evaluation and Analysis 60

5.4 Discussion 63

6 Implicit Structural Inference through Sequence Generation 65

6.1 Introduction 65

6.1.1 Background: Joint Extraction Paradigms 66

6.1.2 Sequence Generation 67

6.1.3 Problem Formulation 69

6.2 Approach 70 

6.2.1 Generation Schema 70 

6.2.2 Pointer-based Inference 72 

6.2.3 Decoder 73 

6.2.4 Training Strategy 74

6.3 Evaluation and Analysis 75

6.4 Discussion 80

7 Conclusion 82

7.1 Research Contributions 82 

7.2 Future Work 83

About this Dissertation

Rights statement
  • Permission granted by the author to include this thesis or dissertation in this repository. All rights reserved by the author. Please contact the author for information regarding the reproduction and use of this thesis or dissertation.
  • English
Research Field
Committee Chair / Thesis Advisor
Committee Members
Last modified

Primary PDF

Supplemental Files