CONSchema: Schema matching with semantics and constraints Pubblico
Wu, Kevin (Spring 2023)
Abstract
Schema matching aims to establish the correspondence between the attributes of database schemas. It has been regarded as the most difficult and crucial stage in the development of many contemporary database and web semantic systems. Manual mapping is a lengthy and laborious process, yet a low-quality algorithmic matcher may cause more trouble. Moreover, the issue of data privacy in certain domains, such as healthcare, poses further challenges, as the use of instance-level data should be avoided to prevent the leakage of sensitive information. To address this issue, we propose CONSchema, a model that combines both the textual attribute description and constraints of the schemas to learn a better matcher. We also propose a new experimental setting to assess the practical performance of schema matching models. Our results on 6 benchmark datasets across various domains including healthcare and movies demonstrate the robustness of CONSchema.
Table of Contents
1 Introduction 1
1.1 Motivation.................................
1 1.2 RelatedWork............................... 4
2 ConSchema 7
2.1 ProblemStatement............................ 7
2.2 Model ................................... 8 2. 2.1 Textualsimilarityembedding .................. 8 2.2.2 Constraintencoding ....................... 8
3 Experiment Setting 13
3.1 Datasets.................................. 14 3.2 BaselineMethods............................. 16
4 Results 17
4.1 RandomPartitionEvaluation ...................... 17
4.2 UnseenPartitionEvaluation....................... 18 4.2.1 CONSchema-RF ......................... 18 4.2.2 CONSchema-MLP ........................ 20
4.3 ExplainingCONSchemaMatchingDecisions . . . . . . . . . . . . . . 21
4.4 CMSCaseStudy ............................. 25
4.5 PrecisionRecallAnalysis......................... 26
5 Conclusion
About this Honors Thesis
School | |
---|---|
Department | |
Degree | |
Submission | |
Language |
|
Research Field | |
Parola chiave | |
Committee Chair / Thesis Advisor | |
Committee Members |
Primary PDF
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
CONSchema: Schema matching with semantics and constraints () | 2023-04-08 12:26:55 -0400 |
|
Supplemental Files
Thumbnail | Title | Date Uploaded | Actions |
---|---|---|---|
CONSchema Files (Data and Script) | 2023-04-06 12:12:13 -0400 |
|