Ontology Alignment for Accurate Ontology Matching: A Survey

Khan, Hasham; Saqib, Muhammad; Khattak, Hasan Ali; Ali, Syed Imran; Lee, Sungyoung

doi:10.1007/978-3-031-43950-6_31

Hasham Khan¹²,
Muhammad Saqib¹²,
Hasan Ali Khattak¹²,
Syed Imran Ali¹² &
…
Sungyoung Lee¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14237))

Included in the following conference series:

International Conference on Smart Homes and Health Telematics

1386 Accesses

Abstract

Edge computing, a distributed computing architecture within the knowledge-defined network (KDN), faces challenges due to the significant disparities and data heterogeneity among its nodes, hindering their interaction. Ontology, a solution within the Semantic Web, is well-suited for addressing data heterogeneity and matching ontologies effectively. However, ontology matching presents difficulties due to non-linear mathematical issues. To overcome these challenges, the generative adversarial network (GAN), an unsupervised learning method, has emerged as a promising tool. GAN consists of two models with distinct objectives trained against eachother to achieve optimal outcomes. This paper introduces SA-GAN, an algorithm that combines GAN with simulation-based annealing to enhance its effectiveness. SA-GAN utilizes a stagnation counter to expedite the convergence speed of GAN. Through experiments conducted on a renowned ontology benchmark, the paper demonstrates that SA-GAN, along with other ontology matching algorithms, can identify the best alignments. Consequently, SA-GAN facilitates the construction of bridges in edge computing, improving its overall effectiveness.

You have full access to this open access chapter, Download conference paper PDF

A Quadratic Assignment Decipherment and Graduated Assignment Solution for Ontology Matching

An Extensible Linear Approach for Holistic Ontology Matching

A General Framework for Graph Matching and Its Application in Ontology Matching

Keywords

1 Introduction

The new paradigm of knowledge-defined networking (KDN) combines artificial intelligence (AI), software-defined network analytics (NA), and software-defined networking (SDN). Within KDN, edge computing empowers edge nodes with storage and processing capabilities to perform data forwarding, recognition, and other services. However, these edge nodes often have different data representations, leading to data heterogeneity. Ontology, at the very exact moment, acts as a model of reference for information transmission, allowing for the exact correct semantic standardized meaning to date [10]. To address this, edge node ontology is used to represent concepts and relationships in specific domains. Ontology, a fundamental technique in the Semantic Web, is widely employed to tackle data heterogeneity by effectively modeling knowledge and formalizing relationships and concepts. The System Upgrade Process in Fig. 1 is an example of how ontology is applied and mapped to interconnected concepts. Ontology helps identify similar domains, definitions, terms, and their connections, creating a graphical map of entities and their associations, including links to other corporate domains.

Ontology construction is subjective, resulting in heterogeneity in ontologies due to different techniques used for representing ideas and relationships [8]. In an information-rich era, multiple sources produce similar content, making it impossible for systems to have identical ontologies. Ontology matching provides a direct approach to leverage ontologies by identifying heterogeneous entities across multiple ontologies. When given two distinct ontologies with discrete items like attributes and classes, ontology matching involves determining relationships such as equality and formation among these entities [11].

Ontology meta-matching is a complex problem that involves determining suitable weights for combining multiple similarity measures, ensuring high-quality alignment. Unsupervised learning is well-suited for large-scale ontology meta-matching, as it doesn’t rely heavily on manual labeling [9]. Generative adversarial networks (GANs) are popular in unsupervised learning. They consist of a generator and a discriminator. The generator produces simulated samples, while the discriminator provides feedback on their authenticity. Through iterations, the generator improves until the discriminator can’t distinguish between real and generated samples. GANs have found applications in various fields, including computer vision, and have inspired innovative approaches in ontology matching.

Rest of the paper is organized in the following order: Sect. 2 gives an outline of the state of the art literature review where we outline the work done. Section 3 presents the model of adversarial learning based ontology alignment. While Sect. 4 gives and outline for the experimental analysis we have done. Finally the paper gives a comparison to Instance Matching using Knowledge Graph in Sect. 5 and concludes the work in Sect. 6.

2 Literature Survey

Ontology matching bridges the semantic gap between different domain representations but can be time-consuming. The use of diverse modeling approaches is necessary due to the multiple perspectives in perceiving things. Integrating domains often involves representing combined knowledge using ontologies and employing ontology matching approaches to align them. Our literature review focused on recent publications in the past three years and revealed a widespread interest in ontology matching. Our objective was to classify and identify research trends in ontology matching while developing a reference framework for integrating and categorizing these materials.

Ontology matching bridges the gap between domain representations. Integrating diverse perspectives requires different modeling approaches using ontologies. Our literature review identified trends and developed a framework. We examined a meta-strategy-based KG matching approach that aligns schema and instance-level entities. Experimental results favor algorithm-based meta-heuristic approaches over CNEA-based KG matching [11].

We introduce a heuristic evaluation measure and an optimization model for ontology matching. Our approach utilizes an Adaptive Evolutionary Algorithm (AEA) and outperforms reference alignment-based measures. It avoids local optima and excels in determining high-level alignments compared to existing systems. Our AEA-based ontology meta-matching system generates superior alignments independently [4].

Early work on graph alignment employs a graph embedding algorithm and an absolute orientation rotation method. Studies have shown that this method is effective for aligning structurally similar ontologies and is more robust against alignment noise when dealing with graphs of different sizes and architectures. Future research aims to explore various embedding methodologies and integrate the method with additional features, such as text-based information, in a hybrid matching system [1].

Normalization of Word and representation learning are employed to enhance the optimization of ontology alignment approaches using feature-based methodologies. Word normalization helps capture various domain terms, while representation learning discovers semantic and conceptual information, expanding alignment possibilities. Entity embedding learning within the representation learning approach is investigated [2].

Ontology heterogeneity in Knowledge Delivery Networks (KDNs) hampers collaboration among edge nodes. Ontology matching tackles this challenge by assessing similarity. We propose an adversarial learning model based on simulated annealing for ontology meta-matching. It optimizes weights and thresholds using a Generator-Discriminator relationship, outperforming previous systems and achieving effective ontology meta-matching [3].

We propose a mapping procedure to generate a semantic-compatible ontology for Digital Dentistry (DD) using a base ontology and three reference ontologies. Additionally, we suggest a deep learning-based method to identify key factors contributing to depression in DD. Due to limited data availability caused by the COVID-19 pandemic, further experimentation is needed to validate the proposed methods [5].

DeepOM is a deep learning-based ontology matching system that handles large ontologies without partitioning. It generates concept embeddings using a reference ontology and an auto-encoder, resulting in accurate and compact representations. DeepOM outperforms other systems in matching large-scale ontologies. The use of an auto-encoder for concept embeddings proves effective, with all DeepOM parameters contributing to improved matching [6].

This study proposes an approach for aligning IIoT ontologies using NLP. It learns vector embeddings for items and relations using ontology metadata and structure. Experimental results consistently outperform the baseline model BERT INT in HR and MRR scores by 1.2–2.7%. Limitations include synthesizing the ontology dataset due to unavailability of real-world data. However, the model benefits from learning language embeddings, facilitating identification of nodes with comparable semantic meaning. The structural encoder accurately aligns nodes, eliminating biases and imposing context [7].

3 Model of Adversarial Learning Based

In this study, the meta-matching problem of ontology is treated as a constrained optimization problem. To solve it, popular heuristic algorithms like simulated annealing, genetic algorithm (GA), and technique are commonly used. These algorithms excel in handling complex optimization problems, especially with large datasets (e.g., SA).

The choice of a specific heuristic algorithm depends on the ontology’s characteristics and the optimization problem’s nature. Selecting the right algorithm is vital for achieving effective and efficient meta-matching of ontologies (Fig. 2).

The use of genetic algorithms (GA) to solve the meta-matching problem in ontology has two main limitations: slow convergence and premature convergence. These problems can reduce the algorithm’s effectiveness and efficiency in finding optimal solutions.

To address these challenges, we propose a new approach called SA-GAN. It combines simulated annealing with the adversarial training framework inspired by generative adversarial networks (GANs). We use the Metropolis criterion to generate optimal characteristics for solving the ontology meta-matching problem. In the context of GANs, our methodology has two components: the Generator and Discriminator models.

3.1 Objective Function

The rate of recall within the domain of ontology matching is computed as the ratio of accurate alignments chosen from the overall count of alignments generated by the ontology matching system. It assesses the system’s capacity to accurately recognize and incorporate pertinent matches in its outcomes. The recall rate provides an important evaluation metric for assessing the completeness and comprehensiveness of an ontology matching system’s results. It indicates how well the system captures the true matches in the given ontologies and helps in evaluating its performance. The following is a definition of the formula:

$$\begin{aligned} recall = \frac{len (A \cup R') }{len (A)} \end{aligned}$$

(1)

The proportion of accurate alignments we select the overall number of total alignments chosen is called precision. It assesses the ontology matching system’s precision. The following is the formula:

$$\begin{aligned} precision = \frac{len (A \cup R') }{len (R')} \end{aligned}$$

(2)

We expect recall and accuracy to be equal (both ranges are between 0 and 1), but this isn’t always the case, therefore we included f - measure which is a harmonic average to assess the match of ontology outcomes. It even emphasizes our system’s performance in a more intuitive manner. In addition, the formula can be changed. the outcome is nearer to recall when it is nearer to the 0, while its result is nearer to the precision the nearer it is to 1. (Xue & Wang, 2015a) The function f-measure is given below:

$$\begin{aligned} f - measure = \frac{recall \times precision}{\alpha \times recall + (1 - \alpha ) \times precision} \end{aligned}$$

(3)

In the SA-GAN framework for ontology meta-matching, the goal of the Generator is to generate optimal similarity thresholds and weights that lead to the optimization of alignment quality and maximize the value of the f-measure. The f-measure is a metric that combines precision and recall to assess the overall effectiveness of the ontology matching system.

The ultimate objective of the SA-GAN framework is to achieve a suitable combination of weights and similarity criteria that yield good performance in terms of the f-measure. The Generator plays a crucial role in this process, guided by the Discriminator’s feedback on the f-measure. The framework involves interactive training of both the Generator and Discriminator parameters to optimize the meta-matching process.

Figure 3 illustrates the overall framework, highlighting the interactions and iterative training between the Generator and Discriminator to achieve effective ontology meta-matching.

In ontology meta-matching, when the generator’s weights and threshold make the discriminator’s precision and recall both 1, regardless of how it trains, the f-measure is 1, the discriminator will end training and now the last threshold and weights will be displayed out (Fig. 5).

3.2 Simulated Annealing Optimizer

This paper explores using simulated annealing, an optimization algorithm, for parameter optimization in ontology meta-matching. Simulated annealing is effective for numerical optimization problems with many local minima. The authors propose utilizing simulated annealing as a generator optimizer to speed up training and improve results. They suggest adjusting the update probability of the algorithm to enhance exploration of the parameter space and achieve better optimization outcomes in ontology meta-matching. Here E is either equal to 0 or less than 0, now updated solution is superior to the previous solution, and the efficient solution is thus revised. Otherwise, based on the existing temperature condition, it will accept the new answer with a fixed likelihood. In the optimization procedure for the generator, the authors introduce Algorithm 1, which describes the calculation process using simulated annealing with a modification to address efficient solutions. The goal is to find the best optimal solution within the SA-GAN epoch.

$$\begin{aligned} P= \left\{ {\begin{array}{ll} {1,} &{} {\varDelta E \le 0} \\ {e^{{ - \varDelta E/t}} ,} &{} {\varDelta E > 0} \\ \end{array} } \right. \end{aligned}$$

In this algorithm, the temperature adjustment is controlled using a geometric-based reduction. The temperature (T) is updated using the formula T = current temperature/m ${\beta }$, where m is a rate parameter of cooling. This geometric reduction allows the annealing process to speed up, gradually reducing the temperature.

Simulated annealing accepts worse options at high temperatures to explore the search space and potentially reach the global maximum. The authors introduce a stagnation counter in Algorithm 1 to measure solution efficiency. If the same result occurs multiple times, it indicates a lack of progress and termination of the algorithm. This modified approach aims to enhance the generator’s effectiveness in finding optimal solutions for ontology meta-matching within each SA-GAN epoch.

3.3 Using Gradient Descent and Discriminator

Because discriminator must be educated in the f-measure, it is thought of as a univariate function optimization issue. We use gradient descent, iterative optimization which is considered to be first order process for calculating differentiable functions’ local minima, to swiftly identify the ideal case.

Here algorithm gradient descent follow the rule i.e., in case the univariate function f-measure($\alpha $) is identified and it also is distinguishable in the neighborhood of the point a, f-measure($\alpha $) decreases with speed at point a in the different direction of its gradient, rf-measure($\alpha $). It follows that, if $b = a - \gamma \varDelta f-measure(\alpha )$, now considering $\gamma R p $ less then $ f-measure(\alpha ) \ge rf-measure(\alpha )$. keeping this in notice, it should begin with an initial approximation of its local minimum $x_0$ with $f-measure(\alpha )$, so we know here is a sequence of x in a way that $x_0, x_1,x_2,...,x_n,x_{n+1}$ so that $ X_{n+1} = X_n - n \varDelta f - measure(X_n), n \ge 0$. Therefore, $f-measure(x_0) \ge f-measure(x_1) \ge f-measure(x_2) $, if it proceed good, its sequence is converged obtaining desired local minimum finally.

We varied parameters’ range in its discriminator in the experiment, creating a wave at 0.46 to 0.56 that we chose from a sensitivity test approximately. Because we aim to enhance both the precision rate and the recall rate during the training of parameter, here, we even need the end outcome to be optimized in the last f-measure calculation, which is 0.51.

3.4 Training Process of Model

SA-GAN, a revolutionary adversarial learning model that offers a fresh approach to the problem of ontology meta-matching. The flowchart related to our process of matching is shown in Fig. 5 during the stage of pre-processing, it finishes extracting alignments of ontology and calculating our matrix of similarity. Then we convert matrix of similarity in our data set which allows our model to be trained. The generator and discriminator models are then trained using an adversarial technique in the training step. Finally, to achieve the final ontology alignments, the weights and threshold produced during adversarial training were combined for similarity computation.

4 Experimental Analysis

In this work evaluated their ontology matching strategy using a benchmark track provided by the Ontology Alignment Evaluation Initiative (OAEI). They compared their generated alignments with the correct correspondences in the reference alignment to measure performance. This allowed for a comprehensive evaluation and comparison with other matching systems in the OAEI benchmark.

Table 1 gives a detailed overview regarding OAEI benchmark. Here Table 1’s first column shows the ID in our test cases, which corresponds to several sorts of ontologies in the experiment. We can discover that our system is sensitive to which features and our system is inconsiderate to which features to using various sorts of test sets, and we can make more precise adjustments to the system based on this information (Fig. 4).

Here SA-GAN model is undergone training for maximum 50 epochs in the adversarial training step. Set Max Temperature to 200, Min Temperature to 100, Maximum Stay Times to 30 and iterations to 99 for fast-SA optimizer. The weights and threshold are Set for the generator at random initial is set to 0.51 and range is set to 0.46 to 0.55 for the discriminator, and epochs and learning rate for the algorithm gradient descent to 11, 0.01 correspondingly (Fig. 6).

Table 1. OAEI’s Benchmark Description

Full size table

4.1 Experimental Results

In the experiment, the evaluation function’s parameter varies, requiring identification of a suitable range for the discriminator’s training parameter. The objective is to control the parameter within a manageable range while improving both recall and precision in ontology matching results. Table 2 presents results achieved in various benchmarks with six groups in the range. Adjusting the range between 0.45 and 0.55 enhances precision and recall.

Researchers compare their method’s benchmark results with participants from OAEI, including CODI, MapPSO edna, GeRMeSMB, TaxoMap, Falcon, and AROMA. Table 3 shows their method performs well in five types of OAEI benchmarks. However, GeRMeSMB’s results in 201-208 differ slightly from Falcon’s in 301-304 due to the lack of entity context and structure consideration, along with imprecise semantic-based similarity metrics.

5 Comparison to Instance Matching in Knowledge Graph

Problem Focus: While the proposed framework focuses on ontology meta-matching, which involves aligning and optimizing different ontologies, instance matching in the knowledge graph aims to identify and match specific entities or instances across different knowledge graphs.

Optimization Approach: The researchers in the proposed framework utilize heuristic algorithms such as simulated annealing and genetic algorithm to solve the complex optimization problem of ontology meta-matching. These algorithms iteratively adjust weights and similarity criteria to improve the alignment results. On the other hand, instance matching in the knowledge graph typically relies on graph-based algorithms and similarity measures specific to entity attributes and relationships.

Evaluation Metric: The framework’s evaluation metric revolves around the f-measure, which considers both precision and recall. The objective is to achieve an f-measure of 1 by iteratively adjusting the weights and thresholds using the Generator and Discriminator. In contrast, instance matching in the knowledge graph often employs metrics like precision, recall, and F1-score at the instance or entity level.

Training Process: The proposed framework’s training process involves interactive training of the Generator and Discriminator parameters. The Generator aims to find optimal weights and similarity criteria, while the Discriminator provides feedback by assessing the alignment results. Training continues until the Discriminator’s precision and recall reach 1, indicating a successful match. In instance matching, the training process focuses on learning and optimizing similarity measures and matching rules specific to the attributes and relationships of instances.

Scope and Data Considerations: The proposed framework deals with ontology meta-matching and addresses the issue of semantic heterogeneity induced by knowledge representation differences. It requires access to ontological metadata and structures to guide the alignment process. Instance matching in the knowledge graph, on the other hand, focuses on identifying correspondences between specific instances, relying on available attributes and relationships within the knowledge graphs.

Applicability and Limitations: The proposed framework enhances ontology meta-matching, achieving a high f-measure through optimized alignment results. It has been evaluated on benchmark datasets, but its effectiveness may vary depending on ontology characteristics and available metadata. On the other hand, instance matching techniques in the knowledge graph target specific entities within the graph, each with its own strengths and limitations based on the chosen approach. By including this detailed comparison section, readers gain insights into the distinct contributions and limitations of the proposed framework for ontology meta-matching and instance matching in the knowledge graph.

6 Conclusion and Future Work

Semantic heterogeneity in knowledge-defined networks arises from differences in knowledge representation, impacting collaboration among edge nodes. Ontology matching addresses this issue by determining weights and confidence levels for multiple similarity assessment methodologies. This study proposes a simulated annealing-based adversarial learning framework for ontology meta-matching. It optimizes a single-objective model by iterative tuning of parameters and achieving improved matching outcomes compared to previous systems.

Future work aims to enhance ontology alignments by considering entity structure and improving confidence value accuracy. Addressing the slower convergence of the framework during training is also a focus, involving parameter adjustments and optimization for faster convergence.

References

Afzal, M., Hussain, M., Lee, S., Khattak, H.A.: Redesign of clinical decision systems to support precision medicine. In: TENCON 2018-2018 IEEE Region 10 Conference, pp. 2259–2263. IEEE (2018)
Google Scholar
Arshad, H., Khattak, H.A., Shah, M.A., Abbas, A., Ameer, Z.: Evaluation and analysis of bio-inspired optimization techniques for bill estimation in fog computing. Int. J. Adv. Comput. Sci. Appl. 9(7) (2018)
Google Scholar
Arshad, H., Shah, M.A., Khattak, H.A., Ameer, Z., Abbas, A., Khan, S.U.: Evaluating bio-inspired optimization techniques for utility price estimation in fog computing. In: 2018 IEEE International Conference on Smart Cloud (SmartCloud), pp. 84–89. IEEE (2018)
Google Scholar
Guerreiro, A., Pesquita, C., Faria, D.: Vowlmap: graph-based ontology alignment visualization and editing (2021)
Google Scholar
Khan, O.A., et al.: Leveraging named data networking for fragmented networks in smart metropolitan cities. IEEE Access 6, 75899–75911 (2018)
Article Google Scholar
Kiran, S., Khattak, H.A., Butt, H.I., Ahmed, A.: Towards efficient energy monitoring using IoT. In: 2018 IEEE 21st International Multi-Topic Conference (INMIC), pp. 1–4. IEEE (2018)
Google Scholar
Liu, M., Li, X., Li, J., Liu, Y., Zhou, B., Bao, J.: A knowledge graph-based data representation approach for IIoT-enabled cognitive manufacturing. Adv. Eng. Inform. 51, 101515 (2022)
Article Google Scholar
Tounsi Dhouib, M., Faron, C., Tettamanzi, A.G.B.: Measuring clusters of labels in an embedding space to refine relations in ontology alignment. J. Data Semant. 10(3–4), 399–408 (2021)
Google Scholar
Wang, M., Peng, J.: Word normalization information systems and improved learning representation for ontology matching. In: 2022 IEEE International Conference on Electrical Engineering, Big Data and Algorithms (EEBDA) (2022)
Google Scholar
Xue, X., Yang, C., Liu, W., Zhu, H.: Evolutionary ontology matching technique with user involvement. In: Tan, Y., Shi, Y. (eds.) ICSI 2021. LNCS, vol. 12690, pp. 313–320. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-78811-7_30
Chapter Google Scholar
Xue, X., Zhu, H.: Matching knowledge graphs with compact niching evolutionary algorithm. Expert Syst. Appl. 203, 117371 (2022)
Article Google Scholar

Download references

Acknowledgements

This research was supported by the MSIT (Ministry of Science and ICT), Korea, under the Grand Information Technology Research Center support program(IITP-2022-2020-0-01489) supervised by the IITP (Institute for Information & communications Technology Planning & Evaluation) and by Institute of Information & communications Technology Planning & Evaluation (IITP) grant funded by the Korea government(MSIT) (IITP-2022-0-00078, Explainable Logical Reasoning for Medical Knowledge Generation), (IITP-2017-0-00655, Lean UX core technology and platform for any digital artifacts UX evaluation).

Author information

Authors and Affiliations

School of Electrical Engineering and Computer Sciences, National University of Science & Technology (NUST), Islamabad, Pakistan
Hasham Khan, Muhammad Saqib, Hasan Ali Khattak & Syed Imran Ali
Department of Computer Science and Engineering, Kyung Hee University, Giheung-gu, Yongin, 17104, South Korea
Sungyoung Lee

Authors

Hasham Khan
View author publications
You can also search for this author in PubMed Google Scholar
Muhammad Saqib
View author publications
You can also search for this author in PubMed Google Scholar
Hasan Ali Khattak
View author publications
You can also search for this author in PubMed Google Scholar
Syed Imran Ali
View author publications
You can also search for this author in PubMed Google Scholar
Sungyoung Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Syed Imran Ali .

Editor information

Editors and Affiliations

Yonsei University, Wonju, Korea (Republic of)
Kim Jongbae
Institut Mines Télécom, Paris, France
Mounir Mokhtari
Digital Research Centre of Sfax, Sfax, Tunisia
Hamdi Aloulou
Université de Sherbrooke, Sherbrooke, QC, Canada
Bessam Abdulrazak
Yonsei University, Wonju, Korea (Republic of)
Lee Seungbok

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this paper

Cite this paper

Khan, H., Saqib, M., Khattak, H.A., Ali, S.I., Lee, S. (2023). Ontology Alignment for Accurate Ontology Matching: A Survey. In: Jongbae, K., Mokhtari, M., Aloulou, H., Abdulrazak, B., Seungbok, L. (eds) Digital Health Transformation, Smart Ageing, and Managing Disability. ICOST 2023. Lecture Notes in Computer Science, vol 14237. Springer, Cham. https://doi.org/10.1007/978-3-031-43950-6_31

Download citation

DOI: https://doi.org/10.1007/978-3-031-43950-6_31
Published: 22 September 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-43949-0
Online ISBN: 978-3-031-43950-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Ontology Alignment for Accurate Ontology Matching: A Survey