Introduction

Cultural heritage management, which involves documentation, conservation, restoration, interpretation, and education, plays a critical role in preservation and communication of its significance to a broad range of audiences [1, 2]. Global initiatives like the Charter on the Preservation of Digital Heritage by UNESCO [3] and the Ten Perspectives on the Future of Digital Culture presented by Europeana [4] have encouraged a shift to digital methods, a transition that has been significantly accelerated by innovations in artificial intelligence, blockchain [5], ontology [6], and virtual reality [7]. This move towards digitization, seen in the digital transformation of collections by institutions like the Hamburg Museum of Art [8] and the Metropolitan Museum of Art [9], fosters increased accessibility and enhanced conservation. Concurrently, the development of databases like the Global Fashion Style Cultural Database in Japan [10] and the Intangible Cultural Heritage Database in China [11] reflects optimization efforts in digital heritage management. Standardized data models and representation languages like Extensible Markup Language (XML) [12], Web Ontology Language (OWL) [13], and Resource Description Framework (RDF) [14] further facilitate data management and sharing, promoting interoperability and meaningful information exchange. Given the significance of cultural heritage, digital technologies can potentially improve access, preservation, and interpretation.

Fig. 1
figure 1

The illustration of digital cultural heritage management that uses knowledge graphs and deep learning algorithms for the Chinese Palace Museum https://digicol.dpm.org.cn/?category=6

However, digital cultural heritage management encounters numerous challenges that should be addressed. Despite the existence of diverse methods proposed by researchers for cultural heritage management [14], many of them still necessitate manual efforts in the collection and processing of massive cultural heritage data [12, 13], consequently leading to high labor costs and the absence of automatic updates. To enhance the scalability and efficiency of digital cultural heritage management, it is crucial to develop intelligent frameworks that leverage advanced technologies. Secondly, traditional approaches still need to address the lack of visualisation, fragmentation, and limited data reasoning. Cultural heritage databases and platforms [10] often suffer from fragmentation, where information is stored in silos, leading to poor correlation between data. As a result, it becomes challenging to identify relationships between cultural heritage objects, limiting analysis and interpretation. Overcoming fragmentation is vital to improving data reasoning and analysing cultural heritage data effectively. Thirdly, digital cultural heritage management has predominantly focused on preservation and management rather than providing adequate support for using scenarios. For example, when museums may store and manage artefacts digitally, it often lacks effective search, interpretation, storytelling, collaboration, accessibility, and inclusivity, making it difficult to derive the required information from cultural heritage resources. Consequently, addressing intelligent frameworks, fragmentary data, and insufficient application are critical to promoting the effective utilisation and accessibility of cultural heritage. In response, introducing new technologies to improve automation, increase data visualisation capabilities, and improve data application capabilities can solve the above challenges.

Regarding the application of digital technologies in the management of cultural heritage, knowledge graphs shows promise by integrating disparate data sources, bridging structured and unstructured data, and providing a visual representation of information [15, 16]. The knowledge graph applications span search engines, e-commerce [17], intelligent manufacturing [18], and electronic healthcare due to their superior interconnection, visualization, and data integration abilities. Knowledge graphs show significant advantages over traditional methods [12,13,14] in terms of visualization [19] and reasoning [20]. Furthermore, knowledge graphs can be effectively employed as input and output in various deep learning models. Therefore, using knowledge graphs and deep learning algorithms can make cultural heritage management more intelligent, address fragmentary cultural data, and enhance visualization capabilities.

The knowledge graph applied to digital cultural heritage management is shown in Fig. 1. The existing Palace Museum website presents ceramic information in list format. This approach limits visual engagement during browsing and provides limited insights into the similarities and differences among ceramic pieces. On the right of Fig. 1, the knowledge graph is utilized to enhance digital ceramic management to address this limitation. The knowledge graph create a highly automated, interconnected, visually appealing, and readily accessible informational structure for ceramics. In the historical research about knowledge graphs applied to digital cultural heritage management, Charles et al. [21] designed a cultural heritage knowledge network that incorporates information about places, people, and concepts. Given the metadata format of the network, identifying matching cultural heritage datasets poses a challenge. Hyvonen et al. [22] proposed an approach integrating a knowledge graph for identifying semantic relationships between different cultural heritage data. However, due to the limitations of transformation rules and missing relations between entities, locating the required cultural heritage knowledge resources can be challenging and inefficient. Debruyne et al. [23] created a knowledge graph for Ireland’s history, and further research will focus on determining the relationship between different attributes and the evolution of names for places. Meanwhile, deep learning algorithms have been used in conjunction with knowledge graphs to refine digital cultural heritage management further. Dou et al. [24] proposed the Chinese intangible cultural knowledge graph, where the Att-BiLSTM and Bi-GRU machine learning models are employed as the entity extraction and relation extraction models, respectively. Nevertheless, these models are not pipelines, leading to an accumulation of errors. Additionally, Bobasheva et al. [25] used machine learning to enhance cultural metadata and information retrieval, but the model struggles with overlapping entities and relations, which diminishes the extraction effect. In summary, the research presented above may need to be improved due to missing links, an appropriate metadata format, the relevance of datasets to cultural heritage, the availability of training sets, and the complexity of scenes. Furthermore, the pipeline of extraction tasks and overlapping triples may reduce the effectiveness of machine learning models.

Knowledge graph construction involves domain ontology construction, knowledge extraction (has two steps of entity extraction and relation extraction), knowledge completion, knowledge fusion, knowledge recommendation, and the knowledge graph application, of which knowledge extraction and knowledge completion are key steps [26]. Specifically, entity extraction has been developed for a long time, which involves statistical algorithms and machine learning algorithms, such as support vector machines [27, 28], Conditional Random Fields [29, 30], BiLSTM-CRF [31], large-scale pre-training models BERT [32]. Relation extraction is the next step after entity extraction and can be challenging when there are overlapping entities and relations [33]. Relation extraction initially utilises semantic rules and templates before machine learning algorithms such as BiLSTM [34] and Lattice LSTM [35]. To address the problem of overlapping entity-relation triples, joint entity-relation triple extraction methods, such as the Seq2Seq model with a copy mechanism [36], reinforcement learning [37], Graph Convolutional Networks [38], Multi-Turn QA models [39], and the CASREL model [40], have been proposed. To solve the problem of overlapping entity-relation triples in knowledge extraction, this study proposes a cascade binary tagging framework inspired by a novel cascade framework from CASREL [40], which shows promising results in addressing overlapping triples. The second core step of knowledge graph construction involves knowledge completion, which can be divided into rule-based, description logic-based, and case-based methods [41]. Afterwards, knowledge embedding is introduced in the knowledge completion task, which can be classified into four categories: combination models (e.g., RESCAL [42], NTN [43], HOLE [44]), translation models (e.g., TransE [45], DISTMULT [46]), convolutional neural network-based models (e.g., ConvE [47], ConvKB [48]), and graph-based models (e.g., R-GCN [49]). While the first three types process each triple independently, the graph-based models perform convolution operations on neighbour nodes, considering the knowledge graph as a whole and utilising the implicit information in the relationship path and neighbour nodes to analyse the semantic relationship between entities. The proposed model in this study enhances the GATs by incorporating multi-order relations and adjacent node features for knowledge completion. The proposed model employs an improved graph attention layer and ConvKB [48].

Fig. 2
figure 2

The framework of digital cultural heritage management using knowledge graphs and deep learning algorithms

Throughout the literature review of knowledge graphs applied to cultural heritage, knowledge extraction models, and knowledge completion models, four challenges are identified as follows:

  1. (1)

    A knowledge graph framework specific to digital cultural heritage management should be established, as generic frameworks from other areas may not capture the unique features of the local heritage. This issue could potentially be resolved through interdisciplinary research that combines expertise in cultural heritage management with knowledge graphs to co-develop a tailored framework.

  2. (2)

    It is challenging to develop a knowledge extraction model compatible with cultural heritage data due to overlapping triples. One potential approach to this could involve the joint entity-relation extraction model to identify and resolve these overlapping triples, thereby enhancing the compatibility of the model.

  3. (3)

    A knowledge graph completion model appropriate for cultural heritage data is required. The knowledge completion model should consider the specific nuances of local cultural heritage data. The development of the model might necessitate the integration of machine learning methods with the detailed study of cultural heritage data properties.

  4. (4)

    There is an evident necessity for research to pivot towards the knowledge graph application, which is intuitively responsive to a variety of using scenarios in digital cultural heritage management tasks. These scenarios encompass a diverse range from searching and interpretation to storytelling and collaboration, as well as ensuring accessibility and inclusivity. Enhanced interactivity and user-friendly design in the knowledge graph visualization can lead to increased user support and engagement.

Therefore, this study introduces a cultural heritage management method that utilises knowledge graphs and deep learning algorithms to overcome the abovementioned obstacles. Specifically, a generic knowledge graph-driven cultural heritage management framework is proposed to facilitate digital cultural heritage management. In addition, a knowledge extraction model is developed to extract entities and relations jointly from massive fragmentary cultural text automatically. Moreover, a knowledge completion model is introduced to predict missing knowledge and complete the knowledge graph. The effectiveness of the proposed approach is demonstrated in a case study using ceramic data from the Palace Museum in China. The proposed knowledge graph-driven cultural heritage management framework allows for acquiring interconnected and visualised cultural heritage information.

Accordingly, the study is structured as follows: "Methodology" Section describes a knowledge graph framework for digital cultural heritage management. The knowledge extraction model and knowledge completion model are also proposed in "Methodology" Section. "Case study: The ceramic knowledge graph for the Palace Museum in China" "Discussion" Section illustrates the case study using ceramic data from the Palace Museum in China. Section 4 discusses the application value and limitations. Finally, "Conclusion" Section 5 is the conclusion and future works.

Methodology

The framework of digital cultural heritage management using knowledge graphs and deep learning algorithms

Fig. 3
figure 3

Knowledge extraction model. The text in orange is the subject, and the text in blue represents an object. The lines with arrows are relationships between subjects and objects

This study proposes a framework which uses knowledge graphs and deep learning algorithms to support digital cultural heritage management to address the above challenges, as depicted in Fig. 2. Furthermore, a joint entity-relation triple extraction model is developed to extract entities and relations from fragmented cultural heritage data. This model automatically identifies entities and relations defined by experts, enabling efficient data integration. Additionally, a knowledge completion model is presented to deal with the issue of missing information in cultural heritage data. The knowledge completion model enables the completion of missing information, further enhancing the quality of the knowledge graph. Finally, a cultural heritage knowledge graph is constructed to satisfy cultural heritage management requirements.

Knowledge extraction model

Task description

Knowledge extraction is the process of identifying subjects, relations, and objects represented as (subject, relation, object). For instance, the sentence Pastel nine peach vase, its whole body painted peach tree and flat peach contains the entity-relation triples (Pastel nine peach vase, has pattern of, peach tree) and (Pastel nine peach vase, has pattern of, flat peach). Knowledge extraction becomes a fixed extraction task when working with a pre-defined set of entity-relation triples. However, conventional knowledge extraction models have challenges in overlapping circumstances, as shown in Fig. 3. Previous research has treated relations as discrete labels attached to entity pairs [50, 51], an approach that exhibits constrained efficacy in the context of multiple overlapping entity-relation triples.

The framework of knowledge extraction model

To solve the problem of overlapping entity-relation triples in knowledge extraction, this study proposes a cascade binary tagging framework inspired by a novel cascade framework from CASREL [40]. Unlike the approach of assigning discrete labels to entity pairs [50], triples are modelled directly and define training objectives at the triple level. The proposed method shows promising results in addressing overlapping triples.

The objective of the joint entity-relation triple extraction model is to maximize the data likelihood for the training set D, given an annotated sentence \(x_{j}\) from D and overlapping triples \(T_{j}=\left\{ (s,r,o)\right\}\) in \(x_{j}\):

$$\begin{aligned}\prod _{j=1}^{|D|}\left[ \prod _{(s, r, o) \in T_j} p\left( (s, r, o) \mid x_j\right) \right] \\ =\prod _{j=1}^{|D|}\left[ \prod _{s \in T_j} p\left( s \mid x_j\right) \prod _{(r, o) \in T_j \mid s} p\left( (r, o) \mid s, x_j\right) \right] \\ =\prod _{j=1}^{|D|}\left[ \prod _{s \in T_j} p \!\!\left( s \!\! \mid \!\! x_j\right) \!\! \prod _{r \in T_j \mid s} p_r\left( o \!\! \mid s \!, \! x_j\right) \!\! \prod _{r \in R \backslash T_j \mid s} \!\! p_r\left( \!\! o_{\varnothing } \!\! \mid \!\! s, \! x_j\right) \right] . \end{aligned}$$
(1)

where the notation \(T_{j}\) represents a set of triples, where each subject \(s \in T_j\) appears in at least one triple. To be precise, \(T_j \mid s\) represents the triples led by s in \(T_{j}\), and \((r, o) \in T_j \mid s\) represents a (ro) pair led by the subject s in \(T_{j}\). R represents all possible relationships. \(R \backslash T_j \mid s\) denotes all relations except those led by s in \(T_{j}\). \(o_{\varnothing }\) denote a ‘null’ object.

The chain rule of probability is applied in this formulation, as shown in Eq. 1. The structure is designed to deal with the problem of overlapping triples without making assumptions regarding how multiple triples may share entities within a sentence. Moreover, this formulation inspires a novel tagging scheme for triple extraction. A subject tagger is learned \(p\left( s \mid x_j\right)\) that recognizes subject entities. For relation r, an object tagger is learned \(p_{r}\left( o\mid s, x_j\right)\) to recognize relation-specific objects for a given subject. In contrast to classifying relations for pairs of subjects and objects, this approach treats each relation as a function mapping a subject to an object. The tagger allows multiple triples to be extracted simultaneously, as depicted in Fig. 3. A subject tagger is applied to identify all subjects. The relation-specific object taggers for each subject are used to identify all relevant relations and their corresponding objects. This study uses binary taggers over a deep bidirectional transform, namely the BERT [32], to instantiate the subject tagger and relation-specific object taggers.

The contribution of the proposed formulation has several advantages. Firstly, the final evaluation criteria are directly optimized at the triple level of the data likelihood by starting from the triple level. Secondly, it handles overlapping triples without making assumptions about how they share entities in a sentence. Finally, the proposed novel tagging scheme enables multiple triple extraction simultaneously.

BERT encoder

Fig. 4
figure 4

The transformer framework in BERT pre-training model

The encoder module extracts feature information \({\textbf {x}}_{j}\) from \(x_{j}\), and feature information is fed into the following tagging modules. BERT is a multi-layer bidirectional Transformer-based language representation model that learns deep representations based on each word’s left and right contexts, as depicted in Fig. 4. It comprises a stack of N Transformer blocks of the same type. Each Transformer block performs multi-head self-attention, layer normalization, and feedforward neural network operations as follows:

$$\begin{aligned}{} & {} {\textbf{h}}_0={\textbf{S}} {\textbf{W}}_s+{\textbf{W}}_p. \end{aligned}$$
(2)
$$\begin{aligned}{} & {} {\textbf{h}}_\alpha ={Trans}\left( {\textbf{h}}_{\alpha -1}\right) , \alpha \in [1, N]. \end{aligned}$$
(3)

In this equation, \({\textbf {S}}\) is a matrix of one-hot vectors of sub-word indices, \({\textbf {W}}_{s}\) represents the sub-word embedding matrix, \({\textbf {W}}_{p}\) is the positional embedding matrix, and p is the position index. \({\textbf {h}}_{\alpha }\) denotes the hidden state vector, which is the context representation of the input sentence at the \(\alpha -th\) layer. N is the total Transformer blocks’ number. Since a sentence is considered as input rather than a pair of sentences, segmentation embedding is not considered in Eq. 3. For a more detailed explanation of the Transformer structure, refer to [52].

Cascade decoder

Based on the previous formulation, the cascade binary tagging scheme is instantiated through a two-step cascade process: a subject tagger followed by a set of relations-specific object taggers, as illustrated in Fig. 3. First, subjects are detected in the input sentence. Afterwards, all possible relations related to each candidate subject are examined to determine whether they can associate objects in the sentence with that subject.

To recognise potential subjects, the subject tagger module decodes the encoded vector \({\textbf {h}}_{N}\) obtained from BERT. The module utilises two identical binary classifiers to detect subjects’ starting and ending positions. For this purpose, each token is assigned a binary tag (0/1) that signifies whether it represents a subject’s start or end position as follows:

$$\begin{aligned}{} & {} p_i^{start\_s}=\sigma \left( {\textbf{W}}_{ {start}} {\textbf{x}}_i+{\textbf{b}}_{{start}}\right) . \end{aligned}$$
(4)
$$\begin{aligned}{} & {} p_i^{end\_s}=\sigma \left( {\textbf{W}}_{{end }} {\textbf{x}}_i+{\textbf{b}}_{{end }}\right) . \end{aligned}$$
(5)

where \(p_i^{start\_s}\) and \(p_i^{end\_s}\) represent the probability of the i-th token identification in the input sequence based on the start and end positions of the subject. The corresponding token is assigned ‘1’ if the probability reaches a certain threshold, it is assigned ‘0’ otherwise. \({\textbf{x}}_i\) is the encoded representation of token i in the input sequence. \({\textbf{W}}_{(\cdot )}\) is the trainable weight, \({\textbf{b}}_{(\cdot )}\) is the bias, and \(\sigma\) is the sigmoid activation function.

The subject tagger optimises the following likelihood function as Eq. 6 to identify the span of a subject s given representation \({\textbf{x}}\). Using the results of the start-end position taggers, a subject’s span is determined using the nearest start-end pair match principle.

$${p_\theta }(s\mid {\bf{x}}) = \prod\limits_{t \in \{ start\_s,end\_s\} } {\prod\limits_{i = 1}^L {{{\left( {p_i^t} \right)}^{{\bf{I}}\left\{ {y_i^t = 1} \right\}}}} } {\left( {1 - p_i^t} \right)^{{\bf{I}}\left\{ {y_i^t = 0} \right\}}}$$
(6)

where L is the sentence length. \({\textbf{I}}\{z\}=1\) when z is true, and \({\textbf{I}}\{z\}=0\) when it is false. \(y_i^{{start\_s}}\) represents the start position for the i-th token in \({\textbf{x}}\), and \(y_i^{{end\_s}}\) represents the end position. The parameters \(\theta =\left\{ {\textbf{W}}_{ {start }}, {\textbf{b}}_{{start }}, {\textbf{W}}_{{end }}, {\textbf{b}}_{{end }}\right\}\).

Objects and their relationships with subjects are simultaneously identified in relation-specific object taggers, as shown in Fig. 3. Each object tagger identifies the corresponding object(s) for every detected subject. Contrary to the subject tagger, the relation-specific object tagger considers the subject’s characteristics. On each token, the relation-specific object tagger performs the following operations:

$$\begin{aligned}{} & {} p_i^{ {start\_o }}=\sigma \left( {\textbf{W}}_{{start }}^r\left( {\textbf{x}}_i+{\textbf{v}}_{{sub }}^k\right) +{\textbf{b}}_{{start }}^r\right) . \end{aligned}$$
(7)
$$\begin{aligned}{} & {} p_i^{ {end\_o}}=\sigma \left( {\textbf{W}}_{{end }}^r\left( {\textbf{x}}_i+{\textbf{v}}_{{sub }}^k\right) +{\textbf{b}}_{{end}}^r\right) . \end{aligned}$$
(8)

where \(p_i^{{start\_o }}\) and \(p_i^{{end\_o}}\) are the probabilities of identifying an object’s start and end positions. \({\textbf{v}}_{{sub}}^k\) is the encoded representation vector of the k-th subject.

Each subject is decoded iteratively using the same process. To make the additions of \({\textbf{x}}_i\) and \({\textbf{v}}_{{sub}}^k\) in Eq. 7 and Eq. 8, the dimensions of the two vectors should remain consistent. Therefore, the averaged vector representation is calculated between the start and end tokens of the k-th subject as \({\textbf{v}}_{{sub}}^k\).

For relation r, the object tagger optimises the following likelihood function to find out the scope of the object o in a sentence \({\textbf{x}}\) and a subject s:

$${p_{{\phi _r}}}(o\mid s,{\bf{x}}) = \prod\limits_{t \in \{ start\_o,end\_o\} } {\prod\limits_{i = 1}^L {{{\left( {p_i^t} \right)}^{{\bf{I}}\left\{ {y_i^t = 1} \right\}}}} } {\left( {1 - p_i^t} \right)^{{\bf{I}}\left\{ {y_i^t = 0} \right\}}}$$
(9)

where \(y_i^{{start\_o }}\) represents the binary tag of the object’s start position for the i-th token in \({\textbf{x}}\), and \(y_i^{{end\_o }}\) represents the tag of the object’s end position. In the case of a ‘null’ \(o_{\varnothing }\), the tags \(y_i^{{start\_o}}=y_i^{ {end\_o}}=0\). The parameters \(\phi _r=\left\{ {\textbf{W}}_{{start }}^r, {\textbf{b}}_{{start }}^r, {\textbf{W}}_{{end }}^r, {\textbf{b}}_{ {end }}^r\right\}\).

Training objective

Taking log of Eq. 1, the training objective \(J(\Theta )\) is:

$$\begin{aligned} \sum _{j=1}^{|D|}&{\left[ \sum _{s \in T_j} \log p_\theta \left( s \mid {\textbf{x}}_j\right) +\sum _{r \in T_j \mid s} \log p_{\phi _r}\left( o \mid s, {\textbf{x}}_j\right) \right. } \\&\left. +\sum _{r \in R \backslash T_j \mid s} \log p_{\phi _r}\left( o_{\varnothing } \mid s, {\textbf{x}}_j\right) \right] . \end{aligned}$$
(10)

where \(p_\theta (s \mid {\textbf{x}})\) is defined in Eq. 6 and \(p_{\phi _r}(o \mid s, {\textbf{x}})\) is defined in Eq. 9. The parameters \(\Theta =\left\{ \theta ,\left\{ \phi _r\right\} _{r \in R}\right\}\). A stochastic gradient descent algorithm trains the model by maximizing \(J(\Theta )\) over shuffled mini-batches.

Fig. 5
figure 5

Graph attention

Fig. 6
figure 6

Knowledge completion model

Knowledge completion model

Task description

The task of knowledge completion is to predict the missing relationships between entities within a knowledge graph. Conventional methodologies like Graph Attention Networks (GATs) [53] have exhibited impressive performance in knowledge completion tasks, primarily due to their capacity to capture first-order neighbour node information. Nevertheless, given their disregard for relational characteristics, these methodologies may not be directly transferable to knowledge graphs.

The framework of knowledge completion model

The proposed model in this study enhances the GATs by incorporating multi-order relations and adjacent node features for knowledge completion. The proposed model employs an improved graph attention layer and ConvKB [48]. By leveraging the enhanced graph attention mechanism with multi-order relations and adjacent node features, the model captures the complex relations between entities and generates predictions for missing relations in the knowledge graph. As a result, the model can generate accurate predictions for missing relations in the knowledge graph. The knowledge completion model is shown in Fig. 6.

The novelty of this work lies in the enhancement of GATs by incorporating multi-order relations and adjacent node features. Furthermore, this model employs an improved graph attention layer with a ConvKB decoder. These enhancements enable the model to capture complex relations between entities, significantly contributing to knowledge graph completion.

Graph attention encoder

To obtain the embedding of an entity \(e_{i}\), it is necessary to learn the embeddings of triples which have relationships with \(e_{i}\). The operation is illustrated in Fig. 5. The linear transformation used here is performed by concatenating entity and relation feature vectors associated with a specific triple \(t^{k}_{ij}=(e_{i},r_{k},e_{j})\):

$$\begin{aligned} \mathbf {{c}_{ijk}}={\textbf{W}}_{1}\left[ \overrightarrow{h_{i}}\left\| \overrightarrow{h_{j}}\right\| \overrightarrow{g_{k}}\right] . \end{aligned}$$
(11)

where \(\mathbf {{c}_{ijk}}\) is the vector representation of the triple \(t^{k}_{ij}\). \(\overrightarrow{h_{i}}\), \(\overrightarrow{h_{j}}\) and \(\overrightarrow{g_{k}}\) are the embeddings of \(e_{i},e_{j}\) and \(r_{k}\). Additionally, \(\varvec{{\textbf {W}}}_{1}\) represents the linear transformation matrix. Each triple is assigned an importance score \(b_{ijk}\). The attention value of the triple is obtained by applying a linear transformation parameterized by a weight matrix \(\varvec{{\textbf {W}}}_{2}\), followed by the LeakyReLU non-linearity:

$$\begin{aligned} b_{ijk}=\text { LeakyReLU }\left( {\textbf{W}}_{2} c_{ijk}\right) . \end{aligned}$$
(12)

Accordingly, the softmax function is applied to \(b_{ijk}\) to obtain relative attention values. Figure 5 illustrates the calculation of the relative attention value \(\alpha _{ijk}\):

$$\begin{aligned} \alpha _{i j k}={\text {softmax}}_{j k}\left( b_{ijk}\right) =\frac{\exp \left( b_{i j k}\right) }{\sum _{n \in N_{i}} \sum _{r\in R_{in}} \exp \left( b_{i n r}\right) }. \end{aligned}$$
(13)

where \({N}_{i}\) represents the neighborhood of entity \(e_{i}\) and \({R}_{ij}\) represents relationship between entities \(e_{i}\) and \(e_{j}\). Based on attention values for each triple representation, the updated embedding of entity \(e_{i}\) is as follows:

$$\begin{aligned} \mathbf {{h}^{\prime }_{i}}=\sigma \left( \sum _{j \in N_{i}} \sum _{k \in R_{ij}} \alpha _{i j k}\mathbf {c_{ijk}}\right) . \end{aligned}$$
(14)

To stabilize the learning process and capture features about the neighbourhood, multi-head attention [52] is used, and representation of embeddings is obtained by concatenating M attention mechanisms:

$$\begin{aligned} &h_{i}^{\prime }=\Vert _{m=1}^{M} \sigma \left( \sum _{j \in N_{i}} \alpha _{i j k}^{m} c_{i j k}^{m}\right) . \end{aligned}$$
(15)

As shown in Fig. 5, this transformation is performed on the input relation embedding matrix \({\textbf {G}}\) which is parameterized by a weight matrix \(\varvec{{\textbf {W}}}^{R} \in {{\mathbb {R}}^{T\times T^{\prime }}}\), where \(T^{\prime }\) represents the dimension of the output relation embedding. The linear function can be calculated as follows:

$$\begin{aligned}&G^{\prime }=G\cdot {\varvec{{\textbf {W}}}^{R}}. \end{aligned}$$
(16)

Average embedding is used to obtain the final embedding vectors for entities rather than concatenating embedding from multiple heads. While learning an entity’s embedding, it loses the information associated with its initial embedding. To overcome this issue, \(\varvec{{\textbf {H}}^{i}}\) is linearly transformed to obtain \(\varvec{{\textbf {H}}^{t}}\) using a weight matrix \(\varvec{{\textbf {W}}}^{E} \in {{\mathbb {R}}^{T^{i} \times T^{f}}}\), and embedding generated from the final attention layer:

$$\begin{aligned}&h_{i}^{\prime }=\sigma \left( \frac{1}{M} \sum _{m=1}^{M} \sum _{j \in {N}_{i}} \sum _{k \in R_{ij}} \alpha _{i j k}^{m} c_{i j k}^{m}\right) . \end{aligned}$$
(17)
$$\begin{aligned} \varvec{{\textbf {H}}}^{\prime \prime }={\textbf {W}}^{E}\varvec{{\textbf {H}}}^{t}+\varvec{{\textbf {H}}}^{f}. \end{aligned}$$
(18)

The proposed model introduces an auxiliary relation for n-hop neighbours between two entities, extending the concept of an edge to a directed path. In this case, the embedding of the auxiliary relation consists of all other embeddings along the path. Iteratively, the model gathers information from distant neighbours of an entity. Figure 6 illustrates the aggregation process for learning entity embeddings and introduces an auxiliary edge between n-hop neighbours. Entity embeddings are normalized after each generalized GAT layer and before every main iteration.

Training objective

This model uses a translational scoring function from [45], which learns embeddings such that the condition \(\textbf{h}_{i}+\textbf{g}_{k} \approx \textbf{h}_{j}\) holds for a given valid triple \(t^{k}_{ij}=(e_{i},r_{k},e_{j})\). To minimize the L1-norm dissimilarity measure, \(d_{t_{i j}}=\left\| \overrightarrow{h_{i}}+\overrightarrow{g_{k}}-\overrightarrow{h_{j}}\right\| _{1}\) is used to learn entity and relation embeddings. The model is trained using the hinge loss given by Eq. 19. The latter set consists of triples obtained by replacing valid triples’ head or tail entity with entities unrelated to the relation. The calculation is shown in Eq. 20.

$$\begin{aligned} &L\left( \Omega \right) =\sum _{t_{ij} \in S} \sum _{t\prime _{ij} \in S^{\prime }} \max \left\{ d_{t^{\prime }_{ij}}-d_{t_{ij}}+\gamma , 0\right\} . \end{aligned}$$
(19)
$$\begin{aligned} S^{\prime }=\underbrace{\left\{ t_{i^{\prime } j}^k \mid e_i^{\prime } \in \varepsilon \backslash e_i\right\} }_{\text{ replace } \text{ head } \text{ entity } } \cup \underbrace{\left\{ t_{i j^{\prime }}^k \mid e_j^{\prime } \in \varepsilon \backslash e_j\right\} }_{\text{ replace } \text{ tail } \text{ entity } }. \end{aligned}$$
(20)

where \(\gamma >0\) is a margin hyperparameter, S is a set of valid triples, and \(S^{\prime }\) represents a set of invalid triples.

Fig. 7
figure 7

The process of the case study

Fig. 8
figure 8

Example of instance

ConvKG decoder

ConvKB [48] is used as a decoder in the model. The convolutional layer analyzes \(t^{k}_{ij}\)’s global embedding properties on each dimension and generalizes transitional properties in the knowledge prediction model. Score functions with multiple feature maps can be expressed as follows:

$$\begin{aligned} f\left( t_{i j}^k\right) =\left( \Vert _{m=1}^{\Omega } {\text {ReLU}}\left( \left[ \textbf{h}_i, \textbf{g}_k, \textbf{h}_j\right] * \omega ^m\right) \right) \cdot {\textbf{W}}. \end{aligned}$$
(21)

where \(\omega ^{m}\) is the \(m^{th}\) convolutional filter, \(\Omega\) is a hyperparameter indicating the number of filters to be used, \(*\) is a convolution operator, and \({\textbf{W}} \in {\mathbb {R}}^{\Omega k \times 1}\) represents the linear transformation matrix.

The model is trained using soft-margin loss defined:

$$\begin{aligned} {\mathcal {L}}=\sum _{t_{i j}^k \in \left\{ S \cup S^{\prime }\right\} } \!\!\log \! \left( 1+\exp \left( l_{t_{i j}^k} \cdot f\left( t_{i j}^k\right) \right) \right) +\frac{\lambda }{2}\Vert {\textbf{W}}\Vert _2^2. \end{aligned}$$
(22)

where \(l_{t_{i j}^{k}}= {\left\{ \begin{array}{ll}1 &{} \text{ for } t_{i j}^{k} \in S \\ -1 &{} \text{ for } t_{i j}^{k} \in S^{\prime }\end{array}\right. }\).

Case study: The ceramic knowledge graph for the Palace Museum in China

The Palace Museum, located in the heart of Beijing in China, is one of the most prominent museums in China. The museum’s extensive collection of ancient Chinese art objects, including ceramics, reflects the country’s rich cultural heritage and long history. To increase accessibility to the collection, the museum has developed a digital heritage database that currently contains digital records of 10,624 ceramics. Given the considerable quantity of artefacts and the restricted visualisation capabilities of the database, it necessitates a novel knowledge management approach to enhance information acquisition.

This case study aims to address these issues by using real ceramics from the Palace Museum in China to assist cultural heritage professionals and enthusiasts in retrieving ceramic data more efficiently. This study comprises five critical steps: ceramic ontology definition, ceramic dataset construction, knowledge graph extraction, knowledge graph completion, and the knowledge graph application. Figure 7 depicts the case study process. The study employed Python 3.6, PyTorch 2.7, CUDA 10.1, Neo4j Graph database, and SQL Server. The study was conducted using Ubuntu 16.04 and Windows 10 operating systems using a CPU I7 7700K CPU and a GPU 1080Ti 11 G.

Ceramic ontology definition

Fig. 9
figure 9

The process of dataset construction

A ceramic ontology is a structure of information that helps form a structured knowledge graph. A ceramic cultural heritage ontology was designed in this study. To ensure the systematic development of this ceramic ontology, we adhered to well-established ontology engineering principles [54]. Ceramic ontology definition involves the definition of domain and scope, the reuse of existing ontologies, the enumeration of critical terms, the definition of concepts and properties, and the instantiation of these definitions.

Specifically, the study defines the ontology domain and scope of ceramic cultural heritage and refers to relevant historical ontologies for reference. The ceramic ontology was developed using the CIDOC Conceptual Reference Model (CIDOC CRM) [55], widely accepted and established as the ontology model in the cultural heritage domain. It is worth noting that the CIDOC CRM, consisting of 99 classes and 188 properties. In light of the specific scope of this study, we included a portion of the CIDOC CRM to form the foundation of the ontology.

Then, the core classes of the ontology were determined through extensive consultations with a multidisciplinary team. Essential terms were enumerated based on CIDOC CRM and the Palace Museum digital platform. The multidisciplinary team for this study comprises cultural heritage researchers and computer programmers. Multidisciplinary teams are selected because they should have worked for at least five years to ensure this study’s validation. Cultural heritage researchers classify creative culture design, such as chronology, processing techniques, shape, function and excavation sites. Computer programmers are responsible for computer technology for the ceramic knowledge graph. This study analysed the domain ontology with five cultural heritage researchers and two programmers. These researchers worked together to develop the domain ontology. The process of domain ontology construction was as follows: (1) Opinions on the ontology classification were collected from a multidisciplinary team, and the following topics were discussed: What are the components of knowledge of a ceramic object? What framework can be developed? (2) Implementing focus groups. The multidisciplinary team discussed the topics, and the researcher recorded the verbal and non-verbal information of the participants using video. (3) The concept classification is derived by analysing and organising information using content analysis and framework structure methods. Finally, domain ontology concepts include name, dynasty, pattern, shape, colour, glaze, and function. Name refers to the names of objects, people, and organisations. For example, ‘Blue and white bowl with pine, bamboo and plum pattern’ is the name of porcelain. Dynasty, pattern, shape, and colour include ‘Qing dynasty’, ‘Bamboo pattern’, ‘round shape’, and ‘blue and white’. The glaze is an impervious layer of a vitreous substance, which can serve to colour, decorate, underlying design or texture.

Additionally, the properties of the concepts were defined by incorporating the inverse property feature and the object-centric nature of CIDOC CRM ontology, which provided flexibility in the ontology construction. The relations between the concepts were established using a top-down approach, and 32 relations were defined. After the ontological model was constructed, it was filled with data obtained from the website of the Palace Museum in China. The ceramic ontology can provide a unified and structured framework for the ceramic knowledge graph, as shown in Fig. 8. We used the standard Web Ontology Language (OWL) for formalisation language, ensuring our ontology’s compatibility and interoperability with various digital platforms. With a balance between knowledge graphs and deep learning algorithms, we aim to offer a robust methodology for capturing, representing, and exploring complex ceramic data.

Ceramic dataset construction

After defining the ceramic ontology, the ceramic dataset was constructed. The procedure of ceramic dataset construction is illustrated in Fig. 9. Firstly, ceramic data was scraped from the website of the Palace Museum [56] using Beautiful Soup. The scraping process entailed identifying tag names and classes, extracting full text, and locating hrefs. The statistical overview of the ceramic corpus is shown in Table 1. The ceramic corpus includes 1151 porcelain pieces, 4658 images, 639,676 words and 3674 sentences. The ceramic corpus was then annotated using BRAT software [57]. Finally, the corpus was transformed into a dataset for knowledge graph extraction and a dataset for knowledge graph complementation, as shown in Fig. 9(d). The statistical overview of the dataset for knowledge graph extraction is shown in Table 3. Each row of the dataset used for knowledge graph extraction includes sentences and entity-relation triples contained in the sentences, with a total of 3,674 sentences and 8,689 triples. The dataset was partitioned into training, validation, and test sets in a ratio of 8:1:1, respectively. The statistical overview of the dataset for knowledge graph completion is shown in Table 2. Each row of the dataset used for knowledge graph complementation is an entity-relation triple and contains entity ID, relation ID, entity embedding, and relation embedding. In the early experiments, we tried out several methods of dividing the dataset. The knowledge extraction model showed improved stability and generalisation when we used an 8:1:1 partitioning. The knowledge complementation model performed better with a 7:2:1 split. With a larger validation set, we could more robustly fine-tune models’ hyperparameters and reduce overfitting.

For knowledge graph extraction task, the dataset comprises sentences and entity-relation triples. Each sentence contains some entity-relation triples, each head entity and tail entity labelled with ‘name’, ‘type’, etc., and each relation labelled with ‘have name of’, ‘have pattern of’, etc. These labels are the ground truth, annotated by a multidisciplinary team manually. The knowledge extraction model is trained to predict these labels based on the sentences and triples, and their performance is evaluated based on how well they match these ground truth labels. For knowledge graph completion task, the dataset comprises entity-relation triples, and each is labelled with entity ID, relation ID, entity embedding and relation embedding. These labels are the ground truth, annotated by a multidisciplinary team manually. The knowledge graph completion model is trained to predict these labels, and their performance is evaluated based on how well they match these ground truth labels. In order to evaluate the consistency of the annotation, we selected 100 sentences randomly and calculated the F-value. The results were calculated to be 96.2% for entity consistency and 93.5% for relationship consistency, demonstrating the reliability of the annotation results.

$$\begin{aligned} &P=\frac{\text{ Consistent } \text{ annotations: } A_{1} \text { and } A_{2} }{\text{ Number } \text{ of } \text{ annotations } \text{ from } A_{2} },\\&R=\frac{\text{ Consistent } \text{ annotations: } A_{1} \text { and } A_{2} }{\text{ Number } \text{ of } \text{ annotations } \text{ from } A_{1} }, \\&F=\frac{2 \times P \times R}{P {+} R}. \end{aligned}$$
(23)
Table 1 Statistical overview of the crawled ceramic corpus
Table 2 Statistical overview of the ceramic dataset for knowledge graph extraction
Table 3 Statistical overview of the ceramic dataset for knowledge graph completion
Table 4 The parameters of knowledge extraction model
Fig. 10
figure 10

The comparative results of knowledge extraction

Fig. 11
figure 11

Test results of knowledge extraction

Table 5 The parameters of knowledge completion model
Table 6 The comparative results of knowledge completion
Fig. 12
figure 12

Entity embedding visualization

Fig. 13
figure 13

Training process of knowledge completion model

Knowledge graph extraction

This section discusses the automated extraction of ceramic knowledge and the evaluation of the knowledge extraction model proposed in Section 3.2, which was trained through a ten-fold cross-validation method on the dataset (Table 2). The experiment of knowledge graph extraction employed Python 3.6, PyTorch 2.7, CUDA 10.1, and Ubuntu 16.04 using CPU I7 7700K and GPU 1080Ti. The validation set was utilized to determine the thresholds for the two layers and the parameters, detailed in Table 4. A stochastic gradient descent algorithm was used to train the model using the Adam optimizer, and early stopping was implemented to prevent overfitting. The pre-training model contained 12 million parameters and 12 bidirectional Transformer blocks with 256 hidden states and four attention heads. To ensure a fair comparison, input sentences were limited to 100 words [50, 58]. The training process stopped after 13 epochs without improvement, the total training time of about 40 h for 13 epochs. Results show that the model achieved an F1-score of 86.7%, a precision of 86.4%, and a recall of 87.1%. Although the loss value was high at the start, it reduced rapidly and approached zero after six epochs, with precision results stabilizing as the number of epochs increased.

A comparison was made between the proposed model and state-of-the-art knowledge extraction models, which we detail as follows: (a) NovelTagging [50]: A sequence tagging joint learning model. Its architecture comprises a bidirectional Long Short-Term Memory (BiLSTM) encoding layer and an LSTM decoding layer. (b) GraphRel [58]: A graph neural joint learning model. Its structure includes a BiLSTM encoding layer and a bidirectional Graph Convolutional Network (Bi-GCN) layer for decoding dependency trees. (c) Our Proposed Model (BiLSTM): Noting that the encoding layers of the above models use BiLSTM, the encoding layer of our model was adjusted to BiLSTM for ablation experiments.

We evaluated all models on the constructed ceramic dataset for the knowledge extraction task. For (a)-(c) models, we used 300-dimensional word vectors created using the skip-gram variant of the word2vec algorithm [59], trained on the Chinese Wikipedia corpus for ablation experiments. Word2Vec is an algorithm designed to transform words into numerical word vectors. It redefines high-dimensional word features into a low-dimensional space, which allows machines to better understand words and their semantic relationships. Normally, a 300-dimensional word vector balances computational efficiency and word information. By representing words as vectors, we can feed them into neural network models (such as BiLSTM or Bi-GCN) and train these models to recognize patterns across the 300-dimensional space.

Figure 10 shows that the proposed model outperformed the other comparative models. NovelTagging employs sequence labelling, which is unable to eliminate overlapping triples. As a result, it achieves the lowest F1 scores compared to the other three models. The decoding layer of GraphRel introduces the graph and enhances the knowledge extraction process. Thus, the F1-scores (55.2%) are slightly higher than NovelTagging (40.3%). The proposed model achieves F1-scores of 86.7% on the ceramic dataset, an improvement of 57.1% over the GraphRel. Despite not having pre-training, the proposed model (Our Model BiLSTM) achieves an F1-score of 82.2%, which improved by approximately 48.9% compared to Multi-Head. The proposed model, which uses BERT pre-training, performed approximately 5.5% better than the proposed model with a BiLSTM encoder. As a result, the comparison demonstrates the proposed model’s superiority compared to NovelTagging and GraphRel.

Fig. 14
figure 14

The knowledge graph construction and application

Fig. 15
figure 15

The home pages of the ceramic knowledge graph

Fig. 16
figure 16

The searching pages of the ceramic knowledge graph

Fig. 17
figure 17

The pages of collection, comparison, and management for ceramic treasures

Fig. 18
figure 18

The detailed pages of ceramic treasures

Knowledge graph completion

This subsection evaluates the knowledge completion model proposed in Section 3.3 on the ceramic dataset (Table 3). The performance assessment for knowledge graph completion task was conducted utilizing the identical hardware and software configurations as those detailed in Section 3.3 for the knowledge graph extraction process (Fig. 11). The embeddings of entity and relation were generated using 100-dimensional word vectors generated by skip-gram of word2vec. In Fig. 12, words of greater relevance or the same type are close, which illustrates the effective capturing of semantic information of the words.

The knowledge completion model was trained using a ten-fold cross-validation method by first training the graph attention layer for encoding and then the ConvKB decoder for relation prediction. The parameters were determined using the validation set, as presented in Table 5. In order to prevent overfitting, the stochastic gradient descent algorithm with Adam as the optimizer was used for training. The proposed model was evaluated using commonly-used knowledge completion evaluation methods, including the average rank (MR) of 139, the average reciprocal rank (MRR) of 47.7%, and the Hits@1, Hits@3, and Hits@10 values of 41.7%, 49.9%, and 59.0%, respectively. The total training time is approximately 3 h. The training process is illustrated in Fig. 13, which shows that the loss value decreases as the training progresses and eventually stabilizes, indicating the robustness of the model.

A comparison was made with other outstanding models to evaluate the proposed knowledge completion model. The comparative models are as follows: TransE [45], ConvKB [48], and R-GCN [49]. As shown in Table 6, the proposed model outperforms TransE, ConvKB, and R-GCN. Specifically, R-GCN based on the graph convolutional neural network demonstrates poor knowledge completion performance, reflected in its low values across all five items. Although ConvKB based on the convolutional neural network shows the highest MR value, other results perform worse than the proposed model. In contrast, the method presented in this study demonstrates relatively stable and robust results, with four items achieving the highest scores.

The knowledge graph application

The proposed framework and deep learning models were used in the case study to construct the ceramic knowledge graph. This study employed Python 3.6, PyTorch 2.7, CUDA 10.1, Neo4j Graph database, and SQL Server. The study was conducted using Ubuntu 16.04 and Windows 10 operating systems using a CPU I7 7700K and a GPU 1080Ti 11 G. The Flask web framework was employed to facilitate the automatic implementation of the knowledge extraction function, knowledge completion function, and the development of the cultural heritage knowledge graph. Initially, the relevant Flask classes were imported, and instances of these classes were created to serve as the foundation for the WSIG (Web Server Gateway Interface) application. In addition, the instantiated object was utilised as a decorator. This allowed it to wrap another function and ultimately return a different function to the browser. In order to deploy the model, the configuration of Gunicorn and Nginx was necessary. Following the successful deployment of the knowledge extraction model and knowledge completion model, the functionalities pertaining to knowledge extraction and knowledge completion within the cultural heritage domain were effectively realised. The ceramic knowledge graph was constructed with a technical architecture consisting of four layers: data infrastructure storage, knowledge graph architecture, knowledge graph service, and the knowledge graph application (Fig. 14). The data infrastructure layer provides a relational database, knowledge-searching functions, graph data, and a big data platform. The knowledge graph architecture layer produces and manages the knowledge graph. The knowledge graph service layer supports semantic query and retrieval services, reasoning services, and knowledge production services. Finally, the knowledge graph application layer offers a platform for users to search, collect, compare, manage knowledge, etc. It is noticed that the process model training did not require a significant amount of computational resources.

The users of the ceramic knowledge graph application can be diverse, ranging from academic researchers and students to cultural heritage professionals, policymakers, and the general public. Each group has unique needs and requirements. Academic researchers and students might utilise the ceramic knowledge graph as a tool for deepening their understanding of historical artefacts. Therefore, the primary need is a convenient and accurate representation of artefacts, contexts, and relationships. Cultural heritage professionals, such as museum curators and heritage site managers, might benefit from using the ceramic knowledge graph to catalogue, organise, and manage collections. In addition, it would be beneficial to have features such as easy data editing, and robust searching. It is possible for policymakers to use the ceramic knowledge graph to inform their decision-making processes. It would be beneficial to them if the data could be visualised and interpreted easily. Public members may be interested in exploring the ceramic knowledge graph out of personal interest or for informal learning purposes. Improving the user-friendliness and interactivity of the knowledge graph visualizations could facilitate more effective user engagement. Considering the diversity of user types, including cultural and museum researchers, students, and the general public, we have conducted a summary of user scenarios for the main needs with expert advice, including searching, collection, comparison, management, and detailed information.

In the knowledge graph application section, the interface of the home page is discussed (Fig. 15). The home page provides a convenient and intuitive method for users to navigate through interconnected ceramics information and images simply by left-clicking and dragging to alter the viewing scope. The knowledge graph application represents ceramic information in a highly connected and visually compelling format.

The search page allows users to search for specific terms such as ‘underglaze colour’ and ‘twig pattern’ (Fig. 16). Using a combination of entity-relation triple extraction and text-matching, this model identifies entities and relations from the input sentences and matches related knowledge with those entities and relations. For instance, if a user queries ‘bottles with short neck and plump shoulder’, the knowledge graph application platform will identify the entities ‘short neck’ and ‘plump shoulder’. It will match them with related bottles, such as ‘Yongzheng style bucket colour wrapped branch pattern plum bottle’ and ‘Blue and white wrapped branches flower plum bottle’. The result is a visual representation of associated cases, enabling users to acquire information in a relational, visual, and intuitive way.

The collection, comparison, and management page helps users to obtain detailed ceramic information (Fig. 17). The knowledge graph visualisation indicates related entities and relationships by clicking on a text or picture. The collection, comparison, and management page allows users to learn more through the ceramic knowledge graph visualisation, such as ‘round belly’, ‘enamel’, ‘bottle’, and ‘flower’. Compared with other cultural heritage knowledge acquisition and management methods, the knowledge graph visualisation presents information in an interconnected, visual, and intuitive way.

The detailed analysis page allows users to obtain further decomposed information (Fig. 18). It is possible to extract information about patterns, functional types, colours, glaze classifications, and shapes using the joint entity-relation triple extraction model of the knowledge graph.

This case study illustrates the effectiveness of the knowledge graph application in acquiring ceramic knowledge for researchers focused on intangible cultural heritage. The knowledge graph application provides a visual representation of interconnected ceramic knowledge. It allows automatic entity and relation extraction, knowledge graph completion, relevant knowledge matching, and visualised case correlations. The ceramic knowledge graph application streamlines the process of acquiring ceramic knowledge, making it a valuable tool for researchers in digital cultural heritage management.

Discussion

The knowledge graph application for digital cultural heritage management

The application of knowledge graph has many benefits, including enhanced search and discovery, improved interpretation and storytelling, facilitated collaboration and data integration, increased accessibility and inclusivity, and better preservation and conservation as follows (Table 7).

  1. (1)

    Enhanced search and discovery: The knowledge graph application enables museums to provide more accurate and personalised search results, improving the overall user experience. For example, museums have implemented a knowledge graph-based search engine that allows users to search for artefacts by different criteria, such as materials, periods, and regions, and obtain a visual representation of the search results. Additionally, the knowledge graph application play a significant role in managing cultural heritage data by organising and categorising vast amounts of information, aiding in efficient data retrieval and exploration.

  2. (2)

    Improved interpretation and storytelling: The knowledge graph application helps museums present their collections more engagingly by connecting different artefacts and their contexts. For example, museums use a knowledge graph to create digital stories that illustrate the life and work of Vincent van Gogh. By incorporating cultural heritage data into knowledge graphs, museums can effectively showcase the historical significance and narratives associated with each artefact, enhancing the interpretive experience for visitors.

  3. (3)

    Facilitated collaboration and data integration: The knowledge graph application enables museums to share and integrate their data more efficiently internally and with external partners. This capability is particularly beneficial in managing cultural heritage data, as it allows for the seamless integration of diverse information sources related to art history, including archives, catalogues, and bibliographies.

  4. (4)

    Increase accessibility and inclusivity: The knowledge graph application can help museums to provide more inclusive and accessible experiences for visitors with different backgrounds and interests. For example, museums create a knowledge graph that allows users to explore their collections through different keywords, such as colour, glaze, and shape. By incorporating knowledge graphs in the management of cultural heritage data, museums can foster collaboration between different stakeholders and ensure comprehensive access to valuable information.

  5. (5)

    Better preservation and conservation: The knowledge graph application can help museums manage their collections more efficiently and effectively by providing a more comprehensive and interconnected view of the objects and their metadata. For example, museums develop a knowledge graph that incorporates information about the physical and chemical properties of their artworks and their historical and cultural contexts to support preservation and conservation efforts.

The knowledge graph application assists in the organisation and management of cultural heritage data, enabling museums to better understand the relationships between artefacts, make informed decisions regarding conservation methods, and ensure the long-term preservation of cultural heritage for future generations.

Table 7 Digital outcomes of the ceramic knowledge graph and potential applications

Limitations of knowledge graphs for digital cultural heritage management

Although the knowledge graphs significantly improves digital cultural heritage management, there are some limitations could be addressed.

The first limitation is the ceramic knowledge graph platform requires a lot of labelled data. As with any machine learning model, the quality of the results depends on the quantity and quality of the data used to train the model. A substantial amount of labelled data is required to be processed into the knowledge extraction model and the knowledge completion model so that they can recognise and predict entities and relations within the ceramic domain.

The second limitation is the incompleteness of the existing ontology framework compared with CIDOC CRM. The ontology framework used in the knowledge graph platform may only partially capture some of the nuances and complexities of the ceramic domain. This could limit the accuracy and completeness of knowledge representation. Because of this limitation, it may be necessary to modify the ontology framework to make it more aligned with the ceramic domain and provide a more comprehensive representation of knowledge in the ceramic domain.

The third limitation is the requirement for more data verification and testing in ceramics or other tangible and intangible cultural heritage. While the ceramic knowledge graph application has shown promise in providing visualised and interconnected ceramic knowledge, its effectiveness and accuracy in different cultural heritage domains may require further testing and verification. More research should be explored to show the feasibility of applying knowledge graphs and proposed deep learning models to other heritage resources, such as textiles, paintings, and sculptures.

Despite the benefits of the knowledge graph application in managing digital cultural heritage, there is a clear deficiency in robust user testing or evaluation in this study. However, user testing often falls short in practice due to constraints like time, cost, or availability of representative user groups. To address this, a comprehensive framework for user testing that can accommodate the diversity of the user population, along with regular user feedback cycles, should be adopted to continually refine and improve the system.

The final limitation is the dependence of the model on textual data. Though integrating machine learning techniques for multimedia data analysis can enhance knowledge graphs, it adds complexity to the process. It necessitates significant alterations to the current data pipeline, from multimedia data ingestion to feature extraction and potential schema restructuring. Despite these difficulties, further research into multimodal data integration and machine learning in knowledge graphs is critical.

Conclusion

Effective digital cultural heritage management is often hampered by fragmented data, intelligent framework and insufficient application, which poses challenges to accessing and utilising valuable cultural heritage information. This study proposes a digital cultural heritage management approach that combines knowledge graphs and deep learning algorithms to address these issues. Specifically, a knowledge graph-driven framework is proposed to automatically manage cultural heritage data, along with a joint entity-relation triple extraction model that extracts entities and relations for knowledge graph construction. The joint entity-relation triple extraction model uses an encoder based on the BERT to contextualise input sentences and extract entities and relations simultaneously through a cascade decoder, overcoming the challenge of overlapping triples in the knowledge extraction of cultural heritage. Additionally, a knowledge completion model with graph attention is proposed to complete missing knowledge. The results demonstrate that the proposed joint entity-relation extraction model and knowledge completion model outperform well-known models. The proposed approach was validated by a case study of ceramic data at the Palace Museum in China, which significantly enhances search and discovery, improves interpretation and storytelling, facilitates collaboration and data integration, and increases accessibility, inclusivity, and preservation.

Future work will address the completeness of the ontology framework in the cultural heritage domain. The proposed approach will involve further validation using a wide range of cultural artefacts. Future implementations should also include user evaluation. Additionally, it will be crucial to incorporate multi-modal data, such as audio, video, and image files, so as to capture a more complete representation of cultural heritage.