Considering Semantics on the Discovery of Relations in Knowledge Graphs
Abstract
Knowledge graphs encode semantic knowledge that can be exploited to enhance different data-driven tasks, e.g., query answering, data mining, ranking or recommendation. However, knowledge graphs may be incomplete, and relevant relations may be not included in the graph, affecting accuracy of these data-driven tasks. We tackle the problem of relation discovery in a knowledge graph, and devise \(\mathcal {KOI}\), a semantic based approach able to discover relations in portions of knowledge graphs that comprise similar entities. \(\mathcal {KOI}\) exploits both datatype and object properties to compute the similarity among entities, i.e., two entities are similar if their datatype and object properties have similar values. \(\mathcal {KOI}\) implements graph partitioning techniques that exploit similarity values to discover relations from knowledge graph partitions. We conduct an experimental study on a knowledge graph of TED talks with state-of-the-art similarity measures and graph partitioning techniques. Our observed results suggest that \(\mathcal {KOI}\) is able to discover missing edges between related TED talks that cannot be discovered by state-of-the-art approaches. These results reveal that combining semantics encoded both in the similarity measures and in the knowledge graph structure, has a positive impact on the relation discovery problem.
Keywords
Relation discovery Semantic similarity Graph partitioning1 Introduction
Following Linked Data initiatives and exploiting features of Semantic Web technologies, large volumes of data are publicly available in the form of knowledge graphs usually described using the RDF data model, e.g., DBpedia^{1} or YAGO^{2}. Simultaneously, data-driven applications that rely on knowledge graphs are progressively increasing [5]. However, as traditional semi-structured data, knowledge graphs may be incomplete, either because relations among graph entities were unknown at the time the graph was created, or because the knowledge graph creation process failed to completely identify all existing relations. This situation encourages the development of techniques for the discovery of missing relations.
Discovering relations in knowledge graphs requires the analysis of both the semantics encoded in the knowledge graph, and the connectivity or structure of the represented relations. However, the majority of the state-of-the-art approaches are based either on the structure of the graph [2, 10], or on properties of the knowledge graph entities [7, 16]. Although some approaches combine both types of knowledge [21], they do not take into account domain semantics encoded in semantic similarity measures to discover missing relations [15].
In this paper we propose \(\mathcal {KOI}\), an approach for relation discovery in knowledge graphs that considers the semantics of both entities represented in the knowledge graph and their neighborhoods. \(\mathcal {KOI}\) receives as input a knowledge graph, and encodes the semantics about the properties of graph entities and their neighbors in a bipartite graph. Entity neighbors correspond to ego-networks, e.g., the friends of a person in a social network or the set of TED talks related to a given TED talk. \(\mathcal {KOI}\) partitions the bipartite graph into parts of highly similar entities connected to also similar ego-networks. Relations are discovered in these parts following the homophily prediction principle, which states that entities with similar characteristics tend to be related to similar entities [13]. Intuitively, the homophily prediction principle allows for relating two entities t1 and t2 whenever they have similar datatype and object property values (neighborhoods).
We evaluate the behavior of \(\mathcal {KOI}\) in a knowledge graph of TED talks^{3}; we crafted this knowledge graph by crawling data from the official TED website (http://www.ted.com/). We compare relations discovered by \(\mathcal {KOI}\) with two baselines of relations identified by the METIS [9] and k-Nearest Neighbors (KNN) algorithms. We empowered KNN with statistic and semantic similarity measures (Sect. 6.3). Experimental outcomes suggest the following statements: (i) Semantics encoded in similarity measures and knowledge graph structure enhances the performance of relation discovery methods; and (ii) \(\mathcal {KOI}\) outperforms state-of-the-art approaches, obtaining higher values of precision and recall.
\(\mathcal {KOI}\), a relation discovery method that implements graph partitioning techniques and relies on semantics encoded in similarity measures and graph structure to discover relations in knowledge graphs;
A knowledge graph describing TED talks crafted from the TED website; and
A empirical evaluation on a real-world knowledge graph of TED talks to analyze the performance of \(\mathcal {KOI}\) with respect to state-of-the-art approaches.
This paper comprises six additional sections. Section 2 motivates our approach with an example, and Sect. 3 introduces preliminary definitions. We explain our approach in Sect. 4 and the related work in Sect. 5. Section 6 reports on experimental results and describes the crafted TED knowledge graph. Section 7 concludes and presents future work ideas.
2 Motivating Example
In this section, we provide an example to motivate the problem of knowledge discovery tackled in this paper. We show an example of relation discovery between TED talks publicly available in the TED website. TED talks are described through textual properties, e.g., title, abstract or tags, and their relations with other talks in order to provide recommendations to the users. Relations between talks are defined by TED curators manually, which corresponds to a time expensive task and prone to omissions. Therefore, it would be helpful to have automatic methods able to ease the relation discovery and other curation tasks. We check the TED website in 2015 and 2016, and compare both versions of the website in order to detect relations between talks that are only represented in the newer version of the website. In total, we observe 62 relations that are included in 2016 but are not present in the 2015 version, i.e., TED curators do not discover these relations until 2016. One example is the relation between talks The politics of fiction^{4} and The shared wonder of film^{5}. Both talks are present in both versions of the website. However, only in 2016 is possible to find a relation between them. Thus, we can conclude that there are missing relations between TED talks in the 2015 version of the website. An approach able to discover these relations automatically would alleviate the effort of curators and improve the quality (completeness) of the data. Though the relation between The politics of fiction and The shared wonder of film is not included in the 2015 website, the rest of knowledge regarding to these talks allows for intuiting a high degree of relatedness between them. We observe that both talks have keywords or tags in common as Culture or Storytelling. We also find some expressions in their abstracts or descriptions, that though do not match exactly, are clearly related such as identity politics and cultural walls, or film and novel. Moreover, if their sets of related TED talks are compared, we observe they share two related talks, The clues to a great story^{6} and The mistery box^{7}. Thus, related talks have properties in common. \(\mathcal {KOI}\) relies on this observation and exploits entity properties to discover missing relations between these entities.
3 Preliminaries
In this section we present definitions required to understand our approach.
Definition 1
(RDF Triple [1]). Let U be a set of RDF URI references, B a set of Blank nodes, and L a set of RDF literals. A tuple \((s, p, o) \in UB \times U \times UBL\) is an RDF triple, where s is called subject, p predicate and o object.
Definition 2
(Knowledge graph [18]). Given a set T of RDF triples, a knowledge graph is a pair \(G=(V, E)\), where \(V = \{s | (s, p, o) \in T\} \cup \{o | (s, p, o) \in T\}\) is a set of entities and \(E=\{(s, p, o) \in T\}\) a set of relations.
Figure 1 shows a portion of a knowledge graph describing TED talks. The predicate vol:hasLink connects related talks, while the rest of predicates correspond to datatype properties and connect talks with string literals.
Definition 3
(Ego-Network). Let \(G=(V, E)\) be a knowledge graph and \(L =\{p \mid (s, p, o) \in E\}\) be a set of predicates. Given an entity \(v_i \in V\) and a predicate \(r \in L\), the ego-network of \(v_i\) according to r is defined as the set of entities connected to \(v_i\) through an edge with predicate r: ego-net\((v_i, r)=\{v_j \mid (v_i,r, v_j)\; \in \;E \}\).
The ego-network of the entity ted:256 with respect to the predicate vol:hasLink (Fig. 1) is formed by entities ted:59, ted:73, and ted:184.
4 Our Approach: \(\mathcal {KOI}\)
4.1 Problem Definition
Let \(G'=(V, E')\) and \(G=(V, E)\) be two knowledge graphs. \(G'\) is an ideal knowledge graph that contains all the existing relations between entities in V. G is the actual knowledge graph, which contains only a portion of the relations represented in \(G'\), i.e., \(E \subseteq E'\). Let \(\varDelta (E', E) = E' - E\) be the set of relations existing in the ideal graph that are not represented in the actual knowledge graph G, and \(G_\text {comp}=(V, E_\text {comp})\) the complete knowledge graph, which contains a relation for each possible combination of entities and predicates \(E\subseteq E'\subseteq E_\text {comp}\).
Given a relation \(e \in \varDelta (E_\text {comp}, E)\), the relation discovery problem consists of determining if \(e \in E'\), i.e., if a relation e corresponds to an existing relation in the ideal graph \(G'\).
4.2 Our Solution
We propose \(\mathcal {KOI}\), a relation discovery method for knowledge graphs that considers semantics encoded in similarity measures and the knowledge graph structure. \(\mathcal {KOI}\) implements an unsupervised graph partitioning approach to identify parts of the graph from where relations are discovered. \(\mathcal {KOI}\) applies the homophily prediction principle to each part of the partitioned bipartite graph, in a way that two entities with similar characteristics are related to similar entities. Similarity values are computed based on: (a) the neighbors or ego-networks of two entities, and (b) their datatype property values (e.g., textual descriptions).
Bipartite Graph Creation. Determining the membership of each relation \(e \in \varDelta (E_\text {comp},E)\) in \(E'\) is expensive in terms of time due to the large amount of relations included in \(\varDelta (E_\text {comp},E)\), and may produce a large amount of false positives. \(\mathcal {KOI}\) leverages from the homophily intuition to tackle this problem by finding highly similar portions of the graph, i.e., portions including entities with similar ego-networks and similar datatype property values. In order to consider at the same time both similarities, \(\mathcal {KOI}\) builds a bipartite graph where each entity is associated with its ego-network. The objective is to find a partitioning of this graph, such that each part contains highly similar entities and highly similar ego-networks. Thus, the \(\mathcal {KOI}\) graph partitioning problem is an optimization problem where these two similarities are maximized on entities of each part.
Definition 4
(\(\mathcal {KOI}\)Bipartite Graph). Let \(G=(V, E)\) be a knowledge graph and \(L =\{p \mid (s, p, o) \in E\}\) be a set of predicates. Given a predicate \(r \in L\), the \(\mathcal {KOI}\) Bipartite Graph of G and r is defined as \(BG(G,r) = (V \cup U(r), E_{BG}(r))\), where \(U(r) = \{\text {ego-net}(v_i,r) \mid v_i \in V\}\) is the set of ego-networks of entities in V, and \(E_{BG}(r) = \{(v_i, u_i) \mid v_i \in V \wedge u_i = \text {ego-net}(v_i, r)\}\) is the set of edges that associate each entity with its ego-network.
Bipartite Graph Partitioning. To identify portions of the knowledge graph where the homophily prediction principle can be applied, the bipartite graph BG(G, r) is partitioned in a way that entities in each part are highly similar (i.e., similar datatype properties) and connected (i.e., have similar ego-networks).
Definition 5
Each part \(p_i\) contains a set of edges \(p_i = \{(v_x, u_x) \in E_{BG}\}\),
Each edge \((v_x, u_x)\) in \(E_{BG}\) belongs to one and only one part p of \(P(E_{BG})\), i.e., \(\forall p_i, p_j \in P(E_{BG}), p_i \cap p_j = \emptyset \) and \(E_{BG} = \bigcup _{p \in P(E_{BG})} p\).
Definition 6
Density(\(P(E_{BG})\))=\(\sum _{p \in P(E_{BG})} (\text {partDensity}(p))\), and
\( \text {partDensity}(p) = \overbrace{\frac{\sum _{v_i,v_j \in V_p}[v_i \ne v_j]S_v(v_i, v_j)}{|V_p|(|V_p| - 1)}}^{(A)}+ \overbrace{\frac{\sum _{u_i,u_j \in U_p}[u_i \ne u_j]S_u(u_i, u_j)}{|U_p|(|U_p| - 1)}}^{(B)}\)
\(\mathcal {KOI}\) utilizes the partitioning algorithm proposed by Palma et al. [15] to solve the optimization problem of partitioning a \(\mathcal {KOI}\) bipartite graph.
The bipartite graph in Fig. 3a is partitioned into two parts represented in Fig. 3b. Entities of the part in the bottom are \(V_p=\{ted:256 , ted:595 , ted:184 \}\) and their corresponding ego-networks are \(U_p=\{u_{256}, u_{595}, u_{184}\}\). In order to calculate the partDensity of this part, we compare pair-wise entities in \(V_p\) with \(S_v\) and ego-networks in \(U_p\) with \(S_u\). Thus, we compute the similarity \(S_v\) for entity pairs \(S_v(\text {ted:256 }\!, \text {ted:595 })\), \(S_v(\text {ted:256 }\!, \text {ted:184 })\), and \(S_v(\text {ted:595 }\!, \text {ted:184 })\), and the similarity \(S_u\) for ego-networks pairs \(S_u(u_{256}, u_{595})\), \(S_u(u_{256}, u_{184})\), and \(S_u(u_{595}, u_{184})\). The computed partDensity value is in this case 0.775.
Candidate Relation Discovery.\(\mathcal {KOI}\) applies the homophily prediction principle in the parts of a partition of a \(\mathcal {KOI}\) bipartite graph, and discovers relations between entities included in the same part.
Definition 7
(Candidate relation). Given two knowledge graphs \(G=(V, E)\) and \(G_{comp}=(V, E_{comp})\). Let \(BG(G,r) = (V \cup U, E_{BG})\) be a \(\mathcal {KOI}\) bipartite graph. Let \(P(E_{BG})\) be a partition of \(E_{BG}\). Given a part \(p= \{(v_x, u_x) \in E_{BG}\} \in P(E_{BG})\), the set of candidate relations CDR(p) in part p corresponds to the set of relations \(\{(v_i, r, v_j) \in E_{comp}\}\) such that \(v_j\) is included in some ego-network \(u_x\) and edges \((v_i, u_i)\) and \((v_x, u_x)\) are contained in the partition p.
In Fig. 3b candidate relations are represented as red dashed lines. One example is the relation (ted:59, vol:hasLink, ted:595). This candidate relation is discovered due to the presence of ted:59 and ego-net(ted:73, vol:hasLink) in the same partition and the inclusion of the entity ted:595 in the ego-network ego-net(ted:73, vol:hasLink).
Constraint Satisfaction. A relation constraint is a set of RDF constraints that states conditions that must be satisfied by a candidate discovered relation in order to become a discovered relation, i.e., relations belonging to the ideal knowledge graph. RDF constraints are expressed using the SPARQL language as suggested by Lausen et al. [11] and Fischer et al. [3]. Only the candidate relations that fulfill relation constraints are considered as discovered relations.
Definition 8
(Discovered Relations). Given a set of candidate relations CDR and a set of relation constraints S, the set of discovered relations DR is defined as the subset of candidate relations that satisfy the given contraints Open image in new window.
Listing 1.1 illustrates a constraint that states a condition for a candidate discovered relation \(cdr=(v_i\; r\; v_j)\) to become a discovered relation. Whenever the candidate discovered relation \(cdr=(v_i\; r \; v_j)\) is identified in several parts of a partition P, the number of times that cdr appears is taken into account, as well as the similarity between the ego-network of \(v_i\) and the ego-networks where \(v_j\) is included. To determine if the constraint is satisfied, a score is computed and the value of this score has to be greater than a threshold \(\theta _i\). The score is defined as the product of the number of times a relation is discovered and the similarity between corresponding ego-networks. For each discovered relation, Fig. 4 contains the value of the corresponding score described in Listing 1.1. Relation (ted:256, vol:hasLink, ted:595) gets the highest value for this score being discovered four times in Fig. 3b. Moreover, the similarity between ego-networks ego-net(ted:595,vol:hasLink) and ego-net(ted:184, vol:hasLink) is 0.5. The constraint, specified as an ASK query, is held if at least one score value is greater than the threshold \(\theta \). Therefore, we consider only the maximum similarity value between the ego-networks.
5 Related Work
Palma et al. [15] and Flores et al. [4] present approaches for relation discovery in heterogeneous bipartite graphs. Palma et al. present semEP, a semantic-based graph partitioning approach that finds the minimal partition of a weighted bipartite graph with highest density. semEP utilizes parts in the same way \(\mathcal {KOI}\) does, in order to find missing relations. However, they consider entities as isolated elements and do not consider their ego-networks during the partitioning process. esDSG [4] performs similarly than semEP, i.e., given a weighted bipartite graph, esDSG identifies a subgraph that is highly dense and comprise highly similar entities. Again, ego-networks are not considered.
Researchers of the social network field study the structure of friendship induced graphs, and define the concept of ego-network as the set of entities that are at one-hop distance to a given entity. Epasto et al. [2] reports on high quality results in the friend suggestion task by analyzing the ego-networks of the induced knowledge graphs. In this case, the discovery of the relations is based purely on the ego-network of the entities and no datatype property value is considered.
Redondo et al. [7] propose an approach to discover relations between video fragments based on visual information and background knowledge extracted from the Web of data in form of annotations. Like [4, 15] entities or video fragments are considered as isolated elements in the knowledge graph, and the similarity is computed as the number of coincident annotations between two video fragments.
Sachan and Ichise [21] discover relations between authors in a co-author network extracted from dblp. Their approach is based on the dense subgraph approach. They consider the connections in the knowledge graph and some features of the authors and from the papers like the keywords. However, the comparison of such features relies on the syntactic level, and the semantics is omitted.
Kastrin et al. [10] present an approach to discover relations among biomedical terms. They build a knowledge graph with such terms with the help of SemRep [20], a tool for recovering semantic propositions from the literature. In this case, it is not only important the existence of the relation, but also the type of the relation. Unlike \(\mathcal {KOI}\), they only consider the graph topology, discarding semantic knowledge encoded in datatype properties.
Nunes et al. [14] link entities based on the number of co-occurrences in a text corpus and distance, measured in number of hops, between the entities in a knowledge graph. Unlike \(\mathcal {KOI}\), this approach needs a corpus labeled with entities and only takes into account the object properties, omitting the semantics encoded in datatype properties.
6 Empirical Evaluation
6.1 Knowledge Graph Creation
In this section we describe the characteristics of the crafted TED knowledge graph and its links to external vocabularies. This knowledge graph is built from a real-world dataset of TED talks and playlists^{8}.
dc:title (Dublin Core vocabulary) represents title of the talk;
dc:creator models speaker;
dc:description represents abstract;
and ted:relatedTags corresponds to set of related keywords.
Apart from the datatype properties, TED talks are connected to playlists that include them through the object property ted:playlist. A vol:hasLink (Vocabulary Of Links^{9}) object property connects each pair of talks that are together in at least one playlist. We crawled the playlists available in the TED website^{10}. Playlists contain sets of TED talks that usually address similar topics. TED playlists are created and maintained by curators, who decide if a certain video may or may be not included in a certain playlist.
Unlike the knowledge graph created by Taibi et al. [23], our knowledge graph of TED talks includes information about the playlists, the relations between TED talks, and four similarity values for each pair of talks (TFIDF, ESA, Doc2Vec, and Doc2Vec Neighbors). The knowledge graph of TED talks is publicly available at https://goo.gl/7TnsqZ.
6.2 Experimental Configuration
We empirically evaluate the effectiveness of \(\mathcal {KOI}\) to discover missing relations in the 2015 TED knowledge graph, which is based on a real-world dataset. We compare \(\mathcal {KOI}\) with METIS [9] and k-Nearest Neighbors (KNN) empowered with four similarity measures: TFIDF, ESA, Doc2Vec, and Doc2Vec Neighbors.
Research Questions: We aim at answering the following research questions: (RQ1) Does semantics encoded in similarity measures affect the relation discovery task? In order to answer this question we compare four similarity measures, one statistical-based measure (TFIDF) and three semantic similarity measures (ESA [6], Doc2Vec [12], and Doc2Vec Neighbors). Doc2Vec Neighbors considers both, the semantics encoded in datatype properties and the structure of the graph by taking into account the ego-networks. (RQ2) Is \(\mathcal {KOI}\) able to outperform common discovery approaches as METIS or KNN?
Implementation: We implemented \(\mathcal {KOI}\) in Java 1.8 and executed the experiments on an Ubuntu 14.04 64 bits machine with CPU: Intel(R) Core(TM) i5-4300U 1.9 GHz (4 physical cores) and 8GB RAM. In order to perform a fair evaluation, we used the library WEKA [8] version 3.7.12 to split the dataset following the 10-fold cross-validation strategy. The cross-validation was performed over the set of relations among TED talks. In order to discover relations using the METIS solver version 5.1^{13}, we apply METIS on a KOI Bipartite Graph with the same similarity measures \(S_u\) and \(S_v\) above specified for \(\mathcal {KOI}\). METIS returns a partitioning of the given graph, and we produce candidate discovered relations as explained in Sect. 4. In order to perform a fair comparison, the same constraint (Listing 1.1) is applied for the results of both, \(\mathcal {KOI}\) and METIS.
Evaluation Metrics: For each discovery approach, we compute the following metrics: (i) Precision: Relation between the number of correctly discovered relations and the whole set of discovered relations. (ii) Recall: Relation between the number of correctly discovered relations and the number of existing relations in the dataset. (iii) F-Measure: harmonic mean of precision and recall. Values showed in Tables 1 and 2 are the average values over the 10-folds. Moreover, we draw the F-Measure curves for \(\mathcal {KOI}\) and METIS and calculate the Precision-Recall Area Under the Curve (AUC) coefficients (Table 3).
6.3 Discovering Relations with K-Nearest Neighbors
In our first experiment, we discover relations in the graph using the K-Nearest Neighbors (KNN) algorithm under the hypothesis that highly similar TED talks should be related. Given a talk, we discover a relation between it and its K most similar talks. This experiment evaluates the impact of considering semantics encoded in domain similarity measures during the relation discovery task (RQ1).
Effectivenness of KNN. D2V = Doc2Vec, D2VN = Doc2Vec Neighbors. D2VN presents the best results with an F-measure of 0.285 for \(K=4\). Relevance of the knowledge encoded in ego-networks is reported
6.4 Effectiveness of \(\mathcal {KOI}\) Discovering Relations
Comparison of\(\mathcal {KOI}\)and METIS. Values of \(\theta \) correspond to the value of variable THETA of the constraint in Listing 1.1
Area Under the Curve coefficients for \(\mathcal {KOI}\), KNN Doc2Vec Neighbors and METIS
Approach | AUC | F-Measure |
---|---|---|
\(\mathcal {KOI}\) | 0.396 | 0.512 |
METIS | 0.244 | 0.39 |
KNN D2VN | 0.223 | 0.285 |
Table 2 contains the obtained results with \(\mathcal {KOI}\) and METIS. The highest F-measure value is 0.512 and is obtained by \(\mathcal {KOI}\) with \(\theta = 0.7\). This F-measure value is higher than the one obtained with KNN and Doc2Vec Neighbors (0.285) and also higher than the maximum value obtained by METIS (0.39). We also observe that the parameter \(\theta \), which corresponds to THETA in Listing 1.1, can be configured depending on the respective importance of precision and recall. Lower values of \(\theta \) deliver high values of recall, while high values of \(\theta \) deliver high values of precision. Figure 5 shows the F-Measure curve for values of \(\theta \in [0,2]\). \(\mathcal {KOI}\) is able to get higher F-Measure values for almost all \(\theta \) values. We also computed the Precision-Recall Curve for \(\mathcal {KOI}\), METIS and KNN Doc2Vec Neighbors. Table 3 shows that \(\mathcal {KOI}\) gets a higher AUC value (0.396) than METIS (0.244) and KNN (0.223).
7 Conclusions and Future Work
In this paper we present \(\mathcal {KOI}\), an approach that exploits semantics and graph structure information in order to discover missing relations in a knowledge graph. \(\mathcal {KOI}\) considers semantics encoded in entities and their ego-networks to identify relations between entities with similar datatype properties and similar ego-networks. Reported experimental results suggest that \(\mathcal {KOI}\) outperforms state-of-the-art approaches that: (i) do not consider semantics (KNN TFIDF), (ii) do not identify graph portions containing highly similar entities (KNN D2VN and METIS). In the future, we plan to extend \(\mathcal {KOI}\) to take into account domain specific knowledge in graphs of more specific domains, e.g., social network, financial, or clinical data. Further, we plan to extend \(\mathcal {KOI}\) to consider the relevance or importance of the entities in ego-networks, as well as to discover relations between different types of entities, e.g., drugs and proteins.
Footnotes
Notes
Acknowledgements
This work is supported by the German Ministry of Education and Research within the SHODAN project (Ref. 01IS15021C) and the German Ministry of Economy and Technology within the ReApp project (Ref. 01MA13001A).
References
- 1.Arenas, M., Gutierrez, C., Pérez, J.: Foundations of RDF databases. In: Tessaris, S., Franconi, E., Eiter, T., Gutierrez, C., Handschuh, S., Rousset, M.-C., Schmidt, R.A. (eds.) Reasoning Web. LNCS, vol. 5689, pp. 158–204. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 2.Epasto, A., Lattanzi, S., Mirrokni, V., Sebe, I.O., Taei, A., Verma, S.: Ego-net community mining applied to friend suggestion. VLDB Endow. 9(4), 324–335 (2015)CrossRefGoogle Scholar
- 3.Fischer, P.M., Lausen, G., Schätzle, A., Schmidt, M.: RDF constraint checking. In: EDBT/ICDT 2015 Joint Conference (2015)Google Scholar
- 4.Flores, A., Vidal, M., Palma, G.: Exploiting semantics to predict potential novel links from dense subgraphs. In: 9th Alberto Mendelzon International Workshop on Foundations of Data Management (2015)Google Scholar
- 5.Fundulaki, I., Auer, S.: Linked open data - introduction to the special theme. ERCIM News 2014(96) (2014)Google Scholar
- 6.Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: IJCAI, vol.7 (2007)Google Scholar
- 7.García, J.L.R., Sabatino, M., Lisena, P., Troncy, R.: Detecting hot spots in web videos. In: ISWC Poster and Demo Track. CEUR-WS.org (2014)Google Scholar
- 8.Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)CrossRefGoogle Scholar
- 9.Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J. Sci. Comput. 20(1) (1998)Google Scholar
- 10.Kastrin, A., Rindflesch, T.C., Hristovski, D.: Link prediction on the semantic MEDLINE network - an approach to literature-based discovery. In: Džeroski, S., Panov, P., Kocev, D., Todorovski, L. (eds.) DS 2014. LNCS, vol. 8777, pp. 135–143. Springer, Heidelberg (2014)Google Scholar
- 11.Lausen, G., Meier, M., Schmidt, M.: Sparqling constraints for RDF. In: 11th International Conference on Extending Database Technology, EDBT. ACM (2008)Google Scholar
- 12.Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. CoRR, abs/1405.4053 (2014)Google Scholar
- 13.Liben-Nowell, D., Kleinberg, J.: The link-prediction problem for social networks. J. Am. Soc. Inf. Sci. Technol. 58(7), 1019–1031 (2007)CrossRefGoogle Scholar
- 14.Pereira Nunes, B., Dietze, S., Casanova, M.A., Kawase, R., Fetahu, B., Nejdl, W.: Combining a co-occurrence-based and a semantic measure for entity linking. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 548–562. Springer, Heidelberg (2013). doi:10.1007/978-3-642-38288-8_37 CrossRefGoogle Scholar
- 15.Palma, G., Vidal, M.-E., Raschid, L.: Drug-target interaction prediction using semantic similarity and edge partitioning. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 131–146. Springer, Heidelberg (2014)Google Scholar
- 16.Pappas, N., Popescu-Belis, A.: Combining content with user preferences for ted lecture recommendation. In: 11th International Workshop on Content Based Multimedia Indexing. IEEE (2013)Google Scholar
- 17.Pérez, J., Arenas, M., Gutierrez, C.: Semantics and complexity of SPARQL. ACM Trans. Database Syst. 34(3), 30–43 (2009)CrossRefGoogle Scholar
- 18.Pirró, G.: Explaining and suggesting relatedness in knowledge graphs. In: Arenas, M., et al. (eds.) ISWC 2015. LNCS, vol. 9366, pp. 622–639. Springer, Heidelberg (2015). doi:10.1007/978-3-319-25007-6_36 CrossRefGoogle Scholar
- 19.Řehůřek, R., Sojka, P.: Software framework for topic modelling with large corpora. In: LREC 2010 Workshop on New Challenges for NLP Frameworks. ELRA (2010). http://is.muni.cz/publication/884893/en
- 20.Rindflesch, T.C., Kilicoglu, H., Fiszman, M., Rosemblat, G., Shin, D.: Semantic medline,: an advanced information management application for biomedicine. Inf. Serv. Use 31(1–2), 15–21 (2011)Google Scholar
- 21.Sachan, M., Ichise, R.: Using semantic information to improve link prediction results in network datasets. Int. J. Eng. Technol. 2(4), 71–76 (2010)CrossRefGoogle Scholar
- 22.Schwartz, J., Steger, A., Weißl, A.: Fast algorithms for weighted bipartite matching. In: Nikoletseas, S.E. (ed.) WEA 2005. LNCS, vol. 3503, pp. 476–487. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 23.Taibi, D., Chawla, S., Dietze, S., Marenzi, I., Fetahu, B.: Exploring TED talks as linked data for education. Br. J. Educ. Technol. 46(5), 1092–1096 (2015)CrossRefGoogle Scholar