LinkSUM: Using Link Analysis to Summarize Entity Data
The amount of structured data published on the Web is constantly growing. A significant part of this data is published in accordance to the Linked Data principles. The explicit graph structure enables machines and humans to retrieve descriptions of entities and discover information about relations to other entities. In many cases, descriptions of single entities include thousands of statements and for human users it becomes difficult to comprehend the data unless a selection of the most relevant facts is provided.
In this paper we introduce LinkSUM, a lightweight link-based approach for the relevance-oriented summarization of knowledge graph entities. LinkSUM optimizes the combination of the PageRank algorithm with an adaption of the Backlink method together with new approaches for predicate selection. Both, quantitative and qualitative evaluations have been conducted to study the performance of the method in comparison to an existing entity summarization approach. The results show a significant improvement over the state of the art and lead us to conclude that prioritizing the selection of related resources leads to better summaries.
KeywordsEntity summarization Linked data Knowledge graph Information filtering
A significant part of search engine result pages (SERPs) is nowadays dedicated to knowledge graph panels about entities (e. g., Fig. 1). In that context, a large amount of information about searched entities is often readily available to be presented to the user in a structured way. In its complete form, data about a single entity may involve thousands of statements. This is an overloading amount for humans. Therefore, fact-based information is often filtered and presented with a pre-defined set of predicates, such as “name, age, and date of birth” in the case of persons. Such a listing is usually associated with fixed patterns and static type-based orderings. However, as each entity is special in its own way it would be more appropriate to select relevant facts with respect to its individual particularities. In the movie domain, for example, some movies are heavily influenced by their main actor(s) (e. g., in the case of “Terminator”) while others are genuine masterpieces by their directors (e. g., in the case of “Pulp Fiction”). It is the goal of entity summarization to distill such individual particularities and present them in a ranked fashion.
Diversity-Centered. Summaries focus more on presenting a diverse selection of predicates (i. e. the type of relation). Repetitive lists of the same type of relation (e. g., “starring Uma Thurman; starring John Travolta; starring...”) are avoided in this setting. Instead, diversification of the relations aims at providing a more complete overview of an entity.
Relevance-Oriented. Summaries are more focused on the values (i. e. the connected resources). The importance of the connected resource and the relevance for the target entity is prioritized. In this setting, a complete summary could involve only one type of relation, if the respective resources are deemed more important than others with different relations.
Both methods present summaries of entities in a top-k manner, i. e. the k most diverse or relevant facts.
In this paper we present LinkSUM, a new method for entity summarization that follows a relevance-oriented approach to produce generic summaries to be displayed in a SERP. LinkSUM goes beyond the state of the art by addressing the following observed limitations of previously developed methods: lack of general applicability (commercial approaches) and the inclusion of redundant information in a summary (commercial and research approaches).
To address these challenges, LinkSUM combines and optimizes techniques for resource selection with approaches for predicate selection in order to provide a generic method for entity summarization. Like other research and commercial approaches [1, 4, 7, 14, 19], LinkSUM is focused on global relevance measures and does not rely on personal or contextual factors like individual interests or temporal trends. Instead, it serves as a foundation which can be extend by such approaches. To study the performance of LinkSUM we compare it with FACES, a recent approach on entity summarization  that has been shown to perform better than [4, 17].
We present LinkSUM, a lightweight link-based approach for relevance-oriented entity summmarization. We investigate on different configuration parameters and evaluate them with respect to their effectiveness.
In a quantitative and qualitative evaluation setting we show that prioritizing the selection of the related resources (rather than focusing on relation selection) and omitting redundancies within the set of related resources leads to better summaries.
The remainder of the paper is organized as follows: Sect. 2 introduces the key components of our approach. Section 3 presents first the experimental setup and afterwards the results of the configuration of the approach. In Sect. 4 we compare the approach to a diversity-centered summarization approach in a quantitative as well as qualitative evaluation setting. Section 5 discusses the obtained results and Sect. 6 provides an overview of related work. Section 7 presents our conclusions and Sect. 8 addresses open topics that will be part of our future work.
2 Proposed Approach
Resource Selection. The goal of this stage is to create a ranked list of resources that are semantically connected to the target entity. The output of this step is a set of triples, where the semantic relation is not fixed, e. g. Pulp Fiction – ?relation \(\rightarrow \)Quentin Tarantino. One requirement for a resource to be included in the list of relevant entities is at least one existing semantic relation to the target entity.
Relation Selection. This stage deals with the selection of a semantic relation that connects the resource with the target entity. This step is necessary if more than one relation exists, e. g. \(Pulp Fiction - starring \rightarrow Quentin Tarantino\), and \(Pulp Fiction - director \rightarrow Quentin Tarantino\).
In the entity summarization setting the list of relevant resources is cut-off at k after resource selection. We refer to such summaries as top-k summaries. In the following subsections we will explain each of the two parts. We will refer to the target entity as e (i. e. the entity that needs to be summarized).
2.1 Resource Selection
2.2 Relation Selection
In a knowledge graph, two resources can be linked through multiple semantic connections. We provide an example in Fig. 2b which demonstrates that the entities “Pulp Fiction” and “Quentin Tarantino” are connected in multiple ways. As a matter of fact, it is very common that multiple relations between entities exist. However, in many cases, one relation is more relevant than others. In our approach, the relation selection task identifies the most prominent connection for presentation in order to avoid redundancies among the connected resources in the top-k set.
Frequency (FRQ). Ranks the candidate relations in accordance to how often a specific relation is used overall in the complete dataset. The relation that is used the most is selected as the most promising candidate.
Exclusivity (EXC). For both entities of a relation, the relation might not be exclusive. For example a movie has commonly more than one starring actor while also an actor is usually starring in more than one movie (N:M). This measure considers the exclusivity of a relation in context to e and \(r \in res(e)\) respectively. For both resources, e and r, we add up the number of times the relation is used with each (N+M). We use the inverse of this number \(1/(N+M)\), in order to get the exclusivity score (the more exclusive, the better). The upper bound of EXC is 0.5 (for a 1:1 relation).
Description (DSC). Relations are represented by RDF predicates. Those predicates are commonly described with domains, ranges, and labels in different languages. The sum \(|labels| + |ranges| + |domains|\) forms a basic method for estimating the quality of the description of the predicate. The relation with the highest quality is chosen.
For each related resource in \(r \in ~top_k(res(e))\), combinations of the above-presented relation selection mechanisms identify the most relevant connection to e.
\(\alpha \)-value. We tested different settings for \(\alpha \) in the range of 0.5 to 1 with 0.1 steps.
- Relation Selection. We tested different relation selection mechanisms. We considered only combinations based on frequency as it has been proven as a robust popularity measure in . The following setups were considered as promising candidates:
FRQ – relations are selected by their frequency in the dataset.
FRQ*EXC – relations are chosen by the product of frequency and exclusivity.
FRQ*DSC – relations are selected by the product of frequency and description.
FRQ*EXC*DSC – relations are chosen by the product of frequency, exclusivity, and description.
As a reference dataset, we use the same as the FACES approach .1 The dataset provided in  would also serve as reference for evaluation. Unfortunately, we could not obtain summaries of the FACES system for the entities covered by .
The dataset provided by FACES involves DBpedia (version 3.9) and features outgoing connections only . Without loss of generality, we also configured LinkSUM to consider outgoing connections only. We also apply LinkSUM on DBpedia version 3.9. We computed the PageRank  scores for each DBpedia entity. As a basis for this, we used DBpedia’s Wikipedia Pagelinks dataset in English language. This dataset contains triples of the form “Wikipedia page A links to Wikipedia page B”. We only use these Web links, i. e. do not make use of semantic links (e. g., dbpedia-owl:birthPlace) for the computation of PageRank. The computed PageRank scores are made available online  in Turtle RDF format using the vRank vocabulary . For the Backlink method, we also use the Wikipedia Pagelinks dataset.
3.1 Configuration Setup
Our experimental setup involves a reference dataset as well as measures for computing the agreement and similarity. We use a similar evaluation setup as the FACES approach  as we directly compare LinkSUM with the FACES system (see Sect. 4).
Reference Dataset. The dataset includes 50 DBpedia (version 3.9) entities. The dataset contains at least seven top-5 and seven top-10 reference summaries per entity that were created by 15 experts of the Semantic Web field . For each entity, these references describe outgoing connections to other resources. The average number of these relations is 44. In addition, several relations, such as dcterms:subject and Wikipedia related links, were filtered out for creating the reference dataset as they do not contain sufficient semantic information .
Subject–Object (SO): This measure treats a summary as a set of tuples containing only subjects and objects while ignoring the predicate. The full URIs of the subject and the object are used respectively. As a matter of fact, the relation selection method has no impact on this measure.
Subject–Predicate–Object (SPO): This measure treats summaries as sets of triples. For representing a triple we use the full URI of each, subject, predicate and object. This measure also estimates the performance of the relation selection approach.
3.2 Configuration Results
Agreement among the experts.
In the SO setting, the best achieved scores of LinkSUM are 1.89 (top-5, \(\alpha = 0.8\)) and 4.82 (top-10, \(\alpha = 0.9\)) respectively. The results of the SPO settings are shown in Fig. 3. The FRQ measure provides a good baseline for both, top-5 and top-10. While the combination of FRQ with DSC improves the Quality in both settings, the combination with EXC damps the impact of FRQ. In the top-10 setting, the combination of the three measures (FRQ*EXC*DSC) provides best values. In the top-5 settings, FRQ*DSC and FRQ*EXC*DSC provide equally good results. In general, the values for \(\alpha \) are best at 0.8 for top-5 and 0.9 for top-10. The impact of the Backlink method on the rankings (\(\alpha < 1.0\)) in comparison to PageRank-only (\(\alpha = 1.0\)) is evident. In addition it is noticeable that strictly prioritizing all results of the Backlink method (ranked in accordance to their respective PageRank scores) does also not yield best results (\(\alpha = 0.5\)). The full blend between importance and strong connectivity produces the best outcomes.
config-1 (top-5): \(\alpha = 0.8\), FRQ*EXC*DSC
config-2 (top-10): \(\alpha = 0.9\), FRQ*EXC*DSC
In our evaluation setting, we compare LinkSUM with the FACES entity summarization system . FACES focuses on the diversification of the relation types (i. e. no semantically similar predicates should be occur in the result summary). The system has two stages: partitioning the feature set and ranking the features. The main idea is to partition the semantic links of an entity into semantically diverse clusters of predicates. For resource selection, the approach uses a tf-idf-related popularity measure for the object. In contrast, in our approach we follow the objective to identify the most relevant object first and then select the predicate. In their evaluation, the authors demonstrate that their system provides better results than [4, 17]. For 50 DBpedia entities, the authors published the results of FACES for top-5 and top-10 summaries (along with the reference dataset described in Sect. 3.1).3 The used DBpedia version is 3.9.
We compare LinkSUM and FACES in two evaluation settings, a quantitative and a qualitative one. In the following we will first describe the experimental setup and the obtained results afterwards.
4.1 Evaluation Setup
Quantitative Analysis. For evaluating the two methods quantitatively, we chose the same setup as described in Sect. 3.1, i. e. the same reference dataset and the same evaluation measures that were used for the evaluation of the FACES system . For comparison, we use the average Quality of each method. In addition, in order to prevent influence of strong outliers, we use the Quality value of each of the 50 entities per system for computing significance. As a significance test, we use the Wilcoxon Signed-Rank Test with two tails as recommended in . We compare the best configurations of LinkSUM for top-5 and top-10 respectively (see Sect. 3.2) with the published results of FACES.
“I know a lot about this entity” – [Strongly agree; Agree; Neither agree, nor disagree; Disagree; Strongly disagree]
“I am sure that I prefer the chosen summary over the other” – [Very confident; Confident; Neutral; Not very confident; Not at all confident]
Besides we provided an optional field where comments about their choice could be given. We included the following introductory text in order to instruct the users on how to proceed with the evaluation:
“You have been searching on a Web search engine for an entity. The search engine result page (SERP) is displayed with a picture of the entity, a short textual description, and a box with facts about the entity. For the following ten entities, it is your task to decide which fact box you would like to see in a SERP.”
In addition, we asked the participants to assume that all displayed data is correct. This was to avoid influence of data quality on the results. Finally, for statistical classification, we requested the participants to provide the following information: gender, age, whether or not the participants had a background in computer science, and the time taken for evaluation.
4.2 Evaluation Results
In the following, we present the outcomes of both evaluation settings.
Overall Quality results of the quantitative evaluation and their respective standard deviation (SD). Best results are bold. \(\dagger \) compared to the best, difference is significant (\(p < 0.05\)); \(\ddagger \) compared with each of the other two settings, difference is significant (\(p<0.05\)).
1.89 (SD 0.55)
1.20 (SD 0.57)
4.78 (SD 1.05)
3.15 (SD 0.89)
1.84 (SD 0.60)
1.20 (SD 0.60)
4.82 (SD 1.06)
3.20 (SD 0.87)
\(1.66^\ddagger \) (SD 0.57)
\(0.93^\ddagger \) (SD 0.54)
\(4.33^\ddagger \) (SD 1.01)
\(2.92^\dagger \) (SD 0.94)
Qualitative Analysis. From the invited people, a total of 20 participated in the qualitative analysis. 75 % of the participants were between 25 and 35, and 25 % were between 35 and 45 years old. 75 % were male and 25 % were female. 95 % of the participants had a computer science background. The average time taken for the evaluation was 11 min and 27 s. In total, 13 participants used the option to comment about their choice. With respect to these characteristics, we did not find any significant difference within the distribution of the votes. The distribution of the votes is visualized in Fig. 5. 73 % of all votes were given to LinkSUM, 27 % of the votes were received by FACES. Out of ten entities, LinkSUM system was clearly chosen with more than 15 votes in the case of five entities. For another 2 entities, the LinkSUM system was chosen with votes in the interval 13 to 14. The votes for the remaining three entities were distributed in the interval of 9 to 11 for both systems. Both systems each received in total ten low-confidence votes (“Not very confident” or “Not at all confident”). This means that 10 out of 146 votes in the case of LinkSUM, and 10 out of 54 votes in the case of FACES were low-confidence votes. With respect to the total number of votes for each system, this means a disproportionate low number of low-confidence votes for LinkSUM. The amount of knowledge of the participants did not influence the preference for either system: the values for high or low knowledge were both in line with the total distribution of the votes.
Selection the presented resources are relevant for the entity (e. g., “I like to see Turing machine mentioned for Alan Turing”).
Rejection redundancy (e. g., “The same thing twice once with prize and once with award”), the presented resources do not characterize the entity (e. g., “I do not care about technical aspects such as format”).
To select the most relevant facts that characterize an entity is, in many cases, a subjective task. Thus, to produce a generic summary not tailored to any specific background or context the user might have is a challenging task that involves the identification of facts that are deemed important by the majority of the users. In order to address this challenge, the LinkSUM method combines and optimizes methods that enable to select relevant facts about entities and at the same time reduce the amount of redundant information. In our experiments and evaluation we assessed and analyzed the efficiency of the mentioned aspects of the LinkSUM method. In a quantitative as well as qualitative setting we compared LinkSUM to the FACES system. In both setups, we demonstrated that LinkSUM exhibits significantly better results than FACES. The comments of the participants of our qualitative experiment suggest that the relevance of the related resources should be of importance and at the same time characterize an entity. We cover this by the combination of PageRank with Backlink. Our experiments with the SO-measure demonstrate that the produced Quality values are close to the agreement of the expert summaries (cf. Table 1).
We have tested four different methods for relation selection. The combination of the frequency of the relation, its exclusivity, and the its description has been shown to perform best in the top-10 setting, while in the top-5 setting the exclusivity score did neither contribute positively, nor negatively in that setup. The introduced measures should be considered as baselines for future evaluation settings in context to the relation selection step.
With regard to the qualitative evaluation, in the cases of the entities “Manchester City F.C.”, “Albert Einstein”, and “Lexus” we could not find any clear majority for either of the two systems. In the case of “Lexus” the set of presented facts has a very high overlap between the systems (with different ordering). In the case of “Manchester City F.C.” and “Albert Einstein” the choices are subjective as the provided comments suggest: some users liked the listing of players (“Machester City F.C.”) or children (“Albert Einstein”) while others stated they did not. Contrary to the claims in , we could not find evidence that repetitive relations have a negative impact on the quality of the summaries. For example, the entity “The Cosby Show” contains a listing of various actors with the “starring”-relation in the LinkSUM summary while in the output of FACES this information is missing (see Fig. 4). This led to 17 vs. three votes for the LinkSUM method. In this case many of the participants provided the “inclusion of the actors” in the LinkSUM method as the main reason for their choice. The FACES system does not filter redundancies on the object level: it happens that the set of relations is diverse while on the object side, a connected resource is re-occurring multiple times (linked through different relations). An example is the entity “Total Recall (1990 film)” where FACES included the following information: director Jerry Goldsmith; Artist Jerry Goldsmith; music Jerry Goldsmith; music composer Jerry Goldsmith; screenplay David Cronenberg; writer David Cronenberg. Those and similar repetitions in the summaries of other entities were commented as “redundant” by a high number of participants (in total ten out of 13 participants with comments mentioned redundancy as a problem).
6 Related Work
To the best of our knowledge, Hogan et al. first mentioned the concept of “summaries of the relevant entities"in .
The authors of  introduce RELIN, a summarization system that supports quick identification of entities. The approach applies a “goal directed surfer” which is an adapted version of the random surfer model that is also used in the PageRank algorithm. The main idea of the contribution is to combine textual notions of informativeness and relatedness for the ranking of features. As a major effect, the concise presentation of retrieved entities for quick identification by users after search is one of the scenarios that RELIN supports. In , the system is shown to perform significantly worse than FACES.
Google “Knowledge Graph”  is an example for an entity search system. The main idea is to enrich search results with summarized information about named entities. While the details of the approach are not public, Amit Singhal, the author of , outlines that for summarization, the system goes back to user data in order to “... study in aggregate what they’ve been asking Google about each item”. This indicates, that Google uses additional data sources for the summaries, i. e. the queries of the users. In addition, this also provides reason to assume that the analysis focuses on informal and partial statements of the subject + predicate or subject + object kind. Our approach is similar to this methodology and follows the pattern of identifying important objects first and then select a predicate.
TripleRank by Franz et al. introduces a tensor-based approach for ranking RDF triples . The approach uses the PARAFAC tensor decomposition method for deriving authority and hub scores as well as information about the importance of the link type. In contrast, in our contribution we separate the steps of deriving importance of the resource and the importance of the link as we put additional focus on the context that the target entity brings (while TripleRank addresses a more general ranking of triples). However, our general PageRank importance scores can be easily augmented or replaced by the scores produced by the TripleRank method.
The authors of  discuss the notion of diversity for graphical entity summarization. Two algorithms are introduced, of which one is diversity-oblivious (called PRECIS) and the other is diversity-aware (called DIVERSUM). The evaluation of the algorithms is shaped towards the movie domain and involved expert-based assessments as well as crowd-sourced experiments. The results suggest that the DIVERSUM algorithm was favored over the PRECIS algorithm. A drawback of the method is that both algorithms treat the predicate-value pairs on a per-predicate basis without measures on the object.
Also with respect to diversity, Schäfer et al. detect anomalies about entities in accordance to their different types . At the current state, the system needs also the specific type as an input. However, if the main type of an entity is detected reliably, the method can be regarded as an entity summarization system that points out hidden or interesting facts.
Blanco et al. introduce Yahoo!’s Spark system , an entity recommendation system that suggests related entities based on a learning approach employing gradient boosted decision trees. The utilized features range from co-occurrence information over popularity features (such as the click frequency) to graph-theoretic features (such as PageRank). The system focuses on related entity recommendation in the domains of locations, movies, people, sports, and TV shows. The types of entities as well as the type of their relation play an important role in the recommendation process. Connecting predicates are not considered by Spark. The system is currently applied in the Yahoo! search system.
In another contribution , Thalhammer et al. exemplify a summarization approach for movie entities that utilizes rating data from the MovieLens dataset. For this, an item neighborhood is established through an item-based collaborative filtering approach. The approach is based on the idea that the semantic background that connects a movie with its neighbors can be found and extracted by making use of structured data. Similar to , the authors treat the object and the predicate as predicate-value compounds. The method introduces a tf-idf-based weighting scheme in order to penalize features that occur commonly in the whole dataset.
Waitelonis and Sack explain in their paper  how different heuristics can be used for discovering related entities in order to support exploratory search. The tested Backlink heuristic achieves the best results in terms of F-measure. In our contribution, we adopted this method and adapted it in order to fit the scenario of entity summarization. Like all tested heuristics of , Backlink provides an unranked set of related entities that is not directly useable in top-k settings. As a consequence, for our resource selection approach, we combine Backlink with PageRank .
In this work, we extended on the state of the art in the field of relevance-oriented entity summarization systems [4, 19] and fact ranking in general . Our contribution provides a clear cut between relevance-oriented and diversity-centered systems. We demonstrate that relevance-oriented systems provide a better foundation for displaying summaries in search engine result pages.
We presented LinkSUM, a generic relevance-centric summarization method for entity summarization. LinkSUM works with a lightweight two-stage approach in order to produce summaries for entities. In the first step, the method identifies relevant connected resources. In the second step, the system selects the most promising semantic relation for each of the connected resources. We also investigated on the most efficient configuration parameters for LinkSUM.
For SERP scenarios, summarization systems should primarily focus on the relevance and the strength of the connection to the related resources. As a second factor the selection of an appropriate semantic relation is of importance.
Redundancies in the set of related resources should be avoided (e. g., see Fig. 1). Commonly, if two entities are related, there is one relation that is more relevant to be mentioned. Summaries should focus on this relation and then present relations to other interesting resources.
Semantic MediaWiki. Semantic MediaWiki (SMW)  is a popular extension of the MediaWiki software (used by Wikipedia). In SMW, (hyper-) textual information about entities is combined with structured information about them. Using the hyperlinks of the MediaWiki articles in combination with the semantic links of the SMW, LinkSUM can be used to provide structured summaries of entities in SMW.
Microdata/RDFa. The number of Web pages that include semantic information about entities is on the rise . In many sites that focus on specific entities, hyperlinks and semantic links are occurring side by side. A prominent example for such co-occurrence is IMDb4. Applied in a Web data setting, LinkSUM can use plain hyperlinks in combination with the hidden semantic information for providing structured summaries of entities that occur on the Web.
LinkSUM is applicable to both of the above-mentioned scenarios and it remains a technical task to implement prototypes. With respect to research, the DBpedia/Wikipedia setting is the most suitable scenario for evaluation as other researchers can also use the same datasets for providing their own summaries and compare them to LinkSUM (that is available online).
Note that the field of entity summarization is not limited to SERPs. As the availability of structured data is growing, applications for different domains and purposes emerge. Examples include business intelligence, e-learning, health information systems, news pages, data sheets, recipes etc. In fact, this includes all domain-specific cases where it is necessary for users to efficiently comprehend large information resources. In addition, entity summarization systems may adapt to user-context factors such as geo-location, cultural background, or time. As entities are retrieved without a specific information demand (like it is the case in question answering) the full personalization/contextualization of entity summaries remains an open challenge.
The above and further challenges need to be addressed in the emerging field of entity summarization. The LinkSUM method can serve as a generic foundation for such domain and/or user-centric scenarios.
8 Future Work
While in this paper we have presented the evaluation of LinkSUM for the case of generic search in the Web, the performance of the LinkSUM method is planned to be evaluated in specific domain settings (e. g., health information).
LinkSUM can be combined with a learning-to-rank approach with respect to the \(\alpha \)-value and different linear combinations of the predicate selection methods.
In future versions of LinkSUM, we plan to include literal values - such as strings or dates - as descriptors of the entities. The blending of entity-literal and entity-entity relations into a single summary will receive specific attention.
The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 611346 and by the German Federal Ministry of Education and Research (BMBF) within the Software Campus project “SumOn” (grant no. 01IS12051).
- 2.Bobić, T., Waitelonis, J., Sack, H.: FRanCo - a ground truth corpus for fact ranking evaluation. In: Joint Proceedings of SumPre and HSWI 2015, Co-located with the 12th Extended Semantic Web Conference, vol. 1556. CEUR-WS (2016)Google Scholar
- 3.Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the 7th International Conference on World Wide Web 7. Elsevier (1998)Google Scholar
- 4.Cheng, G., Tran, T., Qu, Y.: RELIN: relatedness and informativeness-based centrality for entity summarization. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 114–129. Springer, Heidelberg (2011)CrossRefGoogle Scholar
- 6.Franz, T., Schultz, A., Sizov, S., Staab, S.: TripleRank: ranking semantic web data by tensor decomposition. In: Bernstein, A., Karger, D.R., Heath, T., Feigenbaum, L., Maynard, D., Motta, E., Thirunarayan, K. (eds.) ISWC 2009. LNCS, vol. 5823, pp. 213–228. Springer, Heidelberg (2009)CrossRefGoogle Scholar
- 7.Gunaratna, K., Thirunarayan, K., Sheth, A.P.: FACES: diversity-aware entity summarization using incremental hierarchical conceptual clustering. In: Proceedings of the 29th AAAI Conference Artificial Intelligence, 2015, Austin, Texas, USA (2015)Google Scholar
- 8.Hogan, A., Harth, A., Umrich, J., Decker, S.: Towards a scalable search and query engine for the web. In: Proceedings of the 16th International Conference on World Wide Web, WWW 2007, pp. 1301–1302. ACM, New York, NY, USA (2007)Google Scholar
- 10.Meusel, R., Petrovski, P., Bizer, C.: The WebDataCommons microdata, RDFa and microformat dataset series. In: Mika, P., et al. (eds.) ISWC 2014, Part I. LNCS, vol. 8796, pp. 277–292. Springer, Heidelberg (2014)Google Scholar
- 11.Roa-Valverde, A., Thalhammer, A., Toma, I., Sicilia, M.-A.: Towards a formal model for sharing and reusing ranking computations. In: Proceedings of the 6th International Workshop on Ranking in Databases in conjunction with VLDB 2012 (2012)Google Scholar
- 12.Schäfer, B., Ristoski, P., Paulheim, H.: What is special about Bethlehem, Pennsylvania? identifying unusual facts about DBpedia entities. In: Proceedings of the ISWC 2015 Posters and Demonstrations Track (2015)Google Scholar
- 13.Singhal, A.: Introducing the knowledge graph: things, not strings (2012). http://goo.gl/kH1NKq
- 15.Thalhammer, A.: DBpedia PageRank dataset (2016). http://people.aifb.kit.edu/ath#DBpedia_PageRank
- 17.Thalhammer, A., Rettinger, A.: Browsing DBpedia entities with summaries. In: Presutti, V., Blomqvist, E., Troncy, R., Sack, H., Papadakis, I., Tordai, A. (eds.) ESWC Satellite Events 2014. LNCS, vol. 8798, pp. 511–515. Springer, Heidelberg (2014)Google Scholar
- 19.Thalhammer, A., Toma, I., Roa-Valverde, A.J., Fensel, D.: Leveraging usage data for linked data movie entity summarization. In: Proceedings of the 2nd International Ws. on Usage Analysis and the Web of Data (USEWOD2012) (2012)Google Scholar