Dynamic Discovery of Type Classes and Relations in Semantic Web Data

Ayvaz, Serkan; Aydar, Mehmet

doi:10.1007/s13740-019-00102-6

Dynamic Discovery of Type Classes and Relations in Semantic Web Data

Original Article
Published: 20 February 2019

Volume 8, pages 57–75, (2019)
Cite this article

Journal on Data Semantics

256 Accesses
2 Citations
4 Altmetric
Explore all metrics

Abstract

With the rapidly growing resource description framework (RDF) data on the Semantic Web, processing large semantic graph data has become more challenging. Constructing a summary graph structure from the raw RDF can help obtain semantic type relations and reduce the computational complexity for graph processing purposes. In this paper, we addressed the problem of graph summarization in RDF graphs, and we proposed an approach for building summary graph structures automatically from RDF graph data based on instance similarities. To scale our approach, we utilized locality-sensitive hashing technique for identifying instance pairs which are candidates to be in the same type class. Moreover, we introduced a measure to help discover optimum class dissimilarity thresholds and an effective method to discover the type classes automatically. In future work, we plan to investigate further improvement options on the scalability of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Knowledge Graphs: Opportunities and Challenges

Article Open access 03 April 2023

BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM

Dataset search: a survey

Article Open access 24 August 2019

Notes

References

Adida B, Birbeck M, McCarron S, Pemberton S (2008) RDFa in XHTML: syntax and processing. Recommendation W3C
Alzogbi A, Lausen G (2013) Similar structures inside rdf-graphs. LDOW 996
Antonellis I, Molina HG, Chang CC (2008) Simrank++: query rewriting through link analysis of the click graph. Proc VLDB Endow 1(1):408–421
Article Google Scholar
Atre M, Chaoji V, Zaki MJ, Hendler JA (2010) Matrix bit loaded: a scalable lightweight join query processor for rdf data. In: Proceedings of the 19th international conference on World wide web, ACM, pp 41–50
Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives Z (2007) Dbpedia: a nucleus for a web of open data. Springer, Berlin
Google Scholar
Aydar M, Ayvaz S (2018) An improved method of locality-sensitive hashing for scalable instance matching. Knowl Inf Syst pp 1–20
Aydar M, Ayvaz S, Melton AC (2015) Automatic weight generation and class predicate stability in rdf summary graphs. In: Workshop on intelligent exploration of semantic data (IESD2015), co-located with ISWC2015, vol 1472
Ayvaz S, Aydar M, Melton A (2015) Building summary graphs of rdf data in semantic web. In: 2015 IEEE 39th annual computer software and applications conference (COMPSAC), vol 2, pp 686–691. https://doi.org/10.1109/COMPSAC.2015.107
Bizer C, Heath T, Berners-Lee T (2009) Linked data-the story so far. Int J Seman Web Inf Syst 5(3):1–22
Article Google Scholar
Brickley D, Guha RV (2014) RDF schema 1.1. W3c Recommendation. http://www.w3.org/TR/2014/REC-rdf-schema-20140225/
Broder AZ (1997) On the resemblance and containment of documents. In: Proceedings of the compression and complexity of sequences 1997, IEEE, pp 21–29
Campinas S, Perry TE, Ceccarelli D, Delbru R, Tummarello G (2012) Introducing rdf graph summary with application to assisted sparql formulation. In: 2012 23rd international workshop on database and expert systems applications, IEEE, pp 261–266
Castano S, Ferrara A, Montanelli S, Lorusso D (2008) Instance matching for ontology population. In: SEBD, pp 121–132
Chakrabarti D, Faloutsos C (2006) Graph mining: laws, generators, and algorithms. ACM Comput Surv (CSUR) 38(1):2
Article Google Scholar
Chierichetti F, Kumar R, Lattanzi S, Mitzenmacher M, Panconesi A, Raghavan P (2009) On compressing social networks. In: Proceedings of the 15th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 219–228
Chu E, Beckmann J, Naughton J (2007) The case for a wide-table approach to manage sparse relational data sets. In: Proceedings of the 2007 ACM SIGMOD international conference on management of data, ACM, pp 821–832
Consens MP, Fionda V, Khatchadourian S, Pirro G (2015) S+ epps: construct and explore bisimulation summaries, plus optimize navigational queries; all on existing sparql systems. Proc VLDB Endow 8(12):2028–2031
Article Google Scholar
Cyganiak R, Wood D, Lanthaler M (2014) RDF 1.1 concepts and abstract syntax. W3c Recommendation. http://www.w3.org/TR/rdf11-concepts/section-IRIs
Pierce D, Booth C, Ogbuji D, Deaton CC, Blackstone E, Lenat D (2012) Semanticdb: a semantic web infrastructure for clinical research and quality reporting. Curr Bioinform 7(3):267–277
Article Google Scholar
Duan S, Kementsietsidis A, Srinivas K, Udrea O (2011) Apples and oranges: a comparison of rdf benchmarks and real rdf datasets. In: Proceedings of the 2011 ACM SIGMOD international conference on management of data, ACM, pp 145–156
Fan W, Li J, Wang X, Wu Y (2012) Query preserving graph compression. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, ACM, pp 157–168
Gaertler M (2005) Clustering. In: Brandes U, Erlebach T (eds) Network analysis. Lecture Notes in computer science, chap. 8, Springer, Berlin, pp 178–215
Goasdoué F, Manolescu I (2015) Query-oriented summarization of rdf graphs. Proc VLDB Endow 8(12). https://doi.org/10.14778/2824032.2824124
Guo Y, Pan Z, Heflin J (2005) Lubm: a benchmark for owl knowledge base systems. Web Semant Sci Serv Agents World Wide Web 3(2):158–182
Article Google Scholar
He X, Kao MY, Lu HI (2000) A fast general methodology for information-theoretically optimal encodings of graphs. SIAM J Comput 30(3):838–846
Article MathSciNet MATH Google Scholar
Herrmann K, Voigt H, Lehner W (2014) Cinderella—adaptive online partitioning of irregularly structured data. In: 2014 IEEE 30th international conference on data engineering workshops (ICDEW), IEEE, pp 284–291
Jain AK, Dubes RC (1988) Algorithms for clustering data. Prentice-Hall, Inc., Upper Saddle River
MATH Google Scholar
Jeh G, Widom J (2002) SimRank: a measure of structural-context similarity. In: Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 538–543
Jin R, Lee VE, Hong H (2011) Axiomatic ranking of network role similarity. In: Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp 922–930
Karypis G, Kumar V (1998) A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM J Sci Comput 20(1):359–392
Article MathSciNet MATH Google Scholar
Khare R, Çelik T (2006) Microformats: a pragmatic path to the semantic web. In: Proceedings of the 15th international conference on world wide web, ACM, pp 865–866
Khatchadourian S, Consens MP (2010) Explod: summary-based exploration of interlinking and rdf usage in the linked open data cloud. In: Extended semantic web conference, vol 272–287, Springer, Berlin, pp 272–287
Levinson N (1946) The wiener (root mean square) error criterion in filter design and prediction. J Math Phys 25(1):261–278
Article MathSciNet Google Scholar
Lin Z, Lyu MR, King I (2006) Pagesim: a novel link-based measure of web page aimilarity. In: Proceedings of the 15th international conference on world wide web, ACM, pp 1019–1020
Lin, Z., Lyu, MR, King I (2009) Matchsim: a novel neighbor-based similarity measure with maximum neighborhood matching. In: Proceedings of the 18th ACM conference on information and knowledge management, ACM, pp 1613–1616
Luhn HP (1957) A statistical approach to mechanized encoding and searching of literary information. IBM J Res Dev 1(4):309–317
Article MathSciNet Google Scholar
Möller K, Heath T, Handschuh S, Domingue J (2007) Recipes for semantic web dog food—the ESWC and ISWC metadata projects. In: The semantic web, Springer, Berlin, pp 802–815
Newman ME (2003) The structure and function of complex networks. SIAM Rev 45(2):167–256
Article MathSciNet MATH Google Scholar
Newman ME, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69(2):026,113
Article Google Scholar
Page L, Brin S, Motwani R, Winograd T (1999) The PageRank citation ranking: bringing order to the web. Stanford InfoLab
Paige R, Tarjan RE (1987) Three partition refinement algorithms. SIAM J Comput 16(6):973–989
Article MathSciNet MATH Google Scholar
Palma G, Vidal ME, Raschid L (2014) Drug-target interaction prediction using semantic similarity and edge partitioning. In: International semantic web conference, Springer, Berlin, pp 131–146
Parundekar R, Knoblock CA, Ambite JL (2012) Discovering concept coverings in ontologies of linked data sources. In: International semantic web conference, Springer, Berlin, pp 427–443
Pham MD, Passing L, Erling O, Boncz P (2015) Deriving an emergent relational schema from rdf data. In: Proceedings of the 24th international conference on world wide web, international world wide web conferences steering committee, pp 864–874
Picalausa F, Luo Y, Fletcher GH, Hidders J, Vansummeren S (2012) A structural approach to indexing triples. In: Extended semantic web conference, Springer, Berlin, pp 406–421
Raghavan S, Garcia-Molina H (2003) Representing web graphs. In: Proceedings of the 19th international conference on data engineering, 2003, IEEE, pp 405–416
Rajaraman A, Ullman JD (2011) Mining of massive datasets. Cambridge University Press, Cambridge
Book Google Scholar
Seddiqui MH, Nath RPD, Aono M (2015) An efficient metric of automatic weight generation for properties in instance matching technique. Int J Web Semant Technol 6(1):1
Article Google Scholar
Small H (1973) Co-citation in the scientific literature: a new measure of the relationship between two documents. J Am Soc Inf Sci 24(4):265–269
Article MathSciNet Google Scholar
Sparck Jones K (1972) A statistical interpretation of term specificity and its application in retrieval. J Doc 28(1):11–21
Article Google Scholar
Sun Y, Han J, Yan X, Yu PS, Wu T (2011) Pathsim: meta path-based top-k similarity search in heterogeneous information networks. VLDB–11
Tian Y, Hankins RA, Patel JM (2008) Efficient aggregation for graph summarization. In: Proceedings of the 2008 ACM SIGMOD international conference on management of data, ACM, pp 567–580
Tran T, Ladwig G (2010) Structure index for rdf data. In: Workshop on semantic data management
Tran T, Wang H, Rudolph S, Cimiano P (2009) Top-k exploration of query candidates for efficient keyword search on graph-shaped (rdf) data. In: ICDE’09. IEEE 25th international conference on data engineering, 2009, IEEE, pp 101–104
Traverso I, Vidal ME, Kämpgen B, Sure-Vetter Y (2016) Gades: a graph-based semantic similarity measure. In: Proceedings of the 12th international conference on semantic systems, ACM, pp 101–104
Traverso-Ribón I, Palma G, Flores A, Vidal ME (2016) Considering semantics on the discovery of relations in knowledge graphs. In: European knowledge acquisition workshop, Springer, Berlin, pp 666–680
Xu X, Yuruk N, Feng Z, Schweiger TA (2007) Scan: a structural clustering algorithm for networks. In: Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, ACM, pp 824–833
Zhang N, Tian Y, Patel JM (2010) Discovery-driven graph summarization. In: 2010 IEEE 26th international conference on data engineering (ICDE 2010), IEEE, pp 880–891
Zou L, Mo J, Chen L, Özsu MT, Zhao D (2011) gstore: answering sparql queries via subgraph matching. Proc VLDB Endow 4(8):482–493
Article Google Scholar

Download references

Acknowledgements

The authors would like to thank Prof. Austin Melton for his invaluable help and his guidance during the study, Dr. Ruoming Jin and Dr. Viktor Lee for sharing RoleSim similarity measure.

Author information

Authors and Affiliations

Department of Software Engineering, Bahcesehir University, 34353, Besiktas, Istanbul, Turkey
Serkan Ayvaz
Department of Technology Introduction, Huawei Turkey Research and Development Center, 34768, Umraniye, Istanbul, Turkey
Mehmet Aydar

Authors

Serkan Ayvaz
View author publications
You can also search for this author in PubMed Google Scholar
Mehmet Aydar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Serkan Ayvaz.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ayvaz, S., Aydar, M. Dynamic Discovery of Type Classes and Relations in Semantic Web Data. J Data Semant 8, 57–75 (2019). https://doi.org/10.1007/s13740-019-00102-6

Download citation

Received: 22 January 2018
Revised: 30 November 2018
Accepted: 13 February 2019
Published: 20 February 2019
Issue Date: 08 March 2019
DOI: https://doi.org/10.1007/s13740-019-00102-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Dynamic Discovery of Type Classes and Relations in Semantic Web Data

Abstract

Access this article

Similar content being viewed by others

Knowledge Graphs: Opportunities and Challenges

BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM

Dataset search: a survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Dynamic Discovery of Type Classes and Relations in Semantic Web Data

Abstract

Access this article

Similar content being viewed by others

Knowledge Graphs: Opportunities and Challenges

BEAR: Revolutionizing Service Domain Knowledge Graph Construction with LLM

Dataset search: a survey

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation