Skip to main content
Log in

Context-aware similarity assessment within semantic space formed in linked data

  • Original Research
  • Published:
Journal of Ambient Intelligence and Humanized Computing Aims and scope Submit manuscript

Abstract

The Web is a constantly growing repository of information. Amount of data that becomes available exceeds our abilities to search and examine this data in a reasonable time and with a practical effort. The data is stored in forms of documents, texts and web pages, which are not suitable for comprehensive analysis and search. In order to make the data stored on the Internet more accessible, a new model of data representation has been introduced—linked data. Linked data provides an open platform for representing and storing structured data as well as metadata. In this paper, we propose a novel approach for calculating the degree of similarity between two entities in the web of linked data. The idea is based on the fact that entities are submerged in the linked data and their semantics is defined via their connections to other entities. Therefore, similarity between two entities is determined by comparing connections of two entities to other entities. Firstly, the approach is introduced to determine semantic similarity in a context-free manner. This method does not select specific types of connections but takes into consideration all of them. Secondly, a context-aware approach is presented as a modification of the original method. In this case, a context is defined by a set of connection types—only connections of specific types are considered for similarity determination. The proposed approach uses concepts of possibility theory to determine lower and upper bounds of similarity intervals. We evaluate the proposed similarity assessment process by applying it to real-world datasets, and we compare it to other related methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://dbpedia.org/About.

  2. http://www.informatik.uni-trier.de/~ley/db/.

  3. http://www.geonames.org/.

  4. http://www.freebase.com/.

  5. http://data.nytimes.com/.

  6. http://www.bbc.co.uk/programmes.

  7. http://www.foaf-project.org/.

  8. http://www4.wiwiss.fu-berlin.de/bizer/bookmashup/.

  9. http://gephi.org/.

  10. http://dbpedia.org/page/The_Matrix.

  11. http://dbpedia.org/page/The_Matrix_Reloaded.

  12. http://dbpedia.org/page/The_Hangover_(film).

  13. http://dbpedia.org/page/Blade_Runner.

  14. http://dbpedia.org/page/The_Matrix:_Music_from_the_Motion_Picture.

  15. http://dbpedia.org/page/Toyota.

  16. http://gephi.org/.

  17. http://scgroup20.ceid.upatras.gr:8000/tmg/.

  18. http://wiki.dbpedia.org/Ontology.

  19. http://www.linkedmdb.org/.

  20. http://www.mpi-inf.mpg.de/yago-naga/yago/.

References

  • Albertoni R, De Martino M (2010) Semantic similarity and selection of resources published according to linked data best practice. In: OTM 2010 workshops on the move to meaningful internet systems, pp 378–383

  • Albertoni R, Camossi E, De Martino M, Giannini F, Monti M (2008) Context enabled semantic granularity. In: Knowledge-based intelligent information and engineering systems, pp 682–688

  • Berners-Lee T, Hendler J (2001) Scientific publishing on the semantic web. Nature 410:1023–1024

    Article  Google Scholar 

  • Bizer C, Heath T, Berners-Lee T (2009) Linked data-the story so far. Int J Semant Web Inf Syst 4:1–22

    Google Scholar 

  • Boros M, Eckert W, Gallwitz F, Gorz G, Hanrieder G, Niemann H (1996) Towards understanding spontaneous speech: word accuracy vs. concept accuracy in spoken language. Proceedings of fourth international conference on ICSLP 96, vol 2, pp 1009–1012

  • D. Hossein Zadeh P, Reformat MZ (2012a) Assimilation of information in linked data based knowledge base. In: 14th international conference on information processing and management of uncertainty in knowledge-based systems, Catania, 9–13 July 2012

  • D. Hossein Zadeh P, Reformat MZ (2012b) Feature-based similarity assessment in ontology using fuzzy set theory. In: IEEE 2010 international conference on fuzzy systems (FUZZ), pp 1462–1468

  • D. Hossein Zadeh P, Reformat MZ (2012c) Ontology as knowledge base for determining asymmetric and context-dependent similarity. J Inf Sci (submitted)

  • Deerwester S, Dumais ST, Furnas GW, Landauer TK, Harshman R (1990) Indexing by latent semantic analysis. J Am Soc Inf Sci 41:391–407

    Article  Google Scholar 

  • DuBois D, Prade HM (1980) Fuzzy sets and systems: theory and applications. Academic Press, New York

    MATH  Google Scholar 

  • Dubois D, Prade H (2003) Possibility theory and its applications: a retrospective and prospective view. In: FUZZ’03 the 12th IEEE international conference on fuzzy systems, p 5

  • Dubois D, Prade H, Harding E (1988) Possibility theory: an approach to computerized processing of uncertainty. Plenum press, New York

    Book  MATH  Google Scholar 

  • Frakes WB, Baeza-Yates R (1992) Information retrieval: data structures and algorithms, vol 7632. PTR Prentice-Hall Inc, Eaglewood Cliffs

    Google Scholar 

  • Giunchiglia F, Yatskevich M, Shvaiko P (2007) Semantic matching: algorithms and implementation. J Data Semant IX, University of Trento, Trento, pp 1–38

  • Giuseppe P (2009) A semantic similarity metric combining features and intrinsic information content. Data Knowl Eng 68:1289–1308

    Article  Google Scholar 

  • Gruber TR (1993) A translation approach to portable ontology specifications. Knowl Acquis 5:199–220

    Article  Google Scholar 

  • Han L, Sun L, Chen G, Xie L (2006) ADSS: an approach to determining semantic similarity. Adv Eng Softw 37:129–132

    Article  Google Scholar 

  • Hliaoutakis A, Varelas G, Voutsakis E, Petrakis EGM, Milios E (2006) Information retrieval by semantic similarity. Int J Semant Web Inf Syst 2:55–73

    Article  Google Scholar 

  • Johannesson M (1997) Modelling asymmetric similarity with prominence. Lund University Cognitive Studies, Lund

    Google Scholar 

  • Klir GJ, Folger TA (1988) Fuzzy sets, uncertainty, and information. Prentice Hall, Englewood Cliffs

    MATH  Google Scholar 

  • Landauer TK, Dumais ST (1997) A solution to Plato’s problem: the latent semantic analysis theory of acquisition, induction, and representation of knowledge. Psychol Rev 104:211

    Article  Google Scholar 

  • Landauer TK, Foltz PW, Laham D (1998) An introduction to latent semantic analysis. Discourse Process 25:259–284

    Article  Google Scholar 

  • Lassila O, Swick R (1999) Resource description framework (RDF) model and syntax specification. World wide web consortium technical reports and publications. http://www.w3.org/TR/1999/REC-rdf-syntax-19990222

  • Leacock C, Chodorow M (1998) Combining local context with WordNet similarity for word sense identification. In: WordNet: a lexical reference system and its application, MIT Press, Cambridge, pp 265–283

  • Lee TB, Hendler J, Lassila O (2001) The semantic web. Sci Am 284:34–43

    Google Scholar 

  • Leibniz GW (1975) Philosophical papers and letters. Kluwer Academic Publishers, Dordrecht

    Google Scholar 

  • Li Y, Bandar ZA, McLean D (2003) An approach for measuring semantic similarity between words using multiple information sources. Knowl Data Eng IEEE Trans 15:871–882

    Article  Google Scholar 

  • Lin D (1998) An information-theoretic definition of similarity. In: Proceedings of the fifteenth international conference on machine learning, Madison, pp 296–304

  • Navigli R, Velardi P (2005) Structural semantic interconnections: a knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol 27, pp 1075–1086

  • Nosofsky RM (1991) Stimulus bias, asymmetric similarity, and classification. Cognit Psychol 23:94–140

    Article  Google Scholar 

  • Oliva J, Serrano JI, Del Castillo MD, Iglesias A (2011) SyMSS: a syntax-based measure for short-text semantic similarity. Data Knowl Eng 70(4):390–405

    Article  Google Scholar 

  • Pedersen T, Pakhomov SVS, Patwardhan S, Chute CG (2007) Measures of semantic similarity and relatedness in the biomedical domain. J Biomed Inform 40:288–299

    Article  Google Scholar 

  • Rada R, Mili H, Bicknell E (1989) Development and application of a metricon semantic nets. IEEE Trans Syst Man Cybern 19:17–30

    Article  Google Scholar 

  • Resnik P (1995) Using information content to evaluate semantic similarity in a taxonomy. IJCAI 448–453

  • Shadbolt N, Hall W, Berners-Lee T (2006) The semantic web revisited. IEEE Intell Syst 21:96–101

    Article  Google Scholar 

  • Sheng H, Chen H, Yu T, Feng Y (2010) Linked data based semantic similarity and data mining. In: IEEE 2010 international conference on Information reuse and integration (IRI), pp 104–108

  • Simmons S, Estes Z (2006) Using latent semantic analysis to estimate similarity. In: Proceedings of the Cognitive Science Society, pp 2169–2173

  • Taylor JM (2010) Ontology-based view of natural language meaning: the case of humor detection. J Ambient Intell Humaniz Comput 1:221–234

    Article  Google Scholar 

  • Tversky A (1977) Features of similarity. Psychol Rev 84:327–352

    Article  Google Scholar 

  • Volz J, Bizer C, Gaedke M, Kobilarov G (2009) Silk—a link discovery framework for the web of data. In: Proceedings of the second linked data on the Web workshop, Madrid

  • Wu Z, Palmer M (1994) Verbs semantics and lexical selection. In: Proceedings of the 32nd annual meeting on association for computational linguistics, Las Cruces, pp 133–138

  • Zadeh LA (1999) Fuzzy sets as a basis for a theory of possibility. Fuzzy Sets Syst 100:9–34

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Parisa D. Hossein Zadeh.

Rights and permissions

Reprints and permissions

About this article

Cite this article

D. Hossein Zadeh, P., Reformat, M.Z. Context-aware similarity assessment within semantic space formed in linked data. J Ambient Intell Human Comput 4, 515–532 (2013). https://doi.org/10.1007/s12652-012-0154-7

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12652-012-0154-7

Keywords

Navigation