Journal on Data Semantics

, Volume 5, Issue 4, pp 229–248 | Cite as

Discovering Similarity and Dissimilarity Relations for Knowledge Propagation in Web Ontologies

  • Pasquale Minervini
  • Claudia d’Amato
  • Nicola Fanizzi
  • Volker Tresp
Original Article
  • 216 Downloads

Abstract

We focus on the problem of predicting missing class memberships and property assertions in Web Ontologies. We start from the assumption that related entities influence each other, and they may be either similar or dissimilar with respect to a given set of properties: the former case is referred to as homophily, and the latter as heterophily. We present an efficient method for predicting missing class and property assertions for a set of individuals within an ontology by: identifying relations that are likely to encode influence relations between individuals (learning phase) and Leveraging such relations for propagating property information across related entities (inference phase). We show that the complexity of both inference and learning is nearly linear in the number of edges in the influence graph, and we provide an empirical evaluation of the proposed method.

References

  1. 1.
    Aggarwal CC (ed) (2011) Social network data analytics. Springer, New YorkMATHGoogle Scholar
  2. 2.
    Auer S, Bizer C, Kobilarov G, Lehmann J, Cyganiak R, Ives ZG Dbpedia (2007) A nucleus for a web of open data. In: Aberer K et al (eds) The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 \(+\)ASWC2007, Busan,Korea, November 11–15, 2007, LNCS, vol 4825. Springer, Berlin, pp 722–735Google Scholar
  3. 3.
    Baader F, Calvanese D, McGuinness DL, Nardi D, Patel-Schneider PF (eds) (2007) The description logic handbook. Cambridge University Press, CambridgeGoogle Scholar
  4. 4.
    Bengio Y, Delalleau O, Le Roux N (2006) Semi-Supervised Learning. In: Chapelle O, Schölkopf B, Zien A (eds) Label propagation and quadratic criterion. MIT Press, Cambridge, pp 193–216Google Scholar
  5. 5.
    Berners-Lee T, Hendler J, Lassila O (2001) The semantic web. Sci Am 284(5):34–43CrossRefGoogle Scholar
  6. 6.
    Bhagat S, Cormode G, Muthukrishnan S (2011) Node classification in social networks. In: Aggarwal CC [2], pp 115–148Google Scholar
  7. 7.
    Bishop CM (2006) Pattern recognition and machine learning. Springer, New YorkMATHGoogle Scholar
  8. 8.
    Bizer C, Heath T, Berners-Lee T (2009) Linked data—the story so far. Int J Semant Web Inf Syst 5(3):1–22CrossRefGoogle Scholar
  9. 9.
    Bizer C, Lehmann J, Kobilarov G, Auer S, Becker C, Cyganiak R, Hellmann S (2009) DBpedia—a crystallization point for the web of data. J Web Sem 7(3):154–165CrossRefGoogle Scholar
  10. 10.
    Bloehdorn S, Sure Y (2007) Kernel methods for mining instance data in ontologies. In: Aberer K et al (eds) The Semantic Web, 6th International Semantic Web Conference, 2nd Asian Semantic Web Conference, ISWC 2007 \(+\)ASWC2007, Busan,Korea, November 11–15, 2007, LNCS, vol 4825. Springer, Berlin, pp 58–71Google Scholar
  11. 11.
    Bollacker KD, Evans C, Paritosh P, Sturge T, Taylor J (2008) Freebase: a collaboratively created graph database for structuring human knowledge. In: Wang JT (ed) Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2008, Vancouver, BC, Canada, June 10–12, 2008, pp 1247–1250. ACMGoogle Scholar
  12. 12.
    Bordes A, Gabrilovich E (2014) Constructing and mining web-scale knowledge graphs: KDD 2014 tutorial. In: Macskassy SA et al (eds) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA—August 24–27, 2014. ACMGoogle Scholar
  13. 13.
    Bordes A, Glorot X, Weston J, Bengio Y (2014) A semantic matching energy function for learning with multi-relational data—application to word-sense disambiguation. Mach Learn 94(2):233–259MathSciNetCrossRefMATHGoogle Scholar
  14. 14.
    Bordes A, Usunier N, García-Durán A, Weston J, Yakhnenko O (2013) Translating embeddings for modeling multi-relational data. In: Burges CJC et al (eds) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 2787–2795Google Scholar
  15. 15.
    Bordes A, Weston J, Collobert R, Bengio Y (2011) Learning structured embeddings of knowledge bases. In: Burgard W et al (eds) Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, AAAI 2011, San Francisco, California, USA, August 7–11, 2011. AAAI PressGoogle Scholar
  16. 16.
    Socher R, Chen D, Manning CD, Ng AY (2013) Reasoning with neural tensor networks for knowledge base completion. In: Burges CJC et al (eds) Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5–8, 2013, Lake Tahoe, Nevada, United States, pp 926–934Google Scholar
  17. 17.
    Chapelle O, Schölkopf B, Zien A (eds) (2006) Semi-supervised learning. MIT Press, CambridgeGoogle Scholar
  18. 18.
    Cohen MB, Kyng R, Miller GL, Pachocki JW, Peng R, Rao A, Xu SC (2014) Solving SDD linear systems in nearly mlog1/2n time. In: Shmoys DB (ed) Symposium on Theory of Computing, STOC 2014, New York, NY, USA,May 31—June 03, 2014. ACM, New York, pp 343–352Google Scholar
  19. 19.
    d’Amato C, Fanizzi N, Esposito F (2010) Inductive learning for the semantic web: what does it buy? Semantic Web 1(1–2):53–59. doi:10.3233/SW-2010-0007 Google Scholar
  20. 20.
    Davis J, Goadrich M (2006) The relationship between Precision-Recall and ROC curves. In: Cohen W et al (eds) Proceedings of ICML’06. ACM, pp 233–240Google Scholar
  21. 21.
    de Vries GKD (2013) A Fast Approximation of the Weisfeiler–Lehman Graph Kernel for RDF Data. In: Blockeel H et al (eds) Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23–27, 2013, Proceedings, Part I, LNCS, vol 8188. Springer, pp 606–621Google Scholar
  22. 22.
    Delalleau O, Bengio Y, Roux NL (2005) Efficient non-parametric function induction in semi-supervised learning. In: Cowell RG et al (eds) Proceedings of the Tenth International Workshop on Artificial Intelligence and Statistics, AISTATS 2005, Bridgetown, Barbados, January 6–8, 2005. Society for Artificial Intelligence and StatisticsGoogle Scholar
  23. 23.
    Domingos P, Lowd D, Kok S, Poon H, Richardson M, Singla P (2008) Just Add Weights: Markov Logic for the Semantic Web. In: da Costa PCG et al (eds) Uncertainty Reasoning for the Semantic Web I, LNAI, vol 5327. Springer, Berlin, pp 1–25Google Scholar
  24. 24.
    Dong X, Gabrilovich E, Heitz G, Horn W, Lao N, Murphy K, Strohmann T, Sun S, Zhang W (2014) Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Macskassy SA et al (eds) The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, New York, NY, USA—August 24–27, 2014. ACM, pp 601–610Google Scholar
  25. 25.
    Fergus R,Weiss Y, Torralba A (2006) Semi-supervised learning in gigantic image collections. In: Bengio Y et al (eds) Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7–10 December 2009, Vancouver, British Columbia, Canada. Curran Associates, Inc, pp 522–530Google Scholar
  26. 26.
    Franz T, Schultz A, Sizov S, Staab S (2009) Triplerank: ranking semantic web data by tensor decomposition. In: Bernstein A et al (eds) International Semantic Web Conference, LNCS, vol 5823. Springer, Heidelberg, pp 213–228Google Scholar
  27. 27.
    Galárraga LA, Teflioudi C, Hose K, Suchanek FM (2013) AMIE: association rule mining under incomplete evidence in ontological knowledge bases. In: Schwabe D et al (eds) 22nd International World Wide Web Conference, WWW ’13, Rio de Janeiro, Brazil, May 13–17, 2013. International World Wide Web Conferences Steering Committee/ACM, pp 413–422Google Scholar
  28. 28.
    Goldberg AB, Zhu X, Wright SJ (2007) Dissimilarity in graph-based semi-supervised classification. In: Meila M et al (eds) Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, AISTATS 2007, San Juan, Puerto Rico, March 21–24, 2007, JMLR Proceedings, vol 2, pp 155–162. JMLR.orgGoogle Scholar
  29. 29.
    Harris S, Seaborne A (2013) SPARQL 1.1 Query Language . http://www.w3.org/TR/sparql11-query/
  30. 30.
    Hastie T, Tibshirani R, Friedman J (2001) The elements of statistical learning. Springer series in statistics. Springer New York Inc., New YorkCrossRefMATHGoogle Scholar
  31. 31.
    Hastie T, Tibshirani R, Friedman J (2008) The elements of statistical learning: data mining, inference and prediction, 2nd edn. Springer, BerlinMATHGoogle Scholar
  32. 32.
    Hellmann S, Lehmann J, Auer S (2009) Learning of OWL class descriptions on very large knowledge bases. Int J Semant Web Inform Syst 5(2):25–48CrossRefGoogle Scholar
  33. 33.
    Hitzler P, Krötzsch M, Rudolph S (2009) Foundations of semantic web technologies. Chapman & Hall/CRC, Boca RatonGoogle Scholar
  34. 34.
    Ji M, Sun Y, Danilevsky M, Han J, Gao J (2010) Graph regularized transductive classification on heterogeneous information networks. In: Balcázar JL et al (eds) ECML/PKDD (1), LNCS, vol 6321. Springer, Heidelberg, pp 570–586Google Scholar
  35. 35.
    Kok S, Domingos PM (2007) Statistical predicate invention. In: Ghahramani Z(ed) Machine Learning, Proceedings of the Twenty-Fourth International Conference (ICML 2007), Corvallis, Oregon, USA, June 20–24, 2007, ACM International Conference Proceeding Series, vol 227, pp 433–440. ACM, New YorkGoogle Scholar
  36. 36.
    Koller D, Friedman N (2009) Probabilistic graphical models: principles and techniques. MIT Press, CambridgeMATHGoogle Scholar
  37. 37.
    Koutra D, Ke TY, Kang U, Chau DH, Pao HKK, Faloutsos C (2011) Unifying Guilt-by-Association Approaches: Theorems and Fast Algorithms. In: Gunopulos D et al (eds) Proceedings of ECML/PKDD’11, LNCS, vol 6912, Springer, Berlin, pp 245–260Google Scholar
  38. 38.
    Krompaß D, Nickel M, Tresp V (2014) Querying factorized probabilistic triple databases. In: Mika P et al (eds) The Semantic Web—ISWC 2014—13th International Semantic Web Conference, Riva del Garda, Italy, October 19–23, 2014. Proceedings, Part II, LNCS, vol 8797. Springer, New York, pp 114–129Google Scholar
  39. 39.
    LeCun Y, Chopra S, Hadsell R, Ranzato M, Huang F (2006) Predicting Structured Data. In: Bakir G et al (eds) A tutorial on energy-based learning. MIT Press, CambridgeGoogle Scholar
  40. 40.
    Lin HT, Koul N, Honavar V (2011) Learning Relational Bayesian Classifiers from RDF Data. In: Aroyo L et al (eds) International Semantic Web Conference (1), LNCS, vol 7031. Springer, Berlin, pp 389–404Google Scholar
  41. 41.
    Liu W, He J, Chang S (2010) Large graph construction for scalable semi-supervised learning. In: Fürnkranz J et al (eds) Proceedings of the 27th International Conference on Machine Learning (ICML-10), June 21–24, 2010, Haifa, Israel. Omnipress, Haifa, pp 679–686Google Scholar
  42. 42.
    Livne OE, Brandt A (2012) Lean algebraic multigrid (LAMG): fast graph laplacian linear solver. SIAM J Sci Comput 34(4):499–522MathSciNetCrossRefMATHGoogle Scholar
  43. 43.
    Lösch U, Bloehdorn S, Rettinger A (2012) Graph kernels for RDF data. In: Simperl E et al (eds) The Semantic Web: Research and Applications—9th Extended Semantic Web Conference, ESWC 2012, Heraklion, Crete, Greece, May 27–31, 2012. Proceedings, LNCS, vol 7295. Springer, Heidelberg, pp 134–148Google Scholar
  44. 44.
    Luo C, Guan R, Wang Z, Lin C (2014) Hetpathmine: A novel transductive classification algorithm on heterogeneous information networks. In: de Rijke M et al (eds) Advances in Information Retrieval—36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13–16, 2014. Proceedings, LNCS, vol 8416. Springer, Berlin, pp 210–221Google Scholar
  45. 45.
    McPherson M, Lovin LS, Cook JM (2001) Birds of a feather: homophily in social networks. Ann Rev Sociol 27(1):415–444CrossRefGoogle Scholar
  46. 46.
    Miller KT, Griffiths TL (2009) Jordan MI Nonparametric latent feature models for link prediction. In: Bengio Y et al (eds) Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009. Proceedings of a meeting held 7–10 December 2009, Vancouver, British Columbia, Canada. Curran Associates, Inc, pp 1276–1284Google Scholar
  47. 47.
    Minervini P, d’Amato C, Fanizzi N, Esposito F (2014) Adaptive knowledge propagation in web ontologies. In: Janowicz K et al (eds) Knowledge Engineering and Knowledge Management—19th International Conference, EKAW 2014, Linköping, Sweden, November 24–28, 2014. Proceedings, LNCS, vol. 8876. Springer, Berlin, pp 304–319Google Scholar
  48. 48.
    Minervini P, d’Amato C, Fanizzi N, Tresp V (2014) Learning to propagate knowledge in web ontologies. In: Bobillo F et al (eds) Proceedings of the 10th International Workshop on Uncertainty Reasoning for the Semantic Web (URSW 2014) co-located with the 13th International Semantic Web Conference (ISWC 2014), Riva del Garda, Italy, October 19, 2014., CEUR Workshop Proceedings, vol 1259. CEUR-WS.org, pp 13–24Google Scholar
  49. 49.
    Nayak R, Senellart P, Suchanek FM, Varde AS (2012) Discovering interesting information with advances in web technology. SIGKDD Explor 14(2):63–81CrossRefGoogle Scholar
  50. 50.
    Nickel M, Murphy K, Tresp V, Gabrilovich E (2016) A review of relational machine learning for knowledge graphs. Proc IEEE 104(1):11–33CrossRefGoogle Scholar
  51. 51.
    Nickel M, Tresp V, Kriegel H (2011) A three-way model for collective learning on multi-relational data. In: Getoor L et al (eds) Proceedings of the 28th International Conference on Machine Learning, ICML 2011, Bellevue, Washington, USA, June 28–July 2, 2011. Omnipress, pp 809–816Google Scholar
  52. 52.
    Nickel M, Tresp V, Kriegel H (2012) Factorizing YAGO: scalable machine learning for linked data. In: Mille A et al (eds) Proceedings of the 21st World Wide Web Conference 2012, WWW 2012, Lyon, France, April 16–20, 2012. ACM, pp 271–280Google Scholar
  53. 53.
    Peng R (2014) Spielman DA An efficient parallel solver for SDD linear systems. In: Shmoys DB (ed) Symposium on Theory of Computing, STOC 2014, New York, NY, USA, May 31—June 03, 2014. ACM, New York, pp 333–342Google Scholar
  54. 54.
    Rasmussen CE, Williams CKI (2005) Gaussian processes for machine learning (adaptive computation and machine learning). MIT Press, CambridgeGoogle Scholar
  55. 55.
    Rettinger A, Lösch U, Tresp V, d’Amato C, Fanizzi N (2012) Mining the Semantic Web: Statistical learning for next generation knowledge bases. Data Min Knowl Discov 24(3):613–662MathSciNetCrossRefMATHGoogle Scholar
  56. 56.
    Rettinger A, Nickles M, Tresp V (2009) Statistical relational learning with formal ontologies. In: Buntine WL et al (eds) Machine Learning and Knowledge Discovery in Databases, European Conference, ECML PKDD 2009, Bled, Slovenia, September 7–11, 2009, Proceedings, Part II, LNCS, vol 5782. Springer, Berlin, pp 286–301Google Scholar
  57. 57.
    Schmachtenberg M, Bizer C, Paulheim H (2014) Adoption of the linked data best practices in different topical domains. In: Mika P et al (eds) The Semantic Web—ISWC 2014—13th International Semantic Web Conference, Riva del Garda, Italy, October 19–23, 2014. Proceedings, Part I, LNCS, vol 8796. Springer, Heidelberg, pp 245–260Google Scholar
  58. 58.
    Shadbolt N, Berners-Lee T, Hall W (2006) The semantic web revisited. IEEE Intell Syst 21(3):96–101CrossRefGoogle Scholar
  59. 59.
    Shawe-Taylor J, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, CambridgeGoogle Scholar
  60. 60.
    Shor NZ, Kiwiel KC, Ruszcaynski A (1985) Minimization Methods for Non-differentiable Functions. Springer-Verlag New York Inc, New YorkGoogle Scholar
  61. 61.
    Sirin E, Parsia B (2007) SPARQL-DL: SPARQL Query for OWL-DL. In: Golbreich C et al (eds) OWLED, CEUR Workshop Proceedings, vol 258. CEUR-WS.orgGoogle Scholar
  62. 62.
    Sirin E, Parsia B, Grau BC, Kalyanpur A, Katz Y (2007) Pellet: a practical OWL-DL reasoner. J Web Sement 5(2):51–53CrossRefGoogle Scholar
  63. 63.
    Suchanek FM, Kasneci G, Weikum G (2007) Yago: a core of semantic knowledge. In: Williamson CL et al (eds) Proceedings of the 16th International Conference on World Wide Web, WWW 2007, Banff, Alberta, Canada, May 8–12, 2007, ACM, pp 697–706Google Scholar
  64. 64.
    Sun Y, Han J (2012) Mining heterogeneous information networks: principles and methodologies. Synthesis lectures on data mining and knowledge discovery. Morgan & Claypool Publishers, San RafaelGoogle Scholar
  65. 65.
    Sun Y, Han J, Zhao P, Yin Z, Cheng H, Wu T (2009) Rankclus: integrating clustering with ranking for heterogeneous information network analysis. In: Kersten ML et al (eds) EDBT, ACM International Conference Proceeding Series, vol 360. ACM, pp 565–576Google Scholar
  66. 66.
    Tresp V, Huang Y, Bundschus M, Rettinger A (2009) Materializing and querying learned knowledge. In: Proceedings of IRMLeS’09Google Scholar
  67. 67.
    Vapnik VN (1998) Statistical learning theory, 1st edn. Wiley, New YorkMATHGoogle Scholar
  68. 68.
    Wang Z, Zhang J, Feng J, Chen Z (2014) Knowledge graph embedding by translating on hyperplanes. In: Brodley CE et al (eds) Proceedings of the Twenty-Eighth AAAI Conference on Artificial Intelligence, July 27–31, 2014, Québec City, Québec, Canada. AAAI Press, pp 1112–1119Google Scholar
  69. 69.
    Zhang K, Kwok JT, Parvin B (2009) Prototype vector machine for large scale semi-supervised learning. In: Danyluk AP et al (eds) Proceedings of the 26th Annual International Conference on Machine Learning, ICML 2009, Montreal, Quebec, Canada, June 14–18, 2009, ACM International Conference Proceeding Series, vol 382, ACM, pp 1233–1240Google Scholar
  70. 70.
    Zhang Y, Huang K, Liu C (2011) Fast and robust graph-based transductive learning via minimum tree cut. In: Cook DJ et al (eds) 11th IEEE International Conference on Data Mining, ICDM 2011, Vancouver, BC, Canada, December 11–14, 2011. IEEE Computer Society, pp 952–961Google Scholar
  71. 71.
    Zhu X (2005) Semi-supervised learning literature survey. Tech. Rep. 1530, Computer Science, University of Wisconsin-MadisonGoogle Scholar
  72. 72.
    Zhu X (2005) Semi-supervised learning with graphs. Ph.D. thesis, Pittsburgh, PA, USA . AAI3179046Google Scholar
  73. 73.
    Zhu X, Ghahramani Z, Lafferty JD (2003) Semi-Supervised Learning Using Gaussian Fields and Harmonic Functions. In: Fawcett T et al (eds) Proceedings of ICML’03, AAAI Press, pp 912–919Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2016

Authors and Affiliations

  • Pasquale Minervini
    • 1
  • Claudia d’Amato
    • 1
  • Nicola Fanizzi
    • 1
  • Volker Tresp
    • 2
  1. 1.Department of Computer ScienceUniversity of BariBariItaly
  2. 2.Siemens AG, Corporate TechnologyMunichGermany

Personalised recommendations