Progress in Artificial Intelligence

, Volume 7, Issue 4, pp 251–272 | Cite as

SocialLink: exploiting graph embeddings to link DBpedia entities to Twitter profiles

  • Yaroslav NechaevEmail author
  • Francesco Corcoglioniti
  • Claudio Giuliano
Regular Paper


SocialLink is a project designed to match social media profiles on Twitter to corresponding entities in DBpedia. Built to bridge the vibrant Twitter social media world and the Linked Open Data cloud, SocialLink enables knowledge transfer between the two, both assisting Semantic Web practitioners in better harvesting the vast amounts of information available on Twitter and allowing leveraging of DBpedia data for social media analysis tasks. In this paper, we further extend the original SocialLink approach by exploiting graph-based features based on both DBpedia and Twitter, represented as graph embeddings learned from vast amounts of unlabeled data. The introduction of such new features required to redesign our deep neural network-based candidate selection algorithm and, as a result, we experimentally demonstrate a significant improvement of the performances of SocialLink.


Social Media Linked Open Data Machine learning DBpedia 


  1. 1.
    Aprosio, A.P., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting Wikipedia cross-language information. In: Proceedings of the Semantic Web: Semantics and Big Data, 10th International Conference, ESWC 2013, Montpellier, France, May 26–30, 2013. Lecture Notes in Computer Science, vol. 7882, pp. 397–411. Springer, Berlin (2013). CrossRefGoogle Scholar
  2. 2.
    Besel, C., Schlötterer, J., Granitzer, M.: Inferring semantic interest profiles from Twitter followees: Does Twitter know better than your friends? In: ACM SAC, pp. 1152–1157 (2016)Google Scholar
  3. 3.
    Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. Trans. Assoc. Comput. Linguist. 5, 135–146 (2017)Google Scholar
  4. 4.
    Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Biased graph walks for RDF graph embeddings. In: Proceedings of the 7th International Conference on Web Intelligence, Mining and Semantics, WIMS 2017, pp. 21:1–21:12 (2017)Google Scholar
  5. 5.
    Cochez, M., Ristoski, P., Ponzetto, S.P., Paulheim, H.: Global RDF vector space embeddings. In: The Semantic Web-16th International Semantic Web Conference ISWC 2017, Vienna, Austria, October 21-25, 2017, Proceedings, Part I, Lecture Notes in Computer Science, vol. 10587, pp. 190–207. Springer (2017). Google Scholar
  6. 6.
    Corcoglioniti, F., Giuliano, C., Nechaev, Y., Zanoli, R.: Pokedem: An automatic social media management application. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, RecSys ’17, pp. 358–359. ACM, New York, NY, USA (2017).
  7. 7.
    Corcoglioniti, F., Palmero Aprosio, A., Nechaev, Y., Giuliano, C.: MicroNeel: Combining NLP tools to perform named entity detection and linking on microposts. In: EVALITA (2016)Google Scholar
  8. 8.
    Corcoglioniti, F., Rospocher, M., Mostarda, M., Amadori, M.: Processing billions of RDF triples on a single machine using streaming and sorting. In: ACM SAC, pp. 368–375 (2015)Google Scholar
  9. 9.
    Cristianini, N., Shawe-Taylor, J., Lodhi, H.: Latent semantic kernels. J. Intell. Inf. Syst. 18(2), 127–152 (2002). CrossRefGoogle Scholar
  10. 10.
    Erxleben, F., Günther, M., Krötzsch, M., Mendez, J., Vrandeăić, D.: Introducing wikidata to the linked data web. In: Proceedings of the 13th International Semantic Web Conference-Part I, ISWC ’14, pp. 50–65. Springer, New York, NY, USA (2014). Google Scholar
  11. 11.
    Faralli, S., Stilo, G., Velardi, P.: Large scale homophily analysis in twitter using a twixonomy. In: Proceedings of the Twenty-Fourth International Joint Conference on Artificial Intelligence, IJCAI 2015, pp. 2334–2340 (2015)Google Scholar
  12. 12.
    Fetahu, B., Anand, A., Anand, A.: How much is Wikipedia lagging behind news? In: Proceedings of the ACM Web Science Conference, WebSci ’15, pp. 28:1–28:9. ACM, New York, NY, USA (2015).
  13. 13.
    Goga, O.: Matching user accounts across online social networks: methods and applications. Ph.D. thesis, LIP6-Laboratoire d’Informatique de Paris 6 (2014)Google Scholar
  14. 14.
    Goga, O., Lei, H., Parthasarathi, S.H.K., Friedland, G., Sommer, R., Teixeira, R.: Exploiting innocuous activity for correlating users across sites. In: Proceedings of the WWW, pp. 447–458. ACM (2013)Google Scholar
  15. 15.
    Goga, O., Loiseau, P., Sommer, R., Teixeira, R., Gummadi, K.P.: On the reliability of profile matching across large online social networks. In: Proceedings of KDD, pp. 1799–1808. ACM (2015)Google Scholar
  16. 16.
    Goyal, P., Ferrara, E.: Graph embedding techniques, applications, and performance: a survey (2017). arXiv:1705.02801
  17. 17.
    Grover, A., Leskovec, J.: node2vec: scalable feature learning for networks. In: The 22th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’16, pp. 855–864. ACM (2016)Google Scholar
  18. 18.
    Hoffart, J., Suchanek, F.M., Berberich, K., Weikum, G.: YAGO2: a spatially and temporally enhanced knowledge base from wikipedia. Artifi. Intell. 194, 28–61 (2013). MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Landauer, T.K., Foltz, P.W., Laham, D.: An introduction to latent semantic analysis. Discourse Process. 25(2–3), 259–284 (1998)CrossRefGoogle Scholar
  20. 20.
    Lehmann, J., Isele, R., Jakob, M., Jentzsch, A., Kontokostas, D., Mendes, P.N., Hellmann, S., Morsey, M., van Kleef, P., Auer, S., Bizer, C.: DBpedia—a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015). CrossRefGoogle Scholar
  21. 21.
    Liu, S., Wang, S., Zhu, F., Zhang, J., Krishnan, R.: HYDRA: Large-scale social identity linkage via heterogeneous behavior modeling. In: Proceedings of SIGMOD, pp. 51–62. ACM (2014)Google Scholar
  22. 22.
    Lu, C.T., Shuai, H.H., Yu, P.S.: Identifying your customers in social networks. In: Proceedings of CIKM, pp. 391–400. ACM (2014)Google Scholar
  23. 23.
    Minard, A., Qwaider, M.R.H., Magnini, B.: FBK-NLP at NEEL-IT: active learning for domain adaptation. In: EVALITA (2016)Google Scholar
  24. 24.
    Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Concealing interests of passive users in social media. In: Proceedings of the Re-coding Black Mirror 2017 Workshop co-located with 16th International Semantic Web Conference (ISWC 2017), Vienna, Austria, 22 Oct 2017 (2017)Google Scholar
  25. 25.
    Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Linking knowledge bases to social media profiles. In: ACM SAC, pp. 145–150 (2017)Google Scholar
  26. 26.
    Nechaev, Y., Corcoglioniti, F., Giuliano, C.: Sociallink: Linking dbpedia entities to corresponding Twitter accounts. In: The Semantic Web-ISWC 2017, pp. 165–174. Springer, Berlin (2017). Google Scholar
  27. 27.
    Noreen, E.W.: Computer-Intensive Methods for Testing Hypotheses. Wiley, New York (1989)Google Scholar
  28. 28.
    Peled, O., Fire, M., Rokach, L., Elovici, Y.: Matching entities across online social networks. Neurocomputing 210, 91–106 (2016)CrossRefGoogle Scholar
  29. 29.
    Pennington, J., Socher, R., Manning, C.D.: Glove: Global vectors for word representation. In: Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)Google Scholar
  30. 30.
    Perozzi, B., Al-Rfou, R., Skiena, S.: Deepwalk: online learning of social representations. In: The 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD ’14, pp. 701–710 (2014)Google Scholar
  31. 31.
    Piao, G., Breslin, J.G.: Inferring user interests for passive users on twitter by leveraging followee biographies. In: Advances in Information Retrieval 39th European Conference on IR Research, ECIR 2017, pp. 122–133 (2017)Google Scholar
  32. 32.
    Ristoski, P., Paulheim, H.: Rdf2vec: Rdf graph embeddings for data mining. In: International Semantic Web Conference, pp. 498–514. Springer, Berlin (2016)Google Scholar
  33. 33.
    Ristoski, P., Paulheim, H.: Semantic Web in data mining and knowledge discovery: a comprehensive survey. Web Semant. Sci. Serv. Agents World Wide Web 36, 1–22 (2016). CrossRefGoogle Scholar
  34. 34.
    Ristoski, P., Rosati, J., Di Noia, T., De Leone, R., Paulheim, H.: RDF2Vec: RDF graph embeddings and their applications. Semant Web (2019, to appear).
  35. 35.
    Sadilek, A., Kautz, H., Bigham, J.P.: Finding your friends and following them to where you are. In: Proceedings of 5th ACM International Conference on Web Search and Data Mining (WSDM), pp. 723–732. ACM, New York (2012).
  36. 36.
    Shazeer, N., Doherty, R., Evans, C., Waterson, C.: Swivel: Improving embeddings by noticing what’s missing. CoRR (2016). arXiv:abs/1602.02215
  37. 37.
    Zafarani, R., Liu, H.: Connecting corresponding identities across communities. In: Proceedings of ICWSM. AAAI Press (2009)Google Scholar
  38. 38.
    Zafarani, R., Liu, H.: Connecting users across social media sites: a behavioral-modeling approach. In: Proceedings of KDD, pp. 41–49. ACM (2013)Google Scholar
  39. 39.
    Zheleva, E., Getoor, L.: To join or not to join: the illusion of privacy in social networks with mixed public and private user profiles. In: Proceedings of the 18th International Conference on World Wide Web (WWW), pp. 531–540. ACM, New York, NY, USA (2009).

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Fondazione Bruno KesslerTrentoItaly
  2. 2.University of TrentoTrentoItaly

Personalised recommendations