Advertisement

Author Profile Enrichment for Cross-Linking Digital Libraries

  • Arben Hajra
  • Vladimir Radevski
  • Klaus Tochtermann
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9316)

Abstract

This work aims at enriching author profiles with additional information to better support search and retrieval of publications across different digital libraries. To achieve this objective we exploit concepts for cross-linking data to identify correlations between one author and other authors, publications or other related information. We will introduce a profile enrichment approach which adds additional information (e.g. biographic information) from different sources to existing author profiles. Within this context, the linked open data repository DBpedia serves a valuable source for our profile enrichment approach. Still, one of several challenges in this context is the identification of the same author in different sources. To address this challenge we will exploit VIAF (virtual authority file) for author identification. Technically we apply data mining and clustering techniques to uniquely identify authors.

Keywords

Digital libraries VIAF Author disambiguation Data mining Profile enrichment Linked open data 

References

  1. 1.
    Bizer, C., Heath, T., Idehen, K., Berners-Lee, T.: Linked data on the web. In: Proceedings of the 17th International Conference on World Wide Web, pp. 1265–1266. ACM (2008)Google Scholar
  2. 2.
    Elmagarmid, A.K., Ipeirotis, P.G., Verykios, V.S.: Duplicate record detection: a survey. IEEE Trans. Knowl. Data Eng. 19(1), 1–16 (2007)CrossRefGoogle Scholar
  3. 3.
    Hajra, A., Latif, A., Tochtermann, K.: Retrieving and ranking scientific publications from linked open data repositories. In: Proceedings of the 14th International Conference on Knowledge Technologies and Data-Driven Business (I-Know), p. 29. ACM (2014)Google Scholar
  4. 4.
    Latif, A., Borst, T., Tochtermann, K.: Exposing data from an open access repository for economics as linked data. D-Lib Magazine 20(9/10) (2014)Google Scholar
  5. 5.
    Laender, A.H., et al.: Keeping a digital library clean: new solutions to old problems. In: Proceedings of the Eighth ACM Symposium on Document Engineering, Sao Paulo, Brazil, pp. 257–262. ACM (2008)Google Scholar
  6. 6.
    Santana, A.F., Goncalves, M.A., Laender, A.H., Ferreira, A.: Combining domain-specific heuristics for author name disambiguation. In: Proceedings of the IEEE/ACM Joint Conference on Digital Libraries, pp. 173–182. IEEE (2014)Google Scholar
  7. 7.
    Chin, W.S., et al.: Effective string processing and matching for author disambiguation. J. Mach. Learn. Res. 15(1), 3037–3064 (2014)MathSciNetzbMATHGoogle Scholar
  8. 8.
    Torvik, V.I., Smalheiser, N.R.: Author name disambiguation in MEDLINE. ACM Trans. Knowl. Discov. Data (TKDD) 3(3), 11 (2009)Google Scholar
  9. 9.
    Bilenko, M., Mooney, R., Cohen, W., Ravikumar, P., Fienberg, S.: Adaptive name matching in information integration. IEEE Intell. Syst. 18(5), 16–23 (2003)CrossRefGoogle Scholar
  10. 10.
    Bhattacharya, I., Getoor, L.: Iterative record linkage for cleaning and integration. In: Proceedings of the 9th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery, pp. 11–18. ACM (2004)Google Scholar
  11. 11.
    Tang, J., Fong, A.C.M., Wang, B., Zhang, J.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. Knowl. Data Eng. 24(6), 975–987 (2012)CrossRefGoogle Scholar
  12. 12.
    Pereira, D.A., et al.: Using web information for author name disambiguation. In: Proceedings of the 9th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 49–58. ACM (2009)Google Scholar
  13. 13.
    Godoi, T.A., et al.: A relevance feedback approach for the author name disambiguation problem. In: Proceedings of the 13th ACM/IEEE-CS Joint Conference on Digital Libraries, pp. 209–218. ACM (2013)Google Scholar
  14. 14.
    Fan, X., Wang, J., Pu, X., Zhou, L., Lv, B.: On graph-based name disambiguation. J. Data Inf. Qual. (JDIQ) 2(2), 10 (2011)Google Scholar
  15. 15.
    De Nies, T., et al.: Towards named-entity-based similarity measures: challenges and opportunities. In: Proceedings of the 7th International Workshop on Exploiting Semantic Annotations in Information Retrieval, pp. 9–11. ACM (2014)Google Scholar
  16. 16.
    Mazov, N.A., Gureev, V.N.: The role of unique identifiers in bibliographic information systems. Sci. Tech. Inf. Process. 41(3), 206–210 (2014)CrossRefGoogle Scholar
  17. 17.
    Freire, N., et al.: Author consolidation across european national bibliographies and academic digital repositories. In: Proceedings of the 11th International Conference on Current Research Information System (2012)Google Scholar
  18. 18.
  19. 19.
    Virtual International Authority File, http://www.oclc.org/viaf.en.html
  20. 20.
  21. 21.
    What is ResearcherID?, http://www.researcherid.com/
  22. 22.
    OpenID Foundation, http://openid.net/foundation/
  23. 23.
    DNB-Virtual International Authority File (VIAF), http://www.dnb.de/viaf
  24. 24.
    Bilenko, M., Mooney, R.J.: Adaptive duplicate detection using learnable string similarity measures. In: Proceedings of the Ninth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining pp. 39–48. ACM (2003)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Arben Hajra
    • 1
  • Vladimir Radevski
    • 1
  • Klaus Tochtermann
    • 2
  1. 1.South East European University (SEEU)TetovoRepublic of Macedonia
  2. 2.Leibniz Information Centre for Economics (ZBW)KielGermany

Personalised recommendations