Utilization of DBpedia Mapping in Cross Lingual Wikipedia Infobox Completion

  • Megawati
  • Saemi Jang
  • Mun Yong Yi
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9992)


Wikipedia plays a central role in the web as one of the biggest knowledge source due to its large coverage of information that comes from various domains. However, due to the enormous number of pages and limited number of contributors to maintain all of the pages, the problem of missing information among Wikipedia articles has emerged, especially articles in multiple language versions. Several approaches have been studied to fix information gap in between cross- language Wikipedia articles. However, they can only be applied for languages that came from the same root. In this paper, we propose an approach to generate new information for Wikipedia infoboxes written in different languages with different roots by utilizing the existing DBpedia mappings. We combined mapping information from DBpedia with an instance-based method to align the existing Korean-English infobox attribute-value pairs as well as to generate new pairs from the Korean version to fill missing information in the English version. The results showed that we could expand up to 38% of the existing English Wikipedia attribute-value pairs from our datasets with 61% of accuracy.


Infobox alignment Infobox completion DBpedia Cross language Wikipedia 



This work was supported by the Industrial Strategic Technology Development Program, 10052955, Experiential Knowledge Platform Development Research for the Acquisition and Utilization of Field Expert Knowledge, funded by the Ministry of Trade, Industry & Energy (MI, Korea).


  1. 1.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyaniak, R., Hellmann, S.: DBpedia - a crystallization point for the web of data. Web Sem. Sci. Serv. Agents World Wide Web 7(3), 154–165 (2009)CrossRefGoogle Scholar
  2. 2.
    Rinser, D., Lange, D., Naumann, F.: Cross-lingual entity matching and infobox alignment in Wikipedia. Inf. Syst. 38(6), 887–907 (2013)CrossRefGoogle Scholar
  3. 3.
    Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Towards an automatic creation of localized versions of DBpedia. In: Alani, H., et al. (eds.) ISWC 2013, Part I. LNCS, vol. 8218, pp. 494–509. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  4. 4.
    Adar, E., Skinner, M., Weld, D.S.: Information arbitrage across multi-lingual Wikipedia. In: Proceedings of the Second ACM International Conference on Web Search and Data Mining. ACM (2009)Google Scholar
  5. 5.
    Wu, F., Weld, D.S.: Autonomously semantifying Wikipedia. In: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management. ACM (2007)Google Scholar
  6. 6.
    Rahm, E., Bernstein, P.A.: A survey of approaches to automatic schema matching. VLDB J. 10(4), 334–350 (2001)CrossRefzbMATHGoogle Scholar
  7. 7.
    Melnik, S., Garcia-Molina, H., Rahm, E.: Similarity flooding: a versatile graph matching algorithm and its application to schema matching. In: Proceedings of 18th International Conference on Data Engineering. IEEE (2002)Google Scholar
  8. 8.
    Li, W.-S., Clifton, C.: SEMINT: a tool for identifying attribute correspondences in heterogeneous databases using neural networks. Data Knowl. Eng. 33(1), 49–84 (2000)CrossRefzbMATHGoogle Scholar
  9. 9.
    Nottelmann, H., Straccia, U.: Information retrieval and machine learning for probabilistic schema matching. Inf. Process. Manag. 43(3), 552–576 (2007)CrossRefGoogle Scholar
  10. 10.
    Kohonen, T.: Adaptive, associative, and self-organizing functions in neural computing. Appl. Opt. 26(23), 4910–4918 (1987)CrossRefGoogle Scholar
  11. 11.
    Fuhr, N.: Probabilistic datalog—a logic for powerful retrieval methods. In: Proceedings of the 18th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM (1995)Google Scholar
  12. 12.
    Wang, H., et al.: Identifying indirect attribute correspondences in multilingual schemas. In: 17th International Workshop on Database and Expert Systems Applications, 2006. DEXA 2006. IEEE (2006)Google Scholar
  13. 13.
    Fu, B., Brennan, R., O’Sullivan, D.: Cross-lingual ontology mapping – an investigation of the impact of machine translation. In: Gómez-Pérez, A., Yu, Y., Ding, Y. (eds.) ASWC 2009. LNCS, vol. 5926, pp. 1–15. Springer, Heidelberg (2009)CrossRefGoogle Scholar
  14. 14.
    Dos Santos, C.T., Quaresma, P., Vieira, R.: An API for multilingual ontology matching. In: Proceedings of 7th Conference on Language Resources and Evaluation Conference (LREC). No commercial editor (2010)Google Scholar
  15. 15.
    Bouma, G., Duarte, S., Islam, Z.: Cross-lingual alignment and completion of Wikipedia templates. In: Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies. Association for Computational Linguistics (2009)Google Scholar
  16. 16.
    Nguyen, T., et al.: Multilingual schema matching for Wikipedia infoboxes. Proc. VLDB Endow. 5(2), 133–144 (2011)CrossRefGoogle Scholar
  17. 17.
    Cojan, J., Cabrio, E., Gandon, F.: Filling the gaps among DBpedia multilingual chapters for question answering. In: Proceedings of the 5th Annual ACM Web Science Conference. ACM (2013)Google Scholar
  18. 18.
    Kim, E.-K., Choi, K.-S.: Cross-lingual property alignment for DBpedia ontology using triple conceptualization (2014)Google Scholar
  19. 19.
    Palmero Aprosio, A., Giuliano, C., Lavelli, A.: Automatic expansion of DBpedia exploiting Wikipedia cross-language information. In: Cimiano, P., Corcho, O., Presutti, V., Hollink, L., Rudolph, S. (eds.) ESWC 2013. LNCS, vol. 7882, pp. 397–411. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-38288-8_27 CrossRefGoogle Scholar
  20. 20.
    Kim, E.K., et al.: An approach for supplementing the Korean Wikipedia based on DBpedia. Liliana Cabral (Open University, UK) Tania Tudorache (Stanford University, USA), p. 7 (2010)Google Scholar
  21. 21.
    Mahdisoltani, F., Biega, J., Suchanek, F.: Yago3: a knowledge base from multilingual Wikipedias. In: 7th Biennial Conference on Innovative Data Systems Research. CIDR Conference (2014)Google Scholar
  22. 22.
    Tacchini, E., Schultz, A., Bizer, C.: Experiments with Wikipedia cross-language data fusion. In: Workshop on Scripting and Development (2009)Google Scholar
  23. 23.
    Spohr, D., Hollink, L., Cimiano, P.: A machine learning approach to multilingual and cross-lingual ontology matching. In: Aroyo, L., et al. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 665–680. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  24. 24.
    Salhi, A., Camacho, H.: A string metric based on a one-to-one greedy matching algorithm. Res. Comput. Sci. 19, 171–182 (2006)Google Scholar
  25. 25.
    Lee, T.Y., et al.: Automating relational database schema design for very large semantic datasets. Technical report, Department of Computer Science, University of Hong Kong (2013)Google Scholar
  26. 26.
    Lehmann, J., et al.: DBpedia–a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)Google Scholar

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  1. 1.Department of Industrial and Systems Engineering, Graduate School of Knowledge Service EngineeringKAISTDaejeonSouth Korea

Personalised recommendations