Advertisement

High Quality Arabic Lexical Ontology Based on MUHIT, WordNet, SUMO and DBpedia

  • Eslam KamalEmail author
  • Mohsen Rashwan
  • Sameh Alansary
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9041)

Abstract

In this paper, we aim to move ontology-based Arabic NLP forward by experimenting with the generation of a comprehensive Arabic lexical ontology using multiple language resources. We recommend a combination of MUHIT, WordNet and SUMO and use a simple method to link them, which results in the generation of an Arabic-lexicalized version of the SUMO ontology. Then, we evaluate the generated ontology, and propose a method for increasing its named entity coverage using DBpedia, English-to-Arabic Transliteration, and Named Entity Recognition. We end up with an Arabic lexical ontology that has 228K Arabic synsets, linked to 7.8K concepts and 143K instances. This ontology achieves a precision of 96.9% and recall of 75.5% for NLU scenarios.

Keywords

Arabic NLP Ontology-based Arabic NLP Arabic Language Resources Arabic Lexical Ontology Arabic WordNet Arabic SUMO Arabic Ontology Multilingual Dictionary MUHIT Arabic Named Entities 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gruber, T.R.: A translation approach to portable ontology specifications. Knowledge Acquisition 5(2), 199–220 (1993)CrossRefGoogle Scholar
  2. 2.
    Nirenburg, S., Raskin, V., Onyshkevych, B.: Apologiae ontologiae. Computing Research Laboratory, New Mexico State University (1996)Google Scholar
  3. 3.
    Nirenburg, S., Raskin, V.: Ontological semantics, formal ontology, and ambiguity. In: Proceedings of the International Conference on Formal Ontology in Information Systems, vol. 2001, pp. 151–161. ACM (2001)Google Scholar
  4. 4.
    Kara, S., Alan, Ö., Sabuncu, O., Akpınar, S., Cicekli, N.K., Alpaslan, F.N.: An ontology-based retrieval system using semantic indexing. Information Systems 37(4), 294–305 (2012)CrossRefGoogle Scholar
  5. 5.
    Batet, M., Sánchez, D., Valls, A.: An ontology-based measure to compute semantic similarity in biomedicine. Journal of Biomedical Informatics 44(1), 118–125 (2011)CrossRefGoogle Scholar
  6. 6.
    Gutiérrez, Y., Fernández, A., Montoyo, A., Vázquez, S.: UMCC-DLSI: Integrative resource for disambiguation task. In: Proceedings of the 5th International Workshop on Semantic Evaluation, pp. 427–432. Association for Computational Linguistics (2010)Google Scholar
  7. 7.
    McCrae, J., Spohr, D., Cimiano, P.: Linking lexical resources and ontologies on the semantic web with lemon. In: Antoniou, G., Grobelnik, M., Simperl, E., Parsia, B., Plexousakis, D., De Leenheer, P., Pan, J. (eds.) ESWC 2011, Part I. LNCS, vol. 6643, pp. 245–259. Springer, Heidelberg (2011)CrossRefGoogle Scholar
  8. 8.
    Miller, G., Fellbaum, C.: Wordnet: An electronic lexical database (1998)Google Scholar
  9. 9.
    Okumura, A., Hovy, E.: Lexicon-to-ontology concept association using a bilingual dictionary. In: Proceedings of the First Conference of the Association for Machine Translation in the Americans, pp. 177–184 (1994)Google Scholar
  10. 10.
    Khan, L.R., Hovy, E.: Improving the precision of lexicon-to-ontology alignment algorithms. In: Proceedings of AMTA/SIG-IL First Workshop on Interlinguas, San Diego, CA (1997)Google Scholar
  11. 11.
    Chen, B., Fung, P.: Automatic construction of an English-Chinese bilingual FrameNet. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 29–32. Association for Computational Linguistics (2004)Google Scholar
  12. 12.
    Malaisé, V., Isaac, A., Gazendam, L., Brugman, H.: Anchoring dutch cultural heritage thesauri to wordnet: two case studies. In: Proceedings of the ACL Workshop on Language Technology for Cultural Heritage Data (LaTeCH 2007), pp. 57–64 (2007)Google Scholar
  13. 13.
    Farreres, J., Gibert, K., Rodríguez, H., Pluempitiwiriyawej, C.: Inference of lexical ontologies. The LeOnI methodology. Artificial Intelligence 174(1), 1–19 (2010)CrossRefzbMATHGoogle Scholar
  14. 14.
    Elkateb, S., Black, W., Rodríguez, H., Alkhalifa, M., Vossen, P., Pease, A., Fellbaum, C.: Building a wordnet for arabic. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC 2006) (2006)Google Scholar
  15. 15.
    Kellogg, M.: WordReference.com English-Arabic Dictionary (1999), http://www.wordreference.com (accessed November 1, 2014)
  16. 16.
    Fouad, Y.: Arabdict Online Dictionaries (2008), http://www.arabdict.com (accessed November 1, 2014)
  17. 17.
    almaany.com, Almaany Arabic-English Dictionary (2010), http://www.almaany.com (accessed November 1, 2014)
  18. 18.
    Babylon. Babylon Arabic-English Dictionary (1997), http://translation.babylon.com/arabic/to-english (accessed November 1, 2014)
  19. 19.
    Google, Google Translate (2006), https://translate.google.com/#en/ar (accessed November 1, 2014)
  20. 20.
    Navigli, R., Ponzetto, S.P.: BabelNet: Building a very large multilingual semantic network. In: Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, pp. 216–225. Association for Computational Linguistics (2010)Google Scholar
  21. 21.
    Niles, I., Pease, A.: Linking Lexicons and Ontologies: Mapping WordNet to the Suggested Upper Merged Ontology. In: Proceedings of the IEEE International Conference on Information and Knowledge Engineering, pp. 412–416 (2003)Google Scholar
  22. 22.
    Reed, S.L., Lenat, D.B.: Mapping ontologies into Cyc. In: AAAI 2002 Conference Workshop on Ontologies for The Semantic Web, pp. 1–6 (2002)Google Scholar
  23. 23.
    Masolo, C., Borgo, S., Gangemi, A., Guarino, N., Oltramari, A., Oltramari, R., Schneider, L., Istc-cnr, L.P., Horrocks, I.: WonderWeb deliverable D17. the WonderWeb library of foundational ontologies and the DOLCE ontology (2002)Google Scholar
  24. 24.
    Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., Hellmann, S.: DBpedia-A crystallization point for the Web of Data. Web Semantics: Science, Services and Agents on the World Wide Web 7(3), 154–165 (2009)CrossRefGoogle Scholar
  25. 25.
    Alansary, S.: MUHIT: A Multilingual Harmonized Dictionary. In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pp. 2138–2145. European Language Resources Association (ELRA) (2014)Google Scholar
  26. 26.
    Pease, A., Niles, I., Li, J.: The suggested upper merged ontology: A large ontology for the semantic web and its applications. In: Working Notes of the AAAI-2002 Workshop on Ontologies and the Semantic Web, vol. 28 (2002)Google Scholar
  27. 27.
    Matuszek, C., Cabral, J., Witbrock, M.J., DeOliveira, J.: An Introduction to the Syntax and Content of Cyc. In: AAAI Spring Symposium: Formalizing and Compiling Background Knowledge and Its Applications to Knowledge Representation and Question Answering, pp. 44–49 (2006)Google Scholar
  28. 28.
    Alkhalifa, M., Rodríguez, H.: Automatically extending NE coverage of Arabic WordNet using Wikipedia. In: Proc. Of the 3rd International Conference on Arabic Language Processing CITALA 2009, Rabat, Morocco (2009)Google Scholar
  29. 29.
    Parker, R., et al.: Arabic Gigaword Fifth Edition LDC2011T11. Web Download. Linguistic Data Consortium, Philadelphia (2011)Google Scholar
  30. 30.
    Microsoft Research, Arabic Toolkit Service (ATKS) (2015), http://atks.microsoft.com (accessed January 1, 2015)
  31. 31.
    Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. ACM SIGKDD Explorations Newsletter 11(1), 10–18 (2009)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  1. 1.Arab Academy for Science, Technology and Maritime TransportCairoEgypt
  2. 2.The Engineering Company for the Development of Computer Systems, RDICairoEgypt
  3. 3.Bibliotheca AlexandrinaAlexandriaEgypt

Personalised recommendations