A Multilingual Lexico-Semantic Database and Ontology

  • Francis Bond
  • Christiane Fellbaum
  • Shu-Kai Hsieh
  • Chu-Ren Huang
  • Adam Pease
  • Piek Vossen
Chapter

Abstract

We discuss the development of a multilingual lexicon linked to the Suggested Upper Merged Ontology (SUMO) formal ontology. The ontology as well as the lexicon have been expressed in Web Ontology Language (OWL), as well as their original formats, for use on the semantic web and in linked data. We describe the Open Multilingual Wordnet (OMW), a multilingual wordnet with 22 languages and a rich structure of semantic relations. It is made by exploiting links from various monolingual wordnets to the English Wordnet. Currently, it contains 118,337 concepts expressed in 1,643,260 senses in 22 languages. It is available as simple tab-separated files, Wordnet-Lexical Markup Framework (LMF) or lemon and had been used by many projects including BabelNet and Google Translate. We discuss some issues in extending the wordnets and improving the multilingual representation to cover concepts not lexicalized in English and how concepts are stated in the formal ontology.

Key Words

Multilingual Ontology Open data Semantic lexicon Wordnet 

References

  1. Ahrens, K., Chang, L. L., Chen, K. J., & Huang, C.-R. (1998). Meaning representation and meaning instantiation for Chinese nominals. International Journal of Computational Linguistics and Chinese Language Processing, 3, 45–60.Google Scholar
  2. Apresjan, J. (1973). Regular polysemy. Linguistics, 142(5), 5–32.Google Scholar
  3. Benzmüller, C., & Pease, A. (2010). Progress in automating higher-order ontology reasoning. In B. Konev, R. Schmidt, & S. Schulz (Eds.), Workshop on Practical Aspects of Automated Reasoning (PAAR-2010). Edinburgh, UK: CEUR Workshop Proceedings.Google Scholar
  4. Berners-Lee, T. (2009). Linked data-the story so far. International Journal on Semantic Web and Information Systems, 5(3), 1–22.CrossRefGoogle Scholar
  5. Bird, S., Klein, E., & Loper, E. (2009). Natural language processing with python. O’Reilly. www.nltk.org/book.
  6. Bird, S., Klein, E., & Loper, E. (2010). Nyumon Shizen Gengo Shori [Introduction to natural language processing] (Hagiwara, Nakamura, & Mizuno, Trans.). Sebastopol: O’Reilly, Beijing, China.Google Scholar
  7. Black, W., Elkateb, S., Rodriguez, H., Alkhalifa, M., Vossen, P., Pease, A., et al. (2006). Introducing the Arabic WordNet project. In P. Sojka, K.-S. Choi, C. Fellbaum, & P. Vossen (Eds.), Proceedings of the Third International WordNet Conference, Jeju, Korea, 295–299.Google Scholar
  8. Bond, F., & Foster, R. (2013). Linking and extending an open multilingual wordnet. In 51st Annual Meeting of the Association for Computational Linguistics: ACL-2013, Sofia (pp. 1352–1362). http://aclweb.org/anthology/P13-1133
  9. Bond, F., Isahara, H., Fujita, S., Uchimoto, K., Kuribayashi, T., & Kanzaki, K. (2009). Enhancing the Japanese WordNet. In The 7th Workshop on Asian Language Resources (pp. 1–8). Singapore: ACL-IJCNLP 2009.Google Scholar
  10. Bond, F., & Paik, K. (2012). A survey of wordnets and their licenses. In Proceedings of the 6th Global WordNet Conference (GWC 2012), Matsue (pp. 64–71).Google Scholar
  11. Borra, A., Pease, A., Roxas, R., & Dita, S. (2010). Introducing Filipino WordNet. In P. Bhattacharyya, C. Fellbaum, & P. Vossen (Eds.), Principles of Construction and Application of Multilingual Wordnets: Proceedings of the 5th Global WordNet Conference (pp. 306–310). Mumbai, India: Narosa Pub.Google Scholar
  12. Boyd-Graber, J., Fellbaum, C., Osherson, D., & Schapire, R. (2006). Adding dense, weighted connections to WordNet. In Proceedings of the Third Global WordNet Meeting, Jeju.Google Scholar
  13. Burnard, L. (2000). The British national corpus users reference guide. Oxford: Oxford University Computing Services.Google Scholar
  14. Daude, J., Padro, L., & Rigau, G. (2003). Validation and tuning of Wordnet mapping techniques. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP’03), Borovets, Bulgaria.Google Scholar
  15. de Melo, G., Suchanek, F., & Pease, A. (2008). Integrating YAGO into the suggested upper merged ontology. In Proceedings of the 20th IEEE International Conference on Tools with Artificial Intelligence.Google Scholar
  16. de Paiva, V., & Rademaker, A. (2012). Revisiting a Brazilian wordnet. In Proceedings of the 6th Global WordNet Conference (GWC 2012), Matsue.Google Scholar
  17. Fellbaum, C. (Ed.). (1998). WordNet: An electronic Lexical database. Cambridge: MIT Press.MATHGoogle Scholar
  18. Fellbaum, C., & Vossen, P. (2012). Challenges for a multilingual wordnet. Language Resources and Evaluation, 46(2), 313–326. Doi=10.1007/s10579-012-9186-z.
  19. Gangemi, A., Guarino, N., Masolo, C., & Oltramari, A. (2003). Sweetening WordNet with DOLCE. AI Magazine, 24(3), 13–24.Google Scholar
  20. Genesereth, M. (1991). Knowledge interchange format. In J. Allen, R. Fikes, & E. Sandewall (Eds.), Proceedings of the Second International Conference on the Principles of Knowledge Representation and Reasoning (pp. 238–249). Los Altos: Morgan Kaufman.Google Scholar
  21. Gonzalez-Agirre, A., Laparra, E., & Rigau, G. (2012). Multilingual central repository version 3.0: Upgrading a very large lexical knowledge base. In Proceedings of the 6th Global WordNet Conference (GWC 2012), Matsue.Google Scholar
  22. Huang, C.-R., Hsieh, S.-K., Hong, J.-F., Chen, Y.-Z., Su, I.-L., Chen, Y.-X., et al. (2010). Chinese wordnet: Design and implementation of a cross-lingual knowledge processing infrastructure. Journal of Chinese Information Processing, 24(2), 14–23 (in Chinese).Google Scholar
  23. Isahara, H., Bond, F., Uchimoto, K., Utiyama, M., & Kanzaki, K. (2008). Development of the Japanese WordNet. In Sixth International Conference on Language Resources and Evaluation (LREC 2008), Marrakech.Google Scholar
  24. Koide, S., Morita, T., Yamaguchi, T., Muljadi, H., & Takeda, H. (2006). OWL expressions on WordNet and EDR. In AI Society Semantic Web Ontology SIG 13, SIG-SWO-A601-03 (in Japanese). http://www.jaist.ac.jp/ks/labs/kbs-lab/sig-swo/fpapers.htm
  25. Kunze, C., & Lemnitzer, L. (2002). Germanet — Representation, visualization, application. In LREC (pp. 1485–1491).Google Scholar
  26. Laparra, E., Rigau, G., & Vossen, P. (2012). Mapping wordnet to the KYOTO ontology. In N. Calzolari, K. Choukri, T. Declerck, M. U. Dogan, B. Maegaard, J. Mariani, J. Odijk, & S. Piperidis (Eds.), Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC2012) (pp. 2584–2589). Luxembourg: Publ. European Language Resources Association (ELRA).Google Scholar
  27. Lindén, K., & Carlson, L. (2010). Finnwordnet — wordnet påfinska via översättning. LexicoNordica — Nordic Journal of Lexicography, 17, 119–140. In Swedish with an English abstract.Google Scholar
  28. McCrae, J., Spohr, D., & Cimiano, P. (2011). Linking lexical resources and ontologies on the semantic web with lemon. In The Semantic Web: Research and applications, Springer Berlin Heidelberg, (pp. 245–259).Google Scholar
  29. Montazery, M., & Faili, H. (2010). Automatic Persian wordnet construction. In 23rd International Conference on Computational Linguistics (pp. 846–850).Google Scholar
  30. Niles, I., & Pease, A. (2001). Toward a standard upper ontology. In C. Welty & B. Smith (Eds.), Proceedings of the 2nd International Conference on Formal Ontology in Information Systems (FOIS-2001) (pp. 2–9).Google Scholar
  31. Niles, I., & Pease, A. (2003). Linking lexicons and ontologies: Mapping WordNet to the suggested upper merged ontology. In Proceedings of the IEEE International Conference on Information and Knowledge Engineering (pp. 412–416).Google Scholar
  32. Nurril Hirfana Mohamed Noor, Sapuan, S., & Bond, F. (2011). Creating the open Wordnet Bahasa. In Proceedings of the 25th Pacific Asia Conference on Language, Information and Computation (PACLIC 25), Singapore (pp. 258–267).Google Scholar
  33. Ordan, N., & Wintner, S. (2007). Hebrew wordnet: A test case of aligning lexical databases across languages. International Journal of Translation, 19(1), 39–58.Google Scholar
  34. Pease, A. (2006). Formal representation of concepts: The Suggested Upper Merged Ontology and its use in linguistics. In Ontolinguistics: How ontological status shapes the linguistic coding of concepts. New York: Mouton de Gruyter.Google Scholar
  35. Pease, A. (2011). Ontology: A practical guide. Angwin, CA: Articulate Software Press.Google Scholar
  36. Pease, A., & Benzmüller, C. (2013). Sigma: An integrated development environment for logical theories. AI Communications, 26, 9–97.Google Scholar
  37. Pease, A., Fellbaum, C., & Vossen, P. (2008). Building the global WordNet grid. In Proceedings of the CIL-18 Workshop on Linguistic Studies of Ontology, Seoul, South Korea.Google Scholar
  38. Pease, A., Sutcliffe, G., Siegel, N., & Trac, S. (2010). Large theory reasoning with SUMO at CASC. AI Communications, Special Issue on Practical Aspects of Automated Reasoning, 23(2–3), 137–144.MATHMathSciNetGoogle Scholar
  39. Pedersen, B.S., Nimb, S., Asmussen, J., Sørensen, N.H., Trap-Jensen, L., & Lorentzen, H. (2009). DanNet: the challenge of compiling a wordnet for Danish by reusing a monolingual dictionary. Language Resources and Evaluation, 43(3), 269–299.CrossRefGoogle Scholar
  40. Peters, W., Vossen, P., Díez-Orzas, P., & Adriens, G. (1998). Cross-linguistic alignment of wordnets with an inter-lingual-index. In P. Vossen (Ed.), Euro WordNet (pp. 149–251). Dordecht: KluwerGoogle Scholar
  41. Pianta, E., Bentivogli, L., & Girardi, C. (2002). Multiwordnet: Developing an aligned multilingual database. In Proceedings of the First International Conference on Global WordNet, Mysore, India (pp. 293–302).Google Scholar
  42. Piasecki, M., Szpakowicz, S., & Broda, B. (2009). A Wordnet from the Ground Up. Wroclaw University of Technology Press. ISBN 978-83-7493-476-3. http://www.plwordnet.pwr.wroc.pl/main/content/files/publications/A_Wordnet_from_the_Ground_Up.pdf
  43. Pociello, E., Agirre, E., & Aldezabal, I. (2011). Methodology and construction of the Basque wordnet. Language Resources and Evaluation, 45(2), 121–142.CrossRefGoogle Scholar
  44. Pustejovsky, J. (1995). The generative lexicon. Cambridge, MA: MIT Press.Google Scholar
  45. Rosch, E. (1978). Principles of categorization. In E. Rosch & B. B. Lloyd (Eds.), Cognition and categorization, (pp. 27–48). Hillsdale, NJ, USA: Lawrence Erlbaum Associates. Reprinted in Readings in Cognitive Science. A Perspective from Psychology and Artificial Intelligence, A. Collins and E.E. Smith, editors, Morgan Kaufmann Publishers, Los Altos (CA), USA, 1991.Google Scholar
  46. Ruci, E. (2008). On the current state of Albanet and related applications. Tech. Rep., University of Vlora. http://fjalnet.com/technicalreportalbanet.pdf.
  47. Sagot, B., & Fišer, D. (2008). Building a free French wordnet from multilingual resources. In ELRA (Ed.), Proceedings of the Sixth International Language Resources and Evaluation (LREC’08), Marrakech, Morocco.Google Scholar
  48. Savas, B., Hayashi, Y., Monachini, M., Soria, C., & Calzolari, N. (2010). An LMF-based web service for accessing wordnet-type semantic lexicons. In N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, & D. Tapias (Eds.), Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC’10). Valletta, Malta: European Language Resources Association (ELRA).Google Scholar
  49. Thoongsup, S., Charoenporn, T., Robkop, K., Sinthurahat, T., Mokarat, C., Sornlertlamvanich, V., et al. (2009). Thai wordnet construction. In Proceedings of The 7th Workshop on Asian Language Resources (ALR7), Joint Conference of the 47th Annual Meeting of the Association for Computational Linguistics (ACL) and the 4th International Joint Conference on Natural Language Processing (IJCNLP), Suntec, Singapore.Google Scholar
  50. van Assem, M., Gangemi, A., & Schreiber, G. (2006). Conversion of wordnet to a standard RDF/OWL representation. In Proceedings of The Fifth International Conference on Language Resources and Evaluation (LREC 2006).Google Scholar
  51. Vincze, V., & Almázi, A. (2014). Non-lexicalized concepts in wordnets: A case study of English and Hungarian. In Proceedings of the 7th Global WordNet Conference (GWC 2014), Tartu (pp. 118–126).Google Scholar
  52. Vossen, P. (Ed.). (1998). Euro WordNet. Dordecht: Kluwer.Google Scholar
  53. Vossen, P., Maks, I., Segers, R., & Van der Vliet, H. (2008). Integrating lexical units, synsets and ontology in the Cornetto database. In LREC 2008. Marrakech, Morocco: European Language Resources Association (ELRA).Google Scholar
  54. Vossen, P., Peters, W., & Gonzalo, J. (1999). Towards a universal index of meaning. In Proceedings of ACL-99 Workshop, Siglex-99, Standardizing Lexical Resources, Maryland (pp. 81–90).Google Scholar
  55. Vossen, P., & Postma, M. (2014). Open Dutch wordnet. In Proceedings of the 7th Global WordNet Conference (GWC 2014), Tartu (presentation only).Google Scholar
  56. Vossen, P., & Rigau, G. (2010). Division of semantic labor in the global wordnet grid. In P. Bhattacharyya, C. Fellbaum, & P. Vossen (Eds.), 5th Global Wordnet Conference: GWC-2010. Mumbai: Narosa Pub.Google Scholar
  57. Vossen, P., Soria, C., & Monachini, M. (2013). LMF - Lexical markup framework. In G. Francopoulo (Ed.), LMF - Lexical markup framework, Chap. 4. New York: ISTE Ltd + Wiley.Google Scholar
  58. Wang, S., & Bond, F. (2013). Building a Chinese wordnet: Starting from core synsets. In Proceedings of the 11th Workshop on Asian Language Resources, Nagoya.Google Scholar
  59. Yoon, A., Hwang, S., Lee, E., & Kwon, H.-C. (2009). Construction of Korean wordnet KorLex 1.5. Journal of KIISE: Software and Applications, 36(1), 92–108.Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Francis Bond
    • 1
  • Christiane Fellbaum
    • 2
  • Shu-Kai Hsieh
    • 3
  • Chu-Ren Huang
    • 4
  • Adam Pease
    • 5
  • Piek Vossen
    • 6
  1. 1.Nanyang Technological UniversitySingaporeSingapore
  2. 2.Princeton UniversityPrincetonUSA
  3. 3.National Taiwan UniversityTaipeiTaiwan
  4. 4.Hong Kong Polytechnic UniversityHung HomHong Kong
  5. 5.Articulate SoftwareSan FranciscoUSA
  6. 6.Vrije UniversiteitAmsterdamThe Netherlands

Personalised recommendations