Advertisement

The Creation of Onto.PT: A Wordnet-Like Lexical Ontology for Portuguese

  • Hugo Gonçalo Oliveira
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8775)

Abstract

A wordnet is an important tool for developing natural language processing applications for a language, but the manual creation of such a resource limits its development. This dissertation studied the automatic construction of Onto.PT, a large Portuguese wordnet, aiming to minimise the main limitations of existing Portuguese wordnets. On this context, we propose ECO, an approach for creating wordnets automatically from text – relation instances are extracted, synonymy clusters (synsets) are discovered, and the remaining relations are then attached to suitable synsets. This document also reports on the contents of Onto.PT, its comparison to other wordnets, and its evaluation.

Keywords

Semantic Relation Lexical Item Computational Linguistics Relation Extraction Language Resource 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gonçalo Oliveira, H.: Onto.PT: Towards the Automatic Construction of a Lexical Ontology for Portuguese. PhD thesis. University of Coimbra (2013), http://eden.dei.uc.pt/~hroliv/pubs/GoncaloOliveira_PhdThesis2012.pdf
  2. 2.
    Marrafa, P.: Portuguese WordNet: General architecture and internal semantic relations. DELTA 18, 131–146 (2002)CrossRefGoogle Scholar
  3. 3.
    Dias-da-Silva, B.C.: Wordnet.Br: An exercise of human language technology research. In: Procs of 3rd International WordNet Conference, GWC 2006, South Jeju Island, Korea, pp. 301–303 (January 2006)Google Scholar
  4. 4.
    de Paiva, V., Rademaker, A., de Melo, G.: OpenWordNet-PT: An open brazilian wordnet for reasoning. In: Procs of 24th International Conferene on Computational Linguistics, COLING (Demo Paper) (2012)Google Scholar
  5. 5.
    Richardson, S.D., Dolan, W.B., Vanderwende, L.: MindNet: Acquiring and structuring semantic information from text. In: Procs of 17th International Conference on Computational Linguistics, COLING 1998, pp. 1098–1102 (1998)Google Scholar
  6. 6.
    Nichols, E., Bond, F., Flickinger, D.: Robust ontology acquisition from machine-readable dictionaries. In: Procs of 19th International Joint Conference on Artificial Intelligence, IJCAI 2005, pp. 1111–1116. Professional Book Center (2005)Google Scholar
  7. 7.
    Zesch, T., Müller, C., Gurevych, I.: Extracting lexical semantic knowledge from Wikipedia and Wiktionary. In: Procs of 6th International Conference on Language Resources and Evaluation, LREC 2008, Marrakech, Morocco (2008)Google Scholar
  8. 8.
    Lin, D.: Automatic retrieval and clustering of similar words. In: Procs of 17th International Conference on Computational linguistics, COLING 1998, pp. 768–774. ACL Press, Montreal (1998)CrossRefGoogle Scholar
  9. 9.
    Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Procs of 14th Conference on Computational Linguistics, COLING 1992, pp. 539–545. ACL Press (1992)Google Scholar
  10. 10.
    Caraballo, S.A.: Automatic construction of a hypernym-labeled noun hierarchy from text. In: Procs of 37th Annual Meeting of the Association for Computational Linguistics, pp. 120–126. ACL Press (1999)Google Scholar
  11. 11.
    Snow, R., Jurafsky, D., Ng, A.Y.: Learning syntactic patterns for automatic hypernym discovery. In: Advances in Neural Information Processing Systems, pp. 1297–1304. MIT Press, Cambridge (2005)Google Scholar
  12. 12.
    Pantel, P., Pennacchiotti, M.: Espresso: Leveraging generic patterns for automatically harvesting semantic relations. In: Procs of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics, pp. 113–120. ACL Press, Sydney (2006)Google Scholar
  13. 13.
    Etzioni, O., Fader, A., Christensen, J., Soderland, S.: Mausam: Open information extraction: The second generation. In: Procs of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 3–10. IJCAI/AAAI, Barcelona (2011)Google Scholar
  14. 14.
    Shi, L., Mihalcea, R.: Putting pieces together: Combining FrameNet, VerbNet and WordNet for robust semantic parsing. In: Gelbukh, A. (ed.) CICLing 2005. LNCS, vol. 3406, pp. 100–111. Springer, Heidelberg (2005)CrossRefGoogle Scholar
  15. 15.
    Navigli, R., Ponzetto, S.P.: BabelNet: The automatic construction, evaluation and application of a wide-coverage multilingual semantic network. Artificial Intelligence 193, 217–250 (2012)CrossRefzbMATHMathSciNetGoogle Scholar
  16. 16.
    Gurevych, I., Eckle-Kohler, J., Hartmann, S., Matuschek, M., Meyer, C.M., Wirth, C.: UBY - a large-scale unified lexical-semantic resource. In: Procs of 13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012, pp. 580–590. ACL Press, Avignon (2012)Google Scholar
  17. 17.
    Pennacchiotti, M., Pantel, P.: Ontologizing semantic relations. In: Procs of 21st International Conference on Computational Linguistics and 44th Annual Meeting of the Association for Computational Linguistics COLING/ACL, pp. 793–800. ACL Press (2006)Google Scholar
  18. 18.
    Gonçalo Oliveira, H., Gomes, P.: ECO and Onto.PT: A flexible approach for creating a Portuguese wordnet automatically. Language Resources and Evaluation 48(2), 373–393 (2014)CrossRefGoogle Scholar
  19. 19.
    Gonçalo Oliveira, H., Santos, D., Gomes, P., Seco, N.: PAPEL: A dictionary-based lexical ontology for Portuguese. In: Teixeira, A., de Lima, V.L.S., de Oliveira, L.C., Quaresma, P. (eds.) PROPOR 2008. LNCS (LNAI), vol. 5190, pp. 31–40. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Simões, A., Sanromán, Á.I., Almeida, J.J.: Dicionário-Aberto: A source of resources for the Portuguese language processing. In: Caseli, H., Villavicencio, A., Teixeira, A., Perdigão, F. (eds.) PROPOR 2012. LNCS (LNAI), vol. 7243, pp. 121–127. Springer, Heidelberg (2012)CrossRefGoogle Scholar
  21. 21.
    Gonçalo Oliveira, H., Antón Pérez, L., Costa, H., Gomes, P.: Uma rede léxico-semântica de grandes dimensões para o português, extraída a partir de dicionários electrónicos. Linguamática 3(2), 23–38 (2011)Google Scholar
  22. 22.
    Gonçalo Oliveira, H., Gomes, P.: Automatic Discovery of Fuzzy Synsets from Dictionary Definitions. In: Procs of 22nd International Joint Conference on Artificial Intelligence, IJCAI 2011, pp. 1801–1806. IJCAI/AAAI, Barcelona (2011)Google Scholar
  23. 23.
    Maziero, E.G., Pardo, T.A.S., Felippo, A.D., Dias-da-Silva, B.C.: A Base de Dados Lexical e a Interface Web do TeP 2.0 - Thesaurus Eletrônico para o Português do Brasil. In: VI Workshop em Tecnologia da Informação e da Linguagem Humana, TIL, pp. 390–392 (2008)Google Scholar
  24. 24.
    Gonçalo Oliveira, H., Gomes, P.: Towards the automatic enrichment of a thesaurus with information in dictionaries. Expert Systems: The Journal of Knowledge Engineering 30(4), 320–332 (2013)CrossRefGoogle Scholar
  25. 25.
    Gonçalo Oliveira, H., Gomes, P.: Ontologising semantic relations into a relationless thesaurus. In: Procs of 20th European Conference on Artificial Intelligence (ECAI 2012), pp. 915–916. IOS Press, Montpellier (2012)Google Scholar
  26. 26.
    Santos, D., Bick, E.: Providing Internet access to Portuguese corpora: the AC/DC project. In: Proc 2nd Language Resources and Evaluation, LREC 2000, pp. 205–210. ELRA, Athens (2000)Google Scholar
  27. 27.
    Gonçalo Oliveira, H., Gomes, P.: Onto.PT: Recent developments of a large public domain portuguese wordnet. In: Procs of the 7th Global WordNet Conference, GWC 2014, Tartu, Estonia, pp. 16–22 (2014)Google Scholar
  28. 28.
    Rodrigues, R., Gonçalo Oliveira, H., Gomes, P.: Uma abordagem ao Págico baseada no processamento e análise de sintagmas dos tópicos. Linguamática 4(1), 31–39 (2012)Google Scholar
  29. 29.
    Gonçalo Oliveira, H., Coelho, I., Gomes, P.: Exploiting Portuguese lexical knowledge bases for answering open domain cloze questions automatically. In: Proc 9th Language Resources and Evaluation Conference, ELRA, Reykjavik (2014)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Hugo Gonçalo Oliveira
    • 1
  1. 1.CISUC, Dept. of Informatics EngineeringUniversity of CoimbraPortugal

Personalised recommendations