Predicate Matrix: automatically extending the semantic interoperability between predicate resources

Abstract

This paper presents a novel approach to improve the interoperability between four semantic resources that incorporate predicate information. Our proposal defines a set of automatic methods for mapping the semantic knowledge included in WordNet, VerbNet, PropBank and FrameNet. We use advanced graph-based word sense disambiguation algorithms and corpus alignment methods to automatically establish the appropriate mappings among their lexical entries and roles. We study different settings for each method using SemLink as a gold-standard for evaluation. The results show that the new approach provides productive and reliable mappings. In fact, the mappings obtained automatically outnumber the set of original mappings in SemLink. Finally, we also present a new version of the Predicate Matrix, a lexical-semantic resource resulting from the integration of the mappings obtained by our automatic methods and SemLink.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Notes

  1. 1.

    http://adimen.si.ehu.es/web/PredicateMatrix.

  2. 2.

    http://wordnet.princeton.edu/.

  3. 3.

    http://verbs.colorado.edu/~mpalmer/projects/verbnet.html.

  4. 4.

    http://framenet.icsi.berkeley.edu/.

  5. 5.

    http://verbs.colorado.edu/~mpalmer/projects/ace.html.

  6. 6.

    http://verbs.colorado.edu/semlink/.

  7. 7.

    http://wordnet.princeton.edu/glosstag.shtml.

  8. 8.

    We obtain better results combining all sources of contexts than exploiting them separately.

  9. 9.

    https://code.google.com/p/mate-tools/.

  10. 10.

    The overall performance of the SRL obtains 85.50 % F1.

  11. 11.

    As explained before, to discover new alignments, it is possible to start from Step 1 or Step 2.

  12. 12.

    http://www.ark.cs.cmu.edu/SEMAFOR/.

  13. 13.

    https://code.google.com/p/mate-tools/.

  14. 14.

    The overall perfomance of the SRL obtains 85.50 % F1.

  15. 15.

    Note that the Only-one strategy applied to obtain mappings between PropBank and FrameNet deals with lexical and role mappings.

  16. 16.

    http://verbs.colorado.edu/verb-index/vn/learn-14.php#learn-14-1.

  17. 17.

    http://verbs.colorado.edu/verb-index/vn/comprehend-87.2.php#comprehend-87.2-1.

  18. 18.

    http://www.newsreader-project.eu.

References

  1. Agirre, E., & Soroa, A. (2009). Personalizing pagerank for word sense disambiguation. In Proceedings of the 12th conference of the European chapter of the association for computational linguistics (EACL-2009). Athens, Greece: European Association for Computational Linguistics.

  2. Álvez, J., Lucio, P., & Rigau, G. (2012). Adimen-sumo: Reengineering an ontology for first-order reasoning. International Journal on Semantic Web and Information Systems (IJSWIS), 8(4), 80–116.

    Article  Google Scholar 

  3. Baker, C., Fillmore, C., & Lowe, J. (1997). The berkeley framenet project. In COLING/ACL’98. Canada: Montreal.

  4. Bizer, C., Lehmann, J., Kobilarov, G., Auer, S., Becker, C., Cyganiak, R., et al. (2009). Dbpedia-a crystallization point for the web of data. Web Semantics: Science. Services and Agents on the World Wide Web, 7(3), 154–165.

    Article  Google Scholar 

  5. Björkelund, A., & Hafdell, L. (2009). High-performance multilingual semantic role labeling. MSc thesis, Lund University.

  6. Bohnet, B. (2010). Very high accuracy and fast dependency parsing is not a contradiction. In The 23rd international conference on computational linguistics (COLING 2010). China: Beijing.

  7. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., & Taylor, J. (2008). Freebase: A collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on management of data (pp. 1247–1250). ACM.

  8. Burchardt, A., & Pennacchiotti, M. (2008). Fate: A framenet-annotated corpus for textual entailment. In LREC.

  9. Burchardt, A., Erk, K., & Frank, A. (2005). A WordNet detour to FrameNet. In Proceedings of the GLDV 2005 GermaNet II workshop (pp. 408–421). Germany: Bonn.

  10. Chen, D., Schneider, N., Das, D., & Smith, N. A. (2010). Semafor: Frame argument resolution with log-linear models. In Proceedings of the 5th international workshop on semantic evaluation (pp. 264–267). Uppsala, Sweden: Association for Computational Linguistics. http://www.aclweb.org/anthology/S10-1059.

  11. Cuadros, M., & Rigau, G. (2008). Knownet: Building a large net of knowledge from the web. In 22nd International conference on computational linguistics (COLING’08), Manchester, UK.

  12. Das, D., Schneider, N., Chen, D., & Smith, N. A. (2010). Probabilistic frame-semantic parsing. In Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics (pp. 948–956). Association for Computational Linguistics.

  13. Erk, K., & Pado, S. (2004). A powerful and versatile xml format for representing role-semantic annotation. In Proceedings of LREC-2004. Lisbon, http://www.coli.uni-sb.de/~pado/pub/lrec04/salsatiger.pdf.

  14. Fellbaum, C. (Ed.). (1998). WordNet. An electronic lexical database. Cambridge: The MIT Press.

    Google Scholar 

  15. Fellbaum, C., & Baker, C. F. (2013). Comparing and harmonizing different verb classifications in light of a semantic annotation task. Linguistics, 51(4), 707–728. doi:10.1515/ling-2013-0025.

    Article  Google Scholar 

  16. Fillmore, C. J. (1976). Frame semantics and the nature of language. In Annals of the New York Academy of Sciences: Conference on the origin and development of language and speech (vol. 280, pp. 20–32). New York.

  17. Giuglea, A. M., & Moschitti, A. (2006). Semantic role labeling via framenet, verbnet and propbank. In Proceedings of COLING-ACL 2006 (pp. 929–936). Morristown, NJ, USA: ACL. doi:10.3115/1220175.1220292.

  18. Gonzalez-Agirre, A., Laparra, E., & Rigau, G. (2012). Multilingual central repository version 3.0. In LREC (pp. 2525–2529).

  19. González-Agirre, A., Rigau, G., & Castillo, M. (2012). A graph-based method to improve wordnet domains. In CICLING (pp. 17–28). Springer.

  20. Gurevych, I., Eckle-Kohler, J., Hartmann, S., Matuschek, M., Meyer, & C. M., Wirth, C. (2012). Uby: A large-scale unified lexical-semantic resource based on lmf. In Proceedings of EACL (pp. 580–590).

  21. Izquierdo, R., Suárez, A., & Rigau, G. (2007). Exploring the automatic selection of basic level concepts. In Proceedings of RANLP (Vol. 7). Citeseer.

  22. Kipper, K. (2005). Verbnet: A broad-coverage, comprehensive verb lexicon. PhD thesis, University of Pennsylvania, http://repository.upenn.edu/dissertations/AAI3179808/.

  23. Laparra, E., & Rigau, G. (2009). Integrating wordnet and framenet using a knowledge-based word sense disambiguation algorithm. In Proceedings of RANLP. Bulgaria: Borovets.

  24. Laparra, E., & Rigau, G. (2013). Impar: A deterministic algorithm for implicit semantic role labelling. In Proceedings of the 51st annual meeting of the association for computational linguistics (ACL 2013) (pp. 33–41).

  25. Laparra, E., Rigau, G., & Cuadros, M. (2010). Exploring the integration of wordnet and framenet. In Proceedings of the 5th Global WordNet Conference (GWC 2010). Mumbai, India.

  26. Levin, B. (1993). English verb classes and alternations: A preliminary investigation. Chicago: Chicago University Press.

    Google Scholar 

  27. López de Lacalle et al., M., Laparra, E., & Rigau, G. (2014a) Extending semlink through wordnet mappings. In Proceedings of the 9th language resources and evaluation conference (LREC2014). Iceland: Reykjavik.

  28. López de Lacalle et al., M., Laparra, E., & Rigau, G. (2014b) First steps towards a predicate matrix. In Proceedings of the 7th global WordNet conference (GWC2014). Estonia: Tartu.

  29. McDonald, R., Crammer, K., & Pereira, F. (2005). Online large-margin training of dependency parsers. In Proceedings of the 43rd annual meeting on association for computational linguistics (pp. 91–98). Stroudsburg, PA: Association for Computational Linguistics, ACL ’05. doi:10.3115/1219840.1219852.

  30. Navigli, R., & Ponzetto, S. P. (2010). Babelnet: Building a very large multilingual semantic network. In Proceedings of the 48th annual meeting of ACL (pp. 216–225).

  31. Navigli, R., & Velardi, P. (2005). Structural semantic interconnections: A knowledge-based approach to word sense disambiguation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 27(7), 1063–1074.

    Article  Google Scholar 

  32. Palmer, M. (2009). Semlink: Linking propbank, verbnet and framenet. In Proceedings of the generative lexicon conference (pp. 9–15).

  33. Palmer, M., Gildea, D., & Kingsbury, P. (2005). The proposition bank: An annotated corpus of semantic roles. Computational Linguistics, 31(1), 71–106. doi:10.1162/0891201053630264.

    Article  Google Scholar 

  34. Pavlick, E., Wolfe, T., Rastogi, P., Callison-Burch, C., Drezde, M., & Van Durme, B. (2015). Framenet+: Fast paraphrastic tripling of framenet. In Proc of ACL-IJCNLP. Beijing, China.

  35. Presutti, V., Draicchio, F., & Gangemi, A. (2012). Knowledge extraction based on discourse representation theory and linguistic frames. In Proceedings of the 18th international conference on knowledge engineering and knowledge management (pp. 114–129,). Berlin: Springer, EKAW’12. doi:10.1007/978-3-642-33876-2_12.

  36. Segers, R., Vossen, P., Rospocher, M., Serafini, L., Laparra, E., & Rigau, G. (2015). Eso: A frame based ontology for events and implied situations. In Proceedings of MAPLEX 2015. Japan: Yamagata. https://dkm-static.fbk.eu/people/rospocher/files/pubs/2015maplex.pdf.

  37. Shen, D., & Lapata, M. (2007). Using semantic roles to improve question answering. In EMNLP-CoNLL (pp. 12–21).

  38. Shi, L., & Mihalcea, R. (2005). Putting pieces together: Combining framenet, verbnet and wordnet for robust semantic parsing. In Proceedings of CICLing. Mexico.

  39. Subirats, C., & Petruck, M. R. (2003). Surprise: Spanish framenet. In Proceedings of the international congress of linguists. Praga.

  40. Suchanek, F. M., Kasneci, G., & Weikum, G. (2007) Yago: A core of semantic knowledge. In WWW conference. New York, NY: ACM Press.

  41. Taulé, M., Martí, M. A., & Recasens, M. (2008). Ancora: Multilevel annotated corpora for catalan and spanish. In LREC.

  42. Vossen, P., Rigau, G., Serafini, L., Stouten, P., Irving, F., & Hage, W. R. V. (2014). Newsreader: Recording history from daily news streams. In Proceedings of the 9th language resources and evaluation conference (LREC2014). Iceland: Reykjavik. http://www.lrec-conf.org/proceedings/lrec2014/pdf/436_Paper.pdf.

Download references

Acknowledgments

This work has been partially funded by TUNER (TIN2015-65308-C5-1-R), NewsReader (FP7-ICT-2011-8-316404), as well as the READERS project with the financial support of MINECO, ANR (convention ANR-12-CHRI-0004-03) and EPSRC (EP/K017845/1) in the framework of ERA-NET CHIST-ERA (UE FP7/2007-2013).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Egoitz Laparra.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Lopez de Lacalle, M., Laparra, E., Aldabe, I. et al. Predicate Matrix: automatically extending the semantic interoperability between predicate resources. Lang Resources & Evaluation 50, 263–289 (2016). https://doi.org/10.1007/s10579-016-9348-5

Download citation

Keywords

  • Verbal lexicon
  • WordNet
  • VerbNet
  • FrameNet
  • PropBank
  • SemLink