PhrasIS: Phrase Inference and Similarity Benchmark

Lopez-Gazpio, I.; de la Puerta, J. Gaviria; García, P.; Sanjurjo-González, H.; Sanz, B.; Maritxalar, M.; Agirre, E.

doi:10.1007/978-3-030-87869-6_25

I. Lopez-Gazpio¹⁹,
J. Gaviria de la Puerta¹⁹,
P. García¹⁹,
H. Sanjurjo-González¹⁹,
B. Sanz¹⁹,
M. Maritxalar²⁰ &
…
E. Agirre²⁰

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1401))

Included in the following conference series:

International Workshop on Soft Computing Models in Industrial and Environmental Applications

1021 Accesses

Abstract

We present PhrasIS, a dataset of Phrase pairs with Inference and Similarity annotations for the evaluation of semantic representations. This dataset fills the gap between word and sentence-level datasets, allowing to evaluate compositional models at a finer granularity than sentences. Contrary to other datasets, the phrase pairs are extracted from naturally occurring text in image captions and news, and were annotated by experts. We analyze the dataset, showing the relation between inference labels and similarity scores, and evaluated several well-known techniques obtaining satisfactory performance. The gap with respect to annotator agreement shows that there is plenty of room for improvement. In addition, we introduce the use of similarity and relatedness inference relations, showing that they are useful for inference. With 10K phrase pairs split in development and test, the dataset is an excellent benchmark for testing meaning representation systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 219.00; Price excludes VAT (USA)

Softcover Book: USD 279.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agirre, E., et al.: SemEval-2015 task 2: semantic textual similarity, english, spanish and pilot on interpretability. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, CO (2015)
Google Scholar
Agirre, E., et al.: SemEval-2014 task 10: multilingual semantic textual similarity. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 81–91. Association for Computational Linguistics, Dublin (2014). http://www.aclweb.org/anthology/S14-2010
Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A.: SemEval-2012 task 6: a pilot on semantic textual similarity. In: *SEM 2012: The First Joint Conference on Lexical and Computational Semantics, pp. 385–393. Association for Computational Linguistics, Montréal (2012). http://www.aclweb.org/anthology/S12-1051
Agirre, E., Gonzalez-Agirre, A., Lopez-Gazpio, I., Maritxalar, M., Rigau, G., Uria, L.: Semeval-2016 task 2: interpretable semantic textual similarity. In: Proceedings of SemEval, pp. 512–524 (2016)
Google Scholar
Bentivogli, L., Bernardi, R., Marelli, M., Menini, S., Baroni, M., Zamparelli, R.: SICK through the SemEval glasses. Lang. Res. Eval. 50(1), 95–124 (2016). https://doi.org/10.1007/s10579-015-9332-5
Article Google Scholar
Best, C., van der Goot, E., Blackler, K., Garcia, T., Horby, D.: Europe media monitor - system description. In: EUR Report 22173-En. Ispra, Italy (2005)
Google Scholar
Bowman, S.R., Angeli, G., Potts, C., Manning, C.D.: A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP). Association for Computational Linguistics (2015)
Google Scholar
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Trans. Intell. Syst. Technol. (TIST) 2(3), 27 (2011)
Google Scholar
Collobert, R., Kavukcuoglu, K., Farabet, C.: Torch7: a matlab-like environment for machine learning. In: BigLearn, NIPS Workshop. No. EPFL-CONF-192376 (2011)
Google Scholar
Dagan, I., Dolan, B., Magnini, B., Roth, D.: Recognizing textual entailment: rational, evaluation and approaches. Nat. Lang. Eng. 16, 105 (2010). http://journals.cambridge.org/article_S1351324909990234
Dolan, B., Quirk, C., Brockett, C.: Unsupervised construction of large paraphrase corpora: exploiting massively parallel news sources. In: COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, p. 350 (2004)
Google Scholar
Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB: the paraphrase database. In: Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 758–764. Association for Computational Linguistics, Atlanta (2013). http://www.aclweb.org/anthology/N13-1092
Hill, F., Reichart, R., Korhonen, A.: Simlex-999: evaluating semantic models with (genuine) similarity estimation. Computational Linguistics (2015)
Google Scholar
Jurgens, D., Pilehvar, M.T., Navigli, R.: Semeval-2014 task 3: cross-level semantic similarity. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), pp. 17–26. Association for Computational Linguistics and Dublin City University, Dublin (2014). http://www.aclweb.org/anthology/S14-2003
Korkontzelos, I., Zesch, T., Zanzotto, F.M., Biemann, C.: Semeval-2013 task 5: evaluating phrasal semantics. In: Second Joint Conference on Lexical and Computational Semantics (*SEM), pp. 39–47. Association for Computational Linguistics, Atlanta, Georgia (2013). http://www.aclweb.org/anthology/S13-2007
Litkowski, C.K., Hargraves, O.: Semeval-2007 task 06: word-sense disambiguation of prepositions. In: Proceedings of the Fourth International Workshop on Semantic Evaluations (SemEval-2007), pp. 24–29. Association for Computational Linguistics (2007). http://aclweb.org/anthology/S07-1005
MacCartney, B., Manning, C.D.: Natural logic for textual inference. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 193–200. Association for Computational Linguistics (2007)
Google Scholar
Mitchell, J., Lapata, M.: Composition in distributional models of semantics. Cogn. Sci. 34(8), 1388–1429 (2010). https://doi.org/10.1111/j.1551-6709.2010.01106.x
Article Google Scholar
Pavlick, E., Bos, J., Nissim, M., Beller, C., Van Durme, B., Callison-Burch, C.: Adding semantics to data-driven paraphrasing. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1512–1522. Association for Computational Linguistics (2015). http://aclweb.org/anthology/P15-1146
Pavlick, E., Rastogi, P., Ganitkevitch, J., Van Durme, B., Callison-Burch, C.: PPDB 2.0: better paraphrase ranking, fine-grained entailment relations, word embeddings, and style classification. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 425–430. Association for Computational Linguistics, Beijing (2015). http://www.aclweb.org/anthology/P15-2070
Pedersen, T., Patwardhan, S., Michelizzi, J.: Wordnet::similarity: measuring the relatedness of concepts. In: Demonstration papers at HLT-NAACL 2004, pp. 38–41. Association for Computational Linguistics (2004)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. EMNLP. 14, 1532–43 (2014)
Google Scholar
Rashtchian, C., Young, P., Hodosh, M., Hockenmaier, J.: Collecting image annotations using Amazon’s mechanical turk. In: Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon’s Mechanical Turk, pp. 139–147. CSLDAMT 2010, Stroudsburg (2010). http://dl.acm.org/citation.cfm?id=1866696.1866717
Rubenstein, H., Goodenough, J.B.: Contextual correlates of synonymy. Commun. ACM 8(10), 627–633 (1965). https://doi.org/10.1145/365628.365657
Shwartz, V., Dagan, I.: Adding context to semantic data-driven paraphrasing. In: Proceedings of the Fifth Joint Conference on Lexical and Computational Semantics, pp. 108–113. Association for Computational Linguistics (2016). http://aclweb.org/anthology/S16-2013
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, pp. 1556–1566. Association for Computational Linguistics, Beijing (2015). http://www.aclweb.org/anthology/P15-1150
Sang, E.F., Buchholz, S.: Introduction to the CoNLL-2000 shared task: chunking. In: Proceedings of the 2nd Workshop on Learning Language in Logic and the 4th Conference on Computational Natural Language Learning, vol. 7, pp. 127–132. Association for Computational Linguistics (2000)
Google Scholar
Wieting, J., Bansal, M., Gimpel, K., Livescu, K.: From paraphrase database to compositional paraphrase model and back. Trans. Assoc. Comput. Linguist. 3, 345–358 (2015). https://tacl2013.cs.columbia.edu/ojs/index.php/tacl/article/view/571
Zettlemoyer, L.S., Collins, M.: Online learning of relaxed CCG grammars for parsing to logical form. In: Proceedings of EMNLP-CoNLL, pp. 678–687. Association for Computational Linguistics (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

TIEC Department, University of Deusto (UD), Mundaiz kalea 50, 20012, Basque Country, Spain
I. Lopez-Gazpio, J. Gaviria de la Puerta, P. García, H. Sanjurjo-González & B. Sanz
HiTZ Basque Center for Language Technologies - Ixa NLP Group, University of the Basque Country (UPV/EHU), M. Lardizabal 1, Donostia, 20018, Basque Country, Spain
M. Maritxalar & E. Agirre

Authors

I. Lopez-Gazpio
View author publications
You can also search for this author in PubMed Google Scholar
J. Gaviria de la Puerta
View author publications
You can also search for this author in PubMed Google Scholar
P. García
View author publications
You can also search for this author in PubMed Google Scholar
H. Sanjurjo-González
View author publications
You can also search for this author in PubMed Google Scholar
B. Sanz
View author publications
You can also search for this author in PubMed Google Scholar
M. Maritxalar
View author publications
You can also search for this author in PubMed Google Scholar
E. Agirre
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to I. Lopez-Gazpio .

Editor information

Editors and Affiliations

Faculty of Engineering, University of Deusto, Bilbao, Spain
Hugo Sanjurjo González
Faculty of Engineering, University of Deusto, Bilbao, Spain
Iker Pastor López
Faculty of Engineering, University of Deusto, Bilbao, Spain
Pablo García Bringas
Department of Industrial Engineering, University of A Coruña, Ferrol, Spain
Héctor Quintián
BISITE Research Group, University of Salamanca, Salamanca, Spain
Emilio Corchado

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lopez-Gazpio, I. et al. (2022). PhrasIS: Phrase Inference and Similarity Benchmark. In: Sanjurjo González, H., Pastor López, I., García Bringas, P., Quintián, H., Corchado, E. (eds) 16th International Conference on Soft Computing Models in Industrial and Environmental Applications (SOCO 2021). SOCO 2021. Advances in Intelligent Systems and Computing, vol 1401. Springer, Cham. https://doi.org/10.1007/978-3-030-87869-6_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-87869-6_25
Published: 23 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87868-9
Online ISBN: 978-3-030-87869-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics