Linguistic resources for paraphrase generation in portuguese: a lexicon-grammar approach


This paper presents a new linguistic resource for the generation of paraphrases in Portuguese, based on the lexicon-grammar framework. The resource components include: (i) a lexicon-grammar based dictionary of 2100 predicate nouns co-occurring with the support verb ser de ‘be of’, such as in ser de uma ajuda inestimável ‘be of invaluable help’; (ii) a lexicon-grammar based dictionary of 6000 predicate nouns co-occurring with the support verb fazer ‘do’ or ‘make’, such as in fazer uma comparação ‘make a comparison’; and (iii) a lexicon-grammar based dictionary of about 5000 human intransitive adjectives co-occurring with the copula verbs ser and/or estar ‘be’, such as in ser simpático ‘be kind’ or estar entusiasmado ‘be enthusiastic’. A set of local grammars explore the properties described in linguistic resources, enabling a variety of text transformation tasks for paraphrasing applications. The paper highlights the different complementary and synergistic components and integration efforts, and presents some preliminary evaluation results on the inclusion of such resources in the eSPERTo paraphrase generation system.

  7. This is one of the most frequent verbs in European Portuguese, both in written texts and in the spoken language. Sentences with support verb constructions are often more frequent than sentences with the equivalent verbal constructions. This is corroborated by Barreiro (2009), who showed that from a search on all sentences of the COMPARA parallel corpus (Frankenberg-Garcia & Santos, 2003; Santos & Inácio, 2006) where the infinitive form of fazer occurs with a noun or with a left modifier and a noun, 47% of the times the occurrence is a support verb construction.


  9. The underscore indicates that two lexical units (preposition and definite article), normally contracted, were split here for clarity purposes.

  10. This is also a paraphrase of O Pedro fez uma festinha à Joana na cara ‘Pedro did a caress to Joana in the face’.

  11. Many support verb construction can undergo passive as well.

  12. The asterisk ‘*’ signals the sentence unacceptability, while the question mark indicates doubtful acceptability.

  13. Even though variants are also support verbs, they may feature syntactic properties of their own, so a detailed description is in order.

  14. To be precise, the ser de construction expresses not only a human quality, but it also characterizes the attitude or a gesture from the subject towards the human complement, e.g. A atitude/o gesto do Pedro foi de uma certa gentileza ‘Pedro’s attitude/gesture is of a certain kindness’. On the other hand, the sentence with fazer is not strictly semantically equivalent, as the paraphrase involves a regular meaning difference, where the expression of a human quality is, at least, not so obvious, and only the second interpretation of the ser de construction is kept. They can be treated as approximate paraphrases and the difference is systematic.


This research work was supported by Fundação para a Ciência e Tecnologia (FCT), under projects EXPL/MHC-LIN/2260/2013, UIDB/50021/2020,, and UTAP EXPL/EEI-ESS/0031/2014, and post-doctoral grant SFRH/BPD/91446/2012. The authors would like to thank Max Silberztein for his continued support with NooJ since the development of the first version of Port4NooJ.

  • Language Generation Resources
  • Paraphrasing
  • Lexicon-Grammar
  • Paraphrase Generator
  • Portuguese