Paraphrastic Reformulations in Spoken Corpora
Abstract
Our work addresses the automatic detection of paraphrastic reformulation in French spoken corpora. The proposed approach is syntagmatic. It is based on specific markers and the specificities of the spoken language. Manual multi-dimensional annotation performed by two annotators provides fine-grained reference data. An automatic method is proposed in order to decide whether sentences contain or not paraphrastic relations. The obtained results show up to 66.4% precision. Analysis of the manual annotations indicates that few paraphrastic segments show morphological modifications (inflection, derivation or compounding) and that the syntactic equivalence between the segments is seldom respected, as these segments usually belong to different syntactic categories.
Keywords
reformulation paraphrase spoken corporaPreview
Unable to display preview. Download preview PDF.
References
- 1.François, F.: La communication inégale. Heurs et malheurs de l’interaction verbale. In: Actualités pédagogiques et psychologiques. Delachaux & Niestlé, Neuchâtel-Paris (1990)Google Scholar
- 2.Bouamor, H., Max, A., Vilnat, A.: Étude bilingue de l’acquisition et de la validation automatiques de paraphrases sous-phrastiques. TAL 53(1), 11–37 (2012)Google Scholar
- 3.Boucheron, S.: La langue de l’un, et celle de l’autre: l’entre parenthèses comme aire de reformulation. In: Répétition, Altération, Reformulation, pp. 113–118. Presses Universitaires Franc-Comtoises, Besançon (2000)Google Scholar
- 4.Martin, R.: Inférence, antonymie et paraphrase. Klincksieck, Paris (1976)Google Scholar
- 5.Vezin, L.: Les paraphrases: étude sémantique, leur rôle dans l’apprentissage. L’année Psychologique 76(1), 177–197 (1976)CrossRefGoogle Scholar
- 6.Fuchs, C.: Paraphrase et énonciation. Orphys, Paris (1994)Google Scholar
- 7.Melčuk, I.: Paraphrase et lexique dans la théorie linguistique sens-texte. Lexique et paraphrase Lexique 6, 13–54 (1988)Google Scholar
- 8.Vila, M., Antònia Mart, M., Rodríguez, H.: Paraphrase concept and typology. a linguistically based and computationally oriented approach. Procesamiento del Lenguaje Natural 46, 83–90 (2011)Google Scholar
- 9.Bhagat, R., Hovy, E.: What is a paraphrase? Computational Linguistics 39(3), 463–472 (2013)CrossRefGoogle Scholar
- 10.Flottum, K.: Dire et redire. La reformulation introduite par ”c’est-à-dire”. Thèse de doctorat, Hogskolen i Stavanger, Stavanger (1995)Google Scholar
- 11.Fujita, A.: Typology of paraphrases and approaches to compute them. In: CBA to Paraphrasing & Nominalization, Barcelona, Spain (2010) (Invited talk)Google Scholar
- 12.Milicevic, J.: La paraphrase: Modélisation de la paraphrase langagière. Peter Lang (2007)Google Scholar
- 13.Elhadad, N., Sutaria, K.: Mining a lexicon of technical terms and lay equivalents. In: BioNLP, pp. 49–56 (2007)Google Scholar
- 14.Rossari, C.: De l’exploitation de quelques connecteurs reformulatifs dans la gestion des articulations discursives. Pratiques 75, 111–124 (1992)Google Scholar
- 15.Blanche-Benveniste, C., Bilger, M., Rouget, C., Van Den Eynde, K.: Le français parlé. Études grammaticales. CNRS Éditions, Paris (1991)Google Scholar
- 16.Hagège, C.: L’homme de paroles. Contribution linguistique aux sciences humaines. Fayard, Paris (1985)Google Scholar
- 17.Gülich, E., Kotschi, T.: Les actes de reformulation dans la consultation La dame de Caluire. In: Bange, P. (ed.) L’analyse des Interactions Verbales. La Dame de Caluire: une Consultation, pp. 15–81. P Lang, Berne (1987)Google Scholar
- 18.Kanaan, L.: Reformulations, contacts de langues et compétence de communication: analyse linguistique et interactionnelle dans des discussions entre jeunes Libanais francophones. Thèse de doctorat, Université d’Orléans, Orléans (2011)Google Scholar
- 19.Rossari, C.: Les opérations de reformulation. Analyse du processus et des marques dans une perspective contrastive français-italien (1993)Google Scholar
- 20.Madnani, N., Dorr, B.J.: Generating phrasal and sentential paraphrases: A survey of data-driven methods. Computational Linguistics 36, 341–387 (2010)CrossRefMathSciNetGoogle Scholar
- 21.Androutsopoulos, I., Malakasiotis, P.: A survey of paraphrasing and textual entailment methods. Journal of Artificial Intelligence Research 38, 135–187 (2010)zbMATHGoogle Scholar
- 22.Malakasiotis, P., Androutsopoulos, I.: Learning textual entailment using SVMs and string similarity measures. In: ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, pp. 42–47 (2007)Google Scholar
- 23.Lin, D., Pantel, L.: Dirt - discovery of inference rules from text. In: ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 323–328 (2001)Google Scholar
- 24.Paşca, M., Dienes, P.: Aligning needles in a haystack: Paraphrase acquisition across the Web. In: Dale, R., Wong, K.-F., Su, J., Kwong, O.Y. (eds.) IJCNLP 2005. LNCS (LNAI), vol. 3651, pp. 119–130. Springer, Heidelberg (2005)CrossRefGoogle Scholar
- 25.Barzilay, R., McKeown, L.: Extracting paraphrases from a parallel corpus. In: ACL, pp. 50–57 (2001)Google Scholar
- 26.Ibrahim, A., Katz, B., Lin, J.: Extracting structural paraphrases from aligned monolingual corpora. In: International Workshop on Paraphrasing, pp. 57–64 (2003)Google Scholar
- 27.Quirk, C., Brockett, C., Dolan, W.: Monolingual machine translation for paraphrase generation. In: EMNLP, pp. 142–149 (2004)Google Scholar
- 28.Shinyama, Y., Sekine, S., Sudo, K., Grishman, R.: Automatic paraphrase acquisition from news articles. In: Proceedings of HLT, pp. 313–318 (2002)Google Scholar
- 29.Sekine, S.: Automatic paraphrase discovery based on context and keywords between NE pairs. In: International Workshop on Paraphrasing, pp. 80–87 (2005)Google Scholar
- 30.Bannard, C., Callison-Burch, C.: Paraphrasing with bilingual parallel corpora. In: ACL, pp. 597–604 (2005)Google Scholar
- 31.Madnani, N., Resnik, P., Dorr, B., Schwartz, R.: Applying automatically generated semantic knowledge: A case study in machine translation. In: NSF Symposium on Semantic Knowledge Discovery, Organization and Use, pp. 60–61 (2008)Google Scholar
- 32.Callison-Burch, C., Cohn, T., Lapata, M.: Parametric: An automatic evaluation metric for paraphrasing. In: COLING, pp. 97–104 (2008)Google Scholar
- 33.Kok, S., Brockett, C.: Hitting the right paraphrases in good time. In: NAACL, pp. 145–153 (2010)Google Scholar
- 34.Eshkol-Taravella, I., Baude, O., Maurel, D., Hriba, L., Dugua, C., Tellier, I.: Un grand corpus oral disponible: le corpus d’Orléans 1968-2012. Traitement Automatique de Langues 52(3), 17–46 (2012)Google Scholar
- 35.Hölker, K.: Zur Analyse von Markern. Franz Steiner, Stuttgart (1988)Google Scholar
- 36.Beeching, K.: La co-variation des marqueurs discursifs bon, c’est-à-dire, enfin, hein, quand même, quoi et si vous voulez: une question d’identité? Langue Française 154(2), 78–93 (2007)Google Scholar
- 37.Hwang, Y.: Eh bien, alors, enfin et disons en français parlé contemporain. L’Information Grammaticale 57, 46–48 (1993)CrossRefGoogle Scholar
- 38.Petit, M.: Discrimination prosodique et représentation du lexique: application aux emplois des connecteurs discursifs. Thèse de doctorat, Université d’Orléans, Orléans (2009)Google Scholar
- 39.Teston-Bonnard, S.: Je veux dire est-il toujours une marque de reformulation? In: Bot, M.L., Schuwer, M., Richard, E. (eds.) Rivages linguistiques. La Reformulation. Marqueurs linguistiques. Stratégies énonciatives, pp. 51–69. PUR, Rennes (2008)Google Scholar
- 40.Dupont, Y., Tellier, I., Courmet, A.: Un segmenteur-étiqueteur et un chunker pour le français. Technical report, LIFO, Université d’Orléans, demo (2012)Google Scholar
- 41.Cohen, J.: A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20(1), 37–46 (1960)CrossRefGoogle Scholar