Advertisement

Synthesis of Czech Sentences from Tectogrammatical Trees

  • Jan Ptáček
  • Zdeněk Žabokrtský
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4188)

Abstract

In this paper we deal with a new rule-based approach to the Natural Language Generation problem. The presented system synthesizes Czech sentences from Czech tectogrammatical trees supplied by the Prague Dependency Treebank 2.0 (PDT 2.0). Linguistically relevant phenomena including valency, diathesis, condensation, agreement, word order, punctuation and vocalization have been studied and implemented in Perl using software tools shipped with PDT 2.0. BLEU score metric is used for the evaluation of the generated sentences.

Keywords

Machine Translation Relative Clause Word Order Word Form Morphological Category 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Sgall, P.: Generativní popis jazyka a česká deklinace. Academia (1967)Google Scholar
  2. 2.
    Mikulová, M., Bémová, A., Hajič, J., Hajičová, E., Havelka, J., Kolářová, V., Lopatková, M., Pajas, P., Panevová, J., Razímová, M., Sgall, P., Štěpánek, J., Urešová, Z., Veselá, K., Žabokrtský, Z., Kučová, L.: Anotace na tektogramatické rovině Pražského závislostního korpusu. Anotátorská příručka. Technical Report TR-2005-28, ÚFAL MFF UK (2005)Google Scholar
  3. 3.
    Hajič, J., Panevová, J., Urešová, Z., Bémová, A., Kolářová-Řezníčková, V., Pajas, P.: PDT-VALLEX: Creating a Large-coverage Valency Lexicon for Treebank Annotation. In: Proceedings of The Second Workshop on Treebanks and Linguistic Theories, pp. 57–68. Vaxjo University Press (2003)Google Scholar
  4. 4.
    Hajič, J.: Disambiguation of Rich Inflection – Computational Morphology of Czech. Charles University – The Karolinum Press, Prague (2004)Google Scholar
  5. 5.
    Hana, J., Hanová, H., Hajič, J., Vidová-Hladká, B., Jeřábek, E.: Manual for Morphological Annotation. Technical Report TR-2002-14 (2002)Google Scholar
  6. 6.
    Ptáček, J.: Generování vět z tektogramatických stromů Pražského závislostního korpusu. Master’s thesis, MFF. Charles University, Prague (2005)Google Scholar
  7. 7.
    Petkevič, V. (ed.): Vocalization of Prepositions. In: Linguistic Problems of Czech, pp. 147–157 (1995)Google Scholar
  8. 8.
    Razímová, M., Žabokrtský, Z.: Morphological Meanings in the Prague Dependency Treebank 2.0. LNCS/Lecture Notes in Artificial Intelligence/Proceedings of Text, Speech and Dialogue (2005)Google Scholar
  9. 9.
    Papineni, K., Roukos, S., Ward, T., Zhu, W.J.: Bleu: a Method for Automatic Evaluation of Machine Translation. Technical report, IBM (2001)Google Scholar
  10. 10.
    Panevová, J.: Random generation of Czech sentences. In: Proceedings of the 9th conference on Computational linguistics, Czechoslovakia, Academia Praha, pp. 295–300 (1982)Google Scholar
  11. 11.
    Panevová, J.: Transducing Components of Functional Generative Description 1. Technical Report IV, Matematicko-fyzikální fakulta UK, Charles University, Prague, Series: Explizite Beschreibung der Sprache und automatische Textbearbeitung (1979)Google Scholar
  12. 12.
    Hajič, J., Čmejrek, M., Dorr, B., Ding, Y., Eisner, J., Gildea, D., Koo, T., Parton, K., Penn, G., Radev, D., Rambow, O.: Natural Language Generation in the Context of Manchine Translation. Technical report, Johns Hopkins University, Baltimore (2002)Google Scholar
  13. 13.
    Hana, J.: The AGILE System. Prague Bulletin of Mathematical Linguistics, pp. 147–157 (2001)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2006

Authors and Affiliations

  • Jan Ptáček
    • 1
  • Zdeněk Žabokrtský
    • 1
  1. 1.Institute of Formal and Applied LinguisticsCharles UniversityPragueCzech Republic

Personalised recommendations