VIE-GEN A Generator for German Texts

Buchberger, E.; Horacek, H.

doi:10.1007/978-1-4612-3846-1_5

E. Buchberger &
H. Horacek

Part of the book series: Symbolic Computation ((1064))

95 Accesses
3 Citations

Overview

VIE-GEN is a generator that produces German text from a semantic representation. It is a component of the German language dialogue system VIE-LANG [2], implemented in INTERLISP. The input to VIE-GEN is part of the episodic layer of the semantic network SEMNET, its outputs are German sentences. The generator is not restricted to single sentences, it contains features for creating coherent structures (e.g. generation of anaphora and gapping). VIE-GEN is designed to suit the idiosyncracies of the German language: it is able to produce various alternative word orderings (in German word order is not as strict as e.g. in English), it considers syntactic differences between main clauses and dependent ones and it is able to correctly produce all inflectional forms being found in German.

In order to be able to handle all these features, VIE-GEN performs its task in two steps which shall be referred to as verbalization phase and realization phase, respectively. The input to the verbalization phase is a part of the episodic layer of the semantic net, which is to be supplied by the dialogue component. This reflects the nearly classical distinction between “what to say” (decided by the dialogue component) and “how to say it” (decided by the generator) (e.g. [11]). The verbalization process is strongly data driven, its main sources of information being discrimination nets (DNs) and the so-called syntactico-semantic lexicon (SSL). By application of the DNs and by evaluating the SSL an intermediate structure (IMS) is created, forming the input to the realization phase.

The IMS is a tree, whose nodes represent either single words (terminals) or groups of words, i.e. constituents (nonterminals) together with their features. The “1exeme”-property of a node contains the canonical form of the word (actually a pointer to a lexicon entry so that morphological data can be accessed in the last step of processing) or, in the non-terminal case, a list of pointers to the dependent nodes. Admissible features are the complement type, an identifying marker that carries information about the individuated source concept (in SEMNET) the node stands for (used as a link for phrase heads and playing an important role in the generation of anaphora and gapping), number, tense, preposition (prepositional phrases are treated like noun phrases, the only difference being in the preposition feature being bound to a non-NIL value) and some others. Details on the IMS are to be found in a separate chapter below.

The task of the realization phase is the production of surface sentences out of the IMS. This task is divided into the following subprocedures:

The predicates (in the grammatical sense) of the IMS are split into their finite and non-finite parts (in German, the predicate often is a discontinuous constituent)
All constituents of a sentence are sorted in a standard order. The most interesting parts of this subtask are the partitioning of the IMS into sentences, maintaining their connections, and selection of appropriate types of sentences.
Transformations are applied to these sentences, reordering the constituents in case of questions and subordinate phrases and the generation of gapping or anaphora.
Noun phrases are linearized.
In a last step, morphologic synthesis is performed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Softcover Book: USD 16.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Brachman R.J.: A Structural Paradigm for Representing Knowledge, Bolt Beranek and Newman Inc., Rep. 3605, Cambridge, MA,; 1978.
Google Scholar
Buchberger E., Steinacker I., Trappl R., Trost H., Leinfellner E.: VIE-LANG - A German Language Understanding System, in Trappl R.(ed.), Cybernetics and Systems Research, North-Holland, Amsterdam; 1982.
Google Scholar
Charniak E.: The Case-Slot Identity Theory, Cognitive Science, 5(3)285–292; 1981.
Article MathSciNet Google Scholar
Engel U.: Syntax der deutschen Gegenwartssprache, Erich Schmidt Verlag; 1977.
Google Scholar
Goldman N.M.: Computer Generation of Natural Language from a Deep Conceptual Base, Stanford AI Lab Memo AIM-247; 1974.
Google Scholar
Horacek H.: Zur Generierung zusammenhaengender Texte, in Neumann B. (ed.), GWAI-83, Springer, Berlin; 1983.
Google Scholar
Katz B.: A Three-Step Procedure for Language Generation, MIT, AI Memo No.599; 1980.
Google Scholar
Kobsa A., Buchberger E., Steinacker I.: Funktion, Inhalt und Aufbau von Partnermodellen in natuer1ichsprachigen Dialogsystemen, Bericht 82–08, Inst.f.Med.Kybernetik, Univ.Wien; 1982.
Google Scholar
Kunze J., Ruediger B.: Algorithmische Synthese der Flexionsformen des Deutschen, Zeitschrift fuer Phonetik, Sprachwissenschaft und Kommunikationsforschung 21,245–303; 1968.
Google Scholar
Matthiessen C.M.I.M.: A Grammar and a Lexicon for a Text-Producti on System, in Proceedings of the 19th Annual Meeting of the ACL, 49–55; 1981.
Google Scholar
McKeown K.R.: The Text System for Natural Language Generation: An Overview, in Proceedings of the 20th Annual Meeting of the Association for Computational Linguists, University of Toronto, Toronto, Canada, 113–120; 1982.
Google Scholar
Schank R.C.: Conceptual Information Processing, North-Holland, Amsterdam; 1975.
MATH Google Scholar
Steinacker I., Buchberger E.: Relating Syntax and Semantics: The Syntactico-Semanti c Lexicon of the System VIE-LANG, in Proceedings of the First Conference of the European Chapter of the ACL, Pisa, Italy; 1983.
Google Scholar
Steinacker I.: VIE-PAR, ein semantisch gesteuerter Parser zur Analyse deutscher Saetze in einem sprachverstehenden System, Dissertation, Technische Universitaet Wien; 1984.
Google Scholar
Trost H., Buchberger E.: Lexikon, morphologische Analyse und Synthese im System VIE-LANG, Bericht 81–02, Inst.f.Med.Kybernetik, Univ.Wien; 1981.
Google Scholar
Trost H., Steinacker I.: The Role of Roles: Some Aspects of World Knowledge Representation, in Proceedings of the 7th International Joint Conference on Artificial Intelligence, Univ.British Columbia, Vancouver, Canada; 1981.
Google Scholar
Trost H.: Struktur und Zugriffsroutinen der Datenstruktur SEMNET, Institutsbericht 82–13, Inst.f.Med.Kybernetik, Univ.Wien; 1982.
Google Scholar
Trost H.: SEMNET - Ein semantisches Netz zur Darstellung von Umweltwissen in einem natuerlichsprachigen System, Dissertation, Technische Universitaet Wien; 1983.
Google Scholar
Wilks Y.: Good and Bad Arguments about Semantic Primitives, Communication and Cognition, 10(3/4); 1977.
Google Scholar

Download references

Authors

E. Buchberger
View author publications
You can also search for this author in PubMed Google Scholar
H. Horacek
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Brattle Research Corporation, Cambridge, MA, 02138, USA
David D. McDonald
Institute of Informatics, Warsaw, Warsaw University, PKin, pok. 850, PL.00.901, Warszawa, Poland
Leonard Bolc

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Buchberger, E., Horacek, H. (1988). VIE-GEN A Generator for German Texts. In: McDonald, D.D., Bolc, L. (eds) Natural Language Generation Systems. Symbolic Computation. Springer, New York, NY. https://doi.org/10.1007/978-1-4612-3846-1_5

Download citation

DOI: https://doi.org/10.1007/978-1-4612-3846-1_5
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4612-8374-4
Online ISBN: 978-1-4612-3846-1
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics