Abstract
This paper presents a model for representing the architecture of documents for natural language generation. Document architecture is an abstract notion, perceptible through the formatting treatments applied to structural elements in a text: titles, organization of the text into chapters or sections, paragraphs, formulations such as theorems, definitions, introductions, etc. It can be seen as the visual structure of a text. The model of document architecture representation is constructed from a specialized sublanguage (in the sense of Z. Harris) of natural language. The sublanguage, a metalanguage for text architecture, is tailored to text production. We extracted about thirty classes of metasentences of this metalanguage, covering nearly the full scope of text architectural phenomena that appear in scientific and technical documents. A specific text is represented by a metadiscourse which is a list of instantiated metasentences, observing properties that ensure the coherence of the metadiscourse. This model has multiple applications: it can be used by systems of text manipulation and transformation through architecture (analysis, generation, formatting, editing, etc.). In the last part of this paper, we propose a brief survey of a system of formatted text generation, based on this model.
Preview
Unable to display preview. Download preview PDF.
References
J. L. Austin, How to do things with words. Oxford University Press. London. 1962.
Daniel Chester, The Translation of Formal Proofs into English. Artificial Intelligence, 7, 261–278. 1976.
Laurence Danlos, Génération automatique de textes en langues naturelles. Masson. 1985.
Melvin Fitting, Intuitionistic Logic Model Theory and Forcing. North-Holland Publishing Co. Amsterdam. 1969.
C. F. Goldfarb, The SGML Handbook. Oxford University Press. 1990.
Maurice Gross, Méthodes en syntaxe. Hermann. 1975.
Barbara J. Grosz & Candace L. Sidner, Attention, Intentions and the Structure of Discourse. Computational Linguistics, 12, 3, 175–204. 1986.
Barbara J. Grosz, Martha E. Pollack & Candace L. Sidner, Discourse. Foundations of Cognitive Science, M. I. Posner (Ed), The MIT Press, 437–468. 1989.
Zelig S. Harris, Mathematical Structures of Language. John Wiley and Sons (Eds). 1968.
Eduard H. Hovy, Parsimonious and Profligate Approaches to the Question of Discourse Structure Relations. Proceedings of the 5th International Workshop on Natural Language Generation, 128–136. 1990.
Eduard H. Hovy & Yigal Arens, Automatic Generation of Formatted Text. Proceedings of the 8th Conference of the American Association for Artificial Intelligence, Anaheim, CA, 92–96. 1991.
Aravind K. Joshi, Generation: A New Frontier of Natural Language Processing ? Theoritical Issues in Natural Language Processing, Yorick Wilks (Ed), 191–193. 1989.
Martine Landelle, Analyse syntaxique de l'expression de la segmentation dans le lexique franÇais. Rapport de DEA. Toulouse, France. June 1988.
William C. Mann & James A. Moore, Computer Generation of Multiparagraph English Text. American Journal of Computational Linguistics, 7, 1, 17–29. January–March 1981.
William C. Mann & Sandra A. Thompson, Rhetorical Structure Theory: Toward of Functional Theory of Text Organization. Research Report RR-87-190, USC/Information Sciences Institute. 1987.
David D. McDonald, Natural language production as a process of decision-making under constraint. Ph.d Dissertation, draft version. MIT, Cambridge Masson. November 1980.
Kathleen R. McKeown, Text Generation, Using discourse strategies and focus constraints to generate natural language text. Cambridge University Press. 1985.
Mustapha Mojahid, Choice of a Cooperative Model for Structured Document Formatting. Convention IA 91, 167–180. Paris. January 1991.
Mustapha Mojahid, Elsa Pascual & Jacques Virbel, Production de documents intelligemment assistée. Convention IA 89:1ere Conférence Européenne sur les Techniques et les Applications de l'Intelligence Artificielle en milieu industriel et de service, 267–281. Paris. 23–27 January 1989.
Mustapha Mojahid & Jacques Virbel, Towards a Cognitive Approach of Control Strategies for Document Structures Recognition. ICDAR 91: First International Conference on Document Analysis And Recognition. Saint-Malo, France. 30 September–2 October 1991.
Elsa Pascual, Système de description en langage naturel de l'architecture de documents. Convention IA 91: 3ème Conférence Européenne sur les Techniques et les Applications de l'Intelligence Artificielle en milieu industriel et de service, 181–200. Paris. 14–17 January 1991.
Elsa Pascual, Représentation de l'architecture textuelle et génération de texte. Thèse de l'université Paul Sabatier. Toulouse, France. 16 September 1991.
Elsa Pascual & Jacques Virbel, Le problème de la génération de textes architecturés. RFIA 89: 7èrne Congrès Reconnaissance des Formes et Intelligence Artificielle, 1181–1188. Paris. 27–31 November 1989.
Elsa Pascual & Jacques Virbel, The problems of natural linguistic transcription of formalized reasonning. ICCS 91: International Colloquium on Cognitive Science. Donastia-San Sebastian, Spain. 7–11 May 1991.
Elsa Pascual & Jacques Virbel, Connaissances linguistiques et morphodispositionnelles pour le contrÔle de la décomposition structurelle des documents. CNED 92: Colloque National sur l'Ecrit et le Document. Nancy, France. 6–7 July 1992. Bigre nℴ80, A. Belaid (Ed), 217–224.
Elsa Pascual & Jacques Virbel, On the Unsuspected Importance of Material Shaping in the Processes of Text Production and Understanding. ICCS 93: International Colloquium on Cognitive Science. Donastia-San Sebastian, Spain. 4–8 May 93.
John R. Searle, A Taxonomy of Illocutory Acts. Language, Mind and Knowledge. K. Gunderson Ed. University of Minnesota Press. 344–369, 1975.
G. Sperberg-McQueen & Lou Burnard Eds, Guidelines for Electronic Text Encoding and Interchange. ACH-ACL-ALLC. May. 1994.
Peter F. Strawson, Logico-linguistic papers. Methuen. London. 1971.
Jacques Virbel, The contribution of linguistic knowledge to the Interpretation of Text Structures. Structured Documents. J. André, V. Quint & R. Furuta Eds. Cambridge University Press. 161–181. 1989.
Jacques Virbel, Formalisation d'une classe de relations structurelles de textes. CNED 92: Colloque National sur l'Ecrit et le Document. Nancy, France. 6–7 July 1992. Bigre nℴ80, A. Belaid (Ed), 192–199.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1996 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Pascual, E. (1996). Integrating text formatting and text generation. In: Adorni, G., Zock, M. (eds) Trends in Natural Language Generation An Artificial Intelligence Perspective. EWNLG 1993. Lecture Notes in Computer Science, vol 1036. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-60800-1_31
Download citation
DOI: https://doi.org/10.1007/3-540-60800-1_31
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60800-4
Online ISBN: 978-3-540-49457-7
eBook Packages: Springer Book Archive