Ontology in Coq for a Guided Message Composition

Jakubiec-Jamet, Line

doi:10.1007/978-3-319-08043-7_19

Line Jakubiec-Jamet⁵

Part of the book series: Text, Speech and Language Technology ((TLTB,volume 48))

1470 Accesses

Abstract

Natural language generation is based on messages that represent meanings, and goals that are the usual starting points to communicate. How to help people to provide this conceptual input or, in other words, how to communicate thoughts to the computer? In order to express something, one needs to have something to express as an idea, a thought or a concept. The question is how to represent this. In 2009, Michael Zock, Paul Sabatier and Line Jakubiec-Jamet suggested the building of a resource composed of a linguistically motivated ontology, a dictionary and a graph generator. The ontology guides the user to choose among a set of concepts (or words) to build the message from; the dictionary provides knowledge of how to link the chosen elements to yield a message (compositional rules); the graph generator displays the output in visual form (message graph representing the user’s input). While the goal of the ontology is to generate (or analyse) sentences and to guide message composition (what to say), the graph’s function is to show at an intermediate level the result of the encoding process. The Illico system already proposes a way to help a user in generating (or analyzing) sentences and guiding their composition. Another system, the Drill Tutor, is an exercise generator whose goal is to help people to become fluent in a foreign language. It assists people (users have to make choices from the interface in order to build their messages) to produce a sentence expressing a message from an idea (or a concept) to its linguistic realization (or a correct sentence given in a foreign language). These two systems led us to consider the representation of the conceptual information into a symbolic language; this representation is encoded in a logic system in order to automatically check conceptual well-formedness of messages. This logic system is the Coq system used here only for its high level language. Coq is based on a typed \( \lambda \)-calculus. It is used for analysing conceptual input interpreted as types and also for specifying general definitions representing messages. These definitions are typed and they will be instantiated for type-checking the conceptual well-formedness of messages.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

eBook: USD 16.99; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The problem we are dealing with here is search. Obviously, knowledge available at the onset (cognitive state) plays also a very important role in this kind of task, regardless of the goal (determine conceptual input, lexical access, etc.). Search strategies and relative ease of finding a given piece of information (concept, word) depend crucially on the nature of the input (knowledge available) and the distance between the latter and a given output (target word). Imagine that your target word were 'guy' while you've started search from any of the following inputs: 'cat' (synonyme), 'person' (more general term), or 'gars' (equivalent word in French). Obviously, the type of search and ease of access would not be the same. The nature and number of items among which to choose would be different in each case. The influence of formally similar, i.e. close words ('libreria' in Spanish vs. 'library' in English) is well known. Cognates tend to prime each other, a fact that depending on the circumstances can be helpful or sheer nuisance.
2.
For more details and references concerning Illico and its applications (natural language interfaces to knowledge bases, simultaneous composition of sentences in different languages, linguistic games for language learning, communication aid for disabled people, software for language rehabilitation, etc.) you may want to take a look at http://pageperso.lif.univ-mrs.fr/paul.sabatier/ILLICO/illico.html.
3.
Of course, we can also assume that the author does not even know that. But this is a bit of an extreme case.
4.
For example, it allows the testing of well-formedness and linguistic coverage of the application one is about to develop. This being so, we can check now whether all the produced continuations are expected and none is missing.
5.
This idea is somehow contained in Tesnière's notion of valency (Tesnière 1959), in Schank's conceptual dependancy (Schank 1975) and McCoy and Cheng's discourse focus trees (McCoy and Cheng 1991).
6.
The upper part shows the conceptual building blocks structured as a tree and the lower part contains the result of the choices made so far, that is, the message built up to this point. To simplify matters we have ignored the attitude or speech-act node in the lower part of our figure.
7.
Suppose you were looking for the word mocha (target word: t _w), yet the only token coming to your mind were computer (source word: s _w). Taking this latter as starting point, the system would show all the connected words, for example, Java, Perl, Prolog (programing languages), mouse, printer (hardware), Mac, PC (type of machines), etc. querying the user to decide on the direction of search by choosing one of these words. After all, s/he knows best which of them comes closest to the t _w. Having started from the s _w 'computer', and knowing that the t _w is neither some kind of software nor a type of computer, s/he would probably choose Java, which is not only a programming language but also an island. Taking this latter as the new starting point s/he might choose coffee (since s/he is looking for some kind of beverage, possibly made from an ingredient produced in Java, coffee), and finally mocha, a type of beverage made from these beans. Of course, the word Java might just as well trigger Kawa which not only rhymes with the s _w , but also evokes Kawa Igen, a javanese volcano, or familiar word of coffee in French. For more details, see Zock and Schwab (2008).
8.
Of course, conceptual well-formedness, i.e. meaningfulness, does not guarantee communicative adequacy. In other words, it does not assure that the message makes sense in the context of a conversion. To achieve this goal additional mechanisms are needed.
9.
Actually I gratefully acknowledge Michael from many fruitful discussions about this approach. He always has been very attentive to others'works and our collaboration is due to him.
10.
For a similar goal, but with a quite different method, see Boitet et al. (2007).

References

Baker, C. F., Fillmore, C. J., & Lowe, J. B. (1998). The Berkeley Framenet project. In COLING/ACL-98 (pp. 86–90). Montreal.
Google Scholar
Bateman, J., & Zock, M. (2003). Natural language generation. In R. Mitkov (Ed.), Oxford handbook of computational linguistics, Chap. 15 (pp. 284–304). Oxford: Oxford University Press.
Google Scholar
Boitet, C., Bhattacharyya, P., Blanc, E., Meena, S., Boudhh, S., Fafiotte, G., Falaise, A., & Vacchani, V. (2007). Building Hindi-French-English-UNL resources for SurviTra-CIFLI, a linguistic survival system under construction. In Seventh international symposium on natural language processing.
Google Scholar
Briffault, X., & Zock, M. (1994). What do we mean when we say to the left or to the right? How to learn about space by building and exploring a microworld? In 6th International Conference on ARTIFICIAL INTELLIGENCE: Methodology, Systems, Applications (pp. 363–371). Sofia.
Google Scholar
Fellbaum, C. (Ed.). (1998). WordNet: An electronic lexical database and some of its applications. Cambridge: MIT Press.
Google Scholar
Fromkin, V. (1993). Speech production. In J. Berko-Gleason & N. Bernstein Ratner (Eds.), Psycholinguistics. Austin, TX: Harcourt, Brace and Jovanovich.
Google Scholar
Jakubiec-Jamet, L. (2012). A case-study for the semantic analysis of sentences in Coq. Research report, LIF.
Google Scholar
Levelt, W. (1989). Speaking : From intention to articulation. Cambridge, MA: MIT Press.
Google Scholar
Ligozat, G., & Zock, M. (1992). How to visualize time, tense and aspect. In Proceedings of COLING ‘92 (pp. 475–482). Nantes.
Google Scholar
McCoy, K., & Cheng, J. (1991). Focus of attention: Constraining what can be said next. In C. Paris, W. Swartout & W. Mann (Eds.), Natural language generation in artificial intelligence and computational linguistics (pp. 103–124). Boston: Kluwer Academic Publisher.
Google Scholar
Meteer, M. W. (1992). Expressibility and the problem of efficient text planning. London: Pinter.
Google Scholar
Reiter, E., & Dale, R. (2000) Building natural language generation systems. Cambridge: Cambridge University Press.
Google Scholar
Sabatier, P. (1997). Un lexique-grammaire du football. Lingvistic Investigationes, XXI(1):163–197.
Google Scholar
Schank, R. (1975). Conceptual dependency theory. In R. C. Schank (Ed.), Conceptual information processing (pp. 22–82). Amsterdam and New York: North-Holland and Elsevier.
Google Scholar
Tesnière, L. (1959). Éléments de syntaxe structurale. Paris: Klincksieck.
Google Scholar
Zock, M. (1991). Swim or sink: The problem of communicating thought. In M. Swartz & M. Yazdani (Eds.), Intelligent tutoring systems for foreign language learning (pp. 235–247). New York: Springer.
Google Scholar
Zock, M. (1996). The power of words in message planning. In International Conference on Computational Linguistics, Copenhagen.
Google Scholar
Zock, M., & Afantenos, S. (2009). Using e-learning to achieve fluency in foreign languages. In N. Tsapatsoulis & A. Tzanavari (Eds.), Affective, interactive and cognitive methods for e-learning design: Creating an optimal education experience. Hershey: IGI Global.
Google Scholar
Zock, M., & Lapalme, G. (2010). A generic tool for creating and using multilingual phrasebooks. In Natural Language Processing and Cognitive Science, Funchal.
Google Scholar
Zock, M. & Schwab, D. (2008). Lexical access based on underspecified input. In Proceedings of the Workshop on Cognitive Aspects of the Lexicon (COGALEX 2008) (pp. 9–17). Manchester, UK, August 2008 (Coling 2008).
Google Scholar

Download references

Author information

Authors and Affiliations

Laboratoire d’Informatique Fondamentale de Marseille, Aix-Marseille Université, CNRS, LIF UMR 7279, 163, Avenue de Luminy - Case 901, 13288, Marseille Cedex 9, France
Line Jakubiec-Jamet

Authors

Line Jakubiec-Jamet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Line Jakubiec-Jamet .

Editor information

Editors and Affiliations

CNRS-LIF, UMR 7279, Aix-Marseille University, City, France
Núria Gala
CNRS-LIF, UMR 7279, Aix-Marseille University and University of Mainz, Marseille, France
Reinhard Rapp
CNRS-LIF, UMR 7279, Aix-Marseille University, Marseille, France
Gemma Bel-Enguix

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Jakubiec-Jamet, L. (2015). Ontology in Coq for a Guided Message Composition. In: Gala, N., Rapp, R., Bel-Enguix, G. (eds) Language Production, Cognition, and the Lexicon. Text, Speech and Language Technology, vol 48. Springer, Cham. https://doi.org/10.1007/978-3-319-08043-7_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-08043-7_19
Published: 12 November 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08042-0
Online ISBN: 978-3-319-08043-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics