Abstract
The interest in language evolution by various disciplines, such as linguistics, computer science, biology, etc., makes language evolution models an active research topic and many models have been defined in the last decade. In this work, an overview of computational methods and grammars in language evolution models is given. It aims to introduce readers to the main concepts and the current approaches in language evolution research. Some of the language evolution models, developed during the decade 2003–2012, have been described and classified considering both the grammatical representation (context-free, attribute, Christiansen, fluid construction, or universal grammar) and the computational methods (agent-based, evolutionary computation-based or game theoretic). Finally, an analysis of the surveyed models has been carried out to evaluate their possible extension towards multimodal language evolution.
Similar content being viewed by others
Notes
Human written or spoken languages as opposed to artificially constructed languages.
References
Abrams DM, Strogatz SH (2003) Modeling the dynamics of language death. Nature 424:900
Baldridge J, Kruijff GJM (2003) Multimodal combinatory categorial grammar. In: Proceedings of the 10th conference of the European Chapter of the Association for Computational Linguistics, 12–17 Apr 2003, Budapest, Hungary, pp 211–218
Baronchelli A, Felici M, Loreto V, Caglioti E, Steels L (2006) Sharp transition towards shared vocabularies in multiagent systems. J Stat Mech 6:6–14
Baxter GJ, Blythe RA, Croft W, McKane AJ (2009) Modeling language change: an evaluation of Trudgill’s theory of the emergence of New Zealand English. Lang Var Change 21:257–296
Bel-Enguix G, Christiansen H, Jiménez-López MD (2011) A grammatical view of language evolution. In: Proceedings of the 1st international workshop on AI methods for interdisciplinary research in language and biology—BILC 2011, 29 Jan 2011, Rome, pp 57–66
Benz A, Ebert C, Jäger G, van Rooij R (2011) Language, games and evolution. LNCS 6207. Springer, Berlin
Bickerton D (2007) Language evolution: a brief guide for linguists. Lingua 117(3):510–526
Boyland JT (1996) Conditional attribute grammars. ACM Trans Program Lang Syst (TOPLAS) 18(1):73–108
Briscoe T (2000) Grammatical acquisition: inductive bias and coevolution of language and the language acquisition device. Language 76(2):245–296
Bungum L, Gambäck B (2010) Evolutionary algorithms in natural language processing. In: Norwegian artificial intelligence symposium (NAIS), 22 Nov 2010, Gjøvik. Tapir Akademisk Forlag
Cangelosi A, Parisi D (1998) The emergence of a “language” in an evolving population of neural networks. Connect Sci 10(2):83–89
Caschera MC, D’Ulizia A, Ferri F, Grifoni P (2012) Towards evolutionary multimodal interaction in on the move to meaningful Internet systems: OTM 2012 workshops. Springer, Berlin, pp 608–616
Caschera MC, Ferri F, Grifoni P (2013a) From modal to multimodal ambiguities: A classification approach. J Next Gener Inf Technol (JNIT) 4(5):87–109
Caschera MC, Ferri F, Grifoni P (2013b) InteSe: an integrated model for resolving ambiguities in multimodal sentences. IEEE Trans Syst Man Cybern Syst 43(4):911–931
Chatterjee K, Zufferey D, Nowak MA (2012) Evolutionary game dynamics in populations with different learners. J Theor Biol 301:161–173
Chauhan S (2013) Programming languages: design and constructs. University Science Press, New Delhi
Chomsky N (1957) Syntactic structures. Mouton, The Hague
Chomsky N (1965) Aspects of the theory of syntax. MIT Press, Cambridge
Chomsky N (1980) Rules and representations. Behav Brain Sci 3:1–61
Chomsky N (1986) Knowledge of language. Praeger, New York
Christiansen H (1985) Syntax, semantics, and implementation strategies for programming languages with powerful abstraction mechanisms. In: Proceedings of the eighteenth annual hawaii international conference on system sciences, vol 2: Software, pp 57–66
Christiansen MH (1990) A survey of adaptable grammars. ACM SIGPLAN Not 25(11):35–44
Christiansen MH, Chater N (1999) Toward a connectionist model of recursion in human linguistic performance. Cogn Sci 23:157–205
Christiansen MH, Kirby S (2003) Language evolution: consensus and controversies. Trends Cogn Sci 7(7):300–307
Ciortuz L, Pantiru S (2009) Towards a LIGHT implementation of fluid construction grammars. In: COMPUTATIONWORLD ’09, 15–20 Nov 2009, Athens, pp 511–514
Darwin C (1871) The descent of man. Murray, London
de Boer B (2006) Computer modelling as a tool for understanding language evolution. In: Evolutionary epistemology, language and culture—a non-adaptationist, systems theoretical approach. Springer, Dordrecht, pp 381–406
de la Cruz M, de la Puente AO, Alfonseca M (2005) Attribute grammar evolution, artificial intelligence and knowledge engineering applications: a bioinspired approach. In: First international work-conference on the interplay between natural and artificial computation, IWINAC 2005, June 2005, Las Palmas, Canary Islands, Spain, pp 182–191
De Pauw G (2003a) Evolutionary computing as a tool for grammar development. In: Proceedings of GECCO 2003, Chicago, IL, USA, July 12–16 2003, LNCS 2723, Berlin, Heidelberg, pp 549–560
De Pauw G (2003b) GRAEL: an agent-based evolutionary computing approach for natural language grammar development. In: Proceedings of the 18th international joint conference on artificial intelligence, 09–15 Aug 2003, Acapulco, Mexico, pp 823–828
Del Rosa Garcia E (2012) Evolutionary automatic modelling: a general methodology for scientific modeling. PhD thesis, Universidad Autonoma de Madrid
D’Ulizia A, Ferri F (2006) Formalization of multimodal languages in pervasive computing paradigm. In: Advanced Internet based systems and applications, Second international conference on signal-image technology and internet-based systems (SITIS 2006), 17–21 Dec 2006, Hammamet, Tunisia. Revised selected papers, Springer, Lecture Notes in Computer Science 4879, pp 126–136
D’Ulizia A, Ferri F, Grifoni P (2007) A hybrid grammar-based approach to multimodal languages specification. In: Proceedings OTM 2007 workshops, 25–30 Nov 2007, Vilamoura, Portugal, Springer, Lecture Notes in Computer Science 4805, pp 367–376
D’Ulizia A, Ferri F, Grifoni P (2008) Toward the development of an integrative framework for multimodal dialogue processing. In: On the move to meaningful internet systems: OTM 2008 workshops. Springer, Berlin, pp 509–518
D’Ulizia A, Ferri F, Grifoni P (2010) Generating multimodal grammars for multimodal dialogue processing. IEEE Trans Syst Man Cybern Part A Syst Hum 40(6):1130–1145
D’Ulizia A, Ferri F, Grifoni P (2011a) A survey of grammatical inference methods for natural language learning. Artif Intell Rev 36(1):1–27
D’Ulizia A, Ferri F, Grifoni P (2011b) A learning algorithm for multimodal grammar inference. IEEE Trans Syst Man Cybern Part B Cybern 41(6):1495–1510
Ferri F, D’Ulizia A, Grifoni P (2012) Multimodal language specification for human adaptive mechatronics. J Next Gener Inf Technol 3(1):47–57
Fitch WT (2005) The evolution of language: a comparative review. Biol Philos 20(2–3):193–203
Goldberg DE (1998) Genetic algorithms in search, optimization, and machine learning. Addison Wesley, Reading
Gong T, Minett JW, Wang WS (2006) Computational simulation on the coevolution of compositionality and regularity. In: Proceedings of the 6th international conference on the evolution of language, 12–15 Apr 2006, Rome, Italy, pp 99–106
Harrison MA (1978) Introduction to formal language theory. Addison-Wesley, Reading
Hemberg E (2010) An exploration of grammars in grammatical evolution. PhD thesis, University College Dublin
Hemberg E, O’Neill M, Brabazon A (2008) Grammatical bias and building blocks in meta-grammar grammatical evolution. In: Wang J (ed) IEEE World Congress on Computational Intelligence, 1–6 June 2008, Hong Kong, pp 3776–3783
Hopcroft JE (1979) Introduction to automata theory, languages, and computation. Pearson Education, Noida
Jäger G (2004) Learning constraing subhierarchies: the bidirectional gradual learning algorithm. In: Blutner R, Zeevat H (eds) Optimality theory and pragmatics. Palgrave MacMillan, Basingstoke, pp 251–287
Jäger G (2007) Evolutionary game theory and typology: a case study. Language 83(1):74–109
Jäger G (2008) Evolutionary stability conditions for signaling games with costly signals. J Theor Biol 253(1):131–141
Jaeger H, Baronchelli A, Briscoe E, Christiansen MH, Griffiths T, Jäger G, Kirby S, Komarova N, Richerson PJ, Steels L, Triesch J (2009) What can mathematical, computational and robotic models tell us about the origins of syntax? In: Bickerton D, Szathmáry E (eds) Biological foundations and origin of syntax. Strüngmann Forum reports, vol 3. MIT Press, Cambridge, pp 385–410
Jimenez-Lopez MD (2012) A grammar-based multi-agent system for language evolution, highlights on PAAMS. AISC 156:45–52
Johnston M, Bangalore S (2005) Finite-state multimodal integration and understanding. Nat Lang Eng 11(2):159–187
Juergens E, Pizka M (2006) The language evolver lever—tool demonstration. Electron Notes Theor Comput Sci 164(2):55–60
Kandler A, Steele J (2008) Ecological models of language competition. Biol Theor 3:164–173
Kanero J (2014) The gesture theory of language origins: current issues and beyond. In: McCrohon L, Thompson B, Verhoef T, Yamauchi H (eds) The past, present and future of language evolution research. EvoLang9 Organising Committee, Tokyo, pp 1–7
Kaplan F (2005) Simple models of distributed coordination. Connect Sci 17(3–4):249–270
Kay M (1984) Functional unification grammar: a formalism for machine translation. In: Proceedings of the international conference of computational linguistics. Stanford University, Stanford, pp 75–78
Kirby S (2001) Spontaneous evolution of linguistic structure—an iterated learning model of the emergence of regularity and irregularity. IEEE Trans Evol Comput 5(2):102–110
Kirby S, Christiansen M, Chater N (2009) Syntax as an adaptation to the learner. In: Bickerton D, Szathmáry E (eds) Biological foundations and origin of syntax. Strüngmann Forum reports, vol 3. MIT Press, Cambridge
Knuth DE (1968) Semantics of context-free languages. Math Syst Theory 2:127–145
Landsbergen F (2009) Cultural evolutionary modeling of patterns in language change: exercises in evolutionary linguistics. Doctoral dissertation, LOT, Netherlands Graduate School of Linguistics, Utrecht
Levinson SC, Holler J (2014) The origin of human multi-modal communication. Phil Trans R Soc B 369(1651):1–9. http://rstb.royalsocietypublishing.org/content/royptb/369/1651/20130302.full.pdf
Lipowska D (2011) Naming game and computational modelling of language evolution. Comput Methods Sci Technol 17(1–2):41–51
Minett JW, Wang WS (2008) Modeling endangered languages: the effects of bilingualism and social structure. Lingua 118(1):19–45
Mitchener WG (2007) Game dynamics with learning and evolution of universal grammar. Bull Math Biol 69(3):1093–1118
Nettle D (1999) Is the rate of linguistic change constant? Lingua 108:119–136
Niederhut D (2014) Beyond “neuroevidence”. In: McCrohon L, Thompson B, Verhoef T, Yamauchi B (eds) The past, present and future of language evolution research. EvoLang9 Organising Committee, Tokyo, pp 102–109
Nowak MA, Plotkin J, Krakauer D (1999) The evolutionary language game. J Theor Biol 200(2):147–162
Nowak MA, Komarova NL, Niyogi P (2002) Computational and evolutionary aspects of language. Nature 417(6889):611–617
O’Neill M, Ryan C (2003) Grammatical evolution: evolutionary automatic programming in an arbitrary language. Kluwer, Norwell
O’Neill M, Ryan C (2004) Grammatical evolution by grammatical evolution: the evolution of grammar and genetic code. LNCS 3003, pp 138–149
O’Neill M, Brabazon A (2005) mGGA: the meta-grammar genetic algorithm. In: LNCS 3447, Proceedings of the European conference on genetic programming, EuroGP 2005, 30 March–1 Apr 2005, Lausanne, Switzerland, pp 311–320
Ortega A, De La Cruz M, Alfonseca M (2007) Christiansen grammar evolution: grammatical evolution with semantics. IEEE Trans Evol Comput 11(1):77–90
Oviatt SL (1999) Ten myths of multimodal interaction. Commun ACM 42(11):74–81
Parisi D, Antinucci F, Natale F et al (2008) Simulating the expansion of farming and the differentiation of European languages. In: Laks B (ed) Origin and evolution of languages: approaches, models, paradigms. Equinox Publishing, Sheffield, pp 234–258
Patriarca M, Leppänen T (2004) Modeling language competition. Phys A 338(1–2):296–299
Paulmann S, Jessen S, Kotz SA (2009) Investigating the multimodal nature of human communication: insights from ERPs. J Psychophysiol 23(2):63–76
Pereira F, Warren DHD (1980) Definite clause grammars for language analysis—a survey of the formalism and a comparison with augmented transition networks. Artif Intell 13(3):231–278
Regenbogen C, Schneider DA, Gur RE, Schneider F, Habel U, Kellermann T (2012) Multimodal human communication—targeting facial expressions, speech content and prosody. NeuroImage 60(4):2346–2356
Reitter D, Panttaja EM, Cummins F (2004) UI on the fly: generating a multimodal user interface. In: Proceedings of human language technology conference—North American Chapter of the Association for Computational Linguistics (HLT-NAACL-2004), Boston, MA, USA
Saveluc V, Ciortuz L (2010) FCGlight: a system for studying the evolution of natural language. In: 12th international symposium on symbolic and numeric algorithms for scientific computing SYNASC 2010, , 23–26 Sept 2010, Timisoara, Romania. IEEE, pp 188–193
Shimazu H, Takashima Y (1995) Multimodal definite clause grammar. Syst Comput Jpn 26(3):93–102
Shutt JN (1998) Recursive adaptable grammars. Doctoral dissertation, Worcester Polytechnic Institute
Singh YN (2005) Computational modelling of evolution of language. Retrieved from http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.7997&rep=rep1&type=pdf on 10 Feb 2015
Smith K, Kirby S, Brighton H (2003) Iterated learning: a framework for the emergence of language. Artif Life 9(4):371–386
Spranger M, Steels L (2012) Synthetic modeling of cultural language evolution. Five approaches to language evolution. Evolang Organization Committee, Tokyo, pp 130–139
Steels L (1995) A self-organizing spatial vocabulary. Artif Life J 2(3):319–332
Steels L (1997) The synthetic modelling of language origins. Evol Commun 1:1–34
Steels L (2010) Modeling the formation of language in embodied agents: methods and open challenges. In: Nolfi S, Mirolli M (eds) Evolution of communication and language in embodied agents. Springer, Berlin, pp 223–233
Steels L (2011a) The cultural modeling of language evolution. Phys Life Rev 8(4):330–356
Steels L (2011b) Introducing fluid construction grammar. In: Steels L (ed) Design patterns in fluid construction grammar. John Benjamins, Amsterdam, pp 3–30
Steels L, De Beule J (2006) A (very) brief introduction to fluid construction grammar. In: Proceedings of the 3rd workshop on scalable natural language understanding, June 2006, New York City, pp 73–80
Van Trijp R (2008) The emergence of semantic roles in fluid construction grammar. In: Proceedings of the 7th international conference EVOLANG 7. World Scientific Publishing, Singapore, pp 346–353
Vigliocco G, Perniss P, Vinson D (2014) Language as a multimodal phenomenon: implications for language learning, processing and evolution. Philos Trans R Soc B 369(1651):1–7. http://rstb.royalsocietypublishing.org/content/royptb/369/1651/20130292.full.pdf
Vogt P (2006) Language evolution and robotics: Issues in symbol grounding and language acquisition. In: Loula A, Gudwin R, Queiroz J (eds) Artificial cognition systems. Idea Group, Hershey, pp 176–209
Vogt P (2009) Modeling interactions between language evolution and demography. Hum Biol 81(2):237–258
Von Neumann J, Morgenstern O (1944) Theory of games and economic behavior. Princeton University Press, Princeton
Waller B, Liebal K, Burrows A, Slocombe K (2013) How can a multimodal approach to primate communication help us understand the evolution of communication? Evol Psychol 11(3):538–549
Wang WS, Liao CC, Gaskins R, Wang MS (1978) QUINCE system: state-of-the-art review. California University, Berkeley, Berkeley
Watumull J, Hauser MD (2014) Conceptual and empirical problems with game theoretic approaches to language evolution. Front Psychol 5:226
Wellens P, Loetzsch M, Steels L (2008) Flexible word meaning in embodied agents. Connect Sci 20(2):173–191
Zuidema W (2002) Language adaptation helps language acquisition—a computational model study. In: Hallam EB, Floreano D, Hallam J, Hayes y G, Meyer J (eds) Proceedings of the seventh international conference on simulation of adaptive behavior. MIT Press, Cambridge
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
The basic concepts and definitions provided in this appendix are extracted from some books on formal language theory including Harrison (1978) and Hopcroft (1979) and from the work of Nowak et al. (2002).
The fundamental elements of a formal language theory are alphabets, sentences, languages, and grammars (Harrison 1978).
An alphabet is a finite set of symbols (Nowak et al. 2002). The formal definition is as follows.
Definition 1
An alphabet is a set containing finitely many symbols. Conventionally, this alphabet is referred to as \(\Sigma \).
For instance, in natural languagesFootnote 1 the set of all phonemes or graphemes or the set of all words are possible alphabets. Another example of alphabet is the binary alphabet consisting of the symbols (0, 1).
A sentence is a string of symbols in the alphabet (Nowak et al. 2002). Formally, it is defined in the following way:
Definition 2
A sentence is a sequence of finite length that can be constructed from an alphabet \(\Sigma \). The set of all sentences over \(\Sigma \) is denoted by \(\Sigma ^*\).
As observed by Chomsky (1957), there are infinitely many sentences in natural languages, due to the compositionality principle that allows constructing arbitrarily long sentences from shorter pieces. Some examples of sentences for the English language are “the cat mews”, “John works at CNR”.
A language is defined as a set of sentences over the alphabet (Nowak et al. 2002). The formal definition is as follows:
Definition 3
A language L is a subset of \(\Sigma ^*\).
Not all sentences are admissible in a language but only sentences that follow the specific rules of that language. For instance, the sentence “the cat mews” is valid (or meaningful) for the English language, while the sentence “mews cat the” is not.
A grammar is a set of rules that allows the valid sentences of the language to be established (Nowak et al. 2002). A grammar is formally defined by Chomsky (1957) as follows:
Definition 4
A grammar is a tuple (\(\hbox {N}, \Sigma , \hbox {P, S}\)) where:
-
N is a finite set of non-terminal symbols;
-
\(\Sigma \) is a finite set of terminal symbols (disjoint from N);
-
P is a finite set of productions of the form \(\upalpha \rightarrow \upbeta \) with at least one non-terminal in N;
-
S is a member of N called the start symbol.
Each production is a rule that may contain elements of the alphabet (named terminal symbols) and other elements, which act as variables (named non-terminal symbols). The rules replace the string on the left with another string on the right, starting from a non-terminal symbol that is designed as the start symbol. A valid sentence of the language is produced by taking the start symbol and repeatedly replacing substrings with the strings they generate, as defined by the rules of the grammar. An example of grammar producing the sentence “the cat mews” is shown in Fig. 5.
The hierarchy proposed by Chomsky (1965) classifies grammars as regular, context-free, context-sensitive, and unrestricted, based on the power of expression of their representation. The expressive power (also known as expressiveness) is that which can be represented using that language, i.e. the set of sets of strings its instances describe. Languages generated by regular grammar are the least expressive, while languages generated by unrestricted grammars are the most expressive.
The most used class of Chomsky grammars in natural language are CFGs, which are defined by rules of the form \(\hbox {A} \rightarrow \upgamma \), where A is a non-terminal symbol and \(\upgamma \) is a string of terminals and non-terminals. The formal definition of a CFG is as follows:
Definition 5
A grammar \(\hbox {G} = (\hbox {N}, \Sigma , \hbox {P, S})\) is said to be context-free if all productions in P have the form \(\hbox {A} \rightarrow \upgamma \), where \(\hbox {A} \in \hbox {N}\) and \(\upgamma \in \hbox {N} \cup \Sigma \).
Despite the classification of grammars provided by Chomsky, many further formal grammars have been developed by linguists and by computer scientists by extending or modifying those in Chomsky’s hierarchy for improving the expressive power and simplifying parsing. Most of these extensions start from the CFG, which is a very simple and intuitively appealing formalism for representing natural language, and extend it for introducing context-dependent language features. For instance, AGs and CGs retain a CFG kernel, and improve it with a distinct facility that handles context-dependence (Shutt 1998).
Rights and permissions
About this article
Cite this article
Grifoni, P., D’Ulizia, A. & Ferri, F. Computational methods and grammars in language evolution: a survey. Artif Intell Rev 45, 369–403 (2016). https://doi.org/10.1007/s10462-015-9449-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-015-9449-3