Quantum Structure in Cognition: Human Language as a Boson Gas of Entangled Words

We model a piece of text of human language telling a story by means of the quantum structure describing a Bose gas in a state close to a Bose-Einstein condensate near absolute zero temperature. For this we introduce energy levels for the words (concepts) used in the story and we also introduce the new notion of 'cogniton' as the quantum of human thought. Words (concepts) are then cognitons in different energy states as it is the case for photons in different energy states, or states of different radiative frequency, when the considered boson gas is that of the quanta of the electromagnetic field. We show that Bose-Einstein statistics delivers a very good model for these pieces of texts telling stories, both for short stories and for long stories of the size of novels. We analyze an unexpected connection with Zipf's law in human language, the Zipf ranking relating to the energy levels of the words, and the Bose-Einstein graph coinciding with the Zipf graph. We investigate the issue of 'identity and indistinguishability' from this new perspective and conjecture that the way one can easily understand how two of 'the same concepts' are 'absolutely identical and indistinguishable' in human language is also the way in which quantum particles are absolutely identical and indistinguishable in physical reality, providing in this way new evidence for our conceptuality interpretation of quantum theory.


Introduction
Human language is a substance consisting of combinations of concepts giving rise to meaning. We will show that a good model for this substance is the one of a gas of entangled bosonic quantum particles such as they appear in physics in the situation close to a Bose-Einstein condensate. In this respect we also introduce the new notion of 'cogniton' as the entity playing the same role within human language of the 'bosonic quantum particle' for the 'quantum gas'. There is a gas of bosonic quantum particles that we all know very well, and that is the electromagnetic field, which we will also briefly call 'light', which is a substance of photons. Often we will use 'light' as an example and inspiration of how we will talk and reason about human language where 'concepts' (words), as 'states of the cogniton', are then like 'photons of different energies (frequencies, wave lengths)'. With the new findings we present here, we also make an essential and new step forward in the elaboration of our 'conceptuality interpretation of quantum theory', where quantum particles are the concepts of a proto-language, in a similar way that human concepts (words), are the quantum particles (cognitons) of human language (Aerts, 2009a(Aerts, , 2010a(Aerts, ,b, 2013(Aerts, , 2014Aerts et al., 2018dAerts et al., , 2019c.
There are several new results and insights that we will put forward in the coming sections. We summarize them here, referring also to earlier work on which they are built, guaranteeing however that the article is self-contained, so that it is not necessary to have studied these earlier works for understanding its content. The reason we can present here a self-contained theory of human language is because most of our earlier results take a simple and transparent form in the model of a boson gas that we elaborate here for human language. Since we also introduce the basics of the physics of a boson gas, our presentation will remain self-contained also from a physics' perspective. In the article, we will use the terms 'words' and 'concepts' interchangeably because their difference does not play a role in the aspects of language we study.
We will see that the state of the gas of bosonic quantum particles which we identify explicitly to also be the state of a piece of text such as that of a story is one of very low temperature, i.e. a temperature in the neighborhood of where also the fifth state of matter appears, namely the Bose-Einstein condensate. This means that the interactions between 'words', which are the boson particles of language in our description, is mainly one of 'quantum superposition' and 'quantum entanglement', or more precisely one of 'overlapping de Broglie wave functions'. This corresponds well with some of our earlier findings, when studying the combinations of concepts in human language, namely that superposition and entanglement are abundant, and the type of entanglement is deep, namely it also violates additionally to Bell's inequality the marginal laws (Aerts, 2009b;Aerts, Broekaert & Gabora, 2011;Aerts & Sozzo, 2011Aerts, Sozzo & Veloz, 2015aAerts et al., 2012Aerts et al., , 2018aAerts et al., ,b,c, 2019aAerts Arguëlles, 2018;Beltran & Geriente, 2019).
When we present our model in the next sections, we will see that it contains several new explanations of aspects of human language which we brought up in earlier work. For example, we elaborated an axiomatic quantum model for human concepts, which we called SCoP (state context property system), and in which different exemplars of a specific concept are considered as different states of this concept (Gabora & Aerts, 2002;Aerts & Gabora, 2005a,b;Aerts, 2009b;. In the theory of the boson gas for human language that we develop here, we will not only introduce these states explicitly, but also introduce them as eigenstates for specific values of the energy and a detailed energy scale for all the words appearing in a considered piece of text will be introduced. If we compare this with the quantum description of light, it means that the cognitons of our piece of text of human language will radiate their meaning with different frequencies to the human mind, engaging in the meaning of this piece of text. Let us consider an example of a text, namely the Winnie the Pooh story entitled 'In Which Piglet Meets a Haffalump' (Milne, 1926), to make this introduction of 'energy' in our theory of language more concrete. We define the 'energy level' of a word (concept, cogniton) in the story by looking at the number of times this word appears in that story. The most often appearing word, namely 133 times, is the concept And (we will denote concepts or words when they are looked upon as states of a cogniton in italics and with a capital letter, like in our earlier works we have denoted concepts) and we attribute to it (for reasons that will become clearer later) the lowest energy level E 0 . The second most often appearing word, 111 times, is the concept He, and we attribute to it the second lowest energy level E 1 , and so on, till we reach words such as Able, which only appears once. In other words, if we think of a story as a 'gas of bosonic particles' in 'thermal equilibrium with its environment', these 'number of times of appearance in the story' indicate different energy levels of the particles of the gas, following the 'energy distribution law governing the gas', and this is our inspiration for the introduction of 'energy' in human language. Remember indeed that each of these words (concepts) is a 'state of the cogniton', exactly like different energy levels of photons (different wave lengths of light) are each 'states of the photon'. Proceeding in this way we arrive at 452 energy levels for the story 'In Which Piglet Meets a Haffalump', the values of which are taken to be We denote N (E i ) the 'number of appearances' of the word (concept, cogniton) with energy level E i , and if we denote n the total number of energy levels, we have that is the total number of words (concepts, cognitons) of the considered piece of text, which is 2655 for the story 'In Which Piglet Meets a Haffalump'. For each of the energy levels E i , N (E i )E i is the amount of energy 'radiated' by the story 'In Which Piglet Meets a Haffalump' with the 'frequency or wave length' connected to this energy level. For example, the energy level E 54 = 54 is populated by the concept Thought and the word Thought appears N (E 54 ) = 10 times in the story 'In Which Piglet Meets a Haffalump'. Each of the 10 appearances of Thought radiates with energy value 54, which means that the total radiation with the wave length connected to Thought of the story 'In Which Piglet Meets a Haffalump' equals N 54 E 54 = 10 · 54 = 540.
The total energy E radiated by the considered piece of text is therefore For the story 'In Which Piglet Meets a Haffalump' we have E = 242891. Let us represent now some of the other findings that we will describe more in detail in the following sections. When we applied the Bose-Einstein distribution to model the data we collected on the story 'In Which Piglet Meets a Haffalump', determining the parameters A and B by the two requirements we found an almost complete fit with the data (see Section 2, Table 1, Figure 1 (a), Figure 1 (b) and Figure 2. We tested numerous other texts, short stories (see Section 3, Table 4, Figure 3, Figure 4) and long stories of the size of novels (see Section 4, Figure 7 (b)), and each time it showed that a modeling by means of a Bose-Einstein statistical energy distribution, like explained above, gives rise to an almost complete fit with the data. We started this investigation with the idea that 'concepts within human language behave like bosonic entities', an idea we expressed earlier as one of the basic pieces of evidence for the 'conceptuality interpretation' (Aerts, 2009a). The origin of the idea is the simple direct understanding that if one considers, for example, the concept combination Eleven Animals, then, on the level of the 'conceptual realm' each one of the eleven animals is completely 'identical with' and 'indistinguishable from' each other of the eleven animals. It is also a simple direct understanding that in the case of 'eleven physical animals', there will always be differences between each one of the eleven animals, because as 'objects' present in the physical world, they have an individuality, and as individuals, with spatially localized physical bodies, none of them will be really identical with the other ones, which means that each one of them will also always be able to be distinguished from the others. Even if all the animals are horses, simply because they are 'objects' and not 'concepts', they will not be completely identical and hence they will be distinguishable. The idea is that it is 'this not being completely identical and hence being distinguishable' which makes the Maxwell-Boltzmann statistics being applicable to them. However, when we consider 'eleven animals' as concepts, such that their ontological nature is conceptual, they are all 'completely identical and hence intrinsically indistinguishable'. Within the conceptuality interpretation of quantum theory, where we put forward the hypothesis that quantum entities are 'conceptual' and hence are not 'objects', their 'being completely identical and hence intrinsically indistinguishable', would also be due to their being conceptual instead of objectual entities.
In earlier work we already investigated this idea by looking at simple combinations of concepts with numerals, such as indeed Eleven Animals and then considering two states of Animal, namely Cat and Dog. We then checked whether the twelve different exemplars of them that form in these two states, namely Eleven Dogs, One Cat And Ten Dogs, Two Cats And Nine Dogs, . . . , Ten Cats And One Dog, Eleven Cats, in their appearance in texts follow a Maxwell-Boltzmann or rather a Bose-Einstein statistical pattern. In a less convincing way because of a collection of limited data (Aerts, 2009a;Aerts, Sozzo & Veloz, 2015b), but with an abundance of data and very convincingly Beltran (2019), it was shown that indeed the Bose-Einstein statistics delivers a better model for the data as compared to the Maxwell-Boltzmann statistics.
The result that we put forward in the present article, namely that the Bose-Einstein statistics as explained above models entire texts of any size, is a much stronger one, although it expresses the same idea. Consider any text, and then consider two instances of the word Cat appearing in the text, if then one of the concepts Cat is exchanged with the other concept Cat, absolutely nothing changes in the text. Hence, a text contains a perfect symmetry for the exchange of cognitons (concepts, words) in the same state. This is not true for physical reality and its physical objects. Suppose one considers a physical landscape where two cats are within the landscape, exchanging the two cats will always change the landscape, because the cats are not identical and are distinguishable as physical objects. If we introduce a quantum description of the text, the wave function must be invariant for the exchange of the two cats, which would again be not the case if the wave function would describe the physical landscape containing two cats as objects. This is the result we will present in Section 2.
Section 3 is devoted to a self-contained presentation of the phenomenon of Bose-Einstein condensation in physics. We illustrate the different aspects of the Bose-Einstein condensation valuable for our discussion, by means of two examples of Bose gases, the rubidium 87 atom gas and the sodium atom gas, that also originally where the first ones to be used to realize a Bose-Einstein condensate (Anderson et al., 1995;Davis et al., 1995). We compare the Bose-Einstein condensates of the gases and how their energy level distribution is modeled by the Bose-Einstein distribution function with our Bose-Einstein modeling of pieces of texts of stories and point out the points of correspondence.
Another finding that we will put forward, in Section 4, was completely unexpected. The method of attributing an energy level to a word depending on the number of appearances of the word in a text, introduces the typical ranking considered in the well-known Zipf's law analysis of this text (Zipf, 1935(Zipf, , 1949. When we look at the log / log graph of ranking in function of the number of appearances, we indeed see the linear function, or a slight deviation of it, which represents the most common version of Zipf's law. Zipf's law is an experimental law, which has not yet been given any theoretical foundation, hence perhaps our finding, of its unexpected connection with Bose-Einstein statistics, might provide such a foundation. We also show, in Section 4, how the connection with Zipf's law allows us to develop more in depth the Bose-Einstein model of texts of different sizes, short stories and long stories of the size of novels. In Section 5, we reflect about the issue of 'identity and indistinguishability' from the perspective we developed in the foregoing sections, taking into account the conundrum this issue actually still is in quantum theory with respect to quantum particles (Dieks & Lubberdink, 2019). Confronting the theoretical view where bosons and fermions are considered to be identical and indistinguishable even if they are in different states, we note that experimentalists take another stance in this respect considering, for example, photons of different frequencies as distinguishable. A recent experiment shows that if this experimentally accepted possibility to distinguish them is erased by means of a quantum eraser, these different frequency photons behave as indistinguishable . This makes us put forward the proposal that 'the way in which we clearly see and understand the identity and indistinguishability of concepts (words, cognitons) in human language' is also 'the way in which identity and indistinguishability for quantum particles can be understood'. More specifically, it shows that 'identity and indistinguishability' are contextual notions for a quantum particle, depending on the way a measuring apparatus or a heat bath interacts with the quantum particle, similarly to how 'identity and indistinguishability' are contextual notions for a human concept, depending on how a mind interacts with the concept. We elaborate with examples this new way of interpreting 'identity and indistinguishability' and show how it is a strong confirmation of our conceptuality interpretation of quantum theory.

Human language as a Bose gas
Let us consider again the Winnie the Pooh story 'In Which Piglet Meets a Haffalump' as published in Milne (1926). In Table 1, we have presented the list of all words that appear in the story (in the column 'Words concepts cognitons'), with their 'number of appearances' (in the column 'Appearance numbers N (E i )'), ordered from lowest energy level to highest energy level (in the column 'Energy levels E i '), where the energy levels are attributed according to these numbers of appearances, lower energy levels to higher number of appearances, and their values are given as proposed in (1).
The word And is the most often appearing word, namely 133 times, hence the cognitons in this state populate the ground state energy level E 0 , which as per (1) we put equal to zero. The word He is the second most often appearing word, namely 111 times, hence the cognitons in this state populate the first energy level E 1 , which following (1) we put equal to 1. Hence, the 'words', their 'energy levels' and their 'numbers of appearances' are in the first three columns of Table 1.
The question can be asked 'what is the unity of energy in this model that we put forward?', is the number '1' that we choose for energy level E 1 a quantity expressed in joules, or in electronvolts, or still in another unity? This question gives us the opportunity to reveal already one of the very new aspects of our approach. Energy will not be expressed in 'kgm 2 /s 2 ' like it is the case in physics. Why not? Well, a human language is not situated somewhere in space, like we believe it to be the case with a physical boson gas of atoms, or a photon gas of light. Hence, 'energy' is here in our approach a basic quantity, and if we manage to introduce -this is one of our aims in further work -what the 'human language equivalent' of 'physical space' is, then it will be oppositely, namely this 'equivalent of space' will be expressed in unities where 'energy appears as a fundamental unit'. Hence, the '1' indicating that 'He radiates with energy 1', or 'the cogniton in state He carries energy 1', stands with a basic measure of energy, just like 'distance (length)' is a basic measure in 'the physics of space and objects inside space', not to be expressed as a combination of other physical quantities. We used the expressions 'He radiates with energy 1', and 'the cogniton in state He carries energy 1', and we will use this way of speaking about 'human language within the view of a boson gas of entangled cognitons that we develop here', in similarity with how we speak in physics about light and photons.
The words The, It, A and To, are the four next most often appearing words of the Winnie the Pooh story, and hence the energy levels E 2 , E 3 , E 4 and E 5 are populated by cognitons respectively in the states The, It, A and To carrying respectively 2, 3, 4 and 5 basic energy units. Hence, the first three columns in Table 1 describe the experimental data that we extracted from the Winnie the Pooh story 'In Which Piglet Meets a Haffalump'. As we said, the story contains in total 2655 words, which give rise to 542 energy levels, where energy levels are connected with words, hence different words radiate with different energies, and the size of the energies are determined by 'the number of appearances of the words in the story', the most often appearing words being states of lowest energy of the cogniton and the least often appearing words being states of highest energy of the cogniton. In Table 1, we have not presented all 542 energy levels, because that would lead to a too long table, but we have presented the most important part of the energy spectrum, with respect to the further aspects we will point out.
More concretely, we have represented the range from energy level E 0 , the ground state of the cogniton, which is the cogniton in state And, to energy level E 78 , which is the cogniton in state Put. Then we have represented the energy level from E 538 , which is the cogniton in state Whishing, to the highest energy level E 542 of the Winnie the Pooh story, which is the cogniton in state You've.
These last five highest energy levels, from E 538 to E 542 , corresponding respectively to the cogniton in states Whishing, Word, Worse, Year and You've, all have a number of appearance of 'one time' in the story. They do however radiate with different energies, but the story is not giving us enough information to determine whether Whishing is radiating with lower energy as compared to Year or vice versa. Since this does not play a role in our actual analysis, we have ordered them alphabetically. So, different words which radiate with different energies that appear an equal number of times in this specific Winnie the Pooh story will be classified from lower to higher energy level alphabetically.
In the column 'Energies from data E(E i )', we represent E(E i ), the 'amount of energy radiated by the Winnie the Pooh story by the cognitons of a specific word, hence of a specific energy level E i '. As we mentioned already in the previous section, the formula for this amount is given by the product of the number N (E i ) of cognitons in the state of the word with energy level E i multiplied by the amount of energy E i radiated by such a cogniton in that state. In the last row of Table 1, we give the Totalities, namely in the column 'Appearance numbers N (E i )' of this last row the total number of words and in the column 'Energies from data E(E i )' of the last row we give the total amount of energy  Milne (1926). The words are in the column 'Words concepts cognitons' and the energy levels are in the column 'Energy levels Ei', and are attributed according to the 'numbers of appearances' in the column 'Appearance numbers N (Ei)', such that lower energy levels correspond to higher order of appearances, and the value of the energy levels is determined according to (1). The 'amounts of energies radiated by the words of energy level Ei' are in the column 'Energies from data E(Ei)'. In the columns 'Bose-Einstein modeling', 'Maxwell-Boltzmann modeling', 'Energies Bose-Einstein' and 'Energies Maxwell-Boltzmann' are respectively the predicted values of the Bose-Einstein and the Maxwell-Boltzmann model of the 'numbers of appearances', and of the 'radiated energies'.
In columns 'Bose-Einstein modeling' and 'Maxwell-Boltzmann modeling' of Table 1, we give the values of the populations of the different energy states for, respectively, a Bose-Einstein and a Maxwell-Boltzmann model of the data of the considered story. Let us explain what these two models are. As we recalled in the introduction, the Bose-Einstein distribution function is given by where N (E i ) is the number of bosons obeying the Bose-Einstein statistics in energy level E i and A and B are two constants that are determined by expressing that the total number of bosons equals the total number of words, and that the total energy radiated equals the total energy of the Winnie the Pooh story  (Milne, 1926), ranked from lowest energy level, corresponding to the most often appearing word, to highest energy level, corresponding to the least often appearing word as listed in Table 1. The blue graph (Series 1) represents the data, i.e. the collected numbers of appearances from the story (column 'Appearance numbers N (Ei)' of 1 We remark that the Bose-Einstein distribution function is derived in quantum statistical mechanics for a gas of bosonic quantum particles where the notions of 'identity and indistinguishability' play the specific role they are attributed in quantum theory (Huang, 1987). We will come back to this in Section 5, when we will analyze what our findings and our aim are, given our conceptuality interpretation of quantum theory, to understand better how 'identity and indistinguishability' can be explained for a physical Bose gas using our understanding of it in human language.
Since we want to show the validity of the Bose-Einstein statistics for concepts in human language, we compared our Bose-Einstein distribution model with a Maxwell-Boltzmann distribution model, hence we introduce also the Maxwell-Boltzmann distribution explicitly. It is the distribution described by the  (Milne, 1926)  following function where N (E i ) is the number of classical identical particles obeying the Maxwell-Boltzmann statistics in energy level E i and C and D are two constants that will be determined, like in the case of the Bose-Einstein statistics, by the two conditions The Maxwell-Boltzmann distribution function is derived for 'classical identical and distinguishable' particles, and can also be shown in quantum statistical mechanics to be a good approximation if the quantum particles are such that their 'the Broglie waves' do not overlap (Huang, 1987). In the last two columns 'Energies Bose-Einstein' and 'Energies Maxwell-Boltzmann' of Table 1, we show the 'energies' related to the Bose-Einstein modeling and to the Maxwell-Boltzmann modeling, respectively.
We have now introduced all what is necessary to announce the principle result of our investigation.
When we determine the two constants A and B, respectively C and D, in the Bose-Einstein distribution function (9) and Maxwell-Boltzmann distribution function (12), by putting the total number of particles of the model equal to the total number of words of the considered piece of text, (10) and (13), and by putting the total energy of the model to the total energy of the considered piece of text, (11) and (14), we find a remarkable good fit of the Bose-Einstein modeling function with the data of the piece of text, and a big deviation of the Maxwell-Boltzmann modeling function with respect to the data of the piece of text.
The result is expressed in the graphs of Figure 1 (a), where the blue graph represents the data, hence the numbers in column 'Energies from data E(E i )' of Table 1, the red graph represents the quantities obtained by the Bose-Einstein model, hence the quantities in column 'Bose-Einstein modeling' of Table 1, and the green graph represents the quantities obtained by the Maxwell-Boltzmann model, hence the quantities of column 'Energies Maxwell-Boltzmann' of Table 1. We can easily see in Figure 1 (a) how the blue and red graphs almost coincide, while the green graph deviates abundantly from the two other graphs which shows how Bose-Einstein statistics is a very good model for the data we collected from the Winnie the Pooh story, while Maxwell-Boltzmann statistics completely fails to model these data.
To construct the two models, we also considered the energies, and expressed as a second constraint the conditions (11), (14), that the total energy of the Bose-Einstein model and the total energy of the Maxwell-Boltzmann model are both equal the total energy of the data of the Winnie the Pooh story. The result of both constraints, (10), (13) and (11), (14) on the energy functions that express the amount of energy per energy level -or, to use the language customarily used for light, the frequency spectrum of light -can be seen in Figure 2. We see again that the red graph, which represent the Bose-Einstein radiation spectrum, is a much better model for the blue graph, which represents the experimental radiation spectrum, as compared to the green graph, which represents the Maxwell-Boltzmann radiation spectrum.
Both solutions, the Bose-Einstein shown in the red graph, and the Maxwell-Boltzmann shown in the green graph, have been found by making use of a computer program calculating the values of A, B, C and D such that (10), (11), (13) and (14) are satisfied, which gives the approximate values In the graphs of Figure 2, we can see that a maximum is reached for the energy level E 71 , corresponding to the word First, which appears seven times in the Winnie the Pooh story. If we use the analogy with light, we can say that the radiation spectrum of the story 'In Which Piglet Meets a Haffalump' has a maximum at First, which would hence be, again in analogy with light, the dominant color of the story 1 . We have indicated this radiation peak in Table 1, where we can see that the amount of energy the story radiates, following the Bose-Einstein model, is 522.79. Due to their shape, the graphs in Figure 1 (a) are not easily comparable, and although quite obviously the blue and red graphs are almost overlapping, while the blue and green graphs are very different, which shows that the data are well modeled by Bose-Einstein statistics and not well modeled by Maxwell-Boltzmann statistics, it is interesting to consider a transformation where we apply the log function to both the x-values, i.e. the domain values, and the y-values, i.e. the image values, of the functions underlying the graphs. This is a well-known technique to render functions giving rise to this type of graphs more easily comparable.
In Figure 1 (b), the graphs can be seen where we have taken the log of the x-coordinates and also the log of the y-coordinates of the graph representing the data, which is again the blue graph in Figure 1 (b), of the graph representing the Bose-Einstein distribution model of these data, which is the red graph in Figure 1 (b), and of the graph representing the Maxwell-Boltzmann distribution model of the data, which is the green graph in Figure 1 (b). For readers acquainted with Zipf's law as it appears in human language, they will recognize Zipf's graph in the blue graph of Figure 1 (b). It is indeed the log / log graph of 'ranking' versus 'numbers of appearances' of the text of the Winnie the Pooh story 'In Which Piglet Meets a Haffalump', which is the 'definition' of Zipf's graph. As to be expected, we see Zipf's law being satisfied, the blue graph is well approximated by a straight line with negative gradient close to -1. We see that the Bose-Einstein graph still models very well this Zipf's graph, and what is more, it also models the (small) deviation from Zipf's graph of the straight line. Zipf's law and the corresponding straight line when a log / log graph is drawn is an empirical law. Intrigued by the modeling of the Bose-Einstein statistics by the Zipf graph, we have analyzed this correspondence in detail in Section 4.
In the next section, however, we want to describe what a Bose gas is in physics, when it is brought nearby its state of Bose-Einstein condensate, with the aim of identifying the physical equivalent to the Winnie the Pooh story 'In Which Piglet Meets a Haffalump' and other pieces of texts which we will also consider.

The Bose-Einstein condensate in physics
We will explain in this section different aspects related to the experimental realization of a Bose gas near to it being a Bose-Einstein condensate where most of the bosons are in the lowest energy state. The awareness of the existence of this special state of a Bose gas came about as a consequence of a peculiar exchange between the Indian physicist Satyendra Nath Bose and Albert Einstein (Bose, 1924;Einstein, 1924Einstein, , 1925. Bose actually devised a new way to derive Planck's radiation law for light -which has the form of a Bose-Einstein statistics, hence, like we now know, being a consequence of the indistinguishability of the photon as a boson, but that was not known in these pre-quantum theory times -and sent the draft of his calculation to Einstein. Although what Bose did was far from being fully understood in that time, the new method of calculation must have caught right away the full attention of Einstein, because he translated the article from English to German and supported its publication in one of the most important scientific journals of that time (Bose, 1924). Einstein himself then, inspired by Bose's method, worked our a new model and calculation for an atomic gas consisting of bosons, and predicted the existence of what we now call a Bose-Einstein condensate, an amazing accomplishment, taken into account that the difference between bosons and fermions and the Pauli exclusion principle were not yet known (Einstein, 1924(Einstein, , 1925. Because of the intense study of Bose-Einstein condensates that took off after their first experimental realizations (Anderson et al., 1995;Bradley et al., 1995;Davis et al., 1995), a lot of new knowledge, experimental, but also theoretical, has been obtained, material on which we built upon for some of the details of the present article (Ketterle & van Druten, 1996;Parkins & Walls, 1998;Dalfovo et al., 1999;Ketterle, Durfee & Stamper-Kum, 1999;Görlitz et al., 2001;Henn at al., 2008).
The principle idea is still the one foreseen by Einstein, namely to take a dilute gas of boson particles and then stepwise lower its temperature and as a consequence its total energy such that at a certain moment there is so little energy in the gas that all boson particles are forced to transition to the lowest energy state. At that moment, all boson particles are in the same state, namely this lowest energy state, and the gas behaves then in a way for which there is no classical equivalent -we will see that given our conceptuality interpretation of quantum theory and the boson gas model we built here for human language, we will be able to put forward a new way to view the indistinguishability that lies at the heart of a Bose-Einstein condensate (see Section 5).
The Bose-Einstein condensates that have been realized so far all consist mainly of massive boson particles, hence generally atoms with integer spins, which makes them bosons. Indeed, the situation of the bosons of light, i.e. of photons, is more complicated, because photons interact so abundantly with matter that their number is never constant, which makes it difficult to realize a thermal equilibrium in this case, albeit not impossible (Klaers, Verwinger & Weitz, 2010a,b;Klaers et al., 2011;Klaers & Weitz, 2013). We do want to keep using our analogy of language with light, although of course the pieces of texts that we will study contain a fixed number of words, but a dynamic use of human language will also give rise to a continuous coming into existence of new words, which means that for such a dynamic situation the example of light is probably even more representative than gases with a fixed number of atoms. In this stage of our analysis, also because they are the more easy to realize Bose-Einstein condensates, we however focus on massive bosons, hence atoms with integer spins.
The underlying idea is that the gas consists of atoms in a good approximation not interacting with each other, hence only carrying the kinetic energy K = p 2 /2m generated by random movements due to the temperature T . It can be shown that in this situation the average kinetic energy of a free particle equals K = πkT , where k is Boltzmann's constant, hence we have where m is the mass of the atoms and p the absolute value of their momentum. From (16) and de Broglie's formula λ = h/p we can calculate the 'thermal de Broglie wave length' λ th of the atoms of the gas Let us make things more concrete and calculate this thermal de Broglie wave lengths for the atoms that were used in the Bose-Einstein condensates realized by Eric Cornell and Carl Wieman at the University of Colorado at Boulder in their NIST-JILA lab (Anderson et al., 1995), and by the group led by Wolfgang Ketterle at MIT, for which they jointly were attributed the Nobel Prize in physics in 1999. At Cornell they used a vapor of rubidium 87 atoms in a number density of 2.5 × 10 12 atoms per cubic centimeter, cooled down to a temperature of 170 nanokelvin, to see the condensate fraction appear containing an estimated 2000 atoms and be preserved for more than 15 seconds. At MIT, they used a dilute gas of sodium atoms in a number density higher than 10 14 atoms per cubic centimeter to realize the formation of a condensate containing up to 500000 atoms at a temperature of 2 microkelvin, with a lifetime of 2 seconds. Let us calculate λ th for both these condensate formations. Next to the values of Planck's and Boltzmann's constants, and the value of π, we only need the value of the mass of a rubidium 87 atom and of a sodium atom to do the calculation. The atomic mass of a rubidium 87 atom and of a sodium atom are, respectively, 86.909180527 and 22.989769 unified atomic mass units, and given that one such unified atomic mass unit is 1.66053904 × 10 −27 kg we get Using the above values into (17), we obtain for the rubidium gas at 170 nanokelvin and the sodium gas at 2 microkelvin Often one can read that in states of the Bose gas that are 'nearing the Bose-Einstein condensate', the 'de Broglie waves' of the particles start to 'overlap', and that this is the reason why quantum effects become dominant. There is an interesting measure to express in a quantitative way this notion of 'overlapping de Broglie waves' and it is called the 'phase space density' ρ ps of the boson gas where n is the 'atom density' of the gas expressed in 'number of atoms per cubic centimeter'. From (25) follows that ρ ps corresponds to the number of atoms in a region of space of the 'de Broglie wave' cube size. If this number is much smaller than 1, this means that the de Broglie wave length is much smaller than the distance between the atoms, hence there will be no overlapping and the gas will behave classically. The more this number is greater than 1, the more the de Broglie waves of the atoms are overlapping, hence quantum behavior will increase. It has been shown (Bagnato, Pritchard & Kleppner, 1987) that independent of the trapping device used for the atoms, a box, or a magnetic trap -which is the one used in actually realized Bose-Einstein condensates -the condensate starts to form whenever the value of ρ ps is such that Considering (17) and (25), the value of ρ ps in the process of formation of a Bose-Einstein condensate is determined by the temperature T and number density n of the atom gas. In the last stage of the formation, the temperature is lowered by a technique called 'evaporative cooling under influence of a radio frequency field'. The effect is that also the number density decreases, hence to attain the quantum regime of overlapping de Broglie wave lengths it is necessary to lower the temperature faster than diluting the gas. The group at MIT mentions explicitly the number density that they reached when the Bose-Einstein condensate is formed, namely, between 10 14 and 4 × 10 14 atoms per cubic centimeter (Davis et al., 1995). The Boulder group, since they identified the formation of their rubidium Bose-Einstein condensate at a temperature of 170 nK, taking into account (26), we can calculate that the number density of the rubidium gas must have been around 2.8 × 10 13 atoms per cubic centimeter. We give in Table 2 an overview of the energies and lengths that are characteristic for the realizations of the sodium condensate in MIT (Ketterle, Durfee & Stamper-Kum, 1999). Because the gas is very dilute and the temperature is very low, the size of the atoms is very small compared to the distance between the atoms, while the thermal de Broglie wave lengths are large, such that they are overlapping. With each length scale l there is an associated energy scale which is the kinetic energy K = πkT of a particle with a de Broglie wavelength l, that is gives a good indication of the relation between sizes and energies. A good measure for the size of atoms which are diluted like in the considered boson gas is the so-called elastic s-scattering length a = l/2π. For sodium this has been measured to be 3 nanometers, which using (27) corresponds to an energy of 1 millikelvin in temperature (Marte et al., 2002). Around this temperature elastic s-wave scattering between the atoms will be dominant. The separation between the atoms in the gas can be estimated by considering the cubic root n 1 3 of the number density, which gives us the number of atoms spread out over 1 centimeter. For sodium, with a number density higher than 10 14 atoms per cubic centimeter, this gives rise to a spacing between the atoms of around 200 nanometers. The length l can be calculated by making use of (26) which gives us the following estimate for l and hence, by making use of (27) we find that E is around 2 µK. A temperature of around 1 µK gives rise to a thermal de Broglie wavelength of around 300 nm. The largest length scale is related to the confinement characterized by the size of the box potential or by the oscillator length a HO = 1 2π h/mν, which is the typical size of the ground state wave function in harmonic oscillator level spacing hν ≈ 0.5 nK oscillator length ν = 10 Hz a HO = l/ √ 2π ≈ 6.5 µm a harmonic oscillator potential of frequency ν (see Appendix B). With ν = 10 Hz, we get a value for a HO of about 6.5 µm. The energy scale related to the confinement is characterized by the harmonic oscillator energy level spacing, given by hν. Again, for ν = 10 Hz we get an energy value for the spacing of about 0.5 nK.
In Table 3, we made the calculations of length and energy scales for the rubidium 78 Bose-Einstein condensate, taking into account that a density of around 2.8 × 10 13 atoms per cubic centimeter was realized within the condensate of 2000 atoms. We want to show now that our Bose-Einstein distribution model of  the Winnie the Pooh story 'In Which Piglet Meets a Haffalump' is well modeled by a Bose gas close to the Bose-Einstein condensate of this gas, and will take the rubidium and sodium gases that we described in as inspiration. What is important to notice is the difference in order of magnitude between the energy level spacings of the harmonic trap oscillator, they are of the order of 1 nK, and the energies involved with the gas itself, of the order of 1 µK. The Winnie the Pooh story 'In Which Piglet Meets a Haffalump' is not in a Bose-Einstein condensate state, because then all the words of the story should be the word And, populating the zero energy level. So, it is in a state which is close to a Bose-Einstein condensate.
We have not yet explained what the parameters A and B of (9) are for the situation of a physical boson gas, for which the Bose-Einstein distribution is often written as where µ is called the 'chemical potential', and g i the 'multiplicity'. The multiplicity g i of a specific energy level E i is the number of states that are different but have this same energy E i . That different states can have the same energy is connected to the symmetries of the configuration, often spatial ones. For example, for the most simple model of the harmonic trap, the one of a quantum harmonic oscillator, the multiplicity in s dimensions equals which becomes (n + 1)(n + 2)/2 in 3 dimensions, (n + 1) in 2 dimensions, and 1 in the one-dimensional situation. The different dimensions are relevant for the Bose-Einstein condensates realized in laboratories, because, although the boson gas exists always in 3 dimensions, often the harmonic traps give rise to very elongated cigar-like configurations, such that a quantum description in terms of an effective one-dimensional harmonic oscillator is a better model. Anyhow, for the text of the Winnie the Pooh story we do not have to hesitate about its dimension, pronouncing a text while reading it is certainly one-dimensional. Also a written text, although materialized on a page which is two dimensional, is a one-dimensional structure. This means that in the formula for the Bose-Einstein distribution we have rightly taken g i = 1 for every energy level E i . What about the 'chemical potential' µ? There is another quantity which is introduced with respect to it which is called the 'fugacity' If we look at (29), taking into account that g i = 1 and E 0 = 0, we get which means that the chemical potential and the fugacity are determined by the number N 0 of particles that are in the lowest energy state, hence the number of particles that are in the condensate state. More specifically, for the Winnie the Pooh story we find Let us note that from (33) follows that the fugacity is a number contained between 1/2 and 1, in case we have at least one particle in the condensate state, and the chemical potential is a negative number, they respectively approach 1 and 0 when the condensate grows in terms of number of particles in the lowest energy level. For what concerns the second constant B, we have which means that the second constant B is given by the temperature of the Bose gas. The rubidium condensate is a better example for the Winnie the Pooh story, as also the number of atoms, 2000, is of the same order of magnitude as the number of words, 2655, of the Winnie the Pooh story. The energy levels of the trap for the rubidium condensate are of the order of 1 nK, while the temperature of the gas is 170 nK (Table 3), which is 170 times bigger. We see for the Winnie the Pooh story that if we take 1 unit of energy for the energy level spacings, we have B = kT = 593, following (15), and hence 1 2 kT , being a good estimate for the average energy per atom of a one-dimensional gas, gives for the latter 271, which means that we are in this respect also in the same order of magnitude for the Winnie the Pooh story and the rubidium condensate. Hence, we can say that the Winnie the Pooh story can be looked at as behaving similarly to a Bose gas of rubidium 87 atoms in one-dimension at a temperature of 170 nK. We will see in Section 4, where we consider the text of the novel 'Gulliver's Travels' of Jonathan Swift (Swift, 1726), that the sodium condensate is a better example for this text.
Let us introduce a second piece of text in Table 4, namely a story entitled 'The magic shop' written by Herbert George Wells (Wells, 1903), with which we want to illustrate an aspect of our 'Bose gas representation of human language' that we have not yet touched upon. For the Winnie the Pooh story, If we look at Figure 2 and Table 1, we can see that the 'energy spectrum' does not cover the whole range of possible energy values. Indeed, the red graph of Figure 2 on the right hand side of the graph has still a substantial value, and is not at all close to zero. Hence one can wonder what happens further on for higher energy spectrum with this graph?
On the low energy spectrum, the amount of radiation increases starting from zero radiation for energy level E 0 , hence for the words that are captured in the zero energy level of the Bose-Einstein condensate, there is no radiation emerging from them following the considered choice of zero in the energy scale -for the case of the Winnie the Pooh story, the zero level energy state puts the cogniton in state And -and then the amount of radiation increases steeply -we have already a radiation of 111 energy units (and 105.84  How can we understand this, because we have in Table 1 exhausted all the words of the Winnie the Pooh story and hence seemingly represented all possible energy levels. But is this true? To see clear in this, we have to reflect about the difference of the numbers in the third and the fourth column of Table 1, respectively the 'numbers of appearances' of the specific words in the Winnie the Pooh story and the 'values of the Bose-Einstein distribution that we used to model these numbers of appearances'. The values in the fourth column are of a probabilistic nature and express averages of stories 'similar' to the one of Winnie the Pooh with respect to the numbers of appearances of the specific words, while the values in the third column express real counts for one specific story. More concretely, by 'similar' we actually mean 'containing the same total number of words, and containing the same total amount of energy'. Remember indeed that the Bose-Einstein distribution function only contains two parameters, which hence will be determined by the total number of words and the total amount of energy. Or to put it even more concretely, suppose we would collect a vast number of pieces of 'meaningful' text all containing the same total number of words N and the same amount of total energy E, the Bose-Einstein distribution function (9) is then supposed to model a specific type of average that can be obtained for all these texts, and the more numerous these texts the better this average will correspond with the Bose-Einstein distribution function. The reason is that this function is the consequence of the limit process in statistical mechanics of a micro-canonical ensemble of states of particles with the same N and E (Bose, 1924;Einstein, 1924Einstein, , 1925Huang, 1987).
The above reasoning indicates that we can consider to introduce a 'place for words that do no appear in the considered text but could have appeared'. Remark that these new words do not add to the sum N of all words, since they have 'number of appearance zero', which means that this operation of 'adding new words' leaves N unchanged. In the ranking of energy levels, they have to be classified by 'additional energy levels higher than the highest one we now identified with respect to the last alphabetically classified word that appears one time in the text'. Remark that also E remains unchanged by this adding of words that could have appeared. Indeed, although these new added words carry high energies, since all of them have appearance number zero, they do not add to the total amount of energy because the product of the energy of an even very high energy level with the zero of its number of appearances equals zero. Since N and E are left unchanged by the adding of these new words that could have appeared also the micro-canonical ensemble and its thermodynamical equilibrium remain unchanged. However the adding of the new words does alter substantially the Bose-Einstein distribution function and the Maxwell-Boltzmann distribution function calculated to model the data, because they both do not have appearance values equal to zero for these words, which means that there will be contributions to the total number of words and the total energy of their modeling. Hence, this operation of adding words such that the energy spectrum completes itself over the whole range is a necessary operation in the modeling with Bose-Einstein or Maxwell-Boltzmann.
Again more concretely, let us consider the words that appear one time in the Winnie the Pooh story, and look for synonyms of these words, then the word that appears now one time could not have appeared and instead its synonym could then have appeared. So, the synonyms can be listed in a new set of words to add with zero appearance, as 'could have appeared', and indeed, the Bose-Einstein distribution function will not be zero for them, which expresses exactly this 'they could have appeared'.
To illustrate the above, we consider the H. G. Wells story 'The magic shop' (Wells, 1903) for which we have classified its words in energy levels in Table 4. As we can see, the energy level E 1153 corresponding to the state of the cogniton characterized by the word Youngster, would have been the highest energy level in case we had stopped, like we did for the Winnie the Pooh story, to add energy levels at the 'one word appearance number'. For this new story 'The magic shop' we have however added the 'zero word appearance number' explicitly, starting with Garden, which is a word that does not appear in the story, synonym of Yard of energy level E 1149 and we attributed energy level E 1154 to the cogniton in a state characterized by Garden. And indeed, in the third column in the row where Garden appears in Table 4 there is 0, indicating that Garden does not appear in the story 'The magic shop'. In the fourth column, in the row of Garden in Table 4, we however have 0.25, which is the value of the Bose-Einstein distribution function at energy level E 1154 , and in the fifth column, in the row of Garden in Table 4, we have 0.07, which is the value of the Maxwell-Boltzmann distribution function at energy level E 1154 . Both numbers indicate that 'Garden could have appeared in a story similar to the H. G. Wells story', because they are not zero. These numbers are linked to the probability of Garden to appear in a similar story than the story of 'The magic shop' in the way we explained above. And indeed there should be not zeros in these places because there is a probability that Garden would appear in such a similar story. We added the word Okay at energy level E 1155 as synonym of Yes at energy level E 1150 , as a new not appearing state of the cogniton, however potentially appearing in a similar story. We continued in the same way adding Junior as synonym of Youngster, but there are no synonyms of You'd and You're, which gives us the occasion to mention that the added words that could appear in a similar story do not have to be synonyms.   Wells (1903). The words are in the column 'Words concepts cognitons' and the energy levels are in the column 'Energy levels Ei', and are attributed according to the 'numbers of appearances' in the column 'Appearance numbers N (Ei)', such that lower energy levels correspond to higher order of appearances, and the value of the energy levels is determined according to (1). The 'amounts of energies radiated by the words of energy level Ei' are in the column 'Energies from data E(Ei)'. In the columns 'Bose-Einstein modeling', 'Maxwell-Boltzmann modeling', 'Energies Bose-Einstein' and 'Energies Maxwell-Boltzmann' are respectively the predicted values of the Bose-Einstein and the Maxwell-Boltzmann model of the 'numbers of appearances', and of the 'radiated energies'. Words and their corresponding energy levels were added with zero number of appearances to complete the energy spectrum for the high energy region as shown in Figure 4.
The only criterion is that 'they appear in a meaningful story with the same total number of words and the same total energy'. Hence, adding synonyms is a simple way to ensure that the whole story remains meaningful, but also a completely new meaningful part to the story can be added with words that are no synonyms'. So, we added many more energy levels, namely till the cogniton being in energy level E 3500 . We have only shown the seven last ones of these words in Table 4, namely Continued, Adding, Mention, Similar, Criterion, Obviously and Appearing, having zero number of appearances in the H. G. Wells story, but their Bose-Einstein value in the Bose-Einstein model, as well as their Maxwell-Boltzmann value in the Maxwell-Boltzmann model, being not zero.
In Figure 3 (a) and Figure 3 (b), we have represented, respectively, the numbers of the appearing and not appearing words with respect to the energy levels, a graph very steeply going down, and the log / log graphs of these numbers of appearances, where we take the logarithm of both y and x. In Figure 4, we have represented the amounts of radiated energy with respect to the energy levels, and we see that this time the red graph representing the Bose-Einstein model of the data, after steeply going up and reaching a maximum, goes slowly down to touch closely the zero level of amount of energy radiated for high energy level cognitons. We see again, like in Figure 1, that the Bose-Einstein distribution function, the red graph, gives an almost complete fit with the data, the blue graph, and gives definitely a much better fit than the Maxwell-Boltzmann distribution function, the green graph, does. Let us look more carefully to the amounts of energy graphs in Figure 4. Also here we see that the red graph, which is the Bose-Einstein distribution, is a much better fit for the blue graph of the data, than the green graph, which is the Maxwell-Boltzmann distribution. We see that the maximum amount of radiation is reached at energy level E 70 in the state of the cogniton characterized by Door and the amount is 652.55204 energy units. So the frequency of Door would be the dominant color with which the story 'The magic shop' shines.
Comparing with the Winnie the Pooh story, we have a higher temperature, kT equals 722 instead of 593, a higher fugacity, f equals 0.9951 instead of 0.9923, and a higher chemical potential, µ is −3.576 instead of −4.581. This will be generally so when we consider longer texts like again will be illustrated by the text of 'Gulliver's Travels' considered in Section 4. We mentioned already that the sodium condensate realized at MIT, which we described above in detail, is a better model for the 'magic shop' story, and indeed, in Table  2 we can see that the harmonic oscillator level spacing for the sodium condensate is around 0.5 nK while the temperature of the sodium gas is 1 mK, which is a factor 2000 in difference of size. In Table 4, we see that we have 3500 energy levels for the story 'The magic shop', which is of the same order of magnitude. The number of atoms in the MIT sodium condensate was estimated to be 500000, which is way more still  (Wells, 1903) is represented, ranked from lowest energy level, corresponding to the most often appearing word, to highest energy level, corresponding to the least often appearing word, as listed in Table 4. The blue graph (Series 1) represents the data, i.e. the collected numbers of appearances from the story (column 'Appearance numbers N (Ei)' of Table 4 Table 4). In (b) the log / log graphs of the appearance numbers distributions are represented. The red and blue graphs coincide almost completely in both (a) and (b) while the green graph does not coincide at all with the blue graph of the data. This shows that the Bose-Einstein distribution is a good model for the numbers of appearances while the Maxwell-Boltzmann distribution is not. than the number of words in the H. G. Wells story 'The magic shop', which is 3934. When we analyze larger texts that come closer to this size, such as the text of Gulliver's Travels in Section 4, we find an even better correspondence in magnitudes with the data of the sodium condensate. But before showing this, we have to investigate more in depth another aspect of our modeling, namely the aspect related to the 'global energy level structure'.

), the red graph (Series 2) is a Bose-Einstein distribution model for these numbers of appearances (column 'Bose-Einstein modeling' of Table 4), and the green graph (Series 3) is a Maxwell-Boltzmann distribution model (column 'Maxwell-Boltzmann modeling' of
We have not yet revealed the parameters A, B, C and D for the story 'The magic show', they have the following values There are two quantum models that also in physics are used as an inspiration for the energy level structure of the trapped atoms, one is the 'harmonic oscillator and its variations' (Appendix B) and the other is the 'particle in a box and its variations' (Appendix A). From the harmonic oscillator model follows that the energy levels are equally (linearly) spaced, which is also the way we have modeled them for the two examples that we have considered, the Winnie the Pooh story and the H. G. Wells story. However, the energy levels of the particle in a box are quadratically spaced. We will see in the following of our analysis that in view of our experimental findings in analyzing numerous texts in all generality, the energy levels of the cognitons, depending on the story considered, are spaced following a power law, with a power coefficient which is in principle between 0 and 2, but for all the stories that we investigated was between 0.75 and  (Wells, 1903) as listed in Table 4. The blue graph represents the energy radiated by the story per energy level (column 'Energies from data E(Ei)' of Table 4), the red graph represents the energy radiated by the Bose-Einstein model of the story per energy level (column 'Energies Bose-Einstein' of Table 4), and the green graph represents the energy radiated by the Maxwell-Boltzmann model of the story per energy level (column 'Energies Maxwell-Boltzmann' of Table 4).
1.25. This indicates that different energy situations on both sides of the 'harmonic oscillator' are at play, from the 'anharmonic oscillator', with converging spacings between energy levels, to the 'particle in a box', with quadratic spacings between energy levels. We will show in next section how this generalization for the energy spacings strengthens the correspondence with Zipf's law in human language.

Zipf 's law and the Bose gas of human language
Zipf's law is considered to be one of the mysterious structures encountered in language (Zipf, 1935(Zipf, , 1949. It was originally noted in its most simple form in the following way. When ranking words according to their numbers of appearances in a piece of text, the product of the rank with the number of appearances is a constant. Hence Zipf's law was originally stated mathematically as follows where R is the rank, N the number of appearances, and c is a constant. We have presented in Figure 5 the products R i × N i for the text of the Winnie the Pooh story that we have investigated in Section 2, where R i is the i-th Zipf's ranking and N i is the number of appearances corresponding to this ranking. The x-coordinate of the graphs in Figure 5 represents the ranks R i , and the y-coordinate represents the products R i × N i for the blue graph, and the values of respectively the Bose-Einstein distribution, and the Maxwell-Boltzmann distribution for the red and green graphs. It is not a coincidence that there is a striking resemblance between the graphs shown in Figure 5 and the energy distribution graphs of the Winnie the Pooh story as a boson gas shown in Figure 2. Indeed, the energy levels E i that we introduced are very simply related to the Zipf rankings R i , the only difference being that we started with value zero for the lowest energy level, while Zipf started with value 1 for his Figure 5: The blue graph (Series 1) is a representation of the products Ri × Ni for the text of the Winnie the Pooh story that we have investigated in Section 2, where Ri is the i-th rank in Zipf's ranking and Ni is the number of appearances corresponding to this ranking. The x-ccordinate represents the ranks Ri, and the y-coordinate represents the products Ri × Ni. For the red graph (Series 2) and the green graph (Series 3) the values of respectively the Bose-Einstein distribution and the Maxwell-Boltzmann distribution which we developed in section 2 were used as a comparison with the graph in Figure 2. first rank. Hence, more concretely, we have This means that although none of the values of the Zipf products in Figure 5 is equal to the energies in Figure 5, the differences are small, because R i equals E i + 1. Consulting Table 1, we can see that the biggest difference is at the zero point of the graph, where on the x-axis E 0 = 0 and R 0 = 1, hence between the product R 0 × N 0 , which equals (E 0 + 1) × N 0 , that is between 1 × 133 = 133 and E 0 × N 0 = 0 × 133 = 0. This can not easily be seen as a difference between the graphs of Figure 5 and the graphs of Figure 2, since 133 is still little compared to the values the functions take at R 1 and E 1 . Again consulting Table 1, we indeed see that R 1 × N 1 = (E 1 + 1) × N 1 = 2 × 111 = 222, while E 1 × N 1 = 1 × 111 = 111. This means that both the 'product graph' of Figure 5 and the 'energy distribution graph' of Figure 2 go quickly up between R 0 and R 1 and between E 0 and E 1 , the first from value 113 to value 222, and the second from value 0 to value 111, which is almost with the same steepness. Both graphs will then remain increasing quite quickly and then slowly flatten till they reach their maxima at Zipf rank R 70 and energy level E 71 . Then, from this maximum on, both the Zipf product and the energy distribution slowly decrease from their maxima to a lower value. More specifically, the maximum value is 522.79 in both cases, and for the last considered Zipf rank R 542 and energy level E 542 we find values 359.22 and 358.55 respectively. This shows that there is a decreasing for the Zipf products and not constancy like Zipf's law predicts.
In the foregoing reasoning on Zipf's law, we have always considered the two graphs, the blue and the red one, in both Figure 5 and Figure 2. Of course, Zipf did not know of the Bose-Einstein distribution that is represented by the red graph in both figures, and which we used to model the data, represented by the blue graph in both figures. Hence Zipf only had the blue graph in Figure 5 available to come up with the hypothesis that the product of rank and number of appearances is a constant. If one considers the blue graph in Figure 5, one could indeed imagine it to vary around a constant function, certainly in the middle part of the graph. The beginning part can then be considered as a deviation, which is also what Zipf did when noting that in the first ranks the law did not hold up well. It was also known to Zipf that the end part of the graph, as a consequence of how ranks and numbers of appearances behave there, making the product go up and down heavily, did not behave very well with respect to his law either, and the slight downward slope all at the end was identified by Zipf as well. We see it explicitly pictured by the red graph, representing the Bose-Einstein distribution modeling of the data.
There is however another aspect of the situation which was overlooked by Zipf. It is self-evident that 'if Zipf's law is a law, it has to be a probabilistic law'. Let us specify what we mean by this. Suppose we had a large number of texts available with exactly the same number of different words in it, such that a Zipf analysis would lead to the same total number of ranks for each of the texts. Zipf's graphs, including the 'product graph', i.e. the blue graph in Figure 5, will then show a statistical pattern for the set of texts where it is tested on. Suppose we make averages for the numbers of appearances pertaining to the same rank over the available texts, then the function representing these averages of the numbers of appearances for the different texts will be a distribution function with a steep upward slope in the first ranks going towards a maximum and then a slow downwards slope in the ranks after this maximum. It will be a function similar to the Bose-Einstein distribution we have used to model texts as Bose gases, i.e. the red graph. This will be even more so when we add the two constraints that in our case follow naturally from our modeling, namely that the different texts need to count the same total number of words, and the sum of the products, which in our interpretation of the Bose gas model is the total energy, needs to be the same for each one of the texts. What is however more important still is that 'if Zipf's law is a probabilistic law, we should also introduce rankings that represent words with a zero number of appearances', exactly like what we have done for the H. G. Wells story 'The magic shop', for which we have represented the data and the Bose-Einstein model in Table 4, and the graphs representing these data in Figure 3 (a), in Figure  3 (b) and in Figure 4.
If we look carefully at the energy distribution graph in Figure 4, we can understand again somewhat better why Zipf came to believe that the products of the ranks and the numbers of appearances are a constant. Indeed, having added the zero number of appearance till the energy distribution becomes close to zero in the high energy levels, like shown in Figure 4, we can see how the blue graph goes first far up where the one word appearance cases are, to compensate the long row of zero appearance cases that take a great part of the x-axis. So, if one leaves out the zero appearance part, one easily can get the impression that the blue graph represents a constant on average, at least when neglecting the low energy levels at the start, where it goes steeply up. Most of the investigations of Zipf's findings afterwards concentrated on the log / log graph representation, where the log is taken for the rank as well as for the numbers of appearances, hence the Zipf equivalents for the log / log graphs we considered for our Bose gas modeling represented in Figure 1 (b) and in Figure 3 (b). For what concerns Zipf's law expressed in (38), the log / log graph of the Zipf product gives rise to a straight line with gradient equal to −1. Indeed, when we take the log of both sides of (38) we get which graph, with log R on the x-axis and log N on the y-axis, is a straight line with gradient equal to −1.
It is indeed much more easy to see by the naked eye that such a log / log graph like those in Figure 1 (b) and in Figure 3 (b) can be approximated well by a straight line as compared to seeing the constancy of the Zipf's products in a graph like the one in Figure 5, where the constancy needs to be approximated to the up and down moving blue graph. However, the focus of all Zipf's investigations on the log / log graphs also has its down side, in the sense that the upper and lower parts of the graph will be more easily considered as slight deviations of the straight line, while, as we see with our Bose-Einstein distribution modeling in its energy graph version, they really represent essential and significant deviations from Zipf's original product Figure 6: Representation of the log / log graphs of the Zipf data. The blue graph represents that data (Series 1), the red graph represents the Bose-Einstein model (Series 2), the green graph represents the Maxwell-Boltzmann model (Series 3) and the purple graph represents a straight line (Series 4) that is an 'as good as possible approximation' of the other graphs to illustrate that the gradient of the 'straight line approximation' is not equal to −1.
law (38). That in both Figure 1 (b) and in Figure 3 (b) the graphs are slightly bent towards a concave form is the expression of Zipf's law essentially not being satisfied for low ranks and high ranks. The foregoing analysis is meant to provide evidence to the Bose-Einstein distribution being a better model for the Zipf data than a constant, or also still than later more complex versions of Zipf's law along the lines of still believing that the product graph is in good approximation a constant, and the log / log version in good approximation a straight line. There is however another aspect of Zipf's finding that we want to put forward here, since it will be important for our model of a Bose gas for human language.
In Figure 6, we represented the log / log graphs of the Zipf data (blue graph) and the Bose-Einstein (red graph) and Maxwell-Boltzmann (green graph) distributions which we used to model them, and we added a straight line (purple graph) that approximates the other graphs as good as possible. We can see that the gradient of the straight line is not equal to −1, but to −0.94. Although Zipf himself kept focusing on the straight line with gradient −1, it was noted by many who studied Zipf's law that a generalization was needed to take into account the gradient of the straight line usually being smaller than −1, hence the log / log version of law was generalized to which made the original product of rank and frequency be generalized to where p is called the 'power coefficient' of Zipf's law. We will apply this 'power coefficient' in Zipf's law also in our modeling. Let us explain why and how we will do so. First of all, there is no a priori reason why the energy levels would be as simple as we presented it in the two examples that we considered, namely such that where E 1 − E 0 is the unit of energy that we introduced. Of course, we have systematically taken E 0 = 0, see (1), which makes the energy levels we have introduced in both stories even more simple, but it is not necessarily so that E 0 = 0 as a rule, which is why we now formulate the 'linear system of energy levels' as in (43). This simple linear system is inspired by the energy levels of the quantum harmonic oscillator (Appendix B), where we have with ν being the frequency of the oscillator. But that energy spacings between consecutive energy levels are the same, like in the case of the harmonic oscillator, is a very exceptional situation of quantization. For general quantized systems the spacings between consecutive energy levels will not be the same, and both cases exist, for not confined quantized situations the spacings will decrease, while for confined situations the spacings will increase. For example, for the quantized energy levels of the 'particle in a box' (Appendix A), we have which means that the energy levels change quadratically in function of the unit of energy Remark that in Appendices A and Appendix B we have used n to indicate the 'quantum numbers', because that is the traditional letter used for quantum numbers within standard quantum theory. In the approach we followed we have used i to indicate the 'energy levels', because we do not want to make a direct and exclusive reference to standard quantum theory alone, since our aim is to also make a connection with Zipf's law in language. More generally, we want to elaborate a 'quantum cognition theory' for 'human language and cognition' from basic principles on a more foundational level than the one where standard quantum theory is situated, building on earlier work in quantum cognition and quantum computer science (Aerts, 1995;Khrennikov, 1999;Atmanspacher, 2002;Gabora & Aerts, 2002;Van Rijsbergen, 2004;Aerts & Czachor, 2004;Widdows, 2004;Bruza & Cole, 2005;Busemeyer et al., 2006;Pothos & Busemeyer, 2009;Lambert Mogilianski Zamir & Zwirn, 2009;Bruza et al., 2009;Busemeyer & Bruza, 2012;Dalla Chiara et al., 2012Haven & Khrennikov, 2013;Melucci, 2015;Pothos et al., 2015;Blutner & beim Graben, 2016;Moreira & Wichert, 2016;Broekaert et al., 2017;Gabora & Kitto, 2017;Busemeyer & Wang, 2018).
The 'harmonic oscillator' and the 'particle in a box' are both special cases where the one-dimensional Schrödinger equation can be solved analytically, but for boson gases power law potentials have been studied as more general models (Bagnato, Pritchard & Kleppner, 1987), and hence we will also introduce in our approach a more general variation of the energy levels than the linear one, namely one of a 'power law change'  Table 5: The eleven lowest energy levels of the novel Gulliver's Travels by Jonathan Swift (Swift, 1726). The values of the Bose-Einstein model are compared with the data, i.e. the numbers of appearances of the words in the text in (a) without the introduction of a power coefficient and in (b) with the introduction of a power coefficient. The comparison for all energy levels can be seen for (a) in Figure 7 (a) and for (b) in Figure 7 (b).
Let us show right away how the introduction of a power law for the energy level spacings gives extra strength to the Bose-Einstein modeling of the texts of stories expressed in human language. This time we choose a much larger text than the two ones we investigated before, namely the text of the satirical work Gulliver's Travels by Jonathan Swift (Swift, 1726), which contains in total 103184 words, hence of the order of 40 times more than the Winnie the Pooh story and 25 times more than the H. G. Wells story. When analyzed as the Winnie the Pooh and the H. G. Wells story, with the hypothesis of equally spaces energy levels, or, which is equivalent, with a power coefficient spacing of the energy levels with power coefficient equal to 1, we find a total of 8294 energy levels without adding the zero number of appearances levels, and the ten highest numbers of appearances and their corresponding words are The, 5838, Of, 3791, And, 3633, To, 3400, I, 2852, A, 2442, In, 1976, My, 1593, That, 1280 and Was, 1263. In Figure 7 (a), we represented the log / log version of the 'numbers of appearances' graphs for the Gulliver's Travels story, the blue graph representing the data, the red graph the Bose-Einstein model, and the green graph the Maxwell-Boltzmann model. We can see right away that again the Bose-Einstein model is a much better representation of the data than the Maxwell-Boltzmann model, but we can also see that it is a less good representation of the data than it was the case for the Winnie the Pooh story and the H. G. Wells story. Indeed, the red graph indicates noticeably too high values in the low energy levels and for a large region in the middle energy levels it has values that are too low. In Table 5 (a) (a) With power coefficient p = 1 (b) With power coefficient p = 1.08 Figure 7: The log / log graph of the frequency distributions of the novel 'Gulliver's Travels' (Swift, 1726). In (a) it is shown how the Bose-Einstein distribution represented by the red graph (Series 2), although still a much better model than the Maxwell-Boltzmann distribution represented by the green graph (Series 3), fails to be as good a model when compared with the Winnie the Pooh story and the H. G. Wells story (Figure 1 (b) and Figure 3 (b)). Indeed, its values (Table 5) are too high in the lowest energy levels and too low in the middle energy levels, when compared to the data represented by the blue graph (Series 1). However, with addition of the power coefficient 1.08, applied to the spacings between energy levels, in (b) it is shown how the Bose-Einstein distribution model is again a very good model for the data. See Table 5 for the explicit values of the eleven lowest energy levels.
we give the eleven lowest energy levels values of the Bose-Einstein distribution model corresponding to the states of the cognitons, i.e. the corresponding words, and compare with the data, and see that the first ones are too high, while the following ones are too low. For the lowest energy level, with cognitons in state The, we find the Bose-Einstein distribution to have a value of 16454.07 while The appears only 5838 times in the Gulliver's Travels text. This is indeed a big difference, the Bose-Einstein is more than three times the experimental value of the number of appearances. We find a similar too high value for the Bose-Einstein distribution for the two next states of the cognitons, the state Of has a Bose-Einstein distribution value of 6297.00, while Of appears only 3791 in the text, the state And has a Bose-Einstein distribution value of 3893.39, while And appears only 3633 times in the text. For the next states of the cognitons the Bose-Einstein model, however, gives values too low with respect to the experimental data. For To the Bose-Einstein distribution value is 2817.73 while To appears 3400 times in the text, for I the Bose-Einstein distribution value is 2207.73 while I appears 2852 times, for A the Bose-Einstein distribution value is 1814.80 while it appears 2442 times, for In the Bose-Einstein distribution value is 1540.59 while it appears 1976 times, for My the Bose-Einstein distribution value is 1338.35 while it appears 1593 times, for That the Bose-Einstein distribution value is 1183.03 and it appears 1280 times, for Was the Bose-Einstein distribution value is 1060.00 and it appears 1263 times, and for Me the Bose-Einstein distribution value is 960.14 while Me appears 991 times in the text of the Gulliver's Travels story.
We will now apply a 'power law' to the spacings between the energy levels, as per (47), and will see that we can come to a much better match of the Bose-Einstein distribution with the data. Indeed, after applying the power p = 1.08 to the energy spacings between the energy intervals, we found an almost  (Swift, 1726). The blue graph (Series 1) represents the energy radiated by the story per energy level, the red graph (Series 2) represents the energy radiated by the Bose-Einstein model of the story per energy level, and the green graph (Series 3) represents the energy radiated by the Maxwell-Boltzmann model of the story per energy level. We have not added the highest energy levels radiation, but the very slowly descending slope after the maximum 18377.11 has been reached at energy level 43.65, shows that many levels will have to be added with zero number of appearance words for the Bose-Einstein function to approximate zero. perfect match and represented the log / log version of the graphs in Figure 7 (b). The values for the eleven lowest energy levels data compared with the Bose-Einstein model with power coefficient 1.08 are given in Table 5 (b).
We have tested the Bose-Einstein model on a large number of stories, short stories and long stories of the size of novels, and when we allow the energy spacings between different energy levels to vary according to a power law, we have been able to construct a perfectly matching Bose-Einstein model for the data for all of the considered stories. The power that was each time needed was situated between 0.75 and 1.25.
We want to emphasize that it is remarkable how the application of the power 1.08 to the linear version of the text of the novel of Gulliver's Travels makes the Bose-Einstein model fit so well the data, and we observed the same effect of the introduction of a power on an original linear version of the model for many of the other example texts that we investigated. We mentioned already how those who studied Zipf's law came to add a power to take into account that the gradient of the best fitting straight line in the log / log version of the graphs was not equal to −1. However, also the concave slightly curbed nature of the lowest energy level ranks was noticed and tried to be remedied by making the law more general still, however in purely ad hoc ways with the only aim to fit the data (Mandelbrot, 1953(Mandelbrot, , 1954Edmundson, 1972). That this slight concave curb appears in the Bose-Einstein distribution as a consequence of adding a power to the spacings between energy levels in exactly a way to make it fit with the data is in this sense remarkable, and since we saw it happening in many of the other examples for different values of the power, it is a strong indication of the Bose-Einstein model touching onto a fundamental property of human language.
In Figure 8, we have represented the low energy part of the 'energy distribution' of the story of Gulliver's Travels (Swift, 1726). The blue graph represents the energy radiated by the story per energy level, the red graph represents the energy radiated by the Bose-Einstein model of the story per energy level, and the green graph represents the energy radiated by the Maxwell-Boltzmann model of the story per energy level. We have not added the highest energy levels radiation because we wanted to show the detail of the low energy distribution, the one where the Bose-Einstein condensate dynamics of the text plays out. The maximum with a value of 18377.11 is reached at energy level 43.65 at quantum number 33, hence very close to the low level energies. The parameters A, B, C and D of the Bose-Einstein and Maxwell-Boltzmann models are Comparing with the Winnie the Pooh story and with the H. G. Wells story we have a higher temperature, kT equals 19356 instead of 722 or 593, a higher fugacity, f equals 0.9998 instead of 0.9951 or 0.9923, and a higher chemical potential, µ equals −3.648 instead of −3.576 or −4.581. As we remarked already, when we compared the parameters for the Winnie the Pooh story and the H. G. Wells story, this is generally what we expect to happen for longer texts.

Identity and Indistinguishability
We want to reflect now on what can the obtained results teach us about the notions of 'identity and indistinguishability' with respect to how they are used in human language and in quantum theory. We also want to reflect on the way in which these results support the 'conceptuality interpretation of quantum theory' (Aerts, 2009a(Aerts, , 2010a(Aerts, , 2013(Aerts, , 2014Aerts et al., 2018dAerts et al., , 2019c. Before we start our analysis, we repeat that all the words appearing in the stories that we considered are 'states' of the 'cogniton', which is the entity that for human language is what a 'photon' is for light, or what a 'rubidium 87 atom' is for the rubidium gas used to fabricate the Bose-Einstein condensate in Anderson et al. (1995). Let us first analyze how the issue of 'identity and indistinguishability' appears in quantum theory. It is structurally speaking a consequence of the generally adopted mathematical rule that wave functions should be symmetrized or anti-symmetrized, depending of whether the quantum particles in question are bosons or fermions. This entails that a multi-particle wave functions is always a superposition of products of the single particle building blocks of the multi-particle wave function, such that the different product pieces are chosen in a way that the total wave function is symmetric or anti-symmetric, depending on whether the composed quantum entity is a boson or a fermion. Let us make concrete what this means when we apply a quantum model to the text of the Winnie the Pooh story. The set of energy levels {E 0 , . . . , E 542 } shown in Table 1 are in principle the energy levels for a one particle situation in quantum theory, and the many particle situation of a text is then described in a Hilbert space which is the tensor product of, in the case of the Winnie the Pooh story, 2655 Hilbert spaces of which each one describes a one particle situation. The symmetrization is obtained by a superposition of all possible permutations of the original products and a renormalization to make the wave function a unit vector.
Let us consider the very simple version of this symmetrization procedure for two boson quantum particles which we call A and B, to see how challenging it is to try to understand its meaning. Both particles, when not part of a composite system, are described by their wave functions ψ A (x A ) and ψ B (x B ), where x A and x B are variables we considered for respectively particle A and particle B. When the two particles are joined in a single composite system, the latter is described by the symmetrized wave function where c is the renormalization constant. To see to what type of problems this symmetrization procedure leads, suppose for a moment that x A and x B are position variables pertaining to separated regions of space R A and R B , such that for both particles A and B we can understand ψ A (x A ) and ψ B (x B ) as being the wave function representing one particle A mainly present in this region of space R A , and another particle B mainly present in this region of space R B -ψ A (x A ) and ψ B (x B ) are for example wave packets which have negligible values outside respectively regions R A and R B of space. The symmetrized wave function ψ(x A , x B ) describes then a composite quantum entity which however does not consist of one particle pertaining to the region R A and another particle pertaining to the region R B , because it also predicts the presence of entanglement correlations between measurements performed in both regions R A and R B . This entanglement was put into evidence originally by Einstein and two of his students, Boris Podolsky and Nathan Rosen, and the correlations it produces are now called EPR correlations (Einstein, Podolsky & Rosen, 1935). The theoretical and experimental study of the EPR type of correlations has been one of the major subjects of quantum theory investigation for the last decades and resulted in showing that these correlations are non-local, so there is no longer any doubt in the physics community that the EPR type of correlations predicted by the entanglement carried in symmetrized states such as (49) constitute an intrinsic reality in the quantum world even if there is still an ongoing debate about how to understand them (Bohm, 1951;Bell, 1964Bell, , 1987Aerts et al., 2019a). Such a symmetrization for bosons and anti-symmetrization for fermions, following quantum theory, exists for all bosons and all fermions, which literally means that all identical quantum particles are entangled in this strong way, giving rise to non-local correlations of the EPR type. This state of affairs is still nowadays a serious unsolved and not understood conundrum for theoretical physics and philosophy of physics (Black, 1952;Van Fraassen, 1984;French & Redhead, 1988;Saunders, 2003Saunders, , 2006Muller & Seevinck, 2009;Krause, 2010;Dieks & Lubberdink, 2011, and this stands in great contrast with how experimentalists go along with it, for example, photons pertaining to different energy levels, hence carrying different frequencies, are treated by them as distinguishable (Hong, Ou & Mandel, 1987;Knill, Laflamme & Milburn, 2001;. The way in which experimentalists look at the 'indistinguishability' of photons was expressed clearly in more recent times, because of the actual importance of the creation of entangled photons for different reasons, e.g. for the fabrication of optically based quantum computers, and hence the focus in quantum optics on how to achieve this. Spontaneous parametric down conversion, which is a nonlinear optical process that converts one photon of higher energy into a pair of photons of lower energy has been historically the process for the generation of entangled photon pairs for the well-known Bell's inequality tests (Aspect, Dalibard & Roger, 1982;Weihs et al., 1998). Parametric down conversion is however an inefficient process because it has a low probability and hence physicists looked for other ways to produce entangled photons. Hence, when a scheme for using linear optics in function of the needs of the production of qubits was presented (Knill, Laflamme & Milburn, 2001), this made arise an abundance of new research. Most of the applications of this new research rely on the two-photon interference effect with two 'iindistinguishable photons' entering from different sides of a beam splitter and leaving in the same direction after undergoing the so called Hong-Ou-Mandel interference effect (Hong, Ou & Mandel, 1987). The crucial aspect of Hong-Ou-Mandel interference is the 'indistinguishability of the two photons in the spectral, temporal and polarization degrees of freedom'.
This stimulated the direct study of the 'indistinguishability of photons from different sources', with the finding that 'for photons to behave as indistinguishable bosons neither their frequencies nor their arrival times at the beam splitter can be too different, otherwise they behave as distinguishable quantum particles' (Lettow et al., 2010). What is however most significant for what concerns our take on this, and its value as support of our conceptuality interpretation of quantum theory (Aerts, 2009a(Aerts, , 2010a(Aerts, ,b, 2013(Aerts, , 2014Aerts et al., 2018dAerts et al., , 2019c, is the result of an amazing experiment that was performed in the series of attempts of quantum opticians to create entanglement within linear optics by making use of the interference due to two photon indistinguishability. In this experiment, photons of different frequencies are used to enter the beam splitter, hence given earlier experiments (Lettow et al., 2010), these photons should not behave as indistinguishable bosons, but on the outgoing part of the beam splitter a setup is realized that 'erases' the information about the different frequencies of the incoming photons. The result of the experiment is that this erasing makes the photons of different frequencies behave as indistinguishable bosons . This experiment shows that it is sufficient for the photons to be contextually indistinguishable when they are measured, for them to behave as indistinguishable bosons. We should actually not be amazed by this result, because this is what the so called 'quantum eraser experiments' are all about (Scully & Druhl, 1982;Kim et al., 2000;Walborn et al., 2002), and if we carefully read the famous analysis of the double-slit experiment by Richard Feynman (Feynman, Leighton & Sands, 1963;Feynman, 1965), the dependence of interference on the possibility of the measurement apparatus to 'know or not know about the available alternatives', was already at the center of his analysis. Hence, given the above analysis and our conceptuality interpretation of quantum theory, we can now put forward our view on the issue of 'identity and indistinguishability' as follows.
The way in which we understand in a straightforward way 'what identity and indistinguishability are with respect to human language and human mind' teaches us 'what identity and indistinguishability are in quantum theory'.
Let us formulate the reason why it makes sense to state our view as just expressed above given the conceptuality interpretation of quantum theory. The main hypothesis of the latter is that 'the role played by the human mind in relation with language is the same as the role played by a measuring apparatus (but also a heat bath and also a context that is perhaps not willingly used by a human being to make a measurement) in relation with a collection of quantum entities'. The statement above in italics follows directly from this hypothesis.
Let us become more concrete and consider the text of the Winnie the Pooh story of which the words can be found in Table 1. We see that -and the reasoning we develop now can be made for any other of the considered words -the word Piglet corresponds to the cogniton being with energy E 8 , and it appears 47 times in the text of the story. In the quantum wave function that represents the story, which is a multipartite wave function formed by 2655 parts (the total number of words), Piglet is the state associated with 47 of its parts, or components. It is straightforward that each of the Piglet in each of the components can be interchanged with each other of the Piglet in each other of the components without the story being changed even in the slightest way. This means, in physics jargon, that the wave function is symmetric (or anti-symmetric) with respect to the interchange of all these Piglet components. And, the symmetry (or anti-symmetry) is a consequence of their 'absolute indistinguishability'. It is also easy to understand that this 'absolute indistinguishability' is due to Piglet being a concept, and not an object. Indeed, let us imagine for a moment, just to make the above more clear still, that the scenery of the story would be pictured in some physical theatrical form with real piglets on the places where now the concept Piglet appears in the text. If we interchanged these real piglets, of course this would influence the physical scenery of the story. It is indeed not possible to 'interchange a real physical piglet with another real physical piglet without changing the whole of the physical scenery'. That is why real piglets when put in baskets will follow a Maxwell-Boltzmann statistics and not a Bose-Einstein statistics as conceptual piglets do. The 'interchanging of concepts in a piece of text', hence in the components of the wave function representing this piece of text, is an intrinsically different operation than the 'interchange of objects in space', and the basic hypothesis of the conceptuality interpretation of quantum theory consists in believing that quantum particles are like concepts, and that the reason why we find their behavior not understandable is because we think of them as objects. One of the crucial difficulties when thinking of quantum particles as objects comes to the surface exactly in their behavior as indistinguishable entities, as for objects this is something impossible to understand, while for concepts it is something straightforward and natural.
Let us show now how we can also easily understand the difference we indicated above between theoretical physicists who are struggling with the issue that, following quantum theory, all photons should Figure 9: Three typical configurations of two particles in two states be identical, in contrast with experimental physicists who pragmatically consider photons of different frequency as distinguishable and hence not identical. Consider again the Winnie the Pooh story, although we all understand right away that all concepts in the Piglet state are 'absolutely indistinguishable', we also are convinced that two different energy states of the cogniton are distinguishable. For example, energy state E 43 , which is the concept Robin, appearing 12 times in the text, is distinguishable from, Piglet. It is even very important for the meaning carried by the story that these two states are distinguishable. In a very similar way, for any measuring apparatus that is sensitive to the frequency of light, it is very important that a red photon is distinguishable from a blue photon, e.g. for our eyes, but also, we suppose, for plants practicing photosynthesis. It is even the 'essence of the measuring apparatus' to 'distinguish these two states'. However, when a special purpose apparatus is fabricated that, when we would read the Winnie the Pooh story, the points where Piglet appears are made not distinguishable any longer with the points where Robin appears -and there is a multitude of ways we can imagine this to be done -the two cognitons that are still read by us, will be indistinguishable. Again, such an operation consisting of completely erasing the Piglet nature and Robin nature of both concepts, can only work 'because both are concepts and not objects'. Underneath all of the words of the Winnie the Pooh text is indeed the more abstract notion of Concept, and hence we can bring all words into this abstract state of just being an unspecified concept in the text, which would make all of them indistinguishable. There are different ways of 'erasing', some ways more close to the ontology of the concepts, other more close to the measuring itself, and that is also why the quantum eraser effect can be understood very well within the conceptuality interpretation (see Aerts (2009a) Section 4.4).
Does the above mean that 'words in different states are distinguishable' and 'words in the same state are indistinguishable' and this clarifies all of the issue? Not yet, let us proceed in refining our analysis. It certainly does not mean that 'words in different states are objects', they are concepts, and hence behave like concepts, and not like objects. And since they are concepts, when being in different states, their 'distinguishability' is not what 'distinguishability' means for objects. We have to return to the main subject of our investigation to find this more subtle form of behavior of words in different states distinguishable as concepts and being at the origin of the disagreements between theoreticians and experimentalists when it comes to consider photons of the same frequency and photons of different frequencies. To start with, let is not forget that the radiation law for photons, including photons of different frequencies, is derived in statistical mechanics by considering these photons to obey Bose-Einstein statistics, and since in the foregoing sections we showed that Bose-Einstein statistics is valid for pieces of texts of stories containing a mixture of distinguishable and indistinguishable words, it should be possible to identify what happens differently with distinguishable concepts as compared to distinguishable objects which can lead to distinguishable concepts obeying Bose-Einstein statistics while distinguishable objects obey Maxwell-Boltzmann statistics. Let us start our analysis considering a very typical and simple situation used commonly to illustrate the difference between Bose-Einstein statistics and Maxwell-Boltzmann statistics. In Figure 9 we have represented two particles, the balls, in two states, the boxes, and three different configurations of this situation. The first configuration consists of the two particles in the first state, the second configuration of the two particles in the second state, and the third configuration consists of one particle in one state and the other particle in the other state. If the two particles are indistinguishable in the way that customarily is looked upon quantum indistinguishability, which is also the reason that this example is often displayed, the probabilities that are attached within a Bose-Einstein statistics model are 1/3, 1/3 and 1/3 for each of the configurations. However, if the the two particles are indistinguishable classically, the probabilities that are attached within a Maxwell-Boltzmann statistics are 1/4, 1/4 and 1/2. The reason is that the last configuration of one particle in one state and the other particle in the other state is realized in two ways classically, one way, and its permuted way are different realities. Within the 'quantum indistinguishability' these two are not different realities, and given our conceptuality interpretation this would be explained by them indeed not being different realities if they are concepts. What however in case we consider the three configurations of Figure 9 for distinguishable states of the cogniton, hence for distinguishable concepts? To make things more concrete, suppose we consider the concepts Cat and Dog and the configurations Two Cats, Two Dogs and A Cat And A Dog. Let us remark that this is exactly the situation we have studied already in great detail showing Bose-Einstein statistics to be a better representation as compared to Maxwell-Boltzmann statistics (Aerts, 2009a;Aerts, Sozzo & Veloz, 2015b;Beltran, 2019). How can we understand that even for distinguishable concepts Bose-Einstein is a better statistics than Maxwell-Boltzmann? The reason is the presence of 'entanglement' and 'superposition' also for distinguishable concepts like Cat and Dog. Indeed, the probabilities 1/3, 1/3, 1/3 with Bose-Einstein, versus 1/4, 1/4, 1/2 with Maxwell-Boltzmann, actually mean that for Maxwell-Boltzmann there are much more microstates in the third configuration than there are in the first two configurations, actually the double amount. When there is no entanglement and no superposition, and hence Cat and Dog are 'separated', we can understand this. This 'is' what happens when Cat and Dog are objects, hence a real cat and a real dog. Let us make this concrete, suppose we visit a farm with a lot of cats and dogs living at the farm, equal in number, and we receive as a present two of them randomly chosen for us by the farmer, then we will have the double chance that the gift will be a cat and a dog as compared to the gift being two cats or two dogs. What however if we ask a child to which it is promised that he or she can have two pets and he or she can choose for each pet whether it is a cat or a dog. The microstates that come into play in this case exist in the conceptual realm of the child's conceptual world, and there is no reason that within this conceptual world there will be a double amount of microstates for the choice of a cat and a dog as compared to the choices for two cats or two dogs. If there are two children that each apart choose one pet and do this independently of each other Maxwell-Boltzmann statistics will be the better one again, because the amount of microstates of the combination of the two choices will be the double of the amount of microstates playing a role for each child apart. This situation was investigated by us in many different and more complex configurations of this type with the result of Bose-Einstein being a better statistics than Maxwell-Boltzmann to model the situation (Aerts, 2009a;Aerts, Sozzo & Veloz, 2015b;Beltran, 2019). Actually, we noticed already in our study of quantum entanglement with concept combinations that the violation of Bell's inequalities comes about due to the combined exemplars (microstates) being exemplars of the combined concept directly (giving rise to the Bose-Einstein situation) and not being exemplars of the concepts apart that then afterwards are combined (giving rise to the Maxwell-Boltzmann situation) (Aerts & Sozzo, 2011Aerts et al., 2019b,a). In our investigation of the quantum superposition with concept combinations the situation is even more Bose-Einstein, because the exemplars of the combined concepts that play a role (microstates) are no longer combinations of exemplars of the single concepts, which means that their amount in average will be equal to the amount of exemplars of the single concepts, the situation hence fulfilling the basic requirement to be modeled by Bose-Einstein statistics (Aerts & Gabora, 2005b;Aerts et al., 2010Aerts et al., , 2012Sozzo, 2014;Aerts, Sozzo & Veloz, 2015a;Sozzo, 2015;. The insight that also combined distinguishable concepts tend to give rise to Bose-Einstein rather than Maxwell-Boltzmann statistics explains why it is so important for the thermal de Broglie wave-lengths to be large with respect to the distance between the quantum particles, the equivalent for human language always being fulfilled, for the Bose-Einstein statistics to be applicable and why the original Rayleigh Jeans radiation law for light, which is the Maxwell-Boltzmann version of the Planck radiation law, is satisfied for low frequencies. We have not yet reflected about 'identity' in itself. With respect to 'the identity' of a quantum particle, it can be proven that when the wave function of two identical quantum particles is considered, there does not exist a self-adjoint operator in the Hilbert space of their states that can represent a measurement that would identify one of the quantum particles (French & Redhead, 1988;Butterfield, 1993). Can a concept be said to have an identity? Not in the way we understand identity for an object. What can be attributed to a concept is a 'number' indicating 'the number of times it is', and that, one could say, is what can be seen as substituting what identity is for an object. The fact that also a 'number of times it is' can be attributed to a quantum particle is again a support for the hypothesis of our conceptuality interpretation.
Taking into account our above analysis, what we can understand about the nature of reality goes further than what we have formulated till now, in case we interpret quantum theory following the conceptuality interpretation. Like we mentioned already, we showed in earlier work that 'combinations of concepts' give rise to quantum superposition (Aerts, Sozzo & Veloz, 2015a). Every sentence in a text is a combination of concepts. Also every paragraph in a text is a combination of concepts, since sentences, as combinations of concepts, combine amongst each others to form paragraphs. Depending on the nature of the text, this process, of increasingly larger pieces of the text being essentially 'combinations of concepts', keeps going on, certainly up to the level of stories, where the overall meaning content of a story glues all its concepts together in specific combinations. This implies that superpositions will also form for large subsets of combined concepts, and we believe that this is exactly the mechanism which we call 'understanding' when the human mind is engaging in these pieces of text. More concretely, suppose the human mind reads a piece of text. When reading, there is no direct focus on single words as a collection, on the contrary, when the words are read, a 'new state is being formed', which integrates 'the meaning carried by the combination of all the concerned concepts'. This new state carrying the meaning of the piece of text formed by the combination of these words is exactly the superposition state which we identified already in earlier work (Aerts, Sozzo & Veloz, 2015a), and it are these superposition states that form again and again by combining concepts of sentences or paragraphs that again superpose in the course of the reading of the whole text, and lead to the understanding of the whole piece of text. A similar process takes place when talking, thinking or writing, albeit in general in a more discontinuous and complex way than when reading. We believe that what happens with a physical Bose gas close to its Bose-Einstein condensate state can be understood similarly. The role played by the human mind with respect to the text is now played by the heat bath and the measuring apparatuses applied with respect to the Bose gas. When the temperature is low enough and the diluteness of the gas is such that the phase space density (25) satisfies (26), hence the thermal de Broglie wave length (17) is larger than the distance between the atoms, this process of superposition formation starts to happen. Indeed, the de Broglie waves of the different atoms will overlap heavily and give rise to these superpositions, which means that the process which we call 'understanding' when the human mind and text are involved takes place in the Bose gas with the heat bath. These superpositions are new emergent states that do not pertain to one of the atoms any longer, but represent several atoms joining in a new entity, just like the several combined concepts represent an emergent meaning. The more the temperature is lowered and the density of the gas is kept such that the de Broglie waves overlap on larger and larger regions of the gas, the more new states are formed containing a synthetic material reality different from single atoms. The Bose-Einstein condensate is an ultimate state where all the atoms have been gathered in the lowest energy state so that for the whole gas a single new state has emerged. The stories that we have studied are in states close to this Bose-Einstein condensate state, where synthetic parts of combined concepts emerge in superposition states and the sizes of these parts are determined by the state of understanding of the human mind of the stories.

Appendices A The Particle in a Box
Schrödinger's equation is the fundamental equation of quantum theory and we are specifically interested in its time independent form, because that is the form which gives rise to the quantum eigenstates of the energy, hence states with a predictable fixed energy for a specific energetic situation. How this energetic situation is, we can take inspiration of what we know from classical physics, hence constituting the situation with the energy equal to a part of kinetic energy K plus a part of potential energy U , and hence the total energy E is the sum of both For the specific energetic situation of a 'particle in a box', we treat the particle as a free particle as long as it is inside the box, which means that its kinetic energy K equals p 2 /2m and the potential energy is a potential which is zero inside the box, and infinite in the region outside of the box. The Schrödinger equation 'inside the box', where the potential equals zero, becomes the equation for a free particle with mass m, hence which is equivalent to the equation When we put the Schrödinger equation becomes d 2 ψ(x) dx 2 + k 2 ψ(x) = 0 which is a second order differential equation of which the general solution is well known where a and b are constants, which can be complex numbers, that can be chosen depending on extra conditions to be satisfied. Remark that (55) is the wave function representing a free quantum particle in one dimension because we have not yet expressed in any way the presence of the infinite potential representing the box. Suppose we place the box between x = 0 and x = l, where l is the width of the box as we have shown in Figure 10. Hence, this means that at x = 0 and x = l we need to have ψ(0) = ψ(l) = 0, expressing that the walls of the potential representing the box are infinite. Making use of (55) this gives 0 = ψ(0) = b ⇔ ψ(x) = a sin kx (56) 0 = ψ(l) = a sin kl ⇔ sin kl = 0 ⇔ k = nπ l n = 1, 2, . . . Figure 10: A graphical representation of the 'particle in a box' as solution of the time independent Schrödinger equation with infinite potential well between 0 and l. The wave functions are quantized standing waves inside the box with wave lengths inversely proportional to the width l of the box, and also the energies are quantized in this inversely proportional way, i.e. smaller boxes give rise to larger wave lengths and higher energies. The energy spacings between consecutive quantizations are quadratic in the quantum numbers. We present here the four lowest energy levels.
which means that the energy of the particle is different from zero even in the ground state. This energy is called the 'zero point energy', it means that quantum mechanically the particle is unable to 'not move', complete lack of motion would indeed violate the Heisenberg uncertainty relations. In Figure 10 we have represented the energetic situation of the box described by an infinite potential well and drawn the wave functions corresponding to the first four quantum numbers n = 1, 2, 3 and 4. We can see that the wave functions are 'standing waves' that can be imagined to be the wave modes in a string which outer ends are fixed to the walls of the potential well. Remark that the wave lengths and energies are inversely proportional to the width l of the box, i.e. smaller boxes give rise to larger wave lengths and higher energies. This explains some of the differences between the macro-world, where l is large, and hence energies and wave lengths are small, such that no overlapping exists, and typical quantum superposition effects are absent, and the micro-world where energies and wave lengths are large with substantial overlapping such that quantum superposition effects can be abundant (Aerts, 2014).

B The Quantum Harmonic Oscillator
The potential energy of a harmonic oscillator is traditionally written as follows U (x) = 1 2 kx 2 where k is the force constant, which is is a measure of the stiffness of the spring, in case we realize the harmonic oscillator by means of a particle with a mass attached to a spring. We also can write the potential energy in function of the frequency of the oscillator and the mass of the particle by using that k = 4π 2 ν 2 m, and hence the potential energy becomes then U (x) = 2π 2 ν 2 x 2 . This gives rise to the following Schrödinger equation The 'particle in a box' Schrödinger equation' which we considered in Appendix A was easy to solve, and hence we constructed explicitly its solution. The 'quantum harmonic oscillator Schrödinger equation' is less straight forward to solve and hence we will give its solutions directly. They are again quantized and to write them in a more simple form we introduce α = 4π 2 mν h y = αx The general normalized solutions of the Schrödinger equation are then ψ n (y) = ( α π ) 1 4 1 √ 2 n n! H n (y)e − y 2 2 where H n (y) is the Hermite polynomials of grade n H n (y) = (−1) n e y 2 d n dy n (e −y 2 ) and hence for the seven lowest energy levels, the ones illustrated in Figure 11, these polynomials are the following H 0 (y) = 1 H 1 (y) = 2y H 2 (y) = 4y 2 − 2 H 3 (y) = 8y 3 − 12y H 4 (y) = 16y 4 − 48y 2 + 12 H 5 (y) = 32y 5 − 160y 3 + 120y (68) H 6 (y) = 64y 6 − 480y 4 + 720y 2 − 120 H 7 (y) = 128y 7 − 1344y 5 + 3360y 3 − 1680y These solutions of the Schrödinger equation lead to a sequence of evenly spaced energy levels characterized by the quantum number n E n = hν 2 + nhν (70) Figure 11: A graphical representation of a 'quantum harmonic oscillator' as solution of the time independent Schrödinger equation with the harmonic oscillator potential. The wave functions are quantized and also the energies are quantized. The energy spacings between consecutive quantizations are linear in the quantum numbers. We present here the seven lowest energy levels.
and, like for the particle in a box, we have a zero point energy different from zero, namely E 0 = hν/2. The energy spectrum is reminiscent of the energy spectrum of electromagnet radiation, and indeed, this is a consequence of the traditional way of considering electromagnetic radiation as a collection of harmonic oscillators. The wave functions are essentially Gaussian's multiplied by the Hermite polynomials. Hence, like shown in Figure 11, the wave function corresponding to the lowest energy level is a pure Gaussian, since H 0 (y) = 1, and the higher levels have a positive and negative fluctuating pattern reaching outside of the parabola representing the harmonic oscillator potential due to the presence of the Hermite polynomials. The harmonic oscillator is one of the foundation situations of quantum theory. Together with the particle in a box, which we presented in Appendix A, it can be used in many situations as a first approximation, which however gives usually rise to very trustworthy indications for a more sophisticated solution. When a quantum mechanical particle is confined as a consequence of the presence of a macroscopic system, the particle in a box, treating the macroscopic confinement as a box, will serve very well as a first approximation. For complex molecules that interact quantum mechanically the quantum harmonic oscillator will serve very well as a model in the lowest energy levels where the potential is a good approximation for a description of the vibrations that take place as part of the interaction between the molecules.