# Quantum Entanglement in Corpuses of Documents

- 59 Downloads

## Abstract

We show that data collected from corpuses of documents violate the Clauser-Horne-Shimony-Holt version of Bell’s inequality (CHSH inequality) and therefore indicate the presence of quantum entanglement in their structure. We obtain this result by considering two concepts and their combination and coincidence operations consisting of searches of co-occurrences of exemplars of these concepts in specific corpuses of documents. Measuring the frequencies of these co-occurrences and calculating the relative frequencies as approximate probabilities entering in the CHSH inequality, we obtain manifest violations of the latter for all considered corpuses of documents. In comparing these violations with those analogously obtained in an earlier work for the same combined concepts in psychological coincidence experiments with human participants, also violating the CHSH inequality, we identify the entanglement as being carried by the meaning connection between the two considered concepts within the combination they form. We explain the stronger violation for the corpuses of documents, as compared to the violation in the psychology experiments, as being due to the superior meaning domain of the human mind and, on the other side, to the latter reaching a broader domain of meaning and being possibly also actively influenced during the experimentation. We mention some of the issues to be analyzed in future work such as the violations of the CHSH inequality being larger than the ‘Cirel’son bound’ for all of the considered corpuses of documents.

## Keywords

Corpuses of documents Quantum entanglement CHSH inequality Natural language processing Information retrieval## 1 Introduction

Quantum entanglement in human language was studied within the Brussels approach (Aerts et al. 2016) to quantum cognition (Aerts and Aerts 1995; Aerts and Gabora 2005a, b; Aerts 2009; Pothos and Busemeyer 2009; Khrennikov 2010; Busemeyer et al. 2011; Busemeyer and Bruza 2012; Aerts et al. 2013a, b; Kvam et al. 2015; Dalla Chiara et al. 2015a, b) by means of psychological experiments on human participants about the way language is used by them, and several aspects of it were researched (Aerts and Sozzo 2011, 2014; Aerts et al. 2018d, e; Aerts Arguëlles 2018). In the present article we will investigate how quantum entanglement appears in corpuses of documents. We will use the same example of the two concepts *Animal* and *Acts*, that entangle in the concept combination *The Animal Acts*, which were studied in the above mentioned psychological experiments and for which it was proved that the Clauser-Horne-Shimony-Holt (Clauser et al. 1969) version of Bell’s inequality (Bell 1964, 1987) (CHSH inequality) is violated by the relative frequencies of outcomes of the psychological experiments as approximations of the probabilities for these outcomes to occur (Aerts and Sozzo 2011, 2014). This time however, instead of collecting data from the psychological experiments, we will collect data from searches of frequencies of appearance of the respective combinations of exemplars in several corpuses of documents. We will show that, as in the case of the psychological experiments, the collected data violate the CHSH inequality, which hence indicates the presence of entanglement, in the used corpuses of documents. We will use three corpuses of documents for our investigation, the corpus ‘Google Books’, which can be found available and free to use at https://googlebooks.byu.edu/x.asp, the corpus ‘News on Web’ (NOW), which is freely available at https://corpus.byu.edu/now/ and the ‘Corpus of Contemporary American English’ (COCA), which is freely available at https://corpus.byu.edu/coca/. Google Books is the biggest available corpus, with 560 billion words of books ranging over centuries and scanned by Google. Then comes the NOW corpus which 6 billion words of texts from news and periodicals, and finally COCA has 560 million words of texts of the types of stories. We have tested the CHSH inequality on all three corpuses of text such that we could identify the consistency of its violation and compare it with the violation encountered in the psychological experiments on human participants for the same combination of concepts (Aerts and Sozzo 2011, 2014).

The present work contributes to a further study of the presence of entanglement in human cognition (Bruza et al. 2009; Aerts and Sozzo 2011, 2014; Bruza et al. 2015; Gronchi and Strambini 2017; Aerts et al. 2018d, e; Aerts Arguëlles 2018) as studied within the quantum cognition research programme. However this time we identify entanglement due to the collection of data violating the CHSH inequality in the structure of corpuses of documents, which means that the present result pertains to a domain of research closely related to quantum cognition which investigates the presence of quantum structure in computer science with applications to information retrieval and natural language processing. This domain of research, ‘quantum structures in computer science’, developed from 2004 onwards quite parallel to quantum cognition to a flourishing research field (van Rijsbergen 2004; Aerts and Czachor 2004; Widdows 2004; Schmitt and Nurnberger 2007; Melluci 2008; Schmitt et al. 2008; Bruza et al. 2009b; Coecke et al. 2010; Piwowarski et al. 2010; Song et al. 2010; Frommholz et al. 2010; Zellhöfer et al. 2011; Di Buccio et al. 2011; Melucci 2015; Wang et al. 2016). Two European Union funded consortia, ‘Quantum Contextual Information Access and Retrieval’ (QONTEXT) between 2010 and 2013, and ‘Quantum Information Access and Retrieval Theory’ (QUARTZ)^{1} between 2017 and 2020, to which the authors of the present article are connected, have substantially contributed to the development of the field. The research presented in this article is part of a general investigation of identification of quantum structures, such as contextuality, interference, superposition, entanglement, Bose-Einstein and Fermi-Dirac statistics, in the texts of corpuses of documents, for which recently a general framework was proposed (Aerts et al. 2018a).

We summarize the content of the present paper in the following. In Sect. 2, we analyse in detail the coincidence operation we performed with corpuses of documents and present the results we obtained in the case of Google Books. In Sect. 3, we instead present the empirical data we collected using NOW and COCA as corpuses of documents. In all cases, we find a significant violation of the CHSH inequality which goes beyond the well known ‘Cirel’son bound’ for quantum mechanical measurements (Cirel’son 1980). The obtained result is compared in Sect. 4 with the violation of the CHSH inequality that was obtained in experiments with human participants (Aerts and Sozzo 2011, 2014). Next, entanglement by considering collocates type of co-occurrences are studied in Sect. 5. Finally, Sect. 6 offers some conclusive remarks on the obtained results.

## 2 An Entangled Combination of Concepts in Google Books

Now, (1) is violated in case the intermediate term \(E(A^{\prime },B^{\prime })+E(A,B^{\prime })+E(A^{\prime },B)-E(A,B)\) of the inequality is smaller than \(-2\) or bigger than 2. For the violation obtained in Aerts and Sozzo (2011) such intermediate term is equal to 2.42. We will show that we obtain for all used corpuses of documents violations that are even stronger and will put forward a hypothesis of why this is the case. Before proceeding with the calculation, we want to explain the content of the CHSH inequality and how we will collect the data on the considered corpuses of text leading to its violation.

The intermediate term of the CHSH inequality is formed by the ‘expectation values’ of four coincidence experiments or operations *e*(*A*, *B*), \(e(A,B^{\prime })\), \(e(A^{\prime },B)\) and \(e(A^{\prime },B^{\prime })\). For example, *e*(*A*, *B*) is the experiment or operation consisting in jointly performing the measurements of concepts *A* and *B*, and analogously \(e(A,B^{\prime })\), \(e(A^{\prime },B)\) and \(e(A^{\prime },B^{\prime })\) consist in jointly performing the measurements of respectively concepts *A* and \(B^{\prime }\), \(A^{\prime }\) and *B*, and \(A^{\prime }\) and \(B^{\prime }\). The ‘expectation values’ *E*(*A*, *B*), \(E(A,B^{\prime })\), \(E(A^{\prime },B)\) and \(E(A^{\prime },B^{\prime })\) will be calculated from the data gathered by the experiments or operations *e*(*A*, *B*), \(e(A,B^{\prime })\), \(e(A^{\prime },B)\) and \(e(A^{\prime },B^{\prime })\), respectively, as it will be explained in the following.

Let us first explain what the experiment or operation *e*(*A*, *B*) is. Consider for the concept *Animal* the two exemplars *Horse* and *Bear*, as outcomes, and for the concept *Acts* the two exemplars *Growls* and *Whinnies*, as outcomes. The four combinations *The Horse Growls*, *The Horse Whinnies*, *The Bear Growls* and *The Bear Whinnies*, where each of them is an exemplar of *The Animal Acts*, constitute the four outcomes of the experiment or operation *e*(*A*, *B*), where *A* and *B* are jointly measured. To obtain the probabilities associated with these four outcomes, we proceed as follows. In the case of the psychological experiment (Aerts and Sozzo 2011) we proposed the four possibilities to each one of the participants in the experiment, and asked them to choose one of the four. The probabilities were then easily calculated as the large number limit of the relative frequencies of the choices.

*E*(

*A*,

*B*) is then, using (2)–(5),

*Animal*which is

*Horse*is given the value \(+\,1\), while the choice for

*Animal*which is

*Bear*is given the value \(-\,1\). Similarly, the choice for

*Acts*which is

*Growls*is given the value \(+\,1\), and the choice for

*Acts*which is

*Whinnies*is given the value \(-\,1\). Then, combining these values, we obtain that the choice

*The Horse Growls*is associated with the value \(+\,1\), obtained by multiplying the value \(+\,1\) for

*Horse*with the value \(+\,1\) for

*Growls*. Similarly, the choice

*The Horse Whinnies*is \(-\,1\) (multiplying \(+\,1\) with \(-\,1\)), the choice

*The Bear Growls*is \(-\,1\) (multiplying \(-\,1\) with \(+\,1\)), and the choice

*The Bear Whinnies*is \(+\,1\) (multiplying \(-\,1\) with \(-\,1\)). Then,

*E*(

*A*,

*B*) is the ‘expected value’ given by the probabilities \(P(A_1,B_1)\), \(P(A_1,B_2)\), \(P(A_2,B_1)\) and \(P(A_2,B_2)\) of each of these values. Hence \(E(A,B)=-1\) means that there is a perfect anti-correlation, and indeed,

*Horse*anti-correlates with

*Growls*and

*Bear*anti-correlates with

*Whinnies*.

To define the three remaining experiments or operations \(e(A,B^{\prime })\), \(e(A^{\prime },B)\) and \(e(A^{\prime },B^{\prime })\) that are needed to calculate the intermediate term of the CHSH inequality, we consider two different exemplars for *Animal* as well as for *Acts*, namely *Tiger* and *Cat* and *Snorts* and *Meows*. The three experiments or operations are now defined as follows: \(e(A,B^{\prime })\) is the experiment or operation where the previous exemplars for *Animal*, *Horse* and *Bear*, are combined with new exemplars for *Acts*, *Snorts* and *Meows*, \(e(A^{\prime },B)\) is the experiment or operation where new exemplars for *Animal*, *Tiger* and *Cat*, are combined with the previous exemplars for *Acts*, *Growls* and *Whinnies*, and finally \(e(A^{\prime },B^{\prime })\) is the experiment or operation where for both *Animal* and *Acts* the new exemplars are combined, hence *Tiger* and *Cat* with *Snorts* and *Meows*.

## 3 Entanglement in NOW and COCA

We have collected the relative frequencies for the same strings both in the corpus of documents NOW and COCA, see Sect. 1, and found the following results.

*e*(

*A*,

*B*), for the frequencies of appearance of the four strings ‘horse growls’, ‘horse whinnies’, ‘bear growls’ and ‘bear whinnies’ we found 0, 2, 6, and 0, respectively. This means that in total the four strings appear 8 times and the relative frequencies of appearance determining the probabilities are

*E*(

*A*,

*B*) is then, using (23)–(26).

*e*(

*A*,

*B*), the frequencies of appearances of the four strings ‘horse growls’, ‘horse whinnies’, ‘bear growls’ and ‘bear whinnies’ give rise to the following frequencies in COCA, 0, 11, 0, and 0, respectively. This makes that in total the four strings appear 11 times and the relative frequencies of appearance determining the probabilities are

*E*(

*A*,

*B*) is then, using (44)–(47),

## 4 Comparison with the Psychological Experiments’ Violation

We have found a violation of the CHSH inequality in Google Books, NOW and COCA with values 3.41, 3 and 3.33, respectively, which are all stronger violations of the CHSH inequality than the one we found with the psychological experiments in Aerts and Sozzo (2011), where the value of the violation was 2.42. Let us make explicit the probabilities obtained in the latter, so that we can interpret the difference.

*e*(

*AB*), 4 subjects chose the example

*The Horse Growls*as a good example of the combination

*The Animal Acts*, 51 respondents chose

*The Horse Whinnies*, 21 respondents chose

*The Bear Growls*, and 5 respondents chose

*The Bear Whinnies*. This means that on a totality of 81 respondents we obtained portions of 4, 51, 21 and 5 for the different combinations considered. This allows us to calculate the probability for one of the combinations to be chosen. We have, using the symbols of Sects. 2 and 3,

*E*(

*A*,

*B*), we then get

*The Horse Snorts*as a good example of the combination

*The Animal Acts*, 2 respondents chose

*The Horse Meows*, 24 respondents chose

*The Bear Snorts*and 7 respondents chose

*The Bear Meows*. This gives

*The Tiger Growls*as a good example of the combination

*The Animal Acts*, 7 respondents chose

*The Tiger Whinnies*, 7 respondents chose

*The Cat Growls*and 4 respondents chose

*The Cat Whinnies*. This gives

*The Tiger Snorts*as a good example of the combination

*The Animal Acts*, 7 respondents chose

*The Tiger Meows*, 8 respondents chose

*The Cat Snorts*and 54 respondents chose

*The Cat Meows*. This gives

*The Horse Growls*,

*The Bear Whinnies*,

*The Horse Meows*,

*The Bear Meows*,

*The Tiger Whinnies*,

*The Cat Whinnies*,

*The Tiger Meows*, come out all with a higher probability for the psychological experiments as compared of what their probability values are for the corpuses of documents. Let us compare them, respectively writing the probabilities in the the following order: first Google Books, then NOW, then COCA and then the psychological experiments. We find

*This study has to do with what we have in mind when we use words that refer to categories, and more specifically ‘how we think about examples of categories’. Let us illustrate what we mean. Consider the category ‘fruit’. Then ‘orange’ and ‘strawberry’ are two examples of this category, but also ‘fig’ and ‘olive’ are examples of the same category. In each test of the questionnaire you will be asked to pick one of the examples of a set of given examples for a specific category. And we would like you to pick that example that you find ‘a good example’ of the category. In case there are more than one example which you find a good example, pick then the one you find the best of all the good examples. In case there are two examples which you both find equally good, and hence hesitate which ones to take, just take then the one you slightly prefer, however slight the preference might be. It is mandatory that you always ‘pick one and only one example’, hence in case of doubt, anyhow pick one and only one example. This is necessary for the experiment to succeed. So, one of the tests could be that the category ‘fruit’ is given, and you are asked to pick one of the examples ‘orange’, ‘strawberry’, ‘fig’ or ‘olive’ as a good example, and in case of doubt the best of the ones you doubt about, and in case you cannot decide, pick one anyhow. Let all aspects of yourself play a role in the choice you make, ratio, but also imagination, feeling, emotion, and whatever.*

The sentence that this subgroup had paid much attention to was the last sentence of this introductory text, i.e. ‘Let all aspects of yourself play a role in the choice you make, ratio, but also imagination, feeling, emotion, and whatever’, and so some of them would say that they had chosen ‘the tiger meows’, because that was what they preferred as a choice in what they would fantasize for the overall scenery in the imagination that the test brought about to them. And of course, even in all the books gathered by Google, the fantasy of a ‘tiger meowing’ has little chance to appear.

This allows us to put forward the following hypothesis. Although we believe that the corpuses of documents are collections of meaning related very sharply to the human mind, certainly if they are interrogated with a planned set up by human minds such like we have done in this article, they are very shallow still compared to what a human mind itself can carry as a worldview. So, a first aspect which explains the differences in probabilities of appearance between strings of meaning in the corpuses of documents and these same entities of meaning in human minds, is the difference in size. Secondly, and perhaps even more important, human minds are active entities, with the possibility to adapt to the mere context of a questionnaire itself, for example the specific sentence at the end of the introduction, while the way we can interrogate a corpus of documents is much more limited. We can search for frequencies of appearance of co-occurrent terms, which is what we did to find the violations of the CHSH inequality. The corpus of documents exists independent of what we exactly are looking for with this specific interrogation, while a human mind being questioned interacts with the question and can be directly influenced by it.

Of course, much more important than the differences we explained above between the data gathered form the corpuses of text and the data collected in the psychological experiments, one has to consider the similarities. In both cases the CHSH inequality is violated structurally in a completely similar way. It is the meaning connections incorporated in the considered combinations of concepts and the considered combinations of exemplars that are at the origin of the violation, and these meaning connections are present in exactly the same way in the corpuses of documents as in the human minds being tested in the psychological experiments.

## 5 Entanglement and Collocates

In the foregoing sections we made searches in the respective corpuses of text for strings of letters. What we mean is that if we, for example, searched the element of the corpus that we used to calculate the relative frequencies, this element would be defined as a string of characters. More concretely, a search for the frequency of appearance of ‘horse whinnies’ was a search for the frequency of appearance of the exact string of characters contained in ‘horse whinnies’. This is a very sharp way to identify meaning connections, and for the corpuses of texts that we used, a less sharp way of identifying meaning connection is offered by introducing what is called ‘collocates’. By means of this technique words that appear in each others neighborhood can be spotted.

Let us explain more in detail how such a measure of co-occurrence in neighborhood is technically devised. We have two words, for example ‘horse’ and ‘whinnies’, then one of them will be considered as the center of an interval of words, let us call it the target word, and let us choose it to be ‘horse’. One can indicate the number of words that the width of an interval with in its center the target word can have, and we choose for our operation that maximum number available in in the COCA, which is 9 words. This means concretely that whenever the second word ‘whinnies’ is spotted in a search in the interval of 19 words, 9 words to the left of ‘horse’ and 9 words to its right, it will be registered as a co-occurrence of both words ‘horse’ and ‘whinnies’.

The aim of the use of collocates in our operation is to loosen the strictness of co-occurrence and already allow such a less strict co-occurrence to be counted in case the target word ‘horse’ and the collocate word ‘whinnies’ appear in each others neighborhood. For example, suppose we consider ‘cat’ as the target word and ‘meows’ as the collocate word and take 9 before and 9 after as the spread of the interval of words, then a piece of text such as ‘But there, underneath, she sees a skinny orange cat. The cat meows. Ivy’s heart roars’, will be counted as a co-occurrence—it is, by the way, a piece of text that really shows up in the COCA corpus of documents when we did our operation.

We will not repeat the whole scheme of the operations, because they are identical to the foregoing ones, except that the strings of characters identifying the co-occurrences are now replaced by the target words and the collocate words giving rise to the co-occurrences. We found the following results.

*e*(

*A*,

*B*), the frequencies of appearances of the four

*Collocate Pairs*‘horse growls’, ‘horse whinnies’, ‘bear growls’ and ‘bear whinnies’ in COCA give us, 0, 12, 3, and 0, respectively. This means that in total the four

*Collocate Pairs*appear 15 times and the relative co-occurrence of appearance determining the probabilities are

*E*(

*A*,

*B*) is then, using (93)–(96),

*Collocate Pairs*‘horse snorts’ , ‘horse meows’ , ‘bear snorts’ and ‘bear meows’ in COCA gives us, 12, 0, 0, and 0, respectively. The four

*Collocate Pairs*appear 11 times and the relative co-occurrence of appearance determining the probabilities are

*Collocate Pairs*‘tiger growls’, ‘tiger whinnies’ , ‘cat growls’ and ‘cat whinnies’ in COCA give us, 4, 0, 6, and 0, respectively. This means that in total the four

*Collocate Pair*appear 10 times and the relative co-occurrence of appearance determining the probabilities are

*Collocate Pair*‘tiger snorts’, ‘tiger meows’, ‘cat snorts’ and ‘cat meows’ in COCA gives us, 0, 0, 0, and 37. This makes that in total the four

*Collocate Pairs*appear 37 times and the relative co-occurrence of appearance determining the probabilities are

## 6 Conclusion

We have shown that data we collected from three corpuses of text, Google Books, NOW and COCA, violate the CHSH version (Clauser et al. 1969) of Bell’s inequality (Bell 1964, 1987), which indicates the presence of entanglement in the combination of the two concepts *Animal* and *Acts* into the sentence *The Animal Acts*. More precisely, in Sects. 2 and 3 we have shown the violation collecting data of coincidence operations on different combinations of exemplars of *Animal Acts* as co-occurrences in the respective corpuses of documents Google Books, NOW and COCA, by using the search engines that are available on the Web for these respective corpuses of documents. These search engines are very reliable which we could test in different ways—the measured frequencies are consistent over time and the sentences where the co-occurrences appear can be explicitly consulted—which means that the statistics that we derived by calculating the relative frequencies of appearance of each co-occurrence, as approximations for the probabilities in the CHSH inequality, give rise to a good approximation of the probabilities which are present as a ‘meaning structure’ in each one of the corpuses of documents. That very similar and comparable results are obtained in the three corpuses of documents, Google Books, NOW and COCA, proves the deep nature of the presence of this probability structure leading to the violation of the CHSH inequality and hence straightly indicating the presence of quantum entanglement in each of the corpuses of documents. Our interpretation of this violation of the CHSH inequality is that the entanglement revealed by it is carried by the ‘meaning connection’ between *Animal* and *Acts* in the combination *The Animal Acts*. More concretely, it is because the used corpuses of text all are representations in meaning structure of the human mind, due to the texts contained in them being written by humans, that the ‘meaning connection’ between *Animal* and *Acts* is engraved in these corpuses of documents. Still more concretely, more often a co-occurrence between *Horse* and *Whinnies* will appear than a co-occurrence between *Horse* and *Growls*, simply because the meaning contained in *The Animal Acts* makes this be the case for humans living in a world where horses will rather whinny than they will growl.

In Sect. 4 we have compared the violations we obtained in Sects. 2 and 3 with the violation we obtained for the same combination of concepts *The Animal Acts* by means of data collected in a psychological experiments with human participants (Aerts and Sozzo 2011, 2014) and we found a great similarity between the violations in the different corpuses of documents and the violation in the psychological experiments. This is another confirmation of what we expressed already in the forgoing paragraph, namely that the violation originates in the presence of a meaning connection between *Animal* and *Acts* in the sentence *The Animal Acts*. The violations in the three corpuses of texts are stronger than the violation in the psychological experiments, and we observed that this greater strength of violation is due to the human participants making statistically non-zero some of the very uncommon combinations, such as *The Horse Meows*, combinations that all give rise instead to zero probability in the three corpuses of documents. This is an interesting observation, and we put forward a specific hypothesis about it in Sect. 4. The hypothesis is that on the one hand the human mind is an active entity much vaster than any of the corpuses of documents, and in this sense it is not strange that *Horse* and *Meows* have zero co-occurrence in all three corpuses of documents. Despite the enormous amount of stories and books contained in them, it is indeed not obvious that a sentence containing the string ‘horse meows’ will occur in even one of them. On the other hand, also for the psychological experiments we would easily imagine people choosing as their preferred combination *The Horse Meows*, if also *The Horse Snorts* or *The Bear Snorts* are possible choices. Even so, and we can check it in (71), 2 people of the 81 that participated in the psychological experiments preferred *The Horse Meows* to the other three possible choices.

In Sect. 4, we observed that a specific sentence used as an introduction to the series of experiments likely induced a small subgroup of them to answer to the questions in a very imaginative way, preferring to imagine a horse meowing than the boring alternative of a horse (or a bear) just snorting. Probably some of this little subgroup preferred the bear to meow rather than the horse. Anyhow, also this can be seen as part of our hypothesis, namely that the human mind is an active and creative entity, being influenced by all little details even in the way the experiments are explained. Obviously, corpuses of documents also contain the richness of the human mind, but in a collapsed and frozen way, no longer to be influenced by the way a search is made. Except of course if the search itself is contextual, but that is definitely not the case for the simple straightforward search engines offered for use on the Web and connected to the corpuses of documents Google Books, Now and COCA.

In Sect. 5 we have partly tested the hypothesis mentioned in the foregoing paragraph. Indeed, we have redone the operations for the *The Animal Acts* combination with the corpus of documents COCA, this time however making use of a more fuzzy search system referred to as ‘collocates’. Instead of indicating a co-occurrence for *Horse* and *Whinnies* whenever the string ‘horse whinnies’ appears in a sentence of the documents contained in COCA, with the collocate search a co-occurrence is registered whenever the word ‘whinnies’ appears in an interval of 9 words before or after the word ‘horse’. This introduced fuzziness on the part of the search system moves the corpus of documents COCA closer to the human mind, and this is confirmed within the context of our hypothesis by the CHSH inequality being less strongly violated in comparison to searches using strict co-occurences.

We conclude this article with a remark. We have not investigated here the quantum models in complex Hilbert space that can be constructed to represent the collected data, following the procedures in Aerts and Sozzo (2014) and showing that quantum entanglement can be considered to be present in the operations we performed on corpuses of documents. We plan to deliver this task in a forthcoming article (Aerts et al. 2018b). Here, we limit ourselves to mention that it will turn out that these Hilbert space models will show that entanglement is present not only in the state of the considered concepts, but also in the measurements and the evolutions. It is not a very well known fact but, if entanglement is present not only in the states, but as well as in the measurements and evolutions, then the Cirel’son’s bound can be exceeded and the violation of CHSH inequalities can even reach its maximum value of 4. This clarifies why the breaking of Cirel’son’s bound for the entanglement we identified for all the considered corpuses of text is not incompatible with a quantum mechanical modeling, something we will investigate in more detail in a second planned article (Aerts et al. 2018c).

## Footnotes

- 1.
University of Padova (IT), The Open University (UK), University of Bedfordshire (UK), Vrije Universiteit Brussel (BE), University of Copenhagen (DE), Brandenburg University of Technology Cottbus-Senftenberg (GE), Linnæus University (SW).

## Notes

### Acknowledgements

This work was supported by QUARTZ (Quantum Information Access and Retrieval Theory), the Marie Skłodowska-Curie Innovative Training Network 721321 of the European Union’s Horizon 2020 research and innovation programme.

## References

- Aerts Arguëlles, J. (2018). The heart of an image: Quantum superposition and entanglement in visual perception.
*Foundations of Science*. https://doi.org/10.1007/s10699-018-9547-1. CrossRefGoogle Scholar - Aerts, D. (2009). Quantum structure in cognition.
*Journal of Mathematical Psychology*,*53*, 314–348. https://doi.org/10.1016/j.jmp.2009.04.005.CrossRefGoogle Scholar - Aerts, D., Aerts Arguëlles, J., Beltran, L., Beltran, L., Distrito, I., Sassoli de Bianchia, M., Sozzo, S. & Veloz, T. (2018a). Towards a quantum World Wide Web.
*Theoretical computer science*. Online First. https://doi.org/10.1016/j.tcs.2018.03.019. CrossRefGoogle Scholar - Aerts, D., Beltran, L., Geriente, S. & Sozzo, S. (2018b). Quantum theoretic modeling in computer science. A complex Hilbert space model for entangled concepts in corpuses of documents. In preparation.Google Scholar
- Aerts, D., Beltran, L., Geriente, S., Sassoli de Bianchi, M., & Sozzo, S. (2018c). Quantum theoretic modeling in computer science. Entanglement in corpuses of documents violating Cirel’son’s bound. In preparation.Google Scholar
- Aerts, D., Aerts Arguëlles, J., Beltran, L., Geriente, S., Sassoli de Bianchi, M., Sozzo, S., et al. (2018d). Spin and wind directions I: Identifying entanglement in nature and cognition.
*Foundations of Science*,*23*, 323–335. https://doi.org/10.1007/s10699-017-9528-9.CrossRefGoogle Scholar - Aerts, D., Aerts Arguëlles, J., Beltran, L., Geriente, S., Sassoli de Bianchi, M., Sozzo, S., et al. (2018e). Spin and wind directions II: A Bell State quantum model.
*Foundations of Science*,*23*, 337–365. https://doi.org/10.1007/s10699-017-9530-2.CrossRefGoogle Scholar - Aerts, D., & Aerts, S. (1995). Applications of quantum statistics in psychological studies of decision processes.
*Foundations of Science*,*1*, 85–97. https://doi.org/10.1007/BF00208726.CrossRefGoogle Scholar - Aerts, D., Broekaert, J., Gabora, L., & Sozzo, S. (2013). Quantum structure and human thought.
*Behavioral and Brain Sciences*,*36*, 274–276. https://doi.org/10.1017/S0140525X12002841.CrossRefGoogle Scholar - Aerts, D., & Czachor, M. (2004). Quantum aspects of semantic analysis and symbolic artificial intelligence.
*Journal of Physics A: Mathematical and Theoretical*,*37*, L123–L132.CrossRefGoogle Scholar - Aerts, D., & Gabora, L. (2005a). A theory of concepts and their combinations I: The structure of the sets of contexts and properties.
*Kybernetes*,*34*, 167–191. https://doi.org/10.1108/03684920510575799.CrossRefGoogle Scholar - Aerts, D., & Gabora, L. (2005b). A theory of concepts and their combinations II: A Hilbert space representation.
*Kybernetes*,*34*, 192–221. https://doi.org/10.1108/03684920510575807.CrossRefGoogle Scholar - Aerts, D., Gabora, L., & Sozzo, S. (2013). Concepts and their dynamics: A quantum theoretic modeling of human thought.
*Topics in Cognitive Science*,*5*, 737–772. https://doi.org/10.1111/tops.12042.CrossRefGoogle Scholar - Aerts, D., Sassoli de Bianchi, M., & Sozzo, S. (2016). On the foundations of the Brussels operational-realistic approach to cognition.
*Frontiers in Physics*,*4*, 17. https://doi.org/10.3389/fphy.2016.00017.CrossRefGoogle Scholar - Aerts, D., & Sozzo, S. (2011). Quantum structure in cognition: Why and how concepts are entangled. Quantum Interaction.
*Lecture Notes in Computer Science*,*7052*, 116–127. https://doi.org/10.1007/978-3-642-24971-6_12.CrossRefGoogle Scholar - Aerts, D., & Sozzo, S. (2014). Quantum entanglement in concept combinations.
*International Journal of Theoretical Physics*,*53*, 3587–3603. https://doi.org/10.1007/s10773-013-1946-z.CrossRefGoogle Scholar - Bell, J. (1964). On the Einstein Podolsky Rosen paradox.
*Physics*,*1*, 195–200. https://doi.org/10.1103/PhysicsPhysiqueFizika.1.195.CrossRefGoogle Scholar - Bell, J. (1987).
*Speakable and unspeakable in quantum mechanics*. Cambridge, UK: Cambridge University Press.Google Scholar - Bruza, P., Kitto, K., Nelson, D., & McEvoy, C. (2009a). Extracting spooky-activation-at-a-distance from considerations of entanglement. In P. Bruza, D. Sofge, W. Lawless, K. van Rijsbergen, & M. Klusch (Eds.),
*Lecture Notes in Computer Science, Quantum Interaction. QI 2009*. (Vol. 5494, pp. 71–83). Berlin: Springer. https://doi.org/10.1007/978-3-642-00834-4_8.CrossRefGoogle Scholar - Bruza, P., Kitto, K., Nelson, D., & McEvoy, C. (2009b). Is there something quantum-like about the human mental lexicon?
*Journal of Mathematical Psychology*,*53*, 362–377.CrossRefGoogle Scholar - Bruza, P., Kitto, K., Ramm, B., & Sitbon, L. (2015). A probabilistic framework for analysing the compositionality of conceptual combinations.
*Journal of Mathematical Psychology*,*67*, 26–38. https://doi.org/10.1016/j.jmp.2015.06.002.CrossRefGoogle Scholar - Busemeyer, J., & Bruza, P. (2012).
*Quantum models of cognition and decision*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Busemeyer, J., Pothos, E., Franco, R., & Trueblood, J. (2011). A quantum theoretical explanation for probability judgment errors.
*Psycholological Review*,*118*, 193–218. https://doi.org/10.1037/a0022542.CrossRefGoogle Scholar - Cirel’son, B. S. (1980). Quantum generalizations of Bell’s Inequality.
*Letters in Mathematical Physics*,*4*, 93–100.CrossRefGoogle Scholar - Clauser, J. F., Horne, M. A., Shimony, A., & Holt, R. A. (1969). Proposed experiment to test local hidden-variable theories.
*Physical Review Letters*,*23*, 880–884.CrossRefGoogle Scholar - Coecke, B., Sadrzadeh, M., & Clark, S. (2010). Mathematical foundations for a compositional distributional model of meaning.
*Linguistic Analysis*,*36*, 345–384.Google Scholar - Dalla Chiara, M. L., Giuntini, R., Leporini, R., Negri, E., & Sergioli, G. (2015a). Quantum information, cognition, and music.
*Frontiers in Psycholology*,*6*, 1583. https://doi.org/10.3389/fpsyg.2015.01583.CrossRefGoogle Scholar - Dalla Chiara, M. L., Giuntini, R., & Negri, E. (2015b). A quantum approach to vagueness and to the semantics of music.
*International Journal of Theoretical Physics*,*54*, 4546–4556. https://doi.org/10.1007/s10773-015-2694-z.CrossRefGoogle Scholar - Di Buccio, E., Melucci, M., & Song, D. (2011). Towards predicting relevance using a quantum-like framework. In P. Clough, et al. (Eds.),
*Lecture notes in computer science, Advances in information retrieval. ECIR 2011*(Vol. 6611, pp. 755–758). Berlin: Springer.Google Scholar - Frommholz, I., Larsen, B., Piwowarski, B., Lalmas, M., & Ingwersen, P. (2010). Supporting polyrepresentation in a quantum-inspired geometrical retrieval framework. In
*IIiX ’10 Proceedings of the third symposium on information interaction in context*(pp. 115–124). https://doi.org/10.1145/1840784.1840802. - Gronchi, G., & Strambini, E. (2017). Quantum cognition and Bell’s inequality: A model for probabilistic judgment bias.
*Journal of Mathematical Psychology*,*78*, 65–75. https://doi.org/10.1016/j.jmp.2016.09.003.CrossRefGoogle Scholar - Khrennikov, A. (2010).
*Ubiquitous quantum structure*. Berlin: Springer.CrossRefGoogle Scholar - Kvam, P., Pleskac, T., Yu, S. & Busemeyer, J. (2015). Interference effects of choice on confidence. In
*Proceedings of the national academy of science of the USA112*(pp. 10645–10650). https://doi.org/10.1073/pnas.1500688112. CrossRefGoogle Scholar - Melluci, M. (2008). A basis for information retrieval in context.
*ACM Transactions on Information Systems (TOIS)*,*26*, 14. https://doi.org/10.1145/1361684.1361687.CrossRefGoogle Scholar - Melucci, M. (2015).
*Introduction to information retrieval and quantum mechanics*. Berlin: Springer.CrossRefGoogle Scholar - Piwowarski, B., Frommholz, I., Lalmas, M., & van Rijsbergen, K. (2010). What can quantum theory bring to information retrieval. In
*Proceedings of the 19th ACM international conference on Information and knowledge management*(pp. 59–68). https://doi.org/10.1145/1871437.1871450. - Pothos, E., & Busemeyer, J. (2009). A quantum probability explanation for violations of ‘rational’ decision theory.
*Proceedings of the Royal Society B*,*276*, 2171–2178. https://doi.org/10.1098/rspb.2009.0121.CrossRefGoogle Scholar - Schmitt, I. & Nurnberger, A. (2007). Image database search using fuzzy and quantum logic. In
*2007 IEEE International Fuzzy Systems Conference*, London, UK. https://doi.org/10.1109/FUZZY.2007.4295682. - Schmitt, I. Zellhofer, D. & Nurnberger, A. (2008). Towards quantum logic based multimedia retrieval. In
*NAFIPS 2008-2008 annual meeting of the North American fuzzy information processing society*, https://doi.org/10.1109/NAFIPS.2008.4531329. - Song, D., Lalmas, M., van Rijsbergen, K., Frommholz, I., Piwowarski, B., Wang, J., et al. (2010). How quantum theory is developing the field of information retrieval. In P. Bruza, W. Lawless, van Rijsbergen, & D. Sofge (Eds.),
*Proceedings of the AAAI Fall symposium on quantum informatics for cognitive, social and semantic processes 2010*(pp. 105–108). Arlington, VA: AAAI Press.Google Scholar - van Rijsbergen, C. J. (2004).
*The Geometry of Information Retrieval*. Cambridge: Cambridge University Press.CrossRefGoogle Scholar - Wang, B., Zhang, P., Li, J., Song, D., Hou, Y., & Shang, Z. (2016). Exploration of quantum interference in document relevance judgement discrepancy.
*Entropy*,*18*, 144. https://doi.org/10.3390/e18040144.CrossRefGoogle Scholar - Widdows, D. (2004).
*Geometry and Meaning*. Stanford: CSLI publications.Google Scholar - Zellhöfer, D., Frommholz, I., Schmitt, I., Lalmas, M., & van Rijsbergen, K. (2011). Towards quantum-based DB+IR processing based on the principle of polyrepresentation. In P. Clough, et al. (Eds.),
*Advances in Information Retrieval, ECIR 2011, LNCS 6611*(pp. 729–732). Berlin: Springer.Google Scholar

## Copyright information

**Open Access**This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.