# Quantum-Like Structure in Multidimensional Relevance Judgements

- 3k Downloads

## Abstract

A large number of studies in cognitive science have revealed that probabilistic outcomes of certain human decisions do not agree with the axioms of classical probability theory. The field of Quantum Cognition provides an alternative probabilistic model to explain such paradoxical findings. It posits that cognitive systems have an underlying quantum-like structure, especially in decision-making under uncertainty. In this paper, we hypothesise that relevance judgement, being a multidimensional, cognitive concept, can be used to probe the quantum-like structure for modelling users’ cognitive states in information seeking. Extending from an experiment protocol inspired by the Stern-Gerlach experiment in Quantum Physics, we design a crowd-sourced user study to show violation of the Kolmogorovian probability axioms as a proof of the quantum-like structure, and provide a comparison between a quantum probabilistic model and a Bayesian model for predictions of relevance.

## Keywords

Multidimensional relevance User behaviour Quantum Cognition## 1 Introduction

Relevance in Information Retrieval (IR) is widely accepted to be a cognitive feature, driving all our information interactions. All areas of research within IR thus strive to improve relevance of documents to a user’s information need (IN). These research areas of IR can be broadly divided into two: system-oriented and user-oriented IR. Whereas the system-oriented viewpoint ties relevance to be an objective property of the document and query content, the user-oriented approach to IR views relevance as a cognitive property. Although IR fundamentally involves user interaction and decision-making, the user-oriented approach has been found harder to implement, especially in evaluating performance of IR systems. This is because of the variability in user judgements of relevance [5]. System-oriented IR thus sought to standardise IR evaluation, in which the user-cognitive notion of relevance was replaced by an objective, topical relevance. This led to evaluation methodologies based on the Cranfield and TREC type test collections. The user and all of his/her contexts were removed from the evaluation process.

Recent surge in availability of online user data has led to incorporation of more user context in the computation of relevance, e.g. in learning based ranking algorithms. This context is based on the user’s past interactions with the system, in addition to user attributes like age, interests, etc. and current attributes like location, type of device, etc. The common feature in these various contexts is that they are static. They are determined before the point of user’s interaction with the IR system. However, the process of IR is interactive and dynamic. In this paper, we focus on another type of context driving user interactions - dynamic context. Dynamic context is one which changes user’s cognitive state *during* information interaction.

One well-known example of when a dynamic context affects relevance is the phenomenon of Order Effect [8]. Order effects have been investigated and found to exist in IR in the presentation order of documents [4, 6, 9, 24]. For example, in a recent study reported in [22], two groups of participants were presented with a pair of documents \(D_1\) and \(D_2\) in two different orders. For some of such pairs, it was found that the relevance of a document judged by users is different depending on the order it was presented. Although the phenomenon may appear to have an intuitive explanation, it violates one of the fundamental assumptions of classical probability theory - joint distributions, where, for two random variables representing relevance of the documents - \(R_1\), \(R_2\), \(P(R_1, R_2) = P(R_2, R_1)\), i.e., the order of judging the documents does not matter. Order effects violate this fundamental assumption. Such order effects have also been investigated and reported in between the different dimensions of relevance, like Topicality, Understandability, Reliability, etc. [1, 19, 20], where different orders of dimensions considered to judge a document lead to different relevance judgements.

The field of Quantum Cognition [2] offers a generalised framework to model probabilistic outcomes of human decision-making. It has been successful in modelling and predicting order effects [16, 23] and other paradoxical findings where axioms of classical probability theory are violated [3, 14]. Conceptually, it challenges the notion that cognitive states have pre-defined values and that a measurement merely records them. Instead, the act of measurement creates a definite state out of an indefinite state and in doing so, changes the initial state of the cognitive system. In terms of relevance, we cannot pre-assign relevance of a document for a user. Instead, relevance is defined only at the point of interaction of the user’s cognitive state with the document. Therefore, judgement of document \(D_2\) first, changes user’s initial state and the subsequent judgement of relevance of \(D_1\) is different than when \(D_1\) is judged before \(D_2\). Should relevance of the documents for a user be a pre-defined entity, it would not be influenced by judgement of other documents and a joint distribution over relevance of the two documents would exist. We also say that these two measurements of relevance are incompatible with each other. That is, it is not possible to jointly consider the relevance of the two documents, at the same time. At the mathematical level, measurements in quantum theory are represented by operators, which in general, do not commute with each other.

In a classical system, all measurements will commute with each other. However, conversely, commutativity of measurements does not necessarily imply that the system is classical. Therefore, the type of measurements becomes imperative in identifying a quantum system. Even then, not all measurements on quantum systems generate data violating the classical probability theory. The system needs to be probed in a way which exploits the underlying quantum structure. In physics, this was done by experiments such as Stern-Gerlach and double-slit experiments [15] which showed the violation of classical probability principles for microscopic particles like electrons and photons. In cognitive science too, several experiments performed by Tversky, Kahneman and colleagues showed such violations in human decision-making under uncertainty [17].

Recently, an experiment protocol inspired by the Stern-Gerlach experiment in Physics has provided a new way to probe cognitive systems such that they exhibit a quantum-like structure [7]. By quantum-like structure we mean the representation of a system using the mathematical framework of quantum theory in order to model and predict the experimental data. In [19], this experiment was performed in an IR scenario involving judgement of relevance with respect to different dimensions. Extending from the Stern-Gerlach protocol, in this paper we design a new experiment to show the violation of classical probability theory in multidimensional relevance judgements. We hypothesise that multidimensional relevance judgement has an underlying quantum-like structure, which when subject to appropriate measurement design can exhibit violations of classical probability theory. Specifically, we investigate the violation of a particular axiom of Kolmogorovian probability theory [11]. Our results show that the experimental data indeed violates classical probability theory, and a quantum framework provides more accurate predictions to describe the data. This experiment not only shows the necessity of the quantum framework as an alternative for constructing probabilistic models, but also gives novel insights into user behaviour in IR. This understanding can contribute to improvement of interactive IR systems and we also discuss such implications in this paper.

## 2 Stern-Gerlach Inspired Protocol for Multidimensional Relevance

The S-G experiment also describes the minimum number of measurements required from a system to construct a complex-valued Hilbert Space structure. In particular, we need three incompatible measurements each with two mutually exclusive outcomes. We can use this arrangement of measuring properties of a quantum system to measure relevance of a document in IR. For this, we consider three dimensions of relevance: Topicality (*T*) - whether a document is topically relevant to a query, Understandability (*U*) - how easy it is to understand the content of the document, and Reliability (*R*) - how much can the document be relied upon. Each of these three dimensions can be posed as questions requiring a Yes/No type answer (denoted as \(+\) and − respectively) for a document. These three dimensions are important factors considered by users for deciding relevance. Besides, they are tied to a single document, unlike diversity or novelty, which is always considered in comparison with other documents. Certain dimensions like Interest, Habit, etc. are difficult to ascertain via crowdsourcing. As reported in [1], the different relevance dimensions can exhibit incompatibility for certain query-document pairs.

In [19], three query-document pairs were designed in such a way as to potentially exhibit incompatibility between judgement of relevance with respect to different dimensions. The content of the documents was altered to introduce uncertainty in judging each of the three dimensions. The participants were presented with three questions related to three relevance dimensions, for each query-document pair, in line with the S-G design. Figure 1 shows the three questions asked to two different groups in different orders. More details about this design can be found in [19] and [7]. This setup enables one to construct a complex-valued Hilbert space, which models the quantum-like structure of the user’s cognitive state during information interaction.

### 2.1 Constructing Complex-Valued Hilbert Space

The first step in building a quantum probabilistic model is to construct a representation for the user’s cognitive state. In the quantum framework, a complex-valued Hilbert space is used to represent a quantum system, and the state of the system is represented as a vector in this Hilbert space.

*A*in a finite dimensional Hilbert space as a ket vector Open image in new window and its complex conjugate as a bra vector Open image in new window. The norm of this vector is the square root of its inner product with its conjugate - Open image in new window. For two such vectors, their projection onto each other is given as the square of their inner product - Open image in new window. Each vector is written as a linear combination of the vectors of the basis in which it is represented. For the purpose of representing the cognitive state of a person judging a document as topically relevant or topically irrelevant, we consider a basis formed by two orthogonal vectors Open image in new window and Open image in new window respectively. Before a user considers a judgement of topicality, the cognitive state is indefinite with respect to considering the document as topically relevant or irrelevant. Both potentialities exist. We say that the cognitive state collapses to either Open image in new window or Open image in new window after the judgement. Before the judgement, we can represent the indefinite cognitive state in terms of probabilities of its potential responses. This is represented as a linear combination of the two basis states, weighted each by real or complex coefficients (called probability amplitudes), such that the square of the probability amplitude gives the probability of collapsing to the respective state. The initial state

*S*is thus written as:

*u*,

*r*and \(\theta _r\)) comprise the construction of the Hilbert space for user’s cognitive state w.r.t the interaction between the three dimensions. The parameter

*t*defines the initial state. The experiment design of Fig. 1 was carried out in [19] for three queries. The results are listed in Fig. 2.

## 3 Formulation of Research Hypotheses

Using the complex-valued Hilbert Space of multidimensional relevance, this paper aims to design an extended experiment to test the following research hypotheses: (1) Fundamental axioms of classical Kolmogorov probability are violated in a multidimensional relevance judgement scenario; (2) Probabilities obtained from the experiment can be better predicted with quantum than classical (Bayesian) probabilistic models. In the following two subsections, we mathematically formulate these hypotheses.

### 3.1 Violation of Kolmogorov Probability and Quantum Correction

*A*,

*B*are subsets of the set of all alternatives \(\varOmega \), and

*P*(

*A*),

*P*(

*B*) are the corresponding probabilities. The axiom will be violated if the value of \(\delta \) is different from zero.

*A*and

*B*. The projection operator \(\varPi (U+)\) is equal to the outer product of the state Open image in new window with itself, where the vector Open image in new window is computed using Eq. 2. In order to construct the vector, first the Topicality basis is represented as the standard basis and hence the orthogonal vectors Open image in new window and Open image in new window are given as:Thus, vectors Open image in new window and Open image in new window are given as:Then the projector \(\varPi (U+)\) is given as:Similarly, \(\varPi (R+)\) is:From the values of

*u*,

*r*and \(\theta _r\) obtained in [19], these projection operators can be constructed. The quantum analogue of \(\delta \), can then be calculated from Eq. (7). Value of \(\delta \) obtained from our experiment is compared to that predicted by the classical (always zero) and quantum probability frameworks.

### 3.2 Quantum Probabilities vs Classical Probabilities

The violation of Kolmogorovian probability axiom by a given system would likely lead to inaccurate predictions on the system using Kolmogorovian probability. This subsection formulates computation of conditional probabilities of relevance judgement along one dimension given another, using classical vs. quantum frameworks. They will be compared for our experimental data in Sect. 5.

*T*and

*U*, as in general \(P(T+,U+) \ne P(U+,T+)\). As we can see Open image in new window, which for Open image in new window is not equal to \(P(U+,T+)\) in Eq. 10. The conditional probabilities are given according to Luder’s rule [2, 10] as:Note that subscript

*q*is added to distinguish from classical conditional probability. Then \(P_q(R+|U+,T+)\) is given as (see [19] Sect. 4.2 for derivation):In contrast, classical probability theory has the basic assumption of commutativity of two events. Therefore the joint probability distribution always exists, which is the basis of calculating conditional probabilities in Bayes’ rule. Consequently, for events

*T*,

*U*and

*R*we have:

## 4 Experiment

### 4.1 Methodology

- 1.
The participants are shown information need, query and document snippet.

- 2.
Next, they are asked a Yes/No question about the Topicality of the document. This is to prepare the cognitive state of all participants by projecting their initial/background state onto the Topicality subspace of the underlying Hilbert space constructed in the previous experiment in [19].

- 3.
Lastly, they are randomly shown one of the eight possible conjunction or disjunction questions and asked to choose the appropriate answer (Fig. 3).

### 4.2 Participants and Material

We recruited 335 participants for the experiment using the online crowd-sourcing platform Prolific (prolific.ac). The study was designed using the survey platform Qualtrics (qualtrics.com/uk). The participants were paid at a rate of £6.30/h. We sought the participants’ consent and complied with the local data protection guidelines. The study was approved by The Open University UK’s Human Research Ethics Committee with reference number HREC/3063/Uprety.

## 5 Results and Discussion

### 5.1 Violation of Kolmogorov Probability Axiom

The probabilities of conjunction and disjunction of the Understandability and Reliability questions are reported in Fig. 6. In order to compute the \(\delta \) reported in Eq. 4, we also need the two probabilities related to single questions \(U+\) and \(R+\), apart from the conjunction and disjunction probabilities. These single question probabilities are obtained from the results in [19] (listed in Fig. 2). Then, we calculate \(\delta = P(U\pm \vee R\pm |T+) + P(U\pm \wedge R\pm |T+) - P(R+|T+) - P(U+|T+)\). In Fig. 6 we see that \(\delta \) is different from zero for all the three queries, although according to classical probability we expect that \(\delta \) would be zero in all cases. Eq. (7), based on the projection operators in quantum probability, gives predictions of \(\delta \), as are shown in the last column of the table.

*U*and

*R*. As we can see, if operators of

*U*and

*R*commute with each other, the quantum correction term in the Eq. (7) approaches zero (the commutator is zero). In fact, the probability values obtained may violate some of the other basic axioms of classical/Kolmogorovian probability. For example, for Query 2, we can see that \(P(U- \wedge R+|T+) = 0.414\) and \(P(U-|T+) = 0.198\) which clearly violates \(P(A,B) < P(A)\). Also, for this query, \(P(U- \wedge R-|T+)\) is greater than both \(P(U-|T+)\) and \(P(R-|T+)\). This type of violation has been termed as conjunction fallacy in the cognitive science literature [18]. Quantum models have been previously used to explain such violation [3] where the fundamental notion of incompatibility in judgements is identified as the potential cause.

### 5.2 Comparison of Quantum and Classical Probability Predictions

Figure 7 shows a comparison between quantum and classical probabilities with the experimental data for first two queries. The data for Query 3 had many probabilities close to 0 (see Fig. 2) and hence the sample became too small for a meaningful comparison. The probabilities are calculated for prediction of judgement of Reliability given the participant has judged Understandability and Topicality (positively), using equations derived in Sect. 2.1. Bayesian probabilities, in some cases, are significantly different from experimental data (\(P(R+|U-,T+)\) for query 1 and \(P(R-|U-,T+)\) for query 2). Quantum probabilities are consistently closer to the experimental data.

*R*,

*U*and

*T*can be jointly measured. In terms of the judgement process, this implies that a user can jointly consider information regarding the Reliability, Understandability and Topicality of a document with respect to the query. The incompatibility revealed in [19] and the order effects shown in [1] suggest that this is not always the case in general. Therefore we see Bayesian predictions deviate from the experimental data. As the quantum probability theory based on the Hilbert space model is free from this assumption of compatibility, it provides a promising alternative model that gives predictions closer to the experimental data. In fact, the modelling of incompatibility of different judgement perspectives forms one of the pillars of the Quantum Cognition research framework.

## 6 Implications for IR

Quantum models can capture richer cognitive interactions, by way of generalising some of the constraints of classical models like commutativity. Here we discuss a few cases where our findings can inform the design of IR systems and algorithms.

The impossibility of jointly modelling Reliability and Understandability (which leads to the Kolmogorovian axiom violations) can be attributed to the fact that humans make decisions in a sequential manner and consideration of one dimension affects the judgement of the next dimension. Therefore, different orders of consideration of dimensions would lead to different final relevance judgements, *making the order a factor in the variability of relevance judgements by users*. When using an IR system to perform a task or make an important decision, there might be a particular order of dimensions which can lead the user to make an optimal decision. For example, for a health related query, a user might find a document difficult to understand, which may affect his or her judgement of Reliability and hence the overall relevance. However, if another user first judges reliability and finds it highly reliable, the judgement of understandability might be different. The IR system can help users to consider the optimum sequence of dimensions and thus maximise the utility, by providing extra information. For example, if the system can also provide information about the Reliability of the document in terms of a Reliability score or ratings by other users, it can reduce uncertainty in judgement and thus minimise the influence of judgement of other dimensions. Thus, for the given medical document, the low understandability might not affect the perception of Reliability.

Secondly, quantum probabilistic models can replace Bayesian models used in IR algorithms for ranking and evaluation. For example, in [13], a multidimensional evaluation metric is proposed where the gain provided by a document is written as a function of the joint probability of relevance with respect to different dimensions, e.g. *P*(*T*, *U*, *R*, ...). Similar assumptions have also been made in [12, 25]. For documents exhibiting incompatibility between different dimensions, predictions from such a model will be inaccurate. A probabilistic model based on non-commutative operator algebra, accounting for the incompatibility between different dimensions, needs to be considered.

Finally, these results of violation of classical probability theory calls for further user behaviour experiments to be conducted in IR that further exploit the Quantum-like Structure in human judgements. It would require novel experimental protocols like that of Stern-Gerlach, Double-slit experiment, etc., to generate data beyond the modelling capacity of classical probability theory. Such experiments in themselves might lead us to new insights into user behaviour in IR and information based decision-making in general.

## 7 Conclusion

Extending a quantum-inspired experiment protocol, in this work, we begin with the hypothesis that the multidimensional property of relevance has an underlying quantum cognitive structure which can be shown as violation of certain classical (Kolmogorovian) probability axioms. A particular experimental design is reported which can exploit the quantum cognitive structure. The data shows violation of one of Kolmogorovian probability axioms. We further show that quantum probability theory is a better alternative to model multidimensional relevance judgements than its classical counterpart, i.e. Bayesian model. Finally, we highlight important implications of our research findings to the design of IR algorithms system and user experiments.

## Notes

### Acknowledgements

Authors affiliated to the universities in UK, Italy and China are funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 721321, National Key Research and Development Program of China (grant No. 2018YFC0831704) and Natural Science Foundation of China (grant No. U1636203). Authors affiliated to QUT, Australia are supported by the Asian Office of Aerospace Research and Development (AOARD) grant: FA2386-17-1-4016.

## References

- 1.Bruza, P., Chang, V.: Perceptions of document relevance. Front. Psychol.
**5**, 612 (2014). https://doi.org/10.3389/fpsyg.2014.00612CrossRefGoogle Scholar - 2.Busemeyer, J.R., Bruza, P.D.: Quantum Models of Cognition and Decision, 1st edn. Cambridge University Press, New York (2012)CrossRefGoogle Scholar
- 3.Busemeyer, J.R., Pothos, E.M., Franco, R., Trueblood, J.S.: A quantum theoretical explanation for probability judgment errors. Psychol. Rev.
**118**(2), 193–218 (2011). https://doi.org/10.1037/a0022542CrossRefGoogle Scholar - 4.Clemmensen, M.L., Borlund, P.: Order effect in interactive information retrieval evaluation: an empirical study. J. Doc.
**72**(2), 194–213 (2016). https://doi.org/10.1108/JD-04-2015-0051CrossRefGoogle Scholar - 5.Cool, C., Belkin, N.J.: Interactive information retrieval: history and background. Facet, 1–14 (2011). https://doi.org/10.29085/9781856049740.003
- 6.Eisenberg, M., Barry, C.: Order effects: a study of the possible influence of presentation order on user judgments of document relevance. J. Am. Soc. Inf. Sci.
**39**(5), 293–300 (1988)CrossRefGoogle Scholar - 7.Fell, L., Dehdashti, S., Bruza, P., Moreira, C.: An experimental protocol to derive and validate a quantum model of decision-making. In: Proceedings of the 41st Annual Meeting of the Cognitive Science Society (COGSCI 2019) (2019)Google Scholar
- 8.Hogarth, R.M., Einhorn, H.J.: Order effects in belief updating: the belief-adjustment model. Cognitive Psychol.
**24**(1), 1–55 (1992). https://doi.org/10.1016/0010-0285(92)90002-jCrossRefGoogle Scholar - 9.Huang, M.H., Wang, H.Y.: The influence of document presentation order and number of documents judged on user’s judgments of relevance. J. Am. Soc. Inf. Sci. Technol.
**55**(11), 970–979 (2004). https://doi.org/10.1002/asi.20047CrossRefGoogle Scholar - 10.Khrennikov, A.: Basics of quantum theory for quantum-like modeling information retrieval. In: Aerts, D., Khrennikov, A., Melucci, M., Toni, B. (eds.) Quantum-Like Models for Information Retrieval and Decision-Making. SSTEAMH, pp. 51–82. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25913-6_4CrossRefzbMATHGoogle Scholar
- 11.Kolmogorov, A.N.: Foundations of the Theory of Probability. Martino Fine Books, Eastford (2013)Google Scholar
- 12.Palotti, J., Goeuriot, L., Zuccon, G., Hanbury, A.: Ranking health web pages with relevance and understandability. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 2016, pp. 965–968. ACM, New York (2016). https://doi.org/10.1145/2911451.2914741
- 13.Palotti, J., Zuccon, G., Hanbury, A.: MM: a new framework for multidimensional evaluation of search engines. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management CIKM 2018, pp. 1699–1702. ACM, New York (2018). https://doi.org/10.1145/3269206.3269261
- 14.Pothos, E.M., Busemeyer, J.R.: A quantum probability explanation for violations of ‘rational’ decision theory. Proc. R. Soc. B: Biol. Sci.
**276**(1665), 2171–2178 (2009). https://doi.org/10.1098/rspb.2009.0121CrossRefGoogle Scholar - 15.Sakurai, J.J., Napolitano, J.: Modern Quantum Mechanics. Cambridge University Press, Cambridge (2017)CrossRefGoogle Scholar
- 16.Trueblood, J.S., Busemeyer, J.R.: A quantum probability account of order effects in inference. Cognitive Sci.
**35**(8), 1518–1552 (2011). https://doi.org/10.1111/j.1551-6709.2011.01197.xCrossRefGoogle Scholar - 17.Tversky, A., Kahneman, D.: Judgment under uncertainty: heuristics and biases. Science
**185**(4157), 1124–1131 (1974). https://doi.org/10.1126/science.185.4157.1124CrossRefGoogle Scholar - 18.Tversky, A., Kahneman, D.: Extensional versus intuitive reasoning: the conjunction fallacy in probability judgment. Psychol. Rev.
**90**(4), 293–315 (1983). https://doi.org/10.1037/0033-295x.90.4.293CrossRefGoogle Scholar - 19.Uprety, S., Dehdashti, S., Fell, L., Bruza, P., Song, D.: Modelling dynamic interactions between relevance dimensions. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval - ICTIR 2019. ACM Press (2019). https://doi.org/10.1145/3341981.3344233
- 20.Uprety, S., Song, D.: Investigating order effects in multidimensional relevance judgment using query logs. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval ICTIR 2018, pp. 191–194. ACM, New York (2018). https://doi.org/10.1145/3234944.3234972
- 21.Vourdas, A.: Probabilistic inequalities and measurements in bipartite systems. J. Phys. A: Math. Theor.
**52**(8), 085301 (2019). https://doi.org/10.1088/1751-8121/aafe97 MathSciNetCrossRefGoogle Scholar - 22.Wang, B., Zhang, P., Li, J., Song, D., Hou, Y., Shang, Z.: Exploration of quantum interference in document relevance judgement discrepancy. Entropy
**18**(12), 144 (2016). https://doi.org/10.3390/e18040144CrossRefGoogle Scholar - 23.Wang, Z., Busemeyer, J.R.: A quantum question order model supported by empirical tests of an a priori and precise prediction. Topics Cognitive Sci.
**5**(4), 689–710 (2013)Google Scholar - 24.Xu, Y., Wang, D.: Order effect in relevance judgment. J. Am. Soc. Inf. Sci. Technol.
**59**(8), 1264–1275 (2008). https://doi.org/10.1002/asi.20826CrossRefGoogle Scholar - 25.Zuccon, G.: Understandability biased evaluation for information retrieval. In: Ferro, N., et al. (eds.) ECIR 2016. LNCS, vol. 9626, pp. 280–292. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30671-1_21CrossRefGoogle Scholar