Quantum-Like Structure in Multidimensional Relevance Judgements
- 3k Downloads
A large number of studies in cognitive science have revealed that probabilistic outcomes of certain human decisions do not agree with the axioms of classical probability theory. The field of Quantum Cognition provides an alternative probabilistic model to explain such paradoxical findings. It posits that cognitive systems have an underlying quantum-like structure, especially in decision-making under uncertainty. In this paper, we hypothesise that relevance judgement, being a multidimensional, cognitive concept, can be used to probe the quantum-like structure for modelling users’ cognitive states in information seeking. Extending from an experiment protocol inspired by the Stern-Gerlach experiment in Quantum Physics, we design a crowd-sourced user study to show violation of the Kolmogorovian probability axioms as a proof of the quantum-like structure, and provide a comparison between a quantum probabilistic model and a Bayesian model for predictions of relevance.
KeywordsMultidimensional relevance User behaviour Quantum Cognition
Relevance in Information Retrieval (IR) is widely accepted to be a cognitive feature, driving all our information interactions. All areas of research within IR thus strive to improve relevance of documents to a user’s information need (IN). These research areas of IR can be broadly divided into two: system-oriented and user-oriented IR. Whereas the system-oriented viewpoint ties relevance to be an objective property of the document and query content, the user-oriented approach to IR views relevance as a cognitive property. Although IR fundamentally involves user interaction and decision-making, the user-oriented approach has been found harder to implement, especially in evaluating performance of IR systems. This is because of the variability in user judgements of relevance . System-oriented IR thus sought to standardise IR evaluation, in which the user-cognitive notion of relevance was replaced by an objective, topical relevance. This led to evaluation methodologies based on the Cranfield and TREC type test collections. The user and all of his/her contexts were removed from the evaluation process.
Recent surge in availability of online user data has led to incorporation of more user context in the computation of relevance, e.g. in learning based ranking algorithms. This context is based on the user’s past interactions with the system, in addition to user attributes like age, interests, etc. and current attributes like location, type of device, etc. The common feature in these various contexts is that they are static. They are determined before the point of user’s interaction with the IR system. However, the process of IR is interactive and dynamic. In this paper, we focus on another type of context driving user interactions - dynamic context. Dynamic context is one which changes user’s cognitive state during information interaction.
One well-known example of when a dynamic context affects relevance is the phenomenon of Order Effect . Order effects have been investigated and found to exist in IR in the presentation order of documents [4, 6, 9, 24]. For example, in a recent study reported in , two groups of participants were presented with a pair of documents \(D_1\) and \(D_2\) in two different orders. For some of such pairs, it was found that the relevance of a document judged by users is different depending on the order it was presented. Although the phenomenon may appear to have an intuitive explanation, it violates one of the fundamental assumptions of classical probability theory - joint distributions, where, for two random variables representing relevance of the documents - \(R_1\), \(R_2\), \(P(R_1, R_2) = P(R_2, R_1)\), i.e., the order of judging the documents does not matter. Order effects violate this fundamental assumption. Such order effects have also been investigated and reported in between the different dimensions of relevance, like Topicality, Understandability, Reliability, etc. [1, 19, 20], where different orders of dimensions considered to judge a document lead to different relevance judgements.
The field of Quantum Cognition  offers a generalised framework to model probabilistic outcomes of human decision-making. It has been successful in modelling and predicting order effects [16, 23] and other paradoxical findings where axioms of classical probability theory are violated [3, 14]. Conceptually, it challenges the notion that cognitive states have pre-defined values and that a measurement merely records them. Instead, the act of measurement creates a definite state out of an indefinite state and in doing so, changes the initial state of the cognitive system. In terms of relevance, we cannot pre-assign relevance of a document for a user. Instead, relevance is defined only at the point of interaction of the user’s cognitive state with the document. Therefore, judgement of document \(D_2\) first, changes user’s initial state and the subsequent judgement of relevance of \(D_1\) is different than when \(D_1\) is judged before \(D_2\). Should relevance of the documents for a user be a pre-defined entity, it would not be influenced by judgement of other documents and a joint distribution over relevance of the two documents would exist. We also say that these two measurements of relevance are incompatible with each other. That is, it is not possible to jointly consider the relevance of the two documents, at the same time. At the mathematical level, measurements in quantum theory are represented by operators, which in general, do not commute with each other.
In a classical system, all measurements will commute with each other. However, conversely, commutativity of measurements does not necessarily imply that the system is classical. Therefore, the type of measurements becomes imperative in identifying a quantum system. Even then, not all measurements on quantum systems generate data violating the classical probability theory. The system needs to be probed in a way which exploits the underlying quantum structure. In physics, this was done by experiments such as Stern-Gerlach and double-slit experiments  which showed the violation of classical probability principles for microscopic particles like electrons and photons. In cognitive science too, several experiments performed by Tversky, Kahneman and colleagues showed such violations in human decision-making under uncertainty .
Recently, an experiment protocol inspired by the Stern-Gerlach experiment in Physics has provided a new way to probe cognitive systems such that they exhibit a quantum-like structure . By quantum-like structure we mean the representation of a system using the mathematical framework of quantum theory in order to model and predict the experimental data. In , this experiment was performed in an IR scenario involving judgement of relevance with respect to different dimensions. Extending from the Stern-Gerlach protocol, in this paper we design a new experiment to show the violation of classical probability theory in multidimensional relevance judgements. We hypothesise that multidimensional relevance judgement has an underlying quantum-like structure, which when subject to appropriate measurement design can exhibit violations of classical probability theory. Specifically, we investigate the violation of a particular axiom of Kolmogorovian probability theory . Our results show that the experimental data indeed violates classical probability theory, and a quantum framework provides more accurate predictions to describe the data. This experiment not only shows the necessity of the quantum framework as an alternative for constructing probabilistic models, but also gives novel insights into user behaviour in IR. This understanding can contribute to improvement of interactive IR systems and we also discuss such implications in this paper.
2 Stern-Gerlach Inspired Protocol for Multidimensional Relevance
The S-G experiment also describes the minimum number of measurements required from a system to construct a complex-valued Hilbert Space structure. In particular, we need three incompatible measurements each with two mutually exclusive outcomes. We can use this arrangement of measuring properties of a quantum system to measure relevance of a document in IR. For this, we consider three dimensions of relevance: Topicality (T) - whether a document is topically relevant to a query, Understandability (U) - how easy it is to understand the content of the document, and Reliability (R) - how much can the document be relied upon. Each of these three dimensions can be posed as questions requiring a Yes/No type answer (denoted as \(+\) and − respectively) for a document. These three dimensions are important factors considered by users for deciding relevance. Besides, they are tied to a single document, unlike diversity or novelty, which is always considered in comparison with other documents. Certain dimensions like Interest, Habit, etc. are difficult to ascertain via crowdsourcing. As reported in , the different relevance dimensions can exhibit incompatibility for certain query-document pairs.
In , three query-document pairs were designed in such a way as to potentially exhibit incompatibility between judgement of relevance with respect to different dimensions. The content of the documents was altered to introduce uncertainty in judging each of the three dimensions. The participants were presented with three questions related to three relevance dimensions, for each query-document pair, in line with the S-G design. Figure 1 shows the three questions asked to two different groups in different orders. More details about this design can be found in  and . This setup enables one to construct a complex-valued Hilbert space, which models the quantum-like structure of the user’s cognitive state during information interaction.
2.1 Constructing Complex-Valued Hilbert Space
The first step in building a quantum probabilistic model is to construct a representation for the user’s cognitive state. In the quantum framework, a complex-valued Hilbert space is used to represent a quantum system, and the state of the system is represented as a vector in this Hilbert space.
3 Formulation of Research Hypotheses
Using the complex-valued Hilbert Space of multidimensional relevance, this paper aims to design an extended experiment to test the following research hypotheses: (1) Fundamental axioms of classical Kolmogorov probability are violated in a multidimensional relevance judgement scenario; (2) Probabilities obtained from the experiment can be better predicted with quantum than classical (Bayesian) probabilistic models. In the following two subsections, we mathematically formulate these hypotheses.
3.1 Violation of Kolmogorov Probability and Quantum Correction
3.2 Quantum Probabilities vs Classical Probabilities
The violation of Kolmogorovian probability axiom by a given system would likely lead to inaccurate predictions on the system using Kolmogorovian probability. This subsection formulates computation of conditional probabilities of relevance judgement along one dimension given another, using classical vs. quantum frameworks. They will be compared for our experimental data in Sect. 5.
The participants are shown information need, query and document snippet.
Next, they are asked a Yes/No question about the Topicality of the document. This is to prepare the cognitive state of all participants by projecting their initial/background state onto the Topicality subspace of the underlying Hilbert space constructed in the previous experiment in .
Lastly, they are randomly shown one of the eight possible conjunction or disjunction questions and asked to choose the appropriate answer (Fig. 3).
4.2 Participants and Material
We recruited 335 participants for the experiment using the online crowd-sourcing platform Prolific (prolific.ac). The study was designed using the survey platform Qualtrics (qualtrics.com/uk). The participants were paid at a rate of £6.30/h. We sought the participants’ consent and complied with the local data protection guidelines. The study was approved by The Open University UK’s Human Research Ethics Committee with reference number HREC/3063/Uprety.
5 Results and Discussion
5.1 Violation of Kolmogorov Probability Axiom
The probabilities of conjunction and disjunction of the Understandability and Reliability questions are reported in Fig. 6. In order to compute the \(\delta \) reported in Eq. 4, we also need the two probabilities related to single questions \(U+\) and \(R+\), apart from the conjunction and disjunction probabilities. These single question probabilities are obtained from the results in  (listed in Fig. 2). Then, we calculate \(\delta = P(U\pm \vee R\pm |T+) + P(U\pm \wedge R\pm |T+) - P(R+|T+) - P(U+|T+)\). In Fig. 6 we see that \(\delta \) is different from zero for all the three queries, although according to classical probability we expect that \(\delta \) would be zero in all cases. Eq. (7), based on the projection operators in quantum probability, gives predictions of \(\delta \), as are shown in the last column of the table.
5.2 Comparison of Quantum and Classical Probability Predictions
Figure 7 shows a comparison between quantum and classical probabilities with the experimental data for first two queries. The data for Query 3 had many probabilities close to 0 (see Fig. 2) and hence the sample became too small for a meaningful comparison. The probabilities are calculated for prediction of judgement of Reliability given the participant has judged Understandability and Topicality (positively), using equations derived in Sect. 2.1. Bayesian probabilities, in some cases, are significantly different from experimental data (\(P(R+|U-,T+)\) for query 1 and \(P(R-|U-,T+)\) for query 2). Quantum probabilities are consistently closer to the experimental data.
6 Implications for IR
Quantum models can capture richer cognitive interactions, by way of generalising some of the constraints of classical models like commutativity. Here we discuss a few cases where our findings can inform the design of IR systems and algorithms.
The impossibility of jointly modelling Reliability and Understandability (which leads to the Kolmogorovian axiom violations) can be attributed to the fact that humans make decisions in a sequential manner and consideration of one dimension affects the judgement of the next dimension. Therefore, different orders of consideration of dimensions would lead to different final relevance judgements, making the order a factor in the variability of relevance judgements by users. When using an IR system to perform a task or make an important decision, there might be a particular order of dimensions which can lead the user to make an optimal decision. For example, for a health related query, a user might find a document difficult to understand, which may affect his or her judgement of Reliability and hence the overall relevance. However, if another user first judges reliability and finds it highly reliable, the judgement of understandability might be different. The IR system can help users to consider the optimum sequence of dimensions and thus maximise the utility, by providing extra information. For example, if the system can also provide information about the Reliability of the document in terms of a Reliability score or ratings by other users, it can reduce uncertainty in judgement and thus minimise the influence of judgement of other dimensions. Thus, for the given medical document, the low understandability might not affect the perception of Reliability.
Secondly, quantum probabilistic models can replace Bayesian models used in IR algorithms for ranking and evaluation. For example, in , a multidimensional evaluation metric is proposed where the gain provided by a document is written as a function of the joint probability of relevance with respect to different dimensions, e.g. P(T, U, R, ...). Similar assumptions have also been made in [12, 25]. For documents exhibiting incompatibility between different dimensions, predictions from such a model will be inaccurate. A probabilistic model based on non-commutative operator algebra, accounting for the incompatibility between different dimensions, needs to be considered.
Finally, these results of violation of classical probability theory calls for further user behaviour experiments to be conducted in IR that further exploit the Quantum-like Structure in human judgements. It would require novel experimental protocols like that of Stern-Gerlach, Double-slit experiment, etc., to generate data beyond the modelling capacity of classical probability theory. Such experiments in themselves might lead us to new insights into user behaviour in IR and information based decision-making in general.
Extending a quantum-inspired experiment protocol, in this work, we begin with the hypothesis that the multidimensional property of relevance has an underlying quantum cognitive structure which can be shown as violation of certain classical (Kolmogorovian) probability axioms. A particular experimental design is reported which can exploit the quantum cognitive structure. The data shows violation of one of Kolmogorovian probability axioms. We further show that quantum probability theory is a better alternative to model multidimensional relevance judgements than its classical counterpart, i.e. Bayesian model. Finally, we highlight important implications of our research findings to the design of IR algorithms system and user experiments.
Authors affiliated to the universities in UK, Italy and China are funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No. 721321, National Key Research and Development Program of China (grant No. 2018YFC0831704) and Natural Science Foundation of China (grant No. U1636203). Authors affiliated to QUT, Australia are supported by the Asian Office of Aerospace Research and Development (AOARD) grant: FA2386-17-1-4016.
- 5.Cool, C., Belkin, N.J.: Interactive information retrieval: history and background. Facet, 1–14 (2011). https://doi.org/10.29085/9781856049740.003
- 7.Fell, L., Dehdashti, S., Bruza, P., Moreira, C.: An experimental protocol to derive and validate a quantum model of decision-making. In: Proceedings of the 41st Annual Meeting of the Cognitive Science Society (COGSCI 2019) (2019)Google Scholar
- 10.Khrennikov, A.: Basics of quantum theory for quantum-like modeling information retrieval. In: Aerts, D., Khrennikov, A., Melucci, M., Toni, B. (eds.) Quantum-Like Models for Information Retrieval and Decision-Making. SSTEAMH, pp. 51–82. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-25913-6_4CrossRefzbMATHGoogle Scholar
- 11.Kolmogorov, A.N.: Foundations of the Theory of Probability. Martino Fine Books, Eastford (2013)Google Scholar
- 12.Palotti, J., Goeuriot, L., Zuccon, G., Hanbury, A.: Ranking health web pages with relevance and understandability. In: Proceedings of the 39th International ACM SIGIR Conference on Research and Development in Information Retrieval SIGIR 2016, pp. 965–968. ACM, New York (2016). https://doi.org/10.1145/2911451.2914741
- 13.Palotti, J., Zuccon, G., Hanbury, A.: MM: a new framework for multidimensional evaluation of search engines. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management CIKM 2018, pp. 1699–1702. ACM, New York (2018). https://doi.org/10.1145/3269206.3269261
- 19.Uprety, S., Dehdashti, S., Fell, L., Bruza, P., Song, D.: Modelling dynamic interactions between relevance dimensions. In: Proceedings of the 2019 ACM SIGIR International Conference on Theory of Information Retrieval - ICTIR 2019. ACM Press (2019). https://doi.org/10.1145/3341981.3344233
- 20.Uprety, S., Song, D.: Investigating order effects in multidimensional relevance judgment using query logs. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval ICTIR 2018, pp. 191–194. ACM, New York (2018). https://doi.org/10.1145/3234944.3234972
- 23.Wang, Z., Busemeyer, J.R.: A quantum question order model supported by empirical tests of an a priori and precise prediction. Topics Cognitive Sci. 5(4), 689–710 (2013)Google Scholar