Skip to main content
Log in

Scoring in context

  • S.I.: LogPerSciCog
  • Published:
Synthese Aims and scope Submit manuscript

Abstract

A number of authors have recently put forward arguments pro or contra various rules for scoring probability estimates. In doing so, they have skipped over a potentially important consideration in making such assessments, to wit, that the hypotheses whose probabilities are estimated can approximate the truth to different degrees. Once this is recognized, it becomes apparent that the question of how to assess probability estimates depends heavily on context.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

Notes

  1. Or the quadratic scoring rule, which is a generalization of the Brier score; see below.

  2. That the Brier rule cannot guarantee this is a direct consequence of the general fact mentioned two paragraphs back.

  3. Selten instead prefers the Brier rule, mainly because, as he proves, it is the only scoring rule (up to positive linear transformations) that satisfies each of what he considers to be four important desiderata for such rules, which Selten presents as axioms. According to the first axiom, the ordering of the hypotheses should not influence the score. According to the second, the score should not be affected by the introduction of an additional hypothesis that receives zero probability. The third axiom is the requirement of strict propriety. The fourth axiom, finally, concerns a type of situation that we do not consider in this essay, namely, when a probability assignment is scored in light of another probability assignment rather than in light of the truth of one hypothesis; the axiom requires that, in this situation, the score should be the same regardless of which probability assignment is considered to be the “true” one.

  4. To be entirely precise, instances of the quadratic scoring rule and the VS rule would have to be embellished with a super- or subscript to indicate the weighting function that is being assumed. We will not be so fussy, however.

  5. In this connection, it is also worth mentioning that, at least according to some influential Bayesian statisticians (Gelman and Hill 2007; Gelman and Shalizi 2012, 2013; Kruschke 2013), raising or lowering probabilities in the absence of the kind of evidence with “direct bearing” is accepted as legitimate practice, most notably, as resulting from a so-called posterior predictive check in which a statistical model may be rejected because it is found unsatisfactory (according to informal criteria) in light of simulated data. If rejected, the model is to be replaced by a new one, which requires, among other things, a specification of new prior probabilities. The simulated data that can motivate this kind of model revision—including probability revision—is presumably not the kind of new evidence that Moss has in mind.

  6. This result was obtained by means of the FixedPointList function from Mathematica and therefore holds only up to machine precision. However, that the process would reach a fixed point (even if perhaps not after 442 steps) is guaranteed by Theorem 2, to be stated shortly.

  7. A real-life example of this kind of usage is found in recent work on forecasting carried out by a group of psychologists from various American universities (Mellers et al. 2015; Tetlock and Gardner 2015). These researchers have organized, over a period of several years, a number of prediction tournaments, mostly concerning geopolitical questions. They found that some otherwise ordinary people were much more accurate forecasters than even professional intelligence analysts. A key objective of the research was to determine what distinguishes the most accurate forecasters from the rest of the population. The researchers used a number of different scoring rules for evaluating their participants’ performance, including the Brier score but also the so-called AUROC, which is known to be an improper scoring rule (see, e.g., Agresti 2007, Ch. 5; Hastie et al. 2009, Ch. 9, for details). Given that the participants were never told what the evaluation process consisted of, the use of an improper scoring rule in that process will not have affected their responses. (Note that, although in this research both proper and improper scoring rules were used for the purposes of selection, one could also use an improper scoring rule to select participants while at the same time scoring them via a proper scoring rule to determine their compensation in the experiment. Letting participants know how they will be compensated will then encourage them to post their true probabilities, while the improper scoring rule—the use of which is not disclosed to the participants—may still yield more useful information.)

  8. In fact, to the best of my knowledge, Konek (2016) contains the only reference to the rule (actually, the continuous version of the RPS rule) in the entire philosophical literature.

  9. Because, as noted, the RPS rule is strictly proper, it satisfies Selten’s third axiom (see note 3). To see that it also satisfies his fourth axiom, note that, for comparing a probability assignment \((p_1,\ldots ,p_n)\) with a “true” probability distribution \((p^*_1,\ldots ,p^*_n)\), the RPS rule takes this form:

    $$\begin{aligned} \frac{(p_1-p^*_1)^2 + \bigl ((p_1+p_2)-(p^*_1+p^*_2)\bigr )^2 + \cdots + \bigl ((p_1+\cdots + p_n)-(p^*_1+\cdots + p^*_n)\bigr )^2}{n-1}. \end{aligned}$$

    The symmetry required by the fourth axiom then follows from the fact that the addends in the numerator are all squared. Furthermore, the fact that David’s and Emma’s rank probability scores are different, as seen in the main text, is enough to show that the rule does not satisfy Selten’s first axiom. Finally, to show that neither does it satisfy the second axiom, we can add to the partition consisting of hypotheses A, B, and C the hypothesis that the student will receive a C−, where this has zero probability for David. Keeping his probabilities for A, B, and C as they were, David’s rank probability score then becomes (approximately) 0.243, and hence the addition of the zero-probability alternative did affect the score.

  10. Thanks to Ilkka Niiniluoto for bringing this to my attention.

  11. It might be said that the VS rule used in this section does not do quite as well with respect to the grading example as the RPS rule. Although David does better than Emma—David having a score of 0.117, and Emma, of 0.189—he incurs the same penalty as Frank. However, this result depends on the particular weights we chose for the example. It is easy to choose weights which could still be said to reflect truthlikeness relations but which would lead to qualitatively the same result as the RPS rule.

  12. To my knowledge, the only other author explicitly open to the possibility of “scoring rule pluralism” is Schurz (2018).

  13. I am greatly indebted to Eric Raidl, Christopher von Bülow, Verena Wagner, Sylvia Wenmackers, and two anonymous referees for valuable comments on previous versions of this paper. Thanks also to Lieven Decock, Samuel Fletcher, and Jos Uffink for helpful discussions. Versions of this paper were presented at the Universities of Düsseldorf and Konstanz and at the IHPST (Paris). I thank the audiences on those occasions for stimulating questions and remarks.

References

  • Agresti, A. (2007). An introduction to categorical data analysis. Hoboken, NJ: Wiley.

    Google Scholar 

  • Bernardo, J. M. (1979). Expected information as expected utility. Annals of Statistics, 7, 686–690.

    Google Scholar 

  • Bernardo, J. M., & Smith, A. F. M. (2000). Bayesian theory. New York: Wiley.

    Google Scholar 

  • Bickel, J. E. (2007). Some comparisons between quadratic, spherical, and logarithmic scoring rules. Decision Analysis, 4, 49–65.

    Google Scholar 

  • Bickel, J. E. (2010). Scoring rules and decision analysis education. Decision Analysis, 7, 346–357.

    Google Scholar 

  • Brier, G. W. (1950). Verification of forecasts expressed in terms of probability. Monthly Weather Review, 78, 1–3.

    Google Scholar 

  • Brouwer, L. E. J. (1911). Über Abbildungen von Mannigfaltigkeiten. Mathematische Annalen, 71, 97–115.

    Google Scholar 

  • Cevolani, G., Festa, R., & Kuipers, T. A. F. (2013). Verisimilitude and belief change for nomic conjunctive theories. Synthese, 190, 3307–3324.

    Google Scholar 

  • Cooke, R. M. (1991). Experts in uncertainty. Oxford: Oxford University Press.

    Google Scholar 

  • de Finetti, B. (1962). Does it make sense to speak of ‘good probability appraisers’? In I. J. Good (Ed.), The scientist speculates: An anthology of partly-baked ideas (pp. 357–364). New York: Basic Books.

    Google Scholar 

  • Epstein, E. S. (1969). A scoring system for probability forecasts of ranked categories. Journal of Applied Meteorology, 8, 985–987.

    Google Scholar 

  • Gelman, A., & Hill, J. (2009). Data analysis using regression and multilevel/hierarchical models. Cambridge: Cambridge University Press.

    Google Scholar 

  • Gelman, A., & Shalizi, C. R. (2012). Philosophy and the practice of Bayesian statistics in the social sciences. In H. Kincaid (Ed.), The Oxford handbook of philosophy of social science (pp. 259–273). Oxford: Oxford University Press.

    Google Scholar 

  • Gelman, A., & Shalizi, C. R. (2013). Philosophy and the practice of Bayesian statistics. British Journal of Mathematical and Statistical Psychology, 66, 8–38.

    Google Scholar 

  • Good, I. J. (1952). Rational decisions. Journal of the Royal Statistical Society, B14, 107–114.

    Google Scholar 

  • Greaves, H., & Wallace, D. (2006). Justifying conditionalization: Conditionalization maximizes expected epistemic utility. Mind, 115, 607–632.

    Google Scholar 

  • Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (2nd ed.). New York: Springer.

    Google Scholar 

  • Joyce, J. (1998). A nonpragmatic vindication of probabilism. Philosophy of Science, 65, 575–603.

    Google Scholar 

  • Konek, J. (2016). Probabilistic knowledge and cognitive ability. Philosophical Review, 125, 509–587.

    Google Scholar 

  • Kruschke, J. K. (2013). Posterior predictive checks can and should be Bayesian. British Journal of Mathematical and Statistical Psychology, 66, 45–56.

    Google Scholar 

  • Kuipers, T. A. F. (2000). From instrumentalism to constructive realism. Dordrecht: Kluwer.

    Google Scholar 

  • Kuipers, T. A. F. (2001). Structures in Science. Dordrecht: Kluwer.

    Google Scholar 

  • Kuipers, T. A. F. (2014). Empirical progress and nomic truth approximation revisited. Studies in History and Philosophy of Science, 46, 64–72.

    Google Scholar 

  • Leitgeb, H., & Pettigrew, R. (2010). An objective justification of Bayesianism I: Measuring inaccuracy. Philosophy of Science, 77, 201–235.

    Google Scholar 

  • Levinstein, B. A. (2012). Leitgeb and Pettigrew on accuracy and updating. Philosophy of Science, 79, 413–424.

    Google Scholar 

  • Lombrozo, T. (2017). ‘Learning by thinking’ in science and in everyday life. In P. Godfrey-Smith & A. Levy (Eds.), The scientific imagination. Oxford: Oxford University Press. in press.

    Google Scholar 

  • McCarthy, J. (1956). Measures of the value of information. Proceedings of the National Academy of Sciences, 42, 654–655.

    Google Scholar 

  • Mellers, B., Stone, E., Murray, T., Minster, A., Rohrbaugh, N., Bishop, M., et al. (2015). Identifying and cultivating superforecasters as a method of improving probabilistic predictions. Perspectives on Psychological Science, 10, 267–281.

    Google Scholar 

  • Moss, S. (2011). Scoring rules and epistemic compromise. Mind, 120, 1053–1069.

    Google Scholar 

  • Murphy, A. (1969). On the ‘ranked probability score. Journal of Applied Meteorology, 8, 988–989.

    Google Scholar 

  • Niiniluoto, I. (1984). Is science progressive?. Dordrecht: Reidel.

    Google Scholar 

  • Niiniluoto, I. (1998). Verisimilitude: The third period. British Journal for the Philosophy of Science, 49, 1–29.

    Google Scholar 

  • Niiniluoto, I. (1999). Critical scientific realism. Oxford: Oxford University Press.

    Google Scholar 

  • O’Hagan, A., Buck, C. E., Daneshkhah, A., Eiser, J. R., Garthwaite, P. H., Jenkinson, D. J., et al. (2006). Uncertain judgements: Eliciting experts’ probabilities. Hoboken, NJ: Wiley.

    Google Scholar 

  • Popper, K. R. (1963). Conjectures and refutations. London: Routledge and Kegan Paul.

    Google Scholar 

  • Rosenkrantz, R. D. (1981). Foundations and applications of inductive probability. Atascadero, CA: Ridgeview Publishing Company.

    Google Scholar 

  • Schurz, G. (1987). A new definition of verisimilitude and its applications. In P. Weingartner & G. Schurz (Eds.), Logic, philosophy of science and epistemology (Proceedings of the 11th international wittgenstein symposium) (pp. 177–184). Vienna: Hölder-Pichler-Tempsky.

    Google Scholar 

  • Schurz, G. (1991). Relevant deduction. Erkenntnis, 35, 391–437.

    Google Scholar 

  • Schurz, G. (2011). Verisimilitude and belief revision. Erkenntnis, 75, 203–221.

    Google Scholar 

  • Schurz, G. (2014). Philosophy of science: A unified approach. New York: Routledge.

    Google Scholar 

  • Schurz, G. (2018) The optimality of meta-induction: A new approach to Hume’s problem. Manuscript.

  • Selten, R. (1998). Axiomatic characterization of the quadratic scoring rule. Experimental Economics, 1, 43–62.

    Google Scholar 

  • Tetlock, P., & Gardner, D. (2015). Superforecasting: The art and science of prediction. London: Penguin Random House.

    Google Scholar 

  • Tichý, P. (1974). On Popper’s definition of verisimilitude. British Journal for the Philosophy of Science, 25, 155–160.

    Google Scholar 

  • Winkler, R. L. (1969). Scoring rules and the evaluation of probability assessors. Journal of the American Statistical Association, 64, 1073–1078.

    Google Scholar 

  • Winkler, R. L. (1996). Scoring rules and the evaluation of probabilities. Test, 5, 1–60.

    Google Scholar 

  • Winkler, R. L., & Murphy, A. H. (1968). ‘Good’ probability assessors. Journal of Applied Meteorology, 7, 751–758.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Igor Douven.

Additional information

This paper is dedicated to Gerhard Schurz, on the occasion of his 60th birthday.

Appendices

Appendix A

Recall that the weights of a VS rule are all positive and add up to 1, and that they are said to reflect truthlikeness in a minimally adequate sense iff hypotheses are assigned weights as a function of their distance from the truth, with hypotheses farther from the truth being assigned larger weights than hypotheses closer to the truth.

Theorem 1

Every VS rule whose weights reflect truthlikeness in a minimally adequate sense is improper.

Proof

Without loss of generality, consider a hypothesis partition of three hypotheses, \(H_1\), \(H_2\), and \(H_3\). Then, where \(\mathcal {V}\) is some VS rule and \(\mathbf {p}=(p_1,p_2,p_3)\) is a given person’s probability assignment to the aforementioned hypotheses, with \(p_i\) the probability assigned to \(H_i\), this person’s expected \(\mathcal {V}\)-score for a probability assignment \(\mathbf {p}^*\!\) to the same hypotheses is given by the function

$$\begin{aligned} \mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]= & {} p_1\bigl (w_{11}(1 - p^*_1)^2 + w_{21}(p^*_2)^2 + w_{31}(p^*_3)^2\bigr ) \\&+ p_2\bigl (w_{12}(p^*_1)^2 + w_{22}(1 - p^*_2)^2 + w_{32}(p^*_3)^2\bigr ) \\&+ p_3\bigl (w_{13}(p^*_1)^2 + w_{23}(p^*_2)^2 + w_{33}(1 - p^*_3)^2\bigr ). \end{aligned}$$

Again without loss of generality, assume that the hypotheses are ordered by their distances from each other, with \(H_2\) being equally far from \(H_1\) and \(H_3\), and \(H_1\) and \(H_3\) being twice as far from each other as they are from \(H_2\). Then \(w_{11}=w_{33}\), \(w_{21}=w_{23}\), \(w_{31}=w_{13}\), and \(w_{12}=w_{32}\), so that we can simplify notation by defining ; ; ; ; and . For \(\mathcal {V}\) to be proper, it must hold that \(\mathrm{arg\,min}_{\mathbf {p}^*}\!\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]=\mathbf {p}\), for any distribution \(\mathbf {p}\) on \(\{H_1,H_2,H_3\}\). To see whether this does hold, we use the method of Lagrange multipliers. Specifically, where \(f(\mathbf {p}^*)=p^*_1+p^*_2+p^*_3\), we must find values for \(p^*_1\), \(p^*_2\), \(p^*_3\), and \(\lambda \) such that \(\nabla \mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)] = \lambda \nabla f(\mathbf {p}^*)\) and \(f(\mathbf {p}^*)=1\). Calculating the first-order partial derivatives of \(\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]\), we find

$$\begin{aligned} (\partial /\partial p^*_1)\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]= & {} -2 w_1 p_1 (1 - p^*_1) + 2 w_3 p_3 p^*_1 + 2 w_4 p_2 p^*_1; \\ (\partial /\partial p^*_2)\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]= & {} -2 w_5 p_2 (1 - p^*_2) + 2 w_2 p_1 p^*_2 + 2 w_2 p_3 p^*_2; \\ (\partial /\partial p^*_3)\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]= & {} -2 w_1 p_3 (1 - p^*_3) + 2 w_3 p_1 p^*_3 + 2 w_4 p_2 p^*_3. \end{aligned}$$

Because \(\nabla f(\mathbf {p}^*)=\mathbf {1}\), we have \((\partial /\partial p^*_i)\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]=\lambda \) for all \(i\leqslant 3\). So in particular, expanding the partial derivatives in \(p^*_1\) and \(p^*_3\) and dividing both by 2, we have

$$\begin{aligned} -w_1 p_1 + w_1 p_1 p^*_1 + w_3 p_3 p^*_1 + w_4 p_2 p^*_1 = -w_1 p_3 + w_1 p_3 p^*_3 + w_3 p_1 p^*_3 + w_4 p_2 p^*_3, \end{aligned}$$

and hence

$$\begin{aligned} w_1 p_1 p^*_1 + w_3 p_3 p^*_1 + w_4 p_2 p^*_1 - w_1 p_3 p^*_3 - w_3 p_1 p^*_3 - w_4 p_2 p^*_3 - w_1 p_1 + w_1 p_3 = 0. \end{aligned}$$

Suppose that \(\mathcal {V}\) is proper, so that \(\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]\) reaches its minimum if \(p_1 = p^*_1\), \(p_2 = p^*_2\), and \(p_3 = p^*_3\). Then there must be values for the \(w_i\) such that

$$\begin{aligned} w_1 (p_1)^2 + w_3 p_3 p_1 + w_4 p_2 p_1 - w_1 (p_3)^2 - w_3 p_1 p_3 - w_4 p_2 p_3 - w_1 p_1 + w_1 p_3 \,\, = \,\, 0. \end{aligned}$$

However, factoring the left-hand side yields

$$\begin{aligned} (p_1 - p_3) (-w_1 + w_1 p_1 + w_4 p_2 + w_1 p_3). \end{aligned}$$

This equals 0 iff either (i) \(p_1=p_3\) or (ii) \(w_1=w_4\), where the latter follows from the fact that the condition that the right-hand factor equals 0 can be rewritten as \(w_1(1-p_1-p_3)=w_4 p_2\), in conjunction with the fact that the \(p_i\) sum to 1. Because, as said, for \(\mathcal {V}\) to be proper, it must hold for all \(\mathbf {p}\) that \(\mathrm{arg\,min}_{\mathbf {p}^*}\!\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]=\mathbf {p}\), we may pick a \(\mathbf {p}\) such that \(p_1\ne p_3\), thereby violating (i). As for (ii), note that whichever precise values the \(w_i\) assume, \(w_1\) must be smaller than 1/3 (given that it is assigned to the supposed truth) and \(w_4\) must be greater than 1/3 (given that it is assigned to the two hypotheses supposed false). Consequently, on the supposition that \(\mathcal {V}\) is proper, we can minimize \(\mathbb {E}_{\mathbf {p}}[\mathcal {V}(\mathbf {p}^*)]\) subject to the given constraint iff the truthlikeness weights assigned by the rule do not reflect truthlikeness in a minimally adequate sense. By assumption, the weights do reflect truthlikeness in a minimally adequate sense. Given that we made no further assumptions about \(\mathcal {V}\), it follows that every VS rule is improper if it assigns truthlikeness weights in a minimally adequate fashion. \(\square \)

Remark

The above proof proceeds by constructing a specific counterexample involving three hypotheses that are assumed to stand in specific relations of truthlikeness to each other. To see that this assumption does not undermine the generality of the proof, we note that the said relations are perfectly possible according to all modern measures of truthlikeness (see page 5 for references). As a matter of fact, one can think of our earlier example concerning the possible grades (A, B, or C) a given student may receive as instantiating exactly the relations of truthlikeness that are assumed to hold in the counterexample. It is also to be noted, however, that not all known measures of truthlikeness will do for the purposes of the proof. Most famously, Tichý (1974) discovered that on Popper’s (1963) measure all false theories are equally far from the truth, contrary to what Popper had hoped to achieve with his measure.

Appendix B

In this appendix we prove

Theorem 2

Let S be the standard unit \((n-1)\)-simplex, let \(\mathbf {p}\) and \(\mathbf {p}^*\!\) range over vectors in S, and let \(m:S\rightarrow S\) be defined as follows:

with \(\delta _{ij}\) the Kronecker delta, and with \(w_{ij}>0\) for all i, j, and \(\sum _{i=1}^n\sum _{j=1}^n w_{ij} = 1\). Then there is a \(\mathbf {p}^+\!\in S\) such that (i) \(m(\mathbf {p}^+)=\mathbf {p}^+\!\), (ii) \(\mathbf {p}^+\) is unique, and (iii) \(\mathbf {p}^+\) depends only on the \(w_{ij}\).

Proof

Clause (i) follows from Brouwer’s (1911) fixed-point theorem, which (in one version) states that every continuous function from a simplex onto itself has a fixed point. It does not follow from Brouwer’s theorem that the fixed point is unique.

To prove clause (ii), then, one first verifies that the function that is being minimized at each step on the way to the fixed point has the Hessian

$$\begin{aligned} \begin{bmatrix} \phantom {.} 2 (p_1 w_{11} + \cdots + p_n w_{1n})&\quad 0&\quad \phantom {.}\cdots \phantom {.}&\quad 0 \\ 0&\quad 2 (p_1 w_{21} + \cdots p_n w_{2n})&\quad \cdots&\quad 0 \\ \vdots&\quad \vdots&\quad \ddots&\quad \vdots \\ 0&\quad 0&\quad \cdots&\quad 2 (p_1 w_{n1} + \cdots + p_n w_{nn}) \phantom {.} \\ \end{bmatrix} \end{aligned}$$

This is a diagonal matrix, so its eigenvalues are the diagonal elements, which, given the constraints on the \(p_i\) and \(w_{ij}\), can be seen to be all necessarily positive. Therefore, the Hessian is positive definite everywhere, and given that a simplex is a convex set, it follows that the function that is minimized is strictly convex, and hence the minimum it reaches is unique. So, at each step toward the fixed point, a unique minimum is reached. As a result, the minimum reached at the fixed point is unique as well.

For clause (iii), finally, note that at the fixed point the function that is being minimized is of the form

$$\begin{aligned} m^+(\mathbf {p}) = \sum _{i=1}^n \sum _{j=1}^n p_i w_{ij} (\delta _{ij} - p_j)^2. \end{aligned}$$

Because the fixed point \(\mathbf {p}^+\) is a minimum, it holds that \(\nabla m^+(\mathbf {p}^+)=\mathbf {0}\). We obtain a system of n polynomial equations with n variables and with the \(w_{ij}\) as coefficients by setting \((\partial / \partial p_i)m^+(\mathbf {p}^+)=0\), for all \(i\leqslant n\). This system has a unique solution (in virtue of the first two clauses), which is bound to be strictly in terms of the coefficients. \(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Douven, I. Scoring in context. Synthese 197, 1565–1580 (2020). https://doi.org/10.1007/s11229-018-1867-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11229-018-1867-8

Keywords

Navigation