Values, Scales, and the Existence of Properties

Mari, Luca; Wilson, Mark; Maul, Andrew

doi:10.1007/978-3-031-22448-5_6

Luca Mari⁹,
Mark Wilson¹⁰ &
Andrew Maul¹¹

Part of the book series: Springer Series in Measurement Science and Technology ((SSMST))

2299 Accesses

Abstract

This chapter aims to expand on the ontological and epistemological analysis of properties introduced in the previous chapter, with a discussion of three fundamental issues for measurement science. Restarting from the distinction between general and individual properties, the first is about the nature of values of quantities and more generally of properties, thus allowing us to further discuss the epistemic role of Basic Evaluation Equations. The second issue relates to the classification of properties, or of property evaluations, in terms of scale types, and thus particularly to the characterization of quantities as specific kinds of properties, thus leading to the question whether, and under what conditions, non-quantitative properties can be measured. On this basis, the third problem is explored: the conditions of existence of general properties and the role of measurement in the definition of general properties.

You have full access to this open access chapter, Download chapter PDF

6.1 Introduction

We have proposed that properties of objects are associated with modes of empirical interaction of objects with their environments, with the acknowledgment that this interaction makes objects experimentally comparable with one another. Thus we ground our framework upon the assumption that through properties we account for the relational behavior of objects. As already mentioned, this does not mean that we consider a property to exist only if an interaction is observed: our position is simply that observed interactions among objects can be accounted for in terms of their properties. We also accept that the description of an interaction among objects in terms of given properties is always revisable: there must be properties there, but they are not necessarily as we describe them.

As presented in the previous chapter, the framework we are developing is grounded on individual properties, such as lengths and reading comprehension abilities, which we take to be universal entities (see Sect. 5.3.2) that can be instantiated by, or more generically identified as, properties of given objects, such as the lengths of given rods and the reading comprehension abilities of given individuals.^{Footnote 1} A basic relation of the framework is then

a property of an object identifies an individual property

or more shortly

a property of an object is an individual property

so that, for example, there is a length that a given rod has, i.e., the length of that rod (the property of an object) is that length (an individual property), and there is a reading comprehension ability that a given individual has, i.e., the reading comprehension ability of that individual (the property of an object) is that reading comprehension ability (an individual property).

Each property of an object identifies an individual property: individual properties can be handled mathematically, for example by checking which of two lengths is greater, whereas relations between properties of objects must be investigated empirically. Accordingly, when a property of an object appears in a formal relation, such as a mathematical equation or a logical inference, the actual reference is to the corresponding individual property. This applies in particular to the relation of indistinguishability of properties of objects: as already pointed out, the observation that two properties of objects, P[a_i] and P[a_j], are indistinguishable, P[a_i] ≈ P[a_j], is interpreted by assuming that P[a_i] and P[a_j] either identify the same individual property or identify distinct but empirically indistinguishable individual properties. Since in general it is not possible to ascertain which of these situations is true, the customary notation P[a_i] = P[a_j] is just a convenient shorthand, acceptable whenever the relation is assumed to be transitive (see Sect. 5.2.6 on the implications of this assumption).

As discussed in Sect. 2.2.3, comparable individual properties are said to be of the same kind (JCGM, 2012: 1.2). Kinds of properties are abstract entities that are reified by assuming the existence of corresponding general properties, so that the adjectives “long”, “heavy”, etc. are replaced by the nouns “length”, “weight”, etc., and a relation such as

$$long\left[ a \right] \, \approx long\left[ b \right]$$

as in Sect. 5.2.6, is more customarily written

$$length\left[ a \right] \, \approx length\left[ b \right]$$

Each individual property is then an instance of a general property, and two individual properties are comparable only if they are instances of the same general property. Again, the examples are obvious: any given length is an instance of length, any given reading comprehension ability is an instance of reading comprehension ability, and so on. A second relation of the framework is then.

an individual property is an instance of a general property

These relations are depicted in Fig. 6.1.

A model diagram identifies the property and length of an item as an individual property and length. They are instances of general property and length. — **Fig. 6.1**

While such a conceptualization might appear redundant, it is not hard to show that:

properties of objects are not identical to individual properties: properties of objects are in fact features of objects and as such have a spatiotemporal location and can be (individual) measurands, and neither individual properties nor general properties share this feature; however, some features, such as being comparable with respect to a given relation, are characteristic of individual properties and are inherited by properties of objects; for example, we can say that the individual length ${\ell}_1$ is greater than the individual length ${\ell}_2$ if they are identified as the length of rod a and of rod b respectively and rod a has been empirically discovered to be longer than rod b;
individual properties are not identical to general properties: individual properties can be comparable with each other, and general properties do not share this feature; however, some features, such as being a quantitative or a qualitative property, being a physical property or a psychosocial property, etc., are characteristic of general properties and are inherited by their instances; for example, a given length is a physical quantity because length is a physical quantity.

This provides a pragmatic justification of the structure illustrated in Fig. 6.1.^{Footnote 2}

On this basis, we defer to Sects. 6.5 and 6.6 a more specific analysis about general properties, in particular about their categorization into types, such as nominal, ordinal, and so forth. In the sections that follow, we continue to develop this framework grounded on individual properties, by introducing values of properties and first focusing on values of quantities.

6.2 Towards Values of Properties

A Basic Evaluation Equation, in its simplest version in which uncertainty is not taken into account, is

$${\text{property}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{property}}$$

When Norman Campbell famously stated that “the object of measurement is to enable the powerful weapon of mathematical analysis to be applied to the subject matter of science” (1920: p. 267), it is plausible that he was indeed referring to this kind of equation, and expressly to the specific case

$${\text{quantity}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

which, when written in the Q-notation (see Sect. 5.1), enables “the powerful weapon of mathematical analysis” by explicitly including numbers in the equation, e.g.,

$$L\left[ a \right] = {1}.{2345}\,{\text{m}}$$

as multipliers of units (henceforth we write “L[a]” as a shorthand for “length[rod a]”). Analogous is the case of the modified notation

$$L_{{{\text{in}}\,{\text{metres}}}} \left[ a \right] = {\text{1}}.{\text{2345}}$$

as in some formalizations, such as those adopted in representational theories of measurement (see, e.g., Krantz et al., 1971, and also Kyburg, 1984: p. 17). Through Basic Evaluation Equations, values of properties, and thus values of quantities in particular, are indeed the mathematical counterparts of empirical properties of objects. Values play the fundamental role of providing the information that is the reason for which measurement is performed: before measurement the measurand is known only as a property of an object; after measurement we also know a value for it. (Once again, references to uncertainty are important for a more complete presentation of measurement, but are not relevant here.) Once the relation is accepted as dependable, the value can be mathematically manipulated in place of experimentally operating on the property of the object. As a trivial example, if L[a] = 1.2345 m and L[b] = 2.3456 m then from the mathematical fact that 1.2345 m<2.3456 m we can immediately infer that L[a] < L[b].

An analysis of the nature and the role of values of properties is then a core component for the development of a measurement-related ontology and epistemology of properties. Let us start by considering the specific case of quantities and their values.^{Footnote 3} (Henceforth we occasionally use the short term “value”, rather than “value of a property” or “value of a quantity”, if this does not create ambiguity.)

6.2.1 Values of Properties: What They Are not

Values of properties have such a critical role that it is perhaps not surprising that there are multiple and even incompatible positions on what they are. According to two common stereotypes, they are expressions, or they are symbols. Let us start our analysis by showing that neither of these positions is correct. These stereotypes are usually related to quantitative properties rather than properties as such: hence, pars pro toto, we refer to quantities in the discussion that follows.

First, for example, according to the first edition of the VIM the value of a quantity is “the expression of a quantity in terms of a number and an appropriate unit of measurement” (ISO, 1984: 1.17, emphasis added). The first definition that the Oxford English Dictionary (OED) gives of <expression> is “things that people say, write or do in order to show their feelings, opinions and ideas”: thus, in general usage, according to the OED expressions (including mathematical expressions) are linguistic entities, i.e., in the sense of terminology, neither concepts nor objects (see Sect. 2.1). But it should be clear that here, in discussing measurement, values are not linguistic entities. Consider the difference, e.g., between the rod a, which is an object, and the five-character (space included) term “rod a”, which is a linguistic entity: the object has a weight, a color, etc., whereas the term does not. The term “rod a” refers to a given rod, but is not that rod. Analogously, values are communicated by means of terms but they are not terms.^{Footnote 4} And in fact the same value, e.g., 1.2345 m, can be expressed linguistically in multiple ways, e.g., “one point two … metres”, “1.2345 m”, and “1,2345 m” (for most non-English speaking people), showing that 1.2345 m and “1.2345 m” are different entities. Certainly, values must somehow be expressed by means of linguistic entities to be communicated, but they are not, in themselves, expressions.

Second, sometimes values are said to be symbols, or identifiers, which stand for or represent objects or quantities of objects.^{Footnote 5} Of course, values may well be used as such, but this does not solve the problem of what they are. Indeed, stating that x is a symbol of y does not say anything about what x is. In this sense, Napoleon can be a symbol of political power, and a sphere can be a symbol of perfection, but this does not change the fact that Napoleon was a human being and a sphere is a geometric object. “To be a symbol” is just convenient shorthand for “to be used as a symbol”. Hence values may be used as symbols to represent quantities of objects, but a definition of <value of a quantity> phrased as “symbol such that…” is ontologically vacuous.

6.2.2 Values of Properties Cannot Be Discarded in Contemporary Measurement

At this point we need to face the possible objection that values are not needed at all, and therefore our whole current problem can be dismissed as immaterial. At least two analogous arguments can be made in support of this position.

One argument is that most equations and the related explanations that appear in the literature on, for example, physics do not even mention units: while often introduced as relations among general quantities (e.g., F = ma), physical laws are also interpreted as equations that relate numerical values of such quantities, under the assumption that their units are consistently chosen in a system of units. Hence it would seem that, after a system of units has been chosen, values can be discarded, and instead one need only to report numbers, instead of values (e.g., 1.2345 instead of 1.2345 m), for conveying information about quantities of objects.

The second argument against values of properties starts from the supposition that measurement produces numbers rather than values. As mentioned above, this seems to be assumed in particular by representational theories of measurement (see, e.g., Krantz et al., 1971), which usually formalize measurement as a mapping from objects, or sometimes properties of objects (see also Sect. 5.2.5), to numbers^{Footnote 6} by maintaining the unit implicit in the mapping, thus re-writing, e.g., L[a] = 1.2345 m as L_{in_metres}[a] = 1.2345. This seems to be a reinterpretation of Russell’s well-known assertion that “Measurement of magnitudes is, in its most general sense, any method by which a unique and reciprocal correspondence is established between all or some of the magnitudes of a kind and all or some of the numbers, integral, rational, or real, as the case may be” (1903: p. 176). Indeed, the Q-notation (see Sect. 5.1).

$$Q\left[ a \right] = \left\{ {Q\left[ a \right]} \right\}\,\left[ Q \right]$$

is equivalent to

$$Q\left[ a \right]/\left[ Q \right] = \left\{ {Q\left[ a \right]} \right\}$$

where then L[a] / m is what L_{in_metres}[a] is actually meant to be. Since in this relation values of quantities seem to have disappeared, it might be concluded that they are only related to the way knowledge is represented and therefore that they can be avoided by an appropriate choice of the representation.

As we see them, both of these arguments are correct in their premises, but their conclusions are problematic: the fact that in specific cases values can actually be discarded, in favor of dealing with numbers only, is really just a sort of shorthand and does not imply that this is always the case. Rather, there are good reasons for the customary choice of writing the Basic Evaluation Equation in terms of values instead of numbers. The difference between values of quantities and numerical values is that only the former contain information on the metrological context: “1.2345 m” means <1.2345 in the context of the scale generated by the metre>. Reporting only a numerical value, such as 1.2345, loses the reference to such a context, which is crucial for guaranteeing the metrological traceability of measurement data.

Assertions such as Russell’s hide the issue by implicitly assuming that the metrological context is given and is entirely embedded in the definition of the general quantity under measurement, as if a “natural unit of length” were unproblematically available, allowing us to measure the “natural length” of any object by a number, interpreted as the multiple of such a “natural unit” and conveying the information of the traceability to such a unit. It is in fact as if measurement could always be, in its structure, the counting of “natural units”.

But unless and until such “natural units” for all relevant quantities are agreed upon and socially accepted,^{Footnote 7} it is convenient, and essential, for Basic Evaluation Equations, and measurement results, to contain information on their metrological context, as provided by values, which thus play a critical role in effective communication of the information acquired by means of measurement.

On this basis, let us continue our exploration of what values of quantities are.

6.3 Constructing Values of Quantities

While the concept <value of a property> might appear unusual (as an example, the VIM does not define it), values of quantities are widely used, uncontroversially recognized as multiples of units. Even those who are doubtful about the nature of values of quantities, as discussed above, accept that 1.2345 m and 2.34 kg are examples of them. In order to properly introduce values of properties in our framework, let us then start from values of quantities, by exploiting the familiar additive structure of quantities such as length. What follows is a construction by example, rather than a definition.

6.3.1 Operating on (Additive) Quantities of Objects

Let us consider two rods, r and r’, in the experimental situation depicted in Fig. 6.2.

A model diagram presents two cuboids marked as r and r dash. They are one above another. Two parallel vertical lines run along the left and right faces. — **Fig. 6.2**

This situation is usually described as

the rods r and r' have the same length

or

the length of r is the same as the length of r'

and therefore

$$L\left[ r \right] \, \approx L\left[ {r^{\prime}} \right]$$

thus highlighting, more explicitly than L[r] = L[r’], that this is an experimental relation and therefore such a sameness is operationally a length-related indistinguishability.^{Footnote 8}

Moreover, let us then assume that, at least for objects such as rods, length is an empirically additive quantity,^{Footnote 9} so that there exists a length-related concatenation operation ⊕ (hence the symbol “⊕” is used to denote an operation that applies to lengths of objects, not numbers) and the situation depicted in Fig. 6.3 is described as.

the length of a is indistinguishable from the length of the length-related concatenation of r and r'

A model diagram presents three cuboids marked as r, r dash, and a between two parallel vertical lines. r and r dash are in series. Cuboid a is at the bottom. — **Fig. 6.3**

or

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ {r^{\prime}} \right]$$

Since L[r] ≈ L[r’], this relation can also be written as

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ r \right]$$

and therefore

$$L\left[ a \right] \, \approx {2}\,L\left[ r \right]$$

for short, where more generally n L[r], for any integer n > 0, denotes the length of n concatenated copies of L[r].^{Footnote 10} This principle can be then extended also to non-integer relations between L[a] and L[r], by considering, together with iteration, L[a] ≈ n L[r], the inverse operation of partition (as the terms “iteration” and “partition” are used by Weyl, 1949: p. 30), such that L[r] is assumed to be constituted of n’ indistinguishable lengths L[c], so that L[r] ≈ n’ L[c]. By combining the two operations, a length is obtained as n/n’ L[r]. In varying the ratio n/n’ a set of lengths is thus obtained, and while the construction starts from the length of a given object, r, each entity n/n’ L[r] is a length constructed without an object that bears it: what sort of entities are they, then? While leaving this question open for the moment, let us point out that all these relations involve only quantities of objects, and are obtained by experimentally comparing objects.

Suppose now that the length L[r] is agreed to be taken as a reference quantity and given an identifier for convenience, say “${\ell}_{\text{ref}}$” (or, for example, “metre”). The reference length ${\ell}_{\text{ref}}$ is then defined as the length of the object r

$${\ell}_{\text{ref}} : = L\left[ r \right]$$

and r can be called a reference object. The indistinguishability relation L[a] ≈ 2 L[r] can then also be written as

$$L\left[ a \right] \, \approx { 2 }\ell_{{{\text{ref}}}}$$

This shows that the following relations

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ {r^{\prime}} \right]$$

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ r \right]\quad \left( {{\text{provided}}\,{\text{that}}\,L\left[ r \right] \, \approx L\left[ {r^{\prime}} \right]} \right)$$

$$L\left[ a \right] \, \approx { 2}L\left[ r \right]\quad \left( {{\text{a}}\,{\text{shorthand}}\,{\text{of}}\,{\text{the}}\,{\text{previous}}\,{\text{relation}}} \right)$$

$$L\left[ a \right] \, \approx { 2 }\ell_{{{\text{ref}}}} \quad \left( {{\text{according}}\,{\text{to}}\,{\text{the}}\,{\text{definition}}\,{\text{of}}\, \, \ell_{{{\text{ref}}}} } \right)$$

all refer to the same empirical situation and only differ in the way the information is conveyed: in terms of the distinction between senses and referents of expressions (as explained in Sect. 5.3.2), the senses of the involved expressions are different, but their referent is always the same. All these relations—including the last one—involve lengths, and the difference between the length L[r] ⊕ L[r’] and the length 2 ${\ell}_{\text{ref}}$ is only about how such lengths are identified.

The fact that this construction has been developed with no references to values is important. As Alfred Lodge noted (1888: p. 281).

the fundamental equations of mechanics and physics express relations among quantities, and are independent of the mode of measurement of such quantities; much as one may say that two lengths are equal without inquiring whether they are going to be measured in feet or metres; and indeed, even though one may be measured in feet and the other in metres. Such a case is, of course, very simple, but in following out the idea, and applying it to other equations, we are led to the consideration of product and quotients of concrete quantities, and it is evident that there should be some general method of interpreting such products and quotients in a reasonable and simple manner.

With this acknowledgment we may restart our construction.

A rod a can be now calibrated in terms of its length with respect to ${\ell}_{\text{ref}}$ by aligning the left ends of a and r and placing a mark on the rod a at the other end of the rod r. Additional marks can be placed on the rod a, using geometrical methods that implement the iteration and partition methods mentioned above, to denote multiples of ${\ell}_{\text{ref}}$, as depicted in Fig. 6.4.

A model diagram presents two cuboids marked as r and a with a vertical line on the left. Cuboid a is marked with 0.5, 1.0, and 1.5. r extends till 1.0. — **Fig. 6.4**

Common measuring instruments of length, such as metre sticks and tape measures, are constructed and then calibrated in this way: indeed, the rod a can be placed against other objects to establish where their lengths align with corresponding marks on the rod a itself. Hence, the rod a realizes a sequence of lengths. The length L[b] of an object b can be now compared with the lengths marked on the rod a and thus reported as a multiple x of ${\ell}_{\text{ref}}$,

$$L\left[ b \right] \, \approx x\,\ell_{{{\text{ref}}}}$$

where then x = n/n’, for given n and n’, as depicted in Fig. 6.5.

A model diagram presents two cuboids marked as b and a with a vertical line on the left. Cuboid a is marked with 0.5, 1.0, and 1.5. b extends up to 1.5. — **Fig. 6.5**

6.3.2 On Reference Objects and Reference Quantities

Let us now focus on the indistinguishability relation

$$L\left[ b \right] \, \approx x\ell_{{{\text{ref}}}}$$

which holds for a given x = n/n’. The only difference between L[b] ≈ x ${\ell}_{\text{ref}}$ and L[b] ≈ x L[r] is related to the way in which the quantity in the right-hand side of the two relations is referenced, by an identifier of the quantity, ${\ell}_{\text{ref}}$, or by addressing an object, r, with respect to one of its properties, L. Since changing the way in which a quantity is referenced (“the length of the object we agreed to designate as r”, “L[r]”, “${\ell}_{\text{ref}}$”, or whatever else) does not change the quantity, one might conclude that this is just an arbitrary lexical choice. While in principle this is correct, there is a subtle point here related to the way we usually deal with identifiers: for the relation (identifier, identified entity) to be useful, it needs to hold in a stable way. This is why entities whose time-variance is acknowledged are identified by means of identifiers indexed by a time-related variable, as in the case of the length L[b, t] of the object b at the time t (see also Sect. 5.2.5). Conversely, if the identifier does not include a reference to time then the identification is supposed to remain valid only on the condition that the identified entity does not change over time. For example, the date of birth of a given person b can be identified as birthday[b], while her height in a given time t as height[b, t]: in this way we acknowledge that birthday is time invariant, whereas height is time variant.

In this sense, the definition ${\ell}_{\text{ref}}$:= L[r], where the identifier “${\ell}_{\text{ref}}$” is not indexed with time, assumes that the length L[r] is time invariant. Since quantities of objects are instead usually subject to variations, this is a strong assumption: of course, assigning a name to the quantity of an object does not make it stable.^{Footnote 11}

The consequence of choosing the length of an object r as a reference length ${\ell}_{\text{ref}}$, thus under the condition of its stability, is that ${\ell}_{\text{ref}}$ can also be considered to be the length of any other sufficiently stable object having the same length as r. This allows the assessment of L[a] ≈ x ${\ell}_{\text{ref}}$ not only by means of L[a] ≈ x L[r] but also by means of L[a] ≈ x L[r’], for any sufficiently stable r’ in a class of objects such that L[r’] ≈ L[r]. Hence the choice of referring to a length through an identifier as “${\ell}_{\text{ref}}$” (for example “metre”—note: it is not “metre in a given time t”) assumes that the referenced length is both space and time invariant: according to the conceptual framework introduced in Sect. 6.1, it is an individual length, identified by L[r], L[r’], … but abstracted from any particular object.^{Footnote 12}

6.3.3 Alternative Reference Quantities and Their Relations, i.e., Scale Transformations

As remarked, the only condition for having singled out r as a reference object is that its length is stable. Hence nothing precludes the independent choice of an alternative reference object, r*, whose length L[r*] is distinguishable from L[r] and defines a new reference length (for example the foot instead of the metre):

$$\ell_{{{\text{ref}}^*}} : = L\left[ {r^*} \right]$$

A new rod a* can be now calibrated with respect to ${\ell}_{\text{ref*}}$, exactly as was done before for the rod a with respect to ${\ell}_{\text{ref}}$, so that the same object b could be compared in its length with both rod a and rod a*. Different relations of indistinguishability are then obtained, L[b] ≈ x ${\ell}_{\text{ref}}$ and L[b] ≈ x’ ${\ell}_{\text{ref*}}$, with x ≠ x’, as exemplified in Fig. 6.6.

A model diagram presents 3 cuboids marked as an asterisk, b, and a. an asterisk and a are marked with 1, 2, 3, and 0.5, 1.0, 1.5. b extends till 3 and 1.5. — **Fig. 6.6**

The lengths marked in this way on rods a and a* can be compared, which is particularly interesting because such lengths are indexed by numbers, attributed according to the hypothesis of empirical additivity, such that the length 2 ${\ell}_{\text{ref}}$ is L[r] ⊕ L[r] and so on. Hence, the hypothesis that the lengths marked on two rods have been additively constructed can be experimentally validated, by finding the factor k such that ${\ell}_{\text{ref}}$ ≈ k ${\ell}_{\text{ref*}}$ (in the example in Fig. 6.6, k = 0.5) and then checking whether 2 ${\ell}_{\text{ref}}$ ≈ 2k ${\ell}_{\text{ref*}}$, 3 ${\ell}_{\text{ref}}$ ≈ 3k ${\ell}_{\text{ref*}}$, and so on. Such a systematic validation provides a justification for the specific hypothesis that the two lengths ${\ell}_{\text{ref}}$ and k ${\ell}_{\text{ref*}}$ are in fact equal, ${\ell}_{\text{ref}}$ = k ${\ell}_{\text{ref*}}$, and not just indistinguishable, and therefore that the scale transformation, from multiples of ${\ell}_{\text{ref}}$ to multiples of ${\ell}_{\text{ref*}}$ or vice versa, can be performed as a mathematical operation.^{Footnote 13}

6.3.4 Generalizing the strategy of Definition of Reference Quantities

The definition of reference quantities as quantities of objects (sometimes called “prototypes” or “artifacts” when they are physical objects) that are hypothesized to be stable is conceptually simple, and is typically the starting point of the development of a unit. For example, in 1889 the first General Conference of Weights and Measures (CGPM) asserted that “the Prototype of the metre chosen by the CIPM […] at the temperature of melting ice shall henceforth represent the metric unit of length.” (BIPM, 2019: Appendix 1), where the mentioned prototype of the metre was a specially manufactured metallic rod. But this strategy has some drawbacks that have become more and more apparent with the progressive globalization of measurement science and its applications:

First, both physical and non-physical objects at the anthropometric scale are usually not completely stable, with the consequence that, once the definition ${\ell}_{\text{ref}}$:= L[r] is given, for any object a if L[a] ≈ x ${\ell}_{\text{ref}}$ and [r] changes due to the instability of r, then after that change L[a] ≈ x’ ${\ell}_{\text{ref}}$, with x’ ≠ x, even if L[a] did not change: the numerical representation of a quantity has changed even though the quantity itself did not^{Footnote 14};
Second, having a reference quantity defined as the quantity of one object implies that all traceability chains must start from that object, in the sense that all measuring instruments for that quantity must be directly or indirectly calibrated against that reference object: this is operationally inconvenient and may generate political struggles, given the power that the situation confers to the owner of the object.

Alternative strategies to define stable and accessible reference quantities may be and have been in fact envisaged to avoid or at least reduce these flaws. Such alternative strategies are particularly required in the case where the quantities intended to be measured are properties of human beings, which, if the steps described above were to be followed, would imply that, in principle, reference objects should be certain individual humans, a situation that of course is not usually appropriate for several reasons.^{Footnote 15}

Rather than selecting specific objects, a representative sample of objects—hence persons, in some cases—could be then selected, their property of interest somehow evaluated, and the reference quantity defined as a statistic (e.g., the mean or the median) of the obtained values. This makes the reference quantity depend on the selected sample, and therefore in principle it is not entirely stable if new samples are taken or if characteristics of the sampled population change. (In psychometrics, evaluations performed according to this strategy are called “norm-referenced”, to emphasize the normative role of the sample that defines the reference quantity; see Glaser, 1963.^{Footnote 16})

Another possible strategy for dealing with these issues may be based on the consideration that according to the best available theories there is a class of objects, R = {r_i}, that when put in given conditions invariantly have the same quantity of interest, which in those conditions is then a constant.^{Footnote 17} Defining the reference quantity as a constant quantity characteristic of a class of objects guarantees both its stability and accessibility. And if the identified constant were too far from the anthropometric scale to be suitable, the reference quantity could be defined as an appropriate multiple or submultiple of the constant, so as to maintain a principle of continuity, such that different definitions could subsequently be adopted while ensuring that the defined reference quantity remains the same. For example, in 1960 the 11th General Conference of Weights and Measures redefined the metre as “the length equal to 1 650 763.73 wavelengths in vacuum of the radiation corresponding to the transition between the levels 2p₁₀ and 5d₅ of the krypton 86 atom” (BIPM, 2019: Appendix 1). The critical point of this definition is the assumption that the wavelength of the chosen radiation is constant, whereas the numerical value, 1 650 763.73, was only chosen for the purpose of guaranteeing that the metre remained the same length despite the change of its definition.^{Footnote 18}

By exploiting the functional relations that are known to hold among quantities, a more sophisticated version of this strategy allows for the definition of a reference quantity as a function of constants of different kinds, and possibly of previously defined reference quantities. For example, according to Einstein’s theory of relativity the speed of light in vacuum is constant, and so the class of all light beams in vacuum is such that the length of their path in a given time interval is also constant. By exploiting the relation

$$\text{length} = \text{speed} \,\times \text{time duration}$$

among general quantities, the definition is then

$$\ell_{{{\text{ref}}}} : = S\left[ {\text{R}} \right] \, \Delta T$$

where S[R] is the speed S of light in vacuum (R being then intended as the class of all light beams in vacuum) and ΔT is the chosen time interval. This is in fact how in 1983 the 17th General Conference of Weights and Measures defined the metre: “the length of the path travelled by light in vacuum during a time interval of 1/299 792 458 of a second” (BIPM, 2019: Appendix 1). Once again, the appropriate choice of the numerical value, 1/299 792 458, was the condition of validity of the principle of continuity.

With the aim of emphasizing the role of the defining constant quantity S[R], this definition can be rephrased as

$${\text{the}}\,{\text{reference}}\,{\text{length}}\, \, \ell_{{{\text{ref}}}} \, \text{is} \, \text{such} \,{\text{that}}\,S\left[ {\text{R}} \right] = \ell_{{{\text{ref}}}} \Delta T^{{{-}{1}}}$$

and this is in fact what became the definition of the metre in 2019 as a result of the 26th General Conference of Weights and Measures: “The metre (…) is defined by taking the fixed numerical value of the speed of light in vacuum c to be 299 792 458 when expressed in the unit m s^–1, where the second is defined in terms of the caesium frequency ∆ν_Cs.” (BIPM, 2019: 2.3.1).

Given the condition of the correctness of the theory that considers a quantity constant for a class of objects, this generalization produces three important benefits, by making the unit

independent of the conditions of stability of a single object,
more widely accessible (in principle, everyone with the access to one object of the class can realize the definition of the unit, and therefore operate as the root of a traceability chain), and
definable in terms of quantities of kinds other than that of the unit, given the condition that all relevant quantities are related in a system of quantities.

This developmental path, where a unit is defined as

1.
the quantity of a given object (prototype-based definition), then
2.
a statistic of the quantities of a representative set of objects (norm-referenced definition), then
3.
the quantity considered to be constant for a class of objects (constant-based definition, as in the 1960 definition of the metre), then
4.
a quantity functionally related to the quantity/ies considered to be constant for a class of objects (functional constant-based definition, as in the 1983 definition of the metre), with the relation possibly stated in inverse form (inverse functional constant-based definition, as in the 2019 definition of the metre),

may be interpreted as a blueprint of the options for the definition of reference quantities: in fact, a lesson learned from history.

6.3.5 Values of Quantities: What They Are

Let us summarize the main features of the construction proposed in the previous sections. In the special case of an empirically additive general quantity Q, the quantities Q[a_i] of objects a_i can be concatenated so that the concatenation Q[a_i] ⊕ Q[a_j] can be empirically indistinguishable from a quantity Q[a_k], that is, Q[a_i] ⊕ Q[a_j] ≈ Q[a_k].^{Footnote 19} On this basis an object r having the quantity Q can be singled out with the conditions that it is sufficiently Q-stable and that Q-related copies of it are available. This allows for the identification of the individual quantity Q[r] not only as “Q[r]”—i.e., the quantity Q of the object r—but also through a time-independent identifier “q_ref” (“${\ell}_{\text{ref}}$” in the example above). This also allows for reporting of the information on a quantity Q[a_i] in terms of its indistinguishability from a multiple x of q_ref, Q[a_i] ≈ x q_ref. Furthermore, other such reference objects r* can be chosen, and the scale transformation q_ref = k q_ref* can be experimentally tested, for a given k that depends on q_ref and q_ref*.

While everything that has been done in this construction is related to quantities of objects, the conclusions apply to what are commonly acknowledged to be values of quantities, and in fact the indistinguishability

$$Q\left[ {a_{i} } \right] \approx x{\text{q}}_{{{\text{ref}}}}$$

can be interpreted as a Basic Evaluation Equation

$${\text{quantity}}\,{\text{of}}\,{\text{an}}\,{\text{object}} \approx {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

as well, as follows:

an individual quantity q_ref is singled out as a quantity unit (e.g., the metre); q_ref may be defined as the quantity Q[r] of an object r or, being defined in some other way as discussed in Sect. 6.3.4, may be realized by some object r; in either case r is a measurement standard, and possibly in particular the/a primary standard;
the individual quantities x q_ref (e.g., 2 m) are values of quantities, being by construction, the multiples of q_ref obtained by means of the concatenation of the chosen unit;
working standards r’ can be calibrated against the primary standard r, Q[r’] ≈ Q[r] (ignoring calibration uncertainty), so that the quantity Q[a] of an object a can be compared with Q[r’]; hence the inference that from q_ref = Q[r], Q[r’] ≈ Q[r], and Q[a] ≈ x Q[r’] leads by transitivity^{Footnote 20} to Q[a] ≈ x q_ref is the simplest case of a metrological traceability chain (JCGM, 2012: 2.42);
the relation Q[a] ≈ x q_ref for a given x (e.g., L[a] ≈ 2 m) is a Basic Evaluation Equation, and thanks to this traceability it may be a measurement result for the measurand Q[a] (ignoring measurement uncertainty);
hence the relations

the quantity of a given object is indistinguishable from a multiple of the quantity of another object

(e.g., L[a] ≈ 2 L[r]) and

the quantity of a given object is a value of a quantity

(e.g., L[a] ≈ 2 ${\ell}_{\text{ref}}$ (or L[a] ≈ 2 m, that indeed is commonly read “the length of a is 2 metres”))

refer to the same empirical situation, the difference being in the way the two relations convey the information about the individual quantities involved.

The conclusion is then obvious: a value of a quantity is an individual quantity identified as a multiple of a given reference quantity, designated as the unit.^{Footnote 21}

The analysis in Sect. 5.3.2, which led us to interpret a relation such as Q[a_i] ≈ Q[a_j] as including expressions with different senses but possibly the same individual length as their referent, can be now straightforwardly extended to scale transformations and Basic Evaluation Equations:

in the scale transformation q_ref = k q_ref* the expressions “q_ref” and “k q_ref*” have different senses but the same individual length as referent^{Footnote 22};
in the Basic Evaluation Equation Q[a] ≈ x q_ref, for a given x, the expressions “Q[a]” and “x q_ref” have different senses but could have the same individual length as their referent.

The concept system about <quantity> can then be depicted as in Fig. 6.7.

A model diagram presents quantity, length, and their values on the left, individual quantity and length in the middle, and general quantity and length on the right. — **Fig. 6.7**

As discussed in Sect. 6.5.1, there is nothing arbitrary in the fact that an individual quantity q is identified as the quantity Q[a] of an object a. Once again, this shows that the Basic Evaluation Equation L[a] = 1.2345 m conveys the information that there is an individual length ${\ell}$ such that both the length L[a] of rod a and the value 1.2345 m are claimed to be instances of ${\ell}$. This allows us to propose a short discussion, in Box 6.1, about the delicate subject of true values of quantities.

Box 6.1 True values of quantities

Plausibly due to its explicit reference to truth, the idea that the quantity of an object may have a true value, which measurement aims at discovering, has caused controversies and confusion in measurement science for decades, if not centuries. It is a position that has appeared so deeply entangled with philosophical presuppositions about the actual existence of an observer-independent reality that a mutual understanding seemed impossible without preliminary agreement about an underlying ontology. Consider two authoritative, traditional examples of positions that jointly illustrate this controversy. According to Ernest Doebelin “when we measure some physical quantity with an instrument and obtain a numerical value, usually we are concerned with how close this value may be to the ‘true’ value” (1966). Vice versa, Churchill Eisenhart wrote about his “hope that the traditional term ‘true value’ will be discarded in measurement theory and practice, and replaced by some more appropriate term such as ‘target value’ that conveys the idea of being the value that one would like to obtain for the purpose in hand, without any implication that it is some sort of permanent constant preexisting and transcending any use that we may have for it.” (1962).

While acknowledging that references to truth are usually laden with philosophical presuppositions, our understanding of what values of properties are and of the epistemic role of the Basic Evaluation Equation allows us to propose a simple interpretation of <true value>.

As a preliminary note, it should be clear that truth or falsehood do not apply, properly, to values: asking whether, say, 1.2345 m is true or false is meaningless. Rather, the claim that 1.2345 m is a true value refers to its relation with a property of an object, say, the length of rod a, being thus just a shorthand for the assertion that the Basic Evaluation Equation

$$L\left[ a \right] = {1}.{2345}\,{\text{m}}$$

is true. Of course, there can be several reasons that may make it hard or even impossible to achieve knowledge of the actual truth or falsehood of this equation, and the equation as such could be ill defined.

These issues may be left aside in a first, principled analysis about <true value>, which maintains in particular the distinction between the existence of true values and the possibility of our knowledge of them. Indeed, the VIM notes that true values are “in principle and in practice, unknowable” (JCGM, 2012: 2.11, Note 1), but this may be simply considered as an instance of the recognition that we cannot be definitively certain of anything related to empirical facts, like values of empirical properties of objects.

If true values are interpreted in the context of Basic Evaluation Equations, the conclusion does not seem to be particularly controversial: as already discussed, in Sect. 5.1.3 and elsewhere, asking whether 1.2345 m is the true value of the length of rod a is the question of whether there is an individual length that is at the same time the length of rod a and 1.2345 times the metre. Moreover, if values of properties and evaluation scales are considered as classification tools, as further discussed also in Box 6.2, the truth of a Basic Evaluation Equation is about the correctness of the classification of the property of the given object in the class identified by the given value. From this perspective, assessing the truth of a Basic Evaluation Equation is a sensible and useful task, basically free of philosophical preconditions: indeed, the idea that an entity—the property of an object in this case—can be objectively classified after a classification criterion has been set does not require one to accept any realist assumption about properties of objects or values of properties.

This interpretation can be then generalized, by refining our understanding of the two lengths involved in the Basic Evaluation Equation above and their relations. Let us suppose that our knowledge of the structure of rod a leads us to model it as having a unique length at the scale of millimetres, so that L[a] = 1.234 m would be either true or false as such, but not at a finer scale. Were we for some reason instead expected to report a measured value for L[a] at the scale of the tenths of millimetres, as above, we should admit that the measurand has to be defined with a non-null definitional uncertainty, as exemplified in Box 2.3, thus acknowledging that in this case L[a] is not really one length, but several—plausibly an interval—of them. As a consequence, the Basic Evaluation Equation above, if true, should be actually meant as something like

$$L\left[ a \right]\rm{ \ni }{1}.{2345}\,{\text{m}}$$

i.e., 1.2345 times the metre is one of the lengths that rod a has, as the measurand L[a] is defined. While this seems to be a peculiar conclusion, consider that values of quantities are sometimes conceived of as operating in mathematical models assuming the continuity (and the differentiability, etc.) of the involved functions, so that the Basic Evaluation Equation above is actually treated as if it were L[a] = 1.234500000... m. However, macroscopic objects cannot have a length classified in a scale that is infinitely specific, i.e., by values with infinitely many significant digits. Hence, sooner or later the interpretation of Basic Evaluation Equations as memberships, instead of equalities, may become appropriate, where the measurand is then defined with a non-null definitional uncertainty with respect to a given scale and a value is true if it is one of the lengths of the subset / interval admitted by the definitional uncertainty. This is a plausible account of what is behind the VIM’s statement that “there is not a single true quantity value but rather a set of true quantity values” (JCGM, 2012: 2.11, Note 1).

The previous construction, which has led us to reach a conclusion about what values of quantities are, explicitly relies on the additivity of length. In the next two sections we discuss how this conclusion generalizes to non-additive cases. We discuss here values of non-additive quantities, in particular those represented on interval scales, while reserving a discussion of the most general case of values of possibly non-quantitative properties to Sect. 6.5.2.

6.3.6 Beyond Additivity: The Example of Temperature

Let us first discuss the case of temperature, as characterized and then measured in thermometric (e.g., Celsius and Fahrenheit) scales. Unlike length, temperature is not an additive quantity: that is, we do not know how to combine bodies by temperature so that the temperature of the body obtained by combining two bodies at the same temperature is twice the temperature of each of the combined bodies. This could be what led Campbell to conclude that “the scale of temperature of the mercury in glass Centigrade thermometer is quite as arbitrary as that of the instrument with the random marks” (1920: p. 359), so that “the international scale of temperature is as arbitrary as Mohs’ scale of hardness” (p. 400). Were this correct, values of temperature, such as 23.4 °C, would be only identifiers for ordered classes of indistinguishable temperatures, as are values of Mohs’ hardness, so that we could assess that the temperature identified as 23.4 °C is higher than the temperature identified as 23.3 °C, but not that the difference between the temperatures identified as 23.4 °C and 23.3 °C is the same as the difference between the temperatures identified as 22.9 °C and 22.8 °C. Our question is then: what is a value of temperature?

The starting point is the same as in the case of length: we assume to be able to compare bodies by their temperature so as to assess whether two given bodies have indistinguishable temperatures (in analogy with the comparison depicted in Fig. 6.2) or whether one body has a greater temperature than the other.^{Footnote 23}

On this basis, a (non-arbitrary) scale of temperature (and therefore values of temperature) can be constructed through an empirical procedure, though, admittedly, not as simply as the one for length. As in the case of length, all assumptions that follow relate to empirical properties of objects, and non-idealities in the comparisons of such properties are not taken into account.

Let us consider a sequence a_i, i = 1, 2, …, of amounts of gas of the same substance, where the i-th amount has the known mass M[a_i] = m_i and is thermally homogeneous, at the unknown temperature Θ[a_i] = θ_i.^{Footnote 24} Let us suppose that any two amounts of gas a_i and a_j can be combined into a single amount a_i,j, such that m_i,j = m_i + m_j. It is assumed that a_i,j reaches thermal homogeneity and that its temperature θ_i,j is only a function of θ_i, m_i, θ_j, and m_j (but of course the non-additivity of temperature is such that θ_i,j ≠ θ_i + θ_j). Finally, let us suppose that the temperatures of any two amounts of gas can be empirically compared by equality and by order, i.e., whether θ_i = θ_j or θ_i < θ_j or θ_j < θ_i. The hypothesis that temperature is an intensive property (see Sect. 1.2.1) can be tested through some preliminary checks:

for any two amounts a_i and a_j, if θ_i = θ_j then θ_i,j = θ_i = θ_j, i.e., thermal homogeneity does not depend on mass;
for any two amounts a_i and a_j, if θ_i < θ_j then θ_i < θ_i,j < θ_j, i.e., thermal composition is internal independently of mass;
for any three amounts a_i, a_j, and a_k, if θ_i < θ_j < θ_k and m_j ≤ m_k then θ_i,j < θ_i,k, i.e., thermal composition is monotonic for monotonically increasing mass;
for any three amounts a_i, a_j, and a_k, if θ_i < θ_j < θ_k and m_j > m_k then all cases, θ_i,j < θ_i,k, θ_i,j = θ_i,k, and θ_i,j > θ_i,k, can happen.

The fact that these conditions hold may suggest the hypothesis that

$$\uptheta_{i,j} = \frac{{m_{i} \uptheta_{i} + m_{j} \uptheta_{j} }}{{m_{i} + m_{j} }}$$

i.e., temperatures compose by weighted average, where the weights are the masses of the composing amounts of gas. For testing this hypothesis, let us assume that three amounts of gas, a_i, a_j, and a_k, are given such that their masses m_i, m_j, and m_k are known and can be freely changed, and that θ_i < θ_j and θ_i < θ_k. Let us now suppose that a_j and a_k, are independently composed with a_i, and m_i, m_j, and m_k are chosen to obtain that θ_i,j = θ_i,k, and therefore, under the hypothesis that temperatures compose by weighted average, that

$$\frac{{m_{i} \uptheta_{i} + m_{j} \uptheta_{j} }}{{m_{i} + m_{j} }} = \frac{{m_{i} \uptheta_{i} + m_{k} \uptheta_{k} }}{{m_{i} + m_{k} }}$$

What is obtained is a system with two degrees of freedom, in which one of the three unknown temperatures θ_i, θ_j, and θ_k is a function of the other two temperatures and of the three masses, i.e., θ_k = f(θ_i, θ_j, m_i, m_j, m_k). Were a value arbitrarily assigned to identify θ_i and θ_j (for example 0° X and 1° X for an X scale with values in degrees X), a value for θ_k could be computed. By choosing the two temperatures θ_i and θ_j and setting the corresponding values θ_i and θ_j, and repeating the same process with different masses m_i, m_j, m_k and a different temperature θ_k, other values of the X scale would be obtained, and the hypothesis of weighted average validated.

However, as discussed in Sect 1.2.1, historically a key step forward was the discovery of the thermal expansion, i.e., that some bodies change their volume when their temperature changes. In metrological terms, such bodies can be exploited as transducers of temperature (see Sect. 2.3). Making a long story short, the refined treatment of these bodies—in devices that we would consider today (uncalibrated) thermometers—corroborated the empirical hypotheses that, within given ranges of volumes of given bodies,

for a sufficiently large set {a_i} of bodies the temperature Θ[a_i] of each body in the set and its volume V[a_i] are causally connected, as modeled by a function f, V[a_i] = f(Θ[a_i]),
such that changes in temperature of each body in the set produce changes in its volume,
and that, for each body in the set, differences in volume correspond to differences in temperature in such a way that equal differences of volume are produced by equal differences of temperature, i.e., if V = f(Θ) and v₁−v₂ = v₃−v₄ then it is because θ₁−θ₂ = θ₃−θ₄.^{Footnote 25}

While this development so far involves only abstract individual properties—temperatures and volumes^{Footnote 26}—possibly identified as properties of objects, on this basis the construction of a scale of temperatures, and therefore the introduction of values of temperature, is a relatively trivial task. According to the traditional procedure,

two distinct temperatures are identified, θ₁ and θ₂, each of them being the common, constant temperature of a class of objects, θ₁ = Θ[R₁] and θ₂ = Θ[R₂], in analogy with what discussed in Sect. 6.3.4 about speed of light; θ₁ and θ₂ could be the temperatures of the freezing point of water and the boiling point of water in appropriate conditions, respectively;
the scale built from θ₁ and θ₂ is given a name, say °C, and a number in the scale is conventionally assigned to θ₁ and θ₂, thus identifying them with values, for example 0°C:= θ₁ and 100 °C:= θ₂;
according to the hypothesis that equal differences of volume are produced by equal differences of temperature appropriate numbers in the scale are assigned to all other temperatures: for example, if f(θ₃) = [f(θ₁) + f(θ₂)] / 2, then [0 °C + 100 °C]/2 = 50 °C:= θ₃.

The conclusion is then that values of temperature are individual temperatures identified as elements in such a scale.

6.3.7 Beyond Additivity: The Example of Reading Comprehension Ability

Let us now discuss the case of reading comprehension ability (RCA), as characterized and then measured by reading tests. Like temperature and unlike length, RCA is not an additive quantity: that is, we do not know how to combine readers by RCA so that the RCA of a hypothetical “synthetic reader” is the sum of the RCAs of each of the combined readers. As above, our question is then: what is a value of RCA such as, say 150 RCA units? The starting point is the same as in the case of length and temperature: we assume that we can compare readers by their RCA so as to assess whether two given readers have indistinguishable RCAs (in analogy with the comparison depicted in Fig. 6.2). For example, the two readers could be asked to discuss the contents of a text passage with a human judge, and the judge could then rate the reader’s relative RCAs. Now, unaided human judges may not have sufficient resolution to discriminate RCA beyond rough ordinal classes (e.g., very little comprehension, text comprehension, literal comprehension, inferential comprehension, etc.), so that one could, subject to the assumption that one used the same human judge, consider that RCA is at most an ordinal property. Apart from concerns that this may be assuming a weaker scale of RCA than possible, there are clearly serious issues of subjectivity at play in this situation: did the judge ask the same questions of the two readers, did the judge rate the responses to the questions “fairly”, and would a different human judge be consistent with this one?

A key step forward was the implementation of standardized reading tests (Kelly, 1916; see Sects. 1.2.2 and 3.3.1), where readers would (i) read a text passage and (ii) answer a fixed set of questions (called in this context “items”^{Footnote 27}) about the contents of the passage; and then (iii) their answers would be judged as correct or incorrect, and (iv) readers would be given sum-scores (e.g., the total number of items that they answered correctly) on the reading comprehension test. Here readers who had the same sum-score would be indistinguishable with respect to their RCAs as measured by that test. Again, this would result in an ordinal scale (i.e., the readers who scored 0, the readers who scored 1, …, the readers who scored K, for a test composed of K items), though, depending on the number of items in the set, there would be a finer grainsize than in the previous paragraph (i.e., as many levels as there are different sum-scores). This approach does address some of the subjectivity issues raised by the previous approach: the same questions are asked of each reader, and, with a suitable standardized mode of item response scoring, the variations due to different human judges can be reduced, if not eliminated altogether. However, what is not directly addressed are the issues of (a) the selection of text passages, and (b) the selection of questions about those passages. Suppose, however, that one was prepared to overlook these last two issues: one might convince oneself that the specific text passages and questions included in the test were acceptably suitable for all the applications that were envisaged for the reading comprehension test. In that case, one could adopt a norm-referenced approach to developing a scale (see Sect. 6.3.4), where the cumulative percentages of readers from a sample from a given reference-population (say, Grade 6 readers from X state in the year 20YZ) was used to establish a mapping from the RCA scores on the test to percentiles of the sample. This makes possible so-called equipercentile equating to the (similarly calculated) results of other reading comprehension tests.

Thus, at this point in the account, the conclusion is then that values of RCA are individual abilities identified as elements in an ordinal scale. It is interesting to note that the sum-scores which are the indexes used for the ranks can be also thought of as frequencies: a sum-score s, out of K total number of items, is a frequency of s in terms of the number of correct items. It can also be seen as s/K, a relative frequency (proportion) compared to the total number of items K, and, of course, relative frequencies are often interpreted as probabilities (though this move is not philosophically innocent; see, e.g., Holland, 1990, and Borsboom et al., 2003). That is, given this set of K items, what is the average probability that the reader will get the items correct, based on the proportion that they did get correct?

To see how this makes sense, one must backtrack to our conception of RCA, as follows.

We label as RC-micro an event (Wilson et al., 2019) involving a reader’s moment-by-moment understanding of a piece of text. This is related to Kintch’s concept of the textbase in his Construction/Integration (CI) model (Kintch, 2004), and refers to all component skills such as decoding (also known as word recognition), and these typically are driven from a finer to a coarser lexical granularity, i.e., a reader builds meaning from text, starting from small units (letters, sounds, words, etc.) and moving to progressively larger units. Most competent readers are not even conscious of the events at the lowest levels of granularity, unless, of course, the reader comes across a word that she does not recognize, and may have to go back to sounding it out letter by letter (i.e., grapheme by grapheme). Thus, each of these reading comprehension events can be thought of as a micro-level event that is also composed of a cascade of other more basic micro-level events, and is also contained in other micro-level events.
In contrast, we label as RC-macro the events which integrate all the micro-level events which occur for a reader in the process of reading the text passage, and may integrate other conceptions beyond those, including thoughts about other texts and other ideas. This is related to the situation aspect of Kintch’s CI model, which is integrated with the textbase, to form a deeper understanding of the text, and that is what will be stored in long term memory. Here, we might compare this to temperature, where the micro-events can be seen as the motion of individual molecules, which will each have properties of speed and direction in three-dimensional space (i.e., these would be seen as constituents of kinetic energy, a quantity different from temperature), which we cannot directly observe, and in which we are usually not interested. In contrast, the macro-level property of temperature is the integration over all of these molecular motions inside a certain body, which is what we are indeed interested in.
This leads us to reading comprehension ability, which is the overall disposition of an individual reader to comprehend texts.
Then, when test developers construct a RCA test, they (a) sample text passages from a body of texts, (b) design an item-universe (i.e., the population of items:^{Footnote 28} Guttman, 1944) of questions (items) that challenge a reader’s RCA concerning parts of the text (including, of course, possibly whole texts, and across different texts), take a sample from that universe, and (c) establish rules for deciding whether the answers to the sampled questions are correct or not, resulting in a vector of judgments of responses. This then is the transduction, from RCA to a vector of scored responses to the items.
In the tradition of classical test theory, as described above, the items are viewed as being “interchangeable” in the sense of being randomly sampled from the item-universe, and hence the information in the vector can be summarized as the score s, and, equivalently, as the relative frequency s/K that the reader will (on average) get an item correct.
Alternatively, the indication could be seen as the vector of responses, thus preserving the information about individual items (such as their difficulty), and thus modeling the probability of the reader getting each of the items correct, and this is the direction followed below.

In addition, generalizability must be considered: basing the measurement of RCA on a specific test is too limiting for practical application. This was recognized by Louis Thurstone (1928), a historically important figure in psychological and educational measurement (1928: p. 547):

A measuring instrument must not be seriously affected in its measuring function by the object of measurement. To the extent that its measuring function is so affected, the validity of the instrument is impaired or limited. If a yardstick measured differently because of the fact that it was a rug, a picture, or a piece of paper that was being measured, then to that extent the trustworthiness of that yardstick as a measuring device would be impaired. Within the range of objects for which the measuring instrument is intended, its function must be independent of the object of measurement.

To contextualize this, suppose that the RCA of readers is to be assessed using a set of items designed for reading comprehension which can be scored, as above, only as correct or incorrect. Thus, we must ask what is required so that the comparison of two readers m and n (in terms of their RCA) will be independent of the difficulty of the items that are used to elicit evidence of their relative RCAs?

Furthermore, assume that the test is composed of a set I of items. Now, two readers m and n can be observed to differ only when they answer an item differently. For any such pair of readers, m and n, there will a set of items for which they are both correct, call it I_c, and a set for which they are both incorrect, I_i. Then the set of items on which they differ will be I_d, which is I with I_c and I_i removed—and suppose that the number of items in I_d is D. Suppose further that the number of items that reader m gets correct in the reduced set I_d is s_m, and define s_n similarly. Then, s_m + s_n = D, and s_m/D is the relative frequency of m answering an item correctly and n simultaneously answering it incorrectly. Thus the RCAs of m and n (in terms of the success rates of m and n) can be compared by comparing s_m/D with s_n/D, where (s_m/D)/(s_n/D) is the observed proportion of reader n answering an item incorrectly and simultaneously answering it correctly. By interpreting relative frequencies as probabilities, these are then P(m=correct, n=incorrect) and P(m=incorrect, n=correct), and they can be compared using their ratio

$$\frac{{P\left( {m = {\text{correct}},\,n = {\text{incorrect}}} \right)}}{{P\left( {m = {\text{incorrect}},\,n = {\text{correct}}} \right)}}$$

Now, suppose that P_mi is the probability that person m responds correctly to item i (and equivalently for person n), so that this expression can be written somewhat more compactly

$$\frac{{P_{mi} \left( {1 - P_{ni} } \right)}}{{\left( {1 - P_{mi} } \right)P_{ni} }}$$

with the assumption of local independence, and the observation that, where there are only two responses, then the sum of the two possibilities must be 1.0.

Returning now to Thurstone’s admonition, this can be translated in this context to the requirement that the equation

$$\frac{{P_{mi} \left( {1 - P_{ni} } \right)}}{{\left( {1 - P_{mi} } \right)P_{ni} }} = \frac{{P_{mj} \left( {1 - P_{nj} } \right)}}{{\left( {1 - P_{mj} } \right)P_{nj} }}$$

(6.1)

should hold for any choice of items i and j. It would be a matter of just several lines of algebra to show that

$$P_{n i} = \frac{{{\text{exp}}\left( {\theta _{n} - \delta _{i} } \right)}}{{1 + {\text{exp}}\left( {\theta _{n} - \delta _{i} } \right)}}$$

(6.2)

where θ_n is reader n’s RCA, and δ_i is item i’s reading difficulty. In fact, with the probability function in eq. (6.2), both expressions in eq. (6.1) reduce to exp(θ_m−θ_n), that is, the item difficulties, δ_i and δ_j, are no longer present, which confirms that the comparison does not depend on the specific items used for the comparison, as Thurstone demanded. Note that the RCAs and item difficulties are on an interval scale (by construction). Of course, in order for the item difficulties to be eliminated from the equation, the item difficulties and the RCAs must conform to this probability model in the sense of statistical fit, and hence this is an empirical matter that must be examined for each instrument and human population. The surprising finding about this function, in Eq. (6.2), is that, under quite mild conditions, it is the only such function involving these parameters; this result is due to Georg Rasch, hence the function is called the “Rasch” model (Rasch, 1960/1980).

The actual numbers obtained for θ_n and δ_i are termed “logits” (i.e., log of the odds, or log-odds units),^{Footnote 29} and are typically used to generate property values in ways similar to the way that is done for temperature units: the logits are on an interval scale, so what is needed are two fixed and socially-accessible points. One standard way is to assign two relatively extreme values: for example, one might decide that for a given population of readers, say State X for year 20YZ, the 100.0 point level might be the mean of the logits for readers in Grade 1, while a higher value, say 500.0, would be chosen as the mean for students in Grade 12: this would be suitable, for example, for a reading test used in a longitudinal context (for an expanded discussion, see, e.g., Briggs, 2019).

A second, but similar way, perhaps more suitable for test applications focused on particular grades, would be to allocate a value, say 500 as the mean for readers in Grade 6, say, and 100 as the standard deviation for the same students. Some applications also use the raw logits from their analyses—this effectively embeds the interpretation of the units in a given sample, which may be acceptable in some research and development situations, but would be difficult to justify in a broadly-used application. There is also a more traditional set of practices that (a) use an ordinal approach to classifying readers into a sequence of reading performance categories, and (b) a “norm-referenced” approach that carries out similar techniques to those just described, but using the raw scores from the reading tests rather than estimates from a psychometric model.

The conclusion is then that values of RCA are individual abilities identified as elements in a log-odds (interval) scale based on ratios of probabilities (see Freund, 2019, for a discussion of these types of scales).

6.4 The Epistemic Role of Basic Evaluation Equations

The conclusion reached in the previous section has an important implication for an ontology of quantities, and properties more generally as developed further in the following section. A Basic Evaluation Equation such as Q[a] ≈ x q_ref reports not just an attribution or a representation, but the claim of an indistinguishability, and in the form Q[a] = x q_ref the claim of an equality, of individual quantities: if it is true, it informs us that two individual quantities that were known according to different criteria—as the property of an object and the multiple of a unit respectively—are in fact one and the same. In detail:

before the relation is evaluated, we are able to identify an individual quantity q as a quantity of an object a, Q[a], and a set of individual quantities q_x, each as a value x q_ref, for a given q_ref and a number x varying in a given set; q and each q_x are quantities of the same kind;
as a result of the evaluation, a number x is found such that the hypothesis is supported that the individual quantity q and the individual quantity q_x, that were known according to different criteria, are in fact one and the same, i.e., that Q[a] and x q_ref are identifiers of the same individual quantity.

As a consequence, Basic Evaluation Equations are ontologically irrelevant: if they are true, they simply instantiate the tautology that an individual quantity is equal to itself.^{Footnote 30} But, of course, their widespread use is justified by their epistemic significance: if they are true, they inform us that two individual quantities that were known according to different criteria are in fact one and the same.

This gives us the tool to interpret a common way of presenting measurement: “the input to the measurement system is the true value of the variable [and] the system output is the measured value of the variable [, so that] in an ideal measurement system, the measured value would be equal to the true value” (Bentley, 2005: p. 3), with the consequence that “in a deterministic ideal case [it] results in an identity function” (Rossi, 2006: p. 40). Let us concede that in the deterministic ideal case the Basic Evaluation Equation is not just an indistinguishability but an equality, Q[a] = x q_ref. Nevertheless, the position exemplified in these quotes confuses the ontological and epistemic layers: for those who already know that Q[a] and x q_ref are the same quantity, the relation is an ontological identity, as is the evening star = the morning star for current astronomers (see Sect. 5.3.2). And in fact those who already know that Q[a] and x q_ref are the same quantity would have no reasons for measuring Q[a]. But measurement is aimed at acquiring information on a measurand, not at identically transferring values of quantities through measuring chains as is the implicit supposition behind the quotations above.

Indeed, the idea of deterministic ideal measurement as an identity function becomes understandable if it is applied not to measurement, but to transmission systems, as mentioned in Sect. 4.2.1. It is correct indeed to model a transmission system in such a way that the input to the transmission system is the value of the variable and the system output is the transmitted value of the variable, so that “in an ideal transmission system, the transmitted value would be equal to the input value” (by paraphrasing Rossi’s quote above). If in fact values were empirical features of phenomena, measuring instruments could be interpreted as special transmission channels, aimed at transferring values in such a way that the transmission is performed without errors and therefore as an identity function. But a basic difference between transmission and measurement is manifest: the input to a transmission system is a value, explicitly provided by an agent who or which operates on purpose by encoding the value into the quantity of an object and sending the quantity through a channel, the purpose of which is to faithfully transfer this input. In this case, the value transmitted along the channel by the agent via the encoded quantity is in principle perfectly knowable. No such agent exists in the case of measurement, which requires a radically different description, in which values of quantities are the output, and not the input, of the process.^{Footnote 31}

6.5 Generalizing the Framework to Non-quantitative Properties

The ontological and epistemological analysis proposed so far has been focused on quantities, although, as we have exemplified, much can also be done with non-additive quantities. In consistency with the VIM, we have assumed that quantities are specific kinds of properties (JCGM, 2012: 1.1), and therefore we need to work on the relation between quantities and properties in order to explore whether and how the ontology and epistemology introduced so far can be applied to properties in general. Concretely, the issue is whether Basic Evaluation Equations can involve non-quantitative properties, and if so, what are the key differences between quantitative and non-quantitative Basic Evaluation Equations.

According to a standard view in philosophy of science, developed in particular within the neopositivist tradition by Rudolf Carnap (1966) and Carl Gustav Hempel (1952), “the concepts of science, as well as those of everyday life, may be conveniently divided into three main groups: classificatory, comparative, and quantitative.” (Carnap, 1966: p. 51). The VIM at least implicitly assumed this classification and adapted it to properties, defined to be either quantities or nominal properties, where the former are defined to be either quantities with unit (peculiarly, <quantity with unit> is not explicitly defined, nor given a term) or ordinal quantities. Hence according to the VIM the basic distinction is between being quantitative and non-quantitative (Dybkaer, 2013), where the demarcation criterion is <to have magnitude>: quantities are properties that have magnitude (including ordinal quantities, then), and nominal properties are properties that have no magnitude. This concept system is depicted in Fig. 6.8.

Two tree diagrams. The concept is divided into classification, comparative and quantitative. Property is classified based on the presence of magnitude and units. — **Fig. 6.8**

The VIM does not define what a magnitude is, but a plausible hypothesis is that “magnitude” can be generically interpreted there as “amount” or “size”, so that for example the height of human beings is a quantity because it is a property that we have in amounts. This stands in contrast with properties such as blood type, which is only classificatory because we do not have “amounts of blood type”. Accordingly, the phrase “the magnitude of the height of a given human being” refers to “the amount of height” of that human being: it is then, plausibly, an individual length (for a more detailed analysis of the elusive concept <to have magnitude>, see Giordani & Mari, 2012).

The simplicity of the VIM’s account is attractive, but the distinction between quantitative and non-quantitative properties deserves some more analysis here, also due to its traditional connection with the discussion about the conditions of measurability, as mentioned in Sect. 3.4.2.^{Footnote 32}

As discussed in Chap. 4, several interpretations of measurement have been proposed, but a long tradition rooted on Euclid has connected the measurability of a property with its being quantitative. The VIM keeps with this tradition in stating that “measurement does not apply to nominal properties” (JCGM, 2012: 2.1 Note 1). As also discussed in Sect. 1.1.1 and at more length in Sect. 4.2.3, tensions related to this issue helped motivate the formation of a committee of physicists and psychologists appointed the British Association for the Advancement of Science and charged with evaluating the possibility of providing quantitative estimates of sensory events (see the discussion in Rossi, 2007; a more detailed analysis is in Michell, 1999: ch. 6). We might envision two distinct but complementary paths toward resolution of these tensions:

one is about the possibility of providing meaningful quantitative information for properties which are not directly or indirectly evaluated (or evaluable) by means of additive operations^{Footnote 33};
the other is about the appropriateness of broadening the scope of measurement so as to include the evaluation of non-quantitative properties.

From the beginning, both of these paths have been biased by the prestige of measurement, as witnessed by the key role attributed by some to Otto Hölder’s (1901) paper, a mathematical work whose title has been translated in English as “The axioms of quantity and the theory of measurement”, and about which Michell asserted that “we now know precisely why some attributes are measurable and some not: what makes the difference is possession of quantitative structure” (p. 59). In the same vein Jan De Boer claimed that “Helmholtz and the mathematician Hölder are usually seen as the initiators of the axiomatic treatment of what is often called the theory of measurement” (1995: p. 407). But even just a glance at the scope of Hölder’s axioms shows that they do not relate to any experimental process, as would be expected from Helmholtz’s own words—“the most fruitful, most certain, and most exact of all known scientific methods” (1887: p. 1)—and confirmed by the way the VIM defines <measurement>: a “process of experimentally obtaining one or more quantity values that can reasonably be attributed to a quantity” (JCGM, 2012: 2.1). Indeed, Hölder himself admitted that “by ‘axioms of arithmetic’ has been meant what I prefer to call ‘axioms of quantity’” (p. 237), so that measurement is involved in them only insofar “the theory of the measurement” is equated to “the modern theory of proportion” (p. 241), thus confirming the purely mathematical nature of the treatment.

In Sect. 3.4.2 we argued that this superposition of conditions of being measurable and being quantitative derives from a confusion between <measurement>, an empirical concept, and <measure>, a mathematical concept. The conclusion is simple to state: what is to be found in Euclid’s Elements and what Hölder considered “the modern theory of proportion” is not a theory of measurement but a theory of measure, where measures are taken to be continuous quantities, where then, despite their lexical similarity, <measurement> and <measure> need to be maintained as distinct concepts (Bunge, 1973). From this, of course one might assume that only properties modeled as measures are measurable, but this is an assumption, not a (logical, epistemological, or ontological) necessity: it is what we take to be the position of what Michell (1990) calls “the classical theory of measurement”, as rooted in Euclid’s geometry, but his sharp tenet that “without ratios of magnitudes there is no measurement” (p. 16) cannot be maintained without this strong and basically arbitrary assumption.

The problems generated by this confusion are not just lexical or semantic. A well grounded distinction between quantitative and non-quantitative properties would be a key target, at least as a means to identify and justify possible differences in inferential processes and their results, and therefore in the kind of the information they produce. The basic intuition about the distinction remains, e.g., that individuals can be compared in such a way that the height of a person can be one third greater than the height of another, or a difference on an interval scale can be one third greater than another difference, whereas the blood type of that person cannot be one third greater of the blood type of another. This intuition needs a persuasive explanation, which ultimately would be beneficial for a better identification of the conditions of measurability. In fact, the mentioned confusion is a good reason for developing this analysis as a key component of an ontology and an epistemology of properties: once an appropriate classification of types of properties has been established, whether only quantities are measurable might be thought of as simply an arbitrary lexical choice (Mari et al., 2017).^{Footnote 34}

A basic framework on this matter was proposed by Stanley Smith Stevens (1946), with his well-known classification of what he called “scale types”, and since then his distinction between nominal, ordinal, interval, and ratio scales has been widely adopted (for an early, extended and clear presentation, see Siegel, 1956: ch. 3), and variously refined.^{Footnote 35} Such a framework was conceived as dealing with scales of measurement, given the perspective that “measurement, in the broadest sense, is […] the assignment of numerals to objects or events according to rules” (p. 677), explicitly drawing from Campbell’s seminal representationalist statement that “measurement is the assignment of numerals to represent properties” (Campbell, 1920: p. 267; see also the related discussion in Sect. 4.2). From the perspective of the present analysis Stevens’ “broadest sense” is indeed too broad, if considered to be specifically related to measurement. Rather, what is interesting in his classification is more correctly understood by considering it as related to scales of property evaluation, thus disentangled from issues about measurability. We have then to deal with two interrelated issues:

to which entities does the feature of being quantitative or non-quantitative apply?
how should the condition of being quantitative or non-quantitative be defined?

But are the terms “nominal”, “ordinal”, and so forth best understood as referring to types of properties, or of evaluations? And, in consequence, how should such types be defined?

6.5.1 The Scope of the Quantitative/non-Quantitative Distinction

The first question for us to consider is about the scope of the classification between nominal, ordinal, interval, and ratio—let us call it NOIR, from the initials of the four adjectives—that is, what does it apply to, and therefore what does NOIR classify? At least two positions are possible. According to one, NOIR is about assignments of informational entities to objects: nominal, for example, is a feature of the way numerals (in Stevens’ lexicon, i.e., “names for classes”, 1946: p. 679) are assigned to individuals with respect to their blood type. This is how Stevens introduced it, thus considering for example blood type to be evaluated on a scale that is nominal. According to another position, NOIR is about the properties themselves: nominal, for example, is a feature of blood type. This is how, for example, the VIM uses it, thus considering blood type to be a nominal property. Hence given a Basic Evaluation Equation such as

$$blood\,type\left[ {individual\,x} \right] = {\text{A}}\,{\text{in}}\,{\text{the}}\,{\text{ABO}}\,{\text{system}}$$

being nominal is considered to be either^{Footnote 36}

a feature of the evaluation that produces the equation, according to the first position, or
a feature of the general property that is involved in the equation, according to the second position.

By interpreting them in a representational context, Michell presents these two positions as being about internal and external representations, respectively (Michell, 1999: p. 165–166, from which the quotations that follow are taken—for consistency, everywhere the term “attribute” used by Michell has been substituted with “property”). According to Michell, an internal representation

occurs when the [property] represented, or its putative structure, is logically dependent upon the numerical assignments made, in the sense that had the numerical assignments not been made, then either the attribute would not exist or some component of its structure would be absent.

Thus, it is internal to the evaluation. An external representation is instead

one in which the structure of some [property] of the objects or events is identified independently of any numerical assignments and then, subsequently, numerical assignments are made to represent that [property]’s structure

where the adjective “external” is explained by Michell as the hypothesis that the property

exists externally to (or independently of) any numerical assignments, in the sense that even if the assignments were never made, the [property] would still be there and possess exactly the same structure.

Thus, it is external to the evaluation. In summary (Giordani & Mari, 2012: p. 446),

an internal representation is an evaluation that induces a structure, whereas
an external representation is an evaluation that preserves a structure.

The examples proposed by Michell are interesting, and useful for better understanding what is at stake with this distinction. He exemplified external representations (i.e., such that NOIR is a feature of properties) by means of hardness:

Minerals can be ordered according to whether or not they scratch one another when rubbed together. The relation, x scratches y, between minerals, is transitive and asymmetric and these [features] can be established prior to any numerical assignments being made.

The idea is then that once a property-related criterion of comparison has been identified (in this case, mutual scratching), the outcomes of property-related comparisons do not depend on the way they are represented: the conclusion would be that hardness is ordinal (or, more correctly, that hardness is at least ordinal). As the example suggests, this seems to be based on the assumption that, for an external representation to be possible, properties of objects must be empirically comparable according to some given conditions, and the outcome of the comparison must be observable, as in the paradigmatic case of mass via a two pan balance. This condition was embedded in the representational theories of measurement under the assumption that the availability of an empirical relational system is a precondition of measurement.

Michell proposes two examples of internal representations (i.e., such that NOIR is a feature of representations rather than properties). The first one is about

an extreme case [...] of assigning different numbers to each of a class of identical things (say, white marbles) and on that basis defining a [property]. The [property] represented by such assignments would not be logically independent of them and, so, had they not been made, the [property] would not exist.

This is indeed the extreme case of an assignment claimed to be a representation but that does not represent anything, being only a means of object identification: it is not even a property evaluation, given that there is no property to evaluate, in the specific sense that a Basic Evaluation Equation cannot be written because there is not a general property to be evaluated of the considered objects.^{Footnote 37} We may then safely ignore this case, and consider the second, “less extreme” example,

where an independent [property] may exist, but the structure that it is taken to have depends upon numerical assignments made. For example, people may be assigned numbers according to nationality (say, Australian, 1; French, 2; American, 3; Belgian, 4; etc.) and then the [property] of nationality may be taken to have the ordinal structure of the numbers assigned. In this case, had numerical assignments not been made, the [property] (nationality) would still exist but the supposed ordinal structure would not.

This is a case in which Stevens’ framework proves to be non-trivially applicable. While it is always possible to adopt numbers for representational means, the numerical relations do not necessarily relate to empirical relations among the objects^{Footnote 38}: in this case, although it is representable by means of ordered entities, nationality is not itself ordinal.^{Footnote 39} These two examples show why we do not see the category of internal representations as relevant to measurement.

Hence, in our view, the evaluated property exists and has features that are independent of its possible representations: an evaluation is expected to preserve the structure of the property, not to induce a structure on the property.^{Footnote 40}

Given the controversial nature of Stevens’ framework, it may be worth noting that this has nothing to do with setting constraints on ways of representation and of related data processing, such as, say, proscribing against computing the mean value of a set of numbers that encode nationalities. Along the same lines as Lord (1953), Velleman and Wilkinson (1993) emphasized the importance of not imposing such constraints, given that “experience has shown in a wide range of situations that the application of proscribed statistics to data can yield results that are scientifically meaningful, useful in making decisions, and valuable as a basis for further research” (p. 68) (“proscribed statistics” are those statistics that are not “permissible” in the vocabulary of Stevens, 1946: p. 678). In fact, measurement science does include some “proscriptions”, such as the condition of dimensional analysis that only values of quantities of the same kind can be added. Nevertheless, the idea that through data analysis something can be discovered also about the structure of the evaluated properties is not problematic per se. The point is that if the property under consideration is evaluated based on (for example) purely ordinal comparisons (as in the case of hardness), the values that are obtained cannot be expected to convey more than ordinal information, exactly as prescribed by Stevens’ framework and its refinements (an example of which is mentioned in Footnote 32). In this view, what Stevens introduced as the set of “permissible” functions is better understood as a matter of algebraic invariance and meaningfulness under scale transformation (Narens, 2002), and therefore of uniqueness of the scale itself.

A summary can be presented simply as follows:

the representation of properties of objects, or the representation of objects as such, is an unconstrained process, and anything could in principle be used as a means of representation;
the evaluation of properties of objects is a process that is expected to produce values of the evaluated properties;
the measurement of properties of objects is a specific kind of evaluation.

From this point of view, we consider the emphasis on representation that has usually accompanied NOIR to be misleading: the position that assignments are representations that do not represent anything (Michell’s “internal representations”) is void, and the interesting question is instead whether NOIR is about

ways of evaluating properties, or
properties as such,

where in both cases the claim is that there is a property under consideration, having structural features which do not depend on whether or how it is represented. While Stevens, who was inclined toward operationalism, was candid about this alternative—“the type of scale achieved depends upon the character of the basic empirical operations performed” (Stevens, 1946: p. 677)—and consistently considered NOIR a feature of scales, we still have to further explore regarding this subject.

6.5.2 From Values of Quantities to Values of Properties

We have assumed so far that the concept <evaluation> applies not only to quantities but also, and more generally, to properties. This has been the justification for adopting the same structure for the Basic Evaluation Equation for both quantitative cases, e.g.,

$$length\left[ {rod\,a} \right] = {1}.{2345}\,{\text{m}}$$

and

$$\begin{aligned} & reading\,comprehension\,ability\left[ {individual\,b} \right] \\ & = {\text{1}}.{\text{23}}\,{\text{logits }}({\text{on}}\,{\text{a}}\,{\text{specific}}\,{\text{RCA}}\,{\text{scale)}} \\ \end{aligned}$$

and non-quantitative cases, e.g.,

$$blood\,type\left[ {individual\,c} \right] = {\text{A}}\,{\text{in}}\,{\text{the}}\,{\text{ABO}}\,{\text{system}}$$

Hence in the generic structure

$${\text{property}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{property}}$$

A in the ABO system is an example of a value of a property, just as 1.2345 m is an example of a value of a quantity. While it is acknowledged (for example by the VIM) that quantities are specific types of properties, whether values of quantities can be generalized and thus applied to non-quantitative properties is a much less considered subject, as is the related issue of what a value of a property is. For example, the VIM defines <value of a quantity> (JCGM, 2012: 1.19), but does not define <value of a property> even though it deals with non-quantitative properties, termed “nominal properties” (JCGM, 2012: 1.30). Hence the problem for us to consider here is whether the Basic Evaluation Equation is meaningful only in the specific case of quantities, i.e., in the case

$${\text{quantity}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

or is also able to convey knowledge about non-quantitative properties.

Our construction of values of quantitative properties, presented in Sect. 6.3, relies on their empirical additivity, in the example of length, or on the invariance of their empirical difference, in the examples of temperature and reading comprehension ability, conditions which do not hold for non-quantitative properties. Our concept of shape, for example, is indeed such that the ideas of “adding objects by their shape” or “subtracting objects by their shape” are meaningless, in the sense that the shape of an object obtained by somehow composing two other objects is in general not additively related to the shapes of the composing objects (in fact, shapes are not even ordered: for example, it is meaningless to say that a cube is more or less than a cylinder). Hence, the interpretation of the Basic Evaluation Equation for quantities, according to the Q-notation, such that Q[a]/[Q] is a number (see Sect. 5.3.4) does not apply to non-quantitative properties: there are no “shape units” in this case, nor can shapes be compared by their ratio.

At a more fundamental level, however, the idea of conveying information on properties like blood types of individuals or shapes of objects by means of “values of blood type” and “values of shape” maintains its meaning and relevance, as when we say that a given rod is cubic or cylindrical, phraseological means for “whose shape is cube” and “whose shape is cylinder”, which immediately leads to a formalization such as shape[rod a] = cube and shape[rod a] = cylinder. Given their analogy in structure with length[rod a] = 1.2345 m, where 1.2345 m is a value of length, the conclusion seems to be that cube, cylinder, and so forth may be considered to be values of shape.

However, this is not completely correct, given an important difference between the two cases. Indeed, the value 1.2345 m includes, via the unit metre and the reported number of significant digits, information on the set of possible values from which 1.2345 m has been chosen, i.e., the set is the non-negative multiples of the metre: it is 1.2345 m and not 1.2346 m, and so on; it is 1.2345 m but we are unable to distinguish between 1.2345 m and 1.23451 m, and so on. Choosing a unit and a (finite) number of significant digits corresponds to introducing a classification on the set of the lengths, in which each value identifies one class. Hence, selecting a value of length conveys both the information that (i) the class of lengths identified by that value has been selected, and (ii) all other classes (identified by all other multiples of the unit) have not been selected. In a relation such as shape[rod a] = cube this second component is missing.^{Footnote 41}

In order to improve the structural analogy between length[rod a] = 1.2345 m and shape[rod a] = cube, the set of the possible shapes, of which cube is one element, needs to be declared as part of the equation: it might be, e.g., {cube, any other shape} or {cube, cylinder, cone, sphere, any other shape}, thus showing that the report that rod a is cubic conveys different information in the two cases. We may call the set of the possible shapes a reference set, R, so that an example of the Basic Evaluation Equation in the case of a nominal property such as shape is^{Footnote 42}

$$shape\left[ {rod\,a} \right] = {\text{cube}}\,{\text{in}}\,{\text{R}}$$

where cube in R is then the example of a value of a property. Indeed, the same structure may also be used for quantities, e.g.,

$$length\left[ {rod\,a} \right] = {1}.{2345}\,{\text{in}}\,{\text{metres}}$$

i.e., the value is 1.2345 in the classification of lengths generated by the metre and its multiples, but in this case additivity of length permits the more informative interpretation that the class identified as 1.2345 in that classification corresponds to a length which is 1.2345 times the metre.

Hence the concept <value> is not bound to quantities: non-quantitative properties also have values, and any such value is an individual property identified as an element of a given classification of comparable individual properties,^{Footnote 43} such that if the classification changes, and therefore a different reference set is used, another value may be obtained for the same property under evaluation. Under these conditions the previous considerations about values of quantities can be correctly generalized to values of properties: first, choosing a set of values for blood type or shape corresponds to introducing a classification on the set of the blood types or the shapes, in which each value identifies one class, and, second, Basic Evaluation Equations also apply to non-quantitative properties and, if true, they convey much richer information than just representation: they state that the property of an object and the value of a property are the same individual property.

Box 6.2 Evaluation scales

In this context the question of what is a scale—we prefer to refer here to the more generic concept <evaluation scale> than to <measurement scale> given that what follows is not only and specifically about measurement—can be straightforwardly discussed.

Let us consider the concrete case of Mohs’ scale of mineral hardness. The observation that any pair of minerals x and y can be put in interaction so that either one scratches the surface of the other or neither scratches the surface of the other, is accounted for as a relation between their hardnesses H, H[x] > H[y], or H[x] ≈ H[y], or H[y] > H[x], and ten equivalence classes, C₁, C₂, …, C₁₀, were thus empirically identified, such that if H[x]∈C_i and H[y]∈C_j, and i > j, then H[x] > H[y]. However, dealing with equivalence classes for conveying information about properties of objects is inconvenient: rather, each class can be uniquely identified via a one-to-one mapping f from the set of the equivalence classes to a set of identifiers (for example, the set of natural numbers), where the condition of injectivity guarantees that the information about hardness-related distinguishability is not lost in the mapping. Furthermore, since mutual scratching induces an ordering on the set of the equivalence classes C_i, the mapping f may be defined so as to maintain such structural information, i.e., if H[x] > H[y] and H[x]∈C_i and H[y]∈C_j, then f(C_i) > f(C_j), where then H[x] > H[y] is an empirical relation about properties (hardnesses, in this example) of objects and f(C_i) > f(C_j) is an informational relation about identifiers of equivalence classes. The two conditions of injectivity and structure preservation make f an isomorphism (surjectivity is an immaterial condition here): it is a scale, i.e., an isomorphism from equivalence classes of property-related indistinguishability to class identifiers.

For a given a set of equivalence classes {C_i} established for a property P, the condition that the scale f be an isomorphism constrains the set of class identifiers, i.e., the range of f, except for an isomorphism. In the example, the range of f is usually taken to be the set of the first 10 natural numbers, so that f(equivalence class of talc hardnesses):= 1, f(equivalence class of gypsum hardnesses):= 2, and so on, but any other ordered set of 10 elements could be equally chosen, say the sequence 〈a, b, …, j〉, where the mapping 1 → a, 2 → b, … is usually called a scale transformation, though a better term is scale identifiers transformation, given that the empirical component of the scale is left untouched and only the scale identifiers are (isomorphically) changed.

Note that the definition of a scale is a normative statement that establishes which equivalence class is identified by which identifier. As such, it is not an equation (and therefore not a Basic Evaluation Equation), and it is neither true nor false. In the example above, it is written then

$$\forall {\text{i}} \in \left\{ {{1},{ 2}, \ldots ,{1}0} \right\},\,f\left( {C_{i} } \right): = i$$

(where the notation “x:= y” means that x is defined to be y, not that x and y are discovered to be equal). From this characterization it is simple to see why for the same general property on can construct scales that are distinct (in the specific sense of being non-isomorphic):

• the criterion that defines the equivalence classes C_i could be changed, so that a new set of equivalence classes implies a new mapping f; for example, that in the case of refining hardness classes this could lead to non-integer identifiers like 4.5 for steel;

• the structure that needs to be preserved by the mapping f could be changed, so that a new isomorphism has to be obtained that preserves an algebraically stronger or weaker structure; an example of the first case is the historical development of the measurement of temperature when the absolute zero was discovered, allowing measurement of temperatures in the thermodynamic (Kelvin) scale and not only in a thermometric (Celsius, Fahrenheit) scale; an example of the second case is the daily measurement of temperature whenever measured values are reported in a thermometric scale, thus forgetting the available information of the “natural”, absolute zero.

From a scale f a function g can immediately be derived—“lifted”, in the algebraic jargon—mapping properties of objects belonging to equivalence classes to class identifiers: if P[x]∈C, then g(P[x]):= f(C) (for example, since f(equivalence class of talc hardness):= 1, then g(hardness of a given sample of talc) = 1, i.e., in the Mohs scale the hardness of any sample of talc is identified by 1, and so on).^{Footnote 44} Of course, properties of distinct objects may be indistinguishable from each other, i.e., may belong to the same equivalence class, and therefore may be associated to the same identifier via a function g. Hence, while f is one-to-one, g is many-to-one, and therefore a homomorphism, that may be called a scale-based representation of the properties of objects. In contest with the statement f(C_i):= i considered above, a relation like g(P[x]) = i is typically not about constructing a scale but, after a scale f has been constructed, about using it, with the aim of identifying the property P[x] as an element of an equivalence class, the one identified by i. As such, it is an equation, that is either true or false, depending on whether actually P[x]∈C_i, if f(C_i):= i.

In summary, a scale is built under the assumption that some relations (at least indistinguishability, but possibly order and more) among comparable properties of objects are given, and is introduced to organize and present in a standard way the information about such relations. As a consequence, if these relations change, the scale could be changed in turn.

6.5.3 Property Evaluation Types

Given this broad characterization of what values of properties are, it is now clear that the four conditions that in Chap. 2 we have proposed as necessary for a process to be considered a measurement can also be fulfilled by the evaluation of a non-quantitative property: it may be a process that is empirical (Sect. 2.2.1) and designed on purpose (Sect. 2.2.2), whose input is a property of an object (Sect. 2.2.3), and that produces information in the form of values of that property (Sect. 2.2.4).

However, as previously noted, such conditions are not claimed to be also sufficient. In other words, since measurement is a property evaluation but not all property evaluations are measurement, the fact that conditions that are necessary for measurement apply to the evaluation of non-quantitative properties is still not sufficient to conclude that non-quantitative properties are measurable. While Chap. 7 is devoted to proposing our account of the structural conditions that characterize measurement, it is time now to come back to the issue of whether NOIR is about ways of evaluating properties or about properties as such.

The question of the scope of NOIR, as elaborated in Sect. 6.5.1, is in fact about the alternative between a more modest instrumentalist, epistemological position, which assumes that we can only characterize evaluations (and more specifically measurements) rather than properties as such, and a stronger realist, ontological position, according to which we can instead say something about properties themselves, plausibly also on the basis of what we learn in the process of evaluating them. Of course, the more modest position is also safer, and seems to be more consistent with falsificationism (Popper, 1959) and better able to take into account the fact that scientific revolutions (Kuhn, 1969) can annihilate bodies of knowledge that were deemed to be established: given the always revisable status of our hypotheses about the empirical world—as illustrated, for example, by the historically well-known cases of phlogiston and the caloric—wouldn’t it be wiser to renounce any ontological claim about the structure of properties as such?

Let us explore the issue in the light of the assumption of two conditions of information consistency for an evaluation (Giordani & Mari, 2012: p. 446):

(C1) for each relation among properties of objects there is a relation among values such that the evaluation preserves all property relations: this guarantees that the information empirically acquired is maintained by the evaluation;

(C2) only relations among values that correspond to relations among properties of objects are exploited while dealing with values: this guarantees that the information conveyed by values is actually about the evaluated properties.

In summary, values should convey all and only the information available on the evaluated properties. This is plausible, to say the least: given that values are what an evaluation produces, they should report everything that was produced, and that otherwise would be lost (C1), but nothing more than that, to prevent unjustified inferences (C2). These two conditions deserve some more consideration.

Condition (C1) seems obvious, particularly in the context of representational theories of measurement where it may be considered the premise of representation theorems.^{Footnote 45} For example, if the property of an object is compared with the property of another object and the former is observed to be greater than the latter, the value of the former should be greater than the value of the latter. However, the meaning of (C1) is based on the non-trivial acknowledgment that properties of objects may also be compared independently of their evaluation, and therefore that the comparison has features which are independent of the evaluation. The condition that the property of one object is greater than the property of another object might be in some sense observable, and in this case does not require such properties to be evaluated. This gives support to the position that NOIR is a feature not only of the ways in which properties are evaluated, but of properties as such, via what we know about the ways in which they can be compared.^{Footnote 46} In Michell’s words, “the existence of the empirical relations numerically represented must be logically independent of the numerical assignments made. That is, these empirical relations must be such that it is always possible (in principle, at least) to demonstrate their existence without first making numerical assignments” (Michell, 1999: p. 167). For sure, any such ontic claim may be updated, and in particular improved—for example when a metric is discovered to apply to what was previously considered to be a non-quantitative property—but this is just in agreement with the general understanding that empirical knowledge is always revisable.

Condition (C2) has more complex implications: how can we be sure that a relation among values does not correspond to a still-unobserved relation among properties of objects? The point here is not about accepting or refusing “proscriptions”, in the sense of Velleman and Wilkinson (1993) and as already discussed in Sect. 6.5.1, but about acknowledging that through evaluation some features of properties might be discovered. For example, historically, the idea that temperature can be evaluated on an interval scale was formulated as the result of its evaluation by means of thermometers, not via the comparison of temperatures of objects in terms of their distances / intervals. As documented by Chang (2004), a crucial problem was in the confirmation of the preliminary hypothesis that the evaluation is linear (in this case, that thermometers have a linear behavior in transducing temperatures to lengths), so that divisions in the scale of values (in this case, of length in the capillary) can be treated as evidence of correspondingly proportional divisions in the scale of properties of objects (in this case, of temperatures).^{Footnote 47} Such an inference is then justified on the basis of the structure of the evaluation, in the case of thermometers realized by the transduction effect of thermal expansion. And thus, for a property already known to be comparable in terms of order, appropriate conditions on the way the property is evaluated may help justify the hypothesis that distances/intervals, and therefore units (though without a “natural” zero) are also meaningful. Such a general characterization is not limited to physical properties: Indeed, this can be understood as the rationale of simultaneous conjoint measurement (Luce & Tukey, 1964) and Rasch measurement (Rasch, 1960), as also discussed in Sect. 4.4.1^{Footnote 48}: the fact that the evaluation fulfills given conditions leads one to infer that the evaluated property may have a structure richer than the observed one.

The attribution of an unobserved feature to a property is clearly an important and consequential move. While according to condition (C1) NOIR would be considered a feature of properties, known through their means of comparison, condition (C2) suggests a more cautious position, that NOIR is explicitly a feature of evaluations, and only in a derived and more hypothetical way a feature of evaluated properties. That is why we propose that NOIR are examples of Property Evaluation Types (Giordani & Mari, 2012). This is along the same lines as Stevens’ “types of scales of measurement”, but with the acknowledgment that such types are more generally features of evaluations, and not only of measurements. This position allows us to take into account the fact that the same property may be evaluated by means of evaluations of different types,^{Footnote 49} so that the usual property-related terms—“nominal property”, “ordinal property”, etc.—are meant as shorthands for something like “property that at the current state of knowledge is known to be evaluable on a nominal scale at best”, and so on. Even the very distinction between quantitative and non-quantitative properties has, then, this same quality: as the historical development of the measurement of temperature shows, a property that we can evaluate only in a non-quantitative way today might tomorrow also become evaluable quantitatively.

On this basis, we may finally devote some consideration to our most fundamental problem here: the conditions of existence of general properties.

6.6 About the Existence of General Properties

A basic commitment at the core of our perspective on measurement is that it is both an empirical and an informational process, aimed at producing information about the world, and more specifically, about properties of objects. A direct consequence of this view is that a property cannot be measured if it does not exist as part of the empirical world; that is, the empirical existence of a property is a necessary, though not sufficient, condition for its measurability (Mari et al., 2018). This statement may seem so obvious as to approach banality, but it has some less obvious features and consequences worthy of further exploration. In particular, one may ask: how can we know that a property exists? Stated alternatively, under what conditions is a claim about the existence of a property justified? And, more specifically, what does a claim of existence of a general property assume?

This section is dedicated to an analysis of this question, beginning with some conceptual house cleaning, related to the distinction between empirical properties and mathematical variables.

6.6.1 Properties and Variables

We have proposed that empirical properties are associated with modes of empirical interaction of objects with their environments. To help sharpen up this statement, let us consider the distinction between empirical properties and mathematical variables. An (existing) empirical property can, in principle, be modeled by a mathematical variable; indeed, this is one of the primary activities involved in a measurement process, as described in more detail in the following chapter.^{Footnote 50} However, it would be fallacious to conflate empirical properties and mathematical variables, or to assume that the presence of either implies the existence of the other: there can be empirical properties without corresponding mathematical models (for example, because we are unaware of the very existence of such properties; e.g., blood type prior to 1900), and there can be mathematical variables without corresponding empirical properties (for example, the variables in generic mathematical equations such as y = mx + b).

Although this distinction may seem obvious when presented in these terms, conventions in terminology and modes of discourse may sometimes obfuscate it, as when the term “variable” is used to refer both to an empirical property and a mathematical variable (which is common in the literature on “latent variable modeling”, for example; see McGrane & Maul, 2020), or when, as described in the GUM, “for economy of notation […] the same symbol is used for the [property] and for the random variable that represents the possible outcome of an observation of that [property]” (JCGM, 2008: 4.1.1).

As a consequence, it cannot be assumed out of hand that any given feature of a mathematical variable is shared by the empirical property that the variable claims to model. For example, some physical quantities are customarily and effectively modeled as real-valued functions—a precondition for modeling the dynamics of such quantities by means of differential equations—but assuming that all features of real numbers apply to the quantities they purport to model could, for example, lead to the conclusion that a given quantity is dense in the way that real numbers are, which in many cases is known to be false, as in the case of quantized quantities such as electrical charge. Analogously, properties are customarily and effectively modeled as continuous random variables for a variety of purposes, but, again, this does not guarantee that all features of continuous random variables hold true for the modeled properties (see also, e.g., Borsboom, 2006; McGrane & Maul, 2020), even for models that fit the data according to commonly-accepted criteria (see, e.g., Maraun, 1998, 2007; Maul, 2017; Michell, 2000, 2004).

With respect to the confusion between a knowable entity and what we know of it (i.e., the concept that we have of it), a particularly pernicious class of properties are those considered to be, in some sense, constructed, as was previously discussed in Sect. 4.5: one might infer from the fact that “concepts such as compassion and prejudice are […] created from […] the conceptions of all those who have ever used these terms” that they therefore “cannot be observed directly or indirectly, because they don’t exist” (Babbie, 2013, p.167). This fallaciously conflates the concepts we have of psychosocial properties such as compassion with the empirical referents of those concepts.^{Footnote 51} That is, if compassion, prejudice, and other psychosocial properties have ever been measured, what was measured was a property (of an individual or a group), rather than a concept of (or term for) that property.

Thus, whether a given property is defined in purely physical terms or not, the critical question is how we know that a property exists, and therefore that it meets at least the most basic criterion for measurability. What, in other words, justifies one’s belief in the existence of a property?

6.6.2 Justifications for the Existence of Properties

There are many ways in which a claim about the existence of an empirical property could be justified, but given the empirical nature of the properties in question, they must share some form of observational evidentiary basis. Here we propose what we take to be a minimal, pragmatic approach to the justification of the existence of properties, based on our ability to identify their modes of interaction with their environments.^{Footnote 52}

A core aspect of the justification for a claim about the existence of a property is, simply, the observation that an object interacts with its environment in particular ways. The term “interaction” can itself be interpreted in a variety of ways, but in the context of measurement science (see in particular Sects. 2.3, 3.1 and 4.3.4), a starting point is the observation of what we have referred to as a transduction effect, i.e., an empirical process that produces variations of a (response, effect, output) property as effects of variations of one or more (stimulus, cause, input) properties.

One could argue that an even earlier starting point is simply the observation of variation in some empirical phenomenon (event, process, etc.). If we may help ourselves to the assumption that there are no uncaused events (at least not on a sufficiently broad conception of causality; for general discussions, see, e.g., Beebee et al., 2009; see also Markus & Borsboom, 2013), then from the observation of an event one may infer the existence of causal influences, though of course one may initially know little or nothing about the nature of these influences.^{Footnote 53} Progressively, through empirical interaction with relevant phenomena, we may arrive at a state of knowledge and technology such that a transduction effect can be dependably reproduced under specified conditions, which brings us back to the “starting point” referenced in the previous paragraph. Such a transduction effect may become the basis of a direct method of measurement (see Sect. 7.3): through the calibration of the transducer, the values of the output property (i.e., the instrument indication) are functionally related to values of the input property (i.e., the property under measurement). For example, temperatures can be measured by means of differences of expansion of mercury in a glass tube, and reading comprehension abilities can be measured by means of differences in patterns of responses to questions about a particular text. Such cases presuppose the observability of a property Y (e.g., shape, color, pattern of responses to test questions), whose differences are accounted for as being causally dependent on differences in the property under consideration P, via an inference of the kind Y = f(P), where f is the function that models the cause-effect relation: the property P is the cause of observed changes of Y, and therefore it exists.

All this said, if a property P is only known as the cause of observable effects in the context of a single empirical situation (experimental setup etc.)—that is, if there is only a single known transduction effect of which instances of P are the input, and where the transduction itself is understood only at a black-box level—then knowledge of P is obviously highly limited; such a situation might be associated with an operationalist perspective on measurement, and would thus inherit the limitations of that perspective (see Sect. 4.2.2), or might simply be a very early phase in the identification of an empirical property, setting the stage for investigations of the causal relevance of the property in situations other than this single transduction effect. Indeed, in general, absent the availability of multiple, independent sources of knowledge about P, in particular about its role in networks of relationships with other phenomena (properties, outcomes, events, etc.), knowledge about P might be considered vacuous or trivial.

For example, a claim about the existence of hardness as a property of physical objects can be justified in a simple way by the observation that one object scratches another: hardness (P) is what causes (f) observable scratches (Y) to appear given an appropriate experimental setup. Were this the only source of knowledge about hardness, the correct name for P would arguably be something like “the property that causes the effect Y”, e.g., “the ability to produce scratches”, rather than a label as semantically rich as “hardness”. But, of course, this is not the only way in which hardness is known: even simple lived experience can corroborate our common sense about the ways in which objects made of different materials interact with one another; this is further corroborated by alternative methods for measuring hardness such as via observation of indentations under specified conditions. In other words, we have access to knowledge about the property of hardness also independently of that particular cause-effect relationship f, and this knowledge is consistent with what f models as the cause of Y. This shows that the procedure of checking which objects scratch which other objects does not define hardness, but instead may become a method for evaluating it.^{Footnote 54}

Thus, as investigations reveal functional relations connecting P to multiple phenomena (properties, outcomes, events, etc.) whose existence can be assessed independently of such relations, P becomes part of a system of interrelated properties, sometimes called a nomic network.^{Footnote 55} The identification of such relations (referred to in the VIM as a “set of quantities [or more generally, properties] together with a set of noncontradictory equations relating those quantities”, JCGM, 2012: 1.3) is important not only because it expands the explanatory and predictive value of knowledge of P,^{Footnote 56} but also for two additional reasons specifically related to measurement. The first is that such knowledge may suggest alternative methods for directly measuring a given property: for example, temperatures could also be measured by means of differences of electric potential via the thermoelectric effect, and reading comprehension abilities could also be measured by observing how well an individual is able to carry out a set of instructions after having read a relevant text. This corresponds to the minimal example of a nomic network as shown in Fig. 6.9, in which the three properties P, Y, and Z are connected via the two functions Y = f(P) and Z = g(P).^{Footnote 57} The causal relationship between P and either Y or Z—or both—could be used as the basis for a direct measurement of P. This kind of relationship of Y and Z to P is referred to as reflective in the context of latent variable modeling (see, e.g., Edwards & Bagozzi, 2000).

A model diagram has P on the left. Two arrow marks from P point toward Y and Z on the right. — **Fig. 6.9**

The second measurement-related reason for the importance of knowledge of such functional relations is that they may become the basis for indirect methods of measurement (see Sect. 7.2), in which the results of prior direct measurements are used as input properties for the computation of a value of the output property (i.e., the measurand), as, for example, when densities are measured by computing ratios of measured values of masses and volumes. Here the property P whose existence is questioned is a function of other properties, say Y and Z, whose existence is already accepted, as depicted in Fig. 6.10. This kind of relationship of Y and Z to P is referred to as formative in the context of latent variable modeling (again see Edwards & Bagozzi, 2000).

A model diagram has Y and Z on the left, Two arrow marks from them point toward P in the middle. The arrow mark from P points at three dots on the right. — **Fig. 6.10**

A clarification is in order on this matter: if a property is only known through a single function of other properties, in which case the functional relation P = f(Y, Z) would serve as the definition of a previously unknown entity P, there would be no basis for claiming that P is an independently-existing empirical property; rather, what is calculated by f would simply be a variable that summarizes (some of) the available information about the properties Y and Z (as, again, is the case for hage, defined as the product of the height and age of a human being; Ellis, 1968: p. 31). Summaries can, of course, have substantial utility, but as per the previous discussion of the distinction between empirical properties and mathematical variables, mathematical creativity is in itself insufficient for the generation of new empirical properties. As before, it is the availability of independent sources of knowledge about the property in question that lends credence and importance to claims regarding its existence, as is the case with force: although F = ma may be considered to be a definition of force, there are in fact means of knowing force independently of (but consistent with) Newton’s second principle, as, for example, Coulomb’s law, which connects force to quantity of electric charge.

In sum, our approach to the justification of claims about the existence of properties is consistent with the philosophical perspective sketched in Sect. 4.5, which we described as pragmatic realism or model-based realism. The approach is realist, insofar as it focuses on justification for claims regarding the existence of empirical properties, and by so doing helps clarify the distinction between empirical properties and mathematical variables, and more generally the interface between the empirical world and the informational world; this also helps set the stage for a clear distinction between measurement and computation, discussed further in the following chapter. The approach is pragmatic, insofar as the emphasis of the proposed criteria for evaluating our beliefs about the existence of properties is on the practical consequences of those beliefs; this is consistent with the familiar refrain of pragmatic philosophers that “a difference that makes no difference is no difference”, or, as put more specifically by Heil, “a property that made no difference to the causal powers of its possessors would, it seems, be a property the presence of which made no difference at all” (2003: p. 77). Finally, the approach is model-based, insofar as the role of models (of general properties, measurands, environments, and the measurement process) is given primacy: this is the topic to which the following chapter is devoted.

Notes

1.
In the case of quantities, it might be that individual quantities are those entities sometimes called “magnitudes”. However, the concept <magnitude> is used in radically different ways: quantities are magnitudes but also have magnitudes, as in the current edition of the VIM, which defines <quantity> as follows: “property of a phenomenon, body, or substance, where the property has a magnitude that can be expressed as a number and a reference” (JCGM, 2012: 1.1) (see the more extensive discussion in Footnote 19 of Chap. 2). Given this confusion, and the fact that measurement results can be, and usually are, reported without reference to magnitudes, we avoid including <magnitude> in the ontology we are presenting here (for an analysis of the relations between <quantity> and <magnitude> see Mari & Giordani, 2012: p. 761–763).
2.
A foundational ontology might endeavor to build a framework on properties eventually based on one entity, from which everything else can be derived (an example of this monism is trope theory; see Maurin, 2018) or reduced, as nominalism would do by assuming that both individual properties and general properties are just concepts, and only properties of objects exist outside our minds (see Sect. 5.3.1). However, this philosophical task has no direct consequences for measurement science.
3.
In fact, the analysis that follows may be easily generalized to the case of properties and values of properties, as we do later on in this chapter, where we also discuss the characterization of quantities as specific kinds of properties. We start by presenting the more specific case of values of quantities because the very concept <value of a property> is not widely used, and some would consider it controversial. It should be noted that the boundary between quantitative and non-quantitative properties is not uniquely defined, and in particular there are controversies whether ordinal properties are quantitative or not. However, that additive properties are quantities is not an issue, and we start our discussion from them.
4.
One example of a term used to communicate a value is a numeral, which is a term for a number; for example, “4” and “IV” are both numerals that stand for the number 4. As was previously discussed, Campbell (1920) defined measurement as “the assignment of numerals to represent properties”, and Stevens (1959) defined measurement as “the assignment of numerals to objects or events according to rule”; such statements may have inadvertently contributed to the confusion between values and terms. It may be worth noting that even though both Campbell and Stevens are both associated with representational theories of measurement, the wider literature on representationalism emphasizes the mapping of objects, or possibly properties, to numbers, not numerals, as discussed further below.
5.
For example, André Weyl wrote that “measurement permits things … to be represented conceptually, by means of symbols” (1949: p. 144). While not false, this claim is by no means characteristic of measurement in particular, and therefore is not very informative.
6.
The fact that distinct objects can have the same quantity, e.g., the same length, and therefore are mapped to the same number, makes the quantity-related mapping non-injective, thus a homomorphism. What Louis Narens wrote (1985: p. 7) on this matter is interesting (note that he uses the term “scale” to refer to such mappings): “I often prefer to change the character of the representational theory a little and consider a scale to be an isomorphism between the empirical or qualitative situation and some mathematical situation. The primary reason for this is that isomorphisms preserve truth whereas homomorphisms do not.”. According to the ontology we are proposing, a way for making the mapping injective, and therefore an isomorphism, is to assume that its domain is the set of individual properties, rather than of the properties of objects or of objects. Our ontology highlights that individual properties can be measured only in their being properties of objects, thus making the mapping that formalizes such an experimental process non-injective. (Admittedly, consistently with this thinking, Narens chose to title his book “Abstract measurement theory” (emphasis added); hence perhaps the prior question is whether the very concept <abstract measurement> has anything to do with actual measurement as it is commonly understood.).
7.
This highlights another barrier to the elimination of values of quantities in favor of numbers: for all properties evaluated in scales of types algebraically weaker than ratio (see the related discussion in Sect. 6.5), the social acceptance of “natural units” is not sufficient. In particular, in the case of an interval scale a “natural zero” would also need to be universally adopted. The fact that Celsius and Fahrenheit scales of temperature have different zeros witnesses that this could not be a trivial task.
8.
For the sake of simplicity, rods are modeled here as geometrically regular bodies, and in particular prisms, so that each rod has one length. Moreover, this construction is assumed to be performed in one inertial frame of reference, so that problems due to relativistic effects do not arise.
9.
We will relax this assumption later, in Sects. 6.3.6 and 6.3.7, in constructing values of less-than-ratio properties.
10.
The length n L[r] is customarily defined by induction: 1 L[r]:= L[r], and n L[r]:= (n–1) L[r] ⊕ L[r]. Since we are operating with empirical quantities, not numbers, one might challenge the correctness of the equation L[r] ⊕ L[r] = 2 L[r], contesting, in particular, that the geometry of our world on the one hand and the features of our instruments on the other hand do not allow us to guarantee the perfect collinear concatenation of rods. The argument is that numerically L[a] ⊕ L[b] = (L[a]²–2cos(ϑ) L[a] L[b] + L[b]²)^½, where ϑ is the angle between the rods a and b, so that substituting L[r] ⊕ L[r] with 2 L[r] is correct only if ϑ = π, i.e., in the case of collinearity. This is true, of course, but the same argument can be exploited to provide an empirical check of collinearity, via the condition that ⊕ is associative: it is indeed trivially proved that for (L[a] ⊕ L[b]) ⊕ L[c] = L[a] ⊕ (L[b] ⊕ L[c]) to hold ϑ must be π (or, interestingly, (1+2k)π/2, for k=0, 1, …, where the Pythagorean theorem applies: in a peculiar world, “collinear concatenation” means concatenation at right angles…).
11.
This problem is arguably even more pernicious in the human sciences, wherein properties commonly vary not only by time but also by socio-cultural-historical context, as also discussed in Sect. 4.4.
12.
The assumption that, though initially defined about an object, a reference quantity is an abstract, individual quantity is what justifies the notation “${\ell}_{\text{ref}}$”, with the symbol for the quantity written in lowercase roman (see Table 2.3 and Footnote 24 of Chap. 2).
13.
The inverse approach is also possible: given a predefined reference length ${\ell}_{\text{ref}}$ and a given factor k, a new reference length ${\ell}_{\text{ref*}}$ could be defined as ${\ell}_{\text{ref*}}$:= k ${\ell}_{\text{ref}}$. In this case, finding an object r* such that L[r*] = ${\ell}_{\text{ref*}}$ (thus an empirical relation, not a definition) would correspond to realizing the definition of the new reference length.
14.
The fact that this is possible is a compelling reason to maintain the distinction between the quantities in the left- and right-hand sides of Basic Evaluation Equations. Unfortunately, this is sometimes confused. Take the following example: “Suppose we had chosen as our standard [of mass] a cube of iron rather than platinum. Then, as the iron rusted, all other objects would become lighter in weight” (Kaplan, 1964: p. 186). This is wrong: the other objects do not become lighter, they only seem to be lighter when compared to the rusted cube, given that it is only the numerical representation of their mass that changes. A very well-studied case of such changes is that of the kilogram, which before the 2019 revision of the SI was defined as the mass of a given artifact, the International Prototype of the Kilogram (IPK). In 1994 a periodic verification of several national copies of the prototype of the kilogram revealed that basically all of them appeared as if they had gained some mass in comparison with the IPK, despite their independent handling and storage: plausibly, it was instead the IPK that has lost mass (Girard, 1994).
15.
For discussions of strategies for defining reference quantities in human measurement using resources from the Rasch measurement tradition, see, e.g., Maul et al. (2019), Wilson et al. (2019), and Briggs (2019).
16.
An example of this is the well-known case of the intelligence quotient (IQ), defined by taking the median raw score of the chosen sample as IQ 100 and one sample standard deviation as corresponding to 15 IQ points. It has been observed that since the early twentieth century raw scores on IQ tests have increased in most parts of the world, a situation called the Flynn effect (Flynn, 2009). Whether intelligence as such has also increased is, of course, another matter.
17.
On this matter Eran Tal (2019) refers this situation to the distinction between types and tokens (see Wetzel, 2018), and proposes the thermal expansion coefficient of aluminum at 20 °C and the thermal expansion coefficient of a particular piece of aluminum at a given temperature as examples of a type and a corresponding token respectively, such that “quantity types may be instantiated by more than one object or event”. Since the thermal expansion coefficient of aluminum at 20 °C is a given thermal expansion coefficient, which is an individual quantity, a quantity type, in Tal’s lexicon, can be thought of as a “partially abstracted” individual quantity, while any non further specified given thermal expansion coefficient is then a “fully abstracted” individual quantity.
18.
This is one more case in which the distinction between sense and reference (see Sect. 5.3.2) is relevant. The assumption of validity of the principle of continuity can be written as metre₁₈₈₉ = metre₁₉₆₀, in which the fact that the metre was defined in different ways in 1889 and in 1960 makes the senses of the two expressions (“the metre as defined in 1889” and “the metre as defined in 1960”) different, while their referents are the same.
19.
A generalized version of this condition is usually part of an axiomatic system of quantities. For example, the seventh axiom of Patrick Suppes’ system (1951: p. 165) is, in our notation: if Q[a_i] ≤ Q[a_k] then there exists a number x such that Q[a_k] = x Q[a_i].
20.
In Sect. 5.2.6 we pointed out that indistinguishability is generally not transitive: how traceability chains can be constructed in spite of this obstacle is discussed by Mari and Sartori (2007).
21.
Of course, we are still referring here only to quantities with a unit, and not, in particular, to ordinal quantities, that are discussed later on. As explained in Footnote 12 of Chap. 5, we use the concept <multiple of a quantity> in a broad sense, admitting also non-integer multiples.
22.
As a consequence, we can provide a simple answer to a question such as whether, e.g., 1.2345 m and 48.602 inches are the same value or not: “1.2345 m” and “48.602 in” have the same referent—i.e., 1.2345 m and 48.602 inches are the same length—but they have different senses. For short, they are conceptually different but referentially the same. Whether this leads to the conclusion that they are the same value or not depends on the specific definition of <value> that is adopted.
23.
For the sake of simplicity, we assume that this construction is done in a context in which sufficiently clear ideas are available about what temperature is and therefore in particular how temperature and heat are related but different properties (note that sometimes temperature is considered to be the intensity of heat, and this justifies its non-additivity). The actual historical development of these ideas was convoluted, and some sorts of “candidate measurements” were instrumental to the clarification (see Chang, 2004, and Sherry, 2011).
24.
In the given conditions, the target here is to build a scale of temperature from a scale of mass, which is supposed to be known: this is why for any amount of gas a_i the value m_i of the mass M[a_i] is considered known, whereas the value of the temperature Θ[a_i] is still undefined, and all we can say of Θ[a_i] is that it is a given temperature θ_i (where, as explained in Table 2.3, a Roman lowercase symbol, like θ_i, stands for a generic individual property and an italic lowercase symbol, like m_i, stands for a value of a property.
25.
The condition that this construction applies to multiple bodies/thermometers avoids the problems of radical operationalism, which would define temperature as what is measured by a given instrument.
26.
Differences of volumes of the relevant bodies have been assumed to be somehow observable. However, instead of operating on empirical properties it might be more convenient to measure volumes and then to operate mathematically on the measured values. Note furthermore that this role of volume as a transduced property that is a function of temperature played an important historical role, as the scientific principle at the basis of the construction of the first thermometers, but is by no means unique. An analogous presentation could be made, for example, with voltage in place of volume in reference to the thermoelectric effect.
27.
As we discuss in Chap. 7, each question of a test operates as a transducer, in this case transforming the RCA of a reader to a score.
28.
The definition (or content) of this item-universe is often referred to as the “domain”.
29.
Equation 6.2 has no closed-form solution for θ and δ, hence we are not providing equations for them.
30.
This contrasts with any position which attributes a special ontic role to values. For example, Hasok Chang, as an illustration of “ontological principles … that are regarded as essential features of reality in the relevant epistemic community” and in defense of what he calls “the pursuit of ontological plausibility”, mentions the “Principle of single value (or, single-valuedness): a real physical property can have no more than one definite value in a given situation.” (Chang, 2001: p. 11, also presented in Chang, 2004: p. 90).
31.
This highlights the ambiguity of calling a mathematical relation among all quantities known to be involved in a measurement a “model of measurement”, as the VIM definition says (JCGM, 2012: 2.48). We argue against this in Sect. 7.2.
32.
This subject has been widely debated at least since the seminal analysis by Helmholtz (1887a, 1887b), who opened his paper by claiming that “counting and measuring are the bases of the most fruitful, most certain, and most exact of all known scientific methods”. Such prestige and epistemic authority makes measurement a yearned-for target. From another perspective, “Measurement is such an elegant concept that even with [properties] apparently lacking multiples, if the [property] is capable of increase or decrease (like temperature is, for example), the temptation to think of it as quantitative is strong.” (Michell, 2005: p. 289). See Chap. 7 for more on this.
33.
We already followed this path in the discussion about values of temperature and of reading comprehension ability in Sects. 6.3.6 and 6.3.7.
34.
Given this, the reader will not find here the proposal of a clear-cut criterion to distinguish between quantities and non-quantities. At least since Hölder’s (1901) paper, several axiomatizations of quantities have been proposed (e.g., Suppes, 1951; Mundy, 1987; Suppes & Zanotti, 1992), and choosing among them is not relevant here. On this matter a general issue is whether order is sufficient for a property to be considered a quantity. While the sources just cited all answer this question in the negative, more encompassing positions are possible, such as Ellis’, according to whom “a quantity is usually conceived to be a kind of property. It is thought to be a kind of property that admits of degrees, and which is therefore to be contrasted with those properties that have an all-or-none character.” (1968: p. 24). By using the term “ordinal quantity”, the VIM adopted the same stance (JCGM, 2012: 1.26): ordinal properties are considered to be quantities. This multiplicity is one more reason not to fall in the trap of what Abraham Kaplan called the “mystique of quantity” (1964: p. 172).
35.
For example, Nicholas Chrisman mentions the following ten “levels [which] are by no means complete”, where for each level the “information required” is specified (1998: p. 236): 1: Nominal (definition of categories). 2: Graded membership (definition of categories plus degree of membership or distance from prototype). 3: Ordinal (definition of categories plus ordering). 4: Interval (definition of unit and zero). 5: Log-interval (definition of exponent to define intervals). 6: Extensive ratio (definition of unit—additive rule applies). 7: Cyclic ratio (unit and length of cycle). 8: Derived ratio (units—formula of combination). 9: Counts (definition of objects counted). 10: Absolute (type: probability, proportion, etc.).
36.
We are not concerned here with whether the general property applies to single entities or to pairs, triples, etc. of them, and therefore whether—in the traditional terminology—it is a property or a relation (see Sect. 5.2.3).
37.
This is analogous to the unfortunate example given by Stevens about “the numbering of football players for the identification of the individuals” (1946: p. 678): identification is not property evaluation, and so the mocking critique by Lord (1953) rightly applies to this example.
38.
This is why in the definition of <quantity> given by the VIM—“property of a phenomenon, body, or substance, where the property has a magnitude that can be expressed as a number and a reference” (JCGM, 2012: 1.1)—the last part “that can be expressed as a number and a reference” is not an actual specification, and could be removed.
39.
Of course, nationality-related orderings can be defined; for example, names of nations are ordered alphabetically and nations are ordered by the size of their population, but the former is a relation among linguistic terms and the latter is a relation among cardinalities of sets: neither of them involves the property of nationality, in the sense that one’s nationality is not a linguistic entity, nor is it a number.
40.
This is thus in sharp contrast with radically constructionist presentations, such as Kaplan’s view that “the order of a set of objects is something which we impose on them. We take them in a certain order; the order is not given by or found in the objects themselves.” (1964: p. 180).
41.
This lack of a context—seen, for example, in that reporting that the shape of a given object is cube does not in itself provide any hint about what other shapes the object might have had—is a problem in particular for computing the quantity of information obtained by a value. According to Claude Shannon (1948), this is related to the probability of selecting that value, which in turn supposes knowledge of the underlying probability distribution. We further discuss this fundamental idea by Shannon in Sect. 8.1, in terms of quantity of (syntactic) information conveyed by measurement.
42.
This is clearly analogous to the way information is reported in ordinal cases, such as Mohs’ hardness, e.g., hardness(given sample) = 5 on the Mohs scale.
43.
As a consequence, the possible concern that only numbers (or numerals) count as values of properties is unjustified. This also shows that the values of non-quantitative properties are not merely “symbols” or “names”. The discussion in Sect. 6.2.1 about values of quantities applies more generally to values of properties.
44.
The usual mathematical construction follows the opposite direction, by showing that any function g: A → B induces a partition, i.e., a set of equivalence classes, on its domain A, such that a_i, a_j∈A belong to the same equivalence class if and only if g(a_i) = g(a_j), where the set of equivalence classes is called the quotient set of A under the equivalence relation.
45.
For example, Fred Roberts describes what he calls “the representation problem” as follows: “Given a particular numerical relational system Y, find conditions on an observed relational system X (necessary and) sufficient for the existence of a homomorphism from X into Y.” (1979: p. 54).
46.
It seems paradoxical that representationalism—a weak position about the epistemic state of measurement, as also discussed in Chap. 4—assumes some strong ontic requirements on properties.
47.
This hypothesis of linearity can be empirically corroborated by ascertaining that different temperatures produce proportional changes in different thermometers, operating according to different transduction effects. Four conceptual (though not necessarily historical) stages may be envisioned to such a process:
1. A property is known only via a single transduction effect: for example, temperature can be transduced to a single kind of thermometric fluid (e.g., alcohol). In this case, the hypothesis of linearity is only grounded on the meta-hypothesis of simplicity.
2. A property is known via multiple transduction effects related to the same transduction principle: for example, temperature can be transduced to different kinds of thermometric fluid (e.g., alcohol and mercury). In this case, if (for example) it were discovered that the temperature that produces the midpoint in volume between the volumes produced by two fixed points (e.g., the freezing and boiling points of water at sea level) is the same for different fluids, the hypothesis of linearity gains more plausibility. (As it happens, this is not exactly the case for mercury and alcohol.).
3. A property is known via multiple transduction principles: for example, temperature can also be transduced to electric tension, via the thermoelectric effect. In this case, if (for example) it were discovered that the temperature that produces the midpoint in volume between the volumes produced by the two fixed points and the temperature that produces the midpoint in tension between the tensions produced by the same fixed points is the same for different bodies, the hypothesis of linearity gains more plausibility.
4. A property becomes part of a nomic network (see Sect. 6.6.2), if for example, a law is discovered that connects proportional differences of temperature of a given body to transferred heats, the hypothesis of linearity gains even more plausibility.
48.
For an extensive presentation of conjoint measurement, see Michell (1990: Chapter 4), where conjoint measurement is introduced as a “general way […] in which evidence corroborating the hypothesis [that a property is quantitative] may be obtained” (p. 67). In the light of the discussion in Sect. 3.4.2, a method of quantification is not necessarily a method of measurement: hence a more correct term for conjoint measurement would be “conjoint quantitative evaluation”.
49.
For example, the (length of the) diameter of objects, whose evaluation is usually of ratio type, may be evaluated by means of a sequence of sieves of smaller and smaller opening, where each sieve is identified by an ordinal value and the evaluation sets the diameter of each object to be equal to the value of the last sieve crossed by the object. Such an evaluation is then only ordinal.
50.
The identification of the conditions that make such modeling possible is one of the primary contributions of the representational theories of measurement, the stated aim of which is “to construct numerical representations of qualitative structures” (Krantz et al., 1971: p. xviii). (Perhaps peculiarly, in the terminology of representationalism, the term “qualitative” is used to refer to the structure of properties even when they are quantities.).
51.
Again, as discussed in Sect. 4.5 (and at more length in a variety of sources such as Mislevy, 2018), there are many important differences in the ontological character of psychosocial properties compared to (classical) physical properties, including the facts that their existence depends on human consciousness (with all the ontological challenges this entails; see, e.g., Dennett, 1991; Kim, 1998; Searle, 1992), and perhaps also on social groups and the actions thereof (with all the ontological challenges that entails; see, e.g., Searle, 2010). However, the key point remains: none of these differences imply that psychosocial properties are not part of the empirical world, any less so than physical properties.
52.
To be clear, we do not aim to provide a sufficient set of criteria for the justification of a claim about the existence of any given property, as this would surely involve issues specific to that property.
53.
Our stance here is broadly consistent with Ian Hacking’s (1983) perspective on entity realism, which entails that a claim about the existence of an entity is justified if it can be used to create effects that can be investigated and understood independently of their cause. As Hacking famously put it, in reference to experiments involving the spraying of electrons and positrons onto a superconducting metal sphere: “if you can spray them, then they are real” (p. 24). To this we would add a friendly amendment: if you can spray them, something is real, though it remains an open question to what extent the actual causal forces at work are well-described by our current best theories and terminology. This is easily illustrated by the historical example of phlogiston (as also discussed in Box 5): although contemporary theories deny the existence of the substance referred to as “phlogiston” by 17th- and 18th-century theorists, contemporary theories would not deny the existence of the causal forces responsible for the putative effects of phlogiston (e.g., flammability, oxidation, rusting), but instead offer more nuanced explanations for the identity and mechanisms of these causal forces.
54.
The same reasoning applies to the case of educational tests, which would in general not be valued unless the competencies they purport to measure are demonstrably valuable in contexts beyond the immediate testing situation. For further arguments along these lines, see also Rozeboom (1984).
55.
The adjective “nomic” comes from the ancient Greek νόμος, “nomos”, meaning <law>. When attributed to a conceptual network it refers to a set of entities (in this case general properties) interconnected via relations interpreted as laws. The paradigmatic example of this is the International System of Quantities (ISQ), a system of (general) quantities based on length, mass, duration, intensity of electric current, thermodynamic temperature, amount of substance, and luminous intensity (JCGM, 2012: 1.6), from which other physical quantities may be derived through physical laws.
56.
In this we agree with Carl Hempel: “We want to permit, and indeed count on, the possibility that [candidate properties] may enter into further general principles, which will connect them with additional variables and will thus provide new criteria of application for them” (1952: p. 29).
57.
Y and Z would be expected to covary as the effects of the common cause P. This is, in fact, the canonical example of how “correlation is not causation”: the observation that two properties Y and Z correlate may be explained by the presence of a third, “hidden” property P which is their common cause.

References

Babbie E. (2013). The practice of social research (13th ed.). Wadsworth.
Google Scholar
Beebee, H., Hitchcock, C., & Menzies, P. (Eds.). (2009). The Oxford handbook of causation. Oxford University Press.
Google Scholar
Bentley, J. P. (2005). In Principles of measurement systems (4th ed.). Pearson.
Google Scholar
Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of relationships between constructs and measures. Psychological Methods, 5, 155–174.
Article Google Scholar
Borsboom, D. (2006). The attack of the psychometricians. Psychometrika, 71(3), 425–440.
Article MathSciNet MATH Google Scholar
Borsboom, D., Mellenbergh, G. J., & Van Heerden, J. (2003). The theoretical status of latent variables. Psychological Review, 110, 203–219.
Article Google Scholar
Briggs, D. C. (2019). Interpreting and visualizing the unit of measurement in the Rasch model. Measurement, 146, 961–971.
Article ADS Google Scholar
Bunge, M. (1973). On confusing ‘measure’ with ‘measurement’ in the methodology of behavioral science. In M. Bunge (Ed.), The Methodological unity of science (pp. 105–122). Reidel.
Chapter Google Scholar
Campbell, N. R. (1920). Physics:—The elements. Cambridge University Press.
Google Scholar
Carnap, R. (1966). Philosophical foundations of physics. Basic Books.
Google Scholar
Chang, H. (2001). How to take realism beyond foot-stamping. Philosophy, 76(295), 5–30.
Article Google Scholar
Chang, H. (2004). Inventing temperature. Measurement and scientific progress. Oxford University Press.
Book Google Scholar
Chrisman, N. R. (1998). Rethinking levels of measurement for cartography. Cartography and Geographic Information Systems, 25, 231–242.
Article Google Scholar
De Boer, J. (1995). On the history of quantity calculus and the international system. Metrologia, 31, 405–429.
Article ADS Google Scholar
Doebelin, E. (1966). In Measurement systems: Application and design (5th ed. 2003). McGraw-Hill.
Google Scholar
Dybkaer, R. (2013). Concept system on ‘quantity’: Formation and terminology. Accreditation and Quality Assurance, 18, 253–260.
Article Google Scholar
Eisenhart, C. (1963). Realistic evaluation of the precision and accuracy of instrument calibration systems. Journal of Research of the National Bureau of Standards. Engineering and Instrumentation, 67C(2). Retrieved from nvlpubs.nist.gov/nistpubs/jres/67C/jresv67Cn2p161_A1b.pdf
Google Scholar
Ellis, B. (1968). Basic concepts of measurement. Cambridge University Press.
Google Scholar
Flynn, J. R. (2009). What is intelligence: Beyond the Flynn Effect. Cambridge University Press.
Google Scholar
Freund, R. (2019). In Rasch and Rationality: Scale typologies as applied to item response theory. Unpublished doctoral dissertation, University of California, Berkeley.
Google Scholar
Giordani, A., & Mari, L. (2012). Property evaluation types. Measurement, 45, 437–452.
Article ADS Google Scholar
Girard, G. (1994). The third periodic verification of national prototypes of the kilogram (1988–1992). Metrologia, 31(4), 317–336.
Article ADS Google Scholar
Glaser, R. (1963). Instructional technology and the measurement of learning outcomes. American Psychologist, 18, 510–522.
Article Google Scholar
Guttman, L. (1944). A basis for scaling qualitative data. American Sociological Review, 9, 139–150.
Article Google Scholar
Heil, J. (2003). From an ontological point of view. Clarendon Press.
Book Google Scholar
Hempel, C. G. (1952). Fundamentals of concept formation in physical science. Chicago University Press.
Google Scholar
Holder, O. (1901). Die Axiome der Quantität und die Lehre vom Mass. Berichte über die Verhandlungen der Königlich Sächsischen Gesellschaft der Wissenschaften zu Leipzig, Mathematisch-Physische Klasse, 53, 1–46. Transl. Michell, J., & Ernst, C. (1996). The axioms of quantity and the theory of measurement. Journal of Mathematical Psychology, 40, 235–252.
Google Scholar
Holland, P. (1990). On the sampling theory foundations of item response theory models. Psychometrika, 55, 577–601.
Article MathSciNet MATH Google Scholar
International Bureau of Weights and Measures (2019). The International System of units (SI) (9th ed.). BIPM. Retrieved from www.bipm.org/en/si/si_brochure
International Organization for Standardization (ISO) and other three International Organizations (1984). International vocabulary of basic and general terms in metrology (VIM) (1st ed.). Geneva: International Bureau of Weights and Measures (BIPM), International Electrotechnical Commission (IEC), International Organization for Standardization (ISO), International Organization of Legal Metrology (OIML).
Google Scholar
Joint Committee for Guides in Metrology. (2008). JCGM 100:2008, Evaluation of measurement data—Guide to the expression of uncertainty in measurement (GUM). JCGM. Retrieved from www.bipm.org/en/committees/jc/jcgm/publications
Joint Committee for Guides in Metrology. (2012). JCGM 200:2012, International Vocabulary of Metrology—Basic and general concepts and associated terms (VIM) (3rd ed.). JCGM (2008 version with minor corrections). Retrieved from www.bipm.org/en/committees/jc/jcgm/publications
Kaplan, A. (1964). The conduct of inquiry. Chandler.
Google Scholar
Kelly, E. J. (1916). The Kansas silent reading tests. Journal of Educational Psychology, 7, 63–80.
Article Google Scholar
Kim, J. (1998). Mind in a physical world. MIT Press.
Book Google Scholar
Kintsch, W. (2004) The construction-integration model of text comprehension and its implications for instruction. In R. Ruddell & N. Unrau (Eds.), Theoretical models and processes of reading (5th ed.). International Reading Association.
Google Scholar
Krantz, D. H., Luce, R. D., Suppes, P., & Tversky, A. (1971). In Foundations of measurement (Vol. 1). Academic Press.
Google Scholar
Kuhn, T. S. (1969). The structure of scientific revolutions. University of Chicago Press.
Google Scholar
Kyburg, H. E., Jr. (1984). Theory and measurement. Cambridge University Press.
Google Scholar
Lodge, A. (1888, July). The multiplication and division of concrete quantities. Nature, 38, 281–283. Retrieved from www.nature.com/articles/038281a0
Lord, F. M. (1953). On the statistical treatment of football numbers. American Psychologist, 8, 750–751.
Article Google Scholar
Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.
Article MATH Google Scholar
Maraun, M. D. (1996). Meaning and mythology in the factor analysis model. Multivariate Behavioral Research, 31(4), 603–616.
Article Google Scholar
Mari, L., & Giordani, A. (2012). Quantity and quantity value. Metrologia, 49, 756–764.
Article ADS Google Scholar
Mari, L., Maul, A., Torres Irribarra, D., & Wilson, M. (2017). Quantities, quantification, and the necessary and sufficient conditions for measurement. Measurement, 100, 115–121.
Article ADS Google Scholar
Mari, L., Maul, A., & Wilson, M. (2018). On the existence of general properties as a problem of measurement science. Journal of Physics: Conference. Series, 1065, 072021.
Google Scholar
Mari, L., & Sartori, S. (2007). A relational theory of measurement: Traceability as a solution to the non-transitivity of measurement results. Measurement, 40, 233–242.
Article ADS Google Scholar
Markus, K. A., & Borsboom, D. (2013). Frontiers of test validity theory: Measurement, causation, and meaning. Routledge.
Book Google Scholar
Maul, A. (2017). Rethinking traditional methods of survey validation. Measurement: Interdisciplinary Research and Perspectives, 15, 51–69.
Google Scholar
Maul, A., Mari, L., & Wilson, M. (2019). Intersubjectivity of measurement across the sciences. Measurement, 131, 764–770.
Article ADS Google Scholar
Maurin, A. S. (2018). Tropes. In E. N. Zalta (Ed.), The stanford encyclopedia of philosophy. Stanford University. Retrieved from plato.stanford.edu/entries/tropes
Google Scholar
McGrane, J., & Maul, A. (2020). The human sciences, models and metrological mythology. Measurement, 152. https://doi.org/10.1016/j.measurement.2019.107346
Michell, J. (1990). An introduction to the logic of psychological measurement. Erlbaum.
Google Scholar
Michell, J. (1999). Measurement in psychology—Critical history of a methodological concept. Cambridge University Press.
Book Google Scholar
Michell, J. (2000). Normal science, pathological science and psychometrics. Theory and Psychology, 10(5), 639–667.
Article Google Scholar
Michell, J. (2004). Item response models, pathological science and the shape of error: Reply to Borsboom and Mellenbergh. Theory & Psychology, 14(1), 121.
Article Google Scholar
Michell, J. (2005). The logic of measurement: A realist overview. Measurement, 38, 285–294.
Article ADS Google Scholar
Mislevy, R. J. (2018). Sociocognitive foundations of educational measurement. Routledge.
Book Google Scholar
Mundy, B. (1987). The metaphysics of quantity. Philosophical Studies, 51, 29–54.
Article MathSciNet Google Scholar
Narens, L. (1985). Abstract measurement theory. MIT Press.
MATH Google Scholar
Narens, L. (2002). Theories of meaningfulness. Lawrence Erlbaum Associates.
MATH Google Scholar
Popper, K. (1959). The logic of scientific discovery. Routledge.
MATH Google Scholar
Rasch, G. (1960/1980). Probabilistic models for some intelligence and attainment tests. University of Chicago Press.
Google Scholar
Roberts, F. S. (1979). Measurement theory with applications to decision-making, utility and the social sciences. Addison-Wesley.
Google Scholar
Rossi, G. B. (2006). A probabilistic theory of measurement. Measurement, 39, 34–50.
Article ADS Google Scholar
Rossi, G. B. (2007). Measurability. Measurement, 40, 545–562.
Google Scholar
Rozeboom, W. W. (1984). Dispositions do explain: Picking up the pieces after hurricane Walter. In J. R. Royce & L. P. Mos (Eds.), Annals of theoretical psychology (Vol. 1, pp. 205–224). Plenum.
Google Scholar
Russell, B. (1903). The principles of mathematics. Bradford & Dickens.
MATH Google Scholar
Searle, J. (1992). The rediscovery of the mind. MIT Press.
Book Google Scholar
Searle, J. (2010). Making the social world: The structure of human civilization. Oxford University Press.
Book Google Scholar
Shannon, C. (1948). A mathematical theory of communication. Bell System Technical Journal, 27, 379–423 and 623–656.
Google Scholar
Sherry, D. (2011). Thermoscopes, thermometers, and the foundations of measurement. Studies in History and Philosophy of Science Part A, 42, 509–524.
Article Google Scholar
Siegel, S. (1956). Nonparametric statistics for the behavioral sciences. McGraw-Hill.
MATH Google Scholar
Stevens, S. S. (1946). On the theory of scales of measurement. Science, 103, 667–680.
Article ADS Google Scholar
Stevens, S. S. (1959). Measurement, psychophysics, and utility. In C. West Churchman & P. Ratoosh (Eds.), Measurement, definitions and theories (pp. 18–63). Wiley.
Google Scholar
Suppes, P. (1951). A set of independent axioms for extensive quantities. Portugaliae Mathematica, 10(4), 163–172.
MathSciNet MATH Google Scholar
Suppes, P., & Zanotti, M. (1992). Qualitative axioms for random-variable representation of extensive quantities. In C. W. Savage & P. Ehrlich (Eds.), Philosophical and foundational issues in measurement theory (pp. 39–52). Lawrence Erlbaum.
Google Scholar
Tal, E. (2019). Individuating quantities. Philosophical Studies, 176(4), 853–878.
Article Google Scholar
Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, 33, 529–544.
Article Google Scholar
Velleman, P. F., & Wilkinson, L. (1993). Nominal, ordinal, interval, and ratio typologies are misleading. The American Statistician, 47, 65–72.
Google Scholar
von Helmholtz, H. (1887b). Zählen und Messen Erkenntnis–theoretisch betrachtet, Philosophische Aufsätze Eduard Zeller gewidmet. Leipzig: Fuess. Translated as: Numbering and measuring from an epistemological viewpoint. In W. Ewald (Ed.), From Kant to Hilbert: A sourcebook in the foundations of mathematics: Vol. 2 (pp. 727–752). Clarendon Press, 1996.
Google Scholar
Wetzel, L. (2018). Types and tokens. In E. N. Zalta (Ed.), The stanford encyclopedia of philosophy. Stanford University. Retrieved from plato.stanford.edu/entries/types-tokens
Google Scholar
Weyl, H. (1949). Philosophy of mathematics and natural science. Princeton University Press.
MATH Google Scholar
Wilson, M., Mari L., & Maul, A. (2019). The status of the concept of reference object in measurement in the human sciences compared to the physical sciences. In Proceedings joint international IMEKO TC1+TC7+TC13+TC18 Symposium, St Petersburg, Russian Federation, 2–5 July 2019, IOP Journal of Physics: Conference Series, (Vol. 1379, pp. 012025). Retrieved from iopscience.iop.org/article/https://doi.org/10.1088/1742-6596/1379/1/012025/pdf

Download references

Author information

Authors and Affiliations

School of Industrial Engineering, Università Cattaneo—LIUC, Castellanza, Varese, Italy
Luca Mari
Berkeley School of Education, University of California, Berkeley, CA, USA
Mark Wilson
Gevirtz Graduate School of Education, University of California, Santa Barbara, CA, USA
Andrew Maul

Authors

Luca Mari
View author publications
You can also search for this author in PubMed Google Scholar
Mark Wilson
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Maul
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mari, L., Wilson, M., Maul, A. (2023). Values, Scales, and the Existence of Properties. In: Measurement Across the Sciences. Springer Series in Measurement Science and Technology. Springer, Cham. https://doi.org/10.1007/978-3-031-22448-5_6

Download citation

DOI: https://doi.org/10.1007/978-3-031-22448-5_6
Published: 26 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22447-8
Online ISBN: 978-3-031-22448-5
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics