6.1 Introduction

We have proposed that properties of objects are associated with modes of empirical interaction of objects with their environments, with the acknowledgment that this interaction makes objects experimentally comparable with one another. Thus we ground our framework upon the assumption that through properties we account for the relational behavior of objects. As already mentioned, this does not mean that we consider a property to exist only if an interaction is observed: our position is simply that observed interactions among objects can be accounted for in terms of their properties. We also accept that the description of an interaction among objects in terms of given properties is always revisable: there must be properties there, but they are not necessarily as we describe them.

As presented in the previous chapter, the framework we are developing is grounded on individual properties, such as lengths and reading comprehension abilities, which we take to be universal entities (see Sect. 5.3.2) that can be instantiated by, or more generically identified as, properties of given objects, such as the lengths of given rods and the reading comprehension abilities of given individuals.Footnote 1 A basic relation of the framework is then

a property of an object identifies an individual property

or more shortly

a property of an object is an individual property

so that, for example, there is a length that a given rod has, i.e., the length of that rod (the property of an object) is that length (an individual property), and there is a reading comprehension ability that a given individual has, i.e., the reading comprehension ability of that individual (the property of an object) is that reading comprehension ability (an individual property).

Each property of an object identifies an individual property: individual properties can be handled mathematically, for example by checking which of two lengths is greater, whereas relations between properties of objects must be investigated empirically. Accordingly, when a property of an object appears in a formal relation, such as a mathematical equation or a logical inference, the actual reference is to the corresponding individual property. This applies in particular to the relation of indistinguishability of properties of objects: as already pointed out, the observation that two properties of objects, P[ai] and P[aj], are indistinguishable, P[ai] ≈ P[aj], is interpreted by assuming that P[ai] and P[aj] either identify the same individual property or identify distinct but empirically indistinguishable individual properties. Since in general it is not possible to ascertain which of these situations is true, the customary notation P[ai] = P[aj] is just a convenient shorthand, acceptable whenever the relation is assumed to be transitive (see Sect. 5.2.6 on the implications of this assumption).

As discussed in Sect. 2.2.3, comparable individual properties are said to be of the same kind (JCGM, 2012: 1.2). Kinds of properties are abstract entities that are reified by assuming the existence of corresponding general properties, so that the adjectives “long”, “heavy”, etc. are replaced by the nouns “length”, “weight”, etc., and a relation such as

$$long\left[ a \right] \, \approx long\left[ b \right]$$

as in Sect. 5.2.6, is more customarily written

$$length\left[ a \right] \, \approx length\left[ b \right]$$

Each individual property is then an instance of a general property, and two individual properties are comparable only if they are instances of the same general property. Again, the examples are obvious: any given length is an instance of length, any given reading comprehension ability is an instance of reading comprehension ability, and so on. A second relation of the framework is then.

an individual property is an instance of a general property

These relations are depicted in Fig. 6.1.

Fig. 6.1
A model diagram identifies the property and length of an item as an individual property and length. They are instances of general property and length.

Relations between properties of objects, individual properties, and general properties

While such a conceptualization might appear redundant, it is not hard to show that:

  • properties of objects are not identical to individual properties: properties of objects are in fact features of objects and as such have a spatiotemporal location and can be (individual) measurands, and neither individual properties nor general properties share this feature; however, some features, such as being comparable with respect to a given relation, are characteristic of individual properties and are inherited by properties of objects; for example, we can say that the individual length \({\ell}_1\) is greater than the individual length \({\ell}_2\) if they are identified as the length of rod a and of rod b respectively and rod a has been empirically discovered to be longer than rod b;

  • individual properties are not identical to general properties: individual properties can be comparable with each other, and general properties do not share this feature; however, some features, such as being a quantitative or a qualitative property, being a physical property or a psychosocial property, etc., are characteristic of general properties and are inherited by their instances; for example, a given length is a physical quantity because length is a physical quantity.

This provides a pragmatic justification of the structure illustrated in Fig. 6.1.Footnote 2

On this basis, we defer to Sects. 6.5 and 6.6 a more specific analysis about general properties, in particular about their categorization into types, such as nominal, ordinal, and so forth. In the sections that follow, we continue to develop this framework grounded on individual properties, by introducing values of properties and first focusing on values of quantities.

6.2 Towards Values of Properties

A Basic Evaluation Equation, in its simplest version in which uncertainty is not taken into account, is

$${\text{property}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{property}}$$

When Norman Campbell famously stated that “the object of measurement is to enable the powerful weapon of mathematical analysis to be applied to the subject matter of science” (1920: p. 267), it is plausible that he was indeed referring to this kind of equation, and expressly to the specific case

$${\text{quantity}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

which, when written in the Q-notation (see Sect. 5.1), enables “the powerful weapon of mathematical analysis” by explicitly including numbers in the equation, e.g.,

$$L\left[ a \right] = {1}.{2345}\,{\text{m}}$$

as multipliers of units (henceforth we write “L[a]” as a shorthand for “length[rod a]”). Analogous is the case of the modified notation

$$L_{{{\text{in}}\,{\text{metres}}}} \left[ a \right] = {\text{1}}.{\text{2345}}$$

as in some formalizations, such as those adopted in representational theories of measurement (see, e.g., Krantz et al., 1971, and also Kyburg, 1984: p. 17). Through Basic Evaluation Equations, values of properties, and thus values of quantities in particular, are indeed the mathematical counterparts of empirical properties of objects. Values play the fundamental role of providing the information that is the reason for which measurement is performed: before measurement the measurand is known only as a property of an object; after measurement we also know a value for it. (Once again, references to uncertainty are important for a more complete presentation of measurement, but are not relevant here.) Once the relation is accepted as dependable, the value can be mathematically manipulated in place of experimentally operating on the property of the object. As a trivial example, if L[a] = 1.2345 m and L[b] = 2.3456 m then from the mathematical fact that 1.2345 m<2.3456 m we can immediately infer that L[a] < L[b].

An analysis of the nature and the role of values of properties is then a core component for the development of a measurement-related ontology and epistemology of properties. Let us start by considering the specific case of quantities and their values.Footnote 3 (Henceforth we occasionally use the short term “value”, rather than “value of a property” or “value of a quantity”, if this does not create ambiguity.)

6.2.1 Values of Properties: What They Are not

Values of properties have such a critical role that it is perhaps not surprising that there are multiple and even incompatible positions on what they are. According to two common stereotypes, they are expressions, or they are symbols. Let us start our analysis by showing that neither of these positions is correct. These stereotypes are usually related to quantitative properties rather than properties as such: hence, pars pro toto, we refer to quantities in the discussion that follows.

First, for example, according to the first edition of the VIM the value of a quantity is “the expression of a quantity in terms of a number and an appropriate unit of measurement” (ISO, 1984: 1.17, emphasis added). The first definition that the Oxford English Dictionary (OED) gives of <expression> is “things that people say, write or do in order to show their feelings, opinions and ideas”: thus, in general usage, according to the OED expressions (including mathematical expressions) are linguistic entities, i.e., in the sense of terminology, neither concepts nor objects (see Sect. 2.1). But it should be clear that here, in discussing measurement, values are not linguistic entities. Consider the difference, e.g., between the rod a, which is an object, and the five-character (space included) term “rod a”, which is a linguistic entity: the object has a weight, a color, etc., whereas the term does not. The term “rod arefers to a given rod, but is not that rod. Analogously, values are communicated by means of terms but they are not terms.Footnote 4 And in fact the same value, e.g., 1.2345 m, can be expressed linguistically in multiple ways, e.g., “one point two … metres”, “1.2345 m”, and “1,2345 m” (for most non-English speaking people), showing that 1.2345 m and “1.2345 m” are different entities. Certainly, values must somehow be expressed by means of linguistic entities to be communicated, but they are not, in themselves, expressions.

Second, sometimes values are said to be symbols, or identifiers, which stand for or represent objects or quantities of objects.Footnote 5 Of course, values may well be used as such, but this does not solve the problem of what they are. Indeed, stating that x is a symbol of y does not say anything about what x is. In this sense, Napoleon can be a symbol of political power, and a sphere can be a symbol of perfection, but this does not change the fact that Napoleon was a human being and a sphere is a geometric object. “To be a symbol” is just convenient shorthand for “to be used as a symbol”. Hence values may be used as symbols to represent quantities of objects, but a definition of <value of a quantity> phrased as “symbol such that…” is ontologically vacuous.

6.2.2 Values of Properties Cannot Be Discarded in Contemporary Measurement

At this point we need to face the possible objection that values are not needed at all, and therefore our whole current problem can be dismissed as immaterial. At least two analogous arguments can be made in support of this position.

One argument is that most equations and the related explanations that appear in the literature on, for example, physics do not even mention units: while often introduced as relations among general quantities (e.g., F = ma), physical laws are also interpreted as equations that relate numerical values of such quantities, under the assumption that their units are consistently chosen in a system of units. Hence it would seem that, after a system of units has been chosen, values can be discarded, and instead one need only to report numbers, instead of values (e.g., 1.2345 instead of 1.2345 m), for conveying information about quantities of objects.

The second argument against values of properties starts from the supposition that measurement produces numbers rather than values. As mentioned above, this seems to be assumed in particular by representational theories of measurement (see, e.g., Krantz et al., 1971), which usually formalize measurement as a mapping from objects, or sometimes properties of objects (see also Sect. 5.2.5), to numbersFootnote 6 by maintaining the unit implicit in the mapping, thus re-writing, e.g., L[a] = 1.2345 m as Lin_metres[a] = 1.2345. This seems to be a reinterpretation of Russell’s well-known assertion that “Measurement of magnitudes is, in its most general sense, any method by which a unique and reciprocal correspondence is established between all or some of the magnitudes of a kind and all or some of the numbers, integral, rational, or real, as the case may be” (1903: p. 176). Indeed, the Q-notation (see Sect. 5.1).

$$Q\left[ a \right] = \left\{ {Q\left[ a \right]} \right\}\,\left[ Q \right]$$

is equivalent to

$$Q\left[ a \right]/\left[ Q \right] = \left\{ {Q\left[ a \right]} \right\}$$

where then L[a] / m is what Lin_metres[a] is actually meant to be. Since in this relation values of quantities seem to have disappeared, it might be concluded that they are only related to the way knowledge is represented and therefore that they can be avoided by an appropriate choice of the representation.

As we see them, both of these arguments are correct in their premises, but their conclusions are problematic: the fact that in specific cases values can actually be discarded, in favor of dealing with numbers only, is really just a sort of shorthand and does not imply that this is always the case. Rather, there are good reasons for the customary choice of writing the Basic Evaluation Equation in terms of values instead of numbers. The difference between values of quantities and numerical values is that only the former contain information on the metrological context: “1.2345 m” means <1.2345 in the context of the scale generated by the metre>. Reporting only a numerical value, such as 1.2345, loses the reference to such a context, which is crucial for guaranteeing the metrological traceability of measurement data.

Assertions such as Russell’s hide the issue by implicitly assuming that the metrological context is given and is entirely embedded in the definition of the general quantity under measurement, as if a “natural unit of length” were unproblematically available, allowing us to measure the “natural length” of any object by a number, interpreted as the multiple of such a “natural unit” and conveying the information of the traceability to such a unit. It is in fact as if measurement could always be, in its structure, the counting of “natural units”.

But unless and until such “natural units” for all relevant quantities are agreed upon and socially accepted,Footnote 7 it is convenient, and essential, for Basic Evaluation Equations, and measurement results, to contain information on their metrological context, as provided by values, which thus play a critical role in effective communication of the information acquired by means of measurement.

On this basis, let us continue our exploration of what values of quantities are.

6.3 Constructing Values of Quantities

While the concept <value of a property> might appear unusual (as an example, the VIM does not define it), values of quantities are widely used, uncontroversially recognized as multiples of units. Even those who are doubtful about the nature of values of quantities, as discussed above, accept that 1.2345 m and 2.34 kg are examples of them. In order to properly introduce values of properties in our framework, let us then start from values of quantities, by exploiting the familiar additive structure of quantities such as length. What follows is a construction by example, rather than a definition.

6.3.1 Operating on (Additive) Quantities of Objects

Let us consider two rods, r and r’, in the experimental situation depicted in Fig.  6.2.

Fig. 6.2
A model diagram presents two cuboids marked as r and r dash. They are one above another. Two parallel vertical lines run along the left and right faces.

Constructing values of quantities first step (quantity-related comparison)

This situation is usually described as

the rods r and r' have the same length

or

the length of r is the same as the length of r'

and therefore

$$L\left[ r \right] \, \approx L\left[ {r^{\prime}} \right]$$

thus highlighting, more explicitly than L[r] = L[r’], that this is an experimental relation and therefore such a sameness is operationally a length-related indistinguishability.Footnote 8

Moreover, let us then assume that, at least for objects such as rods, length is an empirically additive quantity,Footnote 9 so that there exists a length-related concatenation operation ⊕ (hence the symbol “⊕” is used to denote an operation that applies to lengths of objects, not numbers) and the situation depicted in Fig. 6.3 is described as.

the length of a is indistinguishable from the length of the length-related concatenation of r and r'

Fig. 6.3
A model diagram presents three cuboids marked as r, r dash, and a between two parallel vertical lines. r and r dash are in series. Cuboid a is at the bottom.

Constructing values of quantities: second step (quantity-related concatenation)

or

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ {r^{\prime}} \right]$$

Since L[r] ≈ L[r’], this relation can also be written as

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ r \right]$$

and therefore

$$L\left[ a \right] \, \approx {2}\,L\left[ r \right]$$

for short, where more generally n L[r], for any integer n > 0, denotes the length of n concatenated copies of L[r].Footnote 10 This principle can be then extended also to non-integer relations between L[a] and L[r], by considering, together with iteration, L[a] ≈ n L[r], the inverse operation of partition (as the terms “iteration” and “partition” are used by Weyl, 1949: p. 30), such that L[r] is assumed to be constituted of n’ indistinguishable lengths L[c], so that L[r] ≈ n’ L[c]. By combining the two operations, a length is obtained as n/n’ L[r]. In varying the ratio n/n’ a set of lengths is thus obtained, and while the construction starts from the length of a given object, r, each entity n/n’ L[r] is a length constructed without an object that bears it: what sort of entities are they, then? While leaving this question open for the moment, let us point out that all these relations involve only quantities of objects, and are obtained by experimentally comparing objects.

Suppose now that the length L[r] is agreed to be taken as a reference quantity and given an identifier for convenience, say “\({\ell}_{\text{ref}}\)” (or, for example, “metre”). The reference length \({\ell}_{\text{ref}}\) is then defined as the length of the object r

$${\ell}_{\text{ref}} : = L\left[ r \right]$$

and r can be called a reference object. The indistinguishability relation L[a] ≈ 2 L[r] can then also be written as

$$L\left[ a \right] \, \approx { 2 }\ell_{{{\text{ref}}}}$$

This shows that the following relations

$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ {r^{\prime}} \right]$$
$$L\left[ a \right] \, \approx L\left[ r \right] \, \oplus L\left[ r \right]\quad \left( {{\text{provided}}\,{\text{that}}\,L\left[ r \right] \, \approx L\left[ {r^{\prime}} \right]} \right)$$
$$L\left[ a \right] \, \approx { 2}L\left[ r \right]\quad \left( {{\text{a}}\,{\text{shorthand}}\,{\text{of}}\,{\text{the}}\,{\text{previous}}\,{\text{relation}}} \right)$$
$$L\left[ a \right] \, \approx { 2 }\ell_{{{\text{ref}}}} \quad \left( {{\text{according}}\,{\text{to}}\,{\text{the}}\,{\text{definition}}\,{\text{of}}\, \, \ell_{{{\text{ref}}}} } \right)$$

all refer to the same empirical situation and only differ in the way the information is conveyed: in terms of the distinction between senses and referents of expressions (as explained in Sect. 5.3.2), the senses of the involved expressions are different, but their referent is always the same. All these relations—including the last one—involve lengths, and the difference between the length L[r] ⊕ L[r’] and the length 2 \({\ell}_{\text{ref}}\) is only about how such lengths are identified.

The fact that this construction has been developed with no references to values is important. As Alfred Lodge noted (1888: p. 281).

the fundamental equations of mechanics and physics express relations among quantities, and are independent of the mode of measurement of such quantities; much as one may say that two lengths are equal without inquiring whether they are going to be measured in feet or metres; and indeed, even though one may be measured in feet and the other in metres. Such a case is, of course, very simple, but in following out the idea, and applying it to other equations, we are led to the consideration of product and quotients of concrete quantities, and it is evident that there should be some general method of interpreting such products and quotients in a reasonable and simple manner.

With this acknowledgment we may restart our construction.

A rod a can be now calibrated in terms of its length with respect to \({\ell}_{\text{ref}}\) by aligning the left ends of a and r and placing a mark on the rod a at the other end of the rod r. Additional marks can be placed on the rod a, using geometrical methods that implement the iteration and partition methods mentioned above, to denote multiples of \({\ell}_{\text{ref}}\), as depicted in Fig. 6.4.

Fig. 6.4
A model diagram presents two cuboids marked as r and a with a vertical line on the left. Cuboid a is marked with 0.5, 1.0, and 1.5. r extends till 1.0.

Constructing values of quantities: third step (quantity-related comparison with an object calibrated with respect to a reference quantity)

Common measuring instruments of length, such as metre sticks and tape measures, are constructed and then calibrated in this way: indeed, the rod a can be placed against other objects to establish where their lengths align with corresponding marks on the rod a itself. Hence, the rod a realizes a sequence of lengths. The length L[b] of an object b can be now compared with the lengths marked on the rod a and thus reported as a multiple x of \({\ell}_{\text{ref}}\),

$$L\left[ b \right] \, \approx x\,\ell_{{{\text{ref}}}}$$

where then x = n/n’, for given n and n’, as depicted in Fig. 6.5.

Fig. 6.5
A model diagram presents two cuboids marked as b and a with a vertical line on the left. Cuboid a is marked with 0.5, 1.0, and 1.5. b extends up to 1.5.

The comparison of the length L[b] with the lengths marked on the rod a

6.3.2 On Reference Objects and Reference Quantities

Let us now focus on the indistinguishability relation

$$L\left[ b \right] \, \approx x\ell_{{{\text{ref}}}}$$

which holds for a given x = n/n’. The only difference between L[b] ≈ x \({\ell}_{\text{ref}}\) and L[b] ≈ x L[r] is related to the way in which the quantity in the right-hand side of the two relations is referenced, by an identifier of the quantity, \({\ell}_{\text{ref}}\), or by addressing an object, r, with respect to one of its properties, L. Since changing the way in which a quantity is referenced (“the length of the object we agreed to designate as r”, “L[r]”, “\({\ell}_{\text{ref}}\)”, or whatever else) does not change the quantity, one might conclude that this is just an arbitrary lexical choice. While in principle this is correct, there is a subtle point here related to the way we usually deal with identifiers: for the relation (identifier, identified entity) to be useful, it needs to hold in a stable way. This is why entities whose time-variance is acknowledged are identified by means of identifiers indexed by a time-related variable, as in the case of the length L[b, t] of the object b at the time t (see also Sect. 5.2.5). Conversely, if the identifier does not include a reference to time then the identification is supposed to remain valid only on the condition that the identified entity does not change over time. For example, the date of birth of a given person b can be identified as birthday[b], while her height in a given time t as height[b, t]: in this way we acknowledge that birthday is time invariant, whereas height is time variant.

In this sense, the definition \({\ell}_{\text{ref}}\):= L[r], where the identifier “\({\ell}_{\text{ref}}\)” is not indexed with time, assumes that the length L[r] is time invariant. Since quantities of objects are instead usually subject to variations, this is a strong assumption: of course, assigning a name to the quantity of an object does not make it stable.Footnote 11

The consequence of choosing the length of an object r as a reference length \({\ell}_{\text{ref}}\), thus under the condition of its stability, is that \({\ell}_{\text{ref}}\) can also be considered to be the length of any other sufficiently stable object having the same length as r. This allows the assessment of L[a] ≈ x \({\ell}_{\text{ref}}\) not only by means of L[a] ≈ x L[r] but also by means of L[a] ≈ x L[r’], for any sufficiently stable r’ in a class of objects such that L[r’] ≈ L[r]. Hence the choice of referring to a length through an identifier as “\({\ell}_{\text{ref}}\)” (for example “metre”—note: it is not “metre in a given time t”) assumes that the referenced length is both space and time invariant: according to the conceptual framework introduced in Sect. 6.1, it is an individual length, identified by L[r], L[r’], … but abstracted from any particular object.Footnote 12

6.3.3 Alternative Reference Quantities and Their Relations, i.e., Scale Transformations

As remarked, the only condition for having singled out r as a reference object is that its length is stable. Hence nothing precludes the independent choice of an alternative reference object, r*, whose length L[r*] is distinguishable from L[r] and defines a new reference length (for example the foot instead of the metre):

$$\ell_{{{\text{ref}}^*}} : = L\left[ {r^*} \right]$$

A new rod a* can be now calibrated with respect to \({\ell}_{\text{ref*}}\), exactly as was done before for the rod a with respect to \({\ell}_{\text{ref}}\), so that the same object b could be compared in its length with both rod a and rod a*. Different relations of indistinguishability are then obtained, L[b] ≈ x \({\ell}_{\text{ref}}\) and L[b] ≈ x’ \({\ell}_{\text{ref*}}\), with xx’, as exemplified in Fig. 6.6.

Fig. 6.6
A model diagram presents 3 cuboids marked as an asterisk, b, and a. an asterisk and a are marked with 1, 2, 3, and 0.5, 1.0, 1.5. b extends till 3 and 1.5.

The comparison of the length L[b] with the lengths marked on the rods a and a* a and a*

The lengths marked in this way on rods a and a* can be compared, which is particularly interesting because such lengths are indexed by numbers, attributed according to the hypothesis of empirical additivity, such that the length 2 \({\ell}_{\text{ref}}\) is L[r] ⊕ L[r] and so on. Hence, the hypothesis that the lengths marked on two rods have been additively constructed can be experimentally validated, by finding the factor k such that \({\ell}_{\text{ref}}\)k \({\ell}_{\text{ref*}}\) (in the example in Fig. 6.6, k = 0.5) and then checking whether 2 \({\ell}_{\text{ref}}\) ≈ 2k \({\ell}_{\text{ref*}}\), 3 \({\ell}_{\text{ref}}\) ≈ 3k \({\ell}_{\text{ref*}}\), and so on. Such a systematic validation provides a justification for the specific hypothesis that the two lengths \({\ell}_{\text{ref}}\) and k \({\ell}_{\text{ref*}}\) are in fact equal, \({\ell}_{\text{ref}}\) = k \({\ell}_{\text{ref*}}\), and not just indistinguishable, and therefore that the scale transformation, from multiples of \({\ell}_{\text{ref}}\) to multiples of \({\ell}_{\text{ref*}}\) or vice versa, can be performed as a mathematical operation.Footnote 13

6.3.4 Generalizing the strategy of Definition of Reference Quantities

The definition of reference quantities as quantities of objects (sometimes called “prototypes” or “artifacts” when they are physical objects) that are hypothesized to be stable is conceptually simple, and is typically the starting point of the development of a unit. For example, in 1889 the first General Conference of Weights and Measures (CGPM) asserted that “the Prototype of the metre chosen by the CIPM […] at the temperature of melting ice shall henceforth represent the metric unit of length.” (BIPM, 2019: Appendix 1), where the mentioned prototype of the metre was a specially manufactured metallic rod. But this strategy has some drawbacks that have become more and more apparent with the progressive globalization of measurement science and its applications:

  • First, both physical and non-physical objects at the anthropometric scale are usually not completely stable, with the consequence that, once the definition \({\ell}_{\text{ref}}\):= L[r] is given, for any object a if L[a] ≈ x \({\ell}_{\text{ref}}\) and [r] changes due to the instability of r, then after that change L[a] ≈ x\({\ell}_{\text{ref}}\), with x’ ≠ x, even if L[a] did not change: the numerical representation of a quantity has changed even though the quantity itself did notFootnote 14;

  • Second, having a reference quantity defined as the quantity of one object implies that all traceability chains must start from that object, in the sense that all measuring instruments for that quantity must be directly or indirectly calibrated against that reference object: this is operationally inconvenient and may generate political struggles, given the power that the situation confers to the owner of the object.

Alternative strategies to define stable and accessible reference quantities may be and have been in fact envisaged to avoid or at least reduce these flaws. Such alternative strategies are particularly required in the case where the quantities intended to be measured are properties of human beings, which, if the steps described above were to be followed, would imply that, in principle, reference objects should be certain individual humans, a situation that of course is not usually appropriate for several reasons.Footnote 15

Rather than selecting specific objects, a representative sample of objects—hence persons, in some cases—could be then selected, their property of interest somehow evaluated, and the reference quantity defined as a statistic (e.g., the mean or the median) of the obtained values. This makes the reference quantity depend on the selected sample, and therefore in principle it is not entirely stable if new samples are taken or if characteristics of the sampled population change. (In psychometrics, evaluations performed according to this strategy are called “norm-referenced”, to emphasize the normative role of the sample that defines the reference quantity; see Glaser, 1963.Footnote 16)

Another possible strategy for dealing with these issues may be based on the consideration that according to the best available theories there is a class of objects, R = {ri}, that when put in given conditions invariantly have the same quantity of interest, which in those conditions is then a constant.Footnote 17 Defining the reference quantity as a constant quantity characteristic of a class of objects guarantees both its stability and accessibility. And if the identified constant were too far from the anthropometric scale to be suitable, the reference quantity could be defined as an appropriate multiple or submultiple of the constant, so as to maintain a principle of continuity, such that different definitions could subsequently be adopted while ensuring that the defined reference quantity remains the same. For example, in 1960 the 11th General Conference of Weights and Measures redefined the metre as “the length equal to 1 650 763.73 wavelengths in vacuum of the radiation corresponding to the transition between the levels 2p10 and 5d5 of the krypton 86 atom” (BIPM, 2019: Appendix 1). The critical point of this definition is the assumption that the wavelength of the chosen radiation is constant, whereas the numerical value, 1 650 763.73, was only chosen for the purpose of guaranteeing that the metre remained the same length despite the change of its definition.Footnote 18

By exploiting the functional relations that are known to hold among quantities, a more sophisticated version of this strategy allows for the definition of a reference quantity as a function of constants of different kinds, and possibly of previously defined reference quantities. For example, according to Einstein’s theory of relativity the speed of light in vacuum is constant, and so the class of all light beams in vacuum is such that the length of their path in a given time interval is also constant. By exploiting the relation

$$\text{length} = \text{speed} \,\times \text{time duration}$$

among general quantities, the definition is then

$$\ell_{{{\text{ref}}}} : = S\left[ {\text{R}} \right] \, \Delta T$$

where S[R] is the speed S of light in vacuum (R being then intended as the class of all light beams in vacuum) and ΔT is the chosen time interval. This is in fact how in 1983 the 17th General Conference of Weights and Measures defined the metre: “the length of the path travelled by light in vacuum during a time interval of 1/299 792 458 of a second” (BIPM, 2019: Appendix 1). Once again, the appropriate choice of the numerical value, 1/299 792 458, was the condition of validity of the principle of continuity.

With the aim of emphasizing the role of the defining constant quantity S[R], this definition can be rephrased as

$${\text{the}}\,{\text{reference}}\,{\text{length}}\, \, \ell_{{{\text{ref}}}} \, \text{is} \, \text{such} \,{\text{that}}\,S\left[ {\text{R}} \right] = \ell_{{{\text{ref}}}} \Delta T^{{{-}{1}}}$$

and this is in fact what became the definition of the metre in 2019 as a result of the 26th General Conference of Weights and Measures: “The metre (…) is defined by taking the fixed numerical value of the speed of light in vacuum c to be 299 792 458 when expressed in the unit m s–1, where the second is defined in terms of the caesium frequency ∆νCs.” (BIPM, 2019: 2.3.1).

Given the condition of the correctness of the theory that considers a quantity constant for a class of objects, this generalization produces three important benefits, by making the unit

  • independent of the conditions of stability of a single object,

  • more widely accessible (in principle, everyone with the access to one object of the class can realize the definition of the unit, and therefore operate as the root of a traceability chain), and

  • definable in terms of quantities of kinds other than that of the unit, given the condition that all relevant quantities are related in a system of quantities.

This developmental path, where a unit is defined as

  1. 1.

    the quantity of a given object (prototype-based definition), then

  2. 2.

    a statistic of the quantities of a representative set of objects (norm-referenced definition), then

  3. 3.

    the quantity considered to be constant for a class of objects (constant-based definition, as in the 1960 definition of the metre), then

  4. 4.

    a quantity functionally related to the quantity/ies considered to be constant for a class of objects (functional constant-based definition, as in the 1983 definition of the metre), with the relation possibly stated in inverse form (inverse functional constant-based definition, as in the 2019 definition of the metre),

may be interpreted as a blueprint of the options for the definition of reference quantities: in fact, a lesson learned from history.

6.3.5 Values of Quantities: What They Are

Let us summarize the main features of the construction proposed in the previous sections. In the special case of an empirically additive general quantity Q, the quantities Q[ai] of objects ai can be concatenated so that the concatenation Q[ai] ⊕ Q[aj] can be empirically indistinguishable from a quantity Q[ak], that is, Q[ai] ⊕ Q[aj] ≈ Q[ak].Footnote 19 On this basis an object r having the quantity Q can be singled out with the conditions that it is sufficiently Q-stable and that Q-related copies of it are available. This allows for the identification of the individual quantity Q[r] not only as “Q[r]”—i.e., the quantity Q of the object r—but also through a time-independent identifier “qref” (“\({\ell}_{\text{ref}}\)” in the example above). This also allows for reporting of the information on a quantity Q[ai] in terms of its indistinguishability from a multiple x of qref, Q[ai] ≈ x qref. Furthermore, other such reference objects r* can be chosen, and the scale transformation qref = k qref* can be experimentally tested, for a given k that depends on qref and qref*.

While everything that has been done in this construction is related to quantities of objects, the conclusions apply to what are commonly acknowledged to be values of quantities, and in fact the indistinguishability

$$Q\left[ {a_{i} } \right] \approx x{\text{q}}_{{{\text{ref}}}}$$

can be interpreted as a Basic Evaluation Equation

$${\text{quantity}}\,{\text{of}}\,{\text{an}}\,{\text{object}} \approx {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

as well, as follows:

  • an individual quantity qref is singled out as a quantity unit (e.g., the metre); qref may be defined as the quantity Q[r] of an object r or, being defined in some other way as discussed in Sect. 6.3.4, may be realized by some object r; in either case r is a measurement standard, and possibly in particular the/a primary standard;

  • the individual quantities x qref (e.g., 2 m) are values of quantities, being by construction, the multiples of qref obtained by means of the concatenation of the chosen unit;

  • working standards r’ can be calibrated against the primary standard r, Q[r’] ≈ Q[r] (ignoring calibration uncertainty), so that the quantity Q[a] of an object a can be compared with Q[r’]; hence the inference that from qref = Q[r], Q[r’] ≈ Q[r], and Q[a] ≈ x Q[r’] leads by transitivityFootnote 20 to Q[a] ≈ x qref is the simplest case of a metrological traceability chain (JCGM, 2012: 2.42);

  • the relation Q[a] ≈ x qref for a given x (e.g., L[a] ≈ 2 m) is a Basic Evaluation Equation, and thanks to this traceability it may be a measurement result for the measurand Q[a] (ignoring measurement uncertainty);

  • hence the relations

the quantity of a given object is indistinguishable from a multiple of the quantity of another object

(e.g., L[a] ≈ 2 L[r]) and

the quantity of a given object is a value of a quantity

(e.g., L[a] ≈ 2 \({\ell}_{\text{ref}}\) (or L[a] ≈ 2 m, that indeed is commonly read “the length of a is 2 metres”))

refer to the same empirical situation, the difference being in the way the two relations convey the information about the individual quantities involved.

The conclusion is then obvious: a value of a quantity is an individual quantity identified as a multiple of a given reference quantity, designated as the unit.Footnote 21

The analysis in Sect. 5.3.2, which led us to interpret a relation such as Q[ai] ≈ Q[aj] as including expressions with different senses but possibly the same individual length as their referent, can be now straightforwardly extended to scale transformations and Basic Evaluation Equations:

  • in the scale transformation qref = k qref* the expressions “qref” and “k qref*” have different senses but the same individual length as referentFootnote 22;

  • in the Basic Evaluation Equation Q[a] ≈ x qref, for a given x, the expressions “Q[a]” and “x qref” have different senses but could have the same individual length as their referent.

The concept system about <quantity> can then be depicted as in Fig. 6.7.

Fig. 6.7
A model diagram presents quantity, length, and their values on the left, individual quantity and length in the middle, and general quantity and length on the right.

The concept system about <quantity> (top) and an example (bottom), just as a specialization of Fig. 5.2

As discussed in Sect. 6.5.1, there is nothing arbitrary in the fact that an individual quantity q is identified as the quantity Q[a] of an object a. Once again, this shows that the Basic Evaluation Equation L[a] = 1.2345 m conveys the information that there is an individual length \({\ell}\) such that both the length L[a] of rod a and the value 1.2345 m are claimed to be instances of \({\ell}\). This allows us to propose a short discussion, in Box 6.1, about the delicate subject of true values of quantities.

Box 6.1 True values of quantities

Plausibly due to its explicit reference to truth, the idea that the quantity of an object may have a true value, which measurement aims at discovering, has caused controversies and confusion in measurement science for decades, if not centuries. It is a position that has appeared so deeply entangled with philosophical presuppositions about the actual existence of an observer-independent reality that a mutual understanding seemed impossible without preliminary agreement about an underlying ontology. Consider two authoritative, traditional examples of positions that jointly illustrate this controversy. According to Ernest Doebelin “when we measure some physical quantity with an instrument and obtain a numerical value, usually we are concerned with how close this value may be to the ‘true’ value” (1966). Vice versa, Churchill Eisenhart wrote about his “hope that the traditional term ‘true value’ will be discarded in measurement theory and practice, and replaced by some more appropriate term such as ‘target value’ that conveys the idea of being the value that one would like to obtain for the purpose in hand, without any implication that it is some sort of permanent constant preexisting and transcending any use that we may have for it.” (1962).

While acknowledging that references to truth are usually laden with philosophical presuppositions, our understanding of what values of properties are and of the epistemic role of the Basic Evaluation Equation allows us to propose a simple interpretation of <true value>.

As a preliminary note, it should be clear that truth or falsehood do not apply, properly, to values: asking whether, say, 1.2345 m is true or false is meaningless. Rather, the claim that 1.2345 m is a true value refers to its relation with a property of an object, say, the length of rod a, being thus just a shorthand for the assertion that the Basic Evaluation Equation

$$L\left[ a \right] = {1}.{2345}\,{\text{m}}$$

is true. Of course, there can be several reasons that may make it hard or even impossible to achieve knowledge of the actual truth or falsehood of this equation, and the equation as such could be ill defined.

These issues may be left aside in a first, principled analysis about <true value>, which maintains in particular the distinction between the existence of true values and the possibility of our knowledge of them. Indeed, the VIM notes that true values are “in principle and in practice, unknowable” (JCGM, 2012: 2.11, Note 1), but this may be simply considered as an instance of the recognition that we cannot be definitively certain of anything related to empirical facts, like values of empirical properties of objects.

If true values are interpreted in the context of Basic Evaluation Equations, the conclusion does not seem to be particularly controversial: as already discussed, in Sect. 5.1.3 and elsewhere, asking whether 1.2345 m is the true value of the length of rod a is the question of whether there is an individual length that is at the same time the length of rod a and 1.2345 times the metre. Moreover, if values of properties and evaluation scales are considered as classification tools, as further discussed also in Box 6.2, the truth of a Basic Evaluation Equation is about the correctness of the classification of the property of the given object in the class identified by the given value. From this perspective, assessing the truth of a Basic Evaluation Equation is a sensible and useful task, basically free of philosophical preconditions: indeed, the idea that an entity—the property of an object in this case—can be objectively classified after a classification criterion has been set does not require one to accept any realist assumption about properties of objects or values of properties.

This interpretation can be then generalized, by refining our understanding of the two lengths involved in the Basic Evaluation Equation above and their relations. Let us suppose that our knowledge of the structure of rod a leads us to model it as having a unique length at the scale of millimetres, so that L[a] = 1.234 m would be either true or false as such, but not at a finer scale. Were we for some reason instead expected to report a measured value for L[a] at the scale of the tenths of millimetres, as above, we should admit that the measurand has to be defined with a non-null definitional uncertainty, as exemplified in Box 2.3, thus acknowledging that in this case L[a] is not really one length, but several—plausibly an interval—of them. As a consequence, the Basic Evaluation Equation above, if true, should be actually meant as something like

$$L\left[ a \right]\rm{ \ni }{1}.{2345}\,{\text{m}}$$

i.e., 1.2345 times the metre is one of the lengths that rod a has, as the measurand L[a] is defined. While this seems to be a peculiar conclusion, consider that values of quantities are sometimes conceived of as operating in mathematical models assuming the continuity (and the differentiability, etc.) of the involved functions, so that the Basic Evaluation Equation above is actually treated as if it were L[a] = 1.234500000... m. However, macroscopic objects cannot have a length classified in a scale that is infinitely specific, i.e., by values with infinitely many significant digits. Hence, sooner or later the interpretation of Basic Evaluation Equations as memberships, instead of equalities, may become appropriate, where the measurand is then defined with a non-null definitional uncertainty with respect to a given scale and a value is true if it is one of the lengths of the subset / interval admitted by the definitional uncertainty. This is a plausible account of what is behind the VIM’s statement that “there is not a single true quantity value but rather a set of true quantity values” (JCGM, 2012: 2.11, Note 1).

The previous construction, which has led us to reach a conclusion about what values of quantities are, explicitly relies on the additivity of length. In the next two sections we discuss how this conclusion generalizes to non-additive cases. We discuss here values of non-additive quantities, in particular those represented on interval scales, while reserving a discussion of the most general case of values of possibly non-quantitative properties to Sect. 6.5.2.

6.3.6 Beyond Additivity: The Example of Temperature

Let us first discuss the case of temperature, as characterized and then measured in thermometric (e.g., Celsius and Fahrenheit) scales. Unlike length, temperature is not an additive quantity: that is, we do not know how to combine bodies by temperature so that the temperature of the body obtained by combining two bodies at the same temperature is twice the temperature of each of the combined bodies. This could be what led Campbell to conclude that “the scale of temperature of the mercury in glass Centigrade thermometer is quite as arbitrary as that of the instrument with the random marks” (1920: p. 359), so that “the international scale of temperature is as arbitrary as Mohs’ scale of hardness” (p. 400). Were this correct, values of temperature, such as 23.4 °C, would be only identifiers for ordered classes of indistinguishable temperatures, as are values of Mohs’ hardness, so that we could assess that the temperature identified as 23.4 °C is higher than the temperature identified as 23.3 °C, but not that the difference between the temperatures identified as 23.4 °C and 23.3 °C is the same as the difference between the temperatures identified as 22.9 °C and 22.8 °C. Our question is then: what is a value of temperature?

The starting point is the same as in the case of length: we assume to be able to compare bodies by their temperature so as to assess whether two given bodies have indistinguishable temperatures (in analogy with the comparison depicted in Fig. 6.2) or whether one body has a greater temperature than the other.Footnote 23

On this basis, a (non-arbitrary) scale of temperature (and therefore values of temperature) can be constructed through an empirical procedure, though, admittedly, not as simply as the one for length. As in the case of length, all assumptions that follow relate to empirical properties of objects, and non-idealities in the comparisons of such properties are not taken into account.

Let us consider a sequence ai, i = 1, 2, …, of amounts of gas of the same substance, where the i-th amount has the known mass M[ai] = mi and is thermally homogeneous, at the unknown temperature Θ[ai] = θi.Footnote 24 Let us suppose that any two amounts of gas ai and aj can be combined into a single amount ai,j, such that mi,j = mi + mj. It is assumed that ai,j reaches thermal homogeneity and that its temperature θi,j is only a function of θi, mi, θj, and mj (but of course the non-additivity of temperature is such that θi,j ≠ θi + θj). Finally, let us suppose that the temperatures of any two amounts of gas can be empirically compared by equality and by order, i.e., whether θi = θj or θi < θj or θj < θi. The hypothesis that temperature is an intensive property (see Sect. 1.2.1) can be tested through some preliminary checks:

  • for any two amounts ai and aj, if θi = θj then θi,j = θi = θj, i.e., thermal homogeneity does not depend on mass;

  • for any two amounts ai and aj, if θi < θj then θi < θi,j < θj, i.e., thermal composition is internal independently of mass;

  • for any three amounts ai, aj, and ak, if θi < θj < θk and mjmk then θi,j < θi,k, i.e., thermal composition is monotonic for monotonically increasing mass;

  • for any three amounts ai, aj, and ak, if θi < θj < θk and mj > mk then all cases, θi,j < θi,k, θi,j = θi,k, and θi,j > θi,k, can happen.

The fact that these conditions hold may suggest the hypothesis that

$$\uptheta_{i,j} = \frac{{m_{i} \uptheta_{i} + m_{j} \uptheta_{j} }}{{m_{i} + m_{j} }}$$

i.e., temperatures compose by weighted average, where the weights are the masses of the composing amounts of gas. For testing this hypothesis, let us assume that three amounts of gas, ai, aj, and ak, are given such that their masses mi, mj, and mk are known and can be freely changed, and that θi < θj and θi < θk. Let us now suppose that aj and ak, are independently composed with ai, and mi, mj, and mk are chosen to obtain that θi,j = θi,k, and therefore, under the hypothesis that temperatures compose by weighted average, that

$$\frac{{m_{i} \uptheta_{i} + m_{j} \uptheta_{j} }}{{m_{i} + m_{j} }} = \frac{{m_{i} \uptheta_{i} + m_{k} \uptheta_{k} }}{{m_{i} + m_{k} }}$$

What is obtained is a system with two degrees of freedom, in which one of the three unknown temperatures θi, θj, and θk is a function of the other two temperatures and of the three masses, i.e., θk = fi, θj, mi, mj, mk). Were a value arbitrarily assigned to identify θi and θj (for example 0° X and 1° X for an X scale with values in degrees X), a value for θk could be computed. By choosing the two temperatures θi and θj and setting the corresponding values θi and θj, and repeating the same process with different masses mi, mj, mk and a different temperature θk, other values of the X scale would be obtained, and the hypothesis of weighted average validated.

However, as discussed in Sect 1.2.1, historically a key step forward was the discovery of the thermal expansion, i.e., that some bodies change their volume when their temperature changes. In metrological terms, such bodies can be exploited as transducers of temperature (see Sect.  2.3). Making a long story short, the refined treatment of these bodies—in devices that we would consider today (uncalibrated) thermometers—corroborated the empirical hypotheses that, within given ranges of volumes of given bodies,

  • for a sufficiently large set {ai} of bodies the temperature Θ[ai] of each body in the set and its volume V[ai] are causally connected, as modeled by a function f, V[ai] = f(Θ[ai]),

  • such that changes in temperature of each body in the set produce changes in its volume,

  • and that, for each body in the set, differences in volume correspond to differences in temperature in such a way that equal differences of volume are produced by equal differences of temperature, i.e., if V = f(Θ) and v1−v2 = v3−v4 then it is because θ1−θ2 = θ3−θ4.Footnote 25

While this development so far involves only abstract individual properties—temperatures and volumesFootnote 26—possibly identified as properties of objects, on this basis the construction of a scale of temperatures, and therefore the introduction of values of temperature, is a relatively trivial task. According to the traditional procedure,

  • two distinct temperatures are identified, θ1 and θ2, each of them being the common, constant temperature of a class of objects, θ1 = Θ[R1] and θ2 = Θ[R2], in analogy with what discussed in Sect. 6.3.4 about speed of light; θ1 and θ2 could be the temperatures of the freezing point of water and the boiling point of water in appropriate conditions, respectively;

  • the scale built from θ1 and θ2 is given a name, say °C, and a number in the scale is conventionally assigned to θ1 and θ2, thus identifying them with values, for example 0°C:= θ1 and 100 °C:= θ2;

  • according to the hypothesis that equal differences of volume are produced by equal differences of temperature appropriate numbers in the scale are assigned to all other temperatures: for example, if f(θ3) = [f(θ1) + f(θ2)] / 2, then [0 °C + 100 °C]/2 = 50 °C:= θ3.

The conclusion is then that values of temperature are individual temperatures identified as elements in such a scale.

6.3.7 Beyond Additivity: The Example of Reading Comprehension Ability

Let us now discuss the case of reading comprehension ability (RCA), as characterized and then measured by reading tests. Like temperature and unlike length, RCA is not an additive quantity: that is, we do not know how to combine readers by RCA so that the RCA of a hypothetical “synthetic reader” is the sum of the RCAs of each of the combined readers. As above, our question is then: what is a value of RCA such as, say 150 RCA units? The starting point is the same as in the case of length and temperature: we assume that we can compare readers by their RCA so as to assess whether two given readers have indistinguishable RCAs (in analogy with the comparison depicted in Fig. 6.2). For example, the two readers could be asked to discuss the contents of a text passage with a human judge, and the judge could then rate the reader’s relative RCAs. Now, unaided human judges may not have sufficient resolution to discriminate RCA beyond rough ordinal classes (e.g., very little comprehension, text comprehension, literal comprehension, inferential comprehension, etc.), so that one could, subject to the assumption that one used the same human judge, consider that RCA is at most an ordinal property. Apart from concerns that this may be assuming a weaker scale of RCA than possible, there are clearly serious issues of subjectivity at play in this situation: did the judge ask the same questions of the two readers, did the judge rate the responses to the questions “fairly”, and would a different human judge be consistent with this one?

A key step forward was the implementation of standardized reading tests (Kelly, 1916; see Sects. 1.2.2 and 3.3.1), where readers would (i) read a text passage and (ii) answer a fixed set of questions (called in this context “items”Footnote 27) about the contents of the passage; and then (iii) their answers would be judged as correct or incorrect, and (iv) readers would be given sum-scores (e.g., the total number of items that they answered correctly) on the reading comprehension test. Here readers who had the same sum-score would be indistinguishable with respect to their RCAs as measured by that test. Again, this would result in an ordinal scale (i.e., the readers who scored 0, the readers who scored 1, …, the readers who scored K, for a test composed of K items), though, depending on the number of items in the set, there would be a finer grainsize than in the previous paragraph (i.e., as many levels as there are different sum-scores). This approach does address some of the subjectivity issues raised by the previous approach: the same questions are asked of each reader, and, with a suitable standardized mode of item response scoring, the variations due to different human judges can be reduced, if not eliminated altogether. However, what is not directly addressed are the issues of (a) the selection of text passages, and (b) the selection of questions about those passages. Suppose, however, that one was prepared to overlook these last two issues: one might convince oneself that the specific text passages and questions included in the test were acceptably suitable for all the applications that were envisaged for the reading comprehension test. In that case, one could adopt a norm-referenced approach to developing a scale (see Sect. 6.3.4), where the cumulative percentages of readers from a sample from a given reference-population (say, Grade 6 readers from X state in the year 20YZ) was used to establish a mapping from the RCA scores on the test to percentiles of the sample. This makes possible so-called equipercentile equating to the (similarly calculated) results of other reading comprehension tests.

Thus, at this point in the account, the conclusion is then that values of RCA are individual abilities identified as elements in an ordinal scale. It is interesting to note that the sum-scores which are the indexes used for the ranks can be also thought of as frequencies: a sum-score s, out of K total number of items, is a frequency of s in terms of the number of correct items. It can also be seen as s/K, a relative frequency (proportion) compared to the total number of items K, and, of course, relative frequencies are often interpreted as probabilities (though this move is not philosophically innocent; see, e.g., Holland, 1990, and Borsboom et al., 2003). That is, given this set of K items, what is the average probability that the reader will get the items correct, based on the proportion that they did get correct?

To see how this makes sense, one must backtrack to our conception of RCA, as follows.

  • We label as RC-micro an event (Wilson et al., 2019) involving a reader’s moment-by-moment understanding of a piece of text. This is related to Kintch’s concept of the textbase in his Construction/Integration (CI) model (Kintch, 2004), and refers to all component skills such as decoding (also known as word recognition), and these typically are driven from a finer to a coarser lexical granularity, i.e., a reader builds meaning from text, starting from small units (letters, sounds, words, etc.) and moving to progressively larger units. Most competent readers are not even conscious of the events at the lowest levels of granularity, unless, of course, the reader comes across a word that she does not recognize, and may have to go back to sounding it out letter by letter (i.e., grapheme by grapheme). Thus, each of these reading comprehension events can be thought of as a micro-level event that is also composed of a cascade of other more basic micro-level events, and is also contained in other micro-level events.

  • In contrast, we label as RC-macro the events which integrate all the micro-level events which occur for a reader in the process of reading the text passage, and may integrate other conceptions beyond those, including thoughts about other texts and other ideas. This is related to the situation aspect of Kintch’s CI model, which is integrated with the textbase, to form a deeper understanding of the text, and that is what will be stored in long term memory. Here, we might compare this to temperature, where the micro-events can be seen as the motion of individual molecules, which will each have properties of speed and direction in three-dimensional space (i.e., these would be seen as constituents of kinetic energy, a quantity different from temperature), which we cannot directly observe, and in which we are usually not interested. In contrast, the macro-level property of temperature is the integration over all of these molecular motions inside a certain body, which is what we are indeed interested in.

  • This leads us to reading comprehension ability, which is the overall disposition of an individual reader to comprehend texts.

  • Then, when test developers construct a RCA test, they (a) sample text passages from a body of texts, (b) design an item-universe (i.e., the population of items:Footnote 28 Guttman, 1944) of questions (items) that challenge a reader’s RCA concerning parts of the text (including, of course, possibly whole texts, and across different texts), take a sample from that universe, and (c) establish rules for deciding whether the answers to the sampled questions are correct or not, resulting in a vector of judgments of responses. This then is the transduction, from RCA to a vector of scored responses to the items.

  • In the tradition of classical test theory, as described above, the items are viewed as being “interchangeable” in the sense of being randomly sampled from the item-universe, and hence the information in the vector can be summarized as the score s, and, equivalently, as the relative frequency s/K that the reader will (on average) get an item correct.

  • Alternatively, the indication could be seen as the vector of responses, thus preserving the information about individual items (such as their difficulty), and thus modeling the probability of the reader getting each of the items correct, and this is the direction followed below.

In addition, generalizability must be considered: basing the measurement of RCA on a specific test is too limiting for practical application. This was recognized by Louis Thurstone (1928), a historically important figure in psychological and educational measurement (1928: p. 547):

A measuring instrument must not be seriously affected in its measuring function by the object of measurement. To the extent that its measuring function is so affected, the validity of the instrument is impaired or limited. If a yardstick measured differently because of the fact that it was a rug, a picture, or a piece of paper that was being measured, then to that extent the trustworthiness of that yardstick as a measuring device would be impaired. Within the range of objects for which the measuring instrument is intended, its function must be independent of the object of measurement.

To contextualize this, suppose that the RCA of readers is to be assessed using a set of items designed for reading comprehension which can be scored, as above, only as correct or incorrect. Thus, we must ask what is required so that the comparison of two readers m and n (in terms of their RCA) will be independent of the difficulty of the items that are used to elicit evidence of their relative RCAs?

Furthermore, assume that the test is composed of a set I of items. Now, two readers m and n can be observed to differ only when they answer an item differently. For any such pair of readers, m and n, there will a set of items for which they are both correct, call it Ic, and a set for which they are both incorrect, Ii. Then the set of items on which they differ will be Id, which is I with Ic and Ii removed—and suppose that the number of items in Id is D. Suppose further that the number of items that reader m gets correct in the reduced set Id is sm, and define sn similarly. Then, sm + sn = D, and sm/D is the relative frequency of m answering an item correctly and n simultaneously answering it incorrectly. Thus the RCAs of m and n (in terms of the success rates of m and n) can be compared by comparing sm/D with sn/D, where (sm/D)/(sn/D) is the observed proportion of reader n answering an item incorrectly and simultaneously answering it correctly. By interpreting relative frequencies as probabilities, these are then P(m=correct, n=incorrect) and P(m=incorrect, n=correct), and they can be compared using their ratio

$$\frac{{P\left( {m = {\text{correct}},\,n = {\text{incorrect}}} \right)}}{{P\left( {m = {\text{incorrect}},\,n = {\text{correct}}} \right)}}$$

Now, suppose that Pmi is the probability that person m responds correctly to item i (and equivalently for person n), so that this expression can be written somewhat more compactly

$$\frac{{P_{mi} \left( {1 - P_{ni} } \right)}}{{\left( {1 - P_{mi} } \right)P_{ni} }}$$

with the assumption of local independence, and the observation that, where there are only two responses, then the sum of the two possibilities must be 1.0.

Returning now to Thurstone’s admonition, this can be translated in this context to the requirement that the equation

$$\frac{{P_{mi} \left( {1 - P_{ni} } \right)}}{{\left( {1 - P_{mi} } \right)P_{ni} }} = \frac{{P_{mj} \left( {1 - P_{nj} } \right)}}{{\left( {1 - P_{mj} } \right)P_{nj} }}$$
(6.1)

should hold for any choice of items i and j. It would be a matter of just several lines of algebra to show that

$$P_{n i} = \frac{{{\text{exp}}\left( {\theta _{n} - \delta _{i} } \right)}}{{1 + {\text{exp}}\left( {\theta _{n} - \delta _{i} } \right)}}$$
(6.2)

where θn is reader n’s RCA, and δi is item i’s reading difficulty. In fact, with the probability function in eq. (6.2), both expressions in eq. (6.1) reduce to exp(θmθn), that is, the item difficulties, δi and δj, are no longer present, which confirms that the comparison does not depend on the specific items used for the comparison, as Thurstone demanded. Note that the RCAs and item difficulties are on an interval scale (by construction). Of course, in order for the item difficulties to be eliminated from the equation, the item difficulties and the RCAs must conform to this probability model in the sense of statistical fit, and hence this is an empirical matter that must be examined for each instrument and human population. The surprising finding about this function, in Eq. (6.2), is that, under quite mild conditions, it is the only such function involving these parameters; this result is due to Georg Rasch, hence the function is called the “Rasch” model (Rasch, 1960/1980).

The actual numbers obtained for θn and δi are termed “logits” (i.e., log of the odds, or log-odds units),Footnote 29 and are typically used to generate property values in ways similar to the way that is done for temperature units: the logits are on an interval scale, so what is needed are two fixed and socially-accessible points. One standard way is to assign two relatively extreme values: for example, one might decide that for a given population of readers, say State X for year 20YZ, the 100.0 point level might be the mean of the logits for readers in Grade 1, while a higher value, say 500.0, would be chosen as the mean for students in Grade 12: this would be suitable, for example, for a reading test used in a longitudinal context (for an expanded discussion, see, e.g., Briggs, 2019).

A second, but similar way, perhaps more suitable for test applications focused on particular grades, would be to allocate a value, say 500 as the mean for readers in Grade 6, say, and 100 as the standard deviation for the same students. Some applications also use the raw logits from their analyses—this effectively embeds the interpretation of the units in a given sample, which may be acceptable in some research and development situations, but would be difficult to justify in a broadly-used application. There is also a more traditional set of practices that (a) use an ordinal approach to classifying readers into a sequence of reading performance categories, and (b) a “norm-referenced” approach that carries out similar techniques to those just described, but using the raw scores from the reading tests rather than estimates from a psychometric model.

The conclusion is then that values of RCA are individual abilities identified as elements in a log-odds (interval) scale based on ratios of probabilities (see Freund, 2019, for a discussion of these types of scales).

6.4 The Epistemic Role of Basic Evaluation Equations

The conclusion reached in the previous section has an important implication for an ontology of quantities, and properties more generally as developed further in the following section. A Basic Evaluation Equation such as Q[a] ≈ x qref reports not just an attribution or a representation, but the claim of an indistinguishability, and in the form Q[a] = x qref the claim of an equality, of individual quantities: if it is true, it informs us that two individual quantities that were known according to different criteria—as the property of an object and the multiple of a unit respectively—are in fact one and the same. In detail:

  • before the relation is evaluated, we are able to identify an individual quantity q as a quantity of an object a, Q[a], and a set of individual quantities qx, each as a value x qref, for a given qref and a number x varying in a given set; q and each qx are quantities of the same kind;

  • as a result of the evaluation, a number x is found such that the hypothesis is supported that the individual quantity q and the individual quantity qx, that were known according to different criteria, are in fact one and the same, i.e., that Q[a] and x qref are identifiers of the same individual quantity.

As a consequence, Basic Evaluation Equations are ontologically irrelevant: if they are true, they simply instantiate the tautology that an individual quantity is equal to itself.Footnote 30 But, of course, their widespread use is justified by their epistemic significance: if they are true, they inform us that two individual quantities that were known according to different criteria are in fact one and the same.

This gives us the tool to interpret a common way of presenting measurement: “the input to the measurement system is the true value of the variable [and] the system output is the measured value of the variable [, so that] in an ideal measurement system, the measured value would be equal to the true value” (Bentley, 2005: p. 3), with the consequence that “in a deterministic ideal case [it] results in an identity function” (Rossi, 2006: p. 40). Let us concede that in the deterministic ideal case the Basic Evaluation Equation is not just an indistinguishability but an equality, Q[a] = x qref. Nevertheless, the position exemplified in these quotes confuses the ontological and epistemic layers: for those who already know that Q[a] and x qref are the same quantity, the relation is an ontological identity, as is the evening star = the morning star for current astronomers (see Sect. 5.3.2). And in fact those who already know that Q[a] and x qref are the same quantity would have no reasons for measuring Q[a]. But measurement is aimed at acquiring information on a measurand, not at identically transferring values of quantities through measuring chains as is the implicit supposition behind the quotations above.

Indeed, the idea of deterministic ideal measurement as an identity function becomes understandable if it is applied not to measurement, but to transmission systems, as mentioned in Sect. 4.2.1. It is correct indeed to model a transmission system in such a way that the input to the transmission system is the value of the variable and the system output is the transmitted value of the variable, so that “in an ideal transmission system, the transmitted value would be equal to the input value” (by paraphrasing Rossi’s quote above). If in fact values were empirical features of phenomena, measuring instruments could be interpreted as special transmission channels, aimed at transferring values in such a way that the transmission is performed without errors and therefore as an identity function. But a basic difference between transmission and measurement is manifest: the input to a transmission system is a value, explicitly provided by an agent who or which operates on purpose by encoding the value into the quantity of an object and sending the quantity through a channel, the purpose of which is to faithfully transfer this input. In this case, the value transmitted along the channel by the agent via the encoded quantity is in principle perfectly knowable. No such agent exists in the case of measurement, which requires a radically different description, in which values of quantities are the output, and not the input, of the process.Footnote 31

6.5 Generalizing the Framework to Non-quantitative Properties

The ontological and epistemological analysis proposed so far has been focused on quantities, although, as we have exemplified, much can also be done with non-additive quantities. In consistency with the VIM, we have assumed that quantities are specific kinds of properties (JCGM, 2012: 1.1), and therefore we need to work on the relation between quantities and properties in order to explore whether and how the ontology and epistemology introduced so far can be applied to properties in general. Concretely, the issue is whether Basic Evaluation Equations can involve non-quantitative properties, and if so, what are the key differences between quantitative and non-quantitative Basic Evaluation Equations.

According to a standard view in philosophy of science, developed in particular within the neopositivist tradition by Rudolf Carnap (1966) and Carl Gustav Hempel (1952), “the concepts of science, as well as those of everyday life, may be conveniently divided into three main groups: classificatory, comparative, and quantitative.” (Carnap, 1966: p. 51). The VIM at least implicitly assumed this classification and adapted it to properties, defined to be either quantities or nominal properties, where the former are defined to be either quantities with unit (peculiarly, <quantity with unit> is not explicitly defined, nor given a term) or ordinal quantities. Hence according to the VIM the basic distinction is between being quantitative and non-quantitative (Dybkaer, 2013), where the demarcation criterion is <to have magnitude>: quantities are properties that have magnitude (including ordinal quantities, then), and nominal properties are properties that have no magnitude. This concept system is depicted in Fig. 6.8.

Fig. 6.8
Two tree diagrams. The concept is divided into classification, comparative and quantitative. Property is classified based on the presence of magnitude and units.

A traditional classification of concepts (left), and its implementation in the VIM (right)

The VIM does not define what a magnitude is, but a plausible hypothesis is that “magnitude” can be generically interpreted there as “amount” or “size”, so that for example the height of human beings is a quantity because it is a property that we have in amounts. This stands in contrast with properties such as blood type, which is only classificatory because we do not have “amounts of blood type”. Accordingly, the phrase “the magnitude of the height of a given human being” refers to “the amount of height” of that human being: it is then, plausibly, an individual length (for a more detailed analysis of the elusive concept <to have magnitude>, see Giordani & Mari, 2012).

The simplicity of the VIM’s account is attractive, but the distinction between quantitative and non-quantitative properties deserves some more analysis here, also due to its traditional connection with the discussion about the conditions of measurability, as mentioned in Sect. 3.4.2.Footnote 32

As discussed in Chap. 4, several interpretations of measurement have been proposed, but a long tradition rooted on Euclid has connected the measurability of a property with its being quantitative. The VIM keeps with this tradition in stating that “measurement does not apply to nominal properties” (JCGM, 2012: 2.1 Note 1). As also discussed in Sect. 1.1.1 and at more length in Sect. 4.2.3, tensions related to this issue helped motivate the formation of a committee of physicists and psychologists appointed the British Association for the Advancement of Science and charged with evaluating the possibility of providing quantitative estimates of sensory events (see the discussion in Rossi, 2007; a more detailed analysis is in Michell, 1999: ch. 6). We might envision two distinct but complementary paths toward resolution of these tensions:

  • one is about the possibility of providing meaningful quantitative information for properties which are not directly or indirectly evaluated (or evaluable) by means of additive operationsFootnote 33;

  • the other is about the appropriateness of broadening the scope of measurement so as to include the evaluation of non-quantitative properties.

From the beginning, both of these paths have been biased by the prestige of measurement, as witnessed by the key role attributed by some to Otto Hölder’s (1901) paper, a mathematical work whose title has been translated in English as “The axioms of quantity and the theory of measurement”, and about which Michell asserted that “we now know precisely why some attributes are measurable and some not: what makes the difference is possession of quantitative structure” (p. 59). In the same vein Jan De Boer claimed that “Helmholtz and the mathematician Hölder are usually seen as the initiators of the axiomatic treatment of what is often called the theory of measurement” (1995: p. 407). But even just a glance at the scope of Hölder’s axioms shows that they do not relate to any experimental process, as would be expected from Helmholtz’s own words—“the most fruitful, most certain, and most exact of all known scientific methods” (1887: p. 1)—and confirmed by the way the VIM defines <measurement>: a “process of experimentally obtaining one or more quantity values that can reasonably be attributed to a quantity” (JCGM, 2012: 2.1). Indeed, Hölder himself admitted that “by ‘axioms of arithmetic’ has been meant what I prefer to call ‘axioms of quantity’” (p. 237), so that measurement is involved in them only insofar “the theory of the measurement” is equated to “the modern theory of proportion” (p. 241), thus confirming the purely mathematical nature of the treatment.

In Sect. 3.4.2 we argued that this superposition of conditions of being measurable and being quantitative derives from a confusion between <measurement>, an empirical concept, and <measure>, a mathematical concept. The conclusion is simple to state: what is to be found in Euclid’s Elements and what Hölder considered “the modern theory of proportion” is not a theory of measurement but a theory of measure, where measures are taken to be continuous quantities, where then, despite their lexical similarity, <measurement> and <measure> need to be maintained as distinct concepts (Bunge, 1973). From this, of course one might assume that only properties modeled as measures are measurable, but this is an assumption, not a (logical, epistemological, or ontological) necessity: it is what we take to be the position of what Michell (1990) calls “the classical theory of measurement”, as rooted in Euclid’s geometry, but his sharp tenet that “without ratios of magnitudes there is no measurement” (p. 16) cannot be maintained without this strong and basically arbitrary assumption.

The problems generated by this confusion are not just lexical or semantic. A well grounded distinction between quantitative and non-quantitative properties would be a key target, at least as a means to identify and justify possible differences in inferential processes and their results, and therefore in the kind of the information they produce. The basic intuition about the distinction remains, e.g., that individuals can be compared in such a way that the height of a person can be one third greater than the height of another, or a difference on an interval scale can be one third greater than another difference, whereas the blood type of that person cannot be one third greater of the blood type of another. This intuition needs a persuasive explanation, which ultimately would be beneficial for a better identification of the conditions of measurability. In fact, the mentioned confusion is a good reason for developing this analysis as a key component of an ontology and an epistemology of properties: once an appropriate classification of types of properties has been established, whether only quantities are measurable might be thought of as simply an arbitrary lexical choice (Mari et al., 2017).Footnote 34

A basic framework on this matter was proposed by Stanley Smith Stevens (1946), with his well-known classification of what he called “scale types”, and since then his distinction between nominal, ordinal, interval, and ratio scales has been widely adopted (for an early, extended and clear presentation, see Siegel, 1956: ch. 3), and variously refined.Footnote 35 Such a framework was conceived as dealing with scales of measurement, given the perspective that “measurement, in the broadest sense, is […] the assignment of numerals to objects or events according to rules” (p. 677), explicitly drawing from Campbell’s seminal representationalist statement that “measurement is the assignment of numerals to represent properties” (Campbell, 1920: p. 267; see also the related discussion in Sect. 4.2). From the perspective of the present analysis Stevens’ “broadest sense” is indeed too broad, if considered to be specifically related to measurement. Rather, what is interesting in his classification is more correctly understood by considering it as related to scales of property evaluation, thus disentangled from issues about measurability. We have then to deal with two interrelated issues:

  • to which entities does the feature of being quantitative or non-quantitative apply?

  • how should the condition of being quantitative or non-quantitative be defined?

But are the terms “nominal”, “ordinal”, and so forth best understood as referring to types of properties, or of evaluations? And, in consequence, how should such types be defined?

6.5.1 The Scope of the Quantitative/non-Quantitative Distinction

The first question for us to consider is about the scope of the classification between nominal, ordinal, interval, and ratio—let us call it NOIR, from the initials of the four adjectives—that is, what does it apply to, and therefore what does NOIR classify? At least two positions are possible. According to one, NOIR is about assignments of informational entities to objects: nominal, for example, is a feature of the way numerals (in Stevens’ lexicon, i.e., “names for classes”, 1946: p. 679) are assigned to individuals with respect to their blood type. This is how Stevens introduced it, thus considering for example blood type to be evaluated on a scale that is nominal. According to another position, NOIR is about the properties themselves: nominal, for example, is a feature of blood type. This is how, for example, the VIM uses it, thus considering blood type to be a nominal property. Hence given a Basic Evaluation Equation such as

$$blood\,type\left[ {individual\,x} \right] = {\text{A}}\,{\text{in}}\,{\text{the}}\,{\text{ABO}}\,{\text{system}}$$

being nominal is considered to be eitherFootnote 36

  • a feature of the evaluation that produces the equation, according to the first position, or

  • a feature of the general property that is involved in the equation, according to the second position.

By interpreting them in a representational context, Michell presents these two positions as being about internal and external representations, respectively (Michell, 1999: p. 165–166, from which the quotations that follow are taken—for consistency, everywhere the term “attribute” used by Michell has been substituted with “property”). According to Michell, an internal representation

occurs when the [property] represented, or its putative structure, is logically dependent upon the numerical assignments made, in the sense that had the numerical assignments not been made, then either the attribute would not exist or some component of its structure would be absent.

Thus, it is internal to the evaluation. An external representation is instead

one in which the structure of some [property] of the objects or events is identified independently of any numerical assignments and then, subsequently, numerical assignments are made to represent that [property]’s structure

where the adjective “external” is explained by Michell as the hypothesis that the property

exists externally to (or independently of) any numerical assignments, in the sense that even if the assignments were never made, the [property] would still be there and possess exactly the same structure.

Thus, it is external to the evaluation. In summary (Giordani & Mari, 2012: p. 446),

  • an internal representation is an evaluation that induces a structure, whereas

  • an external representation is an evaluation that preserves a structure.

The examples proposed by Michell are interesting, and useful for better understanding what is at stake with this distinction. He exemplified external representations (i.e., such that NOIR is a feature of properties) by means of hardness:

Minerals can be ordered according to whether or not they scratch one another when rubbed together. The relation, x scratches y, between minerals, is transitive and asymmetric and these [features] can be established prior to any numerical assignments being made.

The idea is then that once a property-related criterion of comparison has been identified (in this case, mutual scratching), the outcomes of property-related comparisons do not depend on the way they are represented: the conclusion would be that hardness is ordinal (or, more correctly, that hardness is at least ordinal). As the example suggests, this seems to be based on the assumption that, for an external representation to be possible, properties of objects must be empirically comparable according to some given conditions, and the outcome of the comparison must be observable, as in the paradigmatic case of mass via a two pan balance. This condition was embedded in the representational theories of measurement under the assumption that the availability of an empirical relational system is a precondition of measurement.

Michell proposes two examples of internal representations (i.e., such that NOIR is a feature of representations rather than properties). The first one is about

an extreme case [...] of assigning different numbers to each of a class of identical things (say, white marbles) and on that basis defining a [property]. The [property] represented by such assignments would not be logically independent of them and, so, had they not been made, the [property] would not exist.

This is indeed the extreme case of an assignment claimed to be a representation but that does not represent anything, being only a means of object identification: it is not even a property evaluation, given that there is no property to evaluate, in the specific sense that a Basic Evaluation Equation cannot be written because there is not a general property to be evaluated of the considered objects.Footnote 37 We may then safely ignore this case, and consider the second, “less extreme” example,

where an independent [property] may exist, but the structure that it is taken to have depends upon numerical assignments made. For example, people may be assigned numbers according to nationality (say, Australian, 1; French, 2; American, 3; Belgian, 4; etc.) and then the [property] of nationality may be taken to have the ordinal structure of the numbers assigned. In this case, had numerical assignments not been made, the [property] (nationality) would still exist but the supposed ordinal structure would not.

This is a case in which Stevens’ framework proves to be non-trivially applicable. While it is always possible to adopt numbers for representational means, the numerical relations do not necessarily relate to empirical relations among the objectsFootnote 38: in this case, although it is representable by means of ordered entities, nationality is not itself ordinal.Footnote 39 These two examples show why we do not see the category of internal representations as relevant to measurement.

Hence, in our view, the evaluated property exists and has features that are independent of its possible representations: an evaluation is expected to preserve the structure of the property, not to induce a structure on the property.Footnote 40

Given the controversial nature of Stevens’ framework, it may be worth noting that this has nothing to do with setting constraints on ways of representation and of related data processing, such as, say, proscribing against computing the mean value of a set of numbers that encode nationalities. Along the same lines as Lord (1953), Velleman and Wilkinson (1993) emphasized the importance of not imposing such constraints, given that “experience has shown in a wide range of situations that the application of proscribed statistics to data can yield results that are scientifically meaningful, useful in making decisions, and valuable as a basis for further research” (p. 68) (“proscribed statistics” are those statistics that are not “permissible” in the vocabulary of Stevens, 1946: p. 678). In fact, measurement science does include some “proscriptions”, such as the condition of dimensional analysis that only values of quantities of the same kind can be added. Nevertheless, the idea that through data analysis something can be discovered also about the structure of the evaluated properties is not problematic per se. The point is that if the property under consideration is evaluated based on (for example) purely ordinal comparisons (as in the case of hardness), the values that are obtained cannot be expected to convey more than ordinal information, exactly as prescribed by Stevens’ framework and its refinements (an example of which is mentioned in Footnote 32). In this view, what Stevens introduced as the set of “permissible” functions is better understood as a matter of algebraic invariance and meaningfulness under scale transformation (Narens, 2002), and therefore of uniqueness of the scale itself.

A summary can be presented simply as follows:

  • the representation of properties of objects, or the representation of objects as such, is an unconstrained process, and anything could in principle be used as a means of representation;

  • the evaluation of properties of objects is a process that is expected to produce values of the evaluated properties;

  • the measurement of properties of objects is a specific kind of evaluation.

From this point of view, we consider the emphasis on representation that has usually accompanied NOIR to be misleading: the position that assignments are representations that do not represent anything (Michell’s “internal representations”) is void, and the interesting question is instead whether NOIR is about

  • ways of evaluating properties, or

  • properties as such,

where in both cases the claim is that there is a property under consideration, having structural features which do not depend on whether or how it is represented. While Stevens, who was inclined toward operationalism, was candid about this alternative—“the type of scale achieved depends upon the character of the basic empirical operations performed” (Stevens, 1946: p. 677)—and consistently considered NOIR a feature of scales, we still have to further explore regarding this subject.

6.5.2 From Values of Quantities to Values of Properties

We have assumed so far that the concept <evaluation> applies not only to quantities but also, and more generally, to properties. This has been the justification for adopting the same structure for the Basic Evaluation Equation for both quantitative cases, e.g.,

$$length\left[ {rod\,a} \right] = {1}.{2345}\,{\text{m}}$$

and

$$\begin{aligned} & reading\,comprehension\,ability\left[ {individual\,b} \right] \\ & = {\text{1}}.{\text{23}}\,{\text{logits }}({\text{on}}\,{\text{a}}\,{\text{specific}}\,{\text{RCA}}\,{\text{scale)}} \\ \end{aligned}$$

and non-quantitative cases, e.g.,

$$blood\,type\left[ {individual\,c} \right] = {\text{A}}\,{\text{in}}\,{\text{the}}\,{\text{ABO}}\,{\text{system}}$$

Hence in the generic structure

$${\text{property}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{property}}$$

A in the ABO system is an example of a value of a property, just as 1.2345 m is an example of a value of a quantity. While it is acknowledged (for example by the VIM) that quantities are specific types of properties, whether values of quantities can be generalized and thus applied to non-quantitative properties is a much less considered subject, as is the related issue of what a value of a property is. For example, the VIM defines <value of a quantity> (JCGM, 2012: 1.19), but does not define <value of a property> even though it deals with non-quantitative properties, termed “nominal properties” (JCGM, 2012: 1.30). Hence the problem for us to consider here is whether the Basic Evaluation Equation is meaningful only in the specific case of quantities, i.e., in the case

$${\text{quantity}}\,{\text{of}}\,{\text{a}}\,{\text{given}}\,{\text{object}} = {\text{value}}\,{\text{of}}\,{\text{a}}\,{\text{quantity}}$$

or is also able to convey knowledge about non-quantitative properties.

Our construction of values of quantitative properties, presented in Sect. 6.3, relies on their empirical additivity, in the example of length, or on the invariance of their empirical difference, in the examples of temperature and reading comprehension ability, conditions which do not hold for non-quantitative properties. Our concept of shape, for example, is indeed such that the ideas of “adding objects by their shape” or “subtracting objects by their shape” are meaningless, in the sense that the shape of an object obtained by somehow composing two other objects is in general not additively related to the shapes of the composing objects (in fact, shapes are not even ordered: for example, it is meaningless to say that a cube is more or less than a cylinder). Hence, the interpretation of the Basic Evaluation Equation for quantities, according to the Q-notation, such that Q[a]/[Q] is a number (see Sect. 5.3.4) does not apply to non-quantitative properties: there are no “shape units” in this case, nor can shapes be compared by their ratio.

At a more fundamental level, however, the idea of conveying information on properties like blood types of individuals or shapes of objects by means of “values of blood type” and “values of shape” maintains its meaning and relevance, as when we say that a given rod is cubic or cylindrical, phraseological means for “whose shape is cube” and “whose shape is cylinder”, which immediately leads to a formalization such as shape[rod a] = cube and shape[rod a] = cylinder. Given their analogy in structure with length[rod a] = 1.2345 m, where 1.2345 m is a value of length, the conclusion seems to be that cube, cylinder, and so forth may be considered to be values of shape.

However, this is not completely correct, given an important difference between the two cases. Indeed, the value 1.2345 m includes, via the unit metre and the reported number of significant digits, information on the set of possible values from which 1.2345 m has been chosen, i.e., the set is the non-negative multiples of the metre: it is 1.2345 m and not 1.2346 m, and so on; it is 1.2345 m but we are unable to distinguish between 1.2345 m and 1.23451 m, and so on. Choosing a unit and a (finite) number of significant digits corresponds to introducing a classification on the set of the lengths, in which each value identifies one class. Hence, selecting a value of length conveys both the information that (i) the class of lengths identified by that value has been selected, and (ii) all other classes (identified by all other multiples of the unit) have not been selected. In a relation such as shape[rod a] = cube this second component is missing.Footnote 41

In order to improve the structural analogy between length[rod a] = 1.2345 m and shape[rod a] = cube, the set of the possible shapes, of which cube is one element, needs to be declared as part of the equation: it might be, e.g., {cube, any other shape} or {cube, cylinder, cone, sphere, any other shape}, thus showing that the report that rod a is cubic conveys different information in the two cases. We may call the set of the possible shapes a reference set, R, so that an example of the Basic Evaluation Equation in the case of a nominal property such as shape isFootnote 42

$$shape\left[ {rod\,a} \right] = {\text{cube}}\,{\text{in}}\,{\text{R}}$$

where cube in R is then the example of a value of a property. Indeed, the same structure may also be used for quantities, e.g.,

$$length\left[ {rod\,a} \right] = {1}.{2345}\,{\text{in}}\,{\text{metres}}$$

i.e., the value is 1.2345 in the classification of lengths generated by the metre and its multiples, but in this case additivity of length permits the more informative interpretation that the class identified as 1.2345 in that classification corresponds to a length which is 1.2345 times the metre.

Hence the concept <value> is not bound to quantities: non-quantitative properties also have values, and any such value is an individual property identified as an element of a given classification of comparable individual properties,Footnote 43 such that if the classification changes, and therefore a different reference set is used, another value may be obtained for the same property under evaluation. Under these conditions the previous considerations about values of quantities can be correctly generalized to values of properties: first, choosing a set of values for blood type or shape corresponds to introducing a classification on the set of the blood types or the shapes, in which each value identifies one class, and, second, Basic Evaluation Equations also apply to non-quantitative properties and, if true, they convey much richer information than just representation: they state that the property of an object and the value of a property are the same individual property.

Box 6.2 Evaluation scales

In this context the question of what is a scale—we prefer to refer here to the more generic concept <evaluation scale> than to <measurement scale> given that what follows is not only and specifically about measurement—can be straightforwardly discussed.

Let us consider the concrete case of Mohs’ scale of mineral hardness. The observation that any pair of minerals x and y can be put in interaction so that either one scratches the surface of the other or neither scratches the surface of the other, is accounted for as a relation between their hardnesses H, H[x] > H[y], or H[x] ≈ H[y], or H[y] > H[x], and ten equivalence classes, C1, C2, …, C10, were thus empirically identified, such that if H[x]∈Ci and H[y]∈Cj, and i > j, then H[x] > H[y]. However, dealing with equivalence classes for conveying information about properties of objects is inconvenient: rather, each class can be uniquely identified via a one-to-one mapping f from the set of the equivalence classes to a set of identifiers (for example, the set of natural numbers), where the condition of injectivity guarantees that the information about hardness-related distinguishability is not lost in the mapping. Furthermore, since mutual scratching induces an ordering on the set of the equivalence classes Ci, the mapping f may be defined so as to maintain such structural information, i.e., if H[x] > H[y] and H[x]∈Ci and H[y]∈Cj, then f(Ci) > f(Cj), where then H[x] > H[y] is an empirical relation about properties (hardnesses, in this example) of objects and f(Ci) > f(Cj) is an informational relation about identifiers of equivalence classes. The two conditions of injectivity and structure preservation make f an isomorphism (surjectivity is an immaterial condition here): it is a scale, i.e., an isomorphism from equivalence classes of property-related indistinguishability to class identifiers.

For a given a set of equivalence classes {Ci} established for a property P, the condition that the scale f be an isomorphism constrains the set of class identifiers, i.e., the range of f, except for an isomorphism. In the example, the range of f is usually taken to be the set of the first 10 natural numbers, so that f(equivalence class of talc hardnesses):= 1, f(equivalence class of gypsum hardnesses):= 2, and so on, but any other ordered set of 10 elements could be equally chosen, say the sequence 〈a, b, …, j〉, where the mapping 1 → a, 2 → b, … is usually called a scale transformation, though a better term is scale identifiers transformation, given that the empirical component of the scale is left untouched and only the scale identifiers are (isomorphically) changed.

Note that the definition of a scale is a normative statement that establishes which equivalence class is identified by which identifier. As such, it is not an equation (and therefore not a Basic Evaluation Equation), and it is neither true nor false. In the example above, it is written then

$$\forall {\text{i}} \in \left\{ {{1},{ 2}, \ldots ,{1}0} \right\},\,f\left( {C_{i} } \right): = i$$

(where the notation “x:= y” means that x is defined to be y, not that x and y are discovered to be equal). From this characterization it is simple to see why for the same general property on can construct scales that are distinct (in the specific sense of being non-isomorphic):

• the criterion that defines the equivalence classes Ci could be changed, so that a new set of equivalence classes implies a new mapping f; for example, that in the case of refining hardness classes this could lead to non-integer identifiers like 4.5 for steel;

• the structure that needs to be preserved by the mapping f could be changed, so that a new isomorphism has to be obtained that preserves an algebraically stronger or weaker structure; an example of the first case is the historical development of the measurement of temperature when the absolute zero was discovered, allowing measurement of temperatures in the thermodynamic (Kelvin) scale and not only in a thermometric (Celsius, Fahrenheit) scale; an example of the second case is the daily measurement of temperature whenever measured values are reported in a thermometric scale, thus forgetting the available information of the “natural”, absolute zero.

From a scale f a function g can immediately be derived—“lifted”, in the algebraic jargon—mapping properties of objects belonging to equivalence classes to class identifiers: if P[x]∈C, then g(P[x]):= f(C) (for example, since f(equivalence class of talc hardness):= 1, then g(hardness of a given sample of talc) = 1, i.e., in the Mohs scale the hardness of any sample of talc is identified by 1, and so on).Footnote 44 Of course, properties of distinct objects may be indistinguishable from each other, i.e., may belong to the same equivalence class, and therefore may be associated to the same identifier via a function g. Hence, while f is one-to-one, g is many-to-one, and therefore a homomorphism, that may be called a scale-based representation of the properties of objects. In contest with the statement f(Ci):= i considered above, a relation like g(P[x]) = i is typically not about constructing a scale but, after a scale f has been constructed, about using it, with the aim of identifying the property P[x] as an element of an equivalence class, the one identified by i. As such, it is an equation, that is either true or false, depending on whether actually P[x]∈Ci, if f(Ci):= i.

In summary, a scale is built under the assumption that some relations (at least indistinguishability, but possibly order and more) among comparable properties of objects are given, and is introduced to organize and present in a standard way the information about such relations. As a consequence, if these relations change, the scale could be changed in turn.

6.5.3 Property Evaluation Types

Given this broad characterization of what values of properties are, it is now clear that the four conditions that in Chap. 2 we have proposed as necessary for a process to be considered a measurement can also be fulfilled by the evaluation of a non-quantitative property: it may be a process that is empirical (Sect. 2.2.1) and designed on purpose (Sect. 2.2.2), whose input is a property of an object (Sect. 2.2.3), and that produces information in the form of values of that property (Sect. 2.2.4).

However, as previously noted, such conditions are not claimed to be also sufficient. In other words, since measurement is a property evaluation but not all property evaluations are measurement, the fact that conditions that are necessary for measurement apply to the evaluation of non-quantitative properties is still not sufficient to conclude that non-quantitative properties are measurable. While Chap. 7 is devoted to proposing our account of the structural conditions that characterize measurement, it is time now to come back to the issue of whether NOIR is about ways of evaluating properties or about properties as such.

The question of the scope of NOIR, as elaborated in Sect. 6.5.1, is in fact about the alternative between a more modest instrumentalist, epistemological position, which assumes that we can only characterize evaluations (and more specifically measurements) rather than properties as such, and a stronger realist, ontological position, according to which we can instead say something about properties themselves, plausibly also on the basis of what we learn in the process of evaluating them. Of course, the more modest position is also safer, and seems to be more consistent with falsificationism (Popper, 1959) and better able to take into account the fact that scientific revolutions (Kuhn, 1969) can annihilate bodies of knowledge that were deemed to be established: given the always revisable status of our hypotheses about the empirical world—as illustrated, for example, by the historically well-known cases of phlogiston and the caloric—wouldn’t it be wiser to renounce any ontological claim about the structure of properties as such?

Let us explore the issue in the light of the assumption of two conditions of information consistency for an evaluation (Giordani & Mari, 2012: p. 446):

(C1) for each relation among properties of objects there is a relation among values such that the evaluation preserves all property relations: this guarantees that the information empirically acquired is maintained by the evaluation;

(C2) only relations among values that correspond to relations among properties of objects are exploited while dealing with values: this guarantees that the information conveyed by values is actually about the evaluated properties.

In summary, values should convey all and only the information available on the evaluated properties. This is plausible, to say the least: given that values are what an evaluation produces, they should report everything that was produced, and that otherwise would be lost (C1), but nothing more than that, to prevent unjustified inferences (C2). These two conditions deserve some more consideration.

Condition (C1) seems obvious, particularly in the context of representational theories of measurement where it may be considered the premise of representation theorems.Footnote 45 For example, if the property of an object is compared with the property of another object and the former is observed to be greater than the latter, the value of the former should be greater than the value of the latter. However, the meaning of (C1) is based on the non-trivial acknowledgment that properties of objects may also be compared independently of their evaluation, and therefore that the comparison has features which are independent of the evaluation. The condition that the property of one object is greater than the property of another object might be in some sense observable, and in this case does not require such properties to be evaluated. This gives support to the position that NOIR is a feature not only of the ways in which properties are evaluated, but of properties as such, via what we know about the ways in which they can be compared.Footnote 46 In Michell’s words, “the existence of the empirical relations numerically represented must be logically independent of the numerical assignments made. That is, these empirical relations must be such that it is always possible (in principle, at least) to demonstrate their existence without first making numerical assignments” (Michell, 1999: p. 167). For sure, any such ontic claim may be updated, and in particular improved—for example when a metric is discovered to apply to what was previously considered to be a non-quantitative property—but this is just in agreement with the general understanding that empirical knowledge is always revisable.

Condition (C2) has more complex implications: how can we be sure that a relation among values does not correspond to a still-unobserved relation among properties of objects? The point here is not about accepting or refusing “proscriptions”, in the sense of Velleman and Wilkinson (1993) and as already discussed in Sect. 6.5.1, but about acknowledging that through evaluation some features of properties might be discovered. For example, historically, the idea that temperature can be evaluated on an interval scale was formulated as the result of its evaluation by means of thermometers, not via the comparison of temperatures of objects in terms of their distances / intervals. As documented by Chang (2004), a crucial problem was in the confirmation of the preliminary hypothesis that the evaluation is linear (in this case, that thermometers have a linear behavior in transducing temperatures to lengths), so that divisions in the scale of values (in this case, of length in the capillary) can be treated as evidence of correspondingly proportional divisions in the scale of properties of objects (in this case, of temperatures).Footnote 47 Such an inference is then justified on the basis of the structure of the evaluation, in the case of thermometers realized by the transduction effect of thermal expansion. And thus, for a property already known to be comparable in terms of order, appropriate conditions on the way the property is evaluated may help justify the hypothesis that distances/intervals, and therefore units (though without a “natural” zero) are also meaningful. Such a general characterization is not limited to physical properties: Indeed, this can be understood as the rationale of simultaneous conjoint measurement (Luce & Tukey, 1964) and Rasch measurement (Rasch, 1960), as also discussed in Sect. 4.4.1Footnote 48: the fact that the evaluation fulfills given conditions leads one to infer that the evaluated property may have a structure richer than the observed one.

The attribution of an unobserved feature to a property is clearly an important and consequential move. While according to condition (C1) NOIR would be considered a feature of properties, known through their means of comparison, condition (C2) suggests a more cautious position, that NOIR is explicitly a feature of evaluations, and only in a derived and more hypothetical way a feature of evaluated properties. That is why we propose that NOIR are examples of Property Evaluation Types (Giordani & Mari, 2012). This is along the same lines as Stevens’ “types of scales of measurement”, but with the acknowledgment that such types are more generally features of evaluations, and not only of measurements. This position allows us to take into account the fact that the same property may be evaluated by means of evaluations of different types,Footnote 49 so that the usual property-related terms—“nominal property”, “ordinal property”, etc.—are meant as shorthands for something like “property that at the current state of knowledge is known to be evaluable on a nominal scale at best”, and so on. Even the very distinction between quantitative and non-quantitative properties has, then, this same quality: as the historical development of the measurement of temperature shows, a property that we can evaluate only in a non-quantitative way today might tomorrow also become evaluable quantitatively.

On this basis, we may finally devote some consideration to our most fundamental problem here: the conditions of existence of general properties.

6.6 About the Existence of General Properties

A basic commitment at the core of our perspective on measurement is that it is both an empirical and an informational process, aimed at producing information about the world, and more specifically, about properties of objects. A direct consequence of this view is that a property cannot be measured if it does not exist as part of the empirical world; that is, the empirical existence of a property is a necessary, though not sufficient, condition for its measurability (Mari et al., 2018). This statement may seem so obvious as to approach banality, but it has some less obvious features and consequences worthy of further exploration. In particular, one may ask: how can we know that a property exists? Stated alternatively, under what conditions is a claim about the existence of a property justified? And, more specifically, what does a claim of existence of a general property assume?

This section is dedicated to an analysis of this question, beginning with some conceptual house cleaning, related to the distinction between empirical properties and mathematical variables.

6.6.1 Properties and Variables

We have proposed that empirical properties are associated with modes of empirical interaction of objects with their environments. To help sharpen up this statement, let us consider the distinction between empirical properties and mathematical variables. An (existing) empirical property can, in principle, be modeled by a mathematical variable; indeed, this is one of the primary activities involved in a measurement process, as described in more detail in the following chapter.Footnote 50 However, it would be fallacious to conflate empirical properties and mathematical variables, or to assume that the presence of either implies the existence of the other: there can be empirical properties without corresponding mathematical models (for example, because we are unaware of the very existence of such properties; e.g., blood type prior to 1900), and there can be mathematical variables without corresponding empirical properties (for example, the variables in generic mathematical equations such as y = mx + b).

Although this distinction may seem obvious when presented in these terms, conventions in terminology and modes of discourse may sometimes obfuscate it, as when the term “variable” is used to refer both to an empirical property and a mathematical variable (which is common in the literature on “latent variable modeling”, for example; see McGrane & Maul, 2020), or when, as described in the GUM, “for economy of notation […] the same symbol is used for the [property] and for the random variable that represents the possible outcome of an observation of that [property]” (JCGM, 2008: 4.1.1).

As a consequence, it cannot be assumed out of hand that any given feature of a mathematical variable is shared by the empirical property that the variable claims to model. For example, some physical quantities are customarily and effectively modeled as real-valued functions—a precondition for modeling the dynamics of such quantities by means of differential equations—but assuming that all features of real numbers apply to the quantities they purport to model could, for example, lead to the conclusion that a given quantity is dense in the way that real numbers are, which in many cases is known to be false, as in the case of quantized quantities such as electrical charge. Analogously, properties are customarily and effectively modeled as continuous random variables for a variety of purposes, but, again, this does not guarantee that all features of continuous random variables hold true for the modeled properties (see also, e.g., Borsboom, 2006; McGrane & Maul, 2020), even for models that fit the data according to commonly-accepted criteria (see, e.g., Maraun, 1998, 2007; Maul, 2017; Michell, 2000, 2004).

With respect to the confusion between a knowable entity and what we know of it (i.e., the concept that we have of it), a particularly pernicious class of properties are those considered to be, in some sense, constructed, as was previously discussed in Sect. 4.5: one might infer from the fact that “concepts such as compassion and prejudice are […] created from […] the conceptions of all those who have ever used these terms” that they therefore “cannot be observed directly or indirectly, because they don’t exist” (Babbie, 2013, p.167). This fallaciously conflates the concepts we have of psychosocial properties such as compassion with the empirical referents of those concepts.Footnote 51 That is, if compassion, prejudice, and other psychosocial properties have ever been measured, what was measured was a property (of an individual or a group), rather than a concept of (or term for) that property.

Thus, whether a given property is defined in purely physical terms or not, the critical question is how we know that a property exists, and therefore that it meets at least the most basic criterion for measurability. What, in other words, justifies one’s belief in the existence of a property?

6.6.2 Justifications for the Existence of Properties

There are many ways in which a claim about the existence of an empirical property could be justified, but given the empirical nature of the properties in question, they must share some form of observational evidentiary basis. Here we propose what we take to be a minimal, pragmatic approach to the justification of the existence of properties, based on our ability to identify their modes of interaction with their environments.Footnote 52

A core aspect of the justification for a claim about the existence of a property is, simply, the observation that an object interacts with its environment in particular ways. The term “interaction” can itself be interpreted in a variety of ways, but in the context of measurement science (see in particular Sects. 2.3, 3.1 and 4.3.4), a starting point is the observation of what we have referred to as a transduction effect, i.e., an empirical process that produces variations of a (response, effect, output) property as effects of variations of one or more (stimulus, cause, input) properties.

One could argue that an even earlier starting point is simply the observation of variation in some empirical phenomenon (event, process, etc.). If we may help ourselves to the assumption that there are no uncaused events (at least not on a sufficiently broad conception of causality; for general discussions, see, e.g., Beebee et al., 2009; see also Markus & Borsboom, 2013), then from the observation of an event one may infer the existence of causal influences, though of course one may initially know little or nothing about the nature of these influences.Footnote 53 Progressively, through empirical interaction with relevant phenomena, we may arrive at a state of knowledge and technology such that a transduction effect can be dependably reproduced under specified conditions, which brings us back to the “starting point” referenced in the previous paragraph. Such a transduction effect may become the basis of a direct method of measurement (see Sect. 7.3): through the calibration of the transducer, the values of the output property (i.e., the instrument indication) are functionally related to values of the input property (i.e., the property under measurement). For example, temperatures can be measured by means of differences of expansion of mercury in a glass tube, and reading comprehension abilities can be measured by means of differences in patterns of responses to questions about a particular text. Such cases presuppose the observability of a property Y (e.g., shape, color, pattern of responses to test questions), whose differences are accounted for as being causally dependent on differences in the property under consideration P, via an inference of the kind Y = f(P), where f is the function that models the cause-effect relation: the property P is the cause of observed changes of Y, and therefore it exists.

All this said, if a property P is only known as the cause of observable effects in the context of a single empirical situation (experimental setup etc.)—that is, if there is only a single known transduction effect of which instances of P are the input, and where the transduction itself is understood only at a black-box level—then knowledge of P is obviously highly limited; such a situation might be associated with an operationalist perspective on measurement, and would thus inherit the limitations of that perspective (see Sect. 4.2.2), or might simply be a very early phase in the identification of an empirical property, setting the stage for investigations of the causal relevance of the property in situations other than this single transduction effect. Indeed, in general, absent the availability of multiple, independent sources of knowledge about P, in particular about its role in networks of relationships with other phenomena (properties, outcomes, events, etc.), knowledge about P might be considered vacuous or trivial.

For example, a claim about the existence of hardness as a property of physical objects can be justified in a simple way by the observation that one object scratches another: hardness (P) is what causes (f) observable scratches (Y) to appear given an appropriate experimental setup. Were this the only source of knowledge about hardness, the correct name for P would arguably be something like “the property that causes the effect Y”, e.g., “the ability to produce scratches”, rather than a label as semantically rich as “hardness”. But, of course, this is not the only way in which hardness is known: even simple lived experience can corroborate our common sense about the ways in which objects made of different materials interact with one another; this is further corroborated by alternative methods for measuring hardness such as via observation of indentations under specified conditions. In other words, we have access to knowledge about the property of hardness also independently of that particular cause-effect relationship f, and this knowledge is consistent with what f models as the cause of Y. This shows that the procedure of checking which objects scratch which other objects does not define hardness, but instead may become a method for evaluating it.Footnote 54

Thus, as investigations reveal functional relations connecting P to multiple phenomena (properties, outcomes, events, etc.) whose existence can be assessed independently of such relations, P becomes part of a system of interrelated properties, sometimes called a nomic network.Footnote 55 The identification of such relations (referred to in the VIM as a “set of quantities [or more generally, properties] together with a set of noncontradictory equations relating those quantities”, JCGM, 2012: 1.3) is important not only because it expands the explanatory and predictive value of knowledge of P,Footnote 56 but also for two additional reasons specifically related to measurement. The first is that such knowledge may suggest alternative methods for directly measuring a given property: for example, temperatures could also be measured by means of differences of electric potential via the thermoelectric effect, and reading comprehension abilities could also be measured by observing how well an individual is able to carry out a set of instructions after having read a relevant text. This corresponds to the minimal example of a nomic network as shown in Fig. 6.9, in which the three properties P, Y, and Z are connected via the two functions Y = f(P) and Z = g(P).Footnote 57 The causal relationship between P and either Y or Z—or both—could be used as the basis for a direct measurement of P. This kind of relationship of Y and Z to P is referred to as reflective in the context of latent variable modeling (see, e.g., Edwards & Bagozzi, 2000).

Fig. 6.9
A model diagram has P on the left. Two arrow marks from P point toward Y and Z on the right.

A simple nomic network laying the groundwork for the direct measurement of P through multiple means (where PY means that P is the cause of Y)

The second measurement-related reason for the importance of knowledge of such functional relations is that they may become the basis for indirect methods of measurement (see Sect. 7.2), in which the results of prior direct measurements are used as input properties for the computation of a value of the output property (i.e., the measurand), as, for example, when densities are measured by computing ratios of measured values of masses and volumes. Here the property P whose existence is questioned is a function of other properties, say Y and Z, whose existence is already accepted, as depicted in Fig. 6.10. This kind of relationship of Y and Z to P is referred to as formative in the context of latent variable modeling (again see Edwards & Bagozzi, 2000).

Fig. 6.10
A model diagram has Y and Z on the left, Two arrow marks from them point toward P in the middle. The arrow mark from P points at three dots on the right.

A simple example of a nomic network laying the groundwork for the indirect measurement of P

A clarification is in order on this matter: if a property is only known through a single function of other properties, in which case the functional relation P = f(Y, Z) would serve as the definition of a previously unknown entity P, there would be no basis for claiming that P is an independently-existing empirical property; rather, what is calculated by f would simply be a variable that summarizes (some of) the available information about the properties Y and Z (as, again, is the case for hage, defined as the product of the height and age of a human being; Ellis, 1968: p. 31). Summaries can, of course, have substantial utility, but as per the previous discussion of the distinction between empirical properties and mathematical variables, mathematical creativity is in itself insufficient for the generation of new empirical properties. As before, it is the availability of independent sources of knowledge about the property in question that lends credence and importance to claims regarding its existence, as is the case with force: although F = ma may be considered to be a definition of force, there are in fact means of knowing force independently of (but consistent with) Newton’s second principle, as, for example, Coulomb’s law, which connects force to quantity of electric charge.

In sum, our approach to the justification of claims about the existence of properties is consistent with the philosophical perspective sketched in Sect. 4.5, which we described as pragmatic realism or model-based realism. The approach is realist, insofar as it focuses on justification for claims regarding the existence of empirical properties, and by so doing helps clarify the distinction between empirical properties and mathematical variables, and more generally the interface between the empirical world and the informational world; this also helps set the stage for a clear distinction between measurement and computation, discussed further in the following chapter. The approach is pragmatic, insofar as the emphasis of the proposed criteria for evaluating our beliefs about the existence of properties is on the practical consequences of those beliefs; this is consistent with the familiar refrain of pragmatic philosophers that “a difference that makes no difference is no difference”, or, as put more specifically by Heil, “a property that made no difference to the causal powers of its possessors would, it seems, be a property the presence of which made no difference at all” (2003: p. 77). Finally, the approach is model-based, insofar as the role of models (of general properties, measurands, environments, and the measurement process) is given primacy: this is the topic to which the following chapter is devoted.