Rasch (1977) addressed science and objectivity in his last published paper: On Specific Objectivity: An Attempt at Formalizing the Request for Generality and Validity of Scientific Statements. In Section III (p. 68), he rhetorically asked, “What is Science?” His answer specified two conditions. Science is:

  1. 1.

    Making comparisons, and

  2. 2.

    Making these comparisons objectively.

Rasch (1977) further explained these two conditions.

Two features seem indispensible in scientific statements. They deal with comparisons, and the comparisons should be objective. However, to complete these requirements I have to specify the kind of comparisons and the precise meaning of objectivity. (p. 68, our italics)

Rasch (1977) follows the above quote by another one taken from an earlier paper (Rasch, 1967) but with each part given an important heading:

Specifying Comparisons. Consider a class of 'objects' to be mutually compared. The sense in which they should be compared is specified through a class of 'agents,' to each of which each object may be 'exposed.' On each exposure an 'observation'—quantitative or qualitative—is made. The whole set of such observations made when a finite number of objects of O1…,On are exposed to a finite number of agents A1…, Ak from the data from which comparisons of the O’s…to such agents as the A’s can be inferred reactions.

Specifying Objectivity. Now, within this framework, which I have taken from psychophysics, the ‘objectivity’ of a comparative statement on, say two objects, O1 and O2 is taken to mean that although being based upon the whole matrix of data it should be independent of which set of agents A1,…,Ak out of the available class were actually used for the comparative purposes, and also of which objects…other than O1 and O2 were also exposed to the set of agents chosen.

Specific objectivity, some general properties. In order to distinguish this type of objectivity from other use of the same word I shall call it 'specific objectivity,' and in passing I beg you notice the relativity of this concept: it refers only to the framework specified by the class of objects, the class of agents and the kind of observations which define the comparison. (pp. 2–3, our italics)

Rasch (1967) adds this important qualification:

...the objects and/or agents are subject to comparison…the data themselves are not directly compared, they only serve as the instruments for the comparison aimed at. The consequences of introducing these two concepts: (specific) comparisons and specific objectivity, completed by the requirements that a comparison is always possible and its result always unambiguous, are really overwhelming. (pp. 2-3, our italics)

The above quotes serve to introduce what is required in the method of comparison as well as the consequences from making such comparisons. It is not the data, but what it stands for that is the goal of science. The essential conditions are two: (1) specify 'the kind of comparisons' and (2) specify the precise meaning of objectivity.'

These stipulations suggest the following paradigm:

Propositions::

Specifications:

1. Scientific statements:

by the kind of comparisons made.

2. Objective comparisons:

by the precise meaning and achievement of specific objectivity.

Scientific statements are the consequences of specifying comparisons. Objective comparisons are the consequences emanating from comparisons exemplifying the precise meaning of specific objectivity. Rasch (1977) draws attention to the ubiquity of observations made by comparisons:

...comparisons form an essential part of our recognition of our surroundings: we are ceaselessly faced with different possibilities for action, among which we have to choose just one, a choice that requires that we compare them. This holds both in everyday life and in scientific studies. (p. 68)

These remarks appear general, obvious, and rather simple. Every day finds us comparing prices and products or deciding activities. Rasch’s observations on making comparisons in everyday life or pursuing science are self-evident, but clearly essential. However, a careful exam nation of the exact role for making comparisons is required.

Rasch (1977) next responds to those who suggest measurement as primary in making comparisons or conducting science:

That science should require observations to be measurable quantities is a mistake, of course; even in physics observations may be qualitative—as in the last analysis they always are. (p. 2)

This statement is vitally important in the social sciences because it challenges two common beliefs.

First, Rasch dismisses the notion of measurement as a prerequisite for science to proceed. His quote echoes Louis Guttman 's comments of a similar nature,

I have avoided the term 'measurement' in all my writings and teachings. I have found it neither useful nor necessary.... No fixed a priori collection of abstract, contentless techniques or principles can be universally appropriate for scientific progress. (1971, pp. 330–331)

Second, Rasch raises qualitative observations to the prominence rightly deserved. Guttman shared this view also.

The basic data of most mental tests are qualitative, yet no treatment is given of the theory of such qualitative data. (in Levy, 1994, p. 324)

Guttman (1971) argued that his approach to science is via “…hypothesis construction for aspects of a universe of observations recorded” (p. 333). He eschewed measurement while promoting better investigations driven by theory. Guttman's remarks further suggest interesting comparisons between these two iconoclasts of traditional measurement practice. Andrich (1982, 1985, 1988) has made important psychometric comparisons of Guttman and Rasch perspectives. Engelhard (2008) has examined both perspectives critical to invariant measurement.

A contemporary expression of many of these same issues has been given by Rein Taagepera (2008) whose book Making Social Sciences More Scientific pursues these same goals (among others) by re-focusing attention to constructing predictive models. The need to repeat calls to the goal of predicting serves to show how important these matters are, but how neglected in practice they have been.

The important issue developed from Rasch thus far is that only specific comparisons, grounded in substance/objects, produce generalities and laws. It is the properties of these generalities and laws that we employ for guidance and direction. Data and its analyses are only a means to this end, and should not be mistaken for the conclusion. In practice, this means that what is derived from data must reach beyond to become a generalized outcome. Whether it does or not is critical.

1 The Strategy of Comparison

Comparison is driven by theory and experiment. Basic statements about making comparisons may initially appear simplistic. What makes these elementary concerns worthy of further investigation? Why such attention to making comparisons? Rasch illustrates his points with two examples. The first example involves a comparison of ashtrays. His choice of this example is unfortunate for these times, but illustrative nevertheless. Data from this experiment is produced by dropping heavy and light ashtrays from six different heights. It is reported as shown in Fig. 1.

Fig. 1
figure 1

Ashtray breakage by falling distance

Survival intact is denoted +, and breakage is denoted -. While the experiment is simple, Rasch reminds us that qualitative comparisons are fundamental to science. He stresses the importance of recognizing qualitative methodology as fundamental in science. This example, likewise for any example, requires confronting each object with an action to produce a reaction—a tripartite condition. But objects and actions must be systematically arranged to allow a valid observation to be made. A theory is stimulated by a thought experiment, an intuitive insight, or a historical review of relevant literature.

The experiment consists of a tripartite condition of object, agent, and resulting reaction. In a psychometric application this triplet could be person, item, and response. Rasch describes the process:

  1. 1.

    To determine whether an object has a certain property one must do something to the object, confront it with some action or different actions liable to create one of a number of reactions.

  2. 2.

    If knowledge gained in this way is to be used in making a choice it must be obtained for several objects of the kind in question so that a comparison becomes possible (Rasch, 1977, p. 69).

Rasch (1977) introduces his second example showing the sequential development of gas equations (IV:l–IV:15) not discussed here. The essential points that Rasch draws from the more rigorous exposition of this second example (pp. 71–73) are the following:

  1. 1.

    Varying the two parameters of temperature and pressure introduces a stepwise series of systematic experiments,

  2. 2.

    The results show “linearity to be a pervading trait,”

  3. 3.

    This results in “a law of greater generality,”

  4. 4.

    The sequence progresses from observations of volume, pressure and temperature and passes through a series of stepwise comparisons leading to summary equations.

Rasch concluded:

This procedure, it seems, can be taken as a prototype of the experimental charting of a complex field…only through systematic comparisons—experimental or observational—is it possible to formulate empirical laws of sufficient generality to be—speaking frankly—of real value, whether for furthering theoretical knowledge or for practical purposes…I see systematic comparisons as a central tool in our investigation of the outer world. (p. 74, our italics)

Great importance is given by Rasch to building sequence of experiments. It is not a singular, critical experiment that typifies science although it has occurred at times. It is a series of them, each of which builds upon the other. Parameters may be experimentally manipulated to observe outcomes, and confirm or deny theoretical predictions. The way that Rasch demonstrates development of the gas law brings a theory to an encompassing state built from and supported by a succession of critical experiments whose hypotheses are sequential, stepwise, and linear. This process constitutes Rasch's causal model. We have earlier stipulated, “The Rasch model in concert with a substantive theory is a powerful tool for discovering and testing the adequacy of formulations” (Burdick et al., 2006, p. 1059), and specified, “It takes a substantive theory to unambiguously distinguish between a latent variable and an index.” (Stenner et al., 2008, p. 1177). Rasch was clearly proposing that comparisons be made in the context of substantive theory driving experimental studies. Only in this context can claims against latent variables be made following the specification outlined above.

In Comparisons and Specific Objectivity, Rasch (1977) generalizes and formalizes the implications of his two examples (p. 75). He begins with explanations and equations more rigorously defining comparison and specific objectivity that are abstracted below but sequenced by Rasch's order of them:

...two collections of elements O and A denoted here objects and agents…single elements and indices Ov and A1…enter into a well-defined contact C…every contact an outcome R.

…the three collections of elements…the frame of reference

$$ F = \left[ {O,A,R} \right] $$
(1)

the concept of comparison…contact C determines outcome R as a function of object O and agent A

$$ R = r\left( {O,A} \right). $$
(2)

Within a specified determinate frame of reference a comparison between two objects O1 and O2—with regard to their reactions to containing the agent A—is defined as a statement about them which is based solely on those reactions.

$$ R_{1} = r\left( {O_{1} ,A} \right),R_{2} = r\left( {O_{2} ,A} \right) $$
(3)

…this comparing function u (R1, R2) form a collection U

…the elements…may be qualitative

…inserting function U in

$$ U\left( {R_{1} , \, R_{2} } \right) = u\left( {r\left( {O_{1} ,A} \right), \, r\left( {O_{2} ,A} \right)} \right) $$
(4)

and a statement about O1 and O2…will depend on A, the agent.

…the comparing function U(R1, R2) as a function of O1 and O2 conditioned by A

$$ U\left( {r\left( {O_{1} ,A} \right) \, r\left( {O_{2} , \, A} \right)} \right) = u\left( {O_{1} ,O_{2} |A} \right) $$
(5)

as a comparator for O1 and O2, conditioned by agent A (in analogy to the concept of conditional probability).

Statements dependent on the agent [object] are said by Rasch to be local comparisons.

Clarity is given about what is not, and what is objective, and what is local or specific in the frame of reference:

Local comparisons are…useful as pointers…a comparing statement.

…for O1 and O2 to be more than locally valid, the comparator must be independent of which ai from A has been used to produce the reaction

... if the condition is fulfilled ... denote the comparison between these two objects as for agents between these two objects:

$$ U\left( {R_{1} , \, R_{2} } \right) = u\left( {r\left( {O_{1} ,A} \right), \, r\left( {O_{2} ,A} \right)} \right) \, u \, \left( {O_{1} ,O_{2} } \right). $$
(6)

…if this globality with A holds for any two objects O1 and O2 in O [then]

…the pairwise comparison defined by (4) is specifically objective within frame of reference F.

The term ‘objectivity’ refers to the fact that the result of any comparison of two objects within O is independent of the choice of the agent ai within A and also of the other elements in the collection of objects O; in other words: independent of everything else within the frame of reference, than the two objects which are to be compared and their observed reactions.

…the qualification “specific” is added because the objectivity of these comparisons is restricted to the frame of reference

F in [1] …denoted as the frame of reference for the specifically objective comparisons in question. (pp. 75–77)

Rasch makes very clear,

... specific objectivity is not an absolute concept, it is related to the specific frame of reference ...this definition concerns only comparisons of objects, but within the same frame of reference it can be applied to comparisons of agents as well. (p. 77)

The philosophic issues encompassing the history of objectivity appear circumspectly avoided in these statements. Rasch specifies,

The concept has therefore not been carved out in a conceptual analysis, but on the contrary its necessity has appeared in my practical [statistical] activity. (p. 58)

Why avoid the philosophic issue of causation? Speculation suggests it was not in good taste at the time he was writing to speak of causation in general. The zeitgeist did not appear to support objectivity and causality in this context. The consequences of quantum mechanics and the influence of logical positivism may have kept Rasch from wanting his methods contaminated by any digression into philosophy, or to have metaphysical issues injected into his systematic discourse. The strategy seems very clear in light of the quotes given in the opening paragraphs of this paper. Rasch wants to make his case without contending with excessive philosophical baggage concerning causality and objectivity. To illustrate the zeitgeist Rasch was avoiding, consider the opening sentence to Waismann's chapter entitled “The Decline and Fall of Causality” Chapter V in Turning Points in Physics (1961).

1927 is a landmark in the evolution of physics—the year which saw the obsequies of the notion of causality. (p. 84)

Hoover (2004) offers another illustration:

[in reference to causality] ... a dip of about twenty percent in the occurrence of the causal family from the 1950s [for ‘causally', 'causality', or 'causation' in econometric literature]. (p. 152)

Rasch is by no means conservative regarding the promotion of objectivity via theory/experiments constructed with a view to exploring results based upon engineered methods. He employs the word “law” in a favorable connotation more than a dozen times throughout his 1977 paper. While Rasch does not venture into “causation” stated explicitly, he does so implicitly via employment of comparison as described above. It seems clear, except for evading the philosophic realm of causality, Rasch advocates a causal mode of investigation by means of comparisons founded upon hypotheses and data.

It is important to observe his distinction between “indicators” and “specific objectivity”. Rasch (1977) specifies,

... if this globality within A holds for any two objects O1 and O2 in O ... the pairwise comparison defined by (4) is specifically objective within the frame of reference F.

The term 'objectivity' refers to the fact that the result of any comparison of two objects within O is independent of the choice of the agent ai within A and also of the other elements in the collection of objects O; in other words: independent of everything else within the frame of reference, than the two objects which are to be compared and their observed reactions. (p. 76)

The essential point rests upon the comparison of two objects (or agents) independent of agent (or object) in the collection of objects (or agents) and their observed reactions within a specified frame of reference. As indicated earlier in the quotes given above, not all comparisons meet the conditions of specific objectivity. A key issue is distinguishing “... those statements dependent on the agent (object),” specified by Rasch to be “local comparisons” from those which emanate from “specific objectivity.”

Returning to Rasch's ashtray example, we find there are no differences observed for the two ashtrays dropped from heights H1, and H2 because both survive breakage. No difference results from heights H5 and H6, also because neither ashtray survives the fall. The first two heights, H1 and H2, allow no comparison to be made, and the same occurs from observing H5 and H6. But every result occurs from making comparisons without knowledge of the heights or the composition of the ashtrays. This is a subtle but critical point in Rasch's methodology of comparisons. Nothing is required beyond the gross descriptions of the ashtray's composition/construction, and a sequence of heights for these drops. Rasch (1977) concludes:

Comparing ashtrays at H3 and H4 shows the two middle distances with the heavier ones surviving breakage while the lighter ones do not…objects type—2 elements (heavy, light); agent distances—6 elements (each one ordered above the other); and reaction elements—2 results (survived breakage, did not survive breakage). (p. 69)

These specified elements reiterate Rasch's earlier contention that it is the qualities which are engaged while measurement quantities are not required.

What is required, ... is to do something specific to the object, confront it with some action or different actions liable to create one of a number of actions. If knowledge gained in this way is to be used in making a choice it must be obtained for several objects of the same kind in question so that a comparison becomes possible. (p. 69)

Rasch (1977) draws these inferences from his ashtray experiment.

A possible comparing function could be the assertion ‘No. 1 is more solid than No. 2,' defined operationally by the sequence of reactions + -, that is, the first one holds and the second one breaks. This comparison is not global, it has the value 'true' for the intermediate falling distances and 'false' for the others. (p. 77)

Another comparing function is ‘No. 1 is at least as solid as No. 2,' defined operationally by the observed reactions ++ or +-: either they both hold or they both break or only No. 1 holds. This comparison is global within the frame of reference of the described experiment and can even be expected to be global also if more ashtrays and more falling distances are included in the frame of reference. (pp. 77–78)

This brings about the ashtray conclusions:

  • 1. Ashtray No.1 is not more solid than No. 2 across all heights. It is not global because it is true for some heights, but not all. A general statement cannot be made for the two types of ashtrays over all heights.

  • 2. Ashtray No. 1 is as solid as No. 2 can be supported as global in the No. 1 (heavy) is equal to or exceeds No. 2 (light) in heights 1 to 4 although both break at heights 5 and 6.

Rasch next delineates specific objectivity as separate from consideration of general objectivity:

...specific objectivity is not to be expected from an arbitrarily chosen comparing function of u(R1, R2)

...bifactorial frames of reference...[where] every reaction is characterized by a so-called scalar parameter...characteristic of object, agent or reaction. [Given…]

…parameters Ov, Ai, and Rvi…denoted by ωv, αi, ξvi .

…the reaction is assumed to be uniquely determined by object and agent [in]…

ξvi = ϱ(ωvi) the parametric reaction function.

The condition corresponding to equation (6) for specific objectivity of comparisons of objects… ωλ and ωv is:

$$ u\left( {\rho \left( {\omega_{\lambda } ,\alpha } \right),\rho \left( {\omega_{{\text{v}}} ,\alpha } \right)} \right) = v\left( {\rho \left( {\omega_{\lambda } ,\omega_{{\text{v}}} } \right)} \right) $$

Under these conditions a decisive statement can be made on the properties of the reaction function ϱ(ωvi) that are necessary for establishing specific objectivity of comparisons of objects within the framework F. (p. 78)

Rasch advocates strategies of comparison designed to tease out knowledge. His first example builds the case. The outcomes from the ashtray experiment can be arranged in a 2 × 2 table as shown in Fig. 2.

Fig. 2
figure 2

Light vs. heavy ashtray breakage

We already know enough about ashtrays and height to dismiss his simple example as unnecessary, but then we would miss the point of his generalizations that follow. Rasch’s experiment establishes these major points.

  1. 1.

    Systematic comparisons made under specified conditions produce stochastically consistent results.

  2. 2.

    It becomes insightful science to systematically arrange each encounter between an object and an agent, and then observe the result, the classic experiment for determining an outcome.

  3. 3.

    Given enough such experiments, wisely contrived, we can often predict outcomes whenever we gain an understanding of the matter under study, and our theory is sound. But sometimes our predictions are surprising, and wrong!

There is a clear difference between the heavy and light ashtrays from the results for H3 and H4 because the heavy one survives, and the light one does not. This establishes Rasch’s critical point illustrated by this experiment; comparisons are fundamental to measurement. Rasch also reminds us the results were produced by qualitative observations, and he indicates this is frequently the case with many scientific investigations in the physical sciences. There has been no requirement or need for prior quantitative measures in order to produce these results. We might wish to bring further clarification and more sophistication to this experiment by refining the conditions (introducing different heights with different ashtrays) and observing the results, but specific units for height and mass are not required. Order suffices.

2 Constructing Experimental Comparisons

Rasch argues that the process and strategy of constructing comparisons is the essence of scientific methodology. In his ashtray experiment every object comes in contact with every agent via a frame of reference. Each interaction produces some outcome resulting from this intersecting occurrence of agents with objects. Outcomes are recorded by one of two qualitative values in a dichotomous frame of reference defined by ashtrays and heights.

A 2 × 2 frame of reference permits these comparisons to be ordered in a systematic way. The outcome may be qualitatively the same or different as in the ashtray example. Further comparisons are made possible by progressing through a larger data frame, and subsequently aggregating all such useful comparisons regarding height and ashtrays. Order remains fundamental regardless of how complex the comparison framework grows.

Rasch (1977) characterizes pair-wise comparisons of objects (or agents) as “specifically objective within the frame of reference” (p. 77, our emphasis), and he delineates the process:

The term 'objectivity' refers to the fact that the result of any comparison of two objects within O is independent of the choice of the agent ai, within A and also of the other elements in the collection of objects O, in other words: independent of everything else within the frame of reference than the two objects which are to be compared and their observed reactions. And the qualification 'specific' is added because the objectivity of these comparisons is restricted to the frame of reference F. This is therefore denoted as the frame of reference for the specifically objective comparisons in question. This also makes clear that specific objectivity in not an absolute concept, it is related to the specified frame of reference. (p. 77, our emphasis)

Designating ωv, αi, and ξvi as parameters for Ov, Ai, and Rvi gives ξvi = ϱ(ωvi) as the parametric reaction function. This condition for the specific objectivity of comparisons for objects ωλ and ωv is:

$$ u\left( {\rho \left( {\omega_{\lambda } ,\alpha } \right),\rho \left( {\omega_{{\text{v}}} ,\alpha } \right)} \right) = v\left( {\rho \left( {\omega_{\lambda } ,\omega_{{\text{v}}} } \right)} \right) $$

Rasch then formulates his main theorem of specific objectivity:

Let objects and agents in the bifactorial determinate frame of reference be characterizable by scalar parameters ω and α and reactions by a scalar reaction function of 'convenient' mathematical properties:

$$ \xi = \wp \left( {\omega ,\alpha } \right) $$

with three monotonic functions:

$$ \omega^{\prime} = \phi \left( \omega \right),\alpha^{\prime} = \psi \left( \alpha \right),\xi^{\prime} = \chi \left( \xi \right) $$

transforming the scalar reaction function to an additive one:

$$ \xi^{\prime} = \omega^{\prime} + \alpha^{\prime} $$

…a necessary and sufficient condition for specifically objective comparability of objects as well as agents. (p. 79)

The model now becomes the source of objective measurement and not the details of the data. Hence, the sometimes used phrase, “When data fit the Rasch model ...” might be better expressed “The Rasch model has (1) identified a fit of data to the model across a frame of reference implied by this experiment, or (2) identified a lack of fit between the same.”

Rasch (1960) had earlier alluded to this same condition:

It is tempting, therefore, in the cases with deviations of one sort or other to ask whether it is the model or the test that has gone wrong...the question is meaningful...the applicability of the model must have something to do with the construction of the test. (p. 51, our emphasis)

The model confirms or identifies inconsistencies by making experimental comparisons of agents, objects, and outcomes from the data. Inconsistent comparisons in an experiment suggest theory, data or both are suspect. Rasch’s penchant for constructing data plots as a “check on the model” indicates his awareness of the importance of quality control. We further draw attention to the role of construction or engineering as critical. Every experiment is a fabrication in the sense of manufacturing a desired outcome from theory and data. To the degree that we succeed we know what is required to produce the desired outcome. Failure or deviations indicate full knowledge is lacking.

In such instances, theory, data or both require further investigation. This is an important point for applying the model to data. To use an illustration from physics, as Rasch often did, Newton's law of motion rests upon his model and not on data. It is the law and not the data to which we attend. Furthermore, we recognize data is fraught with contamination from many sources of error. Hence, we generalize from Newton’s model of force/mass—his abstraction produced from contaminated data. Continuous validation has supported his theory until the advent of quantum physics required a new viewpoint, but even this evolutionary change has not obviated Newton's law.

Rasch’s example began with six ashtrays, but only two ashtrays provided unique and key information. With a supply of additional ashtrays made of different types (shapes, composition, etc.) we can proceed to make every two-way comparison of different ashtrays dropped from various heights until we have exhausted all the two-way (height by ashtray) comparisons useful for this crash test. Summarizing the findings provides types of ashtrays arranged by their capacity to withstand breakage according to height employed. From a simple comparison of two similar ashtrays from different heights (or two different ones from the same height) this simple comparison may be extended as far as desired. We do not need to physically order the ashtrays, although we could do so, because a record of success and failure is sufficient. A summary of all the two-by-two comparisons will produce an ordered arrangement of all the ashtrays by all the heights employed. This process results in a durability/survival variable identifying ashtrays from the most fragile to the most durable, and from the lowest to highest heights in the frame of reference deemed useful for the experiment. We can also move from a two-way comparison to a multi-variable frame if desired.

Any expansion embodies the essence of experimentation guided by theory. It produces an ever encompassing frame of reference for determining durability for a variety of ashtrays dropped from a variety of heights. We can confirm previous predictions from theory as well as make further predictions about unexamined ashtrays and unexamined heights. As we proceed, it may be possible to derive other predictive hypotheses regarding what is expected to occur. A theory of ashtray composition dropped from various heights will probably produce additional predictive insight. Outcomes may be increasingly predictive compared to what was known initially. Theory regarding ashtray composition and heights may become increasingly better understood. Essential to the process is confirmation by cross-validation. If desired, we can describe the results by numeric values noting once more that such “measures” follow comparison.

Campbell (1921) offers a relevant remark,

If measurement is really to mean anything, there must be some important resemblance between the property measured, on the one hand, and the numerals assigned to represent it on the other hand. In fundamental measurement this resemblance (or the most important part of it) arises from the fact that the property that is susceptible to addition is following the same rules as that of numbers. There is left resemblance in respect of “order.” ...Order then is characteristic of numerals; it is also characteristic of the properties represented by numerals. This is the feature which makes “measurement" significant. (p. 126–127, our emphasis)

The key to measurement in psychometrics is (1) making systematic comparisons (items, persons, judges, etc.) and (2) making these comparisons objectively. Order becomes paramount. Assigning numerals in such cases is really an after-thought stemming from the initial pair-wise comparisons of ashtrays so as to systematize the findings and quantify the results. The numerals/numbers assigned and utilized become the summary of the experimental findings. They follow the experiment, but do not dictate it. Only when the results are valid and confirmed can we successfully apply/substantiate numeric/algebraic abstractions, and not the other way around. We construct measurement from results of the theory/experiment. We have Celsius and Fahrenheit scales, but one and the same temperature at a simultaneous measurement.

In Rasch's words (1977),

“Objectivity is achieved when a comparison of any two objects is independent of everything else within the frame of reference other than the two objects which are to be compared and their observed reactions.” (p. 77)

Rasch (1977) also shares his emphasis upon comparison with the English philosopher Hume (1949) who wrote,

All kinds of reasoning consist in nothing but a comparison, and a discovery of those relations, either constant or inconstant, which two or more objects bear to each other. (p. 77, italics in original)

Making systematic comparisons constitutes the fundamental process by which we determine essential differences. Systematic comparisons when possessing transitivity produce order. Measures considered to be quantitative actually result from qualitative ordering. Comparison first makes clear how any two objects/agents relate to each other by whether one is “more” than the other. This is the ground of measurement. Hume stated the importance of comparison as a logico-philosophical deductive principle. Rasch specified it using a simple inductive example to deliver a mathematical generalization. Generalization from order via comparison produces a ground to measurement. Systematic comparisons produce order and understanding without the necessity of any a priori measurement scheme. Ricoeur (1977) speaks to their value and power:

... the power of making things visible, alive, actual is inseparable from either a logical relation of proportion or a comparison… Thus one and the same strategy of discourse puts into play the logical force of analogy and of comparison—the power to set things before the eyes, the power to speak of the inanimate as if alive, ultimately the capacity to signify active reality. (pp. 34–35)

The two-way frame is the data analysis procedure by which to determine order from the results observed by systematic comparisons produced from the tripartite elements of agent, object and outcome. Comparisons derived from this two-way frame are objective whenever “they are independent of everything else within the frame of reference.” We restate this principle by some alternate expressions:

  1. 1.

    The frame of reference provides the basis for making objective comparisons.

  2. 2.

    Comparisons produce order in a frame of reference guided by theory.

  3. 3.

    Order (for agents, objects and results) is demonstrable (or not) by the frame of reference.

  4. 4.

    Comparisons are specifically objective when made within the frame of reference (2 and 3).

In a deterministic framework the consequences of comparison remain categorical (Guttman). In a stochastic framework the consequences of comparison are probabilistic (Rasch). Not every drop of a specified ashtray from a specified height produces the exact same outcome. There will be a distribution of errors surrounding each comparing event. But if the experimental conditions are the same, then a probabilistic result provides the answer. Residual analysis and misfit analysis become the important tools by which to evaluate every outcome resulting from every comparison. Rasch did not address quality control directly in his 1977 paper, but he did give careful consideration to always confirming the model to data as shown by his constant attention to making plots and graphs. He relied on data plots not correlations to show confirmation or identify deviance. This is good advice for today also.

The conclusions to be drawn from Rasch’s exposition are as follows:

  1. 1.

    We construct science by making comparisons. These comparisons must be made by following a procedure leading to specific objectivity. Theory guides this process, but experimentation determines the outcome of hypotheses. This strategy assures clarity in the process and allows replications to confirm or refute the results.

  2. 2.

    The two-way frame of reference specifies the agent, object and resultant outcomes in a predictive guise.

  3. 3.

    The two-way frame of reference arranges and subsequently summarizes the comparisons. These comparisons are fundamental to what Rasch designated as specific objectivity.

  4. 4.

    Measurement follows from the results of qualitative comparisons that have been constructed in a systematic way using order as the fundamental characteristic.

3 Additive Conjoint Measurement

Rasch’s experimental conditions demonstrate the first of two specifications essential for additivity, i.e., the rows and columns of a data matrix are monotonic for the data (Krantz et al., 1971). The independent variables of ashtrays; composition, height and result, were ordered in the ashtray experiment by a two-way frame of reference. Continued experimentation could sustain and expand upon the range thereby extending the frame of reference through additional experimentation.

A second property, double cancellation, identifies departures from additivity (Krantz et al., 1971). Luce and Tukey (1964, p. 3) show a simple way to graph the effects of two factors for demonstrating this property. Boorsboom (2005) sees double cancellation the consequence of additivity which “brings out the similarity between additive conjoint measurement and the Rasch model.” He indicates that the condition of independence is “similar to what Rasch called parameter separation” (p. 97).

Perline et al. (1979) presented data to argue Rasch’s psychometric model is a special case of additive conjoint measurement. Additivity is ascertained by examining these two necessary ordinal properties. Coombs et al. (1970) declare all independent variables are measured on an interval scale with a common unit when additivity exists.

The essential point of Perline et al. (1979) in their analysis of two studies was that Rasch model estimates (when data are orderly) correspond to the conditions for additive conjoint measurement, but data fit does not have to be ideal to be demonstrable.

Boorsboom (2005) argues,

...the Rasch model has little to do with fundamental measurement. In fact, the only things that conjoint measurement and Rasch models have in common, in this interpretation is additivity. (p. 132)

Indirectly, this simply confirms the comparability of Rasch estimates of person and item parameters to additive conjoint measurement. The key property of additive conjoint measurement rests upon order determined from making comparisons which returns us once more to the points made from Rasch's ashtray example.

Newby et al. (2002) provide a clear statement of the mathematical relationship between additive conjoint measurement (ACM) and the Rasch model saying,

... it is not the case that, given any data, the Rasch model will provide a natural numerical representation of the data that is comparable with ACM. (p. 350)

Why should the Rasch model address any data except to show if the conclusions drawn from making constructive comparisons are justified or not within the frame of reference. Our point is that a so-called Rasch model analysis does not cleanse data. Nor does it not signify the conclusion of data analysis. Further analysis must be conducted in the context of theory with clearly stated hypotheses Removed from the context of theory, no analysis sui generis provides substantive answers except incidentally (Stenner et al., 2008). For example; no statistical transformation of so-called raw scores to IQ values, etc., makes any improvement to a substantive understanding of intelligence or cognition.

Comparisons are the experimental results/consequence of an interaction of agents and objects guided by theory/hypotheses (Brogden, 1977). Comparison remains the key to any investigation. Content and substance are embodied in the selected comparisons made.

Are the results obtained from any Rasch analyses descriptive, or are they predictive? Does the ashtray frame of reference only describe, or can it predict? Does what we learn from dropping a variety of ashtrays from a variety of heights yield merely descriptive observations? Can these results predict stochastically suggested outcomes? Rasch answered these questions by the quotes we provided earlier.

Concerning parole data from Perline et al. (1979), we ask whether or not causality can be ascribed to the outcome emanating from this Rasch frame of reference? What do we learn from the parole data that provides a prediction? Is it a truly causal matter? Can we ascertain any information from an experimental manipulation of the tripartite object, agent, and outcome in a frame of reference to produce expected outcomes thought to be causal? If so, the variable moves beyond mere description to prediction and the consequences move to a higher plane (Stenner et al., 2008). But how do we truly ascertain prediction and causality in Rasch measurement?

Our answer ascribes the typical Rasch analysis as largely descriptive and lacking evidence without a causal connection or experimental manipulation. A Rasch analysis addresses order in the data, and may describe association or correlation, but not a direct instance of causal inference—suggestive, maybe, but not confirmatory. Lack of evidence suggests a crucial next step because (Stenner et al., 2008) argue,

There is no single piece of evidence more important to a construct's definition than the causal relationship between the construct and its indicators (p. 1153).

With respect to the parole data we ask, how can these nine items be placed in an experimental framework so as to examine their usefulness and predictive validity? The answer is to put these comparisons in a context for testing causality. The Rubin-Holland (Holland, 1980, 1986) framework for demonstrating causal inference specifies no causation without manipulation. Rasch analysis has produced considerable understanding of the variables exemplified in the resulting map of items, persons, and results from the two-way framework. The results remain descriptive unless contained within an experimental context guided by theory. Supporting any experiment should be a theory guiding inferences about what produced these outcomes. In the matter of parole, for example, how is granting or not granting parole to be experimentally controlled? How will outcome be determined? The hallmark of success is formulation of a specification equation to predict and validate successful manipulation of the variables required in a probabilistic causal model portrayed in the measuring variable (Pearl, 2000).

We therefore specify a process model for classifying Rasch investigations as shown in Fig. 3.

Fig. 3
figure 3

Strategies of comparison and order guided by theory leading to specific objectivity

  1. 1.

    The comparison process begins with an idea, a conjecture, or history which develops into a thought experiment.

  2. 2.

    The experiment is designed and conducted:

    Input → Manipulation → Output

  3. 3.

    An equation may be formulated and specified:

    Domain → Function → Range

  4. 4.

    Data plots, the analysis of misfit and residuals provide information and data/model control.

  5. 5.

    From these strategies come the properties of comparison and order guided by theory which produce the essential ingredients for achieving specific objectivity.