From Model to Measurement with Dichotomous Items

Burdick, Don; Stenner, A. Jackson; Kyngdon, Andrew

doi:10.1007/978-981-19-3747-7_12

Don Burdick^3,4,
A. Jackson Stenner⁵ &
Andrew Kyngdon³

3214 Accesses

Abstract

Psychometric models typically represent encounters between persons and dichotomous items as a random variable with two possible outcomes, one of which can be labeled success. For a given item, the stipulation that each person has a probability of success defines a construct on persons. This model specification defines the construct, but measurement is not yet achieved. The path to measurement must involve replication; unlike coin-tossing, this cannot be attained by repeating the encounter between the same person and the same item. Such replication can only be achieved with more items whose features are included in the model specifications. That is, the model must incorporate multiple items. This chapter examines multi-item model specifications that support the goal of measurement. The objective is to select the model that best facilitates the development of reliable measuring instruments. From this perspective, the Rasch model has important features compared to other models.

Journal of Applied Measurement, 11(2). 2010.

You have full access to this open access chapter, Download chapter PDF

On Matters of Invariance in Latent Variable Models: Reflections on the Concept, and its Relations in Classical and Item Response Theory

Linear Logistic Models with Relaxed Assumptions in R

The InterModel Vigorish as a Lens for Understanding (and Quantifying) the Value of Item Response Models for Dichotomously Coded Items

Article 03 June 2024

1 The Atomic Model

This chapter examines the role of the Rasch (1960) model for dichotomous data from the perspective of first principles concerning the measurement of psychometric constructs. The chapter therefore begins with an atomic model for a single, dichotomous item. The descriptive term atomic implies that single-item models appear as basic units in more elaborate so-called molecular models that incorporate multiple items. Typically, 1 of the 2 outcomes for the dichotomous item is the favored or successful response, and this is denoted as 1. The atomic model defines a construct on persons by specifying that each person has a probability of responding 1 to the item, and this probability varies from person to person. This success probability can, if we choose, be taken as the person parameter, that is, the true measure of a person on the construct defined by the atomic model. The variability from person to person is a sine qua non for the construct-defining property of the atomic model. A model for the tossing of a coin, in which each person has the same probability of obtaining the favored outcome (i.e., heads), does not define a construct on persons (Wood, 1978).

Two important observations can be made about the atomic model and its associated construct: (1) the quantity that represents the construct is a latent variable (i.e., it is not directly observable as data); and (2) the essential character of this latent variable is ordinal. We consider the implications of both of these observations in some detail, beginning with the latter.

The ordinal character referred to in the second observation above does not imply that the atomic model has only order relationships among the objects of measurement (i.e., persons). The success probabilities are a valid numerical representation of the construct. Ordinal character instead implies that any monotonic transformation of the success probabilities would be as valid a representation of the construct as the success probabilities themselves. Thus, the probabilities could be converted to logits or probits, perhaps followed by multiplicative rescaling. Transforming the probabilities from an atomic model into logits yields a Rasch model with the difficulty for the single item set to zero. Setting the difficulty parameter to something other than zero in the Rasch model for a single item produces an alternative monotonic transformation. If that transformation is followed by a multiplicative rescaling,^{Footnote 1} the result is the two-parameter model for a single item with specified location and discrimination parameters (Birnbaum, 1968). Because of the ordinal character of the construct obtained from the atomic model, there is no inherent reason to prefer any particular monotonic transformation of the probabilities compared to any other. Within the confines of the atomic model, the choice of a numerical structure to represent the order relationships is a matter of personal preference.

The first of the two important observations has critical implications for measurement. When the objective is measurement, it is not enough to define the construct in terms of a latent variable. The sine qua non for measurement is the ability to use observable data to discriminate reliably between objects that differ on the construct by an amount large enough to matter. In almost all situations, the one bit of information produced by the observable response to a single dichotomous item does not yield the reliable discrimination required for measurement. Reliable discrimination requires replication, and that in turn requires we broaden the atomic model to accommodate more items.

Some of the previously introduced concepts (e.g., a construct difference that is considered large enough to matter) as well as concepts to be introduced subsequently are context-dependent. It is helpful to have a prototype example to which to refer when discussing these context-dependent concepts. As such an example, consider an item from a reading test in which a passage is presented followed by a task, scored as correct or incorrect, to assess the reader's understanding of the passage. The construct is reader ability. We assume that better readers have higher probabilities of succeeding at the assessment task.

The inadequacy of the discrimination obtainable from a single dichotomous item is easy to see in this example. The difference between a reader with a 40% chance of success and a reader with an 80% chance of success is almost certainly large enough to matter. There is a less than even chance of successful detection of this difference with the single item. Hence, replication is needed.

The purpose of replication is to generate new information that can be used to estimate the value of the latent variable with less uncertainty. In many other dichotomous experiments (e.g., coin-tossing), the experiment can be repeated and the second outcome assumed to be independent of the first, but the assumption of independence for a repeat encounter is not appropriate in the current context. If a reader is presented again with the same passage and the same assessment task, the same outcome is virtually assured to occur. When the second outcome is independent of the first, it provides new information that can be used to reduce uncertainty about the latent variable; when the second outcome is perfectly predictable from the first, no new information is produced. This dependence problem can be overcome by introducing a new passage and task as the replication, but in this situation, the single-item atomic model is no longer adequate. There are now at least two atoms to consider. Continuing the metaphor, a model that represents two or more dichotomous items will be called molecular.

2 Molecular Models

With two dichotomous items, call them Item A and Item B, each could be represented by its own atomic model. Thus, each person has two probabilities, p_A and p_B, that represent the probability of a correct response on Item A and Item B, respectively. These two latent variables could potentially represent two distinct constructs. A third possible construct, represented by the latent variable equal to the sum of the two probabilities, comes readily to mind.

The replication needed to achieve measurement requires multiple items whose atomic models represent the same construct. What conditions must be imposed on the molecular model to ensure that its constituent atomic models define the same construct? The answer lies in the essential ordinal character of the construct obtained from the atomic model.

In the two-atom molecular model, the set of success probabilities for Item A generates a rank ordering of persons that allows the possibility of ties. A similar statement applies to Item B. The two atomic models define the same construct if and only if the success probabilities for the two items generate the same rank ordering of persons, including ties. A set of two or more items whose atomic models define the same construct is said to satisfy the unidimensionality condition. It is worth noting that unidimensionality may depend on the population of persons. A set of items that satisfies unidimensionality for a given population may not be unidimensional when the population is extended.

When the unidimensionality condition is satisfied for a two-atom molecular model, the constructs derived from the two atomic models and the construct obtained from the sum of the two success probabilities are all identical. The latent variables that arise from the two atomic models are related by a strictly monotonic transformation. The proof of this assertion is straightforward. Define a function h as follows:

If a person has success probabilities p_A and p_B, then h(p_A) = p_B.

The preservation of ties implies that h is unambiguously defined. The preservation of rank order implies that h is strictly monotonic. The ordinal character of the construct implies that any strictly monotonic transformation of a valid numerical representation is another valid numerical representation of the same construct.

More than two items of replication are generally necessary to achieve the measurement objective. A multi-item instrument can provide adequate replication if the items satisfy the unidimensionality condition (Stout, 1990). The question is whether unidimensionality is a reasonable assumption in a given context. For an answer in the context of the prototype example, consider a test of reader ability that consists of 40 dichotomous items. The assumption of unidimensionality asserts that examinees can be ordered by reading ability such that for every item on the test, more able readers have higher success probabilities than less able readers.

3 Parameterizations

As mentioned above, any monotonic transformation of the success probabilities in an atomic model can be used as a numerical representation of the person parameter for the construct defined by that model. Let θ = θ(p) be the result of applying a monotonic transformation to a success probability p. The change from p to θ can be regarded as a reparameterization of the atomic model. The inverse function p = p(θ), which is also monotonic, is called the item characteristic function. Note that the form of the item characteristic curve depends on the reparameterization, and the selection of the reparameterizing monotonic transformation is arbitrary. If success probabilities are converted to logits, the item characteristic curve will be the logistic ogive. If success probabilities are converted to probits, the item characteristic curve will be the normal ogive.

In a molecular model that incorporates multiple items, each person has multiple success probabilities, one for each item in the model. If the model satisfies the unidimensionality condition, it unambiguously determines a single construct of ordinal character. When a parameterization of that construct is selected, each person can be mapped to the construct with a parametric value that determines all of that person's success probabilities.

For example, the construct from a unidimensional molecular model could be parameterized in terms of the success probabilities for a single canonical item, say Item A. For any other item in the model, say Item B, there is a monotonic function, h_B that satisfies p_B = h_B(p_A) for all persons. For this parameterization, the item characteristic curve for Item A will be a straight line from the origin to the point (1, 1). For Item B, the item characteristic curve need not be a straight line but will start at the origin and increase monotonically until it reaches (1, 1). If instead values for a latent variable θ are obtained by transforming the success probabilities for Item A into logits or probits, the item characteristic curve p_A(θ) for A will be the logistic or normal ogive, and Item B's characteristic curve will be h_B(p_A(θ)).

4 Unidimensionality, Replication, and Measurement

At this point it is appropriate to return to the prototype example to examine the progress toward the goal of achieving measurement. Two questions should be raised when a new item is included in the assessment instrument: (1) Is the new item on the construct? and (2) Does its inclusion provide new information about the person's location on the construct (i.e., the person parameter)? To achieve our goal of measurement, we need affirmative answers to both questions. The first question can be answered yes when the unidimensionality condition holds, and the second can be answered yes if the new item has the property of local independence with the other items in the instrument. Local independence occurs when the responses to two items by the same person are statistically independent. In a model in which the person parameter is not represented as a random variable, local independence is identical to statistical independence. If the person parameter is a random variable, as in some Bayesian models, local independence is a conditional independence, given the person parameter. In either case, the addition of new items with the local independence property produces replication that leads to a reduction in the standard error of measurement.

Achieving the measurement objective requires replication, but how much replication is required to achieve reliable discrimination between objects that differ by an amount large enough to matter? The answer depends on the context. The finer the differences to be discriminated and the higher the desired level of confidence in the discrimination, the more replication is needed. In the prototype example, if a valid reading test does not have sufficient reliability for the purpose at hand, it may be possible to find more items with affirmative answers to Questions #1 and #2 that could be appended to the test to increase the reliability to the desired level.

At this stage, the unidimensional molecular model can be enhanced with a replication capability by assuming an inexhaustible supply of new on-construct items. Without making further assumptions, what are the measurement implications of this replication-enhanced, unidimensional model? The answer is that reliable discrimination can be achieved between two objects whose construct values differ but only to the extent of determining the order relationship between the two objects. In the prototype example, suppose John and Alice have different reading abilities. If replication is available via an inexhaustible supply of reading items, but all that can be said of these items is that they have local independence and are on construct (the unidimensionality assumption), then a test that consists of a sufficient number of these items can determine who is the better reader but not by how much. Just being on construct (i.e., ordering persons identically) is not an adequately stringent assumption about items to measure any more detailed information about persons than their order relationships. Measurement on an interval scale requires more specificity in the assumptions about the atomic models for the items.

5 The Stringency Construct for Model Specifications

It is possible and often useful to contemplate a stringency construct defined for statistical models. Ordinal position on this construct is determined by the stringency of the assumptions incorporated into the statistical model. These assumptions are called the model's specification, and a so-called tight specification has more stringent assumptions than a loose one. The molecular model that asserts only that there is some encounter-specific probability of success associated with each encounter between a person and an item is a loose specification—the loosest under consideration. Incorporating local independence and unidimensionality tightens the specification but only enough to allow ordinal measurement even though the model is based on a numerical latent variable. The Rasch model is clearly an even tighter specification because it assumes more about the items' atomic models.

If we denote by ULI the model that assumes only unidimensionality and local independence, there is a stringency difference between ULI and the Rasch model. Two questions naturally arise: (1) What further tightening assumptions must be added to the ULI model to yield the Rasch model specification? and (2) Are there any important model specifications located between ULI and the Rasch model on the stringency construct?

The answer to the second question is yes. When the roles of persons and items are reversed in the molecular model, the success probabilities associated with a person can constitute an atomic model for a construct on items. There are as many such atomic models as there are persons. Application of the assumption of unidimensionality to these atomic models tightens the specification. Models can satisfy person unidimensionality without satisfying item unidimensionality. Models that satisfy both unidimensionality conditions are called doubly monotonic models. As the discussion in the next section demonstrates, the doubly monotonic model specification is tighter than person unidimensionality but looser than the Rasch model specification.

6 Doubly Monotonic Models

In the prototype example, it may happen that for each reader, the success probability is lower for Item B than for Item A. As a possible explanation for this feature, perhaps the passage in Item B presents more of a challenge to comprehension than does the passage for Item A (e.g., more complex syntax and more difficult vocabulary). This suggests the possibility that a text readability construct for items might be obtainable from a molecular model that incorporates multiple persons and items. So far, persons have been the objects of measurement and items have been regarded as instruments for measuring the person construct. These roles can be reversed. Consider a molecular model with multiple persons and items and a success probability for each encounter between a person and an item. There is an atomic model associated with each person, and each of these atomic models defines a construct on items. The unidimensionality condition will be satisfied if the rank order of items is consistent for all persons’ atomic models. In the prototype example, this unidimensionality of the persons’ atomic models implies that the text readability rank of two passages will not depend on who happens to be reading them.

A molecular model with multiple items and persons has two sets of atomic models: one set provides rankings of persons for each item and the other provides rankings of items for each person. If unidimensionality is satisfied for both sets, it is called a doubly monotonic model. If a doubly monotonic model is capable of replication, then ordinal measurement is enabled for persons if enough items are included and for items if enough persons are included.

When a molecular model has double monotonicity, the item characteristic curves do not cross regardless of the parameterization. Conversely, if a unidimensional model does not have double monotonicity, the item characteristic curves will be monotonic, but at least one pair of curves will cross. In a doubly monotonic model, the person characteristic curves also do not cross. If the atomic models for persons define a unidimensional construct on items, the person characteristic curves will be monotonic. If there is any crossing of these person characteristic curves, the molecular model will not define a unidimensional construct on persons.

7 Numerical Conjoint Measurement Models

The Rasch model is an example of a numerical conjoint measurement model. With the Rasch model, when the success probabilities are transformed to logits, the result is an array of numbers that can be expressed as differences between a person parameter that does not change from item to item and an item parameter that does not change from person to person. In other words, the interaction in the array of success probabilities can be removed by transforming them to logits. This feature, existence of a monotonic transformation to remove the interaction from the array of success probabilities, characterizes a numerical conjoint measurement model. A numerical conjoint measurement model is necessarily doubly monotonic.

The absence of interaction means that the transformed success probabilities can be expressed as differences between a latent variable for persons and a latent variable for items. Consequently, the constructs for persons and items (e.g., reader ability and text readability in the prototype example), can be expressed on a common interval scale. The absence of interaction also implies that item and person characteristic curves are horizontally parallel, that is, any item characteristic curve can be transformed to any other by moving it either left or right. A similar statement applies to person characteristic curves. The parallelism implies the trade-off of a difference between two reader measures for an identical difference between two text measures to hold constant the success probability (i.e., comprehension rate).

The Rasch model specification is not the only one with the numerical conjoint measurement property. For every numerical conjoint measurement model, there is a monotonic transformation that removes the interaction from the array of success probabilities. An important feature of the logistic transformation, and thus a feature unique to the Rasch model, is the property that the raw scores for persons and the item p values are sufficient statistics for estimating the respective latent variables.

This is not to say that the Rasch model is an instantiation of the theory of conjoint measurement (Luce & Tukey, 1964; Krantz, Luce, Suppes, & Tversky, 1971). The Rasch model is not concerned with the ordinal and equivalence relations necessary and sufficient for additive representation (i.e., those entailed by the hierarchy of cancellation axioms (Scott, 1964)).^{Footnote 2}

8 The Score Sufficiency Condition and Its Implications

Sufficiency is an important technical term in the language of statistical inference. It is especially important in the current context because of its implications with respect to the Rasch model, which is unique in having the property that the raw score—the total number of correct responses—is a sufficient statistic for estimation of the person parameter. Raw score sufficiency means that, once we know the total number of correct responses, we can learn nothing more about the person parameter from the response pattern.

The precise statement of this result is as follows. Assume that the multi-item, multi-person molecular model is unidimensional for persons and that local independence holds. If, in addition, the raw score is a sufficient statistic for the person parameter, then the molecular model is necessarily a Rasch model. In other words, in the context of a unidimensional molecular model with local independence, the reparameterization obtained by transforming the success probabilities to logits produces horizontally parallel item characteristic curves.

The proof is rather straightforward. Let item A and item B represent two arbitrarily selected items, and let p_A(θ) and p_B(θ) denote the success probabilities for these items expressed as an arbitrary monotonic function of a person parameter θ. Raw score sufficiency implies that the conditional probability distribution of response patterns with the same raw score does not depend on the person parameter. Consider a response pattern in which item A is answered correctly but item B is not in comparison to a pattern in which the only change is to reverse these two responses (i.e., item B correct, item A not). These two patterns have the same raw score. The ratio of the probabilities of these patterns in the conditional distribution is the same as their ratio in the unconditional distribution.

If this ratio is denoted by R, it is defined by the equation:

$$ p_{A} \left( {1 - p_{B} } \right) = R\left( {1 - p_{A} } \right)p_{B} , $$

(1)

where the common factors for the responses to other items have been canceled out. Dividing this equaion by (1-p_A) (1-p_B) and taking logarithms yields:

$$\mathrm{ln}\frac{{p}_{A}}{(1-{p}_{A})}=\mathrm{ln}\frac{{p}_{B}}{(1-{p}_{B})}+\mathrm{ln}\,R .$$

Because the ratio R applies to the conditional as well as to the unconditional distribution and the unconditional distribution is independent of the person parameter, R cannot depend on the person parameter θ. This implies that the item characteristic curves are horizontally parallel when the success probabilities are transformed to logits. That is the defining characteristic of the Rasch model.

9 Tightening via Theory

Although the Rasch model, in which person abilities and item difficulties are parameters to be estimated from data, is the tightest model yet considered, further tightening of the model specification is possible and desirable. As an alternative to a data-based empirical approach for the estimation of item difficulties, a theory might be used to predict an item's difficulty from characteristics of the item. In the prototype example, a readability formula for the text passage might be used as a predictor of item difficulty.

Tightening a model's specifications is, however, a two-edged sword. This process increases the number of ways in which a model can be wrong, which can hamper a model's usefulness. On the other hand, tightening can enhance a model's capability for measurement, which adds value to the model. What then is the enhanced capability afforded by a theory of item difficulty? For an answer to this question, consider the prototype example and the enhancement that occurs at various stages of tightening the model specification.

In the prototype example, the assumption of unidimensionality under replication allows the comparison between John and Alice as to who is the better reader. That specification is too loose to allow a data-based answer to the question of how much better. The Rasch model specification with person and item parameters to be estimated is tighter still and with the assumption of adequate replication for both items and persons, this model can answer the question of how much better Alice is than John. The answer to this question does not change when other items that satisfy the specification are used instead of the original items. The capability of providing a measure of the difference in reading ability between John and Alice independent of the items (qua instrument) used to effect the measurement is called specific objectivity (Rasch, 1977).

How well does Alice read? This is a question about Alice's reading ability apart from any comparison with the reading ability of John or anyone else. Suppose the only data available from which to infer an answer to this question are Alice's responses to a set of dichotomous reading items. The Rasch model specification with undetermined item difficulties does not have the capability to provide an answer. When the Rasch model is tightened by using theory to determine item difficulties, the question can be answered using only the data from Alice's responses. The enhancement provided by the theory-based determination of item difficulties is substantial. Alice now has an absolute measure on the reading ability construct, as distinct from her measure relative to another person's, which is independent of the items (qua instrument) used to make the measure.

The use of substantive theory in the form of a construct specification equation (Stenner et al., 1983) adds stringency to the model specification. Other uses of the specification equation include: (1) Explaining the variation detected by an instrument. The specification equation includes just those features of the measurement context that cause variation in success probabilities. In the prototype example, the construct theory states that as we move up the scale, we will encounter text that places higher syntactic and semantic demands on the reader. The specification equation includes proxies for these two text features. As these text features are manipulated, the theory predicts changes in the observed item difficulties. Some argue that there is no more compelling validity evidence than causal control over the variation an instrument detects; (2) Bringing nontest behavior into the measurement frame of reference. In the Lexile Framework for Reading, books are imagined to be tests with theoretical calibrations provided by the specification equation. The Rasch model is solved for the reader ability given an arbitrary but useful relative raw score of 75% and the theoretical item calibrations. The resulting reader measure required to answer correctly 75% of the virtual test items is the text readability measure assigned to the book. Thus, an important nontest behavior, comprehension of a particular book by a particular reader, can be forecasted; and (3) Generating item calibrations for reading items that have been built by a software program. This application enables one-off instruments to be used with each examinee and then disposed of as with disposable (single-use) thermometers. The specification equation and item engineering rules maintain the unit from instrument to instrument (Stenner et al., 2006).

10 Applying the Framework

There are important issues to consider when formulating and evaluating models for the purpose of effecting measurement from dichotomous data. Dimensionality, differential item functioning, and the number of parameters needed to represent item characteristics are examples of such issues. The developments presented in this chapter can provide a framework for addressing these issues when we formulate and evaluate models.

As an example of the framework in action, suppose the items on a test of verbal ability have been organized into subtests labeled comprehension and vocabulary. Do these subtests measure two distinct person constructs or is the full test unidimensional? The framework provides a basis for the conduct of data analysis to answer this question. At issue is the question of whether the ordering of persons is the same for the two subtests. Unidimensionality within subtest and the ability to replicate are all the assumptions needed to make it possible for measurement to provide the answer. In practice, however, there could be a reason to tighten the specification. Observed between-subtest differences in the rankings of persons could occur despite overall unidimensionality as a result of measurement error. An assessment of the statistical significance of the departure from overall unidimensionality may require more stringent assumptions than within-subtest unidimensionality about the interrelationships between the atomic models of the items.

This is just one example. Other important questions can be similarly addressed from the perspective of the framework.

11 Summary and Conclusion

A typical psychometric model represents the encounter between a person and a dichotomous item as a probability. For a given item, the probabilities associated with persons define a construct that is ordinal in character, that is, the probabilities can be arbitrarily subjected to a monotonic transformation without changing the character of the construct. Although a construct may be defined by the model for a single item, the single item model is insufficient for measurement, which requires replication, and replication requires multiple items that all define the same construct. Because single item models are the basic building blocks for multiple item models, it is natural to use the terms atomic and molecular for single item and multiple item models, respectively.

Molecular models involve assumptions about the relationships between the atomic models of the items, and these assumptions vary as to their stringency. This variation enables us to locate model specifications on an ordinal stringency construct where less stringent specifications are described as loose and more stringent ones as tight. Tighter specifications have less data-fitting flexibility but compensate with features that enhance their usefulness for measurement. The Rasch model is a tight specification that enables conjoint measurement. Although other, equally tight specifications enable conjoint measurement, the Rasch model is the only one for which the raw scores and item p-values provide all of the information that is relevant to the measurement of persons and items on the same scale. When the Rasch model is further tightened with item difficulties specified by theory, each person can be measured on an absolute scale with a measurement derived from single-use items.

For a given measurement application, the choice of model is likely to depend on particulars of the context. The concepts of atomic and molecular models and the location of a model specification on the stringency construct provide a framework that can help guide the choice of a model. The framework can also help to guide analyses of issues such as multidimensionality and differential item functioning that can threaten the validity of model-based inferences.

Notes

1.
Multiplicative rescaling will change the numerical distance between person measures without changing the substantive difference. And analogy with temperature helps to clarify this point. An object that is warmer than another by 5 °C is warmer by 9 °F; 9 is larger than 5, but the substantive difference in temperature is unaffected by the multiplicative rescaling.
2.
For a debate on this issue, see Borsboom and Zand Scholten (2008), Kyngdon (2008a, 2008b), and Michell (2008).

References

Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee’s ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 395–479). Addison-Wesley.
Google Scholar
Borsboom, D., & Zand Scholten, A. (2008). The Rasch model and conjoint measurement theory from the perspective of psychometrics. Theory and Psychology, 18, 111–117.
Article Google Scholar
Krantz, D. H., Luce, R. D, Suppes, P., & Tversky, A. (1971). Foundations of measurement, Vol. I: Additive and polynomial representations. New York: Academic Press.
Google Scholar
Kyngdon, A. (2008a). The Rasch model from the perspective of the representational theory of measurement. Theory and Psychology, 18, 89–109.
Article Google Scholar
Kyngdon, A. (2008b). Conjoint measurement, error and the Rasch model: A reply to Michell and Borsboom and Zand Scholten. Theory and Psychology, 18, 125–131.
Article Google Scholar
Luce, R. D., & Tukey, J. W. (1964). Simultaneous conjoint measurement: A new scale type of fundamental measurement. Journal of Mathematical Psychology, 1, 1–27.
Article Google Scholar
Michell, J. (2008). Conjoint measurement and the Rasch paradox: A response to Kyngdon. Theory and Psychology, I, 8, 119–124.
Article Google Scholar
Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Copenhagen, Denmark: Danish Institute for Educational Research (Expanded edition). Chicago: University of Chicago Press.
Google Scholar
Rasch, G. (1977). On specific objectivity: An attempt at formalising the request for generality and validity of scientific statements. Danish Yearbook of Philosophy, 14, 58–94.
Article Google Scholar
Scott, D. (1964). Measurement models and linear inequalities. Journal of Mathematical Psychology, 1, 233–247.
Article Google Scholar
Stenner, A. J., Burdick, H., Sanford, E. E., & Burdick, D. S. (2006). How accurate are Lexile text measures? Journal of Applied Measurement, 7, 307–322.
Google Scholar
Stenner, A. J., Smith, M., & Burdick, D. S. (1983). Toward a theory of construct definition. Journal of Education Measurement, 20, 305–315.
Article Google Scholar
Stout, W. (1990). A new response theory modeling approach with applications to unidimensionality assessment and ability estimates. Psychometrika, 55, 293–325.
Article MathSciNet Google Scholar
Wood, R. (1978). Fitting the Rasch model: A heady tale. British Journal of Mathematical and Statistical Psychology, 31, 27–32.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Metametrics, Inc., Durham, NC, USA
Don Burdick & Andrew Kyngdon
Department of Mathematics, Duke University, Durham, NC, USA
Don Burdick
MetaMetrics, Inc., Durham, NC, USA
A. Jackson Stenner

Authors

Don Burdick
View author publications
You can also search for this author in PubMed Google Scholar
A. Jackson Stenner
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Kyngdon
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Living Capital Metrics LLC, Sausalito, CA, USA
William P. Fisher Jr.
University of Maryland, College Park, MD, USA
Paula J. Massengill

Rights and permissions

Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Burdick, D., Stenner, A.J., Kyngdon, A. (2023). From Model to Measurement with Dichotomous Items. In: Fisher Jr., W.P., Massengill, P.J. (eds) Explanatory Models, Unit Standards, and Personalized Learning in Educational Measurement. Springer, Singapore. https://doi.org/10.1007/978-981-19-3747-7_12

Download citation

DOI: https://doi.org/10.1007/978-981-19-3747-7_12
Published: 16 October 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-3746-0
Online ISBN: 978-981-19-3747-7
eBook Packages: Physics and AstronomyPhysics and Astronomy (R0)

Publish with us

Policies and ethics

From Model to Measurement with Dichotomous Items

Abstract

Similar content being viewed by others

On Matters of Invariance in Latent Variable Models: Reflections on the Concept, and its Relations in Classical and Item Response Theory

Linear Logistic Models with Relaxed Assumptions in R

The InterModel Vigorish as a Lens for Understanding (and Quantifying) the Value of Item Response Models for Dichotomously Coded Items

1 The Atomic Model

2 Molecular Models

3 Parameterizations

4 Unidimensionality, Replication, and Measurement

5 The Stringency Construct for Model Specifications

6 Doubly Monotonic Models

7 Numerical Conjoint Measurement Models

8 The Score Sufficiency Condition and Its Implications

9 Tightening via Theory

10 Applying the Framework

11 Summary and Conclusion

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

From Model to Measurement with Dichotomous Items

Abstract

Similar content being viewed by others

On Matters of Invariance in Latent Variable Models: Reflections on the Concept, and its Relations in Classical and Item Response Theory

Linear Logistic Models with Relaxed Assumptions in R

The InterModel Vigorish as a Lens for Understanding (and Quantifying) the Value of Item Response Models for Dichotomously Coded Items

1 The Atomic Model

2 Molecular Models

3 Parameterizations

4 Unidimensionality, Replication, and Measurement

5 The Stringency Construct for Model Specifications

6 Doubly Monotonic Models

7 Numerical Conjoint Measurement Models

8 The Score Sufficiency Condition and Its Implications

9 Tightening via Theory

10 Applying the Framework

11 Summary and Conclusion

Notes

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation