Skip to main content

Defining, Measuring, and Scaling Affective Constructs

  • Chapter
  • First Online:
Instrument Development in the Affective Domain

Abstract

Affective characteristics, such as attitudes, self-efficacy, and values are impossible to directly observe directly in other humans; they are latent constructs. Latent constructs are variables that we cannot observe directly; instead, we infer their existence through observed variables. This chapter introduces the theoretical concept of the latent variable and the important role it plays in understanding affective characteristics. In addition, we describe the operationalization of a latent construct, the process where instrument developers make both substantive and methodological decisions about the treatment of the observed variables that they are using to model the affective characteristic. This chapter aims to aid the instrument developer in the question or item construction process by highlighting the numerous theoretical and empirical implications of construct definition, measurement, and scaling choices available. This chapter provides both historical perspectives and a review of recent research within the areas of scaling, item construction, and response scale construction.

The parameter is what we aim to estimate; the corresponding statistic represents our current best estimate of it. Just so, the trait is what we aim to understand, and the corresponding construct represents our current best understanding of it.

Jane Loevinger 1957, p. 642

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    It should also be noted that measures can be viewed as causes of latent constructs (Bagozzi and Fornell 1982; Blalock 1964, 1971; Bollen and Lennox 1991).

  2. 2.

    Edwards and Bagozzi (2000, p. 156) stress that a measure refers not to the instrument used to gather data or to the act of collecting data, but to the score generated by these procedures.

  3. 3.

    Thurstone also developed a technique that used paired comparisons. After the set of items had been scaled by the judges, items were paired with other items with similar scale values; and sets of paired comparisons were developed. In some cases, each item was paired with all other items from other scales on the instrument, and respondents were asked to select the item from the pair that best described the target object. Thus, readers should be aware that some references to Thurstone scaling are actually references to Thurstone’s method of paired comparisons.

  4. 4.

    Up to this point, scale has been used to represent a cluster of items on a particular instrument. For the Semantic Differential technique, Osgood uses the term scale to represent a single item.

  5. 5.

    Some researchers do include a few scales from the potency or activity dimensions to see where these scales load in a factor analysis of the total set of scales. In this situation, the potency and activity scales function as marker scales to facilitate interpretation of the main factor structure.

  6. 6.

    Barnette (1999) found that a 5% pattern of nonattention responses can have strong effects on coefficient alpha (usually in the positive direction). This can lead instrument designers to conclude that the data are more internally consistency than they actually are.

  7. 7.

    This advice is not necessarily consistent when considering items across instruments. Work by Chang (1997), for example, suggests that as long as the number of scale points used in the instrument is consistent, changing the labeling of the anchors in the scale from say 1 = disagree, 2 = somewhat disagree, 3 = somewhat agree, 4 = agree to 1 = strongly disagree, 2 = disagree, 3 = agree, 4 = strongly agree, does not add to the observed variance. One potential implication is that instrument developers need not be “overly concerned” with the practice of using different labels to anchor the Likert response scale for items in different instruments. Still, pilot studies of different formats are always good insurance during the process of instrument development.

  8. 8.

    It should also be noted that extremely long or complex item stems, which we discussed earlier in the chapter, can overburden the cognitive optimizing processes of respondents, causing them to engage in satisfying behavior.

  9. 9.

    Initially, the validity coefficients seem low, but each needs to be considered in light of the alpha reliabilities for the respective scales (in parentheses on the main diagonal). The low number of items used for both the normative and ipsative scales appears to result from low reliability levels, except for the normative GRD scale. The maximum validity coefficient is the square root of the product of the reliabilities of the two scales. For example, while the correlation between IFAV and NFAV was 0.41, the maximum correlation possible was \( \sqrt {(0.45)(0.45)} \) or 0.45. The maximum correlation possible between the IGRD and NGRD scales is approximately \( \sqrt {(0.75)(0.55)} \) or 0.64 and the correlation reported is only 0.36. For these scales, the normative and ipsative measures using the same items are not highly related. The diagonal validity values are higher than the row and column counterparts in the dashed-line triangles (MTMM). It is difficult to interpret the values in the dashed triangles, since they partially represent the ipsative scales which reflect both the occupational-values content and the ipsative-scale properties.

  10. 10.

    It should be noted that some recent research suggests there is little or no effect of item ordering on internal consistency of the data. A study by Sparfeldt et al. (2006) found similar factorial structures for groups of high school students presented with items in a blocking order and in a traditional randomized order.

  11. 11.

    The response-window evaluative priming instrument participants respond to words that have a negative or positive evaluative meaning by pressing one key for good words and another key for bad words. Immediately preceding each word, a white or black face appeared for 200 ms and participants were required to respond within 200–600 ms (Draine and Greenwald 1998).

  12. 12.

    The response-window IAT is identical to the IAT except that it requires participants to respond within 225–675 ms of the stimulus presentation.

  13. 13.

    Cunningham et al. (2001) report that estimates of Cronbach’s alpha indicated that more than 30% of the observed variance in the measures was due to random error.

  14. 14.

    Cunningham et al. (2001) also note that “although multiple measures of implicit and explicit attitudes are robustly correlated, the two kinds of attitude measures tap unique sources of variances (Cunningham et al. 2001, p. 170); a single-factor [confirmatory factor analysis] solution does not fit the data”.

  15. 15.

    A recent meta-analysis of IAT studies examining six criterion categories (interpersonal behavior, person perception, policy preferences, microbehaviors, reaction times, and brain activity) for two versions of the IAT (stereotype and attitude IATs), three strategies for measuring explicit bias (feeling thermometers, multi-item explicit measures such as the Modern Racism Scale, and ad hoc measures of intergroup attitudes and stereotypes), and four criterion-scoring methods (computed majority-minority difference scores, relative majority-minority ratings, minority-only ratings and majority-only ratings) suggested that IATs were poor predictors of every criterion category. The only exception to these finding were in brain activity. Ultimately, the researchers found that the IATs performed no better than simple explicit measures for these same criteria (Oswald, F. L., Mitchell, G., Blanton, H., Jaccard, J., & Tetlock, P. E. (in press). Predicting ethnic and racial discrimination: A meta-analysis of IAT criterion studies. Journal of Personality and Social Psychology).

References

  • Aaker, D. A., Kumar, V., & Day, G. S. (2004). Marketing research. New York: Wiley.

    Google Scholar 

  • Ajzen, I. (1988). Attitudes, personality, and behavior. Chicago: Dorsey Press.

    Google Scholar 

  • Ajzen, I., & Fishbein, M. (1970). The prediction of behavior from attitudinal and normative variables. Journal of Experimental Social Psychology, 6, 466–487.

    Google Scholar 

  • Anastasi, A. (1982). Psychological testing (5th ed.). New York: Macmillan.

    Google Scholar 

  • Andersen, E. B. (1977). Sufficient statistics and latent trait models. Psychometrika, 42, 69–81.

    Google Scholar 

  • Andersen, L. W. (1981). Assessing affective characteristics in the schools. Boston: Allyn and Bacon.

    Google Scholar 

  • Anderson, L. W., & Bourke, S. F. (2000). Assessing affective characteristics in the schools (2nd ed.). Mahwah: Erlbaum.

    Google Scholar 

  • Anderson, J. C., & Gerbing, D. W. (1991). Predicting the performance of measures in a confirmatory factor analysis with a pretest assessment of their substantive validities. Journal of Applied Psychology, 76, 732–740.

    Google Scholar 

  • Andrich, D. (1978a). Application of a psychometric model to ordered categories which are scored with successive integers. Applied Psychological Measurement, 21, 581–594.

    Google Scholar 

  • Andrich, D. (1978b). Rating formulation for ordered response categories. Psychometrika, 43, 561–573.

    Google Scholar 

  • Andrich, D. (1978c). Scaling attitude items constructed and scored in the Likert tradition. Educational and Psychological Measurement, 38, 665–680.

    Google Scholar 

  • Andrich, D. (2004). Controversy and the Rasch model: A characteristic of incompatible paradigms? Medical Care, 42, 1–16.

    Google Scholar 

  • Antonak, R. F., & Livneh, H. (1995). Development, psychometric analysis, and validation of an error-choice test to measure attitude towards persons with epilepsy. Rehabilitation Psychology, 40(1), 25–38.

    Google Scholar 

  • Bagozzi, R. P., & Fornell, C. (1982). Theoretical concepts, measurements, and meaning. In C. Fornell (Ed.), A second generation of multivariate analysis (Vol. 1, pp. 24–38)., Praeger NY: New York.

    Google Scholar 

  • Bandalos, D. L., & Enders, C. K. (1996). The effects of nonnormality and number of response categories on reliability. Applied Measurement in Education, 9(2), 151–160.

    Google Scholar 

  • Bandura, A. (2006). Guide for constructing self-efficacy scales. Self-Efficacy Beliefs of Adolescents, 5, 307–337.

    Google Scholar 

  • Bargh, J. A., Chaiken, S., Govender, R., & Pratto, F. (1992). The generality of the automatic attitude activation effect. Journal of Personality and Social Psychology, 62(6), 893–912.

    PubMed  Google Scholar 

  • Barnette, J. J. (1996). Responses that may indicate nonattending behaviors in three self-administered educational attitude surveys. Research in the Schools, 3(2), 49–59.

    Google Scholar 

  • Barnette, J. J. (1999). Nonattending respondent effects on internal consistency of self-administered surveys: A Monte Carlo simulation study. Educational and Psychological Measurement, 59, 38–46.

    Google Scholar 

  • Barnette, J. J. (2000). Effects of stem and Likert response option reversals on survey internal consistency: If you feel the need, there is a better alternative to using those negatively worded stems. Educational and Psychological Measurement, 60, 361–370.

    Google Scholar 

  • Baron, H. (1996). Strengths and limitations of ipsative measurement. Journal of Occupational and Organizational Psychology, 69, 49–56.

    Google Scholar 

  • Beatty, P. C., & Willis, G. B. (2007). Research synthesis: The practice of cognitive interviewing. Public Opinion Quarterly, 71, 287–311.

    Google Scholar 

  • Beauducel, A., & Herzberg, P. Y. (2006). On the performance of maximum likelihood versus means and variance adjusted least squares estimation in CFA. Structural Equation Modeling: A Multidisciplinary Journal, 13(2), 186–203.

    Google Scholar 

  • Bendixen, M., & Sandler, M. (1994). Converting verbal scales to interval scales using correspondence analysis. Johannesburg: University of Witwatersrand.

    Google Scholar 

  • Benson, J., & Hocevar, D. (1985). The impact of item phrasing on the validity of attitude scales for elementary school children. Journal of Educational Measurement, 22, 231–240.

    Google Scholar 

  • Bishop, G. F., Oldendick, R. W., & Tuchfarber, A. J. (1982). Political information processing: Question order and context effects. Political Behavior, 4(2), 177–200.

    Google Scholar 

  • Blalock, H. M. (1964). Causal inferences in nonexperimental research. Chapel Hill: University of North Carolina Press.

    Google Scholar 

  • Blanton, H., & Jaccard, J. (2006). Arbitrary metrics in psychology. American Psychologist, 6(1), 27–41.

    Google Scholar 

  • Blalock, H.M. (Ed.) (1971). Causal Models in the Social Sciences. Chicago:Aldine.

    Google Scholar 

  • Bollen, K. A. (1989). Structural equation models with latent variables. New York: Wiley.

    Google Scholar 

  • Bollen, K. A. (2002). Latent variables in psychology and the social sciences. Annual Review of Psychology, 53, 605–634.

    PubMed  Google Scholar 

  • Bollen, K. A., & Lennox, R. (1991). Conventional wisdom on measurement: A structural equation perspective. Psychological Bulletin, 101(2), 305–314.

    Google Scholar 

  • Bond, T. G. (2004). Validity and assessment: A Rasch measurement perspective. Metodologia de las Ciencias del Comportamiento, 5(2), 179–194.

    Google Scholar 

  • Bond, T. G., & Fox, C. M. (2007). Applying the rasch model: Fundamental measurement in the human sciences (2nd ed.). Mahwah: Lawrence Erlbaum.

    Google Scholar 

  • Borsboom, D. (2003). The theoretical status of latent variables. Psychological Review, 110, 203–219.

    PubMed  Google Scholar 

  • Borsboom, D. (2005). Measuring the mind: Conceptual issues in contemporary psychometrics. Cambridge: Cambridge University Press.

    Google Scholar 

  • Bowen, C. C., Martin, B. A., & Hunt, S. T. (2002). A comparison of ipsative and normative approaches for ability to control faking in personality questionnaires. International Journal of Organizational Analysis, 10, 240–259.

    Google Scholar 

  • Bradburn, N., Sudman, S., & Wansink, B. (2004). Asking questions: the definitive guide to questionnaire design. San Francisco: Jossey-Bass.

    Google Scholar 

  • Burns, A. C., & Bush, R. F. (2000). Marketing research. Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by multitrait- multimethod matrix. Psychological Bulletin, 56, 81–105.

    PubMed  Google Scholar 

  • Carleton, R. N., McCreary, D., Norton, P. J., & Asmundson, G. J. G. (2006). The brief fear of negative evaluation scale, revised. Depression and Anxiety, 23, 297–303.

    PubMed  Google Scholar 

  • Chae, S., Kang, U., Jeon, E., & Linacre, J. M. (2000). Development of computerized middle school achievement test [in Korean]. Seoul: Komesa Press.

    Google Scholar 

  • Chambers, C. T., & Johnston, C. (2002). Developmental differences in children’s use of rating scales. Journal of Pediatric Psychology, 27(1), 27–36.

    PubMed  Google Scholar 

  • Chan, W., & Bentler, P. M. (1993). The covariance structure analysis of ipsative data. Sociological Methods and Research, 22, 214–247.

    Google Scholar 

  • Chang, L. (1997). Dependability of anchoring labels of Likert-type scales. Educational and Psychological Measurement, 57(5), 800–807.

    Google Scholar 

  • Christiansen, N. D., Burns, G. N., & Montgomery, G. E. (2005). Reconsidering forced-choice item formats for applicant personality assessment. Human Performance, 18, 267–307.

    Google Scholar 

  • Cialdini, R. B. (2001). Influence: Science and practice (4th ed.). Boston: Allyn & Bacon.

    Google Scholar 

  • Cicchetti, D. V., Shoinralter, D., & Tyrer, P. J. (1985). The effect of number of rating scale categories on levels of interrater reliability: A Monte Carlo investigation. Applied Psychological Measurement, 9(1), 31–36.

    Google Scholar 

  • Clemans, W. V. (1966). An analytical and empirical examination of some properties of ipsative measures, Psychometric Monographs, 14. Princeton: Psychometric Corporation.

    Google Scholar 

  • Closs, S. J. (1996). On the factoring and interpretation of ipsative data. Journal of Occupational & Organizational Psychology, 69, 41–47.

    Google Scholar 

  • Conner, M., Norman, P., & Bell, R. (2002). The theory of planned behavior and healthy eating. Health Psychology, 21, 194–201.

    PubMed  Google Scholar 

  • Cook, T. D., & Campbell, D. T. (1979). Quasi-experimental design: Design and analysis issues for field settings. Chicago: Rand-McNally.

    Google Scholar 

  • Coopersmith, S. (1967, 1989). The antecedents of self-esteem. San Francisco: Freeman.

    Google Scholar 

  • Crocker, L., & Algina, J. (2006). Introduction to classical and modern test theory. Pacific Grove: Wadsworth.

    Google Scholar 

  • Cronbach, L. J. (1950). Further evidence on response sets and test design. Educational and Psychological Measurement, 10, 30–31.

    Google Scholar 

  • Cunningham, W. A., Preacher, K. J., & Banaji, M. R. (2001). Implicit attitude measures: Consistency, stability, and convergent validity. Psychological Science, 12(2), 163–170.

    PubMed  Google Scholar 

  • De Houwer, J. (2006). What are implicit measures and why are we using them. In R. W. Wiers & A. W. Stacy (Eds.), The handbook of implicit cognition and addiction (pp. 11–28). Thousand Oaks: Sage Publishers.

    Google Scholar 

  • DeVellis, R. F. (1991). Scale development: Theory and application. Applied Social Research Methods Series, 40, Newbury Park: Sage.

    Google Scholar 

  • Dilchert, S., Ones, D. S., Viswesvaran, C., & Deller, J. (2006). Response distortion in personality measurement: Born to deceive, yet capable of providing valid self-assessments? Psychology Science, 48, 209–225.

    Google Scholar 

  • Dillman, D. A., Smyth, J. D., & Christain, L. M. (2009). Internet, mail, and mixed-mode surveys: The tailored design method (3rd ed.). Hoboken: Wiley.

    Google Scholar 

  • Dillon, W. R., Madden, T. J., & Firtle, N. H. (1993). Essentials of marketing research. Homewood: Irwin.

    Google Scholar 

  • DiStefano, C. (2002). The impact of categorization with confirmatory factor analysis. Structural Equation Modeling: A Multidisciplinary Journal, 9(3), 327–346.

    Google Scholar 

  • DiStefano, C., & Motl, R. W. (2006). Further investigating method effects associated with negatively worded items on self-report surveys. Structural Equation Modeling: A Multidisciplinary Journal, 13, 440–464.

    Google Scholar 

  • Dolan, C. V. (1994). Factor analysis of variables with 2, 3, 5 and 7 response categories: A comparison of categorical variable estimators using simulated data. British Journal of Mathematical and Statistical Psychology, 47, 309–326.

    Google Scholar 

  • Donovan, J. J., Dwight, S. A., & Hurtz, G. M. (2003). An assessment of the prevalence, severity, and verifiability of entry-level applicant faking using the randomized response technique. Human Performance, 16, 81–106.

    Google Scholar 

  • Draine, S. C., & Greenwald, A. G. (1998). Replicable unconscious semantic priming. Journal of Experimental Psychology: General, 127, 286–303.

    Google Scholar 

  • DuBois, B., & Burns, J. A. (1975). An analysis of the meaning of the question mark response category in attitude scales. Educational and Psychological Measurement, 35, 869–884.

    Google Scholar 

  • Duncan, O. D. (1984). Notes on social measurement: Historical and critical. New York: Russell Sage Foundation.

    Google Scholar 

  • Dwight, S. A., & Donovan, J. J. (2003). Do warnings not to fake reduce faking? Human Performance, 16(1), 1–23.

    Google Scholar 

  • Edwards, A. L. (1957). Techniques of attitude scale construction. New York: Appleton-Century-Crofts.

    Google Scholar 

  • Edwards, J. R., & Bagozzi, R. P. (2000). On the nature and direction of the relationship between constructs and measures. Psychological Methods, 5, 155–174.

    PubMed  Google Scholar 

  • Eys, M. A., Carron, A. V., Bray, S. R., & Brawley, L. R. (2007). Item wording and internal consistency of a measure of cohesion: The group environment questionnaire. Journal of Sport & Exercise Psychology, 29, 395–402.

    Google Scholar 

  • Fabiani, M., Gratton, G., & Coles, M. G. H. (2000). Event-related brain potentials: Methods, theory, and applications. In J. T. Cacioppo, L. Tassinary, & G. Berntson (Eds.), Handbook of psychophysiology (2nd ed., pp. 53–84). New York: Cambridge University Press.

    Google Scholar 

  • Fabrigar, L., McDougall, B. L., & Krosnick, J. A. (2005). Attitude measurement: Techniques for measuring the unobservable. In T. C. Brock & M. C. Green (Eds.), Persuasion: Psychological insights and perspectives (2nd ed.). Thousand Oaks: Sage.

    Google Scholar 

  • Fazio, R. H., & Olson, M. A. (2003). Implicit measures in social cognition research: Their meaning and use. Annual Review of Psychology, 54, 297–327.

    PubMed  Google Scholar 

  • Fazio, R. H., Sanbonmatsu, D. M., Powell, M. C., & Kardes, F. R. (1986). On the automatic activation of attitudes. Journal of Personality and Social Psychology, 50, 229–238.

    PubMed  Google Scholar 

  • Fazio, R. H., Jackson, J. R., Dunton, B. C., & Williams, C. J. (1995). Variability in automatic activation as an unobtrusive measure of racial stereotypes: A bona fide pipeline? Journal of Personality and Social Psychology, 69, 1013–1027.

    PubMed  Google Scholar 

  • Fishbein, M., & Ajzen, I. (1975). Belief, attitude, intention, and behavior: An introduction to theory and research. Reading: Addison-Wesley.

    Google Scholar 

  • Fishbein, M., & Ajzen, I. (2010). Predicting and changing behavior: The reasoned action approach. New York: Taylor and Francis Group.

    Google Scholar 

  • Frantom, C. G. (2001). Paternalism and the myth of perfection: Test and measurement of a theory underlying physicians’ perceptions of patient autonomy. Unpublished doctoral dissertation, Denver: University of Denver.

    Google Scholar 

  • Glaser, B. G., & Strauss, A. L. (1967). The discovery of grounded theory: Strategies for qualitative research. Chicago: Aldine.

    Google Scholar 

  • Gordon, L. V. (1960). SRA manual for survey of interpersonal values. Chicago: Science Research Associates.

    Google Scholar 

  • Green, S. B., & Hershberger, S. L. (2000). Correlated errors in true score models and their effect on coefficient alpha. Structural Equation Modeling, 7(2), 251–270.

    Google Scholar 

  • Green, S. B., & Yang, Y. (2009). Commentary on coefficient alpha: A cautionary tale. Psychometrika, 74(1), 121–135.

    Google Scholar 

  • Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differences in implicit cognition: The implicit association task. Journal of Personality and Social Psychology, 74, 1464–1480.

    PubMed  Google Scholar 

  • Greenwald, A. G., Nosek, B. A., & Banji, M. R. (2003). Understanding and using the Implicit Association Test: An improved scoring algorithm. Journal of Personality and Social Psychology, 85, 197–216.

    PubMed  Google Scholar 

  • Guilford, J. P. (1952). When not to factor analyze. Psychological Bulletin, 49, 31.

    Google Scholar 

  • Hair, J. F., Bush, R. P., & Ortinau, D. J. (2006). Marketing research. Boston: McGraw Hill.

    Google Scholar 

  • Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals of item response theory. Newbury Park: Sage Publications.

    Google Scholar 

  • Hammond, K. R. (1948). Measuring attitudes by error-choice: An indirect method. Journal of Abnormal Psychology, 43(1), 38–48.

    PubMed  Google Scholar 

  • Hardy, B. & Ford, L. (2012). When often becomes always, and sometimes becomes never: miscomprehension in surveys. In: Academy of Management Annual Meeting 2012, 3-7th August 2012, Boston, MA.

    Google Scholar 

  • Harman, H. H. (1960). Modern factor analysis. Chicago: University Chicago Press.

    Google Scholar 

  • Harter, J. K. (1997). The psychometric utility of the midpoint on a Likert scale. Dissertation Abstracts International, 58, 1198.

    Google Scholar 

  • Hicks, L. E. (1970). Some properties of ipsative, normative, and forced-choice normative measures. Psychological Bulletin, 74(3), 167–184.

    Google Scholar 

  • Hockenberry, M. J., & Wilson, D. (2009). Wong’s essentials of pediatric nursing (8th ed.). St. Louis: Mosby.

    Google Scholar 

  • Hough, L. M. (1998). The effects of intentional distortion in personality measurement and evaluation of suggested palliatives. Human Performance, 11, 209–244.

    Google Scholar 

  • Jaccard, J., & Jacoby, J. (2010). Theory construction and model-building skills: A practical guide for social scientists. New York: Guilford Press.

    Google Scholar 

  • Johnson, D. R., & Creech, J. C. (1983). Ordinal measures in multiple indicator models: A simulation study of categorization error. American Sociological Review, 48, 398–407.

    Google Scholar 

  • Jones, J. W., & Dages, K. D. (2003). Technology trends in staffing and assessment: A practice note. International Journal of Selection and Assessment, 11, 247–252.

    Google Scholar 

  • Joreskog, K. G., & Sorbom, D. (1979). Advances in factor analysis and structural equation models. Cambridge: Abt Books.

    Google Scholar 

  • Kalton, G., Roberts, J., & Holt, D. (1980). The effects of offering a middle response option with opinion questions. The Statistician, 29, 11–24.

    Google Scholar 

  • Kahn (1974) Instructor evaluation using the Thurstone technique. University of Connecticut, Storrs, CT, Unpublished manuscript.

    Google Scholar 

  • Kelloway, E. K., Loughlin, C., Barling, J., & Nault, A. (2002). Self-reported counterproductive behaviors and organizational citizenship behaviors: Separate but related constructs. International Journal of Selection and Assessment, 10, 143–151.

    Google Scholar 

  • Kennedy, R., Riquier, C., & Sharp, B. (1996). Practical applications of correspondence analysis to categorical data in market research. Journal of Targeting, Measurement and Analysis for Marketing, 5, 56–70.

    Google Scholar 

  • Kidder, L. H., & Campbell, D. T. (1970). The indirect testing of social attitudes. In G. F. Summers (Ed.), Attitude measurement (pp. 333–385). Chicago: Rand McNally.

    Google Scholar 

  • Krosnick, J. A. (1991). Response strategies for coping with the cognitive demands of attitude measures in surveys. Applied Cognitive Psychology, 5, 213–236.

    Google Scholar 

  • Krosnick, J. A. (1999). Survey Methodology. Annual Review of Psychology, 50, 537–567.

    PubMed  Google Scholar 

  • Krosnick, J. A., & Alwin, D. F. (1987). An evaluation of a cognitive theory of response-order effects in survey measurement. Public Opinion Quarterly, 51(2), 201–219.

    Google Scholar 

  • Krosnick, J. A., & Fabrigar, L. R. (1997). Designing rating scaling for effective measurement in surveys. Survey measurement and process quality. New York: Wiley.

    Google Scholar 

  • Krosnick, J. A., & Presser, S. (2010). Question and questionnaire design. In James. D. Wright & Peter. V. Marsden (Eds.), Handbook of survey research (2nd ed., pp. 263–313). Emerald Group: West Yorkshire.

    Google Scholar 

  • Likert, R. (1932). A technique for the measurement of attitudes. Archives of Psychology, 140, 152.

    Google Scholar 

  • Loevinger, J. (1957). Objective tests as instruments of psychological theory. Psychological Reports, 3, 635–694.

    Google Scholar 

  • Lord, F. M., & Novick, M. R. (1968). Statistical theories of mental test scores. Reading: Addison-Wesley.

    Google Scholar 

  • Lozano, L. M., García-Cueto, M., & Muñiz, J. (2008). Effect of the number of response categories on the reliability and validity of rating scales. Methodology, 4, 73–79.

    Google Scholar 

  • Lynch (1973) Multidimensional measurement with the D statistic and semantic differential. Northeastern University, Boston, Unpublished manuscript

    Google Scholar 

  • MacCallum, R. C., & Austin, J. T. (2000). Applications of structural equation modeling in psychological research. Annual Review of Psychology, 51, 201–226.

    PubMed  Google Scholar 

  • Marsh, H. W. (1986). Negative item bias in rating scales for preadolescent children: A cognitive-developmental phenomenon. Developmental Psychology, 22, 37–49.

    Google Scholar 

  • Marsh, H. W., & Shavelson, R. J. (1985). Self-concept: Its multifaceted, hierarchical structure. Educational Psychologist, 20(3), 107–123.

    Google Scholar 

  • Marsh, H. W., Byrne, B. M., & Shavelson, R. J. (1988). A multifaceted academic self-concept: Its hierarchical structure and its relation to academic achievement. Journal of Educational Psychology, 80(3), 366–380.

    Google Scholar 

  • Martin, C. L., & Nagao, D. H. (1989). Some effects of computerized interviewing on job applicant responses. Journal of Applied Psychology, 74, 72–80.

    Google Scholar 

  • Martin, B. A., Bowen, C. C., & Hunt, S. T. (2002). How effective are people at faking personality questionnaires? Personality and Individual Differences, 32, 247–256.

    Google Scholar 

  • McCloy, R. A., Heggestad, E. D., & Reeve, C. L. (2005). A silk purse from the sow’s ear: Retrieving normative information from multidimensional forced-choice items. Organizationa Research Methods, 8, 222–248.

    Google Scholar 

  • McCoach, D. B., & Adelson, J. (2010). Dealing with dependence (Part I): Understanding the effects of clustered data. Gifted Child Quarterly, 54, 152–155.

    Google Scholar 

  • McConahay, J. B. (1986). Modern racism, ambivalence, and the modern racism scale. In J. F. Dovidio & S. L. Gaertner (Eds.), Prejudice, discrimination and racism (pp. 91–125). FL, Orlando: Academic Press.

    Google Scholar 

  • McDonald, J. L. (2004). The optimal number of categories for numerical rating scales. Dissertation Abstracts International, 65, 5A. (UMI No. 3134422).

    Google Scholar 

  • McFarland, L. A., & Ryan, A. M. (2000). Variance in faking across noncognitive measures. Journal of Applied Psychology, 85, 812–821.

    PubMed  Google Scholar 

  • McMorris, R. (1971). Paper presented at the annual meeting of the Northeastern Educational Research Association. Ellenville: Normative and ipsative measures of occupational values.

    Google Scholar 

  • Meade, A. W. (2004). Psychometric problems and issues involved with creating and using ipsative measures for selection. Journal of Occupational and Organizational Psychology, 77, 531–552.

    Google Scholar 

  • Mehrens, W., & Lehmann, I. (1983). Measurement and evaluation in education and psychology (3rd ed.). New York: Holt, Rinehart & Winston.

    Google Scholar 

  • Meijer, R. R., & Nering, M. L. (1999). Computerized adaptive testing: Overview and introduction. Applied Psychological Measurement, 23, 187–194.

    Google Scholar 

  • Melnick, S. A. (1993). The effects of item grouping on the reliability and scale scores of an affective measure. Educational and Psychological Measurement, 3(1), 211–216.

    Google Scholar 

  • Melnick, S. A., & Gable, R. K. (1990). The use of negative item stems: A cautionary note. Educational Research Quarterly, 14(3), 31–36.

    Google Scholar 

  • Messick, S. (1995). Validity of psychological assessment: Validation of inferences from persons’ responses and performances as scientific inquiry into score meaning. American Psychologist, 50(9), 741–749.

    Google Scholar 

  • Milgram, S. (1969). Comment on a failure to validate the lost letter technique. Public Opinion Quarterly, 33, 263–264.

    Google Scholar 

  • Milgram, S., Mann, L., & Harters, S. (1965). The lost-letter technique. Public Opinion Quarterly, 29, 437–438.

    Google Scholar 

  • Moore, D. W. (2002). Measuring new types of question-order effects: Additive and subtractive. Public Opinion Quarterly, 66, 80–91.

    Google Scholar 

  • Murphy, S. T., & Zajonc, R. B. (1993). Affect, cognition, and awareness: Affective priming with optimal and suboptimal stimulus exposures. Journal of Personality and Social Psychology, 64, 723–739.

    PubMed  Google Scholar 

  • Muthén, L. K., & Muthén, B. O. (2012). Mplus user’s guide (7th ed.). Los Angeles: Muthén & Muthén.

    Google Scholar 

  • Nagel, E. (1931). Measurement. Erkenntnis, 2(1), 313–335.

    Google Scholar 

  • Narayan, S., & Krosnick, J. A. (1996). Education Moderates Some Response Effects in Attitude Measurement. Public Opinion Quarterly, 60, 58–88.

    Google Scholar 

  • Netemeyer, R. G., Bearden, W. O., & Sharma, S. (2003). Scaling procedures: Issues and applications. Thousand Oaks: Sage Publications.

    Google Scholar 

  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.

    Google Scholar 

  • O’Muircheartaigh C. A., Krosnick J. A., & Helic A. (1999). Middle alternatives, acquiescence, and the quality of questionnaire data. Presented at Annual Meeting American Association of Public Opinion Research, Fort Lauderdale.

    Google Scholar 

  • Ochieng, C. O. (2001). Implications of Using Likert Data in Multiple Regression Analysis. Unpublished Doctoral Dissertation, University of British Columbia.

    Google Scholar 

  • Osgood, C. E. (1952). The nature and measurement of meaning. Psychological Bulletin, 49, 197–237.

    PubMed  Google Scholar 

  • Osgood, C. E., Suci, C. J., & Tannenbaum, P. H. (1957). The measurement of meaning. Urbana: University of Illinois Press.

    Google Scholar 

  • Ostrom, T. M., & Gannon, K. M. (1996). Exemplar generation: Assessing how respondents give meaning to rating scales. In N. Schwarz & S. Sudman (Eds.), Answering questions (pp. 293–318). San Francisco: Jossey-Bass.

    Google Scholar 

  • Pajares, F., Hartley, J., & Valiante, G. (2001). Response format in writing self-efficacy assessment: Greater discrimination increases prediction. Measurement and Evaluation in Counseling and Development, 33, 214–221.

    Google Scholar 

  • Pappalardo, S. J. (1971). An investigation of the efficacy of “in-basket” and “role-playing” variations of simulation technique for use in counselor education. Unpublished doctoral dissertation, Albany: State University of New York.

    Google Scholar 

  • Paulhus, D. L. (1991). Measurement and control of response bias. In J. P. Robinson, P. R. Shaver, & L. S. Wrightsman (Eds.), Measures of personality and social psychological attitudes (pp. 17–59). San Francisco: Academic Press.

    Google Scholar 

  • Pett, M. A., Lackey, N. R., & Sullivan, J. J. (2003). Making sense of factor analysis: The use of factor analysis for instrument development in health care research. Thousand Oaks: Sage.

    Google Scholar 

  • Pilotte W. J, (1991). The impact of mixed item stems on the responses of high school students to a computer anxiety scale (Doctoral Dissertation, University of Connecticut, Storrs).

    Google Scholar 

  • Pilotte, W. J., & Gable, R. K. (1990). The impact of positive and negative item stems on the validity of a computer anxiety scale. Educational and Psychological Measurement, 50, 603–610.

    Google Scholar 

  • Preston, C. C., & Colman, A. M. (2000). Optimal number of response categories in rating scales: Reliability, validity, discriminating power, and respondent preferences. Acta Psychologica, 104, 1–15.

    PubMed  Google Scholar 

  • Rhemtulla, M., Brosseau-Liard, P., & Savalei, V. (2010). How many categories is enough to treat data as continuous? A comparison of robust continuous and categorical SEM estimation methods under a range of non-ideal situations. Retrieved from http://www2.psych.ubc.ca/~mijke/files/HowManyCategories.pdf.

  • Richman, W., Kiesler, S., Weisband, S., & Drasgow, F. (1999). A meta-analytic study of social desirability distortion in computer-administered questionnaires, traditional questionnaires, and interviews. Journal of Applied Psychology, 84, 754–775.

    Google Scholar 

  • Roberts, J. S., Laughlin, J. E., & Wedell, D. H. (1999). Validity issues in the Likert and Thurstone approaches to attitude measurement. Educational and Psychological Measurement, 59(2), 211–233.

    Google Scholar 

  • Robinson, J. P., Shaver, P. R., & Wrightsman, L. S. (1991). Measures of personality and social psychological attitudes. San Diego: Academic Press.

    Google Scholar 

  • Rodebaugh, T. L., Woods, C. M., Thissen, D., Heimberg, R. G., Chambless, D. L., & Rapee, R. M. (2004). More information from fewer questions: The factor structure and item properties of the original and brief fear of negative evaluation scales. Psychological Assessment, 16, 169–181.

    PubMed  Google Scholar 

  • Rosenberg, M. J. (1956). Cognitive structure and attitudinal affect. The Journal of Abnormal and Social Psychology, 53, 367–372.

    Google Scholar 

  • Rossi, P. H., Wright, J. D., & Anderson, A. B. (1983). Handbook of survey research. New York: Academic Press.

    Google Scholar 

  • Roszkowski, M., & Soven, M. (2010). Shifting gears: Consequences of including two negatively worded items in the middle of a positively worded questionnaire. Assessment & Evaluation in Higher Education, 35(1), 117–134.

    Google Scholar 

  • Rothstein, M. G., & Goffin, R. D. (2006). The use of personality measures in personnel selection: What does current research support? Human Resource Management Review, 16, 155–180.

    Google Scholar 

  • Saville, P., & Willson, E. (1991). The reliability and validity of normative and ispative approaches in the measurement of personality. Journal of Occupational Psychology, 64, 219–238.

    Google Scholar 

  • Schriesheim, C. A., Eisenbach, R. J., & Hill, K. D. (1991). The effect of negation and polar opposite item reversals on questionnaire reliability and validity: An experimental investigation. Educational and Psychological Measurement, 51, 67–78.

    Google Scholar 

  • Schriesheim, C. A., Powers, K. J., Scandura, T. A., Gardiner, C. C., & Lankau, M. J. (1993). Improving construct measurement in management research: Comments and a quantitative approach for assessing the theoretical content adequacy of paper-and-paper survey-type instruments. Journal of Management, 19, 385–417.

    Google Scholar 

  • Schuman, H., & Presser, S. (1996). Questions and answers in attitude surveys: Experiments on question form, wording, and context. Thousand Oaks: Sage.

    Google Scholar 

  • Scott, W. A. (1968). Comparative validities of forced-choice and single-stimulus tests. Psychological Bulletin, 70(4), 231–244.

    PubMed  Google Scholar 

  • Simon, H. A. (1957). Models of man: Social and rational. New York: John Wiley and Sons.

    Google Scholar 

  • Snider, J. G., & Osgood, C. E. (1969). Semantic differential technique: A sourcebook. Chicago: Aldine.

    Google Scholar 

  • Sparfeldt, J. R., Schilling, S. R., Rost, D. H., Thiel, A. (2006). Blocked versus randomized format of questionnaires: A confirmatory multigroup analysis. Educational and Psychological Measurement, 66(6), 961–974.

    Google Scholar 

  • Stark, S., Chernyshenko, O. S., & Drasgow, F. (2005). An IRT approach to constructing and scoring pairwise preference items involving stimuli on different dimensions: The multiunidimensional pairwise-preference model. Applied Psychological Measurement, 29, 184–203.

    Google Scholar 

  • Stark, S., Chernyshenko, O. S., Drasgow, F., & Williams, B. A. (2006). Examining assumptions about item responding in personality assessment: Should ideal point methods be considered for scale development and scoring? Journal of Applied Psychology, 91, 25–39.

    PubMed  Google Scholar 

  • Steenkamp, J. E. M., & Baumgartner, H. (1995). Development and cross-national validation of a short-form of CSI as a measure of optimum stimulation level. International Journal of Research in Marketing, 12, 97–104.

    Google Scholar 

  • Stevens, S. S. (1946). On the theory of scales and measurement. Science, 103, 667–680.

    Google Scholar 

  • Strack, F., Schwarz, N., & Gschneidinger, E. (1985). Happiness and reminiscing: The role of time perspective, affect, and mode of thinking. Journal of Personality and Social Psychology, 49, 1460–1469.

    Google Scholar 

  • Sudman, S., Bradburn, N. M., & Schwarz, N. (1996). Thinking about answers: The application of cognitive processes to survey methodology. San Francisco: Jossey-Bass.

    Google Scholar 

  • Tenopyr, M. L. (1968). Internal consistency of ipsative scores: The “one reliable scale” phenomenon. Paper presented at the 76th annual convention of the American Psychological Association, San Francisco.

    Google Scholar 

  • Thurstone, L. L. (1927). The method of paired comparisons for social values. Journal of Abnormal and Social Psychology, 21, 384–400.

    Google Scholar 

  • Thurstone, L. L. (1928). Attitudes can be measured. American Journal of Sociology, 33, 529–554.

    Google Scholar 

  • Thurstone, L. L. (1931a). The measurement of attitudes. Journal of Abnormal and Social Psychology, 26, 249–269.

    Google Scholar 

  • Thurstone, L. L. (1931b). The measurement of change in social attitudes. Journal of Social Psychology, 2, 230–235.

    Google Scholar 

  • Thurstone, L. L. (1946). Comment. American Journal of Sociology, 52, 39–50.

    Google Scholar 

  • Thurstone, L. L., & Chave, E. (1929). The Measurement of Attitude. Chicago: University of Chicago Press.

    Google Scholar 

  • Tourangeau, R., & Rasinski, K. A. (1988). Cognitive processes underlying context effects in attitude measurement. Psychological Bulletin, 103, 299–314.

    Google Scholar 

  • Tourangeau, R., Couper, M. P., & Conrad, F. (2007). Colors, labels, and interpretive heuristics for response scales. Public Opinion Quarterly, 71(1), 91–112.

    Google Scholar 

  • Vasilopoulos, N. L., Cucina, J. M., & McElreath, J. M. (2005). Do warnings of response verification moderate the relationship between personality and cognitive ability? Journal of Applied Psychology, 90, 306–322.

    PubMed  Google Scholar 

  • Veres, J. G., Sims, R. R., & Locklear, T. S. (1991). Improving the reliability of Kolb’s revised LSI. Educational and Psychological Measurement, 51, 143–150.

    Google Scholar 

  • Viswesvaran, C., & Ones, D. S. (1999). Meta-analyses of fakability estimates: Implications for personality measurement. Educational and Psychological Measurement, 59(2), 197–210.

    Google Scholar 

  • Watson, D. (1992). Correcting for acquiescent response bias in the absence of a balanced scale: An application to class consciousness. Sociological Methods & Research, 21, 52–88.

    Google Scholar 

  • Weisberg, H. F., Krosnick, J. A., & Bowen, B. D. (1996). An introduction to survey research, polling, and data analysis (3rd ed.). Newbury Park: Sage.

    Google Scholar 

  • Weng, L. J. (2004). Impact of the number of response categories and anchor labels on coefficient alpha and test-retest reliability. Educational and Psychological Measurement, 64, 956–972.

    Google Scholar 

  • Whitley, B. E., & Kost, C. R. (1999). College students’ perceptions of peers who cheat. Journal of Applied Social Psychology, 29, 1732–1760.

    Google Scholar 

  • Wilson, M. (2005). Constructing measures: An item response modeling approach. Mahwah: Erlbaum.

    Google Scholar 

  • Wilson, T. D., Lindsey, S., & Schooler, T. Y. (2000). A model of dual attitudes. Psychological Review, 107, 101–126.

    PubMed  Google Scholar 

  • Wirth, R. J., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79.

    PubMed  Google Scholar 

  • Wright, B. D. (1984). Despair and hope for educational measurement. Contemporary Education Review, 3(1), 281–288.

    Google Scholar 

  • Wright, B. D. (1997). A history of social science measurement. Educational Measurement: Issues and Practice, 16(4), 33–45.

    Google Scholar 

  • Wright, B. D. (1999). Fundamental measurement for psychology. In S. E. Embretson & S. L. Hershberger (Eds.), The new rules of measurement: What every educator and psychologist should know (pp. 65–104). Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.

    Google Scholar 

  • Zaller, J. R., & Feldman, S. (1992). A simple theory of the survey response: Answering questions versus revealing preferences. American Journal of Political Science, 36, 579–616.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Betsy McCoach .

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer Science+Business Media New York

About this chapter

Cite this chapter

McCoach, D.B., Gable, R.K., Madura, J.P. (2013). Defining, Measuring, and Scaling Affective Constructs. In: Instrument Development in the Affective Domain. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-7135-6_2

Download citation

Publish with us

Policies and ethics