Skip to main content

Correcting the t statistic for measurement error


Studies in marketing often involve application of multi-item scales to measure latent constructs. Once the psychometric properties of a scale have been assessed, responses to individual scale items are often summed to form a composite score, which then is compared across groups by performing statistical tests such as a t test. In this note, we draw researchers’ attention to an often overlooked fact that the t test is attenuated by imperfect measures. As a solution, we propose the disattenuated t statistic and discuss how it would increase accuracy of estimates and affect decisions in the marketing discipline.

This is a preview of subscription content, access via your institution.

Fig. 1


  1. The F statistic, which is more appropriate when comparing mean differences among three or more groups, is also likely to be attenuated by measurement error. However, the focus of this research is on the two-group case and how to disattenuate the t statistic.

  2. For the sake of clarity, the group superscript “g” and respondent subscript “i” are not shown in the equation.

  3. Parenthetically, reliability is itself affected by the number of scale items. From Eq. 3, when the error variance is kept constant across scale items, increasing the number of scale items (p) will result in a decrease in the total variance. For smaller values of total variance, the t statistic increases. So, larger scales tend to drive the t statistic higher whereas smaller scales lead to lower t statistics. This is because all things being equal, longer scales have higher reliability. However, it is not the objective of this paper to discuss the merits of employing smaller vs. larger scales in terms of number of scale items; interested readers should consult Bergkvist and Rossiter (2007), Drolet and Morrison (2001), Kardes et al. (1993), and Rossiter (2002) for further insights into that topic.

  4. In computing the summed scores, we assigned equal weight to each variable, as this is the common practice. However, as suggested by one of the reviewers, summed scores can also be obtained by weighting the variables by the loadings. For both the examples, we computed the loadings by constraining them to be equal across groups to meet the metric equivalency requirement and used the resulting loadings to also compute the weighted summed scores. Although the attenuated and disattenuated t values were slightly different when using the reviewer suggested procedure, there were no changes in the conclusions.


  • Abelson, R. P. (1995). Statistics as principled argument. Hilldale: Erlbaum.

    Google Scholar 

  • Aiken, L. S. And, & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park: Sage.

    Google Scholar 

  • Bergkvist, L., & Rossiter, J. R. (2007). The predictive validity of multiple-item versus single-item measures of the same constructs. Journal of Marketing Research, 44(May), 175–184.

    Article  Google Scholar 

  • Bobko, P., Roth, P. L., & Bobko, C. (2001). Correcting the effect size of d for range restriction and unreliability. Organizational Research Methods, 4(1), 46–61.

    Article  Google Scholar 

  • Bruner, G. C., & Hensel, P. J. (1993). Multi-item scale usage in marketing journals: 1980 to 1989. Journal of the Academy of Marketing Science, 21(Fall), 339–343.

    Article  Google Scholar 

  • Capraro, M. M., Capraro, R. M., & Henson, R. K. (2001). Measurement error of scores on the mathematics anxiety rating scale across studies. Educational and Psychological Measurement, 61, 373–386.

    Article  Google Scholar 

  • Choi, J., Fan, W., & Hancock, G. R. (2009). A note on confidence intervals for two-group latent mean effect size measures. Multivariate Behavioral Research, 44, 396–406.

    Article  Google Scholar 

  • Churchill, G. A., & Peter, J. P. (1984). Research design effects on the reliability of rating scales: A meta analysis. Journal of Marketing Research, 21(November), 360–375.

    Article  Google Scholar 

  • Cochran, W. G. (1968). Errors of measurement in statistics. Technometrics, 10, 637–666.

    Article  Google Scholar 

  • Diamantopoulos, A., & Siguaw, J. A. (2000). Introducing LISREL. London: Sage.

    Google Scholar 

  • Drolet, A. L., & Morrison, D. G. (2001). Do we really need multiple-item measures in service research? Journal of Service Research, 3(February), 196–204.

    Article  Google Scholar 

  • Durvasula, S., Andrews, J. C., Lysonski, S., & Netemeyer, R. (1993). Assessing the cross-national applicability of consumer behavior models: A model of attitude toward advertising in general. Journal of Consumer Research, 19(March), 626–636.

    Article  Google Scholar 

  • Fan, X. (2003). Two approaches for correcting correlation attenuation caused by measurement error: Implications for research practice. Educational and Psychological Measurement, 63(6), 915–930.

    Article  Google Scholar 

  • Hewett, K., Money, R. B., & Sharma, S. (2002). An exploration of the moderating role of buyer corporate culture in industrial buyer–seller relationships. Journal of the Academy of Marketing Science, 30(3), 229–239.

    Google Scholar 

  • Kardes, F. R., Allen, C. T., & Pontes, M. J. (1993). Effects of multiple measurement operations on consumer judgment: Measurement reliability or reactivity? Advances in Consumer Research, 20, 280–283.

    Article  Google Scholar 

  • Malhotra, N. K. (2007). Marketing research: An applied orientation (5th ed.). Upper Saddle River: Prentice Hall.

    Google Scholar 

  • Nunnally, J. C. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill.

    Google Scholar 

  • Nunnally, J. C., & Bernstein, I. H. (1994). Psychometric theory (3rd ed.). New York: McGraw-Hill.

    Google Scholar 

  • Peterson, R. A. (1994). A meta analysis of Cronbach’s coefficient alpha. Journal of Consumer Research, 21, 381–391.

    Article  Google Scholar 

  • Rossiter, J. R. (2002). The C-OAR-SE procedure for scale development in marketing. International Journal of Research in Marketing, 19(December), 305–335.

    Article  Google Scholar 

  • Shadish, W. R., Cook, T. D., & Campbell, D. T. (2001). Experimental and quasi-experimental designs for generalized causal inference. Boston: Houghton Mifflin.

    Google Scholar 

  • Steenkamp, J. E. M., & Baumgartner, H. (1998). Assessing measurement invariance in cross-national consumer research. Journal of Consumer Research, 25(1), 78–90.

    Article  Google Scholar 

Download references


The authors would like to thank Professor Terence A. Shimp, the Editor, and the two anonymous reviewers for their helpful comments on an earlier version of this manuscript.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Subhash Sharma.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Durvasula, S., Sharma, S. & Carter, K. Correcting the t statistic for measurement error. Mark Lett 23, 671–682 (2012).

Download citation

  • Published:

  • Issue Date:

  • DOI:


  • Methodology, testing of group means
  • Disattenuated t statistic
  • Composite scale scores
  • t statistic