Albert, J. H. (1992). Bayesian estimation of normal ogive response curves using Gibbs sampling,Journal of Educational Statistics, 17, 251–269.
Albert, J. H., & Chib, S. (1993). Bayesian analysis of binary and polychotomous response data,Journal of the American Statistical Association, 88, 669–679.
Bradlow, E. T., & Zaslavsky, A. M. (1997). Case influence analysis in Bayesian inference,Journal of Computational and Graphical Statistics, 6, 3, 314–331.
Bradlow, E. T., & Zaslavsky, A. M. (1999). A hierarchical latent variable model for ordinal customer satisfaction survey data with “no answer” responses,Journal of the American Statistical Association, 94(445), 43–52.
Gelfand, A. E., & Smith, A. F. M. (1990). Sampling-based approaches to calculating marginal densities,Journal of the American Statistical Association, 85, 398–409.
Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences,Statistical Science, 7, 457–511.
Hulin, C. L., Drasgow, F., & Parsons, L. K. (1983). Item response theory. Homewood, IL: Dow-Jones-Irwin.
Lord, F. M., & Novick, M. R. (1968).Statistical theories of mental test scores. Reading, PA: Addison-Wesley.
McDonald, R. P. (1981). The dimensionality of tests and items,British Journal of Mathematical and Statistical Psychology, 34, 100–117.
McDonald, R. P. (1982). Linear versus nonlinear models in item response theory.Applied Psychological Measurement, 6, 379–396.
Mislevy, R. J., & Bock, R. D. (1983).BILOG: Item and test scoring with binary logistic models [computer program]. Mooresville, IN: Scientific Software.
Rosenbaum, P. R. (1988). Item Bundles.Psychometrika, 53, 349–359.
Sireci, S. G., Wainer, H., & Thissen, D. (1991). On the reliability of testlet-based tests.Journal of Educational Measurement, 28, 237–247.
Stout, W. F. (1987). A nonparametric approach for assessing latent trait dimensionality,Psychometrika, 52, 589–617.
Stout, W. F. (1990). A new item response theory modeling approach with applications to unidimensional assessment and ability estimation,Psychometrika, 55, 293–326.
Stout, W., Habing, B., Douglas, J., Kim, H. R., Roussos, L., & Zhang, J. (1996). Conditional covariance-based nonparametric multidimensionality assessment,Applied Psychological Measurement, 20, 331–354.
Tanner, M. A., & Wong, W. H. (1987). The calculation of posterior distributions by data augmentation,Journal of the American Statistical Association, 82, 528–540.
Wainer, H. (1995). Precision and differential item functioning on a testlet-based test: The 1991 Law School Admissions Test as an example,Applied Measurement in Education, 8(2), 157–187.
Wainer, H., & Kiely, G. (1987). Item clusters and computerized adaptive testing: A case for testlets.Journal of Educational Measurement, 24, 185–202.
Wainer, H., & Thissen, D. (1996). How is reliability related to the quality of test scores? What is the effect of local dependence on reliability?Educational Measurement: Issues and Practice, 15(1), 22–29.
Yen, W. (1993). Scaling performance assessments: Strategies for managing local item dependence.Journal of Educational Measurement, 30, 187–213.
Zhang, J. (1996).Some fundamental issues in item response theory with applications. Unpublised doctoral dissertation, University of Illinois at Urbana-Champaign.
Zhang, J., & Stout, W. F. (1999). Conditional covariance structure of generalized compensatory multidimensional items,Psychometrika, 64, 129–152.