Skip to main content
Log in

A course in the theory of mental tests

  • Published:
Psychometrika Aims and scope Submit manuscript

Abstract

An outline for a course in test theory is presented, together with a list of assignments, problems, and a bibliography. The course has been given in the Psychology Department of the University of Chicago. The material is presented in outline form at the present time because of the increased need for training in test theory due to the increase in the use of psychological tests for classification of military personnel, and because much of the material in such a course must be selected from a wide array of articles in the literature. This material is presented in order that an organized body of material for instructional purposes may be readily available to those interested.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Bibliography

  1. Adkins, Dorothy C. A comparative study of methods of selecting items. Dissertation on file in library of Ohio State University. Abstract, Psychology Library.

  2. Adkins, Dorothy C., and Toops, Herbert A., 1937. Simplified formulas for item selection and construction.Psychometrika,2, 165–171.

    Google Scholar 

  3. Ayres, Leonard P., 1911. A scale for measuring the quality of handwriting of school children. New York: Publication on Measurement in Education, Division of Education, Russell Sage Fund Bulletin No. 113.

    Google Scholar 

  4. Babitz, Milton, and Keys, Noel. 1940. A method for approximating the average intercorrelation coefficient by correlating the parts with the sum of the parts.Psychometrika,5, 283–288.

    Google Scholar 

  5. Board of Examinations, The University of Chicago. 1937. Manual of Examination Methods. Second Edition. Chicago: Univ. Chicago Bookstore. Pp. 177.

    Google Scholar 

  6. Boring, E. G. 1919. Mathematical vs. scientific significance.Psychol. Bull.,16, 335–338.

    Google Scholar 

  7. —— 1920. The logic of the normal law of error in mental measurement.Amer. J. Psychol.,31, 1–33.

    Google Scholar 

  8. Bradford, Leland P. 1940. The effect of practice upon standard errors of estimate.Psychol. Monogr.,52, No. 3, 56–71.

    Google Scholar 

  9. Brown, William, and Thomson, Godfrey. 1921. Essentials of mental measurement. Cambridge Univ. Press. Pp. viii + 216.

  10. Buros, Oscar K. 1936. Educational, Psychological, and Personality Tests of 1933, 1934, and 1935. New Brunswick, New Jersey: School of Education, Rutgers University. Pp. 83. Reviews 1–503.

    Google Scholar 

  11. —— 1937. Educational, Psychological, and Personality Tests of 1936. New Brunswick, New Jersey: School of Education, Rutgers University. Pp. 141. Reviews 504–868.

    Google Scholar 

  12. —— 1938. The 1938 Mental Measurements Yearbook. New Brunswick, New Jersey: School of Education, Rutgers University. Pp. xiv + 415. Reviews 869–1181.

    Google Scholar 

  13. —— 1941. The 1940 Mental Measurements Yearbook. New Brunswick, New Jersey: School of Education, Rutgers University. Pp. xxi + 674. Reviews 1182–1684.

    Google Scholar 

  14. Burt, Cyril. 1936. Supplement. In “The Marks of Examiners” by Hartog, P. J., and Rhodes, E. C. London: Macmillan and Company. Pp. xix + 344.

    Google Scholar 

  15. Douglass, H. R. 1934. Some observations and data on certain methods of measuring the predictive significance of the Pearson product-moment coefficient of correlation.J. educ. Psychol.,25, 225–232.

    Google Scholar 

  16. Dressel, Paul L. 1940. Some remarks on the Kuder-Richardson reliability coefficient.Psychometrika,5, 305–310.

    Google Scholar 

  17. Dunlap, Jack W. 1936 (a). Note on the computation of bi-serial correlation in item evaluation.Psychometrika,1, 51–58.

    Google Scholar 

  18. —— 1939 (b). Nomograph for computing bi-serial correlations.Psychometrika,1, 59–60.

    Google Scholar 

  19. Dunlap, Jack and Kurtz, A. K. 1932. Handbook of statistical nomographs and formulas. New York: World Book Company. vii + 163.

    Google Scholar 

  20. Edgerton, H. A. and Toops, H. A. 1928. A formula for finding the average inter-correlation coefficient for unranked raw scores without solving any of the individual intercorrelations.J. educ. Psychol.,19, 131–138.

    Google Scholar 

  21. Edgerton, H. A. and Kolbe, Laverne E. 1936. The method of minimum variation for the combination of criteria.Psychometrika,1, 183–187.

    Google Scholar 

  22. Englehart, Max D. 1942. Unique types of achievement test exercises.Psychometrika,7, 103–115.

    Google Scholar 

  23. Flanagan, John C. 1936. A short method for selecting the best combination of test items for a particular purpose.Psychol. Bull.,33, 603–604.

    Google Scholar 

  24. ——. 1939. Scaled scores. New York: Cooperative Test Service.

    Google Scholar 

  25. Freeman, Frank N. 1917. A critique of the Yerkes-Bridges-Hardwick comparison of the Binet-Simon and point scales.Psychol. Rev. 24, 484.

    Google Scholar 

  26. ——. 1939. Mental tests: Their history, principles, and applications. Cambridge, Mass.: The Riverside Press, Rev.

    Google Scholar 

  27. Frisch, Ragnar. 1934. Statistical confluence analysis by means of complete regression systems. Oslo.

  28. Garrett, Henry E. 1943. The discriminant function and its use in psychology.Psychometrika,8, 65–79.

    Google Scholar 

  29. Guilford, J. P. 1936 (a). The determination of item difficulty when chance success is a factor.Psychometrika,1, 259–264.

    Google Scholar 

  30. ——. 1936. (b). Psychometric methods. New York: McGraw-Hill.

    Google Scholar 

  31. ——. 1937. The psychophysics of mental test difficulty.Psychometrika,2, 121–133.

    Google Scholar 

  32. Gulliksen, Harold. 1936. The content reliability of a test.Psychometrika,1, 189–194.

    Google Scholar 

  33. Hawkes, H. E., Lindquist, E. F., and Mann, C. R. 1936. The construction and use of achievement examinations. Boston: Houghton-Mifflin Company.

    Google Scholar 

  34. Hildreth, G. H. 1939. A bibliography of mental tests and rating scales. 2nd Ed. New York: The Psychological Corporation. Pp. xxiv + 295.

    Google Scholar 

  35. Holmes, Henry W. 1917. A descriptive bibliography of measurement in elementary subjects. Cambridge, Mass.: Harvard Univ. Press.

    Google Scholar 

  36. Holzinger, Karl J., and Clayton, Blythe. 1925. Further experiments in the application of Spearman's prophecy formula.J. educ. Psychol.,16, 289–299.

    Google Scholar 

  37. Horst, Paul. 1934. (a). Item selection by the method of successive residuals.J. exper. Educ.,2, 254–263.

    Google Scholar 

  38. ——. 1934 (b). Increasing the efficiency of selection tests.The Personnel Journal,12, 254–259.

    Google Scholar 

  39. ——. 1936 (a). Obtaining a composite measure from different measures of the same attributes.Psychometrika,1, 53–60.

    Google Scholar 

  40. ——. 1936 (b). Item selection by means of a maximizing function.Psychometrika,1, 229–244.

    Google Scholar 

  41. Hull, Clark L. 1928. Aptitude testing. New York: World Book Company. Pp. xiv + 535.

    Google Scholar 

  42. Kelley, Truman L. 1927. Interpretation of educational measurements. New York: World Book Company.

    Google Scholar 

  43. ——. 1924. Statistical methods. New York: Macmillan Company. Pp. xi + 389.

    Google Scholar 

  44. ——. 1923. The principles and techniques of mental measurement.Amer. J. Psychol.,34, 408–432.

    Google Scholar 

  45. Kuder, G. F. 1937. Nomograph for point biserialr, biserialr, and fourfold correlations.Psychometrika,2, 135–138.

    Google Scholar 

  46. Kuder, G. F., and Richardson, M. W. 1937. The theory of the estimation of test reliability.Psychometrika,2, 151–160.

    Google Scholar 

  47. Lee, J. M., and Symonds, P. M. 1934. New type or objective tests: a summary of investigations (Oct. 1931-Oct. 1933).J. educ. Psychol.,25, 161–184.

    Google Scholar 

  48. Lentz, T. F., Hirshstein, Bertha, and Finch, J. H. 1932. Evaluation of methods of evaluating test items.J. educ. Psychol.,23, 344–350.

    Google Scholar 

  49. Lindquist, E. F. 1940. Statistical analysis in educational research. New York: Houghton Mifflin Co.

    Google Scholar 

  50. Long, John A., Sandiford, Peter, et al. 1935. The validation of test items. Bull. Dept. Educ. Res., Ontario Coll. Educ., No. 3, 126 pages.

  51. McCall, W. A. 1922. How to measure in education. New York: The Macmillan Company. Pp. xii + 416.

    Google Scholar 

  52. Merrill, Walter W., Jr. 1937. Sampling theory in item analysis.Psychometrika,2, 215–224.

    Google Scholar 

  53. Monroe, Paul (Editor). 1939. Conference on examinations at Dinard, France, Sept. 16–19, 1938. New York: Bureau of Publications, Teachers College, Columbia University. Pp. xiii + 330.

    Google Scholar 

  54. Monroe, Walter S. 1923. The theory of educational measurements. New York: Houghton-Mifflin Company.

    Google Scholar 

  55. Monroe, Walter S. 1934. A note on efiiciency of prediction.J. educ. Psychol.,25, 547–548.

    Google Scholar 

  56. Moore, Clarence Carl. 1940. The rights-minus wrongs method of correcting chance factors in the T-F examination.J. genet. Psychol.,57, 317–326.

    Google Scholar 

  57. Monroe, Walter S. and Englehart, Max D. 1936. Scientific study of educational problems. New York: The Macmillan Company.

    Google Scholar 

  58. Mosier, Charles I. 1936. A note on item analysis and the criterion of internal consistency.Psychometrika,1, 275–282.

    Google Scholar 

  59. ——. 1940. Psychophysics and mental test theory: fundamental postulates and elementary theorems.Psychol. Rev.,47, 355–366.

    Google Scholar 

  60. National Society for the Study of Education. 1918. 17th Yearbook, Part II. Bloomington, Ill.: Public School Publishing Company.

    Google Scholar 

  61. Orleans, Jacob S. 1937. Measurement in education. New York: Thomas Nelson and Sons. Pp. xvi + 461.

    Google Scholar 

  62. Otis, A. S. 1922 (a). The method for finding the correspondence between scores in two tests.J. educ. Psychol.,13, 529–45.

    Google Scholar 

  63. ——. 1922 (b). A method of inferring the change in a coefficient of correlation resulting from a change in the heterogeneity of the group.J. educ. Psychol.,13, 293–294.

    Google Scholar 

  64. Otis, A. S., and Knollin, H. E. 1921. The reliability of the Binet scale and of pedagogical scales.J. educ. Research,4, 121–142.

    Google Scholar 

  65. Richardson, M. W. 1935. Abac for computing tetrachoric coefficients in item analysis. Chicago: Univ. Chicago Board of Examination.

    Google Scholar 

  66. ——. 1936 (a). Notes on the rationale of item analysis.Psychometrika,1, 69–76.

    Google Scholar 

  67. ——. 1936 (b). The relation of difficulty to the differential validity of a test.Psychometrika,1, 33–49.

    Google Scholar 

  68. Richardson, M. W., and Adkins, Dorothy C. 1938. A rapid method of selecting test items.J. educ. Psychol.,29, 547–552.

    Google Scholar 

  69. Richardson, M. W., and Kuder, G. F. 1939. The computation of test reliability by the method of rational equivalence.J. educ. Psychol.,30, 681–687.

    Google Scholar 

  70. Ruch, G. M. and Stoddard, G. P. 1927. Test and measurements in high-school instruction. New York: World Book Company.

    Google Scholar 

  71. Ruch, G. M., Ackerson, L., and Jackson, J. P. 1926. An empirical study of the Spearman-Brown formula as applied to educational test material.J. educ. Psychol.,17, 309–313.

    Google Scholar 

  72. Ruger, Georgie J. 1918. Bibliography of psychological tests. New York: Bureau of Educational Measurements.

    Google Scholar 

  73. Rulon, Phillip J. 1939. A simplified procedure for determining the reliability of a test by split halves.Harvard educ. Rev.,9, 99–103.

    Google Scholar 

  74. Segel, David. 1933. A note of an error made in investigations of homogeneous grouping.J. educ. Psychol.,24, 64–66.

    Google Scholar 

  75. Smith, B. O. 1938. Logical aspects of educational measurement. New York: Columbia Univ. Press.

    Google Scholar 

  76. Spearman, Charles. 1910. Correlation from faulty data.Brit. J. Psychol.,3, 271–295.

    Google Scholar 

  77. ——. 1907. Demonstration of formulae for true measurement of correlation.Amer. J. Psychol.,18, 161–169.

    Google Scholar 

  78. Stalnaker, J. M. 1938. Weighting questions in the essay-type examination.J. educ. Psychol.,29, 481–490.

    Google Scholar 

  79. Stalnaker, J. M. and Richardson, M. W. 1933. A note concerning the combination of test scores.J. gen. Psychol.,8, 460–463.

    Google Scholar 

  80. Starch, D., and Elliot, E. C. 1912. Reliability of grading high-school work in English.School Review, September, 442–457.

  81. --. 1913. Reliability of grading high-school work in mathematics.School Review, April, 254–259.

  82. --. 1913. Reliability of grading high-school work in history.School Review, December. 676–681.

  83. Stern, William. 1914. The psychological methods of testing intelligence. Baltimore: Warwick and York.

    Google Scholar 

  84. Swineford, F. 1936. Validity of test items.J. educ. Psychol.,27, 68–78.

    Google Scholar 

  85. Thorndike, E. L. 1922. On finding equivalent scores in tests of intelligence.J. appl. Psychol.,6, 29–33.

    Google Scholar 

  86. Thurstone, L. L. 1919. A method for scoring tests.Psychol. Bull.,16, 235–240.

    Google Scholar 

  87. —— 1925. A method of scaling psychological and educational tests.J. educ. Psychol,16, 433–451.

    Google Scholar 

  88. —— 1926. The mental age concept.Psychol. Rev.,33, 268–278.

    Google Scholar 

  89. —— 1928. The absolute zero in intelligence measurement.Psychol. Rev.,35, 175–197.

    Google Scholar 

  90. —— 1931. The reliability and validity of tests. Ann Arbor, Mich.: Edwards Brothers. Planographed.

    Google Scholar 

  91. —— 1924. Fundamentals of statistics. New York: The Macmillan Company. Pp. xvi + 237.

    Google Scholar 

  92. Thurstone, T. G. 1932. The difficulty of a test and its diagnostic value.J. educ. Psychol.,23, 335–343.

    Google Scholar 

  93. Toops, H. A., and Symonds, P. M. 1922. What shall we expect of the A. Q.?J. educ. Psychol.,13, 513–528.

    Google Scholar 

  94. Travers, R. M. W. 1939. The use of a discriminant function in the treatment of psychological group differences.Psychometrika,4, 25–32.

    Google Scholar 

  95. Walker, Helen M. 1929. Studies in the history of statistical method. Baltimore: The Williams and Wilkins Company. Pp. 186.

    Google Scholar 

  96. Whipple, Guy M. 1910. Manual of mental and physical tests. Baltimore: Warwick and York. Vols. I and II.

    Google Scholar 

  97. Wilks, S. S. 1938. Weighting systems for linear functions of correlated variables when there is no dependent variable.Psychometrika,3, 23–40.

    Google Scholar 

  98. Yerkes, R. M., Bridges, J. W., and Hardwick, R. S. 1915. A point scale for measuring mental ability. Baltimore: Warwick and York.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Additional information

On leave from the University of Chicago for a government research project at the College Entrance Examination Board, Princeton, New Jersey.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gulliksen, H. A course in the theory of mental tests. Psychometrika 8, 223–245 (1943). https://doi.org/10.1007/BF02288706

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1007/BF02288706

Keywords

Navigation