Journal of Gambling Studies

, Volume 28, Issue 4, pp 573–605 | Cite as

The Validation of Screening Tests: Meet the New Screen same as the Old Screen?

  • Blase Gambino
Original Paper


The focus of this report is to examine the process of validation of new screening tests designed to detect the problem gambler in research and practice settings. A hierarchical or phases of evaluation model is presented as a conceptual framework to describe the basic features of the validation process and its implications for application and interpretation of test results. The report describes a number of threats to validity in the form of sources of unintended bias that when unrecognized may lead to incorrect interpretations of study results and the drawing of incorrect conclusions about the usefulness of the new screening tests. Examples drawn from the gambling literature on problem gambling are used to illustrate some of the more important concepts including spectrum bias and clinical variation in test accuracy. The concept of zones of severity and the bias inherent in selecting criterion thresholds are reviewed. A definition of reference or study gold standard is provided. The use of 2-stage designs to establish validity by efficiently using reference standards to determine indices of accuracy and prevalence is recommended.


Validation Screening tests Unintended bias Phases of evaluation research Gold standards Reference standards Severity continuum 


  1. Abramson, J. H. (1996). Cross-sectional studies. In W. W. Holland, R. Detals, & G. Knox (Eds.), Methods in public health (pp. 517–535). Oxford: Oxford University Press.Google Scholar
  2. Aertgeerts, B., Buntix, F., Ansoms, S., & Fevery, A. (2001). Screening properties of questionnaires and laboratory tests for the detection of alcohol abuse or dependence in a general practice population. British Journal of General Practice, 51, 206–217.PubMedGoogle Scholar
  3. Alberg, A. J., Park, J. W., Hager, B. W., Brock, M. V., & Diener-West, M. (2004). The use of “overall accuracy” to evaluate the validity of screening or diagnostic tests. Journal of General Internal Medicine, 19, 460–465.PubMedCrossRefGoogle Scholar
  4. Alonzo, T. A., & Pepe, M. S. (1999). Using a combination of reference tests to assess the accuracy of a new diagnostic test. Statistics in Medicine, 18, 2987–3003.PubMedCrossRefGoogle Scholar
  5. American Psychiatric Association. (1994). Diagnostic and statistical manual of mental disorders (Fourth ed.). Washington, DC: American Psychiatric Association.Google Scholar
  6. Bachman, L. M., Juni, P., Reichenbach, S., et al. (2005). Consequences of different diagnostic’gold standards’ in test accuracy research: Carpal tunnel syndrome as an example. International Journal of Epidemiology, 34, 953–955.CrossRefGoogle Scholar
  7. Bachmann, L. M., Puhan, M. A., ter Riet, G., & Bossuyt, P. M. (2006). Sample sizes of studies on diagnostic accuracy: Literature survey. British Medical Journal, 332, 1127–1129.PubMedCrossRefGoogle Scholar
  8. Bellringer, M., Abbott, M. W., Volberg, R. A., Garrett, N., & Coombes, R. (2007). Problem gambling assessment and screening instruments: Phase one report. Auckland: Gambling Research Centre, Auckland University of Technology.Google Scholar
  9. Bossuyt, P. M., Reitsma, J. B., Bruns, D. E., et al. (2003). Towards complete and accurate reporting of studies of diagnostic accuracy: The STARD initiative. BMJ, 326, 41–44.PubMedCrossRefGoogle Scholar
  10. Boudreau, B., & Poulin, C. (2006). The South Oaks gambling screen-revised adolescent (SOGS-RA) revisited: A cut-point analysis. Journal of Gambling Studies, 23, 299–308.CrossRefGoogle Scholar
  11. Chipman, M., Govoni, R., Jazmaji, V., Wilson, S., & Gao, P. (2008). High vs. low risk gambling: What is the difference? Reported prepared for the Ontario Problem Gambling Research Centre.Google Scholar
  12. Currie, S. R., Hodgins, D. C., Wang, J., el-Guebaly, N., Wynne, H., & Chen, S. (2006). Risk of harm among gamblers in the general population as a function of level of participation in gambling activities. Addiction, 101, 570–580.PubMedCrossRefGoogle Scholar
  13. Dickerson, M. (1993). A preliminary exploration of a two-stage methodology in the assessment of the extent and degree of gambling related problems in the Australian population. In W. R. Eadington, J. Cornelius, & J. I. Tabor (Eds.), Gambling behavior and problem gambling (pp. 347–363). Reno, Nevada: Institute for the Study of Gambling and Commercial Gaming, University of Nevada.Google Scholar
  14. Eaton, W. W., Martins, S. S., Nestadt, G., Bienvenu, O. J., Clarke, D., & Alexandre, P. (2008). The burden of mental disorders. Epidemiologic Reviews, 30, 1–14.PubMedCrossRefGoogle Scholar
  15. Ferris, J., & Wynne, H. (2001). The Canadian problem gambling index: Final report. Ottawa: Canadian Centre on Substance Abuse.Google Scholar
  16. Gambino, B. (1997). The correction for bias in prevalence estimation with screening tests. Journal of Gambling Studies, 13, 343–351.PubMedCrossRefGoogle Scholar
  17. Gambino, B. (1999). An epidemiologic note on verification bias: Implications for estimation of rates. Journal of Gambling Studies, 15, 223–232.PubMedCrossRefGoogle Scholar
  18. Gambino, B. (2005a). Interpreting prevalence estimates of pathological gambling: Implications for policy. Journal of Gambling Issues, 14. Available at
  19. Gambino, B. (2005b). Going for the gold. The brief addiction science information source. Available at
  20. Gambino, B. (2006a). A comment on the utility of prevalence estimates of pathological gambling. Journal of Gambling Studies, 22, 321–328.PubMedCrossRefGoogle Scholar
  21. Gambino, B. (2006b). Reflections on accuracy. Journal of Gambling Studies, 22, 393–404.PubMedCrossRefGoogle Scholar
  22. Gambino, B. (2006c). Clarifying the at-risk label: A commentary. Journal of Gambling Issues, 16. Available at
  23. Gambino, B. (2009). Should gambling be included in public health surveillance systems? Journal of Gambling Issues, Issue 23, June. Available at
  24. Gambino, B., & Lesieur, H. (2006). The South Oaks gambling screen (SOGS): A rebuttal to critics. Journal of Gambling Issues, 17. Available at
  25. Gebauer, L., LaBrie, R., & Shaffer, H. J. (2010). Optimizing DSM-IV-TR classification accuracy: A brief biosocial screen for detecting current gambling disorders among gamblers in the general household population. Canadian Journal of Psychiatry, 55, 82–90.Google Scholar
  26. Gerstein, D., Hoffman, J., Larison, C., Murphy, S., Palmer, A., Chuchro, L., et al. (1999). Gambling impact and behavior study. Report to the national gambling impact study commission. Chicago: NORC.Google Scholar
  27. Grove, R., McBride, O., & Slade, T. (2010). Toward DSM-V: Exploring diagnostic thresholds for alcohol dependence and abuse. Alcohol and Addiction, 45, 45–52.Google Scholar
  28. Hawkins, R. C. (2005). The evidence based medicine approach to diagnostic testing: Practicalities and limitations. Clinical Biochemical Review, 26, 1–7.Google Scholar
  29. Hawkins, D. M., Garrett, J. A., & Stephenson, B. (2001). Some issues in resolution of diagnostic tests using an imperfect gold standard. Statistics in Medicine, 20, 1987–2001.PubMedCrossRefGoogle Scholar
  30. Hayen, A., Macaskill, P., Irwig, L., & Bossuyt, P. (2010). Appropriate statistical methods are required to assess diagnostic tests for replacement, add-on, and triage. Journal of Clinical Epidemiology, 63, 883–891.PubMedCrossRefGoogle Scholar
  31. Hodgins, D. C., Stea, J. N., & Grant, J. E. (2011). Gambling disorders. Lancet, online first pp. 1–11.Google Scholar
  32. Irwin, R. J., & Irwin, T. C. (2011). A principled approach to setting optimal diagnostic thresholds: Where ROC and indifference curves meet. European Journal of Internal Medicine, 22, 230–234.PubMedCrossRefGoogle Scholar
  33. Johnson, E. E., Hamer, R., & Nora, R. M. (1998). The Lie-Bet questionnaire for screening pathological gamblers: A follow-up study. Psychological Reports, 83, 1219–1224.PubMedGoogle Scholar
  34. Kessler, R. C., & Ustun, T. B. (2004). The World Mental Health (WMH) survey initiative version of the World Health Organization (WHO) composite international diagnostic interview (CIDI). International Journal of Methods in Psychiatric Research, 13, 93–121.PubMedCrossRefGoogle Scholar
  35. Kleinbaum, D. G., Kupper, L. L., & Morgenstern, H. (1982). Epidemiologic research: Principles and quantitative methods. New York: Van Nostrand Reinhold.Google Scholar
  36. Knottnerus, J. A., & van Weel, C. (2002). General introduction: evaluation of diagnostic procedures. In J. A. Knottnerus (Ed.), The evidence base of clinical diagnosis (pp. 1–17). London: BMJ Publishing Group.Google Scholar
  37. Kraemer, H. C. (1992). Evaluating medical tests: Objective and quantitative guidelines. Newbury Park, California: Sage.Google Scholar
  38. Kraemer, H. C. (2010). Epidemiological methods: About Time. International Journal of Environmental Research & Public Health, 7, 29–45.CrossRefGoogle Scholar
  39. Kraemer, H. C., Noda, A., & O’Hara, R. (2004). Categorical versus dimensional approaches to diagnosis: Methodological challenges. Journal of Psychiatric Research, 38, 17–25.PubMedCrossRefGoogle Scholar
  40. Kroenke, K., & Spitzer, R. L. (2002). The PHQ-9: A new depression diagnostic and severity measure. Psychiatric Annals, 32, 1–7.Google Scholar
  41. Ladouceur, R., Bouchard, C., Rheaume, N., Jacques, C., Ferland, F., Leblond, J., et al. (2000). Is the SOGS an accurate measure of pathological gambling among children, adolescents and adults? Journal of Gambling Studies, 16, 1–24.PubMedCrossRefGoogle Scholar
  42. LaPlante, D. A., Nelson, S. E., LaBrie, R. A., & Shaffer, H. J. (2008). Stability and progression of disordered gambling: Lessons from longitudinal studies. Canadian Journal of Psychiatry, 53, 52–60.Google Scholar
  43. Lesieur, H. R., & Blume, S. B. (1987). South Oaks Gambling Screen (SOGS): A new instrument for the identification of pathological gamblers. American Journal of Psychiatry, 144, 1184–1188.PubMedGoogle Scholar
  44. Lichtenstein, M., & Kiefe, C. (1990). Incorporating severity of illness into estimates of likelihood ratios. The American Journal of Medical Science, 299, 38–42.CrossRefGoogle Scholar
  45. Lijmer, J. G., Leeflang, M., & Bossuyt, P. M. M. (2009). Proposals for a phased evaluation of medical tests. Medical Decision Making, 29, E13–E21.PubMedCrossRefGoogle Scholar
  46. McNamee, R. C. (2003). Efficiency of two-phase designs for prevalence estimation. International Journal of Epidemiology, 32, 1072–1078.PubMedCrossRefGoogle Scholar
  47. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). Washington, DC: American Council on Education and National Council on Measurement in Education.Google Scholar
  48. Mulherin, S. A., & Miller, W. C. (2002). Spectrum bias or spectrum effect? Annals of Internal Medicine, 137, 598–602.PubMedGoogle Scholar
  49. Neal, P., Delfabbro, P., & O’Neil, M. (2005). Problem gambling and harm: Towards a national definition. Adelaide: South Australian Centre for Economic Studies.Google Scholar
  50. Obuchowski, N. A. (2005). ROC analysis. American Journal of Roentgenology, 184, 364–372.PubMedGoogle Scholar
  51. Obuchowski, N. A., & Zhou, X. (2002). Prospective studies of diagnostic test accuracy when disease prevalence is low. Biostatistics, 3, 477–492.PubMedCrossRefGoogle Scholar
  52. Obuchowski, N. A., Graham, R. J., Baker, M. E., & Powell, K. A. (2001). Ten criteria for effective screening. American Journal of Roentgenology, 176, 1357–1362.PubMedGoogle Scholar
  53. Pepe, M. S. (2003). The statistical evaluation of medical tests for classification and prediction. Oxford UK: Oxford University Press.Google Scholar
  54. Perkins, N. J., & Schisterman, E. F. (2006). The inconsistency of “optimal” cut points obtained using two criteria based on the receiver operating characteristic curve. American Journal of Epidemiology, 163, 670–675.PubMedCrossRefGoogle Scholar
  55. Petry, N. M. (2006). Should the scope of addictive behaviors be broadened to include pathological gambling? Addiction, 101(Supplement 1), 152–160.PubMedCrossRefGoogle Scholar
  56. Petry, N. (2011). Non-substance related addictions: Their place in DSM-V (abstract). Paper to be presented at American Public Health Association Meetings, October 31, 2011.Google Scholar
  57. Pfeiffer, R. M., & Castle, P. E. (2005). With or without a gold standard. Epidemiology, 16, 595–597.PubMedCrossRefGoogle Scholar
  58. Ransohoff, D. J., & Feinstein, A. R. (1978). Problems of spectrum and bias in evaluating the efficacy of diagnostic tests. New England Journal of Medicine, 299, 926–930.PubMedCrossRefGoogle Scholar
  59. Reid, M. C., Lachs, M. S., & Feinstein, A. R. (1995). Use of methodological standards in diagnostic test research: Getting better but still not good. Journal of American Medical Association, 274, 645–651.CrossRefGoogle Scholar
  60. Reitsma, J. B., Rutjes, A. W. S., Khan, K. S., Coomarasamy, A., & Bossuyt, P. M. (2009). A review of solutions for diagnostic accuracy studies with an imperfect or missing reference standard. Journal of Clinical Epidemiology, 62, 797–806.PubMedCrossRefGoogle Scholar
  61. Rosenthal, R. J. (2003). Distribution of the DSM-IV criteria for pathological gambling. Addiction, 98, 1674–1675.PubMedCrossRefGoogle Scholar
  62. Rutjes, A. W. S., Reitsma, J. B., DiNisio, M., et al. (2006). Evidence of bias and variation in diagnostic accuracy studies. CMAJ, 174, 469–476.PubMedGoogle Scholar
  63. Rutjes, A. W., Reitsma, J. B., Coomarasamy, A., Khan, K. S., & Bossuyt, P. M. (2007). Evaluation of diagnostic tests when there is no gold standard. Health, Technological Assessment, 11, 50.Google Scholar
  64. Sackett, D. L., & Haynes, R. B. (2002). Evidence base of clinical diagnosis: The architecture of diagnostic research. BMJ, 324, 539–541.PubMedCrossRefGoogle Scholar
  65. Sackett, D. L., Haynes, R. B., & Tugwell, P. (1985). Clinical epidemiology: A basic science for clinical medicine. Boston: Little Brown.Google Scholar
  66. Schatzkin, A., Connor, R. J., Taylor, P. R., & Bunnag, B. (1987). Comparing new and old screening tests when a reference procedure cannot be performed on all screenees: Example of automated cytometry for early detection of cervical cancer. American Journal of Epidemiology, 125, 672–678.PubMedGoogle Scholar
  67. Schlesselman, J. J. (1982). Case-control studies. New York: Oxford University Press.Google Scholar
  68. Shaffer, H. J., & Gambino, B. (1990). Epilogue: Integrating treatment choices. In H. B. Milkman & L. I. Sederer (Eds.), Treatment choices for alcoholism and substance abuse (pp. 351–375). Lexington, MA: Lexington Books.Google Scholar
  69. Shaffer, H. J., Hall, M. N., & Vander Bilt, J. (1997). Estimating the prevalence of disordered gambling behavior in the United States and Canada: A meta-analysis Boston. MA: Harvard Medical School Division on Addictions.Google Scholar
  70. Sica, G. T. (2006). Bias in research studies. Radiology, 238, 780–789.PubMedCrossRefGoogle Scholar
  71. Sox, H. C., Jr., Blatt, M. A., Higgins, M. C., & Marton, K. I. (1988). Medical decision making. Boston: Butterworths-Heinemann.Google Scholar
  72. Statistics Canada (2009). Table 3. Stratum-specific likelihood ratios for selected health status characteristics. Validation of disability categories derived from Health utilities index mark 3 scores. Health reports, 20, no. 2. Available at
  73. Stinchfield, R. (2002). Reliability, validity, classification accuracy of the South Oaks Gambling Screen (SOGS). Addictive Behaviors, 27, 1–19.PubMedCrossRefGoogle Scholar
  74. Stinchfield, R. (2003). Reliability, validity, and classification accuracy of a measure of DSM-IV diagnostic criteria for pathological gambling. American Journal of Psychiatry, 160, 180–182.PubMedCrossRefGoogle Scholar
  75. Stinchfield, R., Govoni, R., & Frisch, G. R. (2007a). A review of screening and assessment instruments for problem and pathological gambling. In G. Smith, D. Hodgins, & R. J. Williams (Eds.), Research and measurement issues in gambling studies (pp. 179–213). New York: Academic Press.Google Scholar
  76. Stinchfield, R., Winters, K. C., Botzet, A., Jerstad, S., & Breyer, J. (2007b). Development and psychometric evaluation of the gambling treatment outcome monitoring system (GAMTOMS). Psychology of Addictive Behaviors, 21, 174–184.PubMedCrossRefGoogle Scholar
  77. Stucki, S., & Rihs-Middel, M. (2007). Prevalence of adult problem and pathological gambling between 2000 and 2005: An update. Journal of Gambling Studies, 23, 245–257.PubMedCrossRefGoogle Scholar
  78. Thomas, S., Jackson, A., & Blaszczynski, A. (2003). Measuring problem gambling: Evaluation of the Victorian gambling screen. Melbourne: Gambling Research Panel.Google Scholar
  79. Toce-Gerstein, M., & Gerstein, D. R. (2004). Of time and the chase: Lifetime versus past-year measures of pathological gambling. E Journal of Gambling Issues, 10. Available at:
  80. Toce-Gerstein, M., Gerstein, D. R., & Volberg, R. A. (2009). The NODS-CLIP: A rapid screen for adult pathological and problem gambling. Journal of Gambling Studies, 25, 541–555.PubMedCrossRefGoogle Scholar
  81. Trikalinos, T. A., Siebert, U., & Lau, J. (2009). Decision-analytic modeling to evaluate benefits and harms of medical tests: Uses and limitations. Medical Decision Making, 29, E22–E29.PubMedCrossRefGoogle Scholar
  82. Valenstein, P. N. (1990). Evaluating diagnostic tests with imperfect standards. American Journal of Clinical Pathology, 93, 252–258.PubMedGoogle Scholar
  83. Van den Bruel, A., Aertgeerts, B., & Buntix, F. (2006). Results of diagnostic accuracy studies are not always validated. Journal of Clinical Epidemiology, 59, 559–566.PubMedGoogle Scholar
  84. Van Holst, R. J., van den Brink, W., & Veltman, D. J. (2010). Brain imaging studies in pathological gambling. Current Psychiatry Reports, 12, 418–425.PubMedCrossRefGoogle Scholar
  85. Volberg, R. A. (2002). Gambling and problem gambling in Nevada. Report to the Nevada Department of Human Resources.Google Scholar
  86. Volberg, R. A. (2003). Gambling and problem gambling in Arizona. Report to the Arizona Lottery.Google Scholar
  87. Volberg, R. A., & Bernhard, B. (2006). The 2006 study of gambling and problem gambling in New Mexico. Report to the responsible gaming association of New Mexico.Google Scholar
  88. Volberg, R. A., & Young, M. M. (2008). Using SOGS vs. CPGI in problem gambling screening and assessment. Toronto: Ontario Problem Gambling Research Centre.Google Scholar
  89. Wenzel, M., McMillen, J., Marshall, D., & Ahmed, E. (2004). Validation of the Victorian gambling screen. Melbourne: Australian National University.Google Scholar
  90. Whiting, P., Rutjes, A. W., Reitsma, J. B., Bossuyt, P. M., & Kleijnen, J. (2003). The development of QUADAS: A tool for the quality assessment of studies of diagnostic accuracy included in systematic reviews. BMC Medical Research Methodology, 3, 25.PubMedCrossRefGoogle Scholar
  91. Whiting, P., Rutjes, A. W., Reitsma, J. B., Glas, A. S., Bossuyt, P. M., & Kleijnen, J. (2005). Sources of variation and bias in studies of diagnostic accuracy. Annals of Internal Medicine, 140, 189–202.Google Scholar
  92. Wiebe, J., Maitland, S. B., Hodgins, D., Davey, A., & Gottlieb, B. (2009). Transition and stability of problem gambling behaviours. Final report to the addiction foundation of Manitoba.Google Scholar
  93. Worster, A., & Carpenter, C. (2008). Incorporation bias in studies of diagnostic tests: How to avoid being biased about bias. CJEM, 10, 174–175.PubMedGoogle Scholar
  94. Youden, J. (1950). Index for rating diagnostic tests. Cancer, 3, 32–35.PubMedCrossRefGoogle Scholar
  95. Zhou, X., Obuchowski, N. A., & McClish, D. K. (2002). Statistical methods in diagnostic medicine. New York: Wiley and Sons.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC 2011

Authors and Affiliations

  1. 1.American Academy of Health Care Providers in the Addictive DisordersBostonUSA

Personalised recommendations