What sample sizes for reliability and validity studies in neurology?

Hobart, Jeremy C.; Cano, Stefan J.; Warner, Thomas T.; Thompson, Alan J.

doi:10.1007/s00415-012-6570-y

What sample sizes for reliability and validity studies in neurology?

Original Communication
Published: 24 June 2012

Volume 259, pages 2681–2694, (2012)
Cite this article

Journal of Neurology Aims and scope Submit manuscript

Jeremy C. Hobart^1,4,
Stefan J. Cano¹,
Thomas T. Warner² &
…
Alan J. Thompson³

4903 Accesses
126 Citations
4 Altmetric
Explore all metrics

Abstract

Rating scales are increasingly used in neurologic research and trials. A key question relating to their use across the range of neurologic diseases, both common and rare, is what sample sizes provide meaningful estimates of reliability and validity. Here, we address two questions: (1) to what extent does sample size influence the stability of reliability and validity estimates; and (2) to what extent does sample size influence the inferences made from reliability and validity testing? We examined data from two studies. In Study 1, we retrospectively reduced the total sample randomly and nonrandomly by decrements of approximately 50 % to generate sub-samples from n = 713–20. In Study 2, we prospectively generated sub-samples from n = 20–320, by entry time into study. In all samples we estimated reliability (internal consistency, item total correlations, test–retest) and validity (within scale correlations, convergent and discriminant construct validity). Reliability estimates were stable in magnitude and interpretation in all sub-samples of both studies. Validity estimates were stable in samples of n ≥ 80, for 75 % of scales in samples of n = 40, and for 50 % of scales in samples of n = 20. In this study, sample sizes of a minimum of 20 for reliability and 80 for validity provided estimates highly representative of the main study samples. These findings should be considered provisional and more work is needed to determine if these estimates are generalisable, consistent, and useful.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Article Open access 17 April 2024

Applicability and Psychometric Properties of General Mental Health Assessment Tools in Autistic People: A Systematic Review

Article Open access 13 April 2024

Management of functional neurological disorder

Article Open access 19 March 2020

References

Zajicek J, Fox P, Sanders H et al (2009) Cannabinoids for treatment of spasticity and other symptoms related to multiple sclerosis (CAMS study): multi-centre randomised placebo-controlled trial. Lancet 362:1517–1526
Article Google Scholar
Lees K, Zivin J, Ashwood T et al (2006) NXY-059 for acute ischemic stroke. N Engl J Med 354:588–600
Article PubMed CAS Google Scholar
Hobart J, Cano S, Zajicek J, Thompson A (2007) Rating scales as outcome measures for clinical trials in neurology: problems, solutions, and recommendations. Lancet Neurol 6:1094–1105
Article PubMed Google Scholar
Darzi A (2008) High quality care for all: NHS Next Stage Review final report. Department of Health, London
UK Department of Health (2010) Equity and excellence: liberating the NHS. Her Majesty’s Stationery Office, London
Google Scholar
Food and Drug Administration (2009). Patient reported outcome measures: use in medical product development to support labelling claims [online]. Available at: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM193282.pdf?utm_campaign=Google2&utm_source=fdaSearch&utm_medium=website&utm_term=patient%20reported%20outcomes%20guidance&utm_content=1
Food and Drug Administration (2010). Qualification Process for Drug Development Tools [online]. Available at: http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/UCM230597.pdf?utm_campaign=Google2&utm_source=fdaSearch&utm_medium=website&utm_term=Qualification%20Process%20for%20Drug%20Development%20Tools&utm_content=1
McDowell I, Jenkinson C (1996) Development standards for health measures. J Health Serv Res Policy 1:238–246
PubMed CAS Google Scholar
Fitzpatrick R, Davey C, Buxton MJ, Jones DR (1998). Evaluating patient-based outcome measures for use in clinical trials. Health Technol Assess 2:1–86
Google Scholar
Scientific Advisory Committee of the Medical Outcomes Trust (2002) Assessing health status and quality of life instruments: attributes and review criteria. Qual Life Res 11:193–205
Article Google Scholar
Feldt L, Woodruff D, Sailh F (1987) Statistical inference for coefficient alpha. Appl Psychol Measure 11:93–103
Article Google Scholar
Donner A, Eliasziw M (1987) Sample size requirements for reliability studies. Stat Med 6:441–448
Article PubMed CAS Google Scholar
DeVellis RF (1991) Scale development: theory and applications. Sage publications, London
Google Scholar
Rea L, Parker R (1992) Designing and conducting survey research: a comprehensive guide. Jossey-Bass, San Fransisco
Google Scholar
Ferguson E, Cox T (1993) Exploratory factor analysis: a user’s guide. Int J Select Assess 1:84–94
Article Google Scholar
Nunnally JC, Bernstein IH (1994) Psychometric theory, 3rd edn. McGraw-Hill, New York
Google Scholar
Eliasziw M, Young S, Woodbury M, Fryday-Field K (1994) Statistical methodology for the concurrent assessment of interrater and interrater reliability: using goniometric measurements as an example. Phys Therapy 74:777–788
CAS Google Scholar
Streiner DL, Norman GR (1995) Health measurement scales: a practical guide to their development and use, 2nd edn. Oxford University Press, Oxford
Google Scholar
Cantor AB (1996) Sample-size calculations for Cohen’s Kappa. Psych Methods 1:150–153
Article Google Scholar
Ware JE, Harris WJ, Gandek B, Rogers BW, Reese PR (1997) MAP-R for windows: multitrait/multi-item analysis program—revised user’s guide. Health Assessment Lab, Boston
Google Scholar
Feldt L, Ankenmann R (1998) Appropriate sample size for a test of equality of alpha coefficients. Appl Psychol Measure 22:170–178
Article Google Scholar
Feldt L, Ankenmann R (1999) Determining sample size for a test of equality of alpha coefficients when the number of part-tests is small. Psychol Methods 4:366–377
Article Google Scholar
Cocchetti D (1999) Sample size requirements for increasing the precision of reliability estimates: problems and proposed solutions. J Clin Exper Neuropsychol 21:567–570
Article CAS Google Scholar
Charter R (1999) Sample size requirements for precise estimates of reliability, generalizability, and validity coefficients. J Clin Exper Neuropsychol 21:559–566
Article CAS Google Scholar
MacCallum R, Widaman K, Zhang S, Hong S (1999) Sample size in factor analysis. Psychol Methods 4:84–99
Article Google Scholar
Mendoza J, Stafford K, Stauffer J (2000) Large-sample confidence intervals for validity and reliability coefficients. Psychol Methods 5:356–369
Article PubMed CAS Google Scholar
Perkins D, Wyatt R, Bartko J (2000) Penny-wise and pound-foolish: the impact of measurement error on sample size requirements in clinical trials. Biol Psychiatr 47:762–766
Article CAS Google Scholar
Bonett D (2002) Sample size requirements for testing and estimating coefficient alpha. J Educ Behav Stat 27:335–340
Article Google Scholar
Maydeu-Olivares A, Coffman D, Hartman W (2007) Asymptotically distribution-free (ADF) interval estimation of coefficient alpha. Psychol Methods 12:157–176
Article PubMed Google Scholar
Bonett D (2002) Sample size requirements for estimating intraclass correlations with desired precision. Stat Med 21:1331–1335
Article PubMed Google Scholar
Barrett P, Kline P (1981) The observation to variable ratio in factor analysis. Personality Study Group Behav 1:23–33
Google Scholar
Hobart JC, Lamping DL, Fitzpatrick R, Riazi A, Thompson AJ (2001) The Multiple Sclerosis Impact Scale (MSIS-29): a new patient-based outcome measure. Brain 124:962–973
Article PubMed CAS Google Scholar
Cano SJ, Warner TT, Linacre JM et al (2004) Capturing the true burden of dystonia on patients: the cervical dystonia impact profile (CDIP-58). Neurology 63:1629–1633
Article PubMed CAS Google Scholar
Ware JE, Sherbourne DC (1992) The MOS 36-Item Short-Form Health Survey (SF-36): I. Conceptual framework and item selection. Med Care 30:473–483
Article PubMed Google Scholar
Cella DF, Dineen K, Arnason B et al. (1996) Validation of the functional assessment of multiple sclerosis quality of life instrument. Neurology 47:129–139
Article PubMed CAS Google Scholar
EuroQoL Group (1990) EuroQoL: a new facility for the measurement of health-related quality of life. Health Policy 16:199–208
Article Google Scholar
Goldberg DP, Hillier VF (1979) A scaled version of the General Health Questionnaire. Psychol Medicine 9:139–145
Article CAS Google Scholar
Gompertz P, Pound P, Ebrahim S (1994) A postal version of the Barthel Index. Clin Rehabil 8:233–239
Article Google Scholar
Zigmond AS, Snaith RP (1983) The hospital anxiety and depression scale. Acta Psychiatr Scand 67:361–370
Article PubMed CAS Google Scholar
Cronbach LJ (1951) Coefficient alpha and the internal structure of tests. Psychometrika 16:297–334
Google Scholar
Nunnally JC (1978) Psychometric theory, 2nd edn. McGraw-Hill, New York
Google Scholar
Eisen M, Ware JE, Donald CA, Brook RH (1979) Measuring components of children’s health status. Med Care 17:902–921
Article PubMed CAS Google Scholar
Gulliksen H (1950) Theory of mental tests. Wiley, New York
Book Google Scholar
Green S, Lissitz R, Mulaik S (1977) Limitations of coefficient alpha as an index of test unidimensionality. Educ Psychol Measure 37:827–838
Article Google Scholar
McGraw KO, Wong SP (1996) Forming inferences about some intraclass correlation coefficients. Psychol Methods 1:30–46
Article Google Scholar
Lohr KN, Aaronson NK, Alonso J et al (1996) Evaluating quality of life and health status instruments: development of scientific review criteria. Clin Therapeutics 18:979–992
Article CAS Google Scholar
McHorney CA, Ware JEJ, Raczek AE (1993) 36-Item Short-Form Health Survey (SF-36): II. Psychometric and clinical tests of validity in measuring physical and mental health constructs. Med Care 31:247–263
Article PubMed CAS Google Scholar
Spearman CE (1904) The proof and measurement of association between two things. American J Psychol 15:72–101
Article Google Scholar
Cano S, Warner T, Thompson A, Bhatia K, Fitzpatrick R, Hobart J (2008) The cervical dystonia impact profile (CDIP-58): can a Rasch developed patient reported outcome measure satisfy traditional psychometric criteria? Health Qual Life Outcomes 6:58
Article PubMed Google Scholar
Cohen J, Cohen P, West S, Aiken L (2003) Applied multiple regression/correlation analysis for the behavioral sciences. Erlbaum, Hillsdale
Google Scholar
Freeman JA, Hobart JC, Langdon DW, Thompson AJ (2000) Clinical appropriateness: a key factor in outcome measure selection. The 36-item Short Form Health Survey in multiple sclerosis. J Neurol Neurosurg Psychiatry 68:150–156
Article PubMed CAS Google Scholar
Riazi A, Hobart J, Lamping D, Fitzpatrick R, Thompson A (2002) Multiple Sclerosis Impact Scale (MSIS-29): reliability and validity in hospital based samples. J Neurol Neurosurg Psychiatry 73:701–704
Article PubMed CAS Google Scholar
Cano S, Hobart J, Edwards M et al (2006) CDIP-58 can measure the impact of botulinum toxin treatment in cervical dystonia. Neurology 67:2230–2232
Article PubMed CAS Google Scholar
Bentler P, Chou C (1987) Practical issues in structural modelling. Sociol Methods Res 16:78–117
Article Google Scholar
Hancock G, Freeman M (2001) Power and sample size for the root mean square error of approximation test of not close fit in structural equation modelling. Educ Psychol Meas 61:741–758
Article Google Scholar
Muthén L, Muthén B (2002) How to use Monte Carlo study to decide on sample size and determine power. Struct Equ Model 9:599–620
Article Google Scholar

Download references

Conflicts of interest

The authors declare that they have no conflict of interest related to this research.

Author information

Authors and Affiliations

Clinical Neurology Research Group, Peninsula College of Medicine and Dentistry, Tamar Science Park, Room N13 ITTC Building, Davy Road, Plymouth, UK
Jeremy C. Hobart & Stefan J. Cano
Department of Clinical Neurosciences, UCL Institute of Neurology, Royal Free Campus, London, UK
Thomas T. Warner
Department of Brain Repair and Rehabilitation, UCL Institute of Neurology, Queen Square, London, UK
Alan J. Thompson
Department of Clinical Neuroscience, Peninsula College of Medicine and Dentistry, Tamar Science Park, Room N16 ITTC Building, Davy Road, Plymouth, Devon, PL6 8BX, UK
Jeremy C. Hobart

Authors

Jeremy C. Hobart
View author publications
You can also search for this author in PubMed Google Scholar
Stefan J. Cano
View author publications
You can also search for this author in PubMed Google Scholar
Thomas T. Warner
View author publications
You can also search for this author in PubMed Google Scholar
Alan J. Thompson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jeremy C. Hobart.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hobart, J.C., Cano, S.J., Warner, T.T. et al. What sample sizes for reliability and validity studies in neurology?. J Neurol 259, 2681–2694 (2012). https://doi.org/10.1007/s00415-012-6570-y

Download citation

Received: 08 March 2012
Revised: 20 May 2012
Accepted: 21 May 2012
Published: 24 June 2012
Issue Date: December 2012
DOI: https://doi.org/10.1007/s00415-012-6570-y

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What sample sizes for reliability and validity studies in neurology?

Abstract

Access this article

Similar content being viewed by others

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Applicability and Psychometric Properties of General Mental Health Assessment Tools in Autistic People: A Systematic Review

Management of functional neurological disorder

References

Conflicts of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

What sample sizes for reliability and validity studies in neurology?

Abstract

Access this article

Similar content being viewed by others

Recognize the Value of the Sum Score, Psychometrics’ Greatest Accomplishment

Applicability and Psychometric Properties of General Mental Health Assessment Tools in Autistic People: A Systematic Review

Management of functional neurological disorder

References

Conflicts of interest

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation