Statistical Inference for Non-inferiority of a Diagnostic Procedure Compared to an Alternative Procedure, Based on the Difference in Correlated Proportions from Multiple Raters

  • Hiroyuki SaekiEmail author
  • Toshiro Tango


In a clinical trial of diagnostic procedures to indicate non-inferiority, the efficacy is generally evaluated on the basis of the results from multiple raters who interpret and report their findings independently. Although we can handle the multiple results from the multiple raters as if there were a single rater by considering consensus evaluations or majority votes, this handling is not recommended for the primary evaluation. Therefore, all results from the multiple independent raters should be used in the analysis. This chapter addresses a non-inferiority test, confidence interval and sample size formula, for inference of the difference in correlated proportions between the two diagnostic procedures based on the multiple raters. Moreover, we illustrate the methods with data from studies of diagnostic procedures for the diagnosis of oesophageal carcinoma infiltrating the tracheobronchial tree and for the diagnosis of aneurysm in patients with acute subarachnoid hemorrhage.


Magnetic Resonance Angiography Digital Subtraction Angiography Majority Vote Oesophageal Carcinoma Consensus Evaluation 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Guidance for industry. Developing medical imaging drugs and biological products. Part 3: design, analysis, and interpretation of clinical studies (2004). URL Cited 21 May 2012
  2. 2.
    Appendix 1 to the guideline on clinical evaluation of diagnostic agents (CPMP/EWP/1119/98 REV. 1) on imaging agents (Doc. Ref. EMEA/CHMP/EWP/321180/2008) (2009). URL Cited 21 May 2012
  3. 3.
    Durkalski, V., Palesch, Y., Lipsitz, S., Rust, P.: Analysis of clustered matched-pair data for a non-inferiority study design. Statistics in Medicine 22, 279–290 (2003). DOI 10.1002/sim.1385Google Scholar
  4. 4.
    Jäger, H., Mansmann, U., Hausmann, O., Partzsch, U., Moseley, I., Taylor, W.: MRA versus digital subtraction angiography in acute subarachnoid haemorrhage: a blinded multireader study of prospectively recruited patients. Neuroradiology 42, 313–326 (2000)Google Scholar
  5. 5.
    Jin, H., Lu, Y.: Comparison of correlated proportions based on paired binary data from clustered samples. Journal of Statistical Planning and Inference 139, 4206–4212 (2009). DOI 10.1016/j.jspi.2009.06.005Google Scholar
  6. 6.
    Lehr, R., Kashanian, F.: Three persistent issues in analysis of clinical trials involving diagnostic contrast agents. Drug Information Journal 43, 525–532 (2009). DOI  10.1177/009286150904300501
  7. 7.
    Lu, Y., Bean, J.: On the sample size for one-sided equivalence of sensitivities based upon McNemar’s test. Statistics in Medicine 14, 1831–1839 (1995). DOI 10.1002/sim.4780141611Google Scholar
  8. 8.
    Lu, Y., Jin, H., Genant, H.: On the non-inferiority of a diagnostic test based on paired observations. Statistics in Medicine 22, 3029–3044 (2003). DOI 10.1002/sim.1569Google Scholar
  9. 9.
    McNemar, Q.: Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika 12, 153–157 (1947). DOI 10.1007/BF02295996Google Scholar
  10. 10.
    Nam, J.: Establishing equivalence of two treatments and sample size requirements in matched-pairs design. Biometrics 53, 1422–1430 (1997)Google Scholar
  11. 11.
    Nam, J., Kwon, D.: Non-inferiority tests for clustered matched-pair data. Statistics in Medicine 28, 1668–1679 (2009). DOI 10.1002/sim.3580Google Scholar
  12. 12.
    Obuchowski, N., Lieber, M.: Statistics and methodology. Skeletal Radiology 37, 393–396 (2008). DOI 10.1007/s00256-008-0448-1Google Scholar
  13. 13.
    Rapp-Bernhardt, U., Welte, T., Budinger, M., Bernhardt, T.: Comparison of three-dimensional virtual endoscopy with bronchoscopy in patients with oesophageal carcinoma infiltrating the tracheobronchial tree. The British Journal of Radiology 71, 1271–1278 (1998)Google Scholar
  14. 14.
    Saeki, H., Tango, T.: Non-inferiority test and confidence interval for the difference in correlated proportions in diagnostic procedures based on multiple raters. Statistics in Medicine 30, 3313–3327 (2011). DOI 10. 1002/sim.4364Google Scholar
  15. 15.
    Schouten, H.: Estimating kappa from binocular data and comparing marginal probabilities. Statistics in Medicine 12, 2207–2217 (1993). DOI 10.1002/sim.4780122306Google Scholar
  16. 16.
    Schwenke, C., Busse, R.: Analysis of differences in proportions from clustered data with multiple measurements in diagnostic studies. Methods of Information in Medicine 46, 548–552 (2007). DOI 10.1160/ ME0433Google Scholar
  17. 17.
    Tango, T.: Equivalence test and confidence interval for the difference in proportions for the paired-sample design. Statistics in Medicine 17, 891–908 (1998). DOI 10.1002/(SICI)1097-0258(19980430)17: 8\(\langle 891::\mathrm{AID}\mbox{ -}\mathrm{SIM780}\rangle\) 3.0.CO;2-BGoogle Scholar
  18. 18.
    Zhou, X., Obuchowski, N., McClish, D.: Statistical Methods in Diagnostic Medicine, 2nd edn. Wiley & Sons, New York (2011)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  1. 1.FUJIFILM RI Pharma Co. LTD.TokyoJapan
  2. 2.Center for Medical StatisticsTokyoJapan

Personalised recommendations