Analytical Solution for the Optimal Addition of an Item to a Composite of Scores for Maximum Reliability

  • Carlos A. FerrerEmail author
  • Idileisy Torres-Rodríguez
  • Alberto Taboada-Crispi
  • Elmar Nöth
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11896)


This paper presents a derivation of the optimal weight to be assigned for an item so that it maximally increases the reliability of the aggregate. This aggregate is the best estimate of the underlying true repeating pattern. The approach differs from previous solutions in being analytical, based on the Signal to Noise Ratio (SNR) instead of the reliability itself, and the ability to visually inform the researcher about the relevance of the weighting strategy and the gains produced in the SNR. Optimal weighting of repetitive phenomena is a bonus not only in the behavioral sciences, but also in many engineering fields. Its uses may include the selection or discarding of raters, judges, repetitions, or epochs, depending on the field.


Reliability Signal-to-Noise Ratio Composites Ensemble Averages 



This work was partially supported by an Alexander von Humboldt Foundation Fellowship granted to one of the authors (Ref 3.2-1164728-CUB-GF-E).


  1. 1.
    Gulliksen, H.: Theory of Mental Tests. Routledge, New York (2013)CrossRefGoogle Scholar
  2. 2.
    Lord, F.M., Novick, M.R.: Statistical Theories of Mental Test Scores. Information Age Publishing, Charlotte (2008)zbMATHGoogle Scholar
  3. 3.
    Cronbach, L.J., Gleser, G.C.: The signal/noise ratio in the comparison of reliability coefficients. Educ. Psychol. Meas. 14(3), 467–480 (1964)CrossRefGoogle Scholar
  4. 4.
    Rompelman, O., Ross, H.H.: Coherent averaging technique: a tutorial review. I. Noise reduction and the equivalent filter. J. Biomed. Eng. 8(1), 24–29 (1988)CrossRefGoogle Scholar
  5. 5.
    Kotsakis, C., Tziavos, I.N.: Parametric versus non-parametric methods for optimal weighted averaging of noisy data sets. In: Sansò, F. (ed.) A Window on the Future of Geodesy. IAG SYMPOSIA, vol. 128, pp. 434–439. Springer, Heidelberg (2005). Scholar
  6. 6.
    Hashimoto, K., et al.: A novel signal-averaged electrocardiogram and an ambulatory-based signal-averaged electrocardiogram show strong correlations with conventional signal-averaged electrocardiogram in healthy subjects: a validation study. J. Electrocardiol. 51(6), 1145–1152 (2018)CrossRefGoogle Scholar
  7. 7.
    Ferrer, C., González, E., Hernández-Díaz, M.E.: Correcting the use of ensemble averages in the calculation of harmonics to noise ratios in voice signals (L). J. Acoust. Soc. Am. 118(2), 605–607 (2005)CrossRefGoogle Scholar
  8. 8.
    Graham, J.M.: Congeneric and (essentially) tau-equivalent estimates of score reliability what they are and how to use them. Educ. Psychol. Meas. 66(6), 930–944 (2006)MathSciNetCrossRefGoogle Scholar
  9. 9.
    Cronbach, L.J.: Coefficient alpha and the internal structure of tests. Psychometrika 16(3), 297–334 (1951)CrossRefGoogle Scholar
  10. 10.
    Jöreskog, K.G.: A general method for analysis of covariance structures. Biometrika 57, 239–251 (1970)MathSciNetCrossRefGoogle Scholar
  11. 11.
    Jöreskog, K.G.: Statistical analysis of sets of congeneric tests. Psychometrika 36(2), 109–133 (1971)CrossRefGoogle Scholar
  12. 12.
    de Winter, J.C.F., Dodou, D.: Common factor analysis versus principal component analysis: a comparison of loadings by means of simulations. Commun. Stat. - Simul. Comput. 45, 299–321 (2014)MathSciNetCrossRefGoogle Scholar
  13. 13.
    Knott, M., Bartholomew, D.J.: Constructing measures with maximum reliability. Psychometrika 58(2), 331–338 (1993)CrossRefGoogle Scholar
  14. 14.
    Sočan, G.: Assessment of reliability when test items are not essentially t-equivalent. Dev. Surv. Methodol. 15, 23–35 (2000)Google Scholar
  15. 15.
    Chang, S.-W.: Choice of weighting scheme in forming the composite. Bull. Educ. Psychol. 40, 489–510 (2009)Google Scholar
  16. 16.
    Lee, S.-Y.: Handbook of Latent Variable and Related Models. Elsevier, Amsterdam (2007)zbMATHGoogle Scholar
  17. 17.
    Streiner, D.L., Goldberg, J.O., Miller, H.R.: MCMI-II item weights: their lack of effectiveness. J. Pers. Assess. 60(3), 471–476 (1993)CrossRefGoogle Scholar
  18. 18.
    Lindell, M.K., Whitney, D.J.: Accounting for common method variance in cross-sectional research designs. J. Appl. Psychol. 86(1), 114–121 (2001)CrossRefGoogle Scholar
  19. 19.
    Li, H., Rosenthal, R., Rubin, D.B.: Reliability of measurement in psychology: from spearman-brown to maximal reliability. Psychol. Methods 1(1), 98–107 (1996)CrossRefGoogle Scholar
  20. 20.
    Telle, A., Vary, P.: A novel approach for impulse response measurements in environments with time-varying noise. In: Proceedings of the 20th International Congress on Acoustics, ICA 2010, Sydney, Australia, 23–27 August 2010, pp. 1–5, July 2010Google Scholar
  21. 21.
    Pander, T., Przybyla, T., Czabanski, R.: An Application of the LP-norm in robust weighted averaging of biomedical signals. J. Med. Inform. Technol. 22(2), 1–8 (2013)Google Scholar
  22. 22.
    Ferrer, C., González, E., Hernández-Díaz, M.E., Torres, D., Del Toro, A.: Removing the influence of shimmer in the calculation of harmonics-to-noise ratios using ensemble-averages in voice signals. EURASIP J. Adv. Signal Process. 2009, 784379 (2009)CrossRefGoogle Scholar
  23. 23.
    Shrivastav, R., Sapienza, C.M., Nandur, V.: Application of psychometric theory to the measurement of voice quality using rating scales. J. Speech Lang. Hear. Res. 48(2), 323–335 (2005)CrossRefGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Informatics Research CenterCentral University “Marta Abreu” de las VillasSanta ClaraCuba
  2. 2.Pattern Recognition LabFriedrich Alexander University Erlangen-NurembergErlangenGermany

Personalised recommendations