Advertisement

Discordancy Tests for Univariate Data

  • Surendra P. VermaEmail author
Chapter

Abstract

In this chapter, we present statistical tests called discordancy tests (Barnett and Lewis in Outliers in statistical data. Wiley, Chichester, 1994), which are of great importance in appropriate handling of experimental data.  The pre-1950 population-based outdated procedure commonly encountered in most books has been shown to be invalid and probably statistically incorrect for handling of finite-sized experimental data. The statistical correct post-1950 discordancy tests consist of single-outlier, multiple-outlier, recursive, and robust tests, of which the best combination was found to be three conventional and two new recursive tests with prior application of the corresponding single-outlier tests. The recommended procedure represents the best combination of highest test performance or power of test and lowest swamping and masking effects. The critical values play an essential role in the functioning of these tests. Precise and accurate values were simulated and incorporated in new computer programs. From the discussion of Type I and type II errors, it is argued that the statistical tests be applied at the strict 99% confidence level, and not at the 95% as is common according to most books and research publications. Similarly, to reduce these errors, the sample sizes should be increased as much as possible. The proposed methodology is illustrated from the compositional data of the geochemical reference material (GRM) BHVO1. The mean and standard deviation and related statistical parameters should only be calculated after the application of appropriate discordancy tests. Finally, the parameter z is briefly explained.

References

  1. Ando, A., Kamioka, H., Terashima, S., & Itoh, S. (1989). 1988 values for GSJ rock reference samples, “igneous rock series”. Geochemical Journal, 23, 143–148.Google Scholar
  2. Barnett, V., & Lewis, T. (1994). Outliers in statistical data. Chichester: Wiley.Google Scholar
  3. Chen, E. H. (1971). The power of the Shapiro-Wilk W test for normality in samples from contaminated normal distributions. Journal of the American Statistical Association, 66, 760–762.Google Scholar
  4. Chhikara, R. S., & Feiveson, A. L. (1980). Extended critical value of extreme studentized deviate test statistics for detecting multiple outliers. Communications in Statistics—Theory and Methods, B9, 155–166.Google Scholar
  5. Croux, C., & Rousseeuw, P. J. (1992). Time-efficient algorithms for two highly robust estimators of scale. Computational Statistics, 2, 411–428.CrossRefGoogle Scholar
  6. Dawson, R. (2011). How significant is a boxplot outlier? Journal of Statistics Education, 19, 1–13.CrossRefGoogle Scholar
  7. Díaz-González, L., & Cruz-Huicochea, R. (2013). Aplicación de las pruebas estadísticas de discordancia y significancia en la comparación del vulcanismo dacítico de la parte central de Cinturón Volcánico Mexicano. Nova Scientia, 6, 158–178.CrossRefGoogle Scholar
  8. Dixon, W. J. (1950). Analysis of extreme values. The Annals of Mathematical Statistics, 21, 488–506.CrossRefGoogle Scholar
  9. Dixon, W. J. (1951). Ratios involving extreme values. The Annals of Mathematical Statistics, 22, 68–78.CrossRefGoogle Scholar
  10. Dybczynski, R. (1980). Comparison of the effectiveness of various procedures for the rejection of outlying results and assigning consensus values in interlaboratory programs involving determination of trace elements or radionuclides. Analytica Chimica Acta, 117, 53–70.CrossRefGoogle Scholar
  11. Gladney, E. S. (1981). Comparison of methods for calculation of recommended elemental concentrations for Canadian certified reference materials project rock standards SY-2, SY-3, and MRG-1 (pp. 4). Los Alamos, New Mexico: Los Alamos Scientific Laboratory, University of California.Google Scholar
  12. Gladney, E. S., & Burns, C. E. (1984). 1982 compilation of elemental concentration data for the United States geological survey ‘s geochemical exploration reference samples GXR-1 to GXR-6. Geostandards Newsletter, 8, 119–154.CrossRefGoogle Scholar
  13. Gladney, E. S., & Roelandts, I. (1988). 1987 compilation of elemental concentration data for USGS BHVO-1, MAG-1, QLO-1, RGM-1, SCo-1, SDC-1, SGR-1, and STM-1. Geostandards Newsletter, 12, 253–262.CrossRefGoogle Scholar
  14. Gladney, E. S., Jones, E. A., Nickell, E. J., & Roelandts, I. (1992). 1988 compilation of elemental concentration data for USGS AGV-1, GSP-1 and G-2. Geostandards Newsletter, 16, 111–300.CrossRefGoogle Scholar
  15. González-Ramírez, R., Díaz-González, L., & Verma, S. P. (2009). Eficiencia relativa de 15 pruebas de discordancia con 33 variantes aplicadas al procesamiento de datos geoquímicos. Revista Mexicana de Ciencias Geológicas, 26, 501–515.Google Scholar
  16. Grubbs, F. E. (1950). Sample criteria for testing outlying observations. The Annals of Mathematical Statistics, 21, 27–58.CrossRefGoogle Scholar
  17. Grubbs, F. E. (1969). Procedures for detecting outlying observations in samples. Technometrics, 11, 1–21.CrossRefGoogle Scholar
  18. Grubbs, F. E., & Beck, G. (1972). Extension of sample sizes and percentage points for significance tests of outlying observations. Technometrics, 14, 847–854.CrossRefGoogle Scholar
  19. Guevara, M., Verma, S. P., & Velasco-Tapia, F. (2001). Evaluation of GSJ intrusive rocks JG1, JG2, JG3, JG1a, and JGb1 by an objective outlier rejection statistical procedure. Revista Mexicana de Ciencias Geológicas, 18, 74–88.Google Scholar
  20. Hawkins, D. M. (1979). Fractiles of an extended multiple outlier test. Journal of Statistical and Computational Simulations, 8, 227–236.CrossRefGoogle Scholar
  21. Hawkins, D. M., & Perold, A. F. (1977). On the joint distribution of left- and right- sided outlier statistics. Utilitas Mathematica, 12, 129–143.Google Scholar
  22. Hayes, K., & Kinsella, T. (2003). Spurious and non-spurious power in performance criteria for tests of discordancy. The Statistician, 52, 69–82.Google Scholar
  23. Hayes, K., Kinsella, A., & Coffey, N. (2007). A note on the use of outlier criteria in Ontario laboratory quality control schemes. Clinical Biochemistry, 40, 147–152.CrossRefGoogle Scholar
  24. Iglewicz, B., & Hoaglin, D. C. (1993). How to detect and handle outliers. Milwaukee, WI: ASQC Quality Press.Google Scholar
  25. Iglewicz, B., & Martínez, J. (1982). Outlier detection using robust measures of scale. Journal of Statistical Computation and Simulation, 15, 285–293.CrossRefGoogle Scholar
  26. Imai, N., Terashima, S., Itoh, S., & Ando, A. (1995a). 1994 compilation of analytical data for minor and trace elements in seventeen GSJ geochemical reference samples, “igneous rock series”. Geostandards Newsletter, 19, 135–213.Google Scholar
  27. Imai, N., Terashima, S., Itoh, S., & Ando, A. (1995b). 1994 compilation values for GSJ reference samples, “Igneous rock series”. Geochemical Journal, 29, 91–95.Google Scholar
  28. Itoh, S., Terashima, S., Imai, N., Kamioka, H., Mita, N., & Ando, A. (1993). 1992 compilation of analytical data for rare-earth elements, scandium, yttrium, zirconium and hafnium. Geostandards Newsletter, 17, 5–79.CrossRefGoogle Scholar
  29. Jain, R. B. (1981a). Detecting outliers: Power and some other considerations. Communications in Statistics—Theory and Methods, 10, 2299–2314.CrossRefGoogle Scholar
  30. Jain, R. B. (1981b). Percentage points of many-outlier detection procedures. Technometrics, 23, 71–75.CrossRefGoogle Scholar
  31. Jain, R. B., & Pingel, L. A. (1981a). On the robustness of recursive outlier detection procedures to nonnormality. Communications in Statistics—Theory and Methods, 10, 1323–1334.CrossRefGoogle Scholar
  32. Jain, R. B., & Pingel, L. A. (1981b). A procedure for estimating the number of outliers. Communications in Statistics—Theory and Methods, 10, 1029–1041.CrossRefGoogle Scholar
  33. Jain, J. C., Pingel, L. A., & Davidson, J. L. (1982). A unified approach for estimation and detection of outliers. Communications in Statistics—Theory and Methods, 11, 2953–2976.CrossRefGoogle Scholar
  34. Maronna, R. A., & Zamer, R. H. (2002). Robust estimates of location and dispersion for high-dimensional datasets. Technometrics, 44, 307–317.CrossRefGoogle Scholar
  35. Marroquín-Guerra, S. G., Velasco-Tapia, F., & Díaz-González, L. (2009). Evaluación estadística de Materiales de Referencia Geoquímica del Centre de Recherches Pétrographiques et Géochimiques (Francia) aplicando un esquema de detección y eliminación de valores desviados. Revista Mexicana de Ciencias Geológicas, 26, 530–542.Google Scholar
  36. Miller, J. N., & Miller, J. C. (2005). Statistics and chemometrics for analytical chemistry (5th ed.). Essex CM20 2JE, England: Pearson Prentice Hall.Google Scholar
  37. Miller, J. N., & Miller, J. C. (2010). Statistics and chemometrics for analytical chemistry (6th ed). Essex CM20 2JE, England: Pearson Prentice Hall.Google Scholar
  38. Pandarinath, K. (2009). Evaluation of geochemical sedimentary reference materials of the Geological Survey of Japan (GSJ) by an objective outlier rejection statistical method. Revista Mexicana de Ciencias Geológicas, 26, 638–646.Google Scholar
  39. Pearson, E. S., & Chandra Sekar, C. (1936). The efficiency of statistical tools and a criterion for the rejection of outlying observations. Biometrika, 28, 308–320.CrossRefGoogle Scholar
  40. Pearson, E. S., & Hartley, H. O. (1966). Biometrika tables for statisticians. Cambridge: University Press.Google Scholar
  41. Rosales Rivera, M. (2018). Desarrollo de herramientas estadísticas computacionales con nuevos valores críticos generados por simulación computacional. In Instituto de Investigación en Ciencias Básicas y Aplicadas, Centro de Investigación en Ciencias (pp. 105). Cuernavaca, Morelos, Mexico: Universidad Autónoma del Estado de Morelos.Google Scholar
  42. Rosales-Rivera, M., Díaz-González, L., & Verma, S. P. (2014). Comparative performance of thirteen single outlier discordancy tests from Monte Carlo simulations. In IAMG16: Geostatistical and Geospatial Approaches for the Characterization of Natural Resources in the Environment: Challenges, Processes and Strategies, pp. 4. New Delhi: International Association of Mathematical Geology.Google Scholar
  43. Rosales-Rivera, M., Díaz-González, L., & Verma, S. P. (2018). A new online computer program (BiDASys) for ordinary and uncertainty weighted least-squares linear regressions: case studies from food chemistry. Revista Mexicana de Ingeniería Química, 17, 507–522.CrossRefGoogle Scholar
  44. Rosales-Rivera, M., Díaz-González, L., & Verma, S. P. (2019). Evaluation of nine USGS reference materials for quality control through Univariate Data Analysis System, UDASys3. Arabian Journal of Geosciences, 12, 40.  https://doi.org/10.1007/s12517-018-4220-0.CrossRefGoogle Scholar
  45. Rosner, B. (1975). On the detection of many outliers. Technometrics, 17, 221–227.CrossRefGoogle Scholar
  46. Rosner, B. (1977). Percentage points for the RST many outlier procedure. Technometrics, 19, 307–312.CrossRefGoogle Scholar
  47. Rousseeuw, P. J., & Croux, C. (1993). Alternatives to the median absolute deviation. Journal of the American Statistical Association, 88, 1273–1283.CrossRefGoogle Scholar
  48. Royston, J. P. (1989). Correcting the Shapiro-Wilk W for ties. Journal of Statistical and Computational Simulations, 31, 237–249.CrossRefGoogle Scholar
  49. Shapiro, S. S., & Wilk, M. B. (1965). An analysis of variance test for normality (complete samples). Biometrika, 52, 591–611.CrossRefGoogle Scholar
  50. Shapiro, S. S., Wilk, M. B., & Chen, H. J. (1968). A comparative study of various tests for normality. Journal of American Statistical Association, 63, 1343–1371.CrossRefGoogle Scholar
  51. Tietjen, G. L., & Moore, R. H. (1972). Some Grubbs-type statistics for the detection of several outliers. Technometrics, 14, 583–597.CrossRefGoogle Scholar
  52. Velasco-Tapia, F., Guevara, M., & Verma, S. P. (2001). Evaluation of concentration data in geochemical reference materials. Chemie der Erde, 61, 69–91.Google Scholar
  53. Verma, S. P. (1997). Sixteen statistical tests for outlier detection and rejection in evaluation of international geochemical reference materials: example of microgabbro PM-S. Geostandards Newsletter: The Journal of Geostandards and Geoanalysis, 21, 59–75.CrossRefGoogle Scholar
  54. Verma, S. P. (1998). Improved concentration data in two international geochemical reference materials (USGS basalt BIR-1 and GSJ peridotite JP-1) by outlier rejection. Geofísica Internacional, 37, 215–250.Google Scholar
  55. Verma, S. P. (2005). Estadística básica para el manejo de datos experimentales: aplicación en la Geoquímica (Geoquimiometría). México, D.F.: UNAM.Google Scholar
  56. Verma, S. P. (2009). Evaluation of polynomial regression models for the student t and fisher F critical values, the best interpolation equations from double and triple natural logarithm transformation of degrees of freedom up to 1000, and their applications to quality control in science and engineering. Revista Mexicana de Ciencias Geológicas, 26, 79–92.Google Scholar
  57. Verma, S. P. (2012). Geochemometrics. Revista Mexicana de Ciencias Geológicas, 29, 276–298.Google Scholar
  58. Verma, S. P. (2016). Análisis estadístico de datos composicionales. CDMX: Universidad Nacional Autónoma de México.Google Scholar
  59. Verma, S. P., & Cruz-Huicochea, R. (2013). Alternative approach for precise and accurate Student’s t critical values and application in geosciences. Journal of Iberian Geology, 39, 31–56.Google Scholar
  60. Verma, S. P., & Díaz-González, L. (2012). Application of the discordant outlier detection and separation system in the geosciences. International Geology Review, 54, 593–614.CrossRefGoogle Scholar
  61. Verma, S. P., & Quiroz-Ruiz, A. (2006a). Critical values for six Dixon tests for outliers in normal samples up to sizes 100, and applications in science and engineering. Revista Mexicana de Ciencias Geológicas, 23, 133–161.Google Scholar
  62. Verma, S. P., & Quiroz-Ruiz, A. (2006b). Critical values for 22 discordancy test variants for outliers in normal samples up to sizes 100, and applications in science and engineering. Revista Mexicana de Ciencias Geológicas, 23, 302–319.Google Scholar
  63. Verma, S. P., & Quiroz-Ruiz, A. (2008). Critical values for 33 discordancy test variants for outliers in normal samples of very large sizes from 1,000 to 30,000 and evaluation of different regression models for the interpolation of critical values. Revista Mexicana de Ciencias Geológicas, 25, 369–381.Google Scholar
  64. Verma, S. P., & Quiroz-Ruiz, A. (2011). Corrigendum to critical values for 22 discordancy test variants for outliers in normal samples up to sizes 100, and applications in science and engineering [Revista Mexicana de Ciencias Geológicas, 23, 302–319 (2006)]. Revista Mexicana de Ciencias Geológicas, 28, 202.Google Scholar
  65. Verma, S. P., & Rivera-Gómez, M. A. (2013). Computer programs for the classification and nomenclature of igneous rocks. Episodes, 36, 115–124.Google Scholar
  66. Verma, S. P., Orduña-Galván, L. J., & Guevara, M. (1998). SIPVADE: A new computer programme with seventeen statistical tests for outlier detection in evaluation of international geochemical reference materials and its application to Whin Sill dolerite WS-E from England and soil-5 from Peru. Geostandards Newsletter: The Journal of Geostandards and Geoanalysis, 22, 209–234.CrossRefGoogle Scholar
  67. Verma, S. P., Quiroz-Ruiz, A., & Díaz-González, L. (2008). Critical values for 33 discordancy test variants for outliers in normal samples up to sizes 1000, and applications in quality control in earth sciences. Revista Mexicana de Ciencias Geológicas, 25, 82–96.Google Scholar
  68. Verma, S. P., Díaz-González, L., & González-Ramírez, R. (2009). Relative efficiency of single-outlier discordancy tests for processing geochemical data on reference materials and application to instrumental calibration by a weighted least-squares linear regression model. Geostandards and Geoanalytical Research, 33, 29–49.CrossRefGoogle Scholar
  69. Verma, S. P., Cruz-Huicochea, R., & Díaz-González, L. (2013). Univariate data analysis system: Deciphering mean compositions of island and continental arc magmas, and influence of underlying crust. International Geology Review, 55, 1922–1940.CrossRefGoogle Scholar
  70. Verma, S. P., Díaz-González, L., Rosales-Rivera, M., & Quiroz-Ruiz, A. (2014). Comparative performance of four single extreme outlier discordancy tests from Monte Carlo simulations. Scientific World Journal, 2014, p. 27. Article ID 746451.  https://doi.org/10.1155/2014/746451.Google Scholar
  71. Verma, S. P., Torres-Sánchez, D., Velasco-Tapia, F., Subramanyam, K. S. V., Manikyamba, C., & Bhutani, R. (2016a). Geochemistry and petrogenesis of extension-related magmas close to the volcanic front of the central part of the Trans-Mexican Volcanic Belt. Journal of South American Earth Sciences, 72, 126–136.CrossRefGoogle Scholar
  72. Verma, S. P., Pandarinath, K., & Rivera-Gómez, M. A. (2016b). Evaluation of the ongoing rifting and subduction processes in the geochemistry of magmas from the western part of the Mexican Volcanic Belt. Journal of South American Earth Sciences, 66, 125–148.CrossRefGoogle Scholar
  73. Verma, S. P., Díaz-González, L., Pérez-Garza, J. A., & Rosales-Rivera, M. (2016c). Quality control in geochemistry from a comparison of four central tendency and five dispersion estimators and example of a geochemical reference material. Arabian Journal of Geosciences, 9, 740.CrossRefGoogle Scholar
  74. Verma, S. P., Rosales-Rivera, M., Díaz-González, L., & Quiroz-Ruiz, A. (2017a). Improved composition of Hawaiian basalt BHVO-1 from the application of two new and three conventional recursive discordancy tests. Turkish Journal of Earth Sciences, 26, 331–353.CrossRefGoogle Scholar
  75. Verma, S. P., Díaz-González, L., Pérez-Garza, J. A., & Rosales-Rivera, M. (2017b). Erratum to: Quality control in geochemistry from a comparison of four central tendency and five dispersion estimators and example of a geochemical reference material. Arabian Journal of Geosciences, 10, 24.CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2020

Authors and Affiliations

  1. 1.Instituto de Energías RenovablesUniversidad Nacional Autónoma de MéxicoTemixcoMexico

Personalised recommendations