Abstract
Measurement error is present in all quantitative studies, and ensuring proper biological inference requires that the effects of measurement error are fully scrutinized, understood, and to the extent possible, minimized. For morphometric data, measurement error is often evaluated from descriptive statistics that find ratios of subject or within-subject variance to total variance for a set of data comprising repeated measurements on the same research subjects. These descriptive statistics do not typically distinguish between random and systematic components of measurement error, even though the presence of the latter (even in small proportions) can have consequences for downstream biological inferences. Furthermore, merely sampling from subjects that are quite morphologically dissimilar can give the incorrect impression that measurement error (and its negative effects) are unimportant. We argue that a formal hypothesis-testing framework for measurement error in morphometric data is lacking. We propose a suite of new analytical methods and graphical tools that more fully interrogate measurement error, by disentangling its random and systematic components, and evaluating any group-specific systematic effects. Through the analysis of simulated and empirical data sets we demonstrate that our procedures properly parse components of measurement error, and characterize the extent to which they permeate variation in a sample of observations. We further confirm that traditional approaches with repeatability statistics are unable to discern these patterns, improperly assuaging potential concerns. We recommend that the approaches developed here become part of the current analytical paradigm in geometric morphometric studies. The new methods are made available in the RRPP and geomorph R-packages.
Similar content being viewed by others
Data Availability
All data from simulattion experiments can be generated with scripts in the Supplementary material. Data for the empirical example can be found at the Dryad Digital Repository: https://doi.org/10.5061/dryad.t9888.
Change history
06 March 2024
A Correction to this paper has been published: https://doi.org/10.1007/s11692-024-09632-9
Notes
If only one machine was the cause of inconsistency, it would be clear which machine it was, regardless of the exactness of any machine to produce the true configuration.
Often the terms, “Procrustes residuals” and “Procrustes coordinates” are used almost interchangeably. Procrustes coordinates are the mean configuration after GPA, plus the Procrustes residuals, which are the deviations of configuration-specific coordinates from the mean. Either can be used in most analyses, producing the same results, as the mean shape would be constant for every research observation.
Despite the imprecision of the automated digitizer compared to the researcher, the configurations it produces are accurate with respect to the researcher’s.
There is not a strict need for replicate balance in the research design (see “Discussion”). However, issues like heterogeneity of variance among subjects might be more difficult to interpret with replicate imbalance.
It is important to realize that the same strategy (within-subject RRPP) is used to obtain sampling distributions, whether Roy’s maximum root or \(SNR\) are used as test statistics. Alternative statistics could also be used. Generally, \(P\)-values and \(Z\)-scores will be similar in terms of interpretation but not perfectly rank-correlated unless they are linear transformations of each other, like \(SNR\) and \(F\). However, alternative sampling distribution strategies are not needed if different statistics are used.
References
Adams, D. C. (2014). A method for assessing phylogenetic least squares models for shape and other high-dimensional multivariate data. Evolution, 68, 2675–2688. https://doi.org/10.1111/evo.12463
Adams, D. C., & Collyer, M. L. (2018). Phylogenetic ANOVA: Group-clade aggregation, biological challenges, and a refined permutation procedure. Evolution, 72(6), 1204–1215.
Adams, D. C., & Collyer, M. L. (2019). Comparing the strength of modular signal, and evaluating alternative modular hypotheses, using covariance ratio effect sizes with morphometric data. Evolution, 73, 2352–2367. https://doi.org/10.1111/evo.13867
Adams, D. C., & Collyer, M. L. (2022). Consilience of methods for phylogenetic analysis of variance. Evolution, 76(7), 1406–1419.
Adams, D. C., Collyer, M. L., Kaliontzopoulou, A., & Baken, E. K. (2023). Geometric morphometric analyses of 2D and 3D landmark data, version 4.0.6. R Foundation for Statistical Computing. https://cran.r-project.org/package=geomorph
Adams, D. C., Rohlf, F. J., & Slice, D. E. (2013). A field comes of age: Geometric morphometrics in the 21st century. Hystrix, 24, 7–14.
Anderson, M. J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26(1), 32–46.
Anderson, M. J., & Walsh, D. C. (2013). PERMANOVA, ANOSIM, and the mantel test in the face of heterogeneous dispersions: What null hypothesis are you testing? Ecological Monographs, 83(4), 557–574.
Arnqvist, G., & Mårtensson, T. (1998). Measurement error in geometric morphometrics: Empirical strategies to assess and reduce its impact on measures of shape. Acta Zoologica Academiae Scientiarum Hungaricae, 44, 73–96.
Bailey, R. C., & Byrnes, J. (1990). A new, old method for assessing measurement error in both univariate and multivariate morphometric studies. Systematic Zoology, 39, 124–130.
Baken, E. K., Collyer, M. L., Kaliontzopoulou, A., & Adams, D. C. (2021). Geomorph 4.0 and gmShiny: Enhanced analytics and a new graphical interface for a comprehensive morphometric experience. Methods in Ecology and Evolution, 12, 2355–2363.
Barbeito-Andrés, J., Anzelmo, M., Ventrice, F., & Sardi, M. L. (2012). Measurement error of 3D cranial landmarks of an ontogenetic sample using computed tomography. Journal of Oral Biology and Craniofacial Research, 2, 77–82. https://doi.org/10.1016/j.jobcr.2012.05.005
Bartko, J. J. (1966). The intraclass correlation coefficient as a measure of reliability. Psychological Reports, 19, 3–11. https://doi.org/10.2466/pr0.1966.19.1.3
Bookstein, F. L. (1991). Morphometric tools for landmark data: Geometry and biology. Cambridge University Press.
Bookstein, F. L. (2015). Integration, disintegration, and self-similarity: Characterizing the scales of shape variation in landmark data. Evolutionary Biology, 42, 395–426. https://doi.org/10.1007/s11692-015-9317-8
Bookstein, F. L., Gunz, P., Mitterœcker, P., Prossinger, H., Schæfer, K., & Seidler, H. (2003). Cranial integration in homo: Singular warps analysis of the midsagittal plane in ontogeny and evolution. Journal of Human Evolution, 44(2), 167–187. https://doi.org/10.1016/s0047-2484(02)00201-4
Bookstein, F. L., & Mitterœcker, P. (2014). Comparing covariance matrices by relative eigenanalysis, with applications to organismal biology. Evolutionary Biology, 41, 336–350.
Cardini, A. (2019). Integration and modularity in procrustes shape data: Is there a risk of spurious results? Evolutionary Biology, 46(1), 90–105.
Collyer, M. L., & Adams, D. C. (2013). Phenotypic trajectory analysis: Comparison of shape change patterns in evolution and ecology. Hystrix, the Italian Journal of Mammalogy, 24, 75–83. https://doi.org/10.4404/hystrix-24.1-6298
Collyer, M. L., & Adams, D. C. (2018). RRPP: An R package for fitting linear models to high-dimensional data using residual randomization. Methods in Ecology and Evolution, 9, 1772–1779.
Collyer, M. L., & Adams, D. C. (2023). RRPP: Linear model evaluation with randomized residuals in a permutation procedure, version 1.3.2. R Foundation for Statistical Computing. https://cran.r-project.org/package=RRPP
Collyer, M. L., Baken, E. K., & Adams, D. C. (2022). A standardized effect size for evaluating and comparing the strength of phylogenetic signal. Methods in Ecology and Evolution, 13(2), 367–382.
Collyer, M. L., Sekora, D. J., & Adams, D. C. (2015). A method for analysis of phenotypic change for phenotypes described by high-dimensional data. Heredity, 115(4), 357–365.
Commenges, D. (2003). Transformations which preserve exchangeability and application to permutation tests. Journal of Nonparametric Statistics, 15(2), 171–185.
Conaway, M. A., & Adams, D. C. (2022). An effect size for comparing the strength of morphological integration across studies. Evolution, 76, 2244–2259. https://doi.org/10.1111/evo.14595
Cramon-Taubadel, N., von Frazier, B. C., & Lahr, M. M. (2007). The problem of assessing landmark error in geometric morphometrics: Theory, methods, and modifications. American Journal of Physical Anthropology, 134, 24–35. https://doi.org/10.1002/ajpa.20616
Daboul, A., Ivanovska, T., Bülow, R., Biffar, R., & Cardini, A. (2018). Procrustes-based geometric morphometrics on MRI images: An example of inter-operator bias in 3D landmarks and its impact on big datasets. PLoS ONE, 13, e0197675. https://doi.org/10.1371/journal.pone.0197675
Fisher, R. A. (1950). Statistical methods for research workers (11th ed.). Oliver; Boyd.
Fleiss, J. L., & Shrout, P. E. (1977). The effects of measurement errors on some multivariate procedures. American Journal of Public Health, 67, 1188–1191.
Fox, N. S., Veneracion, J. J., & Blois, J. L. (2020). Are geometric morphometric analyses replicable? Evaluating landmark measurement error and its impact on extant and fossil Microtus classification. Ecology and Evolution, 10, 3260–3275. https://doi.org/10.1002/ece3.6063
Fruciano, C. (2016). Measurement error in geometric morphometrics. Development Genes and Evolution, 226, 139–158. https://doi.org/10.1007/s00427-016-0537-4
Fruciano, C., Celik, M. A., Butler, K., Dooley, T., Weisbecker, V., & Phillips, M. J. (2017). Sharing is caring? Measurement error and the issues arising from combining 3D morphometric datasets. Ecology and Evolution, 7, 7034–7046. https://doi.org/10.1002/ece3.3256
Galimberti, F., Sanvito, S., Vinesi, M. C., & Cardini, A. (2019). Nose-metrics of wild southern elephant seal Mirounga leonina males using image analysis and geometric morphometrics. Journal of Zoological Systematics and Evolutionary Research, 57, 710–720. https://doi.org/10.1111/jzs.12276
Giacomini, G., Scaravelli, D., Herrel, A., Veneziano, A., Russo, D., Brown, R. P., & Meloro, C. (2019). 3D photogrammetry of bat skulls: Perspectives for macro-evolutionary analyses. Evolutionary Biology, 46, 249–259. https://doi.org/10.1007/s11692-019-09478-6
Goodall, C. (1991). Procrustes methods in the statistical analysis of shape. Journal of the Royal Statistical Society: Series B (Methodological), 53(2), 285–321.
Gunz, P., Mitterœcker, P., & Bookstein, F. L. (2005). Semilandmarks in three dimensions. In Developments in primatology: Progress and prospects (pp. 73–98). Kluwer Academic Publishers-Plenum Publishers. https://doi.org/10.1007/0-387-27614-9_3
Haggard, E. A. (1958). Intraclass correlation and the analysis of variance. Dryden Press.
Hand, D. J. (1996). Statistics and the theory of measurement. Journal of the Royal Statistical Society Series A (Statistics in Society), 159, 445–492. https://doi.org/10.2307/2983326
Houle, D., Pélabon, C., Wagner, G. P., & Hansen, T. F. (2011). Measurement and meaning in biology. The Quarterly Review of Biology, 86, 3–34. https://doi.org/10.1086/658408
Klingenberg, C. P. (2010). MorphoJ: An integrated software package for geometric morphometrics. Molecular Ecology Resources, 11, 353–357. https://doi.org/10.1111/j.1755-0998.2010.02924.x
Klingenberg, C. P. (2021). How exactly did the nose get that long? A critical rethinking of the Pinocchio effect and how shape changes relate to landmarks. Evolutionary Biology, 48(1), 115–127.
Klingenberg, C. P., Barluenga, M., & Meyer, A. (2002). Shape analysis of symmetric structures: Quantifying variation among individuals and asymmetry. Evolution, 56, 1909–1920. https://doi.org/10.1111/j.0014-3820.2002.tb00117.x
Klingenberg, C. P., & Gidaszewski, N. A. (2010). Testing and quantifying phylogenetic signals and homoplasy in morphometric data. Systematic Biology, 59, 245–261.
Klingenberg, C. P., & McIntyre, G. S. (1998). Geometric morphometrics of developmental instability: Analyzing patterns of fluctuating asymmetry with procrustes methods. Evolution, 52, 1363–1375. https://doi.org/10.1111/j.1558-5646.1998.tb02018.x
Konishi, S., Khatri, C. G., & Rao, C. R. (1991). Inferences on multivariate measures of interclass and intraclass correlations in familial data. Journal of the Royal Statistical Society Series B (Methodological), 53, 649–659.
Krantz, D., Luce, D., Suppes, P., & Tversky, A. (1971). Foundations of measurement, volume i: Additive and polynomial representations. Academic Press.
Kreutz, C., Raue, A., Kaschek, D., & Timmer, J. (2013). Profile likelihood in systems biology. FEBS Journal, 280, 2564–2571. https://doi.org/10.1111/febs.12276
Kyburg, H. (1984). Theory and measurement. Cambridge University Press.
Liljequist, D., Elfving, B., & Roaldsen, K. S. (2019). Intraclass correlation—A discussion and demonstration of basic features. PLoS ONE, 14, e0219854. https://doi.org/10.1371/journal.pone.0219854
Luce, R. D., Krantz, D. H., Suppes, P., & Tversky, A. (1990). Foundations of measurement, volume III: Representation, axiomatization, and invariance. Academic Press.
Marcy, A. E., Fruciano, C., Phillips, M. J., Mardon, K., & Weisbecker, V. (2018). Low resolution scans can provide a sufficiently accurate, cost- and time-effective alternative to high resolution scans for 3D shape analyses. PeerJ, 6, e5032. https://doi.org/10.7717/peerj.5032
Menéndez, L. P. (2016). Comparing methods to assess intraobserver measurement error of 3D craniofacial landmarks using geometric morphometrics through a digitizer arm. Journal of Forensic Sciences, 62, 741–746. https://doi.org/10.1111/1556-4029.13301
Mitterœcker, P., & Bookstein, F. L. (2009). The ontogenetic trajectory of the phenotypic covariance matrix, with examples from craniofacial shape in rats and humans. Evolution, 63, 727–737.
Mitterœcker, P., Gunz, P., Bernhard, M., Schæfer, K., & Bookstein, F. L. (2004). Comparison of cranial ontogenetic trajectories among great apes and humans. Journal of Human Evolution, 46, 679–698. https://doi.org/10.1016/j.jhevol.2004.03.006
Mitterœcker, P., & Schæfer, K. (2022). Thirty years of geometric morphometrics: Achievements, challenges, and the ongoing quest for biological meaningfulness. American Journal of Biological Anthropology, 178, 181–210. https://doi.org/10.1002/ajpa.24531
R Core Team. (2023). R: A language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing. https://www.R-project.org/
Rabinovich, S. G. (2005). Measurement errors and uncertainties: Theory and practice (3rd ed.). Springer Nature. https://www.ebook.de/de/product/3897875/semyon_g_rabinovich_measurement_errors_and_uncertainties_theory_and_practice.html
Robinson, C., & Terhune, C. E. (2017). Error in geometric morphometric data collection: Combining data from multiple sources. American Journal of Physical Anthropology, 164, 62–75. https://doi.org/10.1002/ajpa.23257
Rohlf, F. J., & Corti, M. (2000). Use of two-block partial least-squares to study covariation in shape. Systematic Biology, 49, 740–753. https://doi.org/10.1080/106351500750049806
Rohlf, F. J., & Slice, D. E. (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Zoology, 39, 40–59.
Shearer, B. M., Cooke, S. B., Halenar, L. B., Reber, S. L., Plummer, J. E., Delson, E., & Tallman, M. (2017). Evaluating causes of error in landmark-based data collection using scanners. PLoS ONE, 12, e0187452. https://doi.org/10.1371/journal.pone.0187452
Suppes, P., Krantz, D. H., Luce, R. D., & Tversky, A. (1989). Foundations of measurement, volume II: Geometrical, threshold, and probabilistic representations. Academic Press.
Vrdoljak, J., Sanchez, K. I., Arreola-Ramos, R., Huesa, E. G. D., Villagra, A., Avila, L. J., & Morando, M. (2020). Testing repeatability, measurement error and species differentiation when using geometric morphometrics on complex shapes: A case study of Patagonian lizards of the genus Liolaemus (Squamata: Liolaemini). Biological Journal of the Linnean Society, 130, 800–812. https://doi.org/10.1093/biolinnean/blaa079
Yezerinac, S. M., Lougheed, S. C., & Handford, P. (1992). Measurement error and morphometric studies: Statistical power and observer experience. Systematic Biology, 41, 471–482. https://doi.org/10.2307/2992588
Acknowledgements
We thank P.D. Polly and an anonymous reviewer for helpful comments on an earlier version of this paper. We also thank the MorphMet listserv—and A. Cardini in particular—for a discussion of measurement error, which made clear to us that current recommendations regarding measurement error in morphometrics were inadequate, and required a rethink. The present paper takes the nascent ideas we expressed in that thread, and converts them into fully developed analytical methods. This work was sponsored in part by National Science Foundation Grants DBI-1902694 and DEB-2146220 (to MLC), and DBI-1902511 and DEB-2140720 (to DCA). All analyses in this paper were performed in R (R Core Team, 2023), using the packages, geomorph (Adams et al., 2023; Baken et al., 2021) and RRPP (Collyer & Adams, 2018, 2023). The functions measurement.error, plot.measurement.error, focusMEonSubjects, interSubVar and plot.interSubVar in RRPP, and gm.measurement.error in geomorph, contain all new analytical approaches described in this paper.
Author information
Authors and Affiliations
Contributions
M.L.C and D.C.A wrote the main manuscript. M.L.C wrote computer scripts, and prepared figures and tables. Both authors reviewed the manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no known conflicts of interest.
Additional information
The original online version of this article was revised: “In this article, a typo occurs in equation 9, \(\begin{aligned} SS_{replicate}= & {} trace({\textbf{S}}_{subject}) \nonumber \\= & {} trace\left( \left( \hat{{\textbf{Z}}}^T_{sr|s} - \hat{{\textbf{Z}}}_{s}\right) ^T \left( \hat{{\textbf{Z}}}^T_{sr|s} - \hat{{\textbf{Z}}}_{s}\right) \right) , \end{aligned}\), should have read as \(\begin{aligned} SS_{replicate}= & {} trace({\textbf{S}}_{replicate}) \nonumber \\= & {} trace\left( \left( \hat{{\textbf{Z}}}^T_{sr|s} - \hat{{\textbf{Z}}}_{s}\right) ^T \left( \hat{{\textbf{Z}}}^T_{sr|s} - \hat{{\textbf{Z}}}_{s}\right) \right) , \end{aligned}\)”.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Collyer, M.L., Adams, D.C. Interrogating Random and Systematic Measurement Error in Morphometric Data. Evol Biol 51, 179–207 (2024). https://doi.org/10.1007/s11692-024-09627-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11692-024-09627-6