Advertisement

Evolutionary Biology

, Volume 46, Issue 4, pp 303–316 | Cite as

Seeing Distinct Groups Where There are None: Spurious Patterns from Between-Group PCA

  • Andrea Cardini
  • Paul O’Higgins
  • F. James RohlfEmail author
Research Article

Abstract

Using sampling experiments, we found that, when there are fewer groups than variables, between-groups PCA (bgPCA) may suggest surprisingly distinct differences among groups for data in which none exist. While apparently not noticed before, the reasons for this problem are easy to understand. A bgPCA captures the g − 1 dimensions of variation among the g group means, but only a fraction of the \(\sum {n_{i} } - g\) dimensions of within-group variation (\(n_{i}\) are the sample sizes), when the number of variables, p, is greater than g − 1. This introduces a distortion in the appearance of the bgPCA plots because the within-group variation will be underrepresented, unless the variables are sufficiently correlated so that the total variation can be accounted for with just g − 1 dimensions. The effect is most obvious when sample sizes are small relative to the number of variables, because smaller samples spread out less, but the distortion is present even for large samples. Strong covariance among variables largely reduces the magnitude of the problem, because it effectively reduces the dimensionality of the data and thus enables a larger proportion of the within-group variation to be accounted for within the g − 1-dimensional space of a bgPCA. The distortion will still be relevant though its strength will vary from case to case depending on the structure of the data (p, g, covariances etc.). These are important problems for a method mainly designed for the analysis of variation among groups when there are very large numbers of variables and relatively small samples. In such cases, users are likely to conclude that the groups they are comparing are much more distinct than they really are. Having many variables but just small sample sizes is a common problem in fields ranging from morphometrics (as in our examples) to molecular analyses.

Keywords

Covariance Geometric morphometrics Group separation Isotropic model Spurious clustering 

Notes

Acknowledgements

We are very grateful to Jessica Grisenti, who carefully collected the marmot data for her undergraduate thesis and gave AC permission to use them. The authors appreciate the most helpful comments of Julien Claude who reviewed this paper.

Compliance with Ethical Standards

Conflict of interest

The authors declare that they have no conflicts of interest.

References

  1. Astúa, D. (2009). Evolution of scapula size and shape in Didelphid Marsupials (Didelphimorphia: Didelphidae). Evolution, 63(9), 2438–2456.  https://doi.org/10.1111/j.1558-5646.2009.00720.x.CrossRefPubMedGoogle Scholar
  2. Baab, K. L. (2016). The role of neurocranial shape in defining the boundaries of an expanded Homo erectus hypodigm. Journal of Human Evolution, 92, 1–21.  https://doi.org/10.1016/j.jhevol.2015.11.004.CrossRefPubMedGoogle Scholar
  3. Benazzi, S., Douka, K., Fornai, C., Bauer, C. C., Kullmer, O., Svoboda, J., et al. (2011). Early dispersal of modern humans in Europe and implications for Neanderthal behaviour. Nature, 479(7374), 525–528.  https://doi.org/10.1038/nature10617.CrossRefGoogle Scholar
  4. Blackith, R. E., & Reyment, R. A. (1971). Multivariate morphometrics. New York: Academic Press.Google Scholar
  5. Bookstein, F. L. (1991). Morphometric tools for landmark data: Geometry and Biology. New York: Cambridge Univ. Press.Google Scholar
  6. Bookstein, F. L. (1997). Landmark methods for forms without landmarks: Morphometrics of group differences in outline shape. Medical Image Analysis, 1, 225–243.CrossRefGoogle Scholar
  7. Bookstein, F. L. (2017). A newly noticed formula enforces fundamental limits on geometric morphometric analyses. Evolutionary Biology, 44(4), 522–541.  https://doi.org/10.1007/s11692-017-9424-9.CrossRefGoogle Scholar
  8. Bookstein, F. L. (2018). A course in morphometrics for biologists. New York: Cambridge Univ. Press.CrossRefGoogle Scholar
  9. Bookstein, F. L. (2019). Pathologies of between-groups principal components analysis in geometric morphometrics. Evolutionary Biology.  https://doi.org/10.1101/627448.CrossRefGoogle Scholar
  10. Bookstein, F., Schäfer, K., Prossinger, H., Seidler, H., Fieder, M., Stringer, C., et al. (1999). Comparing frontal cranial profiles in archaic and modern Homo by morphometric analysis. The Anatomical Record, 257(6), 217–224.  https://doi.org/10.1002/(SICI)1097-0185(19991215)257:6%3c217:AID-AR7%3e3.0.CO;2-W.CrossRefPubMedGoogle Scholar
  11. Boulesteix, A.-L. (2005). A note on between-group PCA. International Journal of Pure and Applied Mathematics, 19, 359–366.Google Scholar
  12. Cardini, A. (2003). The geometry of the marmot (Rodentia: Sciuridae) mandible: Phylogeny and patterns of morphological evolution. Systematic Biology, 52(2), 186–205.  https://doi.org/10.1080/10635150390192807.CrossRefPubMedGoogle Scholar
  13. Cardini, A. (2018). Integration and modularity in Procrustes shape data: Is there a risk of spurious results? Evolutionary Biology.  https://doi.org/10.1007/s11692-018-9463-x.CrossRefGoogle Scholar
  14. Cardini, A., & Elton, S. (2007). Sample size and sampling error in geometric morphometric studies of size and shape. Zoomorphology, 126(2), 121–134.  https://doi.org/10.1007/s00435-007-0036-2.CrossRefGoogle Scholar
  15. Cardini, A., & Elton, S. (2008). Does the skull carry a phylogenetic signal? Evolution and modularity in the guenons. Biological Journal of the Linnean Society, 93(4), 813–834.  https://doi.org/10.1111/j.1095-8312.2008.01011.x.CrossRefGoogle Scholar
  16. Cardini, A., & Elton, S. (2017). Is there a “Wainer’s rule”? Testing which sex varies most as an example analysis using GueSDat, the free Guenon Skull Database. Hystrix, the Italian Journal of Mammalogy, 28(2), 147–156.  https://doi.org/10.4404/hystrix-28.2-12139.CrossRefGoogle Scholar
  17. Cardini, A., Jansson, A., & Elton, S. (2007). A geometric morphometric approach to the study of ecogeographical and clinal variation in vervet monkeys. Journal of Biogeography, 34(10), 1663–1678.  https://doi.org/10.1111/j.1365-2699.2007.01731.x.CrossRefGoogle Scholar
  18. Cardini, A., & O’Higgins, P. (2004). Patterns of morphological evolution in Marmota (Rodentia, Sciuridae): Geometric morphometrics of the cranium in the context of marmot phylogeny, ecology and conservation. Biological Journal of the Linnean Society, 82(3), 385–407.  https://doi.org/10.1111/j.1095-8312.2004.00367.x.CrossRefGoogle Scholar
  19. Cardini, A., Seetah, K., & Barker, G. (2015). How many specimens do I need? Sampling error in geometric morphometrics: testing the sensitivity of means and variances in simple randomized selection experiments. Zoomorphology, 134(2), 149–163.  https://doi.org/10.1007/s00435-015-0253-z.CrossRefGoogle Scholar
  20. Chemisquy, M. A., Prevosti, F. J., Martin, G., & Flores, D. A. (2015). Evolution of molar shape in didelphid marsupials (Marsupialia: Didelphidae): Analysis of the influence of ecological factors and phylogenetic legacy. Zoological Journal of the Linnean Society, 173(1), 217–235.  https://doi.org/10.1111/zoj.12205.CrossRefGoogle Scholar
  21. Chiozzi, G., Bardelli, G., Ricci, M., De Marchi, G., & Cardini, A. (2014). Just another island dwarf? Phenotypic distinctiveness in the poorly known Soemmerring’s Gazelle, Nanger soemmerringii (Cetartiodactyla: Bovidae), of Dahlak Kebir Island. Biological Journal of the Linnean Society, 111(3), 603–620.  https://doi.org/10.1111/bij.12239.CrossRefGoogle Scholar
  22. Cooke, S. B., & Terhune, C. E. (2015). Form, function, and geometric morphometrics. The Anatomical Record, 298(1), 5–28.  https://doi.org/10.1002/ar.23065.CrossRefPubMedGoogle Scholar
  23. Corti, M., Aguilera, M., & Capanna, E. (2001). Size and shape changes in the skull accompanying speciation of South American spiny rats (Rodentia: Proechimys spp.). Journal of Zoology, 253(4), 537–547.  https://doi.org/10.1017/s0952836901000498.CrossRefGoogle Scholar
  24. Cucchi, T., Hulme-Beaman, A., Yuan, J., & Dobney, K. (2011). Early Neolithic pig domestication at Jiahu, Henan Province, China: Clues from molar shape analyses using geometric morphometric approaches. Journal of Archaeological Science, 38(1), 11–22.  https://doi.org/10.1016/j.jas.2010.07.024.CrossRefGoogle Scholar
  25. Culhane, A. C., Perrière, G., Considine, E. C., Cotter, T. G., & Higgins, D. G. (2002). Between-group analysis of microarray data. Bioinformatics, 18(12), 1600–1608.  https://doi.org/10.1093/bioinformatics/18.12.1600.CrossRefPubMedGoogle Scholar
  26. Dapporto, L., Petrocelli, I., & Turillazzi, S. (2011). Incipient morphological castes in Polistes gallicus (Vespidae, Hymenoptera). Zoomorphology, 130(3), 197–201.  https://doi.org/10.1007/s00435-011-0130-3.CrossRefGoogle Scholar
  27. Domjanic, J., Seidler, H., & Mitteroecker, P. (2015). A combined morphometric analysis of foot form and its association with sex, stature, and body mass. American Journal of Physical Anthropology, 157(4), 582–591.  https://doi.org/10.1002/ajpa.22752.CrossRefPubMedGoogle Scholar
  28. Ferretti, A., Cardini, A., Crampton, J. S., Serpagli, E., Sheets, H. D., & Štorch, P. (2013). Rings without a lord? Enigmatic fossils from the lower Palaeozoic of Bohemia and the Carnic Alps. Lethaia, 46(2), 211–222.  https://doi.org/10.1111/let.12004.CrossRefGoogle Scholar
  29. Fisher, R. A. (1936). The use of multiple measurements in taxonomic problems. Annals of eugenics, 7(2), 179–188.  https://doi.org/10.1111/j.1469-1809.1936.tb02137.x.CrossRefGoogle Scholar
  30. Franchini, P., Fruciano, C., Spreitzer, M. L., Jones, J. C., Elmer, K. R., Henning, F., et al. (2014). Genomic architecture of ecologically divergent body shape in a pair of sympatric crater lake cichlid fishes. Molecular Ecology, 23(7), 1828–1845.  https://doi.org/10.1111/mec.12590.CrossRefPubMedGoogle Scholar
  31. Franklin, D., Cardini, A., Flavel, A., & Kuliukas, A. (2013). Estimation of sex from cranial measurements in a Western Australian population. Forensic Science International, 229(1), 158.e151–158.e158.  https://doi.org/10.1016/j.forsciint.2013.03.005.CrossRefGoogle Scholar
  32. Fruciano, C., Celik, M. A., Butler, K., Dooley, T., Weisbecker, V., & Phillips, M. J. (2017). Sharing is caring? Measurement error and the issues arising from combining 3D morphometric datasets. Ecology and Evolution, 7(17), 7034–7046.  https://doi.org/10.1002/ece3.3256.CrossRefPubMedPubMedCentralGoogle Scholar
  33. Fruciano, C., Franchini, P., Raffini, F., Fan, S., & Meyer, A. (2016). Are sympatrically speciating Midas cichlid fish special? Patterns of morphological and genetic variation in the closely related species Archocentrus centrarchus. Ecology and Evolution, 6(12), 4102–4114.  https://doi.org/10.1002/ece3.2184.CrossRefPubMedPubMedCentralGoogle Scholar
  34. Fruciano, C., Tigano, C., & Ferrito, V. (2011). Geographical and morphological variation within and between colour phases in Coris julis (L. 1758), a protogynous marine fish. Biological Journal of the Linnean Society, 104(1), 148–162.  https://doi.org/10.1111/j.1095-8312.2011.01700.x.CrossRefGoogle Scholar
  35. Galimberti, F., Sanvito, S., Vinesi, M. C., & Cardini, A. (2019). Nose-metrics of wild southern elephant seal (Mirounga leonina) males using photogrammetry and geometric morphometry. Journal of Zoological Systematics & Evolutionary Research.  https://doi.org/10.1111/jzs.12276.CrossRefGoogle Scholar
  36. Gómez-Robles, A., Olejniczak, A. J., Martinón-Torres, M., Prado-Simón, L., & Castro, J. M. B. (2011). Evolutionary novelties and losses in geometric morphometrics: A practical approach through hominin molar morphology. Evolution, 65(6), 1772–1790.  https://doi.org/10.1111/j.1558-5646.2011.01244.x.CrossRefPubMedGoogle Scholar
  37. Gonzalez, P. N., Kristensen, E., Morck, D. W., Boyd, S., & Hallgrímsson, B. (2013). Effects of growth hormone on the ontogenetic allometry of craniofacial bones. Evolution & Development, 15(2), 133–145.  https://doi.org/10.1111/ede.12025.CrossRefGoogle Scholar
  38. Green, D. J., Sugiura, Y., Seitelman, B. C., & Gunz, P. (2015). Reconciling the convergence of supraspinous fossa shape among hominoids in light of locomotor differences. American Journal of Physical Anthropology, 156(4), 498–510.  https://doi.org/10.1002/ajpa.22695.CrossRefPubMedGoogle Scholar
  39. Gunz, P., & Mitteroecker, P. (2013). Semilandmarks: A method for quantifying curves and surfaces. Hystrix, the Italian journal of mammalogy, 24(1), 103–109.  https://doi.org/10.4404/hystrix-24.1-6292.CrossRefGoogle Scholar
  40. Gunz, P., Ramsier, M., Kuhrig, M., Hublin, J.-J., & Spoor, F. (2012). The mammalian bony labyrinth reconsidered, introducing a comprehensive geometric morphometric approach. Journal of Anatomy, 220(6), 529–543.  https://doi.org/10.1111/j.1469-7580.2012.01493.x.CrossRefPubMedPubMedCentralGoogle Scholar
  41. Hair, J. F., Black, W. C., Babin, B. J., & Anderson, R. E. (2009). Multivariate data analysis (7th ed.). Upper Saddle River: Pearson Prentice Hall.Google Scholar
  42. Hublin, J.-J., Ben-Ncer, A., Bailey, S. E., Freidline, S. E., Neubauer, S., Skinner, M. M., et al. (2017). New fossils from Jebel Irhoud, Morocco and the pan-African origin of Homo sapiens. Nature, 546(7657), 289–292.  https://doi.org/10.1038/nature22336.CrossRefGoogle Scholar
  43. Ivanović, A., Sotiropoulos, K., Džukić, G., & Kalezić, M. L. (2009). Skull size and shape variation versus molecular phylogeny: A case study of alpine newts (Mesotriton alpestris, Salamandridae) from the Balkan Peninsula. Zoomorphology, 128(2), 157–167.  https://doi.org/10.1007/s00435-009-0085-9.CrossRefGoogle Scholar
  44. Izenman, A. J. (2008). Modern statistical techniques: regression, classification, and manifold learning. New York: Springer.Google Scholar
  45. Klenovšek, T., & Jojić, V. (2016). Modularity and cranial integration across ontogenetic stages in Martino’s vole, Dinaromys bogdanovi. Contributions to Zoology, 85(3), 275–289.  https://doi.org/10.1163/18759866-08503002.CrossRefGoogle Scholar
  46. Knigge, R. P., Tocheri, M. W., Orr, C. M., & McNulty, K. P. (2015). Three-dimensional geometric morphometric analysis of talar morphology in extant gorilla taxa from highland and lowland habitats. The Anatomical Record, 298(1), 277–290.  https://doi.org/10.1002/ar.23069.CrossRefPubMedGoogle Scholar
  47. Kovarovic, K., Aiello, L. C., Cardini, A., & Lockwood, C. A. (2011). Discriminant function analyses in archaeology: Are classification rates too good to be true? Journal of Archaeological Science, 38(11), 3006–3018.CrossRefGoogle Scholar
  48. Kubiak, B. B., Gutiérrez, E. E., Galiano, D., Maestri, R., & Freitas, T. R. O. (2017). Can niche modeling and geometric morphometrics document competitive exclusion in a pair of subterranean rodents (Genus Ctenomys) with tiny parapatric distributions? Scientific Reports, 7(1), 1–13.  https://doi.org/10.1038/s41598-017-16243-2.CrossRefGoogle Scholar
  49. Mahalanobis, P. C. (1936). On the generalized distance in statistics. Proceedings National Institute of Science, India, 2(1), 49–55.Google Scholar
  50. Mitteroecker, P., & Bookstein, F. (2011). Linear discrimination, ordination, and the visualization of selection gradients in modern morphometrics. Evolutionary Biology, 38(1), 100–114.  https://doi.org/10.1007/s11692-011-9109-8.CrossRefGoogle Scholar
  51. Mitteroecker, P., Gunz, P., & Bookstein, F. L. (2005). Heterochrony and geometric morphometrics: A comparison of cranial growth in Pan paniscus versus Pan troglodytes. Evolution & Development, 7(3), 244–258.  https://doi.org/10.1111/j.1525-142X.2005.05027.x.CrossRefGoogle Scholar
  52. Neubauer, S., Gunz, P., Leakey, L., Leakey, M., Hublin, J.-J., & Spoor, F. (2018). Reconstruction, endocranial form and taxonomic affinity of the early Homo calvaria KNM-ER 42700. Journal of Human Evolution, 121, 25–39.  https://doi.org/10.1016/j.jhevol.2018.04.005.CrossRefPubMedGoogle Scholar
  53. Oxnard, C., & Higgins, P. (2011). Biology clearly needs morphometrics. Does morphometrics need biology? Biological Theory, 4(1), 84–97.  https://doi.org/10.1162/biot.2009.4.1.84.CrossRefGoogle Scholar
  54. Pallares, L. F., Turner, L. M., & Tautz, D. (2016). Craniofacial shape transition across the house mouse hybrid zone: Implications for the genetic architecture and evolution of between-species differences. Development Genes and Evolution, 226(3), 173–186.  https://doi.org/10.1007/s00427-016-0550-7.CrossRefPubMedPubMedCentralGoogle Scholar
  55. Ritzman, T. B., Terhune, C. E., Gunz, P., & Robinson, C. A. (2016). Mandibular ramus shape of Australopithecus sediba suggests a single variable species. Journal of Human Evolution, 100, 54–64.  https://doi.org/10.1016/j.jhevol.2016.09.002.CrossRefPubMedGoogle Scholar
  56. Rohlf, F. J. (2015). The tps series of software. Hystrix, the Italian Journal of Mammalogy, 26, 1-4.  https://doi.org/10.4404/hystrix-26.1-11264.CrossRefGoogle Scholar
  57. Rohlf, F. J., & Slice, D. (1990). Extensions of the Procrustes method for the optimal superimposition of landmarks. Systematic Zoology, 39(1), 40–59.  https://doi.org/10.2307/2992207.CrossRefGoogle Scholar
  58. Sanfilippo, P. G., Cardini, A., Sigal, I. A., Ruddle, J. B., Chua, B. E., Hewitt, A. W., et al. (2010). A geometric morphometric assessment of the optic cup in glaucoma. Experimental Eye Research, 91(3), 405–414.  https://doi.org/10.1016/j.exer.2010.06.014.CrossRefPubMedGoogle Scholar
  59. Sansalone, G., Colangelo, P., Kotsakis, T., Loy, A., Castiglia, R., Bannikova, A. A., et al. (2018). Influence of evolutionary allometry on rates of morphological evolution and disparity in strictly subterranean Moles (Talpinae, Talpidae, Lipotyphla, Mammalia). Journal of Mammalian Evolution, 25(1), 1–14.  https://doi.org/10.1007/s10914-016-9370-9.CrossRefGoogle Scholar
  60. Schlager, S. (2017). Morpho and Rvcg—Shape analysis in R. In G. Zheng, S. Li, & G. Szekely (Eds.), Statistical shape and deformation analysis (pp. 217–256). New York: Academic Press.CrossRefGoogle Scholar
  61. Schlager, S., & Rüdell, A. (2015). Analysis of the human osseous nasal shape—Population differences and sexual dimorphism. American Journal of Physical Anthropology, 157(4), 571–581.  https://doi.org/10.1002/ajpa.22749.CrossRefPubMedGoogle Scholar
  62. Seetah, T. K., Cardini, A., & Miracle, P. T. (2012). Can morphospace shed light on cave bear spatial-temporal variation? Population dynamics of Ursus spelaeus from Romualdova pećina and Vindija, (Croatia). Journal of Archaeological Science, 39(2), 500–510.  https://doi.org/10.1016/j.jas.2011.10.005.CrossRefGoogle Scholar
  63. Serb, J. M., Sherratt, E., Alejandrino, A., & Adams, D. C. (2017). Phylogenetic convergence and multiple shell shape optima for gliding scallops (Bivalvia: Pectinidae). Journal of Evolutionary Biology, 30(9), 1736–1747.  https://doi.org/10.1111/jeb.13137.CrossRefPubMedGoogle Scholar
  64. Siberchicot, A., Julien-Laferrière, A., Dufour, A.-B., Thioulouse, J., & Dray, S. (2017). adegraphics: An s4 lattice-based package for the representation of multivariate data. The R Journal, 9(2), 198–212.  https://doi.org/10.32614/RJ-2017-042.CrossRefGoogle Scholar
  65. Skinner, M. M., Gunz, P., Wood, B. A., & Hublin, J. J. (2009). How many landmarks? Assessing the classification accuracy of Pan lower molars using a geometric morphometric analysis of the occlusal basin as seen at the enamel-dentine junction. In T. Koppe (Ed.), Comparative dental morphology. Basel: Karger Publishers.  https://doi.org/10.1159/000242385.CrossRefGoogle Scholar
  66. Slice, D. E. (2005). Modern morphometrics. In D. E. Slice (Ed.), Modern morphometrics in physical anthropology (pp. 1–45). Boston, MA: Springer.CrossRefGoogle Scholar
  67. Souto-Lima, R. B., & Millien, V. (2014). The influence of environmental factors on the morphology of red-backed voles Myodes gapperi (Rodentia, Arvicolinae) in Québec and western Labrador. Biological Journal of the Linnean Society, 112(1), 204–218.  https://doi.org/10.1111/bij.12263.CrossRefGoogle Scholar
  68. Torres-Tamayo, N., García-Martínez, D., Zlolniski, S. L., Torres-Sánchez, I., García-Río, F., & Bastir, M. (2018). 3D analysis of sexual dimorphism in size, shape and breathing kinematics of human lungs. Journal of Anatomy, 232(2), 227–237.  https://doi.org/10.1111/joa.12743.CrossRefPubMedGoogle Scholar
  69. Watanabe, A. (2018). How many landmarks are enough to characterize shape and size variation? PLoS ONE, 13(6), e0198341.  https://doi.org/10.1371/journal.pone.0198341.CrossRefPubMedPubMedCentralGoogle Scholar
  70. Yendle, P. W., & MacFie, H. J. (1989). Discriminant principal components analysis. Journal of Chemometrics, 3(4), 589–600.  https://doi.org/10.1002/cem.1180030407.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Dipartimento di Scienze Chimiche e GeologicheUniversità di Modena e Reggio EmiliaModenaItaly
  2. 2.Centre for Forensic AnthropologyThe University of Western AustraliaCrawleyAustralia
  3. 3.Department of Anthropology and Department of Ecology and EvolutionStony Brook UniversityStonybrookUSA
  4. 4.Department of Archaeology and Hull York Medical SchoolUniversity of YorkHeslingtonUK

Personalised recommendations