Advertisement

Dimensionality Reduction Techniques for Visualizing Morphometric Data: Comparing Principal Component Analysis to Nonlinear Methods

  • Trina Y. DuEmail author
Tools and Techniques
  • 68 Downloads

Abstract

Principal component analysis (PCA) is the most widely used dimensionality reduction technique in the biological sciences, and is commonly employed to create 2D visualizations of geometric morphometric data. However, interesting biological information may be lost or misrepresented in these plots due to PCA’s inability to summarize nonlinear dependencies between variables. Nonlinear alternative methods exist, but their effectiveness has never been tested on morphometric data. Here, the performance of PCA on the task of visualizing morphometric variation is compared to four nonlinear techniques: Sammon Mapping, Isomap, Locally Linear Embedding, and Laplacian Eigenmaps. The performance of methods is assessed on the basis of global and local preservation of pairwise distances for a variety of simulated and empirical datasets. The relative performance of PCA varies in function of the distribution of variation, complexity, and size of datasets. Overall, nonlinear methods show superior preservation of small differences between morphologies compared to PCA.

Keywords

Data visualization Morphological variation Multivariate data Theoretical biology 

Notes

Acknowledgements

Thanks to D. Fowler and H. Larsson for advice, as well as A. Beauvais-Lacasse and A. Huot for help with coding. I am grateful to the Natural Sciences and Engineering Research Council of Canada (CGS-D) and le Fonds de recherche du Québec - Nature et technologies (BX3) for funding.

Compliance with Ethical Standards

Conflict of interest

The author has no conflicts of interest to declare.

Supplementary material

11692_2018_9464_MOESM1_ESM.xlsx (21 kb)
Supplementary material 1 (XLSX 21 KB)
11692_2018_9464_MOESM2_ESM.pdf (44 kb)
Supplementary material 2 (PDF 43 KB)
11692_2018_9464_MOESM3_ESM.m (8 kb)
Supplementary material 3 (M 8 KB)

References

  1. Adams, D. C., & Collyer, M. L. (2018). Multivariate phylogenetic comparative methods: Evaluations, comparisons, and recommendations. Systematic Biology, 67(1), 14–31.CrossRefGoogle Scholar
  2. Adams, D. C., Collyer, M. L., Kaliontzopoulou, A., & Sherratt, E. (2017). Geomorph: Geometric morphometric analyses of 2D/3D landmark data. R Package version 3.0.5. https://cran.r-project.org/package=geomorph.
  3. Alberch, P. (1991). From genes to phenotype: Dynamical systems and evolvability. Genetica, 84(1), 5–11.CrossRefGoogle Scholar
  4. Altenberg, L. (2005). Modularity in evolution: Some low-level questions. In W. Callebaut & D. Rasskin-Gutman (Eds.), Modularity: Understanding the development and evolution of natural complex systems (pp. 99–128). Cambridge: MIT Press.Google Scholar
  5. Bartholomew, D. J., Knott, M., & Moustaki, I. (2011). Latent variable models and factor analysis: A unified approach. West Sussex: Wiley.CrossRefGoogle Scholar
  6. Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.CrossRefGoogle Scholar
  7. Bookstein, F. L. (1996). Combining the tools of geometric morphometrics. In L. F. Marcus, M. A. Loy, J. P. Naylor & D. E. Slice (Eds.), Advances in morphometrics (pp. 131–151). Boston: Springer.CrossRefGoogle Scholar
  8. Fontana, W., & Schuster, P. (1998). Shaping space: The possible and the attainable in RNA genotype–phenotype mapping. Journal of Theoretical Biology, 194(4), 491–515.CrossRefGoogle Scholar
  9. Gerber, S. (2011). Comparing the differential filling of morphospace and allometric space through time: The morphological and developmental dynamics of Early Jurassic ammonoids. Paleobiology, 37(3), 369–382.CrossRefGoogle Scholar
  10. Hotelling, H. (1933). Analysis of a complex of statistical variables into principal components. Journal of Educational Psychology, 24(6), 417–441.CrossRefGoogle Scholar
  11. Huttegger, S. M., & Mitteroecker, P. (2011). Invariance and meaningfulness in phenotype spaces. Evolutionary Biology, 38(3), 335–351.CrossRefGoogle Scholar
  12. Jernvall, J. (2000). Linking development with generation of novelty in mammalian teeth. Proceedings of the National Academy of Sciences, 97(6), 2641–2645.CrossRefGoogle Scholar
  13. Kaski, S., Nikkilä, J., Oja, M., Venna, J., Törönen, P., & Castrén, E. (2003). Trustworthiness and metrics in visualizing similarity of gene expression. BMC Bioinformatics, 4, 48.CrossRefGoogle Scholar
  14. Kouropteva, O., Okun, O., & Pietikäinen, M. (2002). Selection of the optimal parameter value for the locally linear embedding algorithm. In Proceedings of the 1st international conference on fuzzy systems and knowledge discovery (pp. 359–363). Singapore.Google Scholar
  15. Lawing, A. M., & Polly, P. D. (2010). Geometric morphometrics: Recent applications to the study of evolution and development. Journal of Zoology, 280(1), 1–7.CrossRefGoogle Scholar
  16. Lee, J. A., & Verleysen, M. (2007). Nonlinear dimensionality reduction. New York: Springer.CrossRefGoogle Scholar
  17. MATLAB and Statistics Toolbox. (Version 2018a). Natick: The MathWorks, Inc.Google Scholar
  18. Meier, A., & Kramer, O. (2017). An experimental study of dimensionality reduction methods. In G. Kern-Isberner, J. Fürnkranz & M. Thimm (Eds.), Advances in artificial intelligence, lecture notes in computer science (pp. 178–192). Cham: Springer.Google Scholar
  19. Mitteroecker, P. (2009). The developmental basis of variational modularity: Insights from quantitative genetics, morphometrics, and developmental biology. Evolutionary Biology, 36(4), 377–385.CrossRefGoogle Scholar
  20. Mitteroecker, P., & Huttegger, S. M. (2009). The concept of morphospaces in evolutionary and developmental biology: Mathematics and metaphors. Biological Theory, 4(1), 54–67.CrossRefGoogle Scholar
  21. Niskanen, M., & Silvén, O. (2003). Comparison of dimensionality reduction methods for wood surface inspection. In Sixth international conference on quality control by artificial vision (pp. 178–189). Gatlinburg, TE, USA.Google Scholar
  22. Pearson, K. (1901). On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science, 2(11), 559–572.CrossRefGoogle Scholar
  23. Polly, P. D. (2008). Developmental dynamics and g-matrices: Can morphometric spaces be used to model phenotypic evolution? Evolutionary Biology, 35(2), 83–96.CrossRefGoogle Scholar
  24. Polly, P. D., Lawing, A. M., Fabre, A.-C., & Goswami, A. (2013). Phylogenetic principal components analysis and geometric morphometrics. Hystrix, the Italian Journal of Mammalogy, 24(1), 33–41.Google Scholar
  25. Polly, P. D., & Motz, G. J. (2016). Patterns and processes in morphospace: Geometric morphometrics of three-dimensional objects. The Paleontological Society Papers, 22, 71–99.CrossRefGoogle Scholar
  26. Raup, D. M. (1961). The geometry of coiling in gastropods. Proceedings of the National Academy of Sciences, 47(4), 602–609.CrossRefGoogle Scholar
  27. Raup, D. M. (1966). Geometric analysis of shell coiling: General problems. Journal of Paleontology, 40(5), 1178–1190.Google Scholar
  28. R Core Team. (2018). R: A language and environment for statistical computing. Vienna: R Foundation for Statistical Computing.Google Scholar
  29. Rohlf, F. J. (1999). Shape statistics: Procrustes superimpositions and tangent spaces. Journal of Classification, 16(2), 197–233.CrossRefGoogle Scholar
  30. Roweis, S. T., & Saul, L. K. (2000). Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500), 2323–2326.CrossRefGoogle Scholar
  31. Sakamoto, M., & Ruta, M. (2012). Convergence and divergence in the evolution of cat skulls: Temporal and spatial patterns of morphological diversity. PLoS ONE, 7(7), e39752.CrossRefGoogle Scholar
  32. Salazar-Ciudad, I., & Jernvall, J. (2010). A computational model of teeth and the developmental origins of morphological variation. Nature, 464(7288), 583–586.CrossRefGoogle Scholar
  33. Samko, O., Marshall, A. D., & Rosin, P. L. (2006). Selection of the optimal parameter value for the Isomap algorithm. Pattern Recognition Letters, 27(9), 968–979.CrossRefGoogle Scholar
  34. Sammon, J. W. (1969). A nonlinear mapping for data structure analysis. IEEE Transactions on Computers, 18(5), 401–409.CrossRefGoogle Scholar
  35. Schuster, P., Fontana, W., Stadler, P. F., & Hofacker, I. L. (1994). From sequences to shapes and back: A case study in RNA secondary structures. Proceedings of the Royal Society of London B: Biological Sciences, 255(1344), 279–284.CrossRefGoogle Scholar
  36. Sidlauskas, B. (2008). Continuous and arrested morphological diversification in sister clades of characiform fishes: A phylomorphospace approach. Evolution, 62(12), 3135–3156.CrossRefGoogle Scholar
  37. Stadler, B. M. R., Stadler, P. F., Wagner, G. P., & Fontana, W. (2001). The topology of the possible: Formal spaces underlying patterns of evolutionary change. Journal of Theoretical Biology, 213(2), 241–274.CrossRefGoogle Scholar
  38. Tenenbaum, J. B., De Silva, V., & Langford, J. C. (2000). A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500), 2319–2323.CrossRefGoogle Scholar
  39. Torgerson, W. S. (1952). Multidimensional scaling: I. Theory and method. Psychometrika, 17(4), 401–419.CrossRefGoogle Scholar
  40. Uyeda, J. C., Caetano, D. S., & Pennell, M. W. (2015). Comparative analysis of principal components can be misleading. Systematic Biology, 64(4), 677–689.CrossRefGoogle Scholar
  41. van der Maaten, L., Postma, E., & van den Herik, J. (2009). Dimensionality reduction: A comparative review (# TiCC-TR 2009-005). Tilburg: Tilburg University.Google Scholar
  42. Venna, J., & Kaski, S. (2007). Comparison of visualization methods for an atlas of gene expression data sets. Information Visualization, 6(2), 139–154.CrossRefGoogle Scholar
  43. Young, G., & Householder, A. S. (1938). Discussion of a set of points in terms of their mutual distances. Psychometrika, 3(1), 19–22.CrossRefGoogle Scholar
  44. Young, N. M., Hu, D., Lainoff, A. J., Smith, F. J., Diaz, R., Tucker, A. S., et al. (2014). Embryonic bauplans and the developmental origins of facial diversity and constraint. Development, 141(5), 1059–1063.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of BiologyUniversity of OttawaOttawaCanada

Personalised recommendations