Abstract
Suppose that two large, multi-dimensional data sets are each noisy measurements of the same underlying random process, and principal components analysis is performed separately on the data sets to reduce their dimensionality. In some circumstances it may happen that the two lower-dimensional data sets have an inordinately large Procrustean fitting-error between them. The purpose of this manuscript is to quantify this “incommensurability phenomenon”. In particular, under specified conditions, the square Procrustean fitting-error of the two normalized lower-dimensional data sets is (asymptotically) a convex combination (via a correlation parameter) of the Hausdorff distance between the projection subspaces and the maximum possible value of the square Procrustean fitting-error for normalized data. We show how this gives rise to the incommensurability phenomenon, and we employ illustrative simulations and also use real data to explore how the incommensurability phenomenon may have an appreciable impact.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
ANDERSON, T.W. (2003), An Introduction To Multivariate Statistical Analysis (3rd ed.), Hoboken NJ: Wiley Series in Probability and Statistics.
BELKIN,M., and NIYOGI, P. (2003), “Laplacian Eigenmaps for Dimensionality Reduction and Data Representation”, Neural Computation, 15(6), 1373–1396.
BORG, I., and GROENEN, P. (2005), Modern Multidimensional Scaling: Theory and Applications, New York NY: Springer-Verlag.
CANDES, E., LI, X., MA, Y., and WRIGHT, J. (2009), “Robust Principal Component Analysis”, Journal of ACM, 58(1), 1–37.
CHIKUSE, Y. (2003), Statistics on Special Manifolds, Lecture Notes in Statistics, New York NY: Springer.
GOLDBERG, Y., and RITOV, Y. (2009), “Local Procrustes for Manifold Embedding: A Measure of Embedding Quality and Embedding Algorithms”, Machine Learning, 77, 1–25.
HARDOON, D., SZEDMAK, S., and SHAWE-TAYLOR, J. (2004), “Canonical Correlation Analysis: An Overview with Application to Learning Methods”, Neural Computation, 16, 2639–2664.
HORN, R.A., and JOHNSON, C.R. (1990), Matrix Analysis, Cambridge UK: Cambridge University Press.
HOTELLING, H. (1936), “Relations Between Two Sets of Variates”, Biometrika, 28 (3/4), 321–377.
JOLLIFFE, I. (2002), Principal Component Analysis (2nd ed.), New York NY: Springer.
LUO, B., and HANCOCK, E.R. (1999), “Feature Matching with Procrustes Alignment and Graph Editing”, Seventh International Conference on Image Processing and Its Applications, 465, 72–76.
O’LEARY, T., and MARDER, E. (2014), “Mapping Neural Activation Onto Behavior in an Entire Animal”, Science, 344(6182), 372–373.
PRIEBE, C.E., MARCHETTE, D.J., MA, Z., and ADALI, S. (2013), “Manifold Matching: Joint Optimization of Fidelity and Commensurability”, Brazilian Journal of Probability and Statistics, 27(3), 377–400.
QIU, L., ZHANG, Y., and LI, C-K. (2005), “Unitarily Invariant Metrics on the Grassmann Space”, SIAM Journal on Matrix Analysis and Application, 27(2), 507–531.
SAUL, L., and ROWEIS, S. (2000), “Nonlinear Dimensionality Reduction by Locally Linear Embedding”, Science, 290, 2323–2326.
SHARMA, A., KUMAR, A., DAUME, H., and JACOBS, D.W. (2012), “Generalized Multiview Analysis: A Discriminative Latent Space”, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2012), pp 2160–2167.
SHEN, C., SUN, M., TANG, M., and PRIEBE, C.E. (2014), “Generalized Canonical Correlation Analysis for Classification”, Journal of Multivariate Analysis, 130, 310–322.
SIBSON, R. (1978), “Studies in the Robustness of Multidimensional Scaling: Procrustes Statistics”, Journal of the Royal Statistical Society, Series B, 40(2), 234–238.
SIBSON, R. (1979), “Studies in the Robustness of Multidimensional Scaling: Perturbation Analysis of Classical Scaling”, Journal of the Royal Statistical Society, Series B, 41(2), 217–229.
SUN, M., PRIEBE, C.E., and TANG, M. (2013), “Generalized Canonical Correlation Analysis for Disparate Data Fusion”, Pattern Recognition Letters, 34(2),194–200.
SUN, M., and PRIEBE, C.E. (2013), “Efficiency Investigation of Manifold Matching for Text Document Classification”, Pattern Recognition Letters, 34(11), 1263–1269.
TENENBAUM, J., DE SILVA,V., and LANGFORD, J. (2000), “A Global Geometric Framework for Nonlinear Dimension Reduction”, Science, 290, 2319–2323.
VOGELSTEIN, J.T., PARK, Y., OHYAMA, T., KERR, R., TRUMAN, J.W., PRIEBE, C.E., and ZLATIC, M. (2014), “Discovery of Brainwide Neural-Behavioral Maps via Multiscale Unsupervised Structure Learning”, Science, 344(6182), 386–392.
WANG, C., LIU, B., VU, B., and MAHADEVAN, S. (2012), “Sparse Manifold Alignment”, Technical Report UM-CS-2012-030, University of Massachusetts Department of Computer Science.
WANG, C., and MAHADEVAN, S. (2008), “Manifold Alignment Using Procrustes Analysis”, in Proceedings of the 25th International Conference on Machine Learning, pp 1120–1127.
WITTEN, D., TIBSHIRANI, R., and HASTIE, T. (2009), “A Penalized Matrix Decomposition, with Applications to Sparse Principal Components and Canonical Correlation Analysis”, Biostatistics, 10(3), 515–534.
ZOU, H., and HASTIE, T. (2006), “Sparse Principal Component Analysis”, Journal of Computational and Graphical Statistics, 15(2), 262–286.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0), which permits use, duplication, adaptation, distribution, and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Fishkind, D.E., Shen, C., Park, Y. et al. On the Incommensurability Phenomenon. J Classif 33, 185–209 (2016). https://doi.org/10.1007/s00357-016-9203-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00357-016-9203-9