Embedded Map Projection for Dimensionality Reduction-Based Similarity Search

  • Simone Marinai
  • Emanuele Marino
  • Giovanni Soda
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5342)


We describe a dimensionality reduction method based on data point projection in an output space obtained by embedding the Growing Hierarchical Self Organizing Maps (GHSOM) computed from a training data-set. The dimensionality reduction is used in a similarity search framework whose aim is to efficiently retrieve similar objects on the basis of the Euclidean distance among high dimensional feature vectors projected in the reduced space. This research is motivated by applications aimed at performing Document Image Retrieval in Digital Libraries. In this paper we compare the proposed method with other dimensionality reduction techniques evaluating the retrieval performance on three data-sets.


Dimensionality Reduction Dimensionality Reduction Method Restrict Boltzmann Machine Nonlinear Dimensionality Reduction Best Match Unit 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Marinai, S., Marino, E., Soda, G.: Font adaptive word indexing of modern printed documents. IEEE Transactions on PAMI 28(8), 1187–1199 (2006)CrossRefGoogle Scholar
  2. 2.
    Samet, H.: Foundations of multidimensional and metric data structures. Morgan Kaufmann, Amsterdam (2006)zbMATHGoogle Scholar
  3. 3.
    Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference, and prediction. Springer series in statistics, New York, NY, USA (2001)Google Scholar
  4. 4.
    Beyer, K., Goldstein, J., Ramakrishnan, R., Shaft, U.: When is “nearest neighbor” meaningful? In: Proc. 7th Int. Conf. on Database Theory, vol. 3, pp. 1763–1768 (1999)Google Scholar
  5. 5.
    Kanth, K.V.R., Agrawal, D., Singh, A.: Dimensionality reduction for similarity searching in dynamic databases. SIGMOD Rec. 27(2), 166–176 (1998)CrossRefGoogle Scholar
  6. 6.
    van der Maaten, L., Postma, E., van den Herik, H.: Dimension reduction: A comparative review (preprint, 2007)Google Scholar
  7. 7.
    De Mers, D., Cottrell, G.: Nonlinear dimensionality reduction. In: NIPS-5 (1993)Google Scholar
  8. 8.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Zhang, Z., Zha, H.: Principal manifolds and nonlinear dimensionality reduction via local tangent space alignment. SIAM Journal of Scientific Computing 26(1), 313–338 (2004)CrossRefzbMATHGoogle Scholar
  10. 10.
    Kohonen, T., Kaski, S., Lagus, K., Salojarvi, J., Honkela, J., Paatero, V., Saarela, A.: Self organization of a massive document collection. IEEE Transactions on Neural Networks 11(3), 574–585 (2000)CrossRefGoogle Scholar
  11. 11.
    Chan, A., Pampalk, E.: Growing hierarchical self organising map (ghsom) toolbox: visualisations and enhancements. In: Neural Information Processing, ICONIP 2002. Proceedings of the 9th International Conference, vol. 5, pp. 2537–2541 (2002)Google Scholar
  12. 12.
    Li, C., Chang, E., Garcia-Molina, H., Wiederhold, G.: Clustering for approximate similarity search in high-dimensional spaces. IEEE Transactions on Knowledge and Data Engineering 14(4), 792–808 (2002)CrossRefGoogle Scholar
  13. 13.
    Marinai, S., Faini, S., Marino, E., Soda, G.: Efficient word retrieval by means of SOM clustering and PCA. In: Bunke, H., Spitz, A.L. (eds.) DAS 2006. LNCS, vol. 3872, pp. 336–347. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  14. 14.
    Wu, Z., Yen, G.: A som projection technique with the growing structure for visualizing high-dimensional data. In: Proceedings of the International Joint Conference on Neural Networks, vol. 3, pp. 1763–1768 (2003)Google Scholar
  15. 15.
    Yen, G., Wu, Z.: Ranked centroid projection: a data visualization approach with self-organizing maps. IEEE Transactions on Neural Networks 19(2), 245–258 (2008)CrossRefGoogle Scholar
  16. 16.
    van der Maaten, L.: An introduction to dimensionality reduction using matlab. Technical Report Technical Report MICC 07-07 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Simone Marinai
    • 1
  • Emanuele Marino
    • 1
  • Giovanni Soda
    • 1
  1. 1.Dipartimento di Sistemi e InformaticaUniversità di FirenzeFirenzeItaly

Personalised recommendations