Abstract
This paper studies the impact of the metrics choice on the learning procedure of Self Organizing Maps (SOM). In particular, we modified the learning procedure of SOM, by replacing the standard Euclidean norm, usually employed to evaluate the similarity between input patterns and nodes of the map, with the more general Minkowski norms: \(||X||_p=\left(\displaystyle{\sum_{i}|X_i|^p}\right)^{\frac{1}{p}}\), for p ∈ ℝ + . We have then analized how the clustering capabilities of SOM are modified when both prenorms (0 < p < 1), and ultrametrics (p > > 1) are considered. This was done using financial data on the Foreign Exchange Market (FOREX), observed at different time scales (from 1 minute to 1 month). The motivation inside the use of this data domain (financial data) is the relevance of the addressed question, since SOM are often employed to support the decision process of traders. It could be then of interest to know if and how the results of SOM can be driven by changes in the distance metric according to which proximities are evaluated. Our main result is that concentration seems not to be the unique factor affecting the effectiveness of the norms (and hence of the clustering procedure); in the case of financial data, the time scale of observations counts as well.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aggarwal, C.C., Yu, P.S.: The IGrid Index: Reversing the Dimensionality Curse For Similarity Indexing in High Dimensional Space. In: Proc. of KDD, pp. 119–129 (2000)
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: What Is the Nearest Neighbor in High Dimensional Spaces? In: Abbadi, A., Brodie, M.L., Chakravarthy, S., Dayal, U., Kamel, N., Schlageter, G., Whang, K.-Y. (eds.) Proc. of VLDB 2000, 26th Intl. Conference on Very Large Data Bases, Cairo, Egypt, September 10-14, pp. 506–515. Morgan Kaufmann, San Francisco (2000)
Aggarwal, C.C., Hinneburg, A., Keim, D.A.: On the Surprising Behavior of Distance Metrics in High Dimensional Spaces. In: Van den Bussche, J., Vianu, V. (eds.) ICDT 2001. LNCS, vol. 1973, p. 420. Springer, Heidelberg (2000)
Beyer, K.S., Goldstein, J., Ramakrishnan, R., Shaft, U.: When Is Nearest Neighbor Meaningful. In: Beeri, C., Bruneman, P. (eds.) ICDT 1999. LNCS, vol. 1540, pp. 217–235. Springer, Heidelberg (1998)
Cattaneo Adorno, M., Resta, M.: Reliability and convergence on Kohonen maps: an empirical study. In: Negoita, M.G., Howlett, R.J., Jain, L.C. (eds.) KES 2004. LNCS (LNAI), vol. 3213, pp. 426–433. Springer, Heidelberg (2004)
Demartines, P.: Analyse de Donnes par Rseaux de Neurones Auto-Organiss. PhD dissertation, Institut Nat’l Polytechnique de Grenoble, Grenoble, France (1994)
De Bodt, E., Cottrell, M., Verleysen, M.: Statistical tools to assess the reliability of Self–Organizing Maps. Neural Networks 15, 967–978 (2002)
Francois, D., Wertz, V., Verleysen, M.: Non-euclidean metrics for similarity search in noisy datasets. In: Proc. of ESANN 2005, European Symposium on Artificial Neural Networks (2005)
Francois, D., Wertz, V., Verleysen, M.: On the locality of kernels in high-dimensional spaces. In: Proc. of ASMDA 2005, Applied Stochastic Models and Data Analysis, Brest, France (2005)
Kohonen, T.: Self-Organizing Maps. Springer, Berlin (1982)
Liou, C.Y., Tai, W.P.: Conformal self-organization for continuity on a feature map. Neural Networks 12, 893–905 (1999)
Resta, M.: Seize the (intra)day: Features selection and rules extraction for tradings on high-frequency data. Neurocomputing 72(16-18), 3413–3427 (2009)
Verleysen, M., Francois, D.: The Curse of Dimensionality in Data Mining and Time Series Prediction. In: Cabestany, J., Prieto, A.G., Sandoval, F. (eds.) IWANN 2005. LNCS, vol. 3512, pp. 758–770. Springer, Heidelberg (2005)
Verleysen, M., Francois, D.: The Concentration of Fractional Distances. IEEE Trans. on Knowledge and Data Engineering 19(7), 873–886 (2007)
Wu, Y., Takatsuka, M.: Spherical Self–Organizing Map using efficient indexed geodesic data structure. Neural Networks 19(6-7), 900–910 (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Resta, M. (2010). On the Impact of the Metrics Choice in SOM Learning: Some Empirical Results from Financial Data. In: Setchi, R., Jordanov, I., Howlett, R.J., Jain, L.C. (eds) Knowledge-Based and Intelligent Information and Engineering Systems. KES 2010. Lecture Notes in Computer Science(), vol 6278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15393-8_65
Download citation
DOI: https://doi.org/10.1007/978-3-642-15393-8_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15392-1
Online ISBN: 978-3-642-15393-8
eBook Packages: Computer ScienceComputer Science (R0)