Abstract
The paper proposes a new method to approximate the normalized information distance by a compression method that is particularly suited for image data. The new method is based on a video compressor. The new method is used to compute the distance matrix of all the images in the data sets considered. Moreover, the hierarchical clustering method from the R package is used to cluster the distance matrix obtained. Two different datasets are considered to demonstrate the usefulness of our new image analysis method. The results are very promising and show that one can obtain a very good clustering of the image data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bennett, C.H., Gács, P., Li, M., Paul M.B.V., Zurek, W.H.: Information distance. IEEE Trans. Inf. Theor. 44(4), 1407–1423 (1998)
Cilibrasi, R., Vitányi, P.M.B.: Clustering by compression. IEEE Trans. Inf. Theor. 51(4), 1523–1545 (2005)
Ito, K., Zeugmann, T., Zhu, Y.: Clustering the normalized compression distance for influenza virus data. In: Algorithms and Applications, volume 6060 of Lecture Notes in Computer Science, pp. 130–146. Springer, New York (2010)
Keogh, E., Lonardi, S., Ann, C.: Ratanamahatana. Towards parameter-free data mining. In: KDD ’04: Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 206–215. ACM Press, New York (2004)
Li, M., Chen, X., Li, X., Ma, B., Vitányi, P.M.B.: The similarity metric. IEEE Trans. Inf. Theor. 50(12), 3250–3264 (2004)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)
Pavlidis, T.: Limitations of content-based image retrieval, 2008. unpublished manuscript: http://www.theopavlidis.com/technology/CBIR/PaperB/vers3.htm
Russell, K.N., Do, M.T., Huff, J.C., Platnick, N.I.: Introducing spida-web: Wavelets, neural networks and internet accessibility in an image-based automated identification system. In: MacLeod, N. (eds.) Automated Taxon Identification in Systematics: Theory, Approaches and Applications, pp. 131–152. CRC Press, New York (2007)
Sumathi, S., Paneerselvam, S.: Computational Intelligence Paradigms Theory and Applications using MATLAB. CRC Press, New York (2010)
The R project for statistical computing. http://www.r-project.org/
Ticay-Rivas, J.R., del Pozo-Baños, M., Eberhard, W.G., Alonso, J.B., Travieso, C.M.: Spider specie identification and verification based on pattern recognition of it cobweb. Expert Syst. Appl. 40(10), 4213–4225 (2013)
Paul M.B.V., Frank J.B., Rudi L.C., Li, M.: Normalized information distance. In: Information Theory and Statistical Learning, pp. 45–82. Springer, New York (2008)
Wang, X., Ye, L., Keogh, E., Shelton, C.: Annotating historical archives of images. In: Joint Conference on Digital Libraries, pp. 341–350 (2008)
Acknowledgments
We would like to thank to the program committee and the anonymous referees for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhu, Y., Zeugmann, T. (2016). Image Analysis in a Parameter-Free Setting. In: Abdelrahman, O., Gelenbe, E., Gorbil, G., Lent, R. (eds) Information Sciences and Systems 2015. Lecture Notes in Electrical Engineering, vol 363. Springer, Cham. https://doi.org/10.1007/978-3-319-22635-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-319-22635-4_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22634-7
Online ISBN: 978-3-319-22635-4
eBook Packages: EngineeringEngineering (R0)