Skip to main content

Content-Based Image Indexing by Data Clustering and Inverse Document Frequency

  • Conference paper
Book cover Beyond Databases, Architectures, and Structures (BDAS 2014)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 424))

Abstract

In this paper we present an algorithm for creating and searching large image databases. Effective browsing and searching such collections of images based on their content is one of the most important challenges of computer science. In the presented algorithm, the process of inserting data to the database consists of several stages. In the first step interest points are generated from images by e.g. SIFT, SURF or PCA SIFT algorithms. The resulting huge number of key points is then reduced by data clustering, in our case by a novel, parameterless version of the mean shift algorithm. The reduction is achieved by subsequent operation on generated cluster centers. This algorithm has been adapted specifically for the presented method. Cluster centers are treated as terms and images as documents in the term frequency-inverse document frequency (TF-IDF) algorithm. TF-IDF algorithm allows to create an indexed image database and to fast retrieve desired images. The proposed approach is validated by numerical experiments on images with different content.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chang, Y., Wang, Y., Chen, C., Ricanek, K.: Improved image-based automatic gender classification by feature selection. Journal of Artificial Intelligence and Soft Computing Research, 241 (2011)

    Google Scholar 

  2. Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 17(8), 790–799 (1995)

    Article  Google Scholar 

  3. Comaniciu, D., Meer, P.: Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(5), 603–619 (2002)

    Article  Google Scholar 

  4. Cpałka, K.: On evolutionary designing and learning of flexible neuro-fuzzy structures for nonlinear classification. Nonlinear Analysis: Theory, Methods & Applications 71(12), e1659–e1672 (2009)

    Google Scholar 

  5. Derpanis, K.G.: Mean shift clustering. Lecture Notes, http://www.cse.yorku.ca/kosta/CompVis_Notes/mean_shift.pdf (2005)

  6. Evans, C.: Notes on the opensurf library. University of Bristol, Tech. Rep. CSTR-09-001 (January 2009)

    Google Scholar 

  7. Gabryel, M., Korytkowski, M., Scherer, R., Rutkowski, L.: Object detection by simple fuzzy classifiers generated by boosting. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part I. LNCS, vol. 7894, pp. 540–547. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  8. Gabryel, M., Nowicki, R.K., Woźniak, M., Kempa, W.M.: Genetic cost optimization of the GI/M/1/N finite-buffer queue with a single vacation policy. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS, vol. 7895, pp. 12–23. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  9. Gabryel, M., Woźniak, M., Nowicki, R.K.: Creating learning sets for control systems using an evolutionary method. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) SIDE 2012 and EC 2012. LNCS, vol. 7269, pp. 206–213. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  10. Górecki, P., Sopyła, K., Drozda, P.: Ranking by K-means voting algorithm for similar image retrieval. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part I. LNCS, vol. 7267, pp. 509–517. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  11. Hare, J.S., Samangooei, S., Lewis, P.H.: Efficient clustering and quantisation of sift features: Exploiting characteristics of the sift descriptor and interest region detectors under image inversion. In: Proceedings of the 1st ACM International Conference on Multimedia Retrieval, p. 2. ACM (2011)

    Google Scholar 

  12. Lew, M.S., Sebe, N., Djeraba, C., Jain, R.: Content-based multimedia information retrieval: State of the art and challenges. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP) 2(1), 1–19 (2006)

    Article  Google Scholar 

  13. Lowe, D.G.: Object recognition from local scale-invariant features. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, vol. 2, pp. 1150–1157. IEEE (1999)

    Google Scholar 

  14. Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)

    Article  Google Scholar 

  15. Nister, D., Stewenius, H.: Scalable recognition with a vocabulary tree. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2161–2168. IEEE (2006)

    Google Scholar 

  16. O’Hara, S., Draper, B.A.: Introduction to the bag of features paradigm for image classification and retrieval. arXiv preprint arXiv:1101.3354 (2011)

    Google Scholar 

  17. Przybył, A., Cpałka, K.: A new method to construct of interpretable models of dynamic systems. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2012, Part II. LNCS, vol. 7268, pp. 697–705. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  18. Ramos, J.: Using tf-idf to determine word relevance in document queries. In: Proceedings of the First Instructional Conference on Machine Learning (2003)

    Google Scholar 

  19. Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing & Management 24(5), 513–523 (1988)

    Article  Google Scholar 

  20. Sivic, J., Zisserman, A.: Video google: A text retrieval approach to object matching in videos. In: Proceedings of the Ninth IEEE International Conference on Computer Vision, pp. 1470–1477. IEEE (2003)

    Google Scholar 

  21. Veltkamp, R.C., Tanase, M.: Content-based image retrieval systems: A survey. Rapport no UU-CS-2000-34 (2000)

    Google Scholar 

  22. Wang, J.Z., Li, J., Wiederhold, G.: Simplicity: Semantics-sensitive integrated matching for picture libraries. IEEE Transactions on Pattern Analysis and Machine Intelligence 23(9), 947–963 (2001)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Rafał Grycuk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Grycuk, R., Gabryel, M., Korytkowski, M., Scherer, R. (2014). Content-Based Image Indexing by Data Clustering and Inverse Document Frequency. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małysiak-Mrozek, B., Kostrzewa, D. (eds) Beyond Databases, Architectures, and Structures. BDAS 2014. Communications in Computer and Information Science, vol 424. Springer, Cham. https://doi.org/10.1007/978-3-319-06932-6_36

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-06932-6_36

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-06931-9

  • Online ISBN: 978-3-319-06932-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics