A Bag-of-Features Algorithm for Applications Using a NoSQL Database

Conference paper
Part of the Communications in Computer and Information Science book series (CCIS, volume 639)

Abstract

In this paper we present a Bag-of-Words (also known as a Bag-of-Features) method developed for the use of its implementation in NoSQL databases. When working with this algorithm special attention was brought to facilitating its implementation and reducing the number of computations to a minimum so as to use what the database engine has to offer to its maximum. The algorithm is presented using an example of image storing and retrieving. In this case it proves necessary to use an additional step of preprocessing, during which image characteristic features are retrieved and to use a clustering algorithm in order to create a dictionary. We present our own k-means algorithm which automatically selects the number of clusters. This algorithm does not comprise any computationally complicated classification algorithms, but it uses the majority vote method. This makes it possible to significantly simplify computations and use the Javascript language used in a common NoSQL database.

Keywords

NoSQL database Image classification Bag-of-Features Modified k-means algorithm 

References

  1. 1.
    Bay, H., Tuytelaars, T., Van Gool, L.: SURF: speeded up robust features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  2. 2.
    Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vision 60(2), 91–110 (2004)CrossRefGoogle Scholar
  3. 3.
    Fritzke, B.: Growing grid a self-organizing network with constant neighbourhood range and adaptation strength. Neural Process. Lett. 2(5), 9–13 (1995)CrossRefGoogle Scholar
  4. 4.
    Csurka, G., Dance, C.R., Fan, L., Willamowski, J., Bray, C.: Visual categorization with bags of keypoints. In: Workshop on Statistical Learning in Computer Vision, ECCV, pp. 1−22 (2004)Google Scholar
  5. 5.
    Liu, J.: Image retrieval based on bag-of-words model. CoRR abs/1304.5168 (2013). http://arxiv.org/abs/1304.5168
  6. 6.
    Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169−2178 (2006)Google Scholar
  7. 7.
    Li, W., Dong, P., Xiao, B., Zhou, L.: Object recognition based on the region of interest and optimal bag of words model. Neurocomputing 172, 271–280 (2016)CrossRefGoogle Scholar
  8. 8.
    Nanni, L., Melucci M.: Combination of projectors, standard texture descriptors and bag of features for classifying images. Neurocomputing 173(P3), 1602–1614 (2016)Google Scholar
  9. 9.
    Gao, H., Dou, L., Chen, W., Sun, J.: Image classification with bag-of-words model based on improved sift algorithm. In: 2013 9th Asian Control Conference (ASCC), pp. 1−6 (2013)Google Scholar
  10. 10.
    Zhao, C., Li, X., Cang, Y.: Bisecting k-means clustering based face recognition using block-based bag of words model. Optik – Int. J. Light Electron Opt. 126(19), 1761–1766 (2015)CrossRefGoogle Scholar
  11. 11.
    Audet, S.: JavaCV (2016). http://bytedeco.org/. Accessed 1 Apr 2016
  12. 12.
    Bradski, G.: The OpenCV Library. Dr. Dobb’s Journal of Software Tools (2000)Google Scholar
  13. 13.
    Fei-Fei, L., Fergus, R., Perona, P.: Learning generative visual models from few training examples: an incremental bayesian approach tested on 101 object categories. In: 2004 Conference on Computer Vision and Pattern Recognition Workshop, CVPRW 2004, p. 178, June 2004Google Scholar
  14. 14.
    Cpalka, K.: A new method for design and reduction of neuro-fuzzy classification systems. IEEE Trans. Neural Netw. 20(4), 701–714 (2009)CrossRefGoogle Scholar
  15. 15.
    Starczewski, J.T.: Centroid of triangular and gaussian type-2 fuzzy sets. Inf. Sci. 280, 289–306 (2014)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Nowak, B.A., Nowicki, R.K., Starczewski, J.T., Marvuglia, A.: The learning of neuro-fuzzy classifier with fuzzy rough sets for imprecise datasets. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part I. LNCS, vol. 8467, pp. 256–266. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  17. 17.
    Nowicki, R.: Rough sets in the neuro-fuzzy architectures based on monotonic fuzzy implications. In: Rutkowski, L., Siekmann, J.H., Tadeusiewicz, R., Zadeh, L.A. (eds.) ICAISC 2004. LNCS (LNAI), vol. 3070, pp. 510–517. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  18. 18.
    Sakurai, S., Nishizawa, M.: A new approach for discovering top-k sequential patterns based on the variety of items. J. Artif. Intell. Soft Comput. Res. 5(2), 141–153 (2015)CrossRefGoogle Scholar
  19. 19.
    Tambouratzis, T., Souliou, D., Chalikias, M., Gregoriades, A.: Maximising accuracy and efficiency of traffic accident prediction combining information mining with computational intelligence approaches and decision trees. J. Artif. Intell. Soft Comput. Res. 4(1), 31–42 (2014)CrossRefGoogle Scholar
  20. 20.
    El-Samak, A.F., Ashour, W.: Optimization of traveling salesman problem using affinity propagation clustering and genetic algorithm. J. Artif. Intell. Soft Comput. Res. 5(4), 239–245 (2015)CrossRefGoogle Scholar
  21. 21.
    Woźniak, M., Kempa, W.M., Gabryel, M., Nowicki, R.K.: A finite-buffer queue with single vacation policy - analytical study with evolutionary positioning. Int. J. Appl. Math. Comput. Sci. 24(4), 887–900 (2014)MathSciNetMATHGoogle Scholar
  22. 22.
    Gabryel, M., Grycuk, R., Korytkowski, M., Holotyak, T.: Image indexing and retrieval using GSOM algorithm. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9119, pp. 706–714. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  23. 23.
    Grycuk, R., Gabryel, M., Korytkowski, M., Scherer, R., Voloshynovskiy, S.: From single image to list of objects based on edge and blob detection. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2014, Part II. LNCS, vol. 8468, pp. 605–615. Springer, Heidelberg (2014)CrossRefGoogle Scholar
  24. 24.
    Gabryel, M., Woźniak, M., Damaševičius, R.: An application of differential evolution to positioning queueing systems. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9120, pp. 379–390. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  25. 25.
    Nowak, B.A., Nowicki, R.K., Woźniak, M., Napoli, C.: Multi-class nearest neighbour classifier for incomplete data handling. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2015. LNCS, vol. 9119, pp. 469–480. Springer, Heidelberg (2015)CrossRefGoogle Scholar
  26. 26.
    Nowicki, R.K., Nowak, B.A., Woźniak, M.: Application of rough sets in k nearest neighbours algorithm for classification of incomplete samples. In: Kunifuji, S., Papadopoulos, G.A., Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2014. AISC, vol. 416, pp. 243–257. Springer, Heidelberg (2016)Google Scholar
  27. 27.
    Połap, D., Woźniak, M., Napoli, C., Tramontana, E.: Real-time cloud-based game management system via cuckoo search algorithm. Int. J. Electron. Telecommun. 61(4), 333–338 (2015)Google Scholar
  28. 28.
    Połap, D., Woźniak, M., Napoli, C., Tramontana, E.: Is swarm intelligence able to create mazes? Int. J. Electron. Telecommun. 61(4), 305–310 (2015)Google Scholar
  29. 29.
    Woźniak, M., Gabryel, M., Nowicki, R.K., Nowak, B.A.: An application of firefly algorithm to position traffic in NoSQL database systems. In: Kunifuji, S., Papadopoulos, G.A., Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2014. AISC, vol. 416, pp. 259–272. Springer, Heidelberg (2016)Google Scholar
  30. 30.
    Woźniak, M., Marszałek, Z., Gabryel, M., Nowicki, R.K.: Preprocessing large data sets by the use of quick sort algorithm. In: Skulimowski, A.M.J., Kacprzyk, J. (eds.) KICSS 2013. AISC, vol. 364, pp. 111−121. Springer, Heidelberg (2016)Google Scholar
  31. 31.
    Woźniak, M., Marszałek, Z., Gabryel, M., Nowicki, R.K.: Modified merge sort algorithm for large scale data sets. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds.) ICAISC 2013, Part II. LNCS, vol. 7895, pp. 612–622. Springer, Heidelberg (2013)CrossRefGoogle Scholar
  32. 32.
    Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: Decision trees for mining data streams based on the Gaussian approximation. IEEE Trans. Knowl. Data Eng. 26(1), 108–119 (2014)CrossRefMATHGoogle Scholar
  33. 33.
    Rutkowski, L., Jaworski, M., Pietruczuk, L., Duda, P.: A new method for data stream mining based on the misclassification error. IEEE Trans. Neural Networks Learn. Syst. 26(5), 1048–1059 (2015)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2016

Authors and Affiliations

  1. 1.Institute of Computational IntelligenceCzęstochowa University of TechnologyCzęstochowaPoland

Personalised recommendations