Cluster Computing

, Volume 21, Issue 1, pp 797–804 | Cite as

A modified K-means clustering for mining of multimedia databases based on dimensionality reduction and similarity measures

  • Xiaoping Jiang
  • Chenghua LiEmail author
  • Jing Sun


With rapid innovations in digital technology and cloud computing off late, there has been a huge volume of research in the area of web based storage, cloud management and mining of data from the cloud. Large volumes of data sets are being stored, processed in either virtual or physical storage and processing equipments on a daily basis. Hence, there is a continuous need for research in these areas to minimize the computational complexity and subsequently reduce the time and cost factors. The proposed research paper focuses towards handling and mining of multimedia data in a data base which is a mixed composition of data in the form of graphic arts and pictures, hyper text, text data, video or audio. Since large amounts of storage are required for audio and video data in general, the management and mining of such data from the multimedia data base needs special attention. Experimental observations using well known data sets of varying features and dimensions indicate that the proposed cluster based mining technique achieves promising results in comparison with the other well-known methods. Every attribute denoting the efficiency of the mining process have been compared component wise with recent mining techniques in the past. The proposed system addresses effectiveness, robustness and efficiency for a high-dimensional multimedia database.


Multimedia data bases Clustering Mining K means clustering Optimization 



Funding was provided by General Program of the Natural Science Fund of Hubei Province (Grant No. 2014CFB916).


  1. 1.
    Kotsiantis, S., Kanellopoulos, D., Pintelas, P.: Multimedia mining. WSEAS Trans. Syst. 3(10), 3263–3268 (2004)Google Scholar
  2. 2.
    Manjunath, T.N., Hegadi, R.S., Ravikumar, G.K.: A survey on multimedia data mining and its relevance today. Int. J. Comput. Sci. Netw. Secur. 10(11), 165–170 (2010)Google Scholar
  3. 3.
    Bhatt, C.A., Kankanhalli, M.S.: Multimedia data mining: state of the art and challenges. Multimed. Tools Appl. 51, 35–76 (2011)CrossRefGoogle Scholar
  4. 4.
    Bhatt, C., Kankanhalli, M.: Probabilistic temporal multimedia data mining. ACM Trans. Intell. Syst. Technol. vol. 2, no. 2, Article 17 (2011)Google Scholar
  5. 5.
    Kamde, P.M., Algur, S.P.: A survey on web multimedia mining. arXiv:1109.1145 (2011)
  6. 6.
    Wang, D., Kim, Y.-S., Park, S.C., Lee, C.S., Han, Y.K.: Learning based neural similarity metrics for multimedia data mining. Soft Comput. 11(4), 335–340 (2007)CrossRefGoogle Scholar
  7. 7.
    Benjamin, B., Navarro, G.: Probabilistic proximity searching algorithms based on compact partitions. Discret. Algorithms 2(1), 115–134 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Filippone, M., Camastra, F., Masulli, F., Rovetta, S.: A survey of kernel and spectral methods for clustering. Pattern Recognit. 41(1), 176–190 (2008)CrossRefzbMATHGoogle Scholar
  9. 9.
    D’Urso, P., Massari, R., Cappelli, C., De Giovanni, L.: Autoregressive metric-based trimmed fuzzy clustering with an application to PM\(_{10}\) time series. Chemometr. Intell. Lab. Syst. 161, 15–26 (2017)CrossRefGoogle Scholar
  10. 10.
    Nair, B.B., Saravana Kumar, P.K., Sakthivel, N.R., Vipin, U.: Clustering stock price time series data to generate stock trading recommendations: an empirical study. Expert Syst. Appl. 70, 20–36 (2017)CrossRefGoogle Scholar
  11. 11.
    Méndez, E., Lugo, O., Melin, P.: A competitive modular neural network for long-term time series forecasting. In: Melin, P., Castillo, O., Kacprzyk, J. (eds.) Nature-Inspired Design of Hybrid Intelligent Systems, pp. 243–254. Springer International Publishing (2017)Google Scholar
  12. 12.
    Wang, D., Wang, Z., Li, J., Zhang, B., Li, X.: Query representation by structured concept threads with application to interactive video retrieval. J. Vis. Commun. Image Represent. 20, 104–116 (2009)CrossRefGoogle Scholar
  13. 13.
    Berkhin, P.: A survey of clustering data mining techniques. In: Kogan, J., Nicholas, C., Teboulle, M. (eds.) Grouping Multidimensional Data. Recent Advances in Clustering, pp. 25–71, 372, 520. Springer, Berlin (2006)Google Scholar
  14. 14.
    Bagnall, A., Janacek, G.: Clustering time series with clipped data. Mach. Learn. 58(2–3), 151–178 (2005)CrossRefzbMATHGoogle Scholar
  15. 15.
    Mukherjee, Michael Laszlo Sumitra: A Genetic algorithm that exchanges neighbouring centers for K-means clustering. Pattern Recognit. Lett. 28, 2359–2366 (2007)CrossRefGoogle Scholar
  16. 16.
    Roy, D.K., Sharma, L.K.: Genetic K-means clustering algorithm for mixed numeric and categorical data. Int. J. Artif. Intell. Appl. 1(2), 23–28 (2010)Google Scholar
  17. 17.
    Natarajan, R., Sion, R., Phan, T.: A grid-based approach for enterprise-scale data mining. J. Future Gener. Comput. Syst. 23, 48–54 (2007)CrossRefGoogle Scholar
  18. 18.
    Wong, K.-C., Wu, C.-H., Mok, R.K.P., Peng, C., Zhang, Z.: Evolutionary multimodal optimization using the principle of locality. Inf. Sci. J. 194, 138–170 (2012)CrossRefGoogle Scholar
  19. 19.
    Maji, P.: Fuzzy-rough supervised attribute clustering algorithm and classification of microarray data. IEEE Trans. Syst. Man Cybern. Part B 41(1), 222–233 (2011)CrossRefGoogle Scholar
  20. 20.
    Niknam, T., Firouzi, B.B., Nayeripour, M.: An efficient hybrid evolutionary algorithm for cluster analysis. World Appl. Sci. J. 4(2), 300–307 (2008)Google Scholar
  21. 21.
    Belacel, N., Raval, H.B., Punnen, A.P.: Learning multicriteria fuzzy classification method PROAFTN from data. Comput. Oper. Res. 34, 1885–1898 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Ordonez, C.: Integrating K-means clustering with a relational DBMS using SQL. IEEE Trans. Knowl. Data Eng. 18(2), 188–201 (2006)CrossRefGoogle Scholar
  23. 23.
    Santos, J.M., de Sa, J.M., Alexandre, L.A.: LEGClust-a clustering algorithm based on layered entropic sub graph. IEEE Trans. Pattern Anal. Mach. Intell. 30, 62–75 (2008)CrossRefGoogle Scholar
  24. 24.
    Jarrah, M., Al-Quraan, M., Jararweh, Y., Al-Ayyoub, M.: Medgraph: a graph-based representation and computation to handle large sets of images. Multimed. Tools Appl. 76(2), 2769–2785 (2017)CrossRefGoogle Scholar
  25. 25.
    Monbet, V., Ailliot, P.: Sparse vector Markov switching autoregressive models. Application to multivariate time series of temperature. Comput. Stat. Data Anal. 108, 40–51 (2017)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Varley, J.B., Miglio, A., Ha, V.-A., van Setten, M.J., Rignanese, G.-M., Hautier, G.: High-throughput design of non-oxide p-type transparent conducting materials: data mining, search strategy and identification of boron phosphide. Chem. Mater. 29(6), 2568–2573 (2017). doi: 10.1021/acs.chemmater.6b04663 CrossRefGoogle Scholar
  27. 27.
    Olson, D.L., Desheng Dash, W.: Data Mining Models and Enterprise Risk Management. Enterprise Risk Management Models. Springer, Berlin (2017)CrossRefGoogle Scholar
  28. 28.
    Kandoi, G., Leelananda, S.P., Jernigan, R.L., Sen, T.Z.: Predicting protein secondary structure using consensus data mining (CDM) based on empirical statistics and evolutionary information. Methods Mol. Biol. 1484, 35–44 (2017). doi: 10.1007/978-1-4939-6406-2_4 CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2017

Authors and Affiliations

  1. 1.College of Electronics and Information Engineering, Hubei Key Laboratory of Intelligent Wireless CommunicationsSouth-Central University for NationalitiesWuhanChina

Personalised recommendations