Skip to main content
Log in

Weighted subspace modeling for semantic concept retrieval using gaussian mixture models

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

An Erratum to this article was published on 23 July 2016

Abstract

At the era of digital revolution, social media data are growing at an explosive speed. Thanks to the prevailing popularity of mobile devices with cheap costs and high resolutions as well as the ubiquitous Internet access provided by mobile carriers, Wi-Fi, etc., numerous numbers of videos and pictures are generated and uploaded to social media websites such as Facebook, Flickr, and Twitter everyday. To efficiently and effectively search and retrieve information from the large amounts of multimedia data (structured, semi-structured, or unstructured), lots of algorithms and tools have been developed. Among them, a variety of data mining and machine learning methods have been explored and proposed and have shown their effectiveness and potentials in handling the growing requests to retrieve semantic information from those large-scale multimedia data. However, it is well-acknowledged that the performance of such multimedia semantic information retrieval is far from satisfactory, due to the challenges like rare events, data imbalance, etc. In this paper, a novel weighted subspace modeling framework is proposed that is based on the Gaussian Mixture Model (GMM) and is able to effectively retrieve semantic concepts, even from the highly imbalanced datasets. Experimental results performed on two public-available benchmark datasets against our previous GMM-based subspace modeling method and the other prevailing counterparts demonstrate the effectiveness of the proposed weighted GMM-based subspace modeling framework with the improved retrieval performance in terms of the mean average precision (MAP) values.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Chang, C.C., & Lin, C.J. (2011). LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology, 2, 27:1–27:27.

    Article  Google Scholar 

  • Chen, C., Meng, T., & Lin, L. (2013). A web-based multimedia retrieval system with MCA-based filtering and subspace-based learning algorithms. International Journal of Multimedia Data Engineering and Management, 4 (2), 13–45.

    Article  Google Scholar 

  • Chen, C., & Shyu, M.L. (2011). Clustering-based binary-class classification for imbalanced data sets. In The 12th IEEE international conference on information reuse and integration (IRI 2011), pp. 384–389.

  • Chen, C., Shyu, M.L., & Chen, S.C. (2011). Data management support via spectrum perturbation-based subspace classification in collaborative environments. In The 7th international conference on collaborative computing: networking, Applications and Worksharing, pp. 67–76.

  • Chen, C., Shyu, M.L., & Chen, S.C. (2015). Gaussian mixture model-based subspace modeling for semantic concept retrieval. In The 16th IEEE international conference on information reuse and integration, pp. 258–265. San francisco.

  • Chen, M., Chen, S.C., Shyu, M.L., & Wickramaratna, K. (2006). Semantic event detection via temporal analysis and multimodal data mining. In IEEE Signal Processing Magazine, Special Issue on Semantic Retrieval of Multimedia, (Vol. 23 pp. 38– 46).

  • Chen, S.C., Kashyap, R.L., & Ghafoor, A. (2000). Semantic models for multimedia database searching and browsing, vol. 21 Springer Science & Business Media.

  • Chen, S.C., Rubin, S.H., Shyu, M.L., & Zhang, C. (2006). A dynamic user concept pattern learning framework for content-based image retrieval. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 36(6), 772–783.

    Article  Google Scholar 

  • Chen, S.C., Shyu, M.L., Chen, M., & Zhang, C. (2004). A decision tree-based multimodal data mining framework for soccer goal detection. In IEEE International conference on multimedia and expo (ICME 2004), pp. 265–268.

  • Chen, S.C., Shyu, M.L., & Kashyap, R. (2000). Augmented transition network as a semantic model for video data. International Journal of Networking and Information Systems, 3(1), 9–25.

    Google Scholar 

  • Chen, S.C., Shyu, M.L., Zhang, C., & Chen, M. (2006). A multimodal data mining framework for soccer goal detection based on decision tree logic. International Journal of Computer Applications in Technology, Special Issue on Data Mining Applications, 27(4), 312–323.

    Google Scholar 

  • Chen, S.C., Shyu, M.L., Zhang, C., Luo, L., & Chen, M. (2003). Detection of soccer goal shots using joint multimedia features and classification rules. In The fourth ACM international workshop on multimedia data mining (MDM/KDD2003), pp. 36– 44.

  • Chen, S.C., Sista, S., Shyu, M.L., & Kashyap, R. (1999). Augmented transition networks as video browsing models for multimedia databases and multimedia information systems. In Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence, pp. 175–182. doi:10.1109/TAI.1999.809783.

  • Chen, Y., Sampathkumar, H., Luo, B., & Chen, X.W. (2013). ilike: Bridging the semantic gap in vertical image search by integrating text and visual features. IEEE Transactions on Knowledge and Data Engineering, 25 (10), 2257–2270.

    Article  Google Scholar 

  • Chua, T.S., Tang, J., Hong, R., Li, H., Luo, Z., & Zheng, Y.T. (2009). Nus-wide: A real-world web image database from national university of Singapore. In ACM International conference on image and video retrieval, pp. 48:1–48:9.

  • Dorai, C., & Venkatesh, S. (2003). Bridging the semantic gap with computational media aesthetics. IEEE MultiMedia, 10(2), 15– 17.

    Article  Google Scholar 

  • Fan, J., Gao, Y., Luo, H., & Xu, G. (2004). Automatic image annotation by using concept-sensitive salient objects for image content representation. In Proceedings of the 27th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’04), pp. 361–368.

  • Galar, M., Fernandez, A., Barrenechea, E., Bustince, H., & Herrera, F. (2012). A review on ensembles for the class imbalance problem: Bagging-, boosting-, and hybrid-based approaches. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, 42(4), 463–484.

    Article  Google Scholar 

  • Ha, H.Y., Fleites, F.C., & Chen, S.C. (2013). Content-based multimedia retrieval using feature correlation clustering and fusion. International Journal of Multimedia Data Engineering and Management, 4(2), 46–64.

    Article  Google Scholar 

  • Han, H., Wang, W.Y., & Mao, B.H. (2005). Borderline-smote: a new over-sampling method in imbalanced data sets learning. In International conference on intelligent computing(ICIC 2005), pp. 878–887.

  • Hauptmann, A., Yan, R., Lin, W.H., Christel, M., & Wactlar, H. (2007). Can high-level concepts fill the semantic gap in video retrieval? a case study with broadcast news. IEEE Transactions on Multimedia, 9(5), 958–966.

    Article  Google Scholar 

  • He, H., & Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

    Article  Google Scholar 

  • Hoi, S.C. H., Lyu, M.R., & Jin, R. (2006). A unified log-based relevance feedback scheme for image retrieval. IEEE Transactions on Knowl. and Data Engineering, 18(4), 509–524.

    Article  Google Scholar 

  • Hong, R., Wang, M., Gao, Y., Tao, D., Li, X., & Wu, X. (2014). Image annotation by multiple-instance learning with discriminative feature mapping and selection. IEEE Transactions on Cybernetics, 44(5), 669–680.

    Article  Google Scholar 

  • Hong, X., Chen, S., & Harris, C. (2007). A kernel-based two-class classifier for imbalanced data sets. IEEE Transactions on Neural Networks, 18(1), 28–41.

    Article  Google Scholar 

  • Hu, X., Li, K., Han, J., Hua, X., Guo, L., & Liu, T. (2012). Bridging the semantic gap via functional brain imaging. IEEE Transactions on Multimedia, 14(2), 314–325.

    Article  Google Scholar 

  • Huang, X., Chen, S.C., Shyu, M.L., & Zhang, C. (2002). User concept pattern discovery using relevance feedback and multiple instance learning for content-based image retrieval. In Proceedings of the third international workshop on multimedia data mining, in conjunction with the 8th ACM international conference on knowledge discovery & data mining, pp. 100–108.

  • Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: a systematic study. Intelligent Data Analysis, 6(5), 429–449.

    Article  Google Scholar 

  • Kutics, A., Nakagawa, A., Tanaka, K., Yamada, M., Sanbe, Y., & Ohtsuka, S. (2003). Linking images and keywords for semantics-based image retrieval. In Proceedings. 2003 international conference on multimedia and expo (ICME ’03), pp. 777–780.

  • Li, X., Chen, S.C., Shyu, M.L., & Furht, B. (2002). An effective content-based visual image retrieval system. In IEEE International conference on computer software and applications conference, (COMPSAC), pp. 914–919.

  • Li, X., Chen, S.C., Shyu, M.L., & Furht, B. (2002). Image retrieval by color, texture, and spatial information. In Proceedings of the 8th international conference on distributed multimedia systems, pp. 152–159.

  • Lin, L., Chen, C., Shyu, M.L., & Chen, S.C. (2011). Weighted subspace filtering and ranking algorithms for video concept retrieval. IEEE Multimedia, 18(3), 32–43.

    Article  Google Scholar 

  • Lin, L., Ravitz, G., Shyu, M.L., & Chen, S.C. (2007). Video semantic concept discovery using multimodal-based association classification. In Proceedings of the IEEE international conference on multimedia & expo, pp. 859–862.

  • Lin, L., Ravitz, G., Shyu, M.L., & Chen, S.C. (2008). Correlation-based video semantic concept detection using multiple correspondence analysis. In IEEE International symposium on multimedia (ISM 08), pp. 316–321.

  • Lin, L., & Shyu, M.L. (2009). Effective and efficient video high-level semantic retrieval using associations and correlations. International Journal of Semantic Computing, 3(4), 421–444.

    Article  Google Scholar 

  • Lin, L., & Shyu, M.L. (2010). Weighted association rule mining for video semantic detection. International Journal of Multimedia Data Engineering and Management, 1(1), 37–54.

    Article  Google Scholar 

  • Lin, L., Shyu, M.L., Ravitz, G., & Chen, S.C. (2009). Video semantic concept detection via associative classification. In IEEE International conference on multimedia and expo (ICME), pp. 418– 421.

  • Lo, H.Y., Lin, S.D., & Wang, H.M. (2014). Generalized k-labelsets ensemble for multi-label and cost-sensitive classification. IEEE Transactions on Knowledge and Data Engineering, 26(7), 1679–1691.

    Article  Google Scholar 

  • Lowe, D.G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.

    Article  Google Scholar 

  • MediaMill (2005). The MediaMill Challenge Problem. http://www.science.uva.nl/research/mediamill/chall-enge/data.php.

  • Meng, T., & Shyu, M.L. (2012). Leveraging concept association network for multimedia rare concept mining and retrieval. In Proceedings of the IEEE international conference on multimedia and expo, pp. 860-865. Melbourne, Australia.

  • Mercer, J. (1909). Functions of positive and negative type, and their connection with the theory of integral equations. Philosophical Transactions of the Royal Society, 209(441-458), 415–446.

    Google Scholar 

  • Quinlan, J.R. (1993). C4.5: Programs for machine learning morgan kaufmann.

  • Shyu, M.L., Chen, C., & Chen, S.C. (2011). Multi-class classification via subspace modeling. International Journal of Semantic Computing, 5(1), 55–78.

    Article  Google Scholar 

  • Shyu, M.L., Chen, S.C., Chen, M., & Zhang, C. (2004). A unified framework for image database clustering and content-based retrieval. In ACM International workshop on multimedia databases, pp. 19–27.

  • Shyu, M.L., Chen, S.C., Chen, M., Zhang, C., & Shu, C.M. (2006). Probabilistic semantic network-based image retrieval using mmm and relevance feedback. Multimedia Tools and Applications, 30(2), 131–147.

    Article  Google Scholar 

  • Shyu, M.L., Chen, S.C., & Kashyap, R. (2001). Generalized affinity-based association rule mining for multimedia database queries. An International Journal Knowledge and Information Systems, 3(3), 319–337.

    Article  Google Scholar 

  • Shyu, M.L., Haruechaiyasak, C., Chen, S.C., & Zhao, N. (2005). Collaborative filtering by mining association rules from user access sequences. In Proceedings of the International Workshop on Challenges in Web Information Retrieval and Integration, pp. 128–135. doi:10.1109/WIRI.2005.14.

  • Shyu, M.L., Quirino, T., Xie, Z., Chen, S.C., & Chang, L. (2007). Network intrusion detection through adaptive sub-eigenspace modeling in multiagent systems. ACM Transactions on Autonomous and Adaptive Systems, 2(3), 9:1–9:37.

    Article  Google Scholar 

  • Shyu, M.L., Xie, Z., Chen, M., & Chen, S.C. (2008). Video semantic event/concept detection using a subspace-based multimedia data mining framework. IEEE Transactions on Multimedia, Special number on Multimedia Data Mining, 10(2), 252– 259.

    Article  Google Scholar 

  • Smeulders, A., Worring, M., Santini, S., Gupta, A., & Jain, R. (2000). Content-based image retrieval at the end of the early years. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(12), 1349–1380.

    Article  Google Scholar 

  • Sneok, C., Worring, M., Gemert, J., Geusebroek, J., & Smeulders, A. (2006). The challenge problem for automated detection of 101 semantic concepts in multimedia. In ACM International conference on multimedia (MM06), pp. 421–430.

  • Wang, J., Zhao, P., & Hoi, S. (2014). Cost-sensitive online classification. IEEE Transactions on Knowledge and Data Engineering, 26(10), 2425–2438.

    Article  Google Scholar 

  • Wu, G., & Chang, E. (2005). Kba: kernel boundary alignment considering imbalanced data distribution. IEEE Transactions on Knowledge and Data Engineering, 17(6).

  • Zhang, C., Chen, S.C., & Shyu, M.L. (2004). Multiple object retrieval for image databases using multiple instance learning and relevance feedback. In IEEE International conference on multimedia and expo (ICME), pp. 775–778.

  • Zhao, R., & Grosky, W.I. (2002). Narrowing the semantic gap - improved text-based web document retrieval using visual features. IEEE Transactions on Multimedia, 4(2), 189–200.

    Article  Google Scholar 

  • Zhu, Q., Lin, L., Shyu, M.L., & Chen, S.C. (2011). Effective supervised discretization for classification based on correlation maximization. In Proceedings of the IEEE international conference on information reuse and integration, pp. 390–395.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chao Chen.

Additional information

An erratum to this article can be found at http://dx.doi.org/10.1007/s10796-016-9684-4.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, C., Shyu, ML. & Chen, SC. Weighted subspace modeling for semantic concept retrieval using gaussian mixture models. Inf Syst Front 18, 877–889 (2016). https://doi.org/10.1007/s10796-016-9660-z

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-016-9660-z

Keywords

Navigation