Advertisement

Novel Indexing Strategy and Similarity Measures for Gaussian Mixture Models

  • Linfei Zhou
  • Wei Ye
  • Bianca Wackersreuther
  • Claudia Plant
  • Christian BöhmEmail author
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10439)

Abstract

Efficient similarity search for data with complex structures is a challenging task in many modern data mining applications, such as image retrieval, speaker recognition and stock market analysis. A common way to model these data objects is using Gaussian Mixture Models which has the ability to approximate arbitrary distributions in a concise way. To facilitate efficient queries, indexes are essential techniques. However, due different numbers of components in Gaussian Mixture Models, existing index methods tend to break down in performance. In this paper we propose a novel technique Normalized Transformation that reorganizes the index structure to account for different numbers of components in Gaussian Mixture Models. In addition, Normalized Transformation enables us to derive a set of similarity measures on the basis of existing ones that have close-form expression. Extensive experiments demonstrate the effectiveness of proposed technique for Gaussian component-based indexing and the performance of the novel similarity measures for clustering and classification.

References

  1. 1.
    STATS description. https://www.stats.com/sportvu-basketball-media/. Accessed 25 Feb 2017
  2. 2.
    Campbell, W.M., Sturim, D.E., Reynolds, D.A.: Support vector machines using GMM supervectors for speaker verification. IEEE Sig. Process. Lett. 13(5), 308–311 (2006)CrossRefGoogle Scholar
  3. 3.
    Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker verification using adapted gaussian mixture models. Digit. Sig. Process. 10(1–3), 19–41 (2000)CrossRefGoogle Scholar
  4. 4.
    KaewTraKulPong, P., Bowden, R.: An improved adaptive background mixture model for real-time tracking with shadow detection. In: Remagnino, P., Jones, G.A., Paragios, N., Regazzoni, C.S. (eds.) Video-Based Surveillance Systems, pp. 135–144. Springer, Boston (2002). doi: 10.1007/978-1-4615-0913-4_11 CrossRefGoogle Scholar
  5. 5.
    Zivkovic, Z.: Improved adaptive Gaussian mixture model for background subtraction. In: ICPR, pp. 28–31 (2004)Google Scholar
  6. 6.
    Böhm, C., Pryakhin, A., Schubert, M.: The Gauss-tree: efficient object identification in databases of probabilistic feature vectors. In: ICDE, p. 9 (2006)Google Scholar
  7. 7.
    Helén, M.L., Virtanen, T.: Query by example of audio signals using Euclidean distance between Gaussian mixture models. In: ICASSP, vol. 1, pp. 225–228 (2007)Google Scholar
  8. 8.
    Sfikas, G., Constantinopoulos, C., Likas, A., Galatsanos, N.P.: An analytic distance metric for gaussian mixture models with application in image retrieval. In: Duch, W., Kacprzyk, J., Oja, E., Zadrożny, S. (eds.) ICANN 2005. LNCS, vol. 3697, pp. 835–840. Springer, Heidelberg (2005). doi: 10.1007/11550907_132 Google Scholar
  9. 9.
    Jensen, J.H., Ellis, D.P., Christensen, M.G., Jensen, S.H.: Evaluation of distance measures between Gaussian mixture models of MFCCs. In: ISMIR, pp. 107–108 (2007)Google Scholar
  10. 10.
    Tao, Y., Cheng, R., Xiao, X., Ngai, W.K., Kao, B., Prabhakar, S.: Indexing multi-dimensional uncertain data with arbitrary probability density functions. In: VLDB, pp. 922–933 (2005)Google Scholar
  11. 11.
    Zhou, L., Wackersreuther, B., Fiedler, F., Plant, C., Böhm, C.: Gaussian component based index for GMMs. In: ICDM, pp. 1365–1370 (2016)Google Scholar
  12. 12.
    Böhm, C., Kunath, P., Pryakhin, A., Schubert, M.: Querying objects modeled by arbitrary probability distributions. In: Papadias, D., Zhang, D., Kollios, G. (eds.) SSTD 2007. LNCS, vol. 4605, pp. 294–311. Springer, Heidelberg (2007). doi: 10.1007/978-3-540-73540-3_17 CrossRefGoogle Scholar
  13. 13.
    Kullback, S.: Information Theory and Statistics. Courier Dover Publications, New York (2012)zbMATHGoogle Scholar
  14. 14.
    Hershey, J.R., Olsen, P.A.: Approximating the Kullback Leibler divergence between Gaussian mixture models. In: ICASSP, pp. 317–320 (2007)Google Scholar
  15. 15.
    Goldberger, J., Gordon, S., Greenspan, H.: An efficient image similarity measure based on approximations of KL-divergence between two Gaussian mixtures. In: ICCV, pp. 487–493 (2003)Google Scholar
  16. 16.
    Cui, S., Datcu, M.: Comparison of Kullback-Leibler divergence approximation methods between Gaussian mixture models for satellite image retrieval. In: IGARSS, pp. 3719–3722 (2015)Google Scholar
  17. 17.
    Beecks, C., Ivanescu, A.M., Kirchhoff, S., Seidl, T.: Modeling image similarity by Gaussian mixture models and the signature quadratic form distance. In: ICCV, pp. 1754–1761 (2011)Google Scholar
  18. 18.
    Rougui, J.E., Gelgon, M., Aboutajdine, D., Mouaddib, N., Rziza, M.: Organizing Gaussian mixture models into a tree for scaling up speaker retrieval. Pattern Recogn. Lett. 28(11), 1314–1319 (2007)CrossRefGoogle Scholar
  19. 19.
    Zhou, L., Ye, W., Plant, C., Böhm, C.: Knowledge discovery of complex data using Gaussian mixture models. In: DaWaK (2017)Google Scholar
  20. 20.
    Geusebroek, J.-M., Burghouts, G.J., Smeulders, A.W.: The Amsterdam library of object images. Int. J. Comput. Vis. 61(1), 103–112 (2005)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Linfei Zhou
    • 1
  • Wei Ye
    • 1
  • Bianca Wackersreuther
    • 1
  • Claudia Plant
    • 2
  • Christian Böhm
    • 1
    Email author
  1. 1.Ludwig-Maximilians-Universität MünchenMunichGermany
  2. 2.University of ViennaViennaAustria

Personalised recommendations