Advertisement

Wireless Personal Communications

, Volume 102, Issue 3, pp 2307–2329 | Cite as

Variational Bayesian Inference for Infinite Dirichlet Mixture Towards Accurate Data Categorization

  • Yuping Lai
  • Wenda He
  • Yuan PingEmail author
  • Jinshuai Qu
  • Xiufeng Zhang
Article
  • 218 Downloads

Abstract

In this paper, we focus on a variational Bayesian learning approach to infinite Dirichlet mixture model (VarInDMM) which inherits the confirmed effectiveness of modeling proportional data from infinite Dirichlet mixture model. Based on the Dirichlet process mixture model, VarInDMM has an interpretation as a mixture model with a countably infinite number of components, and it is able to determine the optimal value of this number according to the observed data. By introducing an extended variational inference framework, we further obtain an analytically tractable solution to estimate the posterior distributions of the parameters for the mixture model. Experimental results on both synthetic and real data demonstrate its good performance on object categorization and text categorization.

Keywords

Infinite mixture model Dirichlet distribution Text categorization Nonparametric Bayesian statistics Extended variational inference 

Notes

Acknowledgements

This work was supported by the National Natural Science Foundation of China under Grant No. 513335004, the Program for Science & Technology Innovation Talents in Universities of Henan Province under Grant No. 18HASTIT022, the Plan For Scientific Innovation Talent of He’nan Province under Grand No. 184100510012, the Foundation for University Key Teacher of Henan Province under Grant No. 2016GGJS-141, the Foundation of Henan Educational Committee under Grant Nos. 16A520025 and 18A520047, the Open Project Foundation of Information Technology Research Base of Civil Aviation Administration of China under Grant No. CAAC-ITRB-201702, Yunnan Provincial Department of Education Science Research Fund Project under Grant No. 2017ZDX045, Heilongjiang Natural Science Foundation under Grant No. H2016100, and Innovation Scientists and Technicians Troop Construction Projects of He’nan Province.

References

  1. 1.
    Nguyen, T. M., & Wu, Q. M. (2013). A nonsymmetric mixture model for unsupervised image segmentation. IEEE Transactions on Cybernetics, 43(2), 751–765.CrossRefGoogle Scholar
  2. 2.
    Ma, Z., & Leijon, A. (2011). Bayesian estimation of Beta mixture models with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(11), 2160–2173.CrossRefGoogle Scholar
  3. 3.
    Fan, W., Bouguila, N., & Ziou, D. (2012). Variational learning for finite Dirichlet mixture models and applications. IEEE Transactions on Neural Networks and Learning Systems, 23(5), 762–774.CrossRefGoogle Scholar
  4. 4.
    Ma, Z., Leijon, A., & Kleijn, W. B. (2013). Vector quantization of LSF parameters with a mixture of Dirichlet distributions. IEEE Transaction on Audio, Speech, and Language Processing, 21(9), 1777–1790.CrossRefGoogle Scholar
  5. 5.
    Yu, H., Tan, Z.-H., Ma, Z., Martin, R., & Guo, J. (2018). Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features. IEEE Transactions on Neural Networks and Learning Systems, PP(99), 1–12.Google Scholar
  6. 6.
    Ma, Z., Taghia, J., Kleijn, W. B., Leijon, A., & Guo, J. (2015). Line spectral frequencies modeling by a mixture of von Mises–Fisher distributions. Signal Processing, 114(C), 219–224.CrossRefGoogle Scholar
  7. 7.
    Ma, Z., Chatterjee, S., Kleijn, W. B., & Guo, J. (2014). Dirichlet mixture modeling to estimate an empirical lower bound for LSF quantization signal processing. Signal Processing, 104(6), 291–295.CrossRefGoogle Scholar
  8. 8.
    Figueiredo, M. A. T. (2002). Unsupervised learning of finite mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(3), 381–396.CrossRefGoogle Scholar
  9. 9.
    Bouguila, N., & Ziou, D. (2006). Unsupervised selection of a finite Dirichlet mixture model: An MML-based approach. IEEE Transactions on Knowledge and Data Engineering, 18(8), 993–1009.CrossRefGoogle Scholar
  10. 10.
    Ma, Z., Xie, J., Li, H., Sun, Q., Si, Z., Zhang, J., et al. (2017). The role of data analysis in the development of intelligent energy networks. IEEE Network, 31(5), 88–95.CrossRefGoogle Scholar
  11. 11.
    Reynolds, D. A., & Rose, R. C. (1995). Robust text-independent speaker identification using Gaussian mixture speaker models. IEEE Transactions on Speech and Audio Processing, 3(1), 72–83.CrossRefGoogle Scholar
  12. 12.
    Reynolds, D. A., Quatieri, T. F., & Dunn, R. B. (2000). Speaker verification using adapted Gaussian mixture models. Digital Signal Processing, 10(1–3), 19–41.CrossRefGoogle Scholar
  13. 13.
    Ma, Z., Teschendorff, A. E., Leijon, A., Qiao, Y., Zhang, H., & Guo, J. (2015). Variational Bayesian matrix factorization for bounded support data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(4), 876–889.CrossRefGoogle Scholar
  14. 14.
    Ma, Z., Xue, J. H., Leijon, A., Tan, Z. H., Yang, Z., & Guo, J. (2018). Decorrelation of neutral vector variables: Theory and applications. IEEE Transactions on Neural Networks and Learning Systems, 29(1), 129–143.MathSciNetCrossRefGoogle Scholar
  15. 15.
    Ma, Z., Tan, Z. H., & Guo, J. (2016). Feature selection for neutral vector in EEG signal classification. Neurocomputing, 174(PB), 937–945.CrossRefGoogle Scholar
  16. 16.
    Atapattu, S., Tellambura, C., & Jiang, H. (2011). A mixture Gamma distribution to model the SNR of wireless channels. IEEE Transactions on Wireless Communications, 10(12), 4193–4203.CrossRefGoogle Scholar
  17. 17.
    Bouguila, N. (2012). Hybrid generative/discriminative approaches for proportional data modeling and classification. IEEE Transactions on Knowledge and Data Engineering, 24(12), 2184–2202.CrossRefGoogle Scholar
  18. 18.
    Taghia, J., Ma, Z., & Leijon, A. (2014). Bayesian estimation of the von-Mises Fisher mixture model with variational inference. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(9), 1701–1715.CrossRefGoogle Scholar
  19. 19.
    Nguyen, T. M., Wu, Q. M., Mukherjee, D., & Zhang, H. (2014). A Bayesian bounded asymmetric mixture model with segmentation application. IEEE Journal of Biomedical and Health Informatics, 18(1), 109–119.CrossRefGoogle Scholar
  20. 20.
    Seghouane, A. K., & Amari, S. I. (2007). The AIC criterion and symmetrizing the Kullback–Leibler divergence. IEEE Transactions on Neural Networks, 18(1), 97–106.CrossRefGoogle Scholar
  21. 21.
    Antoniak, C. E. (1997). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. Annalsis of Statistics, 2(6), 1152–1174.MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Wei, X., & Li, C. (2012). The infinite student’s t-mixture for robust modeling. Signal Processing, 92(1), 224–234.CrossRefGoogle Scholar
  23. 23.
    Fan, W., & Bouguila, N. (2014). Variational learning for Dirichlet process mixtures of dirichlet distributions and applications. Multimedia Tools and Applications, 70(3), 1685–1702.CrossRefGoogle Scholar
  24. 24.
    Fan, W., & Bouguila, N. (2013). Online learning of a Dirichlet process mixture of beta-Liouville distributions via variational inference. IEEE Transactions on Neural Networks and Learning Systems, 24(11), 1850–1862.CrossRefGoogle Scholar
  25. 25.
    Gershman, S. J., & Blei, D. M. (2012). A tutorial on Bayesian nonparametric models. Journal of Mathematical Psychology, 56(1), 1–12.MathSciNetCrossRefzbMATHGoogle Scholar
  26. 26.
    Bishop, C. M. (2006). Pattern recognition and machine learning. New York: Springer.zbMATHGoogle Scholar
  27. 27.
    Ma, Z., Rana, P. K., Taghia, J., Flierl, M., & Leijon, A. (2014). Bayesian estimation of Dirichlet mixture model with variational inference. Pattern Recognition, 47(9), 3143–3157.CrossRefzbMATHGoogle Scholar
  28. 28.
    Teh, Y. W., Jordan, M. I., Beal, M. J., & Blei, D. M. (2006). Hierarchical Dirichlet processes. Journal of the American Statistical Association, 101(476), 1566–1581.MathSciNetCrossRefzbMATHGoogle Scholar
  29. 29.
    Blackwell, D., & Macqueen, J. B. (1973). Ferguson distributions via polya URN schemes. The Annals of Statistics, 1(2), 353–355.MathSciNetCrossRefzbMATHGoogle Scholar
  30. 30.
    Blei, D. M., & Jordan, M. I. (2005). Variational inference for Dirichlet process mixtures. Bayesian Analysis, 1(1), 121–144.MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Hoffman, M. D., Blei, D. M. & Cook, P. R. (2010). Bayesian nonparametric matrix factorization for recorded music. In International Conference on Machine Learning (NIPS), pp. 439–446.Google Scholar
  32. 32.
    Bishop, C. M., Lawrence, N., Jaakkola, T., & Jordan, M. I. (1997). Approximating posterior distributions in belief networks using mixtures. In Conference on Advances in Neural Information Processing Systems (NIPS), pp. 1–7.Google Scholar
  33. 33.
    Wang, X., Liu, X., Shi, Z., & Sui, H. (2012). A feature binding computational model for multi-class object categorization and recognition. Neural Computing and Applications, 21(6), 1297–1305.CrossRefGoogle Scholar
  34. 34.
    Lampert, C. H., Nickisch, H., & Harmeling, S. (2014). Attribute-based classification for zeroshot visual object categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(3), 453–65.CrossRefGoogle Scholar
  35. 35.
    Bergamo, A., & Torresani, L. (2014). Classemes and other classifier-based features for efficient object categorization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 36(10), 1988–2001.CrossRefGoogle Scholar
  36. 36.
    Xu, P., Yin, Q., Huang, Y., Song, Y.-Z., Ma, Z., Wang, L., et al. (2018). Cross-modal subspace learning for fine-grained sketch-based image retrieval. Neurocomputing, 278, 75–86.CrossRefGoogle Scholar
  37. 37.
    Lowe, D. G., & Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision, 60(2), 91–110.MathSciNetCrossRefGoogle Scholar
  38. 38.
    Mikolajczyk, K., & Schmid, C. (2005). A performance evaluation of local descriptors. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(10), 1615–1630.CrossRefGoogle Scholar
  39. 39.
    Dalal, N., & Triggs, B. (2005). Histograms of oriented gradients for human detection. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), pp. 886–893.Google Scholar
  40. 40.
    Li, W., Chen, C., Su, H., & Du, Q. (2015). Local binary patterns and extreme learning machine for hyperspectral imagery classification. IEEE Transactions on Geoscience and Remote Sensing, 53(7), 3681–3693.CrossRefGoogle Scholar
  41. 41.
    Ludwig, O., Delgado, D., Goncalves, V., & Nunes, U. (2009). Trainable classifier-fusion schemes: An application to pedestrian detection. In IEEE International Conference on Intelligent Transportation Systems (ITSC), pp. 1–6.Google Scholar
  42. 42.
    Leibe, B., & Schiele, B. (2003). Analyzing appearance and contour based methods for object categorization. In IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), pp. II-409–II-415.Google Scholar
  43. 43.
    Fergus, R., Perona, P., & Zisserman, A. (2003). Object class recognition by unsupervised scale-invariant learning. In IEEE Computer Society Conference on computer Vision and Pattern Recognition (CVPR), pp. II-264–II-271.Google Scholar
  44. 44.
    Li, F. F., Fergus, R., & Perona, P. (2004). Learning generative visual models from few training examples: An incremental Bayesian approach tested on 101 object categories. In IEEE International Conference on Computer Vision and Pattern Recognition Workshop (CVPR), pp. 178–178.Google Scholar
  45. 45.
    Gao, B., Liu, T. Y., Feng, G., Qin, T., Cheng, Q. S., & Ma, W. Y. (2005). Hierarchical taxonomy preparation for text categorization using consistent bipartite spectral graph copartitioning. IEEE Transactions on Knowledge and Data Engineering, 17(9), 1263–1273.CrossRefGoogle Scholar
  46. 46.
    Cai, D., & He, X. (2012). Manifold adaptive experimental design for text categorization. IEEE Transactions on Knowledge and Data Engineering, 24(4), 707–719.CrossRefGoogle Scholar
  47. 47.
    Ping, Y., Chang, Y., Zhou, Y., Tian, Y., Yang, Y., & Zhang, Z. (2015). Fast and scalable support vector clustering for large-scale data analysis. Knowledge and Information System, 43(2), 281–310.CrossRefGoogle Scholar
  48. 48.
    Ping, Y., Tian, Y., Guo, C., Wang, B., & Yang, Y. (2017). FRSVC: Towards making support vector clustering consume less. Pattern Recognition, 69(9), 286–298.CrossRefGoogle Scholar
  49. 49.
    Juan, A., & Vidal, E. (2002). On the use of Bernoulli mixture models for text classification. Pattern Recognition, 11(35), 2705–2710.CrossRefzbMATHGoogle Scholar
  50. 50.
    Bouguila, N., & Ziou, D. (2010). A Dirichlet process mixture of generalized Dirichlet distributions for proportional data modeling. IEEE Transactions on Neural Networks, 21(1), 107–122.CrossRefGoogle Scholar
  51. 51.
    Bouguila, N. (2012). Infinite Liouville mixture models with application to text and texture categorization. Pattern Recognition Letters, 33(2), 103–0110.CrossRefGoogle Scholar
  52. 52.
    Tang, B., He, H., Baggenstoss, P. M., & Kay, S. (2016). A Bayesian classification approach using class-specific features for text categorization. IEEE Transactions on Knowledge and Data Engineering, 28(6), 1602–1606.CrossRefGoogle Scholar
  53. 53.
    Ping, Y., & Zhou, Y. (2012). Efficient representation of text with multiple perspectives. The Journal of China Universities of Posts and Telecommunications, 1(19), 101–111.CrossRefGoogle Scholar
  54. 54.
    Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14(3), 130–137.CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Yuping Lai
    • 1
    • 2
  • Wenda He
    • 1
  • Yuan Ping
    • 3
    • 4
    Email author
  • Jinshuai Qu
    • 5
  • Xiufeng Zhang
    • 6
  1. 1.College of Computer Science and TechnologyNorth China University of TechnologyBeijingChina
  2. 2.Beijing Key Laboratory on Integration and Analysis of Large-scale Stream DataBeijingChina
  3. 3.School of Information EngineeringXuchang UniversityXuchangChina
  4. 4.Information Technology Research Base of Civil Aviation Administration of ChinaCivil Aviation University of ChinaTianjinChina
  5. 5.University Key Laboratory of Wireless Sensor Networks in Yunnan ProvinceYunnan Minzu UniversityKunmingChina
  6. 6.National Research Center for Rehabilitation Technical AidsBeijingChina

Personalised recommendations