Skip to main content

Nonparametric Localized Feature Selection via a Dirichlet Process Mixture of Generalized Dirichlet Distributions

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 7665))

Abstract

In this paper, we propose a novel Bayesian nonparametric statistical approach of simultaneous clustering and localized feature selection for unsupervised learning. The proposed model is based on a mixture of Dirichlet processes with generalized Dirichlet (GD) distributions, which can also be seen as an infinite GD mixture model. Due to the nature of Bayesian nonparametric approach, the problems of overfitting and underfitting are prevented. Moreover, the determination of the number of clusters is sidestepped by assuming an infinite number of clusters. In our approach, the model parameters and the local feature saliency are estimated simultaneously by variational inference. We report experimental results of applying our model to two challenging clustering problems involving web pages and tissue samples which contain gene expressions.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alizadeh, A.A., Eisen, M.B., Davis, R.E., et al.: Distinct Types of Diffuse Large B-cell Lymphoma Identified by Gene Expression Profiling. Nature 403, 503–511 (2000)

    Article  Google Scholar 

  2. Attias, H.: A Variational Bayes Framework for Graphical Models. In: Proc. of Neural Information Processing Systems (NIPS), pp. 209–215 (1999)

    Google Scholar 

  3. Bishop, C.M.: Variational Learning in Graphical Models and Neural Networks. In: Proc. of ICANN, pp. 13–22. Springer (1998)

    Google Scholar 

  4. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. Journal of Machine Learning Research 3, 993–1022 (2003)

    MATH  Google Scholar 

  5. Blei, D.M., Jordan, M.I.: Variational Inference for Dirichlet Process Mixtures. Bayesian Analysis 1, 121–144 (2005)

    Article  MathSciNet  Google Scholar 

  6. Bouguila, N., Ziou, D.: A Hybrid SEM Algorithm for High-Dimensional Unsupervised Learning Using a Finite Generalized Dirichlet Mixture. IEEE Transactions on Image Processing 15(9), 2657–2668 (2006)

    Article  Google Scholar 

  7. Bouguila, N., Ziou, D.: High-Dimensional Unsupervised Selection and Estimation of a Finite Generalized Dirichlet Mixture Model Based on Minimum Message Length. IEEE Transactions on PAMI 29(10), 1716–1731 (2007)

    Article  Google Scholar 

  8. Boutemedjet, S., Bouguila, N., Ziou, D.: A Hybrid Feature Extraction Selection Approach for High-Dimensional Non-Gaussian Data Clustering. IEEE Transactions on PAMI 31(8), 1429–1443 (2009)

    Article  Google Scholar 

  9. Constantinopoulos, C., Titsias, M., Likas, A.: Bayesian Feature and Model Selection for Gaussian Mixture Models. IEEE Trans. on PAMI 28(6), 1013–1018 (2006)

    Article  Google Scholar 

  10. Fan, W., Bouguila, N., Ziou, D.: Unsupervised Anomaly Intrusion Detection via Localized BayesianFeature Selection. In: Proc. of ICDM, pp. 1032–1037 (2011)

    Google Scholar 

  11. Fan, W., Bouguila, N., Ziou, D.: Variational Learning for Finite Dirichlet Mixture Models and Applications. IEEE Trans. Neural Netw. Learning Syst. 23(5), 762–774 (2012)

    Article  Google Scholar 

  12. Ferguson, T.S.: Bayesian Density Estimation by Mixtures of Normal Distributions. Recent Advances in Statistics 24, 287–302 (1983)

    MathSciNet  Google Scholar 

  13. Figueiredo, M., Jain, A.: Unsupervised Learning of Finite Mixture Models. IEEE Transactions on PAMI 24(3), 381–396 (2002)

    Article  Google Scholar 

  14. Ji, Y., Wu, C., Liu, P., Wang, J., Coombes, K.R.: Applications of Beta-mixture Models in Bioinformatics. Bioinformatics 21(9), 2118–2122 (2005)

    Article  Google Scholar 

  15. Jordan, M.I., Ghahramani, Z., Jaakkola, T.S., Saul, L.K.: An Introduction to Variational Methods for Graphical Models. Machine Learning 37(2), 183–233 (1999)

    Article  MATH  Google Scholar 

  16. Law, M.H.C., Figueiredo, M.A.T., Jain, A.K.: Simultaneous Feature Selection and Clustering Using Mixture Models. IEEE Trans. on PAMI 26(9), 1154–1166 (2004)

    Article  Google Scholar 

  17. Li, Y., Dong, M., Hua, J.: Simultaneous Localized Feature Selection and Model Detection for Gaussian Mixtures. IEEE Transactions on PAMI 31, 953–960 (2009)

    Article  Google Scholar 

  18. Ma, Z., Leijon, A.: Bayesian Estimation of Beta Mixture Models with Variational Inference. IEEE Transactions on PAMI 33(11), 2160–2173 (2011)

    Article  Google Scholar 

  19. McLachlan, G.J., Khan, N.: On a Resampling Approach for Tests on the Number of Clusters with Mixture Model-based Clustering of Tissue Samples. J. Multivar. Anal. 90(1), 90–105 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  20. Neal, R.M.: Markov Chain Sampling Methods for Dirichlet Process Mixture Models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)

    MathSciNet  Google Scholar 

  21. Sethuraman, J.: A Constructive Definition of Dirichlet Priors. Statistica Sinica 4, 639–650 (1994)

    MathSciNet  MATH  Google Scholar 

  22. Teh, Y.W., Jordan, M.I., Beal, M.J., Blei, D.M.: Hierarchical Dirichlet Processes. Journal of the American Statistical Association 101, 705–711 (2004)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Fan, W., Bouguila, N. (2012). Nonparametric Localized Feature Selection via a Dirichlet Process Mixture of Generalized Dirichlet Distributions. In: Huang, T., Zeng, Z., Li, C., Leung, C.S. (eds) Neural Information Processing. ICONIP 2012. Lecture Notes in Computer Science, vol 7665. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34487-9_4

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34487-9_4

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34486-2

  • Online ISBN: 978-3-642-34487-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics