Nonparametric Localized Feature Selection via a Dirichlet Process Mixture of Generalized Dirichlet Distributions
In this paper, we propose a novel Bayesian nonparametric statistical approach of simultaneous clustering and localized feature selection for unsupervised learning. The proposed model is based on a mixture of Dirichlet processes with generalized Dirichlet (GD) distributions, which can also be seen as an infinite GD mixture model. Due to the nature of Bayesian nonparametric approach, the problems of overfitting and underfitting are prevented. Moreover, the determination of the number of clusters is sidestepped by assuming an infinite number of clusters. In our approach, the model parameters and the local feature saliency are estimated simultaneously by variational inference. We report experimental results of applying our model to two challenging clustering problems involving web pages and tissue samples which contain gene expressions.
KeywordsMixture Models Clustering Dirichlet Process Nonparametric Bayesian Generalized Dirichlet Localized Feature Selection Variational Inference
Unable to display preview. Download preview PDF.
- 2.Attias, H.: A Variational Bayes Framework for Graphical Models. In: Proc. of Neural Information Processing Systems (NIPS), pp. 209–215 (1999)Google Scholar
- 3.Bishop, C.M.: Variational Learning in Graphical Models and Neural Networks. In: Proc. of ICANN, pp. 13–22. Springer (1998)Google Scholar
- 10.Fan, W., Bouguila, N., Ziou, D.: Unsupervised Anomaly Intrusion Detection via Localized BayesianFeature Selection. In: Proc. of ICDM, pp. 1032–1037 (2011)Google Scholar