Abstract
Presents an analysis of the structure of mixed-membership models into elementary blocks and their numerical properties. By associating such model structures with structures known or assumed in the data, we propose how models can be constructed in a controlled way, using the numerical properties of data likelihood and Gibbs full conditionals as predictors of model behavior. To illustrate this “bottom-up” design method, example models are constructed that may be used for expertise finding from labeled data.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Blei, D., Ng, A., Jordan, M.: Latent Dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: 20th Conference on Uncertainty in Artificial Intelligence (2004)
Heinrich, G.: A generic approach to topic models. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds.) ECML PKDD 2009. LNCS, vol. 5781, pp. 517–532. Springer, Heidelberg (2009)
Heinrich, G., Goesele, M.: Variational bayes for generic topic models. In: Mertsching, B., Hund, M., Aziz, Z. (eds.) KI 2009. LNCS, vol. 5803, pp. 161–168. Springer, Heidelberg (2009)
Pearl, J.: Bayesian networks: A model of self-activated memory for evidential reasoning. In: Proc. 7th Conf. of the Cognitive Science Society, pp. 329–334 (1985)
Griffiths, T.L., Steyvers, M.: Finding scientific topics. Proceedings of the National Academy of Sciences 101, 5228–5235 (2004)
Liu, J.: The collapsed Gibbs sampler in Bayesian computations with applications to a gene regulation problems. Journal of the American Statistical Association 89(427), 958–966 (1994)
Andrews, G.E., Askey, R., Roy, R.: Special functions. Cambridge University Press, Cambridge (1999)
Li, W., McCallum, A.: Pachinko allocation: DAG-structured mixture models of topic correlations. In: ICML 2006: Proceedings of the 23rd International Conference on Machine Learning, pp. 577–584. ACM, New York (2006)
Shafiei, M.M., Milios, E.E.: Latent Dirichlet co-clustering. In: ICDM 2006: Proceedings of the Sixth International Conference on Data Mining, pp. 542–551. IEEE Computer Society, Washington, DC, USA (2006)
Blei, D., Lafferty, J.: A correlated topic model of Science. Annals of Applied Statistics 1, 17–35 (2007)
Wallach, H.M.: Structured Topic Models for Language. PhD thesis, University of Cambridge (2008)
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.: Matching words and pictures. JMLR – Special Issue on Machine Learning Methods for Text and Images 3, 1107–1136 (2003)
McLachlan, G., Peel, D.: Finite Mixture Models. Wiley, Chichester (2000)
Blei, D., McAuliffe, J.: Supervised topic models. In: Advances in Neural Information Processing Systems (2007)
Ramage, D., Heymann, P., Manning, C.D., Garcia-Molina, H.: Clustering the tagged web. In: Proc. WSDM (2009)
Xu, Z., Tresp, V., Yu, K., Kriegel, H.P.: Infinite hidden relational models. In: Proc. 22nd Conference in Uncertainty in Artificial Intelligence UAI (2006)
Erosheva, E., Fienberg, S., Lafferty, J.: Mixed membership models of scientific publications. PNAS 101, 5220–5227 (2004)
Li, W., Blei, D., McCallum, A.: Mixtures of hierarchical topics with pachinko allocation. In: International Conference on Machine Learning (2007)
Porteous, I., Bart, E., Welling, M.: Multi-HDP: A non-parametric Bayesian model for tensor factorization. In: Proc. AAAI (2008)
Titov, I., McDonald, R.: Modeling online reviews with multi-grain topic models. In: Proc. 17th International World Wide Web Conference (WWW 2008), Beijing, China (2008)
Newman, D., Chemudugunta, C., Smyth, P.: Statistical entity-topic models. In: KDD 2006: Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 680–686. ACM, New York (2006)
Sinkkonen, J., Parkkinen, J., Aukia, J., Kaski, S.: A simple infinite topic mixture for rich graphs and relational data. In: Proc. NIPS Workshop on Analyzing Graphs: Theory and Applications (2008)
Chang, J., Blei, D.M.: Relational topic models for document networks. In: AISTATS (2009)
Cao, L., Fei-Fei, L.: Spatially coherent latent topic model for concurrent object segmentation and classification. In: Proc. ICCV (2007)
Chang, J., Boyd-Graber, J., Gerrish, S., Wang, C., Blei, D.: Reading tea leaves: How humans interpret topic models. In: Proc. Neural Information Processing Systems, NIPS (2009)
Heinrich, G.: Typology of mixed-membership models: Applications to community data. Technical note TN2011/2, arbylon.net (2011)
Heyer, G., Bordag, S.: A Structuralist Framework for Quantitative Linguistics. In: Aspects of Automatic Text Analysis. Studies in Fuzziness and Soft Computing. Springer, Heidelberg (2007)
Teh, Y.W., Jordan, M.I.: Hierarchical Bayesian nonparametric models with applications. In: Hjort, N., Holmes, C., Müller, P., Walker, S. (eds.) To appear in Bayesian Nonparametrics: Principles and Practice. Cambridge University Press, Cambridge (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Heinrich, G. (2011). Typology of Mixed-Membership Models: Towards a Design Method. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2011. Lecture Notes in Computer Science(), vol 6912. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-23783-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-642-23783-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-23782-9
Online ISBN: 978-3-642-23783-6
eBook Packages: Computer ScienceComputer Science (R0)