Abstract
In many unsupervised learning problems data can be available in different representations, often referred to as views. By leveraging information from multiple views we can obtain clustering that is more robust and accurate compared to the one obtained via the individual views. We propose a novel algorithm that is based on neighborhood co-regularization of the clustering hypotheses and that searches for the solution which is consistent across different views. In our empirical evaluation on publicly available datasets, the proposed method outperforms several state-of-the-art clustering algorithms. Furthermore, application of our method to recently collected biomedical data leads to new insights, critical for future research on determinants of the cervicovaginal microbiome and the cervicovaginal microbiome as a risk factor for the transmission of HIV. These insights could have an influence on the interpretation of clinical presentation of women with bacterial vaginosis and treatment decisions.
E. Tsivtsivadze and H. Borgdorff contributed equally to this work.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92ā100. ACM, New York (1998)
Sindhwani, V., Niyogi, P., Belkin, M.: A co-regularization approach to semi-supervised learning with multiple views. In: Proceedings of ICML Workshop on Learning with Multiple Views (2005)
Chaudhuri, K., Kakade, S.M., Livescu, K., Sridharan, K.: Multi-view clustering via canonical correlation analysis. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 129ā136. ACM (2009)
Krishnapuram, B., Williams, D., Xue, Y., Hartemink, A.J., Carin, L., Figueiredo, M.A.T.: On semi-supervised classification. In: Advances Neural Information Processing Systems, vol. 17 (2004)
Brefeld, U., GƤrtner, T., Scheffer, T., Wrobel, S.: Efficient co-regularised least squares regression. In: Proceedings of the International Conference on Machine learning, pp. 137ā144. ACM, New York (2006)
Tsivtsivadze, E., Pahikkala, T., Boberg, J., Salakoski, T., Heskes, T.: Co-regularized least-squares for label ranking. In: HĆ¼llermeier, E., FĆ¼rnkranz, J. (eds.) Preference, Learning, pp. 107ā123 (2010)
Kumar, A., Rai, P., Daume III, H.: Co-regularized multi-view spectral clustering. In: Shawe-Taylor, J., Zemel, R., Bartlett, P., Pereira, F., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 24, pp. 1413ā1421 (2011)
Ng, A.Y., Jordan, M.I., Weiss, Y.: On spectral clustering: analysis and an algorithm. In: Advances in Neural Information Processing Systems, vol. 14, pp. 849ā856 (2001)
Luxburg, U.: A tutorial on spectral clustering. Stat. Comput. 17(4), 395ā416 (2007)
Zhou, D., Burges, C.J.C.: Spectral clustering and transductive learning with multiple views. In: Proceedings of the 24th International Conference on Machine Learning, pp. 1159ā1166 (2007)
de Sa, V.R.: Spectral clustering with two views. In: Workshop on Learning with Multiple Views, International Conference on Machine Learning (2005)
Tang, W., Lu, Z., Dhillon, I.S.: Clustering with multiple graphs. In: Proceedings of the 2009 Nineth IEEE International Conference on Data Mining, pp. 1016ā1021 (2009)
Strehl, A., Ghosh, J.: Cluster ensembles ā a knowledge reuse framework for combining multiple partitions. J. Mach. Learn. Res. 3, 583ā617 (2003)
Srinivasan, S., Hoffman, N.G., Morgan, M.T., Matsen, F.A., Fiedler, T.L., Hall, R.W., Ross, F.J., McCoy, C.O., Bumgarner, R., Marrazzo, J.M., Fredricks, D.N.: Bacterial communities in women with bacterial vaginosis: high resolution phylogenetic analyses reveal relationships of microbiota to clinical criteria. PLoS ONE 7(6), e37818 (2012)
Ravel, J., Gajer, P., Abdo, Z., Schneider, G.M., Koenig, S.S., McCulle, S.L., Karlebach, S., Gorle, R., Russell, J., Tacket, C.O., Brotman, R.M., Davis, C.C., Ault, K., Peralta, L., Forney, L.J.: Vaginal microbiome of reproductive-age women. PNAS 108(Suppl. 1), 4680ā4687 (2011)
Wu, M., Schƶlkopf, B.: A local learning approach for clustering. In: Schƶlkopf, B., Platt, J., Hoffman, T. (eds.) Advances in Neural Information Processing Systems, vol. 19, pp. 1529ā1536. MIT Press, Cambridge (2007)
Wang, F., Zhang, C., Li, T.: Clustering with local and global regularization. In: Proceedings of the 22nd National Conference on Artificial Intelligence, pp. 657ā662. AAAI Press (2007)
Sindhwani, V., Niyogi, P.: A co-regularized approach to semi-supervised learning with multiple views. In: Proceedings of the ICML Workshop on Learning with Multiple Views (2005)
Rosenberg, D., Bartlett, P.L.: The Rademacher complexity of co-regularized kernel classes. In: Meila, M., Shen, X., (eds.) Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, pp. 396ā403 (2007)
Sindhwani, V., Rosenberg, D.: An RKHS for multi-view learning and manifold co-regularization. In: McCallum, A., Roweis, S. (eds.) Proceedings of the 25th Annual International Conference on Machine Learning (ICML 2008), Finland, pp. 976ā983. Omnipress, Helsinki (2008)
Dols, J.A., Smit, P.W., Kort, R., Reid, G., Schuren, F.H., Tempelman, H., Bontekoe, T.R., Korporaal, H., Boon, M.E.: Microarray-based identification of clinically relevant vaginal bacteria in relation to bacterial vaginosis. Am. J. Obstet. Gynecol. 204(4), 1ā7 (2011)
Braunstein, S.L., Ingabire, C.M., Kestelyn, E., Uwizera, A.U., Mwamarangwe, L., Ntirushwa, J., Nash, D., Veldhuijzen, N.J., Nel, A., Vyankandondera, J., van de Wijgert, J.H.: High human immunodeficiency virus incidence in a cohort of Rwandan female sex workers. Sex. Transm. Dis. 38(5), 385ā394 (2011)
Nugent, R.P., Krohn, M.A., Hillier, S.L.: Reliability of diagnosing bacterial vaginosis is improved by a standardized method of gram stain interpretation. J. Clin. Microbiol. 29(2), 297ā301 (1991)
Hauth, J.C., Macpherson, C., Carey, J.C., Klebanoff, M.A., Hillier, S.L., Ernest, J.M., Leveno, K.J., Wapner, R., Varner, M., Trout, W., Moawad, A., Sibai, B.: Early pregnancy threshold vaginal pH and Gram stain scores predictive of subsequent preterm birth in asymptomatic women. Am. J. Obstet. Gynecol. 188(3), 831ā835 (2003)
Cohen, C.R., Lingappa, J.R., Baeten, J.M., Ngayo, M.O., Spiegel, C.A., Hong, T., Donnell, D., Celum, C., Kapiga, S., Bukusi, E.A.: Bacterial vaginosis associated with increased risk of female-to-male HIV-1 transmission: a prospective cohort analysis among African couples. PLoS Med. 9(6), e1001251 (2012)
Wiesenfeld, H.C., Hillier, S.L., Krohn, M.A., Landers, D.V., Sweet, R.L.: Bacterial vaginosis is a strong predictor of Neisseria gonorrhoeae and Chlamydia trachomatis infection. Clin. Infect. Dis. 36(5), 663ā668 (2003)
Quackenbush, J.: Microarray data normalization and transformation. Nat. Genet. 32(Suppl.), 496ā501 (2002)
Acknowledgments
We acknowledge support from the Netherlands Organization for Scientific Research (grant number 639.023.604). Funding for the cervicovaginal microbiome study was received from the European and Developing Countries Clinical Trials Partnership (EDCTP), European Commission 7th Framework CHAARM project and the Aids Fonds Netherlands (grant number 201102). The views expressed in this paper are those of the authors and do not necessarily represent the official position of EDCTP, the EU, or the Aids Fonds Netherlands.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
Given the matrix formulation of our optimization problem, we can find the following closed form for the solution. Taking the partial derivative of \(J(W_i)\) with respect to \(\mathbf {w}^{(v)}_i\) we get
By defining \(G^\nu = 2\nu (M -1) X^{(v)}_i X^{(v)T}_i \), \(G^\lambda = \lambda X^{(v)T}_i\) and \(G = X^{(v)}_i X^{(v)T}_i \), we can rewrite the above term as
At the optimum we have \(\frac{\partial }{\partial \mathbf {w}^{(v)}_i}J(W_i)=0\) for all views, thus we get the exact solution by solving
with respect to \(\mathbf {w}^{(1)}_i,\ldots ,\mathbf {w}^{(M)}_i\). Note that the left-hand side matrix is positive definite and therefore invertible.
Rights and permissions
Copyright information
Ā© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Tsivtsivadze, E., Borgdorff, H., Wijgert, J.v., Schuren, F., Verhelst, R., Heskes, T. (2013). Neighborhood Co-regularized Multi-view Spectral Clustering of Microbiome Data. In: Zhou, ZH., Schwenker, F. (eds) Partially Supervised Learning. PSL 2013. Lecture Notes in Computer Science(), vol 8183. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40705-5_8
Download citation
DOI: https://doi.org/10.1007/978-3-642-40705-5_8
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40704-8
Online ISBN: 978-3-642-40705-5
eBook Packages: Computer ScienceComputer Science (R0)