In many problem domains data may come from multiple sources (or views), such as video and audio from a camera or text on and links to a web page. These multiple views of the data are often not directly comparable to one another, and thus a principled method for their integration is warranted. In this paper we develop a new algorithm to leverage information from multiple views for unsupervised clustering by constructing a custom kernel. We generate a multipartite graph (with the number of parts given by the number of views) that induces a kernel we then use for spectral clustering. Our algorithm can be seen as a generalization of co-clustering and spectral clustering and a relative of Kernel Canonical Correlation Analysis. We demonstrate the algorithm on four data sets: an illustrative artificial data set, synthetic fMRI data, voxels from an fMRI study, and a collection of web pages. Finally, we compare its performance to common alternatives.
Bickel, S., & Scheffer, T. (2004). Multi-view clustering. In Proceedings of the IEEE international conference on data mining (pp. 19–26).
Blaschko, M., & Lampert, C. (2008). Correlational spectral clustering. Computer Vision and Pattern Recognition. DOI:10.1109/CVPR.2008.4587353. CVPR 2008. IEEE Conference on pp. 1–8 (2008).
Blaschko, M. B., Lampert, C. H., & Gretton, A. (2008). Semi-supervised Laplacian regularization of kernel canonical correlation analysis. In ECML PKDD ’08: Proceedings of the 2008 European conference on machine learning and knowledge discovery in databases—Part I (pp. 133–145). Berlin/Heidelberg: Springer.
Blum, A., & Mitchell, T. (1998). Combining labeled and unlabeled data with co-training. In Proceedings of the eleventh annual conference on computational learning theory (COLT-98) (pp. 92–100).
Cai, D., He, X., Li, Z., Ma, W., & Wen, J. (2004). Hierarchical clustering of WWW image search results using visual, textual and link information. In Proceedings of the 12th annual ACM international conference on Multimedia (pp. 952–959).
Charless Fowlkes Serge Belongie, F. C., & Malik, J. (2004). Spectral grouping using the Nystrom method. IEEE Transactions Pattern Analysis and Machine Intelligence, 26(2), 214–225.
Chaudhuri, K., Kakade, S., Livescu, K., & Sridharan, K. (2009). Multi-view clustering via canonical correlation analysis. In Proceedings of the 26th annual international conference on machine learning. New York: ACM.
de Sa, V. R. (1994). Learning classification with unlabeled data. In J. Cowan, G. Tesauro, & J. Alspector (Eds.), Advances in neural information processing systems (Vol. 6, pp. 112–119). San Mateo: Morgan Kaufmann.
de Sa, V. R. (2005). Spectral clustering with two views. In ICML workshop on learning with multiple views (20–27).
de Sa, V. R., & Ballard, D. H. (1998). Category learning through multimodality sensing. Neural Computation, 10(5), 1097–1117.
Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In KDD 2001 (pp. 269–274).
Golland, Y., Golland, P., Bentin, S., & Malach, R. (2008). Data-driven clustering reveals a fundamental subdivision of the human cortex into two global systems. Neuropsychologia, 46(2), 540–553.
Golland, P., Golland, Y., & Malach, R. (2007). Detection of spatial activation patterns as unsupervised segmentation of fMRI Data. In LNCS : Vol. 4791. Proceedings of MICCAI: International Conference on Medical Image Computing and Computer Assisted Intervention (pp. 110–118). Berlin: Springer.
Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2003). Canonical correlation analysis; an overview with application to learning methods (Technical Report CSD-TR-03-02). Department of Computer Science, Royal Holloway, University of London.
Hardoon, D. R., Szedmak, S., & Shawe-Taylor, J. (2004). Canonical correlation analysis: an overview with application to learning methods. Neural Computation, 16, 2639–2664.
Haxby, J., Gobbini, M., Furey, M., Ishai, A., Schouten, J., & Pietrini, P. (2001). Distributed and overlapping representations of faces and objects in ventral temporal cortex. Science, 293(5539), 2425–2430.
Hotelling, H. (1936). Relations between two sets of variables. Biometrika, 28, 321–377.
Joachims, T. (2003). Transductive learning via spectral graph partitioning. In Proceedings of the 20th international conference on machine learning (ICML 2003) (pp. 290–297).
Kleine, L. L., Monnet, V., Pechoux, C., & Trubuil, A. (2008). Role of bacterial peptidase f inferred by statistical analysis and further experimental validation. HFSP Journal, 2(1), 29–41.
Lai, P., & Fyfe, C. (2000). Kernel and nonlinear canonical correlation analysis. In IJCNN ’00: Proceedings of the IEEE-INNS-ENNS international joint conference on neural networks (IJCNN’00) (Vol. 4, p. 4614). Washington: IEEE Computer Society.
Lanckriet, G. R. G., Cristianini, N., Bartlett, P., Ghaoui, L. E., & Jordan, M. I. (2004). Learning the kernel matrix with semidefinite programming. Journal of Machine Learning Research, 5, 27–72.
Law, M., Topchy, A., & Jain, A. (2004). Clustering with soft and groupconstraints. In Joint IAPR international workshop on syntactical and structural pattern recognition and statistical pattern recognition (pp. 662–670).
Law, M., Topchy, A., & Jain, A. (2005). Model-based clustering with probabilistic constraints. In Proceedings of SIAM data mining (pp. 641–645).
Loeff, N., Alm, C., & Forsyth, D. (2006). Discriminating image senses by clustering with multimodal features. In Proceedings of the COLING/ACL 2006 main conference poster sessions (pp. 547–554).
Long, B., Wu, X., Zhang, Z. M., & Yu, P. S. (2006). Unsupervised learning on k-partite graphs. In KDD ’06: Proceedings of the 12th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 317–326). New York: ACM.
Long, B., Yu, P. S., & Zhang, Z. M. (2008). A general model for multiple view unsupervised learning. In SDM (pp. 822–833). Philadelphia: SIAM.
Lu, Z., & Leen, T. (2005). Semi-supervised learning with penalized probabilistic clustering. Advances in Neural Information Processing Systems, 17, 849–856.
Lu, Z., & Leen, T. (2007). Penalized Probabilistic Clustering. Neural Computation, 19(6), 1528–1567.
Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems (Vol. 14).
Rakotomamonjy, A., Bach, F., Grandvalet, Y., & Canu, S. (2008). SimpleMKL. Journal of Machine Learning Research, 9, 2491–2521.
Shi, J., & Malik, J. (1997). Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 888–905.
Wagstaff, K., Cardie, C., Rogers, S., & Schroedl, S. (2001). Constrained k-means clustering with background knowledge. In Proceedings of the eighteenth international conference on machine learning (pp. 577–584).
Wang, Y., & Rajapakse, J. C. (2006). Contextual modeling of functional mr images with conditional random fields. IEEE Transactions on Medical Imaging, 25(6), 804–812.
Woolrich, M. W., Behrens, T. E., Beckmann, C. F., & Smith, S. M. (2005). Mixture models with adaptive spatial regularization for segmentation with an application to fmri data. IEEE Transactions on Medical Imaging, 24(1), 1–11.
Zha, H., Ding, C., & Gu, M. (2001). Bipartite graph partitioning and data clustering. In CIKM ’01 (pp. 25–32).
Zhou, D., & Burges, C. J. C. (2007). Spectral clustering and transductive learning with multiple views. In ICML ’07: Proceedings of the 24th international conference on machine learning (pp. 1159–1166). New York: ACM. DOI:http://doi.acm.org/10.1145/1273496.1273642.
Editors: Nicolo Cesa-Bianchi, David R. Hardoon, and Gayle Leen.
This work is supported by NSF CAREER grant IIS-0133996 and NSF IGERT grant DGE-0333451 and NSF CBET-0756828.
About this article
Cite this article
de Sa, V.R., Gallagher, P.W., Lewis, J.M. et al. Multi-view kernel construction. Mach Learn 79, 47–71 (2010). https://doi.org/10.1007/s10994-009-5157-z