Skip to main content
Log in

On context-aware co-clustering with metadata support

  • Published:
Journal of Intelligent Information Systems Aims and scope Submit manuscript

Abstract

In traditional co-clustering, the only basis for the clustering task is a given relationship matrix, describing the strengths of the relationships between pairs of elements in the different domains. Relying on this single input matrix, co-clustering discovers relationships holding among groups of elements from the two input domains. In many real life applications, on the other hand, other background knowledge or metadata about one or more of the two input domain dimensions may be available and, if leveraged properly, such metadata might play a significant role in the effectiveness of the co-clustering process. How additional metadata affects co-clustering, however, depends on how the process is modified to be context-aware. In this paper, we propose, compare, and evaluate three alternative strategies (metadata-driven, metadata-constrained, and metadata-injected co-clustering) for embedding available contextual knowledge into the co-clustering process. Experimental results show that it is possible to leverage the available metadata in discovering contextually-relevant co-clusters, without significant overheads in terms of information theoretical co-cluster quality or execution cost.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20

Similar content being viewed by others

Notes

  1. http://archive.ics.uci.edu/ml/datasets/Bag+of+Words.

  2. http://www.dmoz.org

  3. The matrix is re-normalized after the application of the combination function to ensure that information-theoretic co-clustering, which treats the values in the matrix as probability distributions, can be applied. Due to this renormalization, the combination function sum() is equivalent to the average() (the two functions would differ for a scaling factor 2, which is absorbed by re-normalization).

References

  • Alp Aslandogan, Y., Thier, C., Yu, C. T., Liu, C., & Nair, K. R. (1995). Design, implementation and evaluation of score (a system for content based retrieval of pictures). In ICDE ’95: Proceedings of the eleventh international conference on data engineering (pp. 280–287). Washington: IEEE.

    Chapter  Google Scholar 

  • Baier, D., Gaul, W., & Schader, M. (1997). Two-mode overlapping clustering with applications to simultaneous benefit segmentation and market structuring. In R. Klar, & O. Opitz (Eds.), Classification and knowledge organization: Recent advances and applications (pp. 557–566). Springer.

  • Banerjee, A., Dhillon, I., Ghosh, J., Merugu, S., & Modha, D. S. (2007). A generalized maximum entropy approach to bregman co-clustering and matrix approximation. Journal of Machine Learning Research, 8, 1919–1986.

    MATH  MathSciNet  Google Scholar 

  • Basu, S., Banerjee, A., & Mooney, R. J. (2002). Semi-supervised clustering by seeding. In ICML’02: Proceedings of the 9th international conference on machine learning (pp. 27–34). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Basu, S., Bilenko, M., & Mooney, R. J. (2004). A probabilistic framework for semi-supervised clustering. In KDD ’04: Proceedings of the tenth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 59–68). New York: ACM.

    Chapter  Google Scholar 

  • Bilenko, M., Basu, S., & Mooney, R. J. (2004). Integrating constraints and metric learning in semi-supervised clustering. In ICML (pp. 81–88).

  • Bilenko, M., & Mooney, R. J. (2003). Adaptive duplicate detection using learnable string similarity measures. In KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 39–48). New York: ACM.

    Chapter  Google Scholar 

  • Bishop, C. M. (2006). Pattern recognition and machine learning (Information science and statistics). New York: Springer.

    Google Scholar 

  • Candan, K. S., Cataldi, M., Sapino, M. L., & Schifanella, C. (2008). Structure- and extension-informed taxonomy alignment. In Proceedings of the 4th international VLDB workshop on ontology-based techniques for databases in information systems and knowledge systems, ODBIS 2008, Auckland, New Zealand, 23 August 2008, co-located with the 34th international conference on very large data bases (pp. 1–8).

  • Candan, K. S., & Li, W.-S. (2001). On similarity measures for multimedia database applications. Knowledge and Information Systems, 3(1), 30–51.

    Article  MATH  Google Scholar 

  • Cataldi, M., Schifanella, C., Candan, K. S., Sapino, M. L., & Di Caro, L. (2009). Cosena: A context-based search and navigation system. In The first international acm conference on management of emergent digital ecosystems (MEDES). Lyon: ACM.

    Google Scholar 

  • Chen, Y., Dong, M., & Wan, W. (2009). Image co-clustering with multi-modality features and user feedbacks. In Proceedings of the seventeen ACM international conference on multimedia, MM ’09 (pp. 689–692). New York: ACM. ISBN 978-1-60558-608-3. doi:10.1145/1631272.1631389. URL http://doi.acm.org/10.1145/1631272.1631389.

    Chapter  Google Scholar 

  • Chen, Y., Wang, L., & Dong, M. (2008). A matrix-based approach for semi-supervised document co-clustering. In Proceeding of the 17th ACM conference on information and knowledge management, CIKM ’08 (pp. 1523–1524). New York: ACM. ISBN 978-1-59593-991-3.

    Chapter  Google Scholar 

  • Chen, Y., Wang, L., & Dong, M. (2010). Non-negative matrix factorization for semisupervised heterogeneous data coclustering. IEEE Transactions on Knowledge and Data Engineering, 22, 1459–1474. ISSN 1041-4347. doi:10.1109/TKDE.2009.169.

    Article  Google Scholar 

  • Cheng, Y., & Church, G. M. (2000). Biclustering of expression data. In Proceedings of the eighth international conference on intelligent systems for molecular biology (pp. 93–103). AAAI Press.

  • Cho, H., Dhillon, I. S., Guan, Y., & Sra, S. (2004). Minimum sum-squared residue co-clustering of gene expression data. In M. W. Berry, U. Dayal, C. Kamath, & D. B. Skillicorn (Eds.), SDM. SIAM.

  • Demiriz, A., Bennett, K. P., & Embrechts, M. J. (1999). Semi-supervised clustering using genetic algorithms. In Artificial neural networks in engineering (ANNIE-99) (pp. 809–814). ASME Press.

  • Dhillon, I. S. (2001). Co-clustering documents and words using bipartite spectral graph partitioning. In KDD ’01: Proceedings of the seventh ACM SIGKDD international conference on knowledge discovery and data mining (pp. 269–274). New York: ACM.

    Chapter  Google Scholar 

  • Dhillon, I. S., Subramanyam, M., & Modha Dharmendra, S. (2003). Information-theoretic co-clustering. In KDD ’03: Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining (pp. 89–98). New York: ACM.

    Chapter  Google Scholar 

  • Freitag, D. (2004). Trained named entity recognition using distributional clusters. In Proceedings of the conference on empirical methods in natural language processing, EMNLP (pp. 262–269). Barcelona, Spain.

  • Gao, B., Liu, T.-Y., & Ma, W.-Y. (2006). Star-structured high-order heterogeneous data co-clustering based on consistent information theory. In Proceedings of the 6th IEEE international conference on data mining (ICDM 2006), 18–22 December 2006, Hong Kong, China (pp. 880–884). IEEE Computer Society.

  • Gao, B., Liu, T.-Y., Zheng, X., Cheng, Q.-S., & Ma, W.-Y. (2005). Consistent bipartite graph co-partitioning for star-structured high-order heterogeneous data co-clustering. In KDD ’05: Proceedings of the 11th ACM SIGKDD int. conference on knowledge discovery in data mining (pp. 41–50). New York: ACM.

    Chapter  Google Scholar 

  • Gaul, W., & Schader, M. (1996). A new algorithm for two-mode clustering. In H. Hermann, & W. Polasek (Eds.), Data analysis and information systems (pp. 15–23). Springer.

  • George, T., & Merugu, S. (2005). A scalable collaborative filtering framework based on co-clustering. In ICDM ’05: Proceedings of the fifth IEEE international conference on data mining (pp. 625–628). Washington: IEEE.

    Chapter  Google Scholar 

  • Hanisch, D., Zien, A., Zimmer, R., & Lengauer, T. (2002). Co-clustering of biological networks and gene expression data. In ISMB (Supplement of bioinformatics) (pp. 145–154).

  • Hartigan, J. A. (1972). Direct clustering of a data matrix. Journal of the American Statistical Association, 67(337), 123–129.

    Article  Google Scholar 

  • Hofmann, T., & Puzicha, J. (1999). Latent class models for collaborative filtering. In Proceedings of the sixteenth international joint conference on artificial intelligence, IJCAI ’99 (pp. 688–693). San Francisco: Morgan Kaufmann. ISBN 1-55860-613-0.

    Google Scholar 

  • Kemp, C., Tenenbaum, J. B., Griffiths, T. L., Yamada, T., & Ueda, N. (2006). Learning systems of concepts with an infinite relational model. In Proceedings of the 21st national conference on artificial intelligence (Vol. 1, pp. 381–388). AAAI Press. ISBN 978-1-57735-281-5.

  • Kim, J. W., & Candan, K. S. (2006). Cp/cv: Concept similarity mining without frequency information from domain describing taxonomies. In CIKM ’06 (pp. 483–492).

  • Klein, D., Kamvar, S. D., & Manning, C. D. (2002). From instance-level constraints to space-level constraints: Making the most of prior knowledge in data clustering. In ICML ’02: Proceedings of the nineteenth international conference on machine learning (pp. 307–314). San Francisco: Morgan Kaufmann.

    Google Scholar 

  • Kolda, T. G., & Bader, B. W. (2009). Tensor decompositions and applications. SIAM Review, 51(3), 455–500.

    Article  MATH  MathSciNet  Google Scholar 

  • Lee, D. D., & Seung, H. S. (2000). Algorithms for non-negative matrix factorization. In T. K. Leen, T. G. Dietterich, & V. Tresp (Eds.), Advances in neural information processing systems 13, papers from neural information processing systems (NIPS) 2000, Denver, CO, USA (pp. 556–562). MIT Press.

  • Li, H., & Abe, N. (1998). Word clustering and disambiguation based on co-occurrence data. In Proceedings of the 17th international conference on computational linguistics (pp. 749–755). Morristown: Association for Computational Linguistics.

    Chapter  Google Scholar 

  • Long, B., Zhang, Z. M., Wú, X., & Yu, P. S. (2006). Spectral clustering for multi-type relational data. In Proceedings of the 23rd international conference on machine learning, ICML ’06 (pp. 585–592). New York: ACM. ISBN 1-59593-383-2. doi:10.1145/1143844.1143918. URL http://doi.acm.org/10.1145/1143844.1143918.

    Google Scholar 

  • Long, B., Zhang, Z. M., & Yu, P. S. (2007). A probabilistic framework for relational clustering. In Proceedings of the 13th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’07 (pp. 470–479). New York: ACM. ISBN 978-1-59593-609-7. doi:10.1145/1281192.1281244. URL http://doi.acm.org/10.1145/1281192.1281244.

    Chapter  Google Scholar 

  • Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416.

    Article  MathSciNet  Google Scholar 

  • Ma, H., Zhao, W., Tan, Q., & Shi, Z. (2010). Orthogonal nonnegative matrix tri-factorization for semi-supervised document co-clustering. In M. Zaki, J. Yu, B. Ravindran, & V. Pudi (Eds.), Advances in knowledge discovery and data mining. Lecture Notes in Computer Science (Vol. 6119, pp. 189–200). Berlin: Springer.

    Chapter  Google Scholar 

  • Madeira, S. C., & Oliveira, A. L. (2004). Biclustering algorithms for biological data analysis: A survey. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1(1), 24–45.

    Article  Google Scholar 

  • Ng, A. Y., Jordan, M. I., & Weiss, Y. (2001). On spectral clustering: Analysis and an algorithm. In Advances in neural information processing systems 14 (pp. 849–856). MIT Press.

  • Pensa, R. G., & Boulicaut, J.-F. (2008). Constrained co-clustering of gene expression data. In Proceedings of the SIAM international conference on data mining, SDM 2008, 24–26 April 2008, Atlanta, Georgia, USA (pp. 25–36). SIAM.

  • Ruiz, C., Spiliopoulou, M., & Ruiz, E. M. (2007). C-dbscan: Density-based clustering with constraints. In A. An, J. Stefanowski, S. Ramanna, C. J. Butz, W. Pedrycz, & G. Wang (Eds.), RSFDGrC. LNCS (Vol. 4482, pp. 216–223). Springer.

  • Shan, H., & Banerjee, A. (2008). Bayesian co-clustering. In Proceedings of the 2008 eighth IEEE international conference on data mining (pp. 530–539). Washington: IEEE Computer Society. ISBN 978-0-7695-3502-9.

    Chapter  Google Scholar 

  • Song, Y., Pan, S., Liu, S., Wei, F., Zhou, M. X., & Qian, W. (2010). Constrained coclustering for textual documents. In M. Fox, & D. Poole (Eds.), AAAI. AAAI Press.

  • Struyf, J., & Dzeroski, S. (2007). Clustering trees with instance level constraints. In J. N. Kok, J. Koronacki, R. López de Mántaras, S. Matwin, D. Mladenic, & A. Skowron (Eds.), ECML. LNCS (Vol. 4701, pp. 359–370). Springer.

  • Tucker, L. (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3), 279–311.

    Article  MathSciNet  Google Scholar 

  • Valtchev, P., & Euzenat, J. (1997). Dissimilarity measure for collections of objects and values. In X. Liu, P. R. Cohen, & M. R. Berthold (Eds.), IDA. LNCS (Vol. 1280, pp. 259–272). Springer.

  • Vichi, M. (2001). Double k-means clustering for simultaneous classification of objects and variables. In Advances in classification and data analysis (pp. 43–52). Springer.

  • Wagstaff, K., Cardie, C., Rogers, S., & Schrödl, S. (2001). Constrained k-means clustering with background knowledge. In C. E. Brodley, & A. P. Danyluk (Eds.), ICML (pp. 577–584). Morgan Kaufmann.

  • Xing, E. P., Ng, A. Y., Jordan, M. I., & Russell, S. (2002). Distance metric learning, with application to clustering with side-information. Advances in neural information processing systems (pp. 505–512). MIT Press

  • Xu, W., Liu, X., & Gong, Y. (2003). Document clustering based on non-negative matrix factorization. In SIGIR ’03: Proceedings of the 26th annual international ACM SIGIR conference (pp. 267–273). New York: ACM.

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Claudio Schifanella.

Additional information

This work is partially supported by NSF Grant NSF-III1016921. “One Size Does Not Fit All: Empowering the User with User-Driven Integration.”

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schifanella, C., Sapino, M.L. & Candan, K.S. On context-aware co-clustering with metadata support. J Intell Inf Syst 38, 209–239 (2012). https://doi.org/10.1007/s10844-011-0151-x

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10844-011-0151-x

Keywords

Navigation