Skip to main content

Unsupervised Sparse Matrix Co-clustering for Marketing and Sales Intelligence

  • Conference paper

Part of the Lecture Notes in Computer Science book series (LNAI,volume 7301)

Abstract

Business intelligence focuses on the discovery of useful retail patterns by combining both historical and prognostic data. Ultimate goal is the orchestration of more targeted sales and marketing efforts. A frequent analytic task includes the discovery of associations between customers and products. Matrix co-clustering techniques represent a common abstraction for solving this problem. We identify shortcomings of previous approaches, such as the explicit input for the number of co-clusters and the common assumption for existence of a block-diagonal matrix form. We address both of these issues and present techniques for automated matrix co-clustering. We formulate the problem as a recursive bisection on Fiedler vectors in conjunction with an eigengap-driven termination criterion. Our technique does not assume perfect block-diagonal matrix structure after reordering. We explore and identify off-diagonal cluster structures by devising a Gaussian-based density estimator. Finally, we show how to explicitly couple co-clustering with product recommendations, using real-world business intelligence data. The final outcome is a robust co-clustering algorithm that can discover in an automatic manner both disjoint and overlapping cluster structures, even in the preserve of noisy observations.

Keywords

  • Bipartite Graph
  • Input Matrix
  • Business Intelligence
  • Product Recommendation
  • Spectral Graph Theory

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-642-30217-6_49
  • Chapter length: 13 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   84.99
Price excludes VAT (USA)
  • ISBN: 978-3-642-30217-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   109.99
Price excludes VAT (USA)

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Anagnostopoulos, A., Dasgupta, A., Kumar, R.: Approximation Algorithms for co-Clustering. In: Proceedings of ACM Symposium on Principles of Database Systems (PODS), pp. 201–210 (2008)

    Google Scholar 

  2. Arora, S., Rao, S., Vazirani, U.: Expander Flows, Geometric Embeddings and Graph Partitioning. J. ACM 56, 5:1–5:37 (2009)

    MathSciNet  CrossRef  Google Scholar 

  3. Chakrabarti, D., Papadimitriou, S., Modha, D.S., Faloutsos, C.: Fully Automatic Cross-associations. In: Proc. of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 79–88 (2004)

    Google Scholar 

  4. Cho, H., Dhillon, I.S., Guan, Y., Sra, S.: Minimum Sum-Squared Residue co-Clustering of Gene Expression Data. In: Proc. of SIAM Conference on Data Mining, SDM (2004)

    Google Scholar 

  5. Chung, F.R.K.: Spectral Graph Theory. American Mathematical Society (1994)

    Google Scholar 

  6. Dhillon, I.S.: Co-Clustering Documents and Words using Bipartite Spectral Graph Partitioning. In: Proc. of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 269–274 (2001)

    Google Scholar 

  7. Dhillon, I.S., Mallela, S., Modha, D.S.: Information-theoretic co-Clustering. In: Proc. of International Conference on Knowledge Discovery and Data Mining (KDD), pp. 89–98 (2003)

    Google Scholar 

  8. Fiedler, M.: Algebraic Connectivity of Graphs. Czechoslovak Mathematical Journal 23(98), 298–305 (1973)

    MathSciNet  Google Scholar 

  9. Guattery, S., Miller, G.L.: On the Performance of Spectral Graph Partitioning Methods. In: Proc. of ACM-SIAM Symposium on Discrete Algorithms (SODA), pp. 233–242 (1995)

    Google Scholar 

  10. Hagen, L., Kahng, A.: New Spectral Methods for Ratio Cut Partitioning and Clustering. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 11(9), 1074–1085 (1992)

    CrossRef  Google Scholar 

  11. Hartigan, J.A.: Direct Clustering of a Data Matrix. Journal of the American Statistical Association 67(337), 123–129 (1972)

    Google Scholar 

  12. Leighton, T., Rao, S.: Multicommodity Max-flow Min-cut Theorems and their Use in Designing Approximation Algorithms. J. ACM 46, 787–832 (1999)

    MathSciNet  MATH  CrossRef  Google Scholar 

  13. Luxburg, U.: A Tutorial on Spectral Clustering. Statistics and Computing 17, 395–416 (2007)

    MathSciNet  CrossRef  Google Scholar 

  14. Madeira, S., Oliveira, A.L.: Biclustering Algorithms for Biological Data Analysis: a survey. Trans. on Comp. Biology and Bioinformatics 1(1), 24–45 (2004)

    CrossRef  Google Scholar 

  15. Newman, M.E.J.: Fast Algorithm for Detecting Community Structure in Networks. Phys. Rev. E 69, 066133 (2004)

    CrossRef  Google Scholar 

  16. Papadimitriou, S., Sun, J.: DisCo: Distributed Co-clustering with Map-Reduce: A Case Study towards Petabyte-Scale End-to-End Mining. In: Proc. of International Conference on Data Mining (ICDM), pp. 512–521 (2008)

    Google Scholar 

  17. Shi, J., Malik, J.: Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 888–905 (2000)

    CrossRef  Google Scholar 

  18. Salomon, D.: Data Compression: The Complete Reference, 2nd edn. Springer-Verlag New York, Inc. (2000)

    Google Scholar 

  19. Shmoys, D.B.: Cut Problems and their Application to Divide-and-conquer, pp. 192–235. PWS Publishing Co. (1997)

    Google Scholar 

  20. Sun, J., Faloutsos, C., Papadimitriou, S., Yu, P.S.: GraphScope: Parameter-free Mining of Large Time-evolving Graphs. In: Proc. of KDD, pp. 687–696 (2007)

    Google Scholar 

  21. Tanay, A., Sharan, R., Shamir, R.: Biclustering Algorithms: a survey. Handbook of Computational Molecular Biology (2004)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and Permissions

Copyright information

© 2012 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zouzias, A., Vlachos, M., Freris, N.M. (2012). Unsupervised Sparse Matrix Co-clustering for Marketing and Sales Intelligence. In: Tan, PN., Chawla, S., Ho, C.K., Bailey, J. (eds) Advances in Knowledge Discovery and Data Mining. PAKDD 2012. Lecture Notes in Computer Science(), vol 7301. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30217-6_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-30217-6_49

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-30216-9

  • Online ISBN: 978-3-642-30217-6

  • eBook Packages: Computer ScienceComputer Science (R0)