Skip to main content
Log in

Additive Biclustering: A Comparison of One New and Two Existing ALS Algorithms

  • Published:
Journal of Classification Aims and scope Submit manuscript

Abstract

The additive biclustering model for two-way two-mode object by variable data implies overlapping clusterings of both the objects and the variables together with a weight for each bicluster (i.e., a pair of an object and a variable cluster). In the data analysis, an additive biclustering model is fitted to given data by means of minimizing a least squares loss function. To this end, two alternating least squares algorithms (ALS) may be used: (1) PENCLUS, and (2) Baier’s ALS approach. However, both algorithms suffer from some inherent limitations, which may hamper their performance. As a way out, based on theoretical results regarding optimally designing ALS algorithms, in this paper a new ALS algorithm will be presented. In a simulation study this algorithm will be shown to outperform the existing ALS approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • BAIER, D., GAUL, W., and SCHADER, M. (1997), “Two-Mode Overlapping Clustering with Applications to Simultaneous Benefit Segmentation and Market Structuring”, in Classification and Knowledge Organization, eds. R. Klar, and K. Opitz, Berlin, Germany: Springer, pp. 557–566.

    Chapter  Google Scholar 

  • BOTH, M., and GAUL, W. (1987), “Ein Vergleich Zweimodaler Clusteranalyseverfahren,” Methods of Operations Research, 57, 593–605.

    Google Scholar 

  • BOTH, M., and GAUL, W. (1985), “PENCLUS: Penalty Clustering for Marketing Applications,” Discussion Paper No. 82, Institution of Decision Theory and Operations Research, University of Karlsruhe.

  • CEULEMANS, E., and KIERS, H.A.L. (2006), “Selecting Among Three-Mode Principal Component Models of Different Types and Complexities: A Numerical Convex Hull Based Method,” British Journal of Mathematical and Statistical Psychology, 59, 133–150.

    Article  MathSciNet  Google Scholar 

  • CHATURVEDI,A., and CARROLL, J.D. (1994), “An Alternating Combinatorial Optimization Approach to Fitting the INDCLUS and Generalized INDCLUSModels,” Journal of Classification, 11, 155–170.

    Article  MATH  Google Scholar 

  • COLLINS, L.M., and DENT, C.W. (1988), “Omega: A General Formulation of the Rand Index of Cluster Recovery Suitable for Non-Disjoint Solutions,” Multivariate Behavioral Research, 23, 231–242.

    Article  Google Scholar 

  • DE LEEUW, J. (1994), “Block-Relaxation Algorithms in Statistics”, in Information Systems and Data Analysis, eds. H.-H. Bock, W. Lenski, and M.M. Richter, Berlin: Springer-Verlag, pp. 308–325.

    Chapter  Google Scholar 

  • DE SARBO,W.S. (1982), “Gennclus: New Models for General Nonhierarchical Clustering Analysis,” Psychometrika, 47, 449–475.

    Article  MathSciNet  Google Scholar 

  • ECKES, T., and ORLIK, P. (1993), “An Error Variance Approach to Two-Mode Hierarchical Clustering,” Journal of Classification, 10, 51–74.

    Article  MATH  Google Scholar 

  • FAITH, J.J., HAYETE, B., JOSHUA, T., THADEN, J.T.,MONGO, I.M.,WIERZBOWSKI, J., COTTAREL, G., KASIF, S., COLLINS, J.J., and GARDNER, T.S. (2007), “Large-Scale Mapping and Validation of Escherichia Coli Transcriptional Regulation from a Compendium of Expression Profiles,” PLoS Biology, 5(1), 54–66.

    Article  Google Scholar 

  • GARA, M., ROSENBERG, S., and GOLDBERG, L. (1992), “DSM-IIIR as a Taxonomy: A Cluster Analysis of Diagnoses and Symptoms,” Journal of Nervous and Mental Disease, 180, 11–19.

    Article  Google Scholar 

  • GASCH, A.P., SPELLMAN, P.T., KAO, C.M., CARMEL-HAREL, O., EISEN, M.B., STORZ, G., BOTSTEIN, D., and BROWN, P.O. (2000), “Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes,” Molecular Biology of the Cell, 11, 4241–4257.

    Google Scholar 

  • GAUL, W., and SCHADER, M. (1996), “A New Algorithm for Two-Mode Clustering”, in Data Analysis and Information Systems: Statistical and Computational Approaches, eds. H.-H. Bock, and W. Polasek, Berlin, Germany: Springer, pp. 15–23.

    Chapter  Google Scholar 

  • GREENACRE, M.J. (1988), “Clustering the Rows and Columns of a Contingency Table,” Journal of Classification, 5, 39–51.

    Article  MathSciNet  MATH  Google Scholar 

  • HAND, D., and KRZANOWSKI, W. (2005), “Optimizing K-means Clustering Results with Standard Software Packages,” Computational Statistics and Data Analysis, 49, 969–973.

    Article  MathSciNet  MATH  Google Scholar 

  • HARTIGAN, J.A. (1976), “Modal Blocks in Dentition of West Coast Mammals,” Systematic Zoology, 25, 149–160.

    Article  Google Scholar 

  • LAZZERONI, L., and OWEN, A. (2002), “Plaid Models for Gene Expression Data,” Statistica Sinica, 12, 61–86.

    MathSciNet  MATH  Google Scholar 

  • MADEIRA, S.C., and OLIVEIRA, A.L. (2004), “Biclustering Algorithms for Biological Data Analysis: A Survey,” IEEE/ACM Transactions on Computational Biology and Bioinformatics, 1, 24–45.

    Article  Google Scholar 

  • MEZZICH, J.E., and SOLOMON, H. (1980), Taxonomy and Behavioral Science: Comparative Performance of Grouping Methods, London: Academic Press.

    Google Scholar 

  • MIRKIN, B., ARABIE, P., and HUBERT, L.J. (1995), “Additive Two-Mode Clustering: The Error-Variance Approach Revisited?,” Journal of Classification, 12, 243–263.

    Article  MATH  Google Scholar 

  • SCHEPERS, J., CEULEMANS, E., and VAN MECHELEN, I. (2008), “Selecting Among Multi-Mode Partitioning Models of Different Complexities: A Comparison of Four Model Selection Criteria,” Journal of Classification, 25, 67–85.

    Article  MathSciNet  MATH  Google Scholar 

  • SCHEPERS, J., and VAN MECHELEN, I. (2011), “A Two-Mode Clustering Method to Capture the Nature of the Dominant Interaction Pattern in Large Profile Data Matrices,” Psychological Methods, 16, 361–371.

    Article  Google Scholar 

  • SEGAL, E., SHAPIRA, M., REGEV, A., PE’ER, D., BOTSTEIN, D., KOLLER, D., and FRIEDMAN, N. (2003), “Module Networks: Identifying Regulatory Modules and Their Condition-Specific Regulators from Gene Expression Data,” Nature Genetics, 34, 166–176.

    Article  Google Scholar 

  • SPELLMAN, P.T., SHERLOCK, G., ZHANG, M.Q., IYER, V.R., ANDERS, K., EISEN, M.B., BROWN, P.O., BOTSTEIN, D., and FUTCHER, B. (1998), “Comprehensive Identification of Cell Cycle-Regulated Genes of the Yeast Saccharomyces Cerevisiae by Microarray Hybridization,” Molecular Biology of the Cell, 9, 3273–3297.

    Google Scholar 

  • STEINLEY, D., and BRUSCO, M.J. (2007), “Initializing K-means Batch Clustering: A Critical Evaluation of Several Techniques,” Journal of Classification, 24, 99–121.

    Article  MathSciNet  MATH  Google Scholar 

  • TURNER,H., BAILEY, T., and KRZANOWSKI,W. (2005), “Improved Biclustering ofMicroarray Data Demonstrated Through Systematic Performance Tests,” Computational Statistics and Data Analysis, 48, 235–254.

    Article  MathSciNet  MATH  Google Scholar 

  • VAN MECHELEN, I., BOCK, H.-H., and DE BOECK, P. (2004), “Two-Mode Clustering Methods: A Structured Overview,” Statistical Methods in Medical Research, 13, 363–394.

    Article  MathSciNet  MATH  Google Scholar 

  • VAN MECHELEN, I., and DE BOECK, P. (1989), “Implicit Taxonomy in Psychiatric Diagnosis: A Case Study,” Journal of Social and Clinical Psychology, 8, 276–287.

    Article  Google Scholar 

  • VAN MECHELEN, I., and DE BOECK, P. (1990), “Projection of a Binary Criterion into a Model of Hierarchical Classes,” Psychometrika, 55, 677–694.

    Article  Google Scholar 

  • WILDERJANS, T. F., CEULEMANS, E., and VAN MECHELEN, I. (2008), “The CHIC Model: A Global Model for Coupled Binary Data,” Psychometrika, 73, 729–751.

    Article  MathSciNet  MATH  Google Scholar 

  • WILDERJANS, T. F., CEULEMANS, E., and VAN MECHELEN, I. (2012), “The SIMCLAS Model: Simultaneous Analysis of Coupled Binary Data Matrices with Noise Heterogeneity Between and Within Data Blocks,” Psychometrika, 77, 724–740.

    Article  MATH  Google Scholar 

  • WILDERJANS, T. F., CEULEMANS, E., VAN MECHELEN, I., and DEPRIL, D. (2011), “ADPROCLUS: A Graphical User Interface for Fitting Additive Profile Clustering Models to Object by Variable Data Matrices,” Behavior Research Methods, 43, 56–65.

    Article  Google Scholar 

  • WILDERJANS, T. F., DEPRIL, D., and VAN MECHELEN, I. (2012), “Block-Relaxation Approaches for Fitting the INDCLUS Model,” Journal of Classification, 29, 277–296.

    Article  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tom F. Wilderjans.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wilderjans, T.F., Depril, D. & Van Mechelen, I. Additive Biclustering: A Comparison of One New and Two Existing ALS Algorithms. J Classif 30, 56–74 (2013). https://doi.org/10.1007/s00357-013-9120-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00357-013-9120-0

Keywords

Navigation