Combinatorial Optimization Algorithms to Mine a Sub-Matrix of Maximal Sum

Branders, Vincent; Schaus, Pierre; Dupont, Pierre

doi:10.1007/978-3-319-78680-3_5

Vincent Branders¹⁸,
Pierre Schaus¹⁸ &
Pierre Dupont¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 10785))

Included in the following conference series:

International Workshop on New Frontiers in Mining Complex Patterns

447 Accesses
2 Citations

Abstract

Biclustering techniques have been widely used to identify homogeneous subgroups within large data matrices, such as subsets of genes similarly expressed across subsets of patients. Mining a max-sum sub-matrix is a related but distinct problem for which one looks for a (non-necessarily contiguous) rectangular sub-matrix with a maximal sum of its entries. Le Van et al. [7] already illustrated its applicability to gene expression analysis and addressed it with a constraint programming (CP) approach combined with large neighborhood search (LNS). In this work, we exhibit some key properties of this \(\mathcal {NP}\)-hard problem and define a bounding function such that larger problems can be solved in reasonable time. The use of these properties results in an improved CP-LNS implementation evaluated here. Two additional algorithms are also proposed in order to exploit the highlighted characteristics of the problem: a CP approach with a global constraint (CPGC) and a mixed integer linear programming (MILP). Practical experiments conducted both on synthetic and real gene expression data exhibit the characteristics of these approaches and their relative benefits over the CP-LNS method. Overall, the CPGC approach tends to be the fastest to produce a good solution. Yet, the MILP formulation is arguably the easiest to formulate and can also be competitive.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Biclustering via structured regularized matrix decomposition

Article 29 April 2022

Identifying gene-specific subgroups: an alternative to biclustering

Article Open access 03 December 2019

An Incremental Linear Programming Based Tool for Analyzing Gene Expression Data

Notes

1.
Essentially by considering the rows and columns of the matrix as the two sets of nodes of a bipartite graph.

References

Atzmueller, M.: Subgroup discovery. Wiley Interdiscipl. Rev. Data Mining Knowl. Discov. 5(1), 35–49 (2015)
Article Google Scholar
Bentley, J.: Programming pearls: algorithm design techniques. Commun. ACM 27(9), 865–873 (1984)
Article Google Scholar
Cheng, Y., Church, G.M.: Biclustering of expression data. In: ISMB, vol. 8, pp. 93–103 (2000)
Google Scholar
Dawande, M., Keskinocak, P., Tayur, S.: On the biclique problem in bipartite graphs (1996)
Google Scholar
Fanaee-T, H., Gama, J.: Eigenspace method for spatiotemporal hotspot detection. Expert Syst. 32(3), 454–464 (2015). eXSY-Nov-13-198.R1
Article Google Scholar
Herrera, F., Carmona, C.J., González, P., del Jesus, M.J.: An overview on subgroup discovery: foundations and applications. Knowl. Inf. Syst. 29(3), 495–525 (2011)
Article Google Scholar
Le Van, T., van Leeuwen, M., Nijssen, S., Fierro, A.C., Marchal, K., De Raedt, L.: Ranked tiling. In: Calders, T., Esposito, F., Hüllermeier, E., Meo, R. (eds.) ECML PKDD 2014. LNCS (LNAI), vol. 8725, pp. 98–113. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-44851-9_7
Google Scholar
López-Ibánez, M., Stützle, T.: Automatically improving the anytime behaviour of optimisation algorithms. Eur. J. Oper. Res. 235(3), 569–582 (2014)
Article MathSciNet MATH Google Scholar
Madeira, S.C., Oliveira, A.L.: Biclustering algorithms for biological data analysis: a survey. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) 1(1), 24–45 (2004)
Article Google Scholar
Nemhauser, G.L., Wolsey, L.A.: Integer programming and combinatorial optimization. Wiley, Chichester (1988). Nemhauser, G.L., Savelsbergh, M.W.P., Sigismondi, G.S.: Constraint classification for mixed integer programming formulations. COAL Bull. 20, 8–12 (1992)
Google Scholar
OscaR Team: OscaR: Scala in OR (2012). https://bitbucket.org/oscarlib/oscar
Parker, J.S., Mullins, M., Cheang, M.C., Leung, S., Voduc, D., Vickery, T., Davies, S., Fauron, C., He, X., Hu, Z., et al.: Supervised risk predictor of breast cancer based on intrinsic subtypes. J. Clin. Oncol. 27(8), 1160–1167 (2009)
Article Google Scholar
Perou, C.M., Sørlie, T., Eisen, M.B., van de Rijn, M., Jeffrey, S.S., Rees, C.A., Pollack, J.R., Ross, D.T., Johnsen, H., Akslen, L.A., et al.: Molecular portraits of human breast tumours. Nature 406(6797), 747–752 (2000)
Article Google Scholar
Pio, G., Ceci, M., D’Elia, D., Loglisci, C., Malerba, D.: A novel biclustering algorithm for the discovery of meaningful biological correlations between micrornas and their target genes. BMC Bioinform. 14(7), S8 (2013)
Article Google Scholar
Pio, G., Ceci, M., Malerba, D., D’Elia, D.: Comirnet: a web-based system for the analysis of mirna-gene regulatory networks. BMC Bioinform. 16(9), S7 (2015)
Article Google Scholar
Pontes, B., Giráldez, R., Aguilar-Ruiz, J.S.: Biclustering on expression data: a review. J. Biomed. Inform. 57, 163–180 (2015)
Article Google Scholar
de Saint-Marcq, V.l.C., Schaus, P., Solnon, C., Lecoutre, C.: Sparse-sets for domain implementation. In: CP Workshop on Techniques foR Implementing Constraint programming Systems (TRICS), pp. 1–10 (2013)
Google Scholar
Takaoka, T.: Efficient algorithms for the maximum subarray problem by distance matrix multiplication. Electron. Not. Theoret. Comput. Sci. 61, 191–200 (2002)
Article MATH Google Scholar
Tamaki, H., Tokuyama, T.: Algorithms for the maximum subarray problem based on matrix multiplication. In: SODA 1998, pp. 446–452 (1998)
Google Scholar
Yang, J., Wang, H., Wang, W., Yu, P.: Enhanced biclustering on expression data. In: Proceedings of the Third IEEE Symposium on Bioinformatics and Bioengineering, pp. 321–327. IEEE (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

ICTEAM/INGI, Machine Learning Group, Université catholique de Louvain, Place Sainte Barbe 2, 1348, Louvain-la-Neuve, Belgium
Vincent Branders, Pierre Schaus & Pierre Dupont

Authors

Vincent Branders
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Schaus
View author publications
You can also search for this author in PubMed Google Scholar
Pierre Dupont
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Vincent Branders .

Editor information

Editors and Affiliations

University of Bari Aldo Moro, Bari, Italy
Annalisa Appice
University of Bari Aldo Moro, Bari, Italy
Corrado Loglisci
CNR, Rende, Italy
Giuseppe Manco
CNR, Rende, Italy
Elio Masciari
University of North Carolina, Charlotte, North Carolina, USA
Zbigniew W. Ras

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Branders, V., Schaus, P., Dupont, P. (2018). Combinatorial Optimization Algorithms to Mine a Sub-Matrix of Maximal Sum. In: Appice, A., Loglisci, C., Manco, G., Masciari, E., Ras, Z. (eds) New Frontiers in Mining Complex Patterns. NFMCP 2017. Lecture Notes in Computer Science(), vol 10785. Springer, Cham. https://doi.org/10.1007/978-3-319-78680-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-319-78680-3_5
Published: 24 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-78679-7
Online ISBN: 978-3-319-78680-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Combinatorial Optimization Algorithms to Mine a Sub-Matrix of Maximal Sum

Abstract

Access this chapter

Similar content being viewed by others

Biclustering via structured regularized matrix decomposition

Identifying gene-specific subgroups: an alternative to biclustering

An Incremental Linear Programming Based Tool for Analyzing Gene Expression Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Combinatorial Optimization Algorithms to Mine a Sub-Matrix of Maximal Sum

Abstract

Access this chapter

Similar content being viewed by others

Biclustering via structured regularized matrix decomposition

Identifying gene-specific subgroups: an alternative to biclustering

An Incremental Linear Programming Based Tool for Analyzing Gene Expression Data

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation