Skip to main content
Log in

Orthogonal joint sparse NMF for microarray data analysis

  • Published:
Journal of Mathematical Biology Aims and scope Submit manuscript

Abstract

The 3D microarrays, generally known as gene-sample-time microarrays, couple the information on different time points collected by 2D microarrays that measure gene expression levels among different samples. Their analysis is useful in several biomedical applications, like monitoring dose or drug treatment responses of patients over time in pharmacogenomics studies. Many statistical and data analysis tools have been used to extract useful information. In particular, nonnegative matrix factorization (NMF), with its natural nonnegativity constraints, has demonstrated its ability to extract from 2D microarrays relevant information on specific genes involved in the particular biological process. In this paper, we propose a new NMF model, namely Orthogonal Joint Sparse NMF, to extract relevant information from 3D microarrays containing the time evolution of a 2D microarray, by adding additional constraints to enforce important biological proprieties useful for further biological analysis. We develop multiplicative updates rules that decrease the objective function monotonically, and compare our approach to state-of-the-art NMF algorithms on both synthetic and real data sets.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. A monomial matrix has exactly one non-zero element in each row

  2. The data set is available online as a supplementary material of https://doi.org/10.1371/journal.pbio.0030002.sd001.

References

  • Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci 97(18):10101–10106

    Article  Google Scholar 

  • Baranzini SE, Mousavi P, Rio J, Caillier SJ, Stillman A, Villoslada P, Wyatt MM, Comabella M, Greller LD, Somogyi R et al (2004) Transcription-based prediction of response to IFN\(\beta \) using supervised computational methods. Plos Biol 3(1):e2

    Article  Google Scholar 

  • Boccarelli A, Esposito F, Coluccia M, Frassanito MA, Vacca A, Del Buono N (2018) Improving knowledge on the activation of bone marrow fibroblasts in mgus and mm disease through the automatic extraction of genes via a nonnegative matrix factorization approach on gene expression profiles. J Transl Med 16(1):217

    Article  Google Scholar 

  • Boivin N, Baillargeon J, Doss PMIA, Roy AP, Rangachari M (2015) Interferon-\(\beta \) suppresses murine th1 cell function in the absence of antigen-presenting cells. PLOS ONE 10(4):1–17

    Article  Google Scholar 

  • Borgwardt KM, Vishwanathan S, Kriegel HP (2006) Class prediction from time series gene expression profiles using dynamical systems kernels. Biocomputing. World Scientific, Singapore, pp 547–558

    Google Scholar 

  • Boutsidis C, Gallopoulos E (2008) SVD based initialization: a head start for nonnegative matrix factorization. Pattern Recognit 41(4):1350–1362

    Article  MATH  Google Scholar 

  • Boven L, Montagne L, Nottet H, De Groot C (2000) Macrophage inflammatory protein-1\(\alpha \) (MIP-1\(\alpha \)), MIP-1\(\beta \), and RANTES mRNA semiquantification and protein expression in active demyelinating multiple sclerosis (MS) lesions. Clin Exp Immunol 122(2):257–263

    Article  Google Scholar 

  • Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169

    Article  Google Scholar 

  • Carmona-Saez P, Pascual-Marqui RD, Tirado F, Carazo JM, Pascual-Montano A (2006) Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinform 7(1):1

    Article  Google Scholar 

  • Casalino G, Del Buono N, Mencar C (2014) Subtractive clustering for seeding non-negative matrix factorizations. Inf Sci 257:369–387

    Article  MathSciNet  MATH  Google Scholar 

  • Cheung VC, Devarajan K, Severini G, Turolla A, and Bonato P (2015) Decomposing time series data by a non-negative matrix factorization algorithm with temporally constrained coefficients. In 2015 37th annual international conference of the IEEE on engineering in medicine and biology society (EMBC), pp 3496–3499

  • Cichocki A, Zdunek R, Phan AH, Amari SI (2009) Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. Wiley, New York

    Book  Google Scholar 

  • Crescenzi M, Giuliani A (2001) The main biological determinants of tumor line taxonomy elucidated by a principal component analysis of microarray data. FEBS Lett 507(1):114–118

    Article  Google Scholar 

  • Dai JJ, Lieu L, Rocke D (2006) Dimension reduction for classification with gene expression microarray data. Stat Appl Genet Mol Biol 5(1):1–21

    Article  MathSciNet  MATH  Google Scholar 

  • Del Buono N, Esposito F, Fumarola F, Boccarelli A, Coluccia M (2016) Breast cancer’s microarray data: pattern discovery using nonnegative matrix factorizations. Machine learning, optimization, and big data. Springer, Berlin, pp 281–292

    Chapter  Google Scholar 

  • Dhillon IS and Sra S (2005) Generalized nonnegative matrix approximations with Bregman divergences. In NIPS, vol 18

  • Ding C, He X, and Simon H (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the 2005 SIAM international conference on data mining, pp 606–610. SIAM

  • Du Mg, Zhang SW, and Wang H (2009) Tumor classification using high-order gene expression profiles based on multilinear ICA. Adv Bioinform. https://doi.org/10.1155/2009/926450

  • Esposito F, Del Buono N (2017) Exploring hidden information in sparse NMF. Technical Report 8, University of Bari, Department of Mathematics

  • Farias RC, Cohen JE, Comon P (2016) Exploring multimodal data fusion through joint decompositions with flexible couplings. IEEE Trans Signal Process 64(18):4830–4844

    Article  MathSciNet  MATH  Google Scholar 

  • Gade-Andavolu R, Comings DE, MacMurray J, Vuthoori RK, Tourtellotte WW, Nagra RM, Cone LA (2004) RANTES: a genetic risk marker for multiple sclerosis. Mult Scler J 10(5):536–539

    Article  Google Scholar 

  • Gillis N (2012) Sparse and Unique nonnegative matrix factorization through data preprocessing. J Mach Learn Res 13:3349–3386

    MathSciNet  MATH  Google Scholar 

  • Gillis N, Glineur F (2012) Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput 24(4):1085–1105

    Article  MathSciNet  Google Scholar 

  • Glaab E, Garibaldi JM, Krasnogor N (2011) Integrative analysis of large-scale biological data sets. Nat Precedings. https://doi.org/10.1038/npre.2011.5598.1

  • He Z, Xie S, Zdunek R, Zhou G, Cichocki A (2011) Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering. IEEE Trans Neural Netw 22(12):2117–2131

    Article  Google Scholar 

  • Hoyer PO (2004) Non-negative Matrix factorization with sparseness constraints. J Mach Learn Res 457–1469

  • Huang YM, Hussien Y, Jin YP, Söderstrom M, Link H (2001) Multiple sclerosis: deficient in vitro responses of blood mononuclear cells to IFN-\(\beta \). Acta Neurol Scand 104(5):249–256

    Article  Google Scholar 

  • Hutchins LN, Murphy SM, Singh P, Graber JH (2008) Position-dependent motif characterization using non-negative matrix factorization. Bioinformatics 24:2684–2690

    Article  Google Scholar 

  • Kim H, Park H (2007a) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502

    Article  Google Scholar 

  • Kim H, Park H (2007b) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502

    Article  Google Scholar 

  • Kim PM, Tidor B (2003) Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res 13(7):1706–1718

    Article  Google Scholar 

  • Kong W, Mou X, Hu X (2011) Exploring matrix factorization techniques for significant genes identification of Alzheimer’s disease microarray gene expression data. BioMed Cent BMC Bioinform 12:S7

    Article  Google Scholar 

  • Kong W, Vanderburg CR, Gunshin H, Rogers JT, Huang X (2008) A review of independent component analysis application to microarray gene expression data. BioTechniques 45(5):501–520

    Article  Google Scholar 

  • Kouskoumvekaki I, Shublaq N, Brunak S (2013) Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics. Brief Bioinform 15(6):942–952

    Article  Google Scholar 

  • Lee DD and Seung HS (2000) Algorithms for non-negative matrix factorization. In Proceedings of the advances in neural information processing systems conference, vol 3, pp 556–562. MIT Press

  • Li Y and Ngom A (2010) Non-negative matrix and tensor factorization based classification of clinical microarray gene expression data. In 2010 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 438–443. IEEE

  • Li Y and Ngom A (2011) Classification of clinical gene-sample-time microarray expression data via tensor decomposition methods. In: Rizzo R, Lisboa PJG (eds) Computational intelligence methods for bioinformatics and biostatistics. Springer, Berlin, pp 275–286

  • Li Z, Wu X, Peng H (2010) Nonnegative matrix factorization on orthogonal subspace. Pattern Recognit Lett 31(9):905–911

    Article  Google Scholar 

  • Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP (2003) Network component analysis: reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci 100(26):15522–15527

    Article  Google Scholar 

  • Liu W, Yuan K, Ye D (2008) Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis. J Biomed Inform 41(4):602–606

    Article  Google Scholar 

  • Liu W, Zheng N, and Lu X (2003) Non-negative matrix factorization for visual coding. In Proceedings of 2003 IEEE international conference on acoustics, speech, and signal processing, 2003 (ICASSP’03), vol 3, pp 3–293. IEEE

  • Mairal J, Bach F, and Ponce J (2014) Sparse Modeling for Image and Vision Processing. arXiv preprint arXiv:1411.3230

  • Marckmann S, Wiesemann E, Hilse R, Trebst C, Stangel M, Windhagen A (2004) Interferon-\(\beta \) up-regulates the expression of co-stimulatory molecules CD80, CD86 and CD40 on monocytes: significance for treatment of multiple sclerosis. Clin Exp Immunol 138(3):499–506

    Article  Google Scholar 

  • Moschetta M, Basile A, Ferrucci A, Frassanito MA, Rao L, Ria R, Solimando AG, Giuliani N, Angelina B, Fumarola F, Coluccia M, Rossini B, Ruggieri S, Nico B, Maiorano E, Ribatti D, Roccaro AM, Vacca A (2013) Novel targeting of phospho-cMET overcomes drug resistance and induces antitumor activity in multiplle myeloma. Clin Cancer Res 19(16):4371–82

    Article  Google Scholar 

  • Nikulin V and Huang TH (2012) Unsupervised dimensionality reduction via gradient-based matrix factorization with two adaptive learning rates. In Proceedings of ICML workshop on unsupervised and transfer learning, pp. 181–194

  • Omberg L, Golub GH, Alter O (2007) A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Natl Acad Sci 104(47):18371–18376

    Article  Google Scholar 

  • Pompili F, Gillis N, Absil PA, Glineur F (2014) Two algorithms for orthogonal nonnegative matrix factorization with application to clustering. Neurocomputing 141:15–25

    Article  Google Scholar 

  • Racke MK, Yang Y, Lovett-Racke AE (2014) Is T-bet a potential therapeutic target in multiple sclerosis? J Interferon Cytokine Res 34(8):623–632

    Article  Google Scholar 

  • Takahashi N, Hibi R (2014) Global convergence of modified multiplicative updates for nonnegative matrix factorization. Comput Optim Appl 57(2):417–440

    Article  MathSciNet  MATH  Google Scholar 

  • Vandenbroeck K, Alloza I, Swaminathan B, Antigüedad A, Otaegui D, Olascoaga J, Barcina MG, De Las Heras V, Bartolomé M, Fernández-Arquero M et al (2011) Validation of IRF5 as multiple sclerosis risk gene: putative role in interferon beta therapy and human herpes virus-6 infection. Genes Immun 12(1):40

    Article  Google Scholar 

  • Veganzones MA, Cohen JE, Farias RC, Chanussot J, Comon P (2016) Nonnegative tensor cp decomposition of hyperspectral data. IEEE Trans Geosci Remote Sens 54(5):2577–2588

    Article  Google Scholar 

  • Wall ME, Rechtsteiner A, and Rocha LM (2003) Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Springer, Berlin, pp 91–109

  • Wiesemann E, Deb M, Trebst C, Hemmer B, Stangel M, Windhagen A (2008) Effects of interferon-\(\beta \) on co-signaling molecules: upregulation of CD40, CD86 and PD-l2 on monocytes in relation to clinical response to interferon-\(\beta \) treatment in patients with multiple sclerosis. Multiple Scler J 14(2):166–176

    Article  Google Scholar 

  • Yang Z, Michailidis G (2015) A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics 32(1):1–8

    Article  Google Scholar 

  • Zhang A (2006) Advanced analysis of gene expression microarray data, vol 1. World Scientific, Singapore

    Book  MATH  Google Scholar 

Download references

Acknowledgements

This work has been supported in part by the GNCS (Gruppo Nazionale per il Calcolo Scientifico) of Istituto Nazionale di Alta Matematica Francesco Severi, P.le Aldo Moro, Roma, Italy. NG acknowledges the support of the European Research Council (ERC starting Grant No. 679515). We thank Angelina Boccarelli (Department of Biomedical Science and Human Oncology, Medical School, University of Bari, Italy) for her useful biological support during the construction of the model and her precious suggestions for the biological interpretation. We also thank Jérémy Cohen (IRISA, Rennes) for his suggestion to compare OJSNMF with NMF applied on the unfolded tensor.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Flavia Esposito.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Esposito, F., Gillis, N. & Del Buono, N. Orthogonal joint sparse NMF for microarray data analysis. J. Math. Biol. 79, 223–247 (2019). https://doi.org/10.1007/s00285-019-01355-2

Download citation

  • Received:

  • Revised:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00285-019-01355-2

Keywords

Mathematics Subject Classification

Navigation