Abstract
The 3D microarrays, generally known as gene-sample-time microarrays, couple the information on different time points collected by 2D microarrays that measure gene expression levels among different samples. Their analysis is useful in several biomedical applications, like monitoring dose or drug treatment responses of patients over time in pharmacogenomics studies. Many statistical and data analysis tools have been used to extract useful information. In particular, nonnegative matrix factorization (NMF), with its natural nonnegativity constraints, has demonstrated its ability to extract from 2D microarrays relevant information on specific genes involved in the particular biological process. In this paper, we propose a new NMF model, namely Orthogonal Joint Sparse NMF, to extract relevant information from 3D microarrays containing the time evolution of a 2D microarray, by adding additional constraints to enforce important biological proprieties useful for further biological analysis. We develop multiplicative updates rules that decrease the objective function monotonically, and compare our approach to state-of-the-art NMF algorithms on both synthetic and real data sets.
Similar content being viewed by others
Notes
A monomial matrix has exactly one non-zero element in each row
The data set is available online as a supplementary material of https://doi.org/10.1371/journal.pbio.0030002.sd001.
References
Alter O, Brown PO, Botstein D (2000) Singular value decomposition for genome-wide expression data processing and modeling. Proc Natl Acad Sci 97(18):10101–10106
Baranzini SE, Mousavi P, Rio J, Caillier SJ, Stillman A, Villoslada P, Wyatt MM, Comabella M, Greller LD, Somogyi R et al (2004) Transcription-based prediction of response to IFN\(\beta \) using supervised computational methods. Plos Biol 3(1):e2
Boccarelli A, Esposito F, Coluccia M, Frassanito MA, Vacca A, Del Buono N (2018) Improving knowledge on the activation of bone marrow fibroblasts in mgus and mm disease through the automatic extraction of genes via a nonnegative matrix factorization approach on gene expression profiles. J Transl Med 16(1):217
Boivin N, Baillargeon J, Doss PMIA, Roy AP, Rangachari M (2015) Interferon-\(\beta \) suppresses murine th1 cell function in the absence of antigen-presenting cells. PLOS ONE 10(4):1–17
Borgwardt KM, Vishwanathan S, Kriegel HP (2006) Class prediction from time series gene expression profiles using dynamical systems kernels. Biocomputing. World Scientific, Singapore, pp 547–558
Boutsidis C, Gallopoulos E (2008) SVD based initialization: a head start for nonnegative matrix factorization. Pattern Recognit 41(4):1350–1362
Boven L, Montagne L, Nottet H, De Groot C (2000) Macrophage inflammatory protein-1\(\alpha \) (MIP-1\(\alpha \)), MIP-1\(\beta \), and RANTES mRNA semiquantification and protein expression in active demyelinating multiple sclerosis (MS) lesions. Clin Exp Immunol 122(2):257–263
Brunet JP, Tamayo P, Golub TR, Mesirov JP (2004) Metagenes and molecular pattern discovery using matrix factorization. Proc Natl Acad Sci 101(12):4164–4169
Carmona-Saez P, Pascual-Marqui RD, Tirado F, Carazo JM, Pascual-Montano A (2006) Biclustering of gene expression data by non-smooth non-negative matrix factorization. BMC Bioinform 7(1):1
Casalino G, Del Buono N, Mencar C (2014) Subtractive clustering for seeding non-negative matrix factorizations. Inf Sci 257:369–387
Cheung VC, Devarajan K, Severini G, Turolla A, and Bonato P (2015) Decomposing time series data by a non-negative matrix factorization algorithm with temporally constrained coefficients. In 2015 37th annual international conference of the IEEE on engineering in medicine and biology society (EMBC), pp 3496–3499
Cichocki A, Zdunek R, Phan AH, Amari SI (2009) Nonnegative matrix and tensor factorizations: applications to exploratory multi-way data analysis and blind source separation. Wiley, New York
Crescenzi M, Giuliani A (2001) The main biological determinants of tumor line taxonomy elucidated by a principal component analysis of microarray data. FEBS Lett 507(1):114–118
Dai JJ, Lieu L, Rocke D (2006) Dimension reduction for classification with gene expression microarray data. Stat Appl Genet Mol Biol 5(1):1–21
Del Buono N, Esposito F, Fumarola F, Boccarelli A, Coluccia M (2016) Breast cancer’s microarray data: pattern discovery using nonnegative matrix factorizations. Machine learning, optimization, and big data. Springer, Berlin, pp 281–292
Dhillon IS and Sra S (2005) Generalized nonnegative matrix approximations with Bregman divergences. In NIPS, vol 18
Ding C, He X, and Simon H (2005) On the equivalence of nonnegative matrix factorization and spectral clustering. In Proceedings of the 2005 SIAM international conference on data mining, pp 606–610. SIAM
Du Mg, Zhang SW, and Wang H (2009) Tumor classification using high-order gene expression profiles based on multilinear ICA. Adv Bioinform. https://doi.org/10.1155/2009/926450
Esposito F, Del Buono N (2017) Exploring hidden information in sparse NMF. Technical Report 8, University of Bari, Department of Mathematics
Farias RC, Cohen JE, Comon P (2016) Exploring multimodal data fusion through joint decompositions with flexible couplings. IEEE Trans Signal Process 64(18):4830–4844
Gade-Andavolu R, Comings DE, MacMurray J, Vuthoori RK, Tourtellotte WW, Nagra RM, Cone LA (2004) RANTES: a genetic risk marker for multiple sclerosis. Mult Scler J 10(5):536–539
Gillis N (2012) Sparse and Unique nonnegative matrix factorization through data preprocessing. J Mach Learn Res 13:3349–3386
Gillis N, Glineur F (2012) Accelerated multiplicative updates and hierarchical ALS algorithms for nonnegative matrix factorization. Neural Comput 24(4):1085–1105
Glaab E, Garibaldi JM, Krasnogor N (2011) Integrative analysis of large-scale biological data sets. Nat Precedings. https://doi.org/10.1038/npre.2011.5598.1
He Z, Xie S, Zdunek R, Zhou G, Cichocki A (2011) Symmetric nonnegative matrix factorization: algorithms and applications to probabilistic clustering. IEEE Trans Neural Netw 22(12):2117–2131
Hoyer PO (2004) Non-negative Matrix factorization with sparseness constraints. J Mach Learn Res 457–1469
Huang YM, Hussien Y, Jin YP, Söderstrom M, Link H (2001) Multiple sclerosis: deficient in vitro responses of blood mononuclear cells to IFN-\(\beta \). Acta Neurol Scand 104(5):249–256
Hutchins LN, Murphy SM, Singh P, Graber JH (2008) Position-dependent motif characterization using non-negative matrix factorization. Bioinformatics 24:2684–2690
Kim H, Park H (2007a) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502
Kim H, Park H (2007b) Sparse non-negative matrix factorizations via alternating non-negativity-constrained least squares for microarray data analysis. Bioinformatics 23(12):1495–1502
Kim PM, Tidor B (2003) Subsystem identification through dimensionality reduction of large-scale gene expression data. Genome Res 13(7):1706–1718
Kong W, Mou X, Hu X (2011) Exploring matrix factorization techniques for significant genes identification of Alzheimer’s disease microarray gene expression data. BioMed Cent BMC Bioinform 12:S7
Kong W, Vanderburg CR, Gunshin H, Rogers JT, Huang X (2008) A review of independent component analysis application to microarray gene expression data. BioTechniques 45(5):501–520
Kouskoumvekaki I, Shublaq N, Brunak S (2013) Facilitating the use of large-scale biological data and tools in the era of translational bioinformatics. Brief Bioinform 15(6):942–952
Lee DD and Seung HS (2000) Algorithms for non-negative matrix factorization. In Proceedings of the advances in neural information processing systems conference, vol 3, pp 556–562. MIT Press
Li Y and Ngom A (2010) Non-negative matrix and tensor factorization based classification of clinical microarray gene expression data. In 2010 IEEE international conference on bioinformatics and biomedicine (BIBM), pp 438–443. IEEE
Li Y and Ngom A (2011) Classification of clinical gene-sample-time microarray expression data via tensor decomposition methods. In: Rizzo R, Lisboa PJG (eds) Computational intelligence methods for bioinformatics and biostatistics. Springer, Berlin, pp 275–286
Li Z, Wu X, Peng H (2010) Nonnegative matrix factorization on orthogonal subspace. Pattern Recognit Lett 31(9):905–911
Liao JC, Boscolo R, Yang YL, Tran LM, Sabatti C, Roychowdhury VP (2003) Network component analysis: reconstruction of regulatory signals in biological systems. Proc Natl Acad Sci 100(26):15522–15527
Liu W, Yuan K, Ye D (2008) Reducing microarray data via nonnegative matrix factorization for visualization and clustering analysis. J Biomed Inform 41(4):602–606
Liu W, Zheng N, and Lu X (2003) Non-negative matrix factorization for visual coding. In Proceedings of 2003 IEEE international conference on acoustics, speech, and signal processing, 2003 (ICASSP’03), vol 3, pp 3–293. IEEE
Mairal J, Bach F, and Ponce J (2014) Sparse Modeling for Image and Vision Processing. arXiv preprint arXiv:1411.3230
Marckmann S, Wiesemann E, Hilse R, Trebst C, Stangel M, Windhagen A (2004) Interferon-\(\beta \) up-regulates the expression of co-stimulatory molecules CD80, CD86 and CD40 on monocytes: significance for treatment of multiple sclerosis. Clin Exp Immunol 138(3):499–506
Moschetta M, Basile A, Ferrucci A, Frassanito MA, Rao L, Ria R, Solimando AG, Giuliani N, Angelina B, Fumarola F, Coluccia M, Rossini B, Ruggieri S, Nico B, Maiorano E, Ribatti D, Roccaro AM, Vacca A (2013) Novel targeting of phospho-cMET overcomes drug resistance and induces antitumor activity in multiplle myeloma. Clin Cancer Res 19(16):4371–82
Nikulin V and Huang TH (2012) Unsupervised dimensionality reduction via gradient-based matrix factorization with two adaptive learning rates. In Proceedings of ICML workshop on unsupervised and transfer learning, pp. 181–194
Omberg L, Golub GH, Alter O (2007) A tensor higher-order singular value decomposition for integrative analysis of DNA microarray data from different studies. Proc Natl Acad Sci 104(47):18371–18376
Pompili F, Gillis N, Absil PA, Glineur F (2014) Two algorithms for orthogonal nonnegative matrix factorization with application to clustering. Neurocomputing 141:15–25
Racke MK, Yang Y, Lovett-Racke AE (2014) Is T-bet a potential therapeutic target in multiple sclerosis? J Interferon Cytokine Res 34(8):623–632
Takahashi N, Hibi R (2014) Global convergence of modified multiplicative updates for nonnegative matrix factorization. Comput Optim Appl 57(2):417–440
Vandenbroeck K, Alloza I, Swaminathan B, Antigüedad A, Otaegui D, Olascoaga J, Barcina MG, De Las Heras V, Bartolomé M, Fernández-Arquero M et al (2011) Validation of IRF5 as multiple sclerosis risk gene: putative role in interferon beta therapy and human herpes virus-6 infection. Genes Immun 12(1):40
Veganzones MA, Cohen JE, Farias RC, Chanussot J, Comon P (2016) Nonnegative tensor cp decomposition of hyperspectral data. IEEE Trans Geosci Remote Sens 54(5):2577–2588
Wall ME, Rechtsteiner A, and Rocha LM (2003) Singular value decomposition and principal component analysis. In: Berrar DP, Dubitzky W, Granzow M (eds) A practical approach to microarray data analysis. Springer, Berlin, pp 91–109
Wiesemann E, Deb M, Trebst C, Hemmer B, Stangel M, Windhagen A (2008) Effects of interferon-\(\beta \) on co-signaling molecules: upregulation of CD40, CD86 and PD-l2 on monocytes in relation to clinical response to interferon-\(\beta \) treatment in patients with multiple sclerosis. Multiple Scler J 14(2):166–176
Yang Z, Michailidis G (2015) A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data. Bioinformatics 32(1):1–8
Zhang A (2006) Advanced analysis of gene expression microarray data, vol 1. World Scientific, Singapore
Acknowledgements
This work has been supported in part by the GNCS (Gruppo Nazionale per il Calcolo Scientifico) of Istituto Nazionale di Alta Matematica Francesco Severi, P.le Aldo Moro, Roma, Italy. NG acknowledges the support of the European Research Council (ERC starting Grant No. 679515). We thank Angelina Boccarelli (Department of Biomedical Science and Human Oncology, Medical School, University of Bari, Italy) for her useful biological support during the construction of the model and her precious suggestions for the biological interpretation. We also thank Jérémy Cohen (IRISA, Rennes) for his suggestion to compare OJSNMF with NMF applied on the unfolded tensor.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Esposito, F., Gillis, N. & Del Buono, N. Orthogonal joint sparse NMF for microarray data analysis. J. Math. Biol. 79, 223–247 (2019). https://doi.org/10.1007/s00285-019-01355-2
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-019-01355-2