Skip to main content

Non-intrusive Sparse Subspace Learning for Parametrized Problems


We discuss the use of hierarchical collocation to approximate the numerical solution of parametric models. With respect to traditional projection-based reduced order modeling, the use of a collocation enables non-intrusive approach based on sparse adaptive sampling of the parametric space. This allows to recover the low-dimensional structure of the parametric solution subspace while also learning the functional dependency from the parameters in explicit form. A sparse low-rank approximate tensor representation of the parametric solution can be built through an incremental strategy that only needs to have access to the output of a deterministic solver. Non-intrusiveness makes this approach straightforwardly applicable to challenging problems characterized by nonlinearity or non affine weak forms. As we show in the various examples presented in the paper, the method can be interfaced with no particular effort to existing third party simulation software making the proposed approach particularly appealing and adapted to practical engineering problems of industrial interest.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11


  1. 1.

    Zorriassatine F, Wykes C, Parkin R, Gindy N (2003) A survey of virtual proto-typing techniques for mechanical product development. Proc Inst Mech Eng Part B J Eng Manuf 217(4):513–530

    Article  Google Scholar 

  2. 2.

    Oden JT, Belytschko T, Fish J, Hughes TJR, Johnson C, Keyes D, Laub A, Petzold L, Srolovitz D, Yip S (2006) Simulation-based engineering science: revolutionizing engineering science through simulation. Report of the NSF Blue Ribbon Panel on Simulation-Based Engineering Science. National Science Foundation, Arlington

  3. 3.

    Glotzer SC, Kim S, Cummings PT, Deshmukh A, Head-Gordon M, Karniadakis G, Petzold L, Sagui C, Shinozuka M (2009) International assessment of research and development in simulation-based engineering and science. Panel Report. World Technology Evaluation Center Inc, Baltimore

    Google Scholar 

  4. 4.

    Bellman RE (2003) Dynamic programming. Courier Dover Publications, New York (Republished edition)

    MATH  Google Scholar 

  5. 5.

    Montgomery DC (2013) Design and analysis of experiments, 8th edn. Wiley, Hoboken

    Google Scholar 

  6. 6.

    Chong EKP, Zak SH (2013) An introduction to optimization, 4th edn. Wiley series on discrete mathematics and optimization. Wiley, Hoboken

    MATH  Google Scholar 

  7. 7.

    Antoulas A, Sorensen DC, Gugercin S (2001) A survey of model reduction methods for large-scale systems. Contemp Math 280:193–220

    MathSciNet  Article  MATH  Google Scholar 

  8. 8.

    Bialecki RA, Kassab AJ, Fic A (2005) Proper orthogonal decomposition and modal analysis for acceleration of transient FEM thermal analysis. Int J Numer Method Eng 62(6):774–797

    Article  MATH  Google Scholar 

  9. 9.

    Quarteroni A, Manzoni A, Negri E (2015) Reduced basis methods for partial differential equations: an introduction. Modeling and simulation in science, engineering and technology, 1st edn. Springer, Basel

  10. 10.

    Huynh DBP, Rozza G, Sen S, Patera AT (2007) A successive constraint linear optimization method for lower bounds of parametric coercivity and infsup stability constants. C R Math 345(8):473–478

    MATH  Google Scholar 

  11. 11.

    Daversin C, Prud’homme C (2015) Simultaneous empirical interpolation and reduced basis method for non-linear problems. C R Math 353(12):1105–1109

    MathSciNet  MATH  Google Scholar 

  12. 12.

    Barrault M, Maday Y, Nguyen NC, Patera AT (2004) An “empirical interpolation method”: application to efficient reduced-basis discretization of partial differential equations. C R Math 339(9):667–672

    MathSciNet  Article  MATH  Google Scholar 

  13. 13.

    Farhat C, Chapman T, Avery P (2015) Structure-preserving, stability, and accuracy properties of the energy-conserving sampling and weighting method for the hyper reduction of nonlinear finite element dynamic models. Int J Numer Method Eng 102:1077–1110

    MathSciNet  Article  MATH  Google Scholar 

  14. 14.

    Chaturantabut S, Sorensen DC (2010) Nonlinear model order reduction via discrete empirical interpolation. SIAM J Sci Comput 32(5):2737–2764

    MathSciNet  Article  MATH  Google Scholar 

  15. 15.

    Grepl MA, Maday Y, Nguyen NC, Patera AT (2007) Efficient reduced-basis treatment of non-affine and nonlinear partial differential equations. ESAIM Math Model Numer Anal 41(3):575–605

    Article  MATH  Google Scholar 

  16. 16.

    Maday Y, Nguyen NC, Patera AT, Pau SH (2009) A general multipurpose interpolation procedure: the magic points. CPPA 8(1):383–404

    MathSciNet  MATH  Google Scholar 

  17. 17.

    Ryckelynck D (2005) A priori hypereduction method: an adaptive approach. J Comput Phys 202(1):346–366

    Article  MATH  MathSciNet  Google Scholar 

  18. 18.

    Amsallem D, Farhat C (2011) An online method for interpolating linear parametric reduced-order models. SIAM J Sci Comput 33(5):2169–2198

    MathSciNet  Article  MATH  Google Scholar 

  19. 19.

    Hernández J, Caicedo MA, Ferrer A (2017) Dimensional hyper-reduction of nonlinear finite element models via empirical cubature. Comput Method Appl Mech Eng 313:687–722

    MathSciNet  Article  Google Scholar 

  20. 20.

    Rapún ML, Terragni F, Vega JM (2017) Lupod: collocation in POD via LU decomposition. J Comput Phys 335:1–20

    MathSciNet  Article  MATH  Google Scholar 

  21. 21.

    Kumar D, Raisee M, Lacor C (2016) An efficient non-intrusive reduced basis model for high dimensional stochastic problems in CFD. Comput Fluids 138:67–82

    MathSciNet  Article  MATH  Google Scholar 

  22. 22.

    Prulière E, Chinesta F, Ammar A (2010) On the deterministic solution of multidimensional parametric models using the proper generalized decomposition. Math Comput Simul 81(4):791–810

    MathSciNet  Article  MATH  Google Scholar 

  23. 23.

    Hackbusch W (2012) Tensor spaces and numerical tensor calculus, 1st edn. Springer Series in Computational Mathematics. Springer, Berlin

    Book  MATH  Google Scholar 

  24. 24.

    Grasedyck L, Kressner D, Tobler C (2013) A literature survey of low-rank tensor approximation techniques. GAMM-Mitt 36:53–78. arXiv:1302.7121

  25. 25.

    Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3):455–500

    MathSciNet  Article  MATH  Google Scholar 

  26. 26.

    Ammar A, Mokdad B, Chinesta F, Keunings R (2007) A new family of solvers for some classes of multidimensional partial differential equations encountered in kinetic theory modeling of complex fluids. Part II: transient simulation using space-time separated representations. J Non-Newtonian Fluid Mech 144(2–3):98–121

    Article  MATH  Google Scholar 

  27. 27.

    Nouy A (2010) A priori model reduction through proper generalized decomposition for solving time-dependent partial differential equations. Comput Method Appl Mech Eng 199:1603–1626

    MathSciNet  Article  MATH  Google Scholar 

  28. 28.

    Chinesta F, Leygue A, Bordeu F, Aguado JV, Cueto E, Gonzalez D, Alfaro I, Ammar A, Huerta A (2013) PGD-based computational vademecum for efficient design, optimization and control. Arch Comput Methods Eng 20(1):31–59

    MathSciNet  Article  MATH  Google Scholar 

  29. 29.

    Chinesta F, Ladevèze P, Cueto E (2011) A short review on model order reduction based on proper generalized decomposition. Arch Comput Methods Eng 18(4):395–404

    Article  Google Scholar 

  30. 30.

    Ammar A, Chinesta F, Díez P, Huerta A (2010) An error estimator for separated representations of highly multidimensional models. Comput Method Appl Mech Eng 199(25–28):1872–1880

    MathSciNet  Article  MATH  Google Scholar 

  31. 31.

    Uschmajew A (2012) Local convergence of the alternating least squares algorithm for canonical tensor approximation. SIAM J Matrix Anal Appl 33(2):639–652

    MathSciNet  Article  MATH  Google Scholar 

  32. 32.

    Quesada C, Alfaro I, Gonzalez D, Cueto E, Chinesta F (2014) PGD-based model reduction for surgery simulation: solid dynamics and contact detection. Lect Notes Comput Sci 8789:193–202

    Article  Google Scholar 

  33. 33.

    Aguado JV, Borzacchiello D, Ghnatios C, Lebel F, Upadhyay R, Binetruy C, Chinesta F (2017) A simulation app based on reduced order modeling for manufacturing optimization of composite outlet guide vanes. Adv Model Simul Eng Sci 4(1):1

    Article  Google Scholar 

  34. 34.

    Borzacchiello D, Aguado JV, Chinesta F (2016) Reduced order modelling for efficient numerical optimisation of a hot-wall Chemical Vapour Deposition reactor. Int J Numer Method Heat Fluid Flow 27(4). doi:10.1108/HFF-04-2016-0153

  35. 35.

    Ghnatios Ch, Masson F, Huerta A, Leygue A, Cueto E, Chinesta F (2012) Proper generalized decomposition based dynamic data-driven control of thermal processes. Comput Method Appl Mech Eng 213–216:29–41

    Article  Google Scholar 

  36. 36.

    Cohen A, DeVore R (2015) Approximation of high-dimensional parametric PDEs. Acta Numer 24:1–159

    MathSciNet  Article  MATH  Google Scholar 

  37. 37.

    Bachmayr M, Cohen A, Dahmen W (2016) Parametric PDEs: sparse or low-rank approximations? arXiv:1607.04444

  38. 38.

    Boyd JP (2001) Chebyshev and Fourier spectral methods. Courier Corporation, Ann Arbor

    MATH  Google Scholar 

  39. 39.

    Candès E, Romberg J (2007) Sparsity and incoherence in compressive sampling. Inverse Probl 23(3):969

    MathSciNet  Article  MATH  Google Scholar 

  40. 40.

    Gilbert A, Indyk P (2010) Sparse recovery using sparse matrices. Proc IEEE 98(6):937–947

    Article  Google Scholar 

  41. 41.

    Donoho DL (2006) Compressed sensing. IEEE Trans Inform Theor 52(4):1289–1306

    MathSciNet  Article  MATH  Google Scholar 

  42. 42.

    Chen SS, Donoho DL, Saunders MA (2001) Atomic decomposition by basis pursuit. SIAM Rev 43(1):129–159

    MathSciNet  Article  MATH  Google Scholar 

  43. 43.

    Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Series B Methodol 58:267–288

    MathSciNet  MATH  Google Scholar 

  44. 44.

    Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J R Stat Soc Series B Stat Methodol 67(2):301–320

    MathSciNet  Article  MATH  Google Scholar 

  45. 45.

    Brunton SL, Proctor JL, Kutz JN (2016) Discovering governing equations from data by sparse identification of nonlinear dynamical systems. Proc Natl Acad Sci USA 113(15):3932–3937

    MathSciNet  Article  MATH  Google Scholar 

  46. 46.

    Bungartz HJ, Griebel M (2004) Sparse grids. Acta Numer 13:147–269

    MathSciNet  Article  MATH  Google Scholar 

  47. 47.

    Nobile F, Tempone R, Webster CG (2008) A sparse grid stochastic collocation method for partial differential equations with random input data. SIAM J Numer Anal 46(5):2309–2345

    MathSciNet  Article  MATH  Google Scholar 

  48. 48.

    Pflüger D, Peherstorfer B, Bungartz HJ (2010) Spatially adaptive sparse grids for high-dimensional data-driven problems. J Complex 26(5):508–522

    MathSciNet  Article  MATH  Google Scholar 

  49. 49.

    Gerstner T, Griebel M (2003) Dimension-adaptive tensor-product quadrature. Computing 71(1):65–87

    MathSciNet  Article  MATH  Google Scholar 

  50. 50.

    Nobile F, Tempone R, Webster CG (2008) An anisotropic sparse grid stochastic collocation method for partial differential equations with random input data. SIAM J Numer Anal 46(5):2411–2442

    MathSciNet  Article  MATH  Google Scholar 

  51. 51.

    Golub GH, Van Loan CF (1996) Matrix computations, 3rd edn. The Johns Hopkins University Press, Baltimore

    MATH  Google Scholar 

  52. 52.

    Halko N, Martinsson PG, Tropp JA (2011) Finding structure with randomness: probabilistic algorithms for constructing approximate matrix decompositions. SIAM Rev 53(2):217–288

    MathSciNet  Article  MATH  Google Scholar 

  53. 53.

    Harshman RA (1970) Foundations of the parafac procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Working Papers in Phonetics

  54. 54.

    Carroll JD, Chang J (1970) Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition. Psychometrika 35(3):283–319

    Article  MATH  Google Scholar 

  55. 55.

    Li Y (2004) On incremental and robust subspace learning. Pattern Recognit 37(7):1509–1518

    Article  MATH  Google Scholar 

  56. 56.

    Zhao H, Yuen PC, Kwok JT (2006) A novel incremental principal component analysis and its application for face recognition. IEEE Trans Syst Man Cybern Part B 36(4):873–886

    Article  Google Scholar 

  57. 57.

    Sobral A, Baker CG, Bouwmans T, Zahzah E (2014) Incremental and multi-feature tensor subspace learning applied for background modeling and subtraction. In International conference image analysis and recognition. Springer, New York. pp. 94–103

  58. 58.

    Brand M (2002) Incremental singular value decomposition of uncertain data with missing values. Computer Vision ECCV 2002, pp. 707–720

  59. 59.

    Quarteroni A, Rozza G (2007) Numerical solution of parametrized navier-stokes equations by reduced basis methods. Numer Methods Partial Differ Equ 23(4):923–948

    MathSciNet  Article  MATH  Google Scholar 

  60. 60.

    Canuto C, Hussaini MY, Quarteroni A, Zang TA Jr (2012) Spectral methods in fluid dynamics. Springer, Berlin

    MATH  Google Scholar 

  61. 61.

    De Lathauwer L, De Moor B, Vanderwalle J (2000) A multilinear singular value decomposition. SIAM J Matrix Anal Appl 21(4):1253–1278

    MathSciNet  Article  MATH  Google Scholar 

  62. 62.

    Smoljak SA (1963) Quadrature and interpolation formulae on tensor products of certain function classes. Dokl Akad Nauk SSSR 148(5):1042–1045 (Transl.: Soviet Math Dokl 4:240–243, 1963)

    MathSciNet  Google Scholar 

  63. 63.

    Gerstner T, Griebel M (1998) Numerical integration using sparse grids. Numer Algorithm 18(3):209–232

    MathSciNet  Article  MATH  Google Scholar 

  64. 64.

    Dauge M, Stevenson R (2010) Sparse tensor product wavelet approximation of singular functions. SIAM J Math Anal 42(5):2203–2228

    MathSciNet  Article  MATH  Google Scholar 

  65. 65.

    Garcke J (2007) A dimension adaptive sparse grid combination technique for machine learning. ANZIAM J 48:725–740

    MathSciNet  Article  MATH  Google Scholar 

  66. 66.

    Dũng  D, Temlyakov VN, Ullrich T (2016) Hyperbolic cross approximation. arXiv:1601.03978

  67. 67.

    Quarteroni A, Rozza G, Manzoni A (2011) Certified reduced basis approximation for parametrized partial differential equations and applications. J Math Indus 1(1):3

    MathSciNet  Article  MATH  Google Scholar 

  68. 68.

    Bordeu E (2013) Pxdmf : aA file format for separated variables problems version 1.6. Technical report, Ecole Centrale de Nantes

  69. 69.

    Chen P, Quarteroni A, Rozza G (2014) Comparison between reduced basis and stochastic collocation methods for elliptic problems. J Sci Comput 59(1):187–216

    MathSciNet  Article  MATH  Google Scholar 

  70. 70.

    Peherstorfer B, Zimmer S, Bungartz HJ (2012) Model reduction with the reduced basis method and sparse grids. Sparse grids and applications. Springer, Berlin, pp. 223–242

    Chapter  Google Scholar 

Download references


The authors of the paper would like to acknowledge Jean-Louis Duval, Jean-Christophe Allain and Julien Charbonneaux from the ESI group for the support and data for crash and stamping simulations.

Author information



Corresponding author

Correspondence to Domenico Borzacchiello.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed Consent

All the authors are informed and provided their consent.

Research Involving Human and Animal Participants

The research does not involve neither human participants nor animals.

Appendix Incremental Random Singular Value Decomposition

Appendix Incremental Random Singular Value Decomposition

Randomized Singular Value Decomposition

Suppose we are given the input data matrix \(\mathbf {S}\), which in our case contains the hierarchical surpluses, and assume that a rank-r approximation like in Eq. (7) wants to be computed. There are two main ideas behind the rsvd:

  • The first is to realize that a partial SVD can be readily computed provided that we are able to construct a low-dimensional subspace that captures most of the range of the input data matrix.

  • The second is to observe that such low-dimensional subspace can be computed very efficiently using a random sensing method.

Let us start with the first of the aforementioned ideas. Suppose that we are given a matrix \(\mathbf {Q}_{|m\times p}\) with \(r\le p\ll n\) orthonormal columns such that the range of \(\mathbf {S}\) is well captured, i.e.:

$$\begin{aligned} \left|\left|\mathbf {S} - \mathbf {Q}\mathbf {Q}^*\mathbf {S}\right|\right|\le \varepsilon , \end{aligned}$$

for some arbitrarily small given tolerance \(\varepsilon\). Then, the input data is restricted to the subspace generated by its columns, that is: \(\mathbf {B}_{|p\times n} = \mathbf {Q}^*\mathbf {S}\). Observe at this point that we implicitly have the factorization \(\mathbf {A}\approx \mathbf {Q}\mathbf {B}\). Next, we compute an SVD factorization of the matrix \(\mathbf {B}=\widetilde{\mathbf {U}}\widetilde{\mathbf {\Sigma }}\widetilde{\mathbf {V}}^*\), where factors are defined as \(\widetilde{\mathbf {U}}_{|p\times p}\), \(\widetilde{\mathbf {\Sigma }}_{|p\times p}\) and \(\widetilde{\mathbf {V}}_{|n\times p}\). This operation is much less expensive than performing the SVD on the initial data set because the rank is now restricted to the rows of \(\mathbf {B}.\) Finally, in order to recover the r dominant components, we define an extractor matrix \(\mathbf {P}_{|p\times r}\) and set: \(\mathbf {U} = \mathbf {Q}\widetilde{\mathbf {U}}\mathbf {P}\), \(\mathbf {\Sigma } = \mathbf {P}^t\widetilde{\mathbf {\Sigma }}\mathbf {P}\) and \(\mathbf {V} = \widetilde{\mathbf {V}}\mathbf {P}\). In summary, given \(\mathbf {Q}\), it is straightforward to compute a SVD decomposition at a relatively low cost \(\mathcal {O}(mnp+(m+n)p^2)\).

Now we address the second question, that is, how to compute the matrix \(\mathbf {Q}\). We first draw a random Gaussian test matrix, \(\mathbf {\Omega }_{|n\times p}\). Then, we generate samples from the data matrix, i.e. \(\mathbf {Y}_{|m\times p}=\mathbf {A}\mathbf {\Omega }\). Observe that if the rank of input data matrix was exactly r, the columns of \(\mathbf {Y}\) would form a linearly independent set spanning exactly the range of \(\mathbf {S},\) provided that we set \(p=r\). Since in general the true rank will be greater than r, we must consider an oversampling parameter by setting \(p=r+\alpha\). This will produce a matrix \(\mathbf {Y}\) whose range has a much better chance of approximating well the range of the input data matrix. Finally, \(\mathbf {Q}\) can be obtained from the orthogonalization of \(\mathbf {Y}\). In fact, it can be shown that the following error bound is satisfied

$$\begin{aligned} ||\mathbf {S} - \mathbf {Q}\mathbf {Q}^*\mathbf {S}||\le \left[ 1+9\sqrt{p\min \{ m,n\}} \right] \sigma _{r+1}, \end{aligned}$$

with a probability in the order of \(\mathcal {O}(1-\alpha ^{-\alpha })\). That is, the failure probability decreases superexponentially with the oversampling parameter [52].

Remark 1

(On the optimal decomposition) Observe that the standard SVD produces \(\mathbf {Q}_{|m\times r}\) such that

$$\begin{aligned} \left|\left|\mathbf {S} - \mathbf {Q}\mathbf {Q}^*\mathbf {S}\right|\right|= \sigma _{r+1}, \end{aligned}$$

but at a higher cost \(\mathcal {O}(mn\min \{m,n\})\).

A prototype version of the randomized SVD is given in the Algorithm 4.


Neglecting the cost of generating the Gaussian random matrix \(\mathbf {\Omega }\), the cost of generating the matrix \(\mathbf {Q}\) is in the order of \(\mathcal {O}(mnp + mp^2)\) flops. In consequence, the computational cost of the entire rsvd procedure remains as \(\mathcal {O}(mnp+(m+n)p^2).\) The algorithmic performance of the rsvd can be further improved by introducing a number of refinements at the price of worsening slightly the error bounds. In particular, the most expensive steps in the rsvd algorithm consist in forming matrices \(\mathbf {Y}\) and \(\mathbf {B},\) which require in the order of \(\mathcal {O}(mnp)\) flops. The first can be reduced to \(\mathcal {O}(mn\log (p))\) by giving some structure to the random matrix \(\mathbf {\Omega }\), while the second can be reduced to \(\mathcal {O}((m+n)p^2)\) via row extraction techniques, which leaves the total cost \(\mathcal {O}(mn\log (p)+(m+n)p^2)\). The interested reader can find further details on these refinements as well as on their impact on the assessment in [52].

Incremental Rrandomized SVD

In this section we present an incremental variant of the randomized SVD algorithm, discussed in Appendix “Randomized Singular Value Decomposition”. The objective is twofold: (i) to be able to learn a subspace for the hierarchical surpluses as they are streamed from the sparse sampling procedure; (ii) to perform it at a computational cost that scales reasonably with the number of samples.

Let us assume that we want to compute a rank-r approximation of some streamed data, and that we have chosen an oversampling parameter \(\alpha\) such that \(p=r+\alpha\), as in Appendix “Randomized Singular Value Decomposition”. Let us denote by \(\mathbf {S}_{0|m\times n}\) the old data matrix, whereas \(\mathbf {S}_{|m\times n^\prime }\) is the new data columns such that the total data is now \(\mathbf {S}_{1|m\times (n+n^\prime )} = [\mathbf {S}_0 \,|\, \mathbf {S} ]\). We would like to compute an approximated SVD decomposition \(\mathbf {S}_1\approx \mathbf {U}_1\mathbf {\Sigma }_1\mathbf {V}_1^*\) at a cost which is roughly independent on n, the number of columns of the old data. For the sake of completeness, recall that \(\mathbf {U}_{1|m\times p}\), \(\mathbf {\Sigma }_{1|p\times p}\) and \(\mathbf {V}_{1|(n+n^\prime )\times p}\).

In order to do so, suppose that we are given a non-truncated SVD approximation of the old data, i.e. \(\mathbf {S}_0\approx \mathbf {U}_0\mathbf {\Sigma }_0\mathbf {V}_0^*,\) with \(\mathbf {U}_{0|m\times p}\), \(\mathbf {\Sigma }_{0|p\times p}\) and \(\mathbf {V}_{0|n\times p}\). Suppose that we also dispose of the matrix of random samples \(\mathbf {Y}_{0|m\times p}\). Then, in order to account for the new data we only need to generate a random Gaussian test matrix \(\mathbf {\Omega }_{|n^\prime \times p}\) and perform a small product which only involves the new data:

$$\begin{aligned} \mathbf {Y}_1 = \mathbf {Y}_0 + \mathbf {S}\mathbf {\Omega }. \end{aligned}$$

The matrix \(\mathbf {Q}_{1|m\times p}\) can be obtained from the orthogonalization of \(\mathbf {Y}_1\) at a cost that remains stable, as it does not depend on n nor \(n^\prime\). Next, input data has to be restricted to the range of \(\mathbf {Q}_1\). Recalling that we already dispose of a non-truncated SVD approximation of the old data:

$$\begin{aligned} \mathbf {B}_1 \approx \mathbf {Q}_1^* \left[ \begin{array}{c|c} \mathbf {U}_0 \mathbf {\Sigma }_0 \mathbf {V}_0^*&\mathbf {S} \end{array} \right] = \underbrace{ \left[ \begin{array}{c|c} \mathbf {Q}_1^*\mathbf {U}_0\mathbf {\Sigma }_0&\mathbf {Q}_{1}^* \mathbf {S} \end{array} \right] }_{\widetilde{\mathbf {B}}} \left[ \begin{array}{c|c} \mathbf {V}_{0}^* &{} \mathbf {0} \\ \hline \mathbf {0} &{} \mathbf {I}_{n^\prime \times n^\prime } \\ \end{array} \right] , \end{aligned}$$

where \(\mathbf {I}_{n^\prime \times n^\prime }\) is the identity matrix of size \(n^\prime\). Similarly to Appendix “Randomized Singular Value Decomposition”, observe that Eq. (32) yields a factorization \(\mathbf {S}_1\approx \mathbf {Q}_1\widetilde{\mathbf {B}}\). Hence, if we compute a SVD decomposition of the factor \(\widetilde{\mathbf {B}}\),

$$\begin{aligned} \widetilde{\mathbf{B}} = \widetilde{\mathbf{U}}\mathbf {\Sigma }_1\widetilde{\mathbf{V}}^*, \quad \text {with}\, \widetilde{\mathbf{U}}_{|p\times p},\mathbf {\Sigma }_{1|p\times p}\,\text {and}\,\widetilde{\mathbf{V}}_{|(p+n^\prime )\times p}, \end{aligned}$$

we can conclude the algorithm by setting:

$$\begin{aligned} \mathbf {U}_1 = \mathbf {Q}_1\widetilde{\mathbf {U}} \quad \text {and} \quad \mathbf {V}_1 = \left[ \begin{array}{c|c} \mathbf {V}_{0} &{} \mathbf {0} \\ \hline \mathbf {0} &{} \mathbf {I}_{n^\prime \times n^\prime } \\ \end{array} \right] \widetilde{\mathbf {V}}. \end{aligned}$$

A prototype version of the incremental randomized SVD is given in the Algorithm 5.


Observe that the cost of the irsvd algorithm is driven by \(\mathcal {O}((m+n)p^2)\) when choosing \(n^\prime \sim p\), while if one chooses \(n^\prime \gg p\), the cost matches the standard rSVD, that is \(\mathcal {O}(mn^\prime p)\). A more detailed analysis of the flop count indicates that in fact, the only dependence on n of the algorithm is due to the cost of updating the right singular vectors in Eq. (34). On the other hand, the reader should keep in mind that, for the applications targeted in this paper, the number of rows of the input dataset (degrees of freedom after discretization of a PDE) is at least one or two orders of magnitude bigger than the number of columns (solution snapshots). As a consequence, the cost of the irsvd turns out to be roughly independent on n. A final consideration that should not be neglected is that, for data sets that do not fit in the core memory, the cost of transferring data from slow memory dominates the cost of the arithmetics. This can be generally avoided with the incremental algorithm presented in this section.

A Numerical Example: Order Reduction Applied to the Lid-driven Cavity Problem

In this section, we provide numerical evidence on the performance of the irsvd, as described in Sect. 1. In particular, we apply irsvd on the set of hierarchical surpluses, \(\mathbf {S}_\text {ldc}\), coming from the solution of the lid-driven cavity problem, as described in Sect. 2.4. The size of the data matrix is \(m=14,082\) rows and \(n=513\) columns. An overkill value of the oversampling parameter is taken, \(\alpha =r\) (i.e. \(p=2\,r\)).

Firstly, we show that the low-rank singular value decomposition given by the irsvd tends to both the standard svd and the rsvd as the number of processed columns approaches the number of columns of the entire dataset. To that end, we choose a rank \(r=20\) and a fixed bandwidth \(n^\prime = 5\). Figure 14 shows the evolution of the singular values as the number of processed columns increases. It can be noticed that, in fact, after a relatively low number of columns are processed (say 20), the singular values are already very close to the reference ones. This is simply because when coupling irsvd with the hierarchical sampling the surpluses that come from higher hierarchical levels are naturally associated to the first singular vectors. On the contrary, lower hierarchical levels yield smaller surpluses, as the hierarchical sampling method converges. When the entire dataset is processed, the irsvd yields a SVD that matches the standard one, see Fig. 12f.

Fig. 12

Singular values evolution in terms of the cumulated number of columns processed by the irsvd. Comparison is made against reference results given by standard svd and rsvd, for rank \(r=20\) and bandwidth \(n^\prime =5\)

In order to further assess the convergence of the irsvd towards the standard svd decomposition, the energy error between both decompositions is measured:

$$\begin{aligned} \varepsilon _r = \sqrt{\frac{\sum _{i=1}^{r}{(\sigma _\texttt {irsvd}-\sigma _\texttt {svd})^2}}{\sum _{i=1}^{r}{\sigma _\texttt {svd}^2}}}, \end{aligned}$$

for a given rank r. Figure 13 shows the evolution of \(\varepsilon _r\) for several bandwidth. It can be observed that the bandwidth hardly influences the convergence results.

Fig. 13

Energy error measuring the convergence of the singular values, for fixed rank \(r=20\), as a function of the number of processed columns for different values of the bandwidth \(n^\prime\)

Next, the computational cost of the irsvd must be assessed. Figure 14a shows the runtime (denoted by \(\tau\)) of the irsvd, i.e. Algorithm 5, as a function of the bandwidth. The runtime is computed as the average of three executions. Results confirm that, as discussed in Sect. 1, the computational cost is independent on the bandwidth size. Besides, it can be observed that greater ranks yield greater runtimes. In fact, the computational complexity should depend quadratically on the rank. This quadratic scaling is confirmed by Fig. 14b, which shows the normalized rank \(\widetilde{r}=r/r_0\) (with \(r_0=10\)) against the normalized runtime \(\widetilde{\tau }=\tau /\tau _0\), where \(\tau _0\) is the runtime associated to \(r_0\). It can be seen that for all bandwidth the normalized runtime scales super-linearly with the normalized rank (linear scaling is depicted for reference).

Fig. 14

Assessment of computational performances of irsvd: bandwidth independence and rank scaling

Finally, it is worth to highlight that in many practical applications the cost of irsvd turns out to be independent on n, the total number of columns of the data set. This is simply because usually \(m\gg n\) and then the computational complexity reduces to \(\mathcal {O}(mp^2)\). In other words, the cost only starts being influenced by n when \(n\sim m\). Figure 15 shows the runtime of each irsvd call, averaged over three runs. For the sake of clarity, runtimes have been normalized to their mean value, while the vertical axis scale is chosen so we can observe \(\pm 50\%\) deviations from the mean. Results show that runtime deviates very few from the mean. Moreover, the cost of each call remains fairly constant as the number of processed columns increases, which confirms the discussion above.

Fig. 15

Assessment of computational performances of irsvd: data size independence

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Borzacchiello, D., Aguado, J.V. & Chinesta, F. Non-intrusive Sparse Subspace Learning for Parametrized Problems. Arch Computat Methods Eng 26, 303–326 (2019).

Download citation