Abstract
This paper investigates the question of whether the Jacobi preconditioner can be made to run faster by including a few additional outlying diagonals, for the case of mesh-based implicit clothing simulation. For the given \(N \times N\) block system matrix \({\mathbf {A}}\), we investigate two regularly striped (RS) preconditioners: (1) 3-RS which has two symmetric outlying diagonals with offset \(\frac{N}{2}\) from the main diagonal and (2) 5-RS which has four symmetric outlying diagonals with offsets \(\frac{N}{3}\) and \(\frac{2N}{3}\) from the main diagonal. This paper finds that both 3- and 5-RS preconditioners are consistently superior to the Jacobi preconditioner in the performance. Based on the loop iteration count and time measurement, we finally recommend 5-RS rather than 3-RS.
Similar content being viewed by others
Notes
If the order of the vertices in Line 2 of Algorithm 2 is random, without Line 4, nonzero elements tend to come at off-stencil positions as shown in Fig. 4b. It increases the bandwidth of the matrix and lowers the cache coherency, which results in increasing the cost for the SpMV. We can prevent such bandwidth increase by performing the Cuthill–McKee algorithm [12] at Line 4, which produces the effect of ordering the vertices in the breadth-first order. (Cuthill–McKee algorithm is adopted in various other works related to reducing bandwidth [1, 10].) We can see the effectiveness of the Cuthill–McKee algorithm as we compare Fig. 4b, c.
When constructing the 3-RS preconditioner, the displacement d of the outlying diagonal should be N/2. What if N is not even? We circumvented the situation by a simple implemental trick of using \(Q = \lfloor (N+1)/2 \rfloor \), instead of \(Q = N/2\), in Line 5 of Algorithm 2. Then, if N is even, \(\lfloor (N+1)/2 \rfloor \) produces d that is originally intended. If N is odd, \(\lfloor (N+1)/2 \rfloor \) produces the smallest d that does not introduce more than two diagonals when performing regular striping. Similarly, when constructing the 5-RS preconditioner, we used \(Q = \lfloor (N+2)/3 \rfloor \) and \(R = \lfloor 2(N+2)/3 \rfloor \) in Line 5 of Algorithm 3.
References
Ainsley, S., Vouga, E., Grinspun, E., Tamstorf, R.: Speculative parallel asynchronous contact mechanics. ACM Trans. Graph. (TOG) 31(6), 1–8 (2012)
Ament, M., Knittel, G., Weiskopf, D., Strasser, W.: A parallel preconditioned conjugate gradient solver for the poisson problem on a multi-GPU platform. In: 18th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP), pp. 583–592. IEEE (2010)
Ascher, U.M., Boxerman, E.: On the modified conjugate gradient method in cloth simulation. Vis. Comput. 19(7–8), 526–531 (2003)
Baraff, D., Witkin, A.: Large steps in cloth simulation. In: Proceedings of the 25th Annual Conference on Computer Graphics and Interactive Techniques, pp. 43–54. ACM (1998)
Berkeley garment library. http://graphics.berkeley.edu/resources/GarmentLibrary/
Benzi, M.: Preconditioning techniques for large linear systems: a survey. J. Comput. Phys. 182(2), 418–477 (2002)
Benzi, M., Tûma, M.: A comparative study of sparse approximate inverse preconditioners. Appl. Numer. Math. 30(2), 305–340 (1999). https://doi.org/10.1016/S0168-9274(98)00118-4
Bernstein, D.S.: Matrix Mathematics: Theory, Facts, and Formulas, 2nd edn. Princeton University Press, Princeton (2009)
Boxerman, E., Ascher, U.: Decomposing cloth. In: Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 153–161. Eurographics Association (2004)
Chentanez, N., Alterovitz, R., Ritchie, D., Cho, L., Hauser, K.K., Goldberg, K., Shewchuk, J.R., O’Brien, J.F.: Interactive simulation of surgical needle insertion and steering. ACM Trans. Graph. (2009). https://doi.org/10.1145/1531326.1531394
Choi, K.J., Ko, H.S.: Stable but responsive cloth. In: ACM SIGGRAPH 2005 Courses, p. 1. ACM (2005)
Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference, pp. 157–172 (1969)
Grinspun, E., Hirani, A.N., Desbrun, M., Schröder, P.: Discrete shells. In: Proceedings of the 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 62–67. Eurographics Association (2003)
Hauth, M., Etzmuß, O., Straßer, W.: Analysis of numerical methods for the simulation of deformable models. Vis. Comput. 19(7–8), 581–600 (2003)
Saad, Y.: Iterative Methods for Sparse Linear Systems, vol. 82. SIAM (2003)
Shewchuk, J.R., et al.: An Introduction to the Conjugate Gradient Method Without the Agonizing Pain (1994)
Sideris, C., Kapadia, M., Faloutsos, P.: Parallelized incomplete Poisson preconditioner in cloth simulation. In: International Conference on Motion in Games, pp. 389–399. Springer (2011)
Smolarski, D.C.: Diagonally-striped matrices and approximate inverse preconditioners. J. Comput. Appl. Math. 186(2), 416–431 (2006)
Swesty, F.D., Smolarski, D.C., Saylor, P.E., et al.: A comparison of algorithms for the efficient solution of the linear systems arising from multigroup flux-limited diffusion problems. Astrophys. J. Suppl. Ser. 153(1), 369 (2004)
Ye, J., Webber, R.E., Wang, Y.: A reduced unconstrained system for the cloth dynamics solver. Vis. Comput. 25(10), 959–971 (2009)
Acknowledgements
This research was supported by R&D program for Advanced Integrated-intelligence for IDentification (AIID) through the National Research Foundation of Korea (NRF) funded by Ministry of Science and ICT (NRF-2018M3E3A1057288), and ASRI (Automation and Systems Research Institute at Seoul National University).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 14732 KB)
A Calculating the inverse of the block k-RS preconditioner
A Calculating the inverse of the block k-RS preconditioner
In this section, we show how to calculate the inverse of the block k-RS preconditioner. A key point of the derivation is using the footprint. For a block k-RS preconditioner \({\mathbf {P}}\), when multiplying \({\mathbf {P}}\) and \({\mathbf {P}}^{-1}\), due to the sparsity pattern of \({\mathbf {P}}\) and \({\mathbf {P}}^{-1}\) (see [18]), it turns out only footprint(i, i) of \({\mathbf {P}}\) are multiplied with footprint(i, i) of \({\mathbf {P}}^{-1}\). Let us denote the footprint(i, i) of \({\mathbf {P}}\) and \({\mathbf {P}}^{-1}\) as \(\hat{{\mathbf {P}}}_{(i,i)}\) and \(\hat{{\mathbf {P}}}^{-1}_{(i,i)}\), respectively.
For the case of the block 3-RS preconditioner \({\mathbf {P}}\),
where \({\mathbf {P}}_{(a,b)}\) represents the block (a, b) of \({\mathbf {P}}\). We must have
for \(i = 0,\dots , \frac{N}{2} - 1\). Replacing \(\hat{{\mathbf {P}}}_{(i,i)}\) of Eq. 20 with Eq. 19 and comparing the left- and right-hand sides of Eq. 20, it follows that
where \({\mathbf {E}} = ({\mathbf {D}}-\mathbf {CA}^{-1}{\mathbf {B}})^{-1}\) and \({\mathbf {F}} = {\mathbf {A}}^{-1}\mathbf {BE}\). (You can see the detailed derivation in [8].)
For the case of the block k-RS preconditioner with \(k>3\), the above procedure can be applied recursively. For example, if \({\mathbf {P}}\) is 5-RS preconditioner, as shown in Fig. 10, it can be decomposed into the 3-RS preconditioner \({\mathbf {A}}\) and three matrices \(\mathbf {B, C, D}\). Then, \({\mathbf {P}}^{-1}\) can be calculated according to Eq. 21.
Rights and permissions
About this article
Cite this article
Lee, SB., Lee, KW. & Ko, HS. Regularly striped preconditioner for implicit clothing simulation. Vis Comput 38, 2827–2838 (2022). https://doi.org/10.1007/s00371-021-02158-7
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-021-02158-7