Spectral reordering for faster elasticity simulations

Flor, Alon; Aanjaneya, Mridul

doi:10.1007/s00371-024-03513-0

Spectral reordering for faster elasticity simulations

Research
Open access
Published: 18 June 2024

(2024)
Cite this article

Download PDF

You have full access to this open access article

The Visual Computer Aims and scope Submit manuscript

Spectral reordering for faster elasticity simulations

Download PDF

Alon Flor¹ &
Mridul Aanjaneya¹

133 Accesses
Explore all metrics

Abstract

We present a novel method for faster physics simulations of elastic solids. Our key idea is to reorder the unknown variables according to the Fiedler vector (i.e., the second-smallest eigenvector) of the combinatorial Laplacian. It is well known in the geometry processing community that the Fiedler vector brings together vertices that are geometrically nearby, causing fewer cache misses when computing differential operators. However, to the best of our knowledge, this idea has not been exploited to accelerate simulations of elastic solids which require an expensive linear (or non-linear) system solve at every time step. The cost of computing the Fiedler vector is negligible, thanks to an algebraic Multigrid-preconditioned Conjugate Gradients (AMGPCG) solver. We observe that our AMGPCG solver requires approximately 1 s for computing the Fiedler vector for a mesh with approximately 50K vertices or 100K tetrahedra. Our method provides a speed-up between $10\%$ – $30\%$ at every time step, which can lead to considerable savings, noting that even modest simulations of elastic solids require at least 240 time steps. Our method is easy to implement and can be used as a plugin for speeding up existing physics simulators for elastic solids, as we demonstrate through our experiments using the Vega library and the ADMM solver, which use different algorithms for elasticity.

Development of an ABAQUS plugin tool for periodic RVE homogenisation

Article Open access 21 May 2018

Finite basis physics-informed neural networks (FBPINNs): a scalable domain decomposition approach for solving differential equations

Article Open access 31 July 2023

Interpolation-based immersogeometric analysis methods for multi-material and multi-physics problems

Article Open access 17 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Physics simulation is an active area of research in computer graphics due to the prominence of digital characters and artificially generated natural phenomena in the VFX industry [13, 20, 21, 38, 41, 44]. Numerical simulation of any kind of physical effect requires the governing laws to be discretized on a mesh for computing the differential operators. Subsequently, for temporal evolution, the backward Euler scheme (or its variants) is generally the accepted standard for stability purposes [10, 47]. This makes a linear solve the bottleneck in physics simulation, and thus, substantial effort has been invested by numerous research groups in designing fast methods, either through the development of faster solvers [1, 2, 11, 34, 57, 58] or through the development of novel discretization schemes [35, 37, 49, 50].

In contrast to prior works, in this paper, we propose a method that can potentially be useful for accelerating any other method for physics simulation. We achieve this by paying close attention to the interaction of the simulation software with the underlying hardware and identifying a key bottleneck—the numerical solution of linear systems is a memory-bound (as opposed to computation-bound) operation. This is because of the discrepancy between the memory bandwidth, which is roughly $4-5\times $ slower, in comparison with the compute bandwidth that is available on modern workstations. Every linear equation requires a handful of variables to be read from the memory, some basic arithmetic operations (e.g., multiplication by some scalar coefficient, addition/subtraction) performed on them, and the result subsequently written back to memory. Besides being much faster (as most arithmetic operations are natively supported in hardware), the density of arithmetic operations is much smaller than the number of read/write operations involved in each linear equation.

To address this issue, some authors have considered increasing the density of associated computations once some values have been read from memory using some rather intricate matrix factorizations [34]. However, as observed by the authors themselves [34], such an approach only pays off when the problem size is large enough so as to not fit in the aggregate memory of all available GPUs. Another approach is to realize that variables are read from the memory to the system cache. If the variables in a linear equation are all present in the cache (i.e., a cache hit), then the operations can be executed much faster in comparison with the case when some of the required variables are not present in the cache (i.e., a cache miss) and need to be read from the memory. Thus, researchers have designed grid-based data structures [36, 42] that make a better effort in putting variables in a linear equation closer together in memory. While this is possible for grid-based data structures that exhibit regular connectivity patterns, it is generally assumed that such optimizations cannot be performed for triangle/tetrahedral mesh-based data structures, where the connectivity pattern is more irregular.

For mesh-based data structures, some authors have investigated reordering optimizations at the level of the compiler for better efficiency [55]. Other researchers have investigated delayed updates to matrix factorizations [23, 24] as a means to faster computation. In contrast to these prior works that require more involved computations, our proposed approach is much simpler to implement and is based on spectral reordering [16, 32, 53], a technique that is well known in the geometry processing community but, to the best of our knowledge, has not been exploited for accelerating computations in physics simulation. Specifically, this approach requires computing the eigenvector corresponding to the second-smallest eigenvalue, also known as the Fiedler vector, which can be computed very efficiently thanks to “off-the-shelf” Algebraic Multigrid-preconditioned Conjugate Gradient (AMGPCG) solvers. Most notably, this approach is cognizant of the connectivity pattern used for generating the linear system and reorders variables to respect this connectivity pattern, as opposed to the predominant practice of using fixed connectivity patterns for mesh layouts regardless of the linear system.

We observe that our spectral reordering approach can speed-up simulations of elastic solids by at least $10\%$, and up to nearly $30\%$, as described in our experimental results in Sect. 6. This can lead to considerable savings given that even modest simulations require at least 240 time steps. Moreover, our proposed method is general as we show its applicability to two different approaches for simulating elastic solids [6, 37].

The rest of this paper is organized as follows: Sect. 2 discusses related prior work, Sect. 3 describes our core technical approach, Sect. 4 provides details behind our numerical implementation, Sect. 5 describes the elasticity simulators that we used, Sect. 6 describes our experimental results, and finally, Sect. 7 describes our conclusion and avenues for future work.

2 Related work

Cache-efficient algorithms fall into two types: cache-aware, which depend on knowing specific parameters of the cache, such as block size, and cache-oblivious, which improve performance by reducing cache misses without using any specific information about the cache. Some of the first cache-oblivious algorithms were introduced by Frigo et al. [19]. We refer the reader to [14] for an overview of cache-oblivious algorithms.

While cache-efficient algorithms are designed on a case-by-case basis, cache-efficient data structures are general-purpose, allowing for improvements across multiple algorithms. The simplest cache-efficient data structure is a reordering of an existing data structure. For example, Yoon et al. [53] reorder the vertices in meshes for cache-efficiency using a metric-minimization method. Their metric measures the distance of indices in the list of vertices that are connected by edges. They use repeated local reorderings to minimize this metric and focus on very large meshes, from millions to hundreds of millions of vertices. They utilize their method to speed up three separate applications: large-scale rendering, collision detection, and isocontour extraction.

The mesh itself is not the only data structure that can benefit from reordering. Yoon and Manocha [54] use probabilistic models to cluster and reorder the bounding volume hierarchy. Their method minimizes cache misses in collision detection for rigid body simulations.

Isenburg and Lindstrom [26] introduce a new streaming format for storing very large meshes. Similar to our method, they utilize the Fiedler vector to reorder the mesh nodes, to minimize separation between vertices that belong to the same face. Hoppe [25] used indexed triangle strips to make renderings more efficient. To increase efficiency further, he reordered the triangle faces to make cache-efficient layouts. The strategies used include a greedy triangle stripification algorithm, and repeated local optimizations. Triangle stripping has been developed further by Gopi and Eppstein [22] by using a perfect matching algorithm to create a triangle strip for a graph representing a mesh’s faces as nodes. Setaluri et al. [43] speed up simulations of fluid dynamics by ordering the cells of 2D and 3D grids in memory using Morton ordering, a type of space-filling curve.

The Fiedler vector is part of a wider array of methods for processing functions on graphs, commonly known as spectral methods. Pioneered by Taubin [51], these methods draw insights from signal processing on 2D and 3D geometric data, including meshes. Spectral methods are defined by several commonalities. First, they treat geometric data as graphs. Second, they focus on the graph Laplacian rather than the adjacency matrix. Third, they make use of eigenvectors and eigenvalues from the graph Laplacian, in a direct analogy to the decomposition of functions in Fourier analysis. See [40, 56] for a survey of spectral methods for geometric processing. There are numerous applications of spectral methods. Levy [32] interprets the graph Laplacian (referred to as the Laplace-Beltrami operator) as a basis for a function space defined on meshes. He uses it for approximating functions on meshes, demonstrating applications to pose transfer, segmentation, and parameterization. Barnard et al. [7] and Alpert et al. [3] use eigenvectors of the graph Laplacian for two different applications—for reordering the graph’s sparse adjacency matrix and for partitioning the graph. Karni et al. [27] use spectral methods for compressing meshes.

Since its introduction [17, 18], the Fiedler vector has been extensively used in various spectral methods, such as graph drawing [28, 29], for optimizing streaming formats for meshes [26], shape characterization [4, 30, 31], graph segmentation and partitioning, where the graph can represent meshes [33, 48, 52] or images [45].

3 Technical approach

Our choice of using the Fiedler vector for mesh reordering is inspired by its use for efficient graph drawing [28, 29]. A mesh can be represented as a graph, where vertices are mesh nodes and an edge connects any two vertices if they are part of the same mesh element. For both graph drawing and cache-efficient mesh processing, the goal is to minimize the distance between vertices that are connected by edges. As described in [29], one way to ensure this property is by giving the vertices coordinates defined by the Fiedler vector (see Fig. 1). Here, we provide a brief overview of the argument in [29]. Let ${\varvec{x}}$ denote the vector of coordinates for all vertices, where each vertex coordinate is a real number. Hall’s energy for a graph of n nodes is defined as follows:

$$\begin{aligned} E = \frac{1}{2} \sum _{i.j}^n w_{i,j} (x_i - x_j)^2 \end{aligned}$$

(1)

where $x_i$ and $x_j$ are the coordinates of nodes i and j, where $i \ne j$, and $w_{i,j}$ is the weight of the edge connecting them. Intuitively, minimizing Hall’s energy is equivalent to finding nodal coordinates that minimize the distance between connected nodes.

Hall’s energy can be restated by using the Laplacian ${\varvec{L}}$ of the graph. ${\varvec{L}}$ is a sparse symmetric $n\times n$ matrix. Given vertices i and j, if there is an edge between them, then the value in ${\varvec{L}}$ at the $i^{th}$ row and $j^{th}$ column is $-w_{i,j}$. On the diagonal, each (i, i) entry is set to the sum of the edge weights on row i. Formally,

(2)

Using the Laplacian matrix in equation (2), Hall’s energy can be denoted as $E = \varvec{x^T L x}$. The trivial solution, ${\varvec{x}} =0$, can be eliminated by requiring the coordinates vector norm to be nonzero, i.e., $\varvec{x^Tx} = c$. As the coordinates can be arbitrarily scaled, the constant c can be set to 1: $\varvec{x^Tx} = 1$. This constraint can be accommodated in the minimization problem using Lagrange multipliers: $\min (\varvec{x^TLx} - \lambda (\varvec{x^Tx} - 1))$. The analytical solution can be found by computing the derivative with respect to ${\varvec{x}}$ and setting it to 0. The result is the eigenvalue problem:

$$\begin{aligned} \varvec{Lx} = \lambda {\varvec{x}} \end{aligned}$$

(3)

Since $E = \varvec{x^T Lx}$ and $\varvec{Lx} = \lambda {\varvec{x}}$, it follows that $E = \varvec{x^T Lx} = \varvec{^T} \lambda {\varvec{x}}$. Since we set $\varvec{x^Tx} = 1$, $\lambda \varvec{x^T x} = \lambda $, so $E = \lambda $. Thus, the eigenvalue is the Hall energy.

The solution which minimizes the eigenvalue problem is trivial: ${\varvec{x}}$ such that $x_i = 1/n$ for $1\le i\le n$, whose eigenvalue, and thus the Hall energy, is 0. This solution gives the same coordinate to each node, which is not practical. The second eigenvector, which has the second-smallest eigenvalue, is known as the Fiedler vector. For the non-trivial solutions, it gives the smallest eigenvalue and, thus, the smallest energy. Per the intuitive interpretation of the Hall energy mentioned above, the Fiedler vector gives the node coordinates where the distance between connected nodes is minimized.

The Fiedler vector can be used for defining coordinates. The Fielder vector was used for graph drawing in [29], where it was used for defining coordinates along a single axis for each vertex. For drawing graphs in 2D and 3D, eigenvectors corresponding to the third and fourth smallest eigenvalues can be used to obtain the additional vertex coordinates. For our elasticity simulation, the Fiedler vector alone is sufficient, but the values defined by it need to be converted to a ranking, i.e., an ordering of the mesh nodes. Rankings are computed by sorting the Fiedler values of the vertices: The node with the smallest value is ranked first, the node with the second-smallest value is ranked second, and so on. The rankings define the order of the vertices in memory. As connected vertices have similar values in the Fiedler vector, they are close together in the ranking and thus close together in the main memory. So, when the simulator extracts them, it is likely that they will be placed in the same cache, and thus, cache misses would be less likely to occur. See Fig. 2 for an illustration of how reordering a mesh’s vertices can reduce the number of cache misses.

4 Numerical implementation

Our code to reorder the meshes was written in Python and used the NumPy and SciPy libraries. Given an input mesh, to extract a Fiedler vector of node coordinates, we first compute the Laplacian of the mesh’s graph as a sparse matrix. As there is no special reason to give different weights to different node pairs, we set $w_{i,j} = 1$ for every pair of nodes i and j (where $i \ne j$) if they are part of the same mesh element, and $w_{i,j} = 0$ otherwise. This form of the Laplacian is typically known as the combinatorial Laplacian. We extract the Fiedler vector from the combinatorial Laplacian using Locally Optimal Block Preconditioned Conjugate Gradients (LOBPCG), a well-known method for finding the largest or smallest eigenvalues of a matrix and their corresponding eigenvectors [15]. We use this solver to find the two smallest eigenvalues and eigenvectors.

LOBPCG requires an initial guess for the eigenvectors as input. For the first eigenvector, we used its known true value: a constant vector where every value is 1/n. For the second eigenvector, we used random values for the initial guess, where each value was sampled from a uniform distribution between $-0.5$ and 0.5. For fast convergence of the Conjugate Gradients solve, we used the Algebraic Multigrid smoothed aggregation solver from PyAMG [8] as the preconditioner.

4.1 Reordering mesh elements

In addition to reordering vertices in meshes, we also experimented with reordering the mesh elements. For this purpose, we defined a dual graph of the mesh, where each mesh element is a vertex, and if any two mesh elements share at least one node in common, they are connected by an edge. Using this definition, we built the combinatorial Laplacian matrix as before (where all edges have weight 1). We used the same method, LOBPCG preconditioned with a smoothed aggregation Multigrid solver from PyAMG, to find the Fiedler vector of the matrix. The new mesh element ranking was likewise computed in the same way as before.

4.2 Mesh orderings

We used meshes of three different objects: a dragon, a bunny, and an armadillo. Their characteristics are described in Sect. 5. For each mesh, we used the above methods to find a Fiedler reordering for its vertices, and another Fiedler reordering for its elements (i.e., its tetrahedra). However, we found that it is possible for many vertices in any given mesh to already be well ordered, and comparing only an original mesh with its Fiedler-reordered version would not properly show the power of reordering for reducing cache misses. Thus, for each shape, we created a scrambled reordering as a point of reference, where the node rankings were determined randomly using NumPy’s shuffle function. All of the orderings for the mesh models are shown in Fig. 3. For each shape, we also created an additional reordering where both the vertices and mesh elements were randomly reordered.

4.3 Edge span

To illustrate the effect of reordering, we utilized the edge span metric. As defined in [53], for an edge consisting of vertices $(v_i,v_j)$, its span is the absolute difference of its vertices’ indices, $|i - j |$. As before, an edge is defined between two vertices if they are part of the same mesh element. The greater the edge span, the higher the chance of cache misses since the simulator is likely to access both vertices in quick succession. We list the edge spans for the original, Fiedler-reordered, and the scrambled reordering of each mesh in Sect. 6.1.

5 Elasticity simulations

For testing the effects of different orderings on the simulation speed, we used three physics libraries for simulating elastic solids: Vega [6], an ADMM solver [37] based on Projective Dynamics [9], and an updated version of the ADMM solver with collision handling [39]. From here on, we will refer to the latter two as ADMM-PD and ADMM-PD-extended, respectively.

We devised six scenarios to test the speedup, two for each solver, as listed in Table 2. Each scenario involves a single object undergoing a motion that results in, or is caused by, a deformation. The full animations can be seen in the supplemental video. We ran each scenario for 240 time steps on a Intel Xeon E5–1620 v4 CPU with 4 cores and a cache size of 10 MB. We timed all of the scenarios using Chrono, a C++ timing library. We did not time the setup for the models. At each time step, we only timed the call to the solver for simulating that time step and used the sum of the time for all of the time steps as the computation time for that run.

For each scenario, we used five different orderings as listed in Table 2: original, Fiedler-reordered (vertex-only), VF Fiedler-reordered (both vertices and faces), scrambled (vertex-only), and VF scrambled (both vertices and faces). We ran each scenario three times per reordering and recorded the average and standard deviation of the timings, as described in Sect. 6.2.

5.1 Vega

Vega [6] is a library that simulates elastic solids using an implicit backwards Euler scheme. For the internal solver, we used the Jacobi-preconditioned Conjugate Gradients (PCG) solver that comes built-in with the library. We used Vega for the falling scenarios shown in Figs. 4 and 5. For both scenarios, we used the dragon model that came with Vega. It has 46,736 vertices and 160,553 tetrahedra. Neither scenario had gravity.

In the falling scenario of Fig. 4, the dragon uses corotational linear elasticity as the constitutive model. It starts at rest in mid-air, and we exert a force of $-50$ N along each of the three coordinate axes on all vertices during the first time step. No forces are exerted after the first time step. In the falling scenario of Fig. 5, the dragon has the Saint Venant–Kirchhoff (StVK) constitutive model. We randomly selected a node near the top of the dragon model and constrained it to a fixed location. We exerted a force of $-9.8$ N along the Y-axis (i.e., downwards) on all of the mesh vertices.

5.2 ADMM-PD

ADMM-PD [37] is a generalization of the projective dynamics (PD) framework [9] and supports general constitutive models of elasticity. It uses an implicit scheme based on minimizing an error function closely related to the physical system’s energy. It is computationally efficient as it pre-computes a Cholesky factorization of the system matrix and reuses the Cholesky factors during each time step. We used ADMM-PD for the scenarios shown in Figs. 6 and 7, respectively. In both scenarios, the mesh is scaled such that its height is 1 m. For both scenarios, gravity is turned off. The scenario in Fig. 6 uses the same dragon model from the Vega scenarios. The dragon is given a simple spring constitutive model and the simulation uses 20 ADMM iterations per time step. The dragon’s vertices are split through the midpoint of the mesh along the X-axis. Vertices with an x-coordinate greater than or equal to the midpoint are given an initial velocity of 0.1 m/s in the +X direction. The other vertices are given a velocity of 0.1 m/s in the -X direction. The resulting motion stretches the dragon. The scenario in Fig. 7 is directly based on a similar scenario in [37]. We tetrahedralized the armadillo mesh from the Stanford 3D Scanning Repository using Tetgen [46]. The model has 45,908 vertices and 151,869 tetrahedra. The armadillo is given a linear tetrahedral strain constitutive model, and the simulation uses 100 ADMM iterations per time step. It takes all of the armadillo’s vertices and gives them the same starting location, so that the entire model is “squashed” into a single point. The model then unsquashes itself into its original shape during the course of the simulation.

5.3 ADMM-PD-extended

ADMM-PD-extended [39] is the revised version of ADMM-PD. It uses the same formulation as ADMM-PD for handling object motion and elastic deformation, but also handles self-collisions within objects and frictionless object-obstacle collisions. We used the ADMM-PD-extended model for the two scenarios shown in Figs. 8 and 9, where collisions occur. In both scenarios, we rescale the mesh to have a height of 0.07 m. For the scenario shown in Fig. 8, the armadillo falls from a height of 0.4 m into the rim of a nearly spherical bowl. For the scenario shown in Fig. 9, we use a version of the Stanford bunny mesh tetrahedralized with Tetgen [46]. It has 34,833 vertices and 120,001 tetrahedra. The bunny falls from a height of 0.2 m and bounces off the floor.

6 Experimental results

In this section, we provide quantitative details behind the performance of our method in our experiments.

6.1 Edge spans

Table 1 shows the average edge span of the connected vertices for the three meshes. As expected, we see a reduction in average edge span between the scrambled and original orderings of the meshes. We see a much larger reduction in the average edge span, of around two orders of magnitude, between the scrambled and the Fiedler orderings.

Table 1 Average edge span for each of the three meshes

Full size table

Figure 10 shows log–log histograms of the edge spans for the three meshes. The scrambled orderings have roughly the same number of edges for each edge span. The original orderings have somewhat more edges with shorter edge spans, and fewer edges with longer edge spans. In contrast, the Fiedler reordering has almost no edges with a span of more than 8000, and it has the largest number of edges with short edge spans.

Table 2 Average computation time (in seconds) and standard deviation for each of our 6 experimental scenarios to run (taken across 5 runs) for each ordering of the scenario’s shape

Full size table

6.2 Elasticity simulations

Our timing results for the six scenarios are shown in Table 2. We took the average run-time of three runs for each mesh ordering, for each scenario. We also took the standard deviations, from which we saw that the averages are broadly representative, and can be used for comparisons between different mesh orderings. We did not use the standard deviations for any other purpose. In particular, the results shown in Tables 3 and 4 are based only on the averages.

We chose the scrambled ordering for vertices to be the reference ordering for our comparisons. The number of seconds that the original and Fiedler reorderings save compared to the reference ordering is shown in Table 3. The same speedups are shown in Table 4 as a percentage of the reference ordering. The results show limited speedups as one goes from the scrambled orderings to the original orderings, and larger speedups as one goes from the original orderings to the Fiedler reorderings. The Fiedler orderings provide savings of hundreds of seconds (several minutes) over the original and scrambled orderings in all scenarios except Fig. 6, where the savings are in tens of seconds. Since the reordering process itself takes on the order of seconds to complete, we see that Fiedler reordering pays for itself several times over. Table 4 shows that savings of nearly $30\%$ over the scrambled version can be achieved with Fiedler reorderings.

We note that when both vertices and mesh elements are reordered, in most of the scenarios the time savings are increased further. The amount of savings depends on whether or not mesh elements are accessed by reference from their vertices, similar to how the savings from vertex ordering depends on vertices being accessed by reference from their mesh elements. This effect is most pronounced for the stretch scenario shown in Fig. 6, which uses the spring constitutive model. For this constitutive model, edges rather than tetrahedra are the primary elements of computation.

In addition, we noticed that the Fiedler ordering for the bounce scenario in Fig. 9 shows less improvement over the original ordering in comparison with the case with the other scenarios. We believe this is because of the bunny mesh in the bounce scenario. The bunny’s original ordering, as shown in Figs. 3 and 10, was itself already partially ordered, such that there was less room for improvement from applying the Fiedler reordering.

Table 3 Average speedups (in seconds) for the original and Fiedler reorderings compared to the scrambled vertices case

Full size table

Table 4 Speedups for the original and Fiedler reorderings as percentages of the time taken by the scrambled vertices case

Full size table

6.3 Dependence of simulations on the orderings

When examining the speedups in Tables 3 and 4 obtained by the Fiedler vertex reordering in comparison with the scrambled reordering, we notice that greater speedups are obtained in the Vega simulations than the ADMM simulations. This is because Vega uses a traditional Newton-based solver for elasticity, where a linear problem is solved using Conjugate Gradients inside each Newton iteration. A number of computations that occur during the linear solve, such as the matrix–vector product, vector–vector dot-product, or vector-norm computation, directly benefit from the Fiedler vertex reorderings, leading to speedups. In contrast, the ADMM solver pre-factorizes the Cholesky factors of the system matrix and subsequently uses those factors during each iteration. Thus, ADMM does not require back-and-forth communication between the matrix/vector data and the geometric data stored on the meshes, leading to less speedups overall. However, we notice a nonlinear effect when simultaneously reordering both the vertices and mesh elements. Specifically, we notice a greater speedup in the stretch scenario in Fig. 6 with the VF Fiedler reordering, which was simulated using ADMM, in comparison with the constrained falling scenario in Fig. 5, which was simulated using Vega. Since the difference is relatively small, we defer a more in-depth analysis of this behavior to future work, as we believe such an analysis would require intervention at the systems level with explicit control over cache-scheduling policies.

7 Conclusion and future work

We proposed to use the Fiedler vector to reorder tetrahedral meshes for greater cache efficiency. We demonstrated the benefits of this reordering for speeding up simulations of elastic solids. Our proposed method is general, as demonstrated by its applicability to Vega [6] and the ADMM solver [37], which use different algorithms for elasticity. Our results show that using cache-efficient mesh orderings can provide significant speedups for physics simulations with negligible overhead.

It would be interesting to apply our method to fluid simulations on tetrahedral meshes [5, 12]. Since fluids exhibit dynamic topology changes, the mesh connectivity will change every time step, requiring the Fiedler reordering to be recomputed every time step. However, given the negligible overhead associated with this computation, we believe our method should still be applicable for speeding up fluid simulations. There may also be some opportunities for optimizing the computation of the Fiedler vector, given that the mesh connectivity will only change very slightly during each time step.

Another potential avenue is to investigate what makes any given scenario and/or algorithm more or less susceptible to speedups when the meshes used are reordered to be cache-efficient. We demonstrated a large variation in speed-ups with our limited selection of physics engines and scenarios. This indicates that different ways of processing data vary greatly in their utilization of memory caches. Knowing the exact ways in which simulation algorithms utilize the cache means knowing when it is worth to reorder the mesh and could provide other insights into increasing the simulation efficiency.

Data availibility

No datasets were generated or analysed during the current study.

References

Aanjaneya, M.: An efficient solver for two-way coupling rigid bodies with incompressible flow. Comput. Gr. Forum 37(8), 59–68 (2018)
Article Google Scholar
Aanjaneya, M., Han, C., Goldade, R., Batty, C.: An efficient geometric multigrid solver for viscous liquids. Proc. ACM Comput. Gr. Interact. Techn. 2(2), 1–21 (2019)
Article Google Scholar
Alpert, C.J., Kahng, A.B., Yao, S.Z.: Spectral partitioning with multiple eigenvectors. Discret. Appl. Math. 90(1), 3–26 (1999)
Article MathSciNet Google Scholar
Alwaely, B., Abhayaratne, C.: Ghosm: Graph-based hybrid outline and skeleton modelling for shape recognition. ACM Trans. Multimedia Comput. Commun. Appl. 19(2s), 1–23 (2023)
Article Google Scholar
Ando, R., Thürey, N., Wojtan, C.: Highly adaptive liquid simulations on tetrahedral meshes. ACM Trans. Graph. 32(4), 1–10 (2013)
Article Google Scholar
Barbič, J., Sin, F.S., Schroeder, D.: Vega FEM Library (2012). http://www.jernejbarbic.com/vega
Barnard, S., Pothen, A., Simon, H.: A spectral algorithm for envelope reduction of sparse matrices. In: Supercomputing ’93:Proceedings of the 1993 ACM/IEEE Conference on Supercomputing, pp. 493–502 (1993)
Bell, N., Olson, L.N., Schroder, J., Southworth, B.: PyAMG: algebraic multigrid solvers in python. J. Open Source Softw. 8(87), 5495 (2023)
Article Google Scholar
Bouaziz, S., Martin, S., Liu, T., Kavan, L., Pauly, M.: Projective dynamics: fusing constraint projections for fast simulation. ACM Trans. Graph. (2014). https://doi.org/10.1145/3596711.3596794
Article Google Scholar
Bridson, R.: Fluid simulation for computer graphics. CRC Press (2015)
Book Google Scholar
Chu, J., Zafar, N.B., Yang, X.: A schur complement preconditioner for scalable parallel fluid simulation. ACM Trans. Graph. (2017). https://doi.org/10.1145/3072959.3092818
Article Google Scholar
Clausen, P., Wicke, M., Shewchuk, J.R., O’Brien, J.F.: Simulating liquids and solid-liquid interactions with lagrangian meshes. ACM Trans. Graph. (2013). https://doi.org/10.1145/2451236.2451243
Article Google Scholar
De Goes, F., Sheffler, W., Fleischer, K.: Character articulation through profile curves. ACM Trans. Graph. (2022). https://doi.org/10.1145/3528223.3530060
Article Google Scholar
Demaine, E.D.: Cache-oblivious algorithms and data structures. Lecture Notes from the EEF Summer School on Massive Data Sets 8(4), 1–249 (2002)
Google Scholar
Deursch, J., Gu, M., Shao, M., Yang, C.: A robust and efficient implementation of lobpcg. SIAM J. Sci. Comput. (2017). https://doi.org/10.1137/17M1129830
Article Google Scholar
Díaz, J., Petit, J., Serna, M.: A survey of graph layout problems. ACM Comput. Surv. 34(3), 313–356 (2002)
Article Google Scholar
Fiedler, M.: Algebraic connectivity of graphs. Czechoslov. Math. J. 23(2), 298–305 (1973)
Article MathSciNet Google Scholar
Fiedler, M.: A property of eigenvectors of nonnegative symmetric matrices and its application to graph theory. Czechoslov. Math. J. 25(4), 619–633 (1975)
Article MathSciNet Google Scholar
Frigo, M., Leiserson, C.E., Prokop, H., Ramachandran, S.: Cache-oblivious algorithms. In: 40th Annual Symposium on Foundations of Computer Science (Cat. No. 99CB37039), pp. 285–297. IEEE (1999)
Geiger, W., Leo, M., Rasmussen, N., Losasso, F., Fedkiw, R.: So real it’ll make you wet. In: ACM SIGGRAPH 2006 Sketches, SIGGRAPH ’06 (2006)
de Goes, F., Fong, D., O’Malley, M.: Garment refitting for digital characters. In: ACM SIGGRAPH 2020 Talks, SIGGRAPH ’20. Association for Computing Machinery (2020)
Gopi, M., Eppstein, D.: Single-strip triangulation of manifolds with arbitrary topology. Comput. Gr. Forum 23(3), 371–379 (2004)
Article Google Scholar
Hecht, F., Lee, Y.J., Shewchuk, J.R., O’Brien, J.F.: Updated sparse cholesky factors for corotational elastodynamics. ACM Trans. Graph. (2012). https://doi.org/10.1145/2231816.2231821
Article Google Scholar
Herholz, P., Sorkine-Hornung, O.: Sparse cholesky updates for interactive mesh parameterization. ACM Trans. Graph. (2020). https://doi.org/10.1145/3414685.3417828
Article Google Scholar
Hoppe, H.: Optimization of mesh locality for transparent vertex caching. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’99, p. 269-276. ACM Press/Addison-Wesley Publishing Co., USA (1999)
Isenburg, M., Lindstrom, P.: Streaming meshes. In: VIS 05. IEEE Visualization, 2005., pp. 231–238 (2005)
Karni, Z., Gotsman, C.: Spectral compression of mesh geometry. In: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’00, p. 279-286. ACM Press/Addison-Wesley Publishing Co., USA (2000)
Koren, Y.: On spectral graph drawing. In: Warnow, T., Zhu, B. (eds.) Computing and Combinatorics, pp. 496–508. Springer, Berlin Heidelberg, Berlin, Heidelberg (2003)
Chapter Google Scholar
Koren, Y., Carmel, L., Harel, D.: Ace: a fast multiscale eigenvectors computation for drawing huge graphs. In: IEEE Symposium on Information Visualization, 2002. INFOVIS 2002., pp. 137–144 (2002)
Lai, Z., Hu, J., Liu, C., Taimouri, V., Pai, D., Zhu, J., Xu, J., Hua, J.: Intra-patient supine-prone colon registration in ct colonography using shape spectrum. pp. 332–9 (2010)
van Leuken, R.H., Symonova, O., Veltkamp, R.C., de Amicis, R.: Complex fiedler vectors for shape retrieval. In: N. da Vitoria Lobo, T. Kasparis, F. Roli, J.T. Kwok, M. Georgiopoulos, G.C. Anagnostopoulos, M. Loog (eds.) Structural, Syntactic, and Statistical Pattern Recognition, pp. 167–176. Springer Berlin Heidelberg, Berlin, Heidelberg (2008)
Levy, B.: Laplace-beltrami eigenfunctions towards an algorithm that "understands" geometry. In: IEEE International Conference on Shape Modeling and Applications 2006 (SMI’06), pp. 13 (2006)
Lingfei, L., Tieru, W.: Automatic spectral method of mesh segmentation based on fiedler residual. Chin. J. Electron. 30(3), 426–436 (2021)
Article Google Scholar
Liu, H., Mitchell, N., Aanjaneya, M., Sifakis, E.: A scalable schur-complement fluids solver for heterogeneous compute platforms. ACM Trans. Graph. 35(6), 1–12 (2016)
Article Google Scholar
Liu, T., Bouaziz, S., Kavan, L.: Quasi-newton methods for real-time simulation of hyperelastic materials. ACM Trans. Graph. 36(3), 1–16 (2017)
Article Google Scholar
Museth, K.: Vdb: high-resolution sparse volumes with dynamic topology. ACM Trans. Graph. (2013). https://doi.org/10.1145/2487228.2487235
Article Google Scholar
Narain, R., Overby, M., Brown, G.E.: ADMM $\supseteq $ projective dynamics: Fast simulation of general constitutive models. In: Proceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’16, pp. 21–28. Eurographics Association, Aire-la-Ville, Switzerland, Switzerland (2016)
Nguyen, D., Talbot, J., Sheffler, W., Hessler, M., Fleischer, K., de Goes, F.: Shaping the elements: Curvenet animation controls in pixar’s elemental. In: ACM SIGGRAPH 2023 Talks, SIGGRAPH ’23. Association for Computing Machinery (2023)
Overby, M., Brown, G.E., Li, J., Narain, R.: Admm $\supseteq $ projective dynamics: fast simulation of hyperelastic models with dynamic constraints. IEEE Trans. Visual Comput. Gr. 23(10), 2222–2234 (2017)
Article Google Scholar
Peng, R.: Algorithm Design Using Spectral Graph Theory (2013)
Rasmussen, N., Enright, D., Nguyen, D., Marino, S., Sumner, N., Geiger, W., Hoon, S., Fedkiw, R.: Directable photorealistic liquids. In: Proceedings of the 2004 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, SCA ’04, p. 193-202 (2004)
Setaluri, R., Aanjaneya, M., Bauer, S., Sifakis, E.: Spgrid: a sparse paged grid structure applied to adaptive smoke simulation. ACM Trans. Graph. 33(6), 1–12 (2014)
Article Google Scholar
Setaluri, R., Aanjaneya, M., Bauer, S., Sifakis, E.: Spgrid: a sparse paged grid structure applied to adaptive smoke simulation. ACM Trans. Graph. (2014). https://doi.org/10.1145/2661229.2661269
Article Google Scholar
Shek, A., Lacewell, D., Selle, A., Teece, D., Thompson, T.: Art-directing disney’s tangled procedural trees. In: ACM SIGGRAPH 2010 Talks, SIGGRAPH ’10 (2010)
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Article Google Scholar
Si, H.: Tetgen, a delaunay-based quality tetrahedral mesh generator. ACM Trans. Math. Softw. (2015). https://doi.org/10.1145/2661229.2661269
Article MathSciNet Google Scholar
Sifakis, E., Barbic, J.: Fem simulation of 3d deformable solids: A practitioner’s guide to theory, discretization and model reduction. In: ACM SIGGRAPH 2012 Courses (2012)
Spielman, D.A., Teng, S.H.: Spectral partitioning works: planar graphs and finite element meshes. In: Proceedings of 37th Conference on Foundations of Computer Science, pp. 96–105 (1996)
Stomakhin, A., Schroeder, C., Chai, L., Teran, J., Selle, A.: A material point method for snow simulation. ACM Trans. Gr. (TOG) 32(4), 1–10 (2013)
Article Google Scholar
Sueda, S., Jones, G.L., Levin, D.I.W., Pai, D.K.: Large-scale dynamic simulation of highly constrained strands. ACM Trans. Graph. (2011). https://doi.org/10.1145/1964921.1964934
Taubin, G.: A signal processing approach to fair surface design. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH ’95, p. 351-358. Association for Computing Machinery, New York, NY, USA (1995)
Tong, W., Yang, X., Pan, M., Chen, F.: Spectral mesh segmentation via $\ell _0$0 gradient minimization. IEEE Trans. Visual Comput. Graphics 26(4), 1807–1820 (2020)
Google Scholar
Yoon, S.E., Lindstrom, P., Pascucci, V., Manocha, D.: Cache-oblivious mesh layouts. In: ACM SIGGRAPH 2005 Papers. SIGGRAPH ’05, pp. 886–893. Association for Computing Machinery, New York, NY, USA (2005)
Google Scholar
Yoon, S.E., Manocha, D.: Cache-efficient layouts of bounding volume hierarchies. Computer Graphics Forum 25(3), 507–516 (2006)
Article Google Scholar
Yu, C., Xu, Y., Kuang, Y., Hu, Y., Liu, T.: Meshtaichi: A compiler for efficient mesh-based operations. ACM Trans. Graph. (2022). https://doi.org/10.1145/3550454.3555430
Article Google Scholar
Zhang, H., van Kaick, O., Dyer, R.: Spectral mesh processing. Comput. Gr. Forum 29(6), 1865–1894 (2010)
Article Google Scholar
Zhang, X., Bridson, R.: A pppm fast summation method for fluids and beyond. ACM Trans. Graph. 33(6), 1–11 (2014)
Zhu, Y., Sifakis, E., Teran, J., Brandt, A.: An efficient multigrid method for the simulation of high-resolution elastic solids. ACM Trans. Graph. 29(2), 1–18 (2010)
Article Google Scholar

Download references

Acknowledgements

A. F. and M. A. were supported in part by the National Science Foundation under awards CCF-2110861, IIS-2132972, IIS-2238955, and CCF-2312220 as well as a research gift from Red Hat, Inc. Any opinions, findings and conclusions, or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Author information

Authors and Affiliations

Department of Computer Science, Rutgers University, Piscataway, USA
Alon Flor & Mridul Aanjaneya

Authors

Alon Flor
View author publications
You can also search for this author in PubMed Google Scholar
Mridul Aanjaneya
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A. F. prepared all the figures and ran the simulations. A. F. and M. A. wrote the manuscript together.

Corresponding author

Correspondence to Mridul Aanjaneya.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Flor, A., Aanjaneya, M. Spectral reordering for faster elasticity simulations. Vis Comput (2024). https://doi.org/10.1007/s00371-024-03513-0

Download citation

Accepted: 26 May 2024
Published: 18 June 2024
DOI: https://doi.org/10.1007/s00371-024-03513-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Spectral reordering for faster elasticity simulations

Abstract

Similar content being viewed by others

Development of an ABAQUS plugin tool for periodic RVE homogenisation

Finite basis physics-informed neural networks (FBPINNs): a scalable domain decomposition approach for solving differential equations

Interpolation-based immersogeometric analysis methods for multi-material and multi-physics problems

1 Introduction

2 Related work

3 Technical approach