High-performance GPU computations in nonlinear dynamics: an efficient tool for new discoveries

The main aim of this paper is to demonstrate the benefit of the application of high-performance computing techniques in the field of non-linear science through two kinds of dynamical systems as test models. It is shown that high-resolution, multi-dimensional parameter scans (in the order of millions of parameter combinations) via an initial value problem solver are an efficient tool to discover new features of dynamical systems that are hard to find by other means. The employed initial value problem solver is an in-house code written in C++ and CUDA C software environments, which can exploit the high processing power of professional graphics cards (GPUs). The first test model is the Keller–Miksis equation, a non-linear oscillator describing the dynamics of a driven single spherical gas bubble placed in an infinite domain of liquid. This equation is important in the field of cavitation and sonochemistry. Here, the high-resolution parameter scans gave us the opportunity to lay down the basis of a non-feedback technique to control multi-stability in which direct selection of the desired attractor is possible. The second test model is related to a pressure relief valve that can exhibit a special kind of impact dynamics called grazing impact. A fine scan of the initial conditions revealed a second focal point of the grazing lines in the initial-condition space that was hidden in previous studies.


Introduction
Non-linear dynamics has received a lot of attention since the discovery of the chaotic Lorenz attractor [1].It opened Pandora's box that led to a series of further discoveries of other phenomena such as, additional kinds of bifurcations [2,3], multi-stability and its control [4][5][6][7][8], various routes to chaos and its control [9][10][11][12][13], transient chaotic behaviour [14,15] or the characterisation of non-linear resonance phenomena [16][17][18][19][20][21], to name a few.Investigating a large number of classical low-dimensional equations, the above mentioned phenomena turned out to be universal features of non-linear systems.The corresponding emerging theories still play an important role in the qualitative understanding of many real-life phenomena in a large variety of scientific fields, for instance, in climate dynamics [22], social sciences [23], neurobiology [24], fluid dynamics [25], mechanical engineering [26] or in laser physics [27].
Although the aforementioned studies are important, they are carried out usually on low-dimensional systems by performing investigations only in low-dimensional parameter spaces or in the local flow of the state space.That is, they require relatively low computational resources compared to an up to date personal computer.However, in order to explore the complex bifurcation structure in parameter space with high resolution [28][29][30][31][32], the necessary computational power can increase by orders of magnitude.For instance, even in a two dimensional parameter plane-employing an initial value problem solver (IVP) with a resolution of 1000 Â 1000-the computational requirements are increased by three orders of magnitude compared to conventional 1D bifurcation plots with the same resolution of 1000.Not to mention if other important, ''secondary'' control parameters are involved or the application of several initial conditions is mandatory (e.g. to investigate multi-stability).The total number of the parameter combinations can easily blow-up to tens or even hundreds of millions; for instance, see our recent paper about control of multi-stability [4].
At first, it might seem impractical to try to solve a two-dimensional problem with high-resolution IVP computations, since many clever techniques exist (e.g. the pseudo-arclength continuation using a boundary value problem solver (BVP) [33]) that can explore the evolution of bifurcation points even in two dimensions fast and easily.Indeed, in this way, valuable information can be obtained about the bifurcation structures [4,20,[34][35][36][37].Nevertheless, these techniques need an already found orbit to initiate the computation.Moreover, they are usually not capable to find a new set of co-existing solutions.Thus, the BVP computations are always combined with IVP simulations, see the aformentioned references.In the present paper, we demonstrate that the application of parameter scans with quite high resolution using IVP solvers can be the source for new ideas and discoveries.For this purpose, computations are carried out on two quite different test models, for details see Sect. 2.

The history of the choice of the test models
The first test model is the periodically excited Keller-Miksis equation that is a second order ordinary differential equation describing the radial pulsation of a single spherical gas bubble placed in an infinite domain of liquid [38].During the radial oscillation of the bubble, due to the external forcing, its contraction phase can be so rapid (collapse) that the temperature inside can reach thousands of degrees of Kelvin inducing chemical reactions [39][40][41].Therefore, this model is extensively used in the field of sonochemistry [42][43][44][45][46][47][48][49][50][51] to estimate the collapse strength and the chemical yield of a single bubble.In one of our previous papers [4], we extended the investigation to dual-frequency driving using two harmonic components in the external excitation.Therefore, the number of control parameters was increased to four: two driving amplitudes and two driving frequencies (for simplicity, the phase shift between the components was assumed to be zero).Our purpose was to investigate the effect of dual-frequency driving on the dynamics and the collapse strength of a bubble.The main strategy was to create high-resolution biparametric maps in the parameter plane of the amplitudes at several fixed frequency pairs.However, during the evaluation of the results, due to the high resolution of the parameter space, special features of the bifurcation structure could be observed.They helped to reveal that with a special choice of the frequencies, specific periodic orbits can be smoothly transformed into each other; for instance, a period-2 and a period-3 attractor.This observation inspired us to develop a non-feedback technique to control multistability, in which direct selection of the desired attractors is possible.To the best knowledge of the authors, such a technique was not proposed in the literature before.The present study presents the procedure of the discovery of the technique via an extension of our original work [4].
The second test model (adapted from [52]) describes the dynamics of a pressure relief valve that can exhibit impact dynamics.It is a system of three first-order ordinary differential equations.Our main purpose was to test the special features of the numerical GPU code for non-smooth dynamical systems and reproduce some of the results presented in the original paper [52].For some additional information about the code, the reader is referred to Sect. 3.There is a special type of impact called grazing impact related to the oscillation of the valve body.It means that the valve body approaches the valve seat, makes contact with the valve seat with zero velocity and then moves away from the seat.At a specific parameter set, the sets of initial conditions from which the pressure relief valve exhibit grazing impact are called grazing lines.They have a focal point in the initial condition space, at which an impacting Shil'nikov-like orbit exist.The grazing lines are computed by means of a BVP solver in the paper of H} os and Champneys [52], which was a cumbersome task that needed special care due to the discontinuous trajectories caused by the impact dynamics.According to the personal communication with the authors, the assembly of their MATLAB code took weeks.Comparing their grazing lines with our GPU accelerated IVP solver, the simulation time is reduced from a couple of hours to seconds.Moreover, the high-resolution scan of the initial conditions revealed a second focal point of the grazing lines that had been overlooked before.
It must be stressed, that in both cases, the original objective was to investigate the collapse strength of a single bubble or to reproduce some results corresponding to a pressure relief valve.The aforementioned discoveries are the ''side effects'' of the computations of high-resolution multi-dimensional parameter/initial condition scans.

The GPU accelerated solver: MPGOS
The usage of an IVP solver performing high-resolution parametric scans is sometimes called the ''brute force'' technique.It is easy to tune up the number of the parameters, their resolution and the number of the initial conditions; however, to write efficient computer code to do the task within a reasonable time is far from obvious.It is especially true in our case, as we intend to employ the high processing power of professional graphics cards (GPUs).It is not trivial how to use their massively parallel hardware architecture and distribute the workload evenly to tens of thousands of parallel threads.
The developed program package (also used here) is called Massively-Parallel-GPU-ODE-Solver (MPGOS) written in C?? and CUDA C software environments and capable to distribute the tasks to multiple GPUs.It supports explicit solvers: the classic Runge-Kutta solver with fixed time-stepping, and the adaptive Runge-Kutta-Cash-Karp method with embedded error estimation of orders 4 and 5.During the simulations of the present study, the adaptive solver is used.Event handling is also incorporated into the program package.It is mandatory to be able to detect the impact in case of the pressure-relief-valve test model.In addition, with specialized user-defined functions, it is possible to manipulate the trajectories by the user after every successful time step or event detection during the GPU computations.In this way, the impact law can be immediately applied upon the detection of an impact and the integration can be continued.Thus, it is not necessary to stop the integration or perform expensive memory transactions to apply the impact law via the CPU.The code is quite efficient, a simulation is approximately about 50 times faster on an Nvidia GTX GeForce Titan Black card (1707 GFLOPS peak performance) than on a four-core Intel Core i7-4790 CPU (115 GFLOPS peak performance) using double precision floating point arithmetic.The parallelisation strategy in the GPU code follows the ''per-thread'' approach; that is, to each GPU thread, a different instance of the investigated system is associated having different initial conditions or parameter sets.In the case of the CPU code, the different instances of the system were distributed amongst the CPU cores via the OpenMP application programming interface (API).A single CPU core solved a single instance of a system at a time.It must be emphasised that the proposed speed-up is an estimation using the Keller-Miksis equation introduced in Sect.4; the achievable factor in the reduction of the runtime can highly depend on the investigated ODE system (handling of special events like impact, or the number of the evaluation of transcendental functions or divisions).The detailed description of the code is beyond the scope of the present paper; however, it has to be stressed that such a fast and efficient solver was the key to achieve the aforementioned discoveries.For more details, the interested reader is referred to the official website of the program package: www.gpuode.comor to its GitHub repository [53].It is free to use under an MIT license and it has a detailed manual [54] with tutorial examples.

Mathematical model of the dual-frequency driven single bubble
The first test model is the Keller-Miksis equation [38]) describing the radial pulsation of a single spherical bubble placed in an infinite domain of liquid.The equation reads as where R(t) is the time dependent bubble radius.The values of the material properties of the employed liquid (water) are c L ¼ 1497:3 m=s (sound speed) and q L ¼ 997:1 kg=m 3 (density).According to the general, dual-frequency treatment, the pressure far away from the bubble, is the sum of a static ambient pressure, P 1 , and periodic components with pressure amplitudes P A1 and P A2 , angular frequencies x 1 and x 2 , and with a phase shift h.The connection between the pressures at the bubble interface can be written as where the total pressure inside the bubble is the sum of the partial pressures of the non-condensable gas, p G , and the vapour, p V ¼ 3166:8 Pa at ambient temperature of 25 °C.The surface tension is r ¼ 0:072 N=m and the liquid kinematic viscosity is l L ¼ 8:902 À4 Pa s.The gas inside the bubble obeys a simple polytropic relationship where the polytropic exponent c ¼ 1:4 (adiabatic behaviour), the equilibrium bubble radius is R E lm and the static pressure is P 1 ¼ 1 bar.System (1)-( 4) is written into a dimensionless form by introducing the dimensionless variables The equations are rearranged in order to minimize the number of its coefficients.The final form is where and For completeness and reproducibility, the coefficients are summarised below The angular frequencies x 1 and x 2 are normalized by the linear, undamped eigenfrequency [55] x of the unexcited system that defines the relative frequencies as

The global Poincare ´section
Due to the dual-frequency driving, the external forcing is not purely harmonic.In Eq. ( 10), the two dimensionless angular frequencies are 2p and 2pC 11 , here is the frequency ratio.The corresponding periods are T 1 ¼ 1 and For simplicity, the relative phase shift between the harmonic components is set to h ¼ C 12 ¼ 0. During the computations, the main control parameters are the pressure amplitudes while the frequency combinations are kept fixed.The ratio of the employed frequency pairs is always rational; thus, the dual-frequency driving is still periodic (quasiperiodic forcing is excluded).This period T, which is the smallest common multiple of T 1 and T 2 can be used as the global Poincare ´section of the system.That is, the trajectories are sampled at time instances s n ¼ n Á T (n ¼ 0; 1; 2; . ..).

The discovery of a non-feedback technique to directly control multi-stability
In order to represent the dynamical properties of a bubble in a four-dimensional parameter space, our strategy is to compute high-resolution bi-parametric plots with the pressure amplitudes P A1 and P A2 as control parameters applying fixed relative frequency pairs (x R1 , x R2 ).The pressure amplitudes are varied between 0 and 5 bar with 501, uniformly distributed values.In order to explore the co-existing attractors, 10 randomly chosen initial conditions are used.In our experience, it was enough to find the most relevant attractors to draw meaningful conclusions.Thus, a single bi-parametric computation consists of approximately 2.5 million initial value problems.In the first part of the investigation, the relative frequencies are selected from the following set of values: Bi-parametric computations are performed at every possible relative frequency combination, meaning a total number of 36 frequency pairs (taking into account the symmetry property of the driving).In order to explore the subharmonic resonance region in more detail, an additional series of simulations were performed with every possible combination of the frequency values This means 22 additional high-resolution bi-parametric plots (taking into account again the symmetry property and the already computed pairs of frequencies during the first computation period).Thus, the overall number of the solved initial value problems is approximately 145 millions.At each parameter combination, the first 2048 iterations are regarded as transients and discarded.Then the system is integrated further by additional 8192 iterations to achieve convergence of averaged quantities like the Lyapunov exponent or the winding number.One iteration means the integration of the system from 0 to the period of the excitation T, see Sect.4.1.To avoid code complexity, the numbers of the iterations mentioned above are the same for all instances of the initial value problems being solved, and they turned out to be enough according to our preliminary calculations.Thus, the convergence of the transients and the average quantities are not monitored.Besides the aforementioned averaged quantities, the period, the maximum bubble radius expansion and the subsequent minimum bubble radius (important to calculate the collapse strength of the bubble oscillation) are also stored.Furthermore, 32 points of the Poincare ´section of the last 32 iterations are also recorded.From the various quantities, only the period and the points of the Poincare ´section are used in the present study.
Figure 1 shows four typical bi-parametric periodicity diagrams at different relative frequency combinations.The colour code represents the maximum period up to period-6 found at a given parameter set.Chaotic oscillations or orbits with periodicity higher than six occupy the black regions.In the case of coexisting attractors, only the highest period is plotted.Keep in mind that the axes in the figures represent single frequency driving since one of the pressure amplitudes is zero in these cases.The bifurcation structure in many of such diagrams shows extreme complexity, where it is hard to find a clear regularity in the bifurcation patterns, see e.g. the upper panels of Fig. 1.However, at specific frequency combinations, bridge shaped structures appear connecting periodic segments from the vertical axis to the horizontal axis, or vice-versa.Such bifurcation structure can be clearly seen in the bottom panels of Fig. 1.Consequently, periodic orbits of single frequency driving at different relative frequencies can be transformed into each other via a temporary dual-frequency driving.The bottomleft panel of Fig. 1 is investigated in more detail in the following to give an in-depth description of the phenomenon.
Figure 2 shows a 3D representation of the period-1 orbits (yellow and gray surfaces) corresponding to relative frequencies x R1 ¼ 4 and x R2 ¼ 3, where the second component of the points of the Poincare śection Pðy 2 Þ is presented as a function of the pressure amplitudes P A1 and P A2 .Keep in mind that the global Poincare ´section is chosen according to the period of the dual-frequency driving T that is different from the period of the individual components T 1 and T 2 .For the present frequency combination, T ¼ 4, T 1 ¼ 1 and T 2 ¼ 4=3 % 1:333 in terms of the dimensionless time s.That is, the simulation defines every orbit as period-1 that repeats itself after every Ds ¼ 4. Therefore, for single frequency driving using x R1 ¼ 4 (T ¼ 4T 1 ), all period-1 and period-4 orbits are treated as period-1 solutions in the dual-frequency simulations.Similarly, in case of single frequency driven system with x R2 ¼ 3 (T ¼ 3T 2 ), all period-1 and period-3 orbits are regarded as period-1 solutions if the dual-frequency Poincare ´map is applied.For an exhaustive discussion of the ''period reduction'' described above, the reader is referred to our previous paper [4].
Let us summarise the colour code in Fig. 2. The red curves represent period-3 orbits using a single frequency Poincare ´map if only the second frequency component is active (x R2 ¼ 3, P A1 ¼ 0).The green curves represent period-4 orbits of single frequency driving (again using a single frequency Poincare ´map) with relative frequency x R1 ¼ 4 (P A2 ¼ 0).Finally, the yellow and grey surfaces and both the red and green curves are the second components of the Poincare ´section of period-1 orbits corresponding to the dual-frequency driving (as already discussed above).The surfaces are presented with different colours (yellow and grey) only for the better visibility.It can be clearly seen how these surfaces make connections between the period-3 and period-4 orbits related to different relative frequency values.That is, these two kinds of orbits can be transformed into each other via a temporary dual-frequency excitation.
Although the above-described orbits are related to different relative frequency values, a special kind of control of multi-stability can be achieved in this way if the period-3 (red curves) and the period-4 (green curves) attractors have overlapping domains in the frequency-amplitude parameter plane in case of single frequency driving.However, the transformation works well even if such overlapping domains do not exist.Thus, one can still drive the system from one attractor to another regardless of their co-existence.Observe that such a control technique is a non-feedback method, but the direct selection of the desired attractor is nevertheless possible.Up to now, this was possible only by feedback control techniques [6].A thorough discussion of the advantages and the drawbacks can be found in our already mentioned previous work [4]; however, only for the transformation between period-2 and period-3 orbits.Therefore, the results presented here indicate that the control technique can be generalised for other pairs of periodic orbits.
It must be emphasized that the high-resolution, multi-dimensional parameter scans have played a vital role in the discovery of the new non-feedback control technique.Since not all the bi-parametric plots show even the sign of the transformation possibility (see e.g. the top panels of Fig. 1), it is very likely that investigating only a few frequency combinations or using coarse resolutions for the pressure amplitudes, we might have missed the special bifurcation structure that led to the discovery.Moreover, as the total number of parameter combinations is of the order of a hundred million, the high-performance GPU computing was a prerequisite of this success.
Fig. 1 Periodicity diagram of bi-parametric plots with pressure amplitudes as control parameters at different relative frequency pairs.The colour code represents the highest period (up to period-6) found at a given parameter set.Inside the black domains, there are chaotic solutions or obits having period higher than six.In the case of co-existing attractors, only the highest period is plotted The second test case describes the behaviour of a pressure relief valve that can exhibit impact dynamics.
The dimensionless governing equations are adopted from [52] and are written as where y 1 and y 2 are the displacement and the velocity of the valve body, respectively.The pressure relief valve is attached to a reservoir chamber in which the dimensionless pressure is y 3 .The fixed parameters in the system during the computations are as follows: j ¼ 1:25 is the damping coefficient, d ¼ 10 is the precompression parameter, b ¼ 20 is the compressibility parameter and q ¼ 0:3 is the dimensionless flow rate.
In Eqs. ( 30)-( 32), the zero value of the displacement (y 1 ¼ 0) means that the valve body is in contact with the seat of the valve.If the velocity of the valve body y 2 has a non-zero, negative value at this point, the following impact law is applied: That is, the velocity of the valve body is reversed by the Newtonian coefficient of restitution r ¼ 0:8 that approximates the loss of energy of the impact.

The discovery of a new focal point of grazing lines
During the oscillation of the valve body of a pressure relief valve, it can exhibit impact dynamics (the valve body is in contact with the valve seat) that can be categorised as follows.The transversal impact has a non-zero velocity during the impact (y 2 \0); that is, it is a ''normal'' impact.Whereas, the so-called grazing impact occurs when the impact happens with a zero velocity (y 2 ¼ 0).In this case, the impact law has no real effect as the valve body only touches the valve seat.Figure 3 shows the y 1 component of two trajectories that exhibit impacts (y 1 ¼ 0).The red dots denote the grazing impacts.The simulations are stopped at the next impact.In both cases, the initial conditions for the first two components are y 10 ¼ 0 and y 20 ¼ 0:4.The only difference is in the third initial condition: y 30 ¼ 8:66 and y 30 ¼ 8:58 depicted also in the figure.The employed parameter set is summarized in Sect.6.The grazing impact can also be labelled (for a specific initial condition) according to how many transversal impacts there were before.Thus, in Fig. 3, the grazing impacts are denoted as G 0 (zero transversal impact) and G 2 (two transversal impacts).
From a theoretical point of view, the generalization of the grazing impacts to the y 20 À y 30 initial condition plane is an interesting problem.The first component of the initial condition is always set to y 10 ¼ 0. In this way, G ðkÞ denotes a set of points in the y 20 À y 30 initial condition plane, which leads to a grazing impact after k transversal impacts.Throughout this paper, we shall call such a set of points as grazing lines of order k.The first seven grazing lines computed by H} os and Champneys [52] are shown in the bottom-right panel of Fig. 4. Their strategy was to use a BVP solver and to employ the pseudo-arclength continuation technique to follow the path of the curves initiated from the results of an IVP solver.This formalism is quite complex, as for a single BVP, one needs to define sub-BVPs for each of the k þ 1 segments divided by the impacts.These are coupled via the impact law for the internal connections.At one side of the full-BVP, the grazing condition, while at the other side of the full-BVP, the condition y 1 ¼ 0 has to be prescribed.Furthermore, the time instances of the intermediate transversal impacts need to be tracked properly as well.The main drawback of this approach is that for different values of k, a different set of BVPs has to be set up and solved.These are the main reasons why the total computational time of a single grazing line was as high as several hours (according to personal communications with the authors).In addition, the implementation of the solver took weeks.The main outcome of the results is that the grazing lines are organized as spirals with a single focal point; and at this focal point, a Shil'nikov-like orbit exists with impacts, see again [52].
Another way to compute the grazing lines is to take an IVP solver (like our GPU accelerated solver), solve the system forward in time, stop the integration after k þ 1 impacts and register the velocity of the endpoint y 2E .If this velocity is zero, the corresponding initial condition lies on a grazing line of order k denoted as G ðkÞ .With a fine resolution of the set of initial conditions in the y 20 À y 30 plane, the grazing lines can be drawn easily by creating a contour plot of the y 2E value.Theoretically, the zero iso-lines shall represent the corresponding grazing line.
The G ð1Þ curve computed with our GPU-ODE solver is presented in the top-left panel of Fig. 4 via a white-red colour-coded plot.Here the integrations are stopped at the second impact (k ? 1 = 2).The resolution of the initial conditions is 1024 Â 1024 and the total computation time is merely 4 s.The pure white colour represents the zero value of y 2E .The pure red colour means y 2E [ 1:5 m=s.Between 1:5 [ y 2E [ 0, the transition is uniform in the colour code.Interestingly, the zero values always lie at a discontinuity, see the jump in the colour code labelled by G ð1Þ in the top-left panel of Fig. 4. Accordingly, the grazing lines can be easily identified as a jump in the value of y 2E .In this sense, the task can be reduced to an edge detection problem; this is beyond the scope of the present study.The computations corresponding to the G ð2Þ and G ð5Þ curves are shown in the top-right and bottom-left panels of Fig. 4, respectively.In the case of G ð2Þ , a second focal point already appears in the initial condition plane which was not observed in the BVP computations of H} os and Champneys [52].The G ( 1) G (2)   Fig. 4 Grazing lines computed by the GPU accelerated IVP solver (colour-coded panels) and the BVP solver with the pseudo-arclength continuation technique (bottom-right panel, reprinted with permission from H} os and Champneys [52]).The pure white colour represents the zero value of velocity of the valve body of an impact (grazing impact).
The pure red colour means the velocity of 1:5 m=s or higher.Between 1:5 m=s and 0 m=s, the transition is uniform in the colour code.
(Color figure online) two focal points are also connected with an additional G ð2Þ curve.Interestingly, the G ð1Þ curve also appears as a discontinuity in the y 2E values; however, in either sides y 2E 6 ¼ 0. Therefore, this curve can be seen only as a ''pale'' dark red-light red transition.The reason for the non-zero velocity is that the integration is stopped at the third impact for G ð2Þ instead of at the second one required for the detection of G ð1Þ .Nevertheless, an edge detection algorithm can find both the G ð1Þ and G ð2Þ curves from a single computation with k ¼ 2. The grazing lines corresponding to k ¼ 5 are presented in the bottom-left panel in Fig. 4. Similarly, as in the case of k ¼ 2, all the previous grazing lines (k ¼ 1. ..4) are visible in the figure making it extremely complex.Thus, to detect the edges properly, a suitably fine resolution is necessary.This is not a problem in our case, as a single computation with one million initial conditions takes only a couple of seconds.Observe that in the bottom-left panel, no further focal points are discovered apart from the second one.In summary, high-resolution scans of the initial conditions using our GPU accelarated IVP solver have revealed an additional feature (second focal point) of the grazing lines in the y 20 À y 30 initial condition plane.This shows that fast ''brute force'' scanning is nowadays able to discover features otherwise not visible or overlooked-here by the available BVP approach.At first sight, high-performance computation seems to be exaggerated.Even without using GPUs, the above ''brute force'' computations can be done within a few hours using MATLAB on a CPU.However, the main message here is that considering the usage of a ''brute force'' approach can lead to unexpected discoveries.Although in this specific example, high-performance computing is not really mandatory, in general, to obtain results within reasonable time for a detailed ''brute force'' computation, the applications of high-performance GPU (and/or CPU) clusters is usually a must.

Summary
In this paper, the efficacy of ''brute force'' technique combined with high-performance GPU computing is demonstrated through two test cases.The first model, the Keller-Miksis equation, is related to the scientific topic of sonochemistry and bubble dynamics.Apart from mapping the dynamics of bubbles to obtain approximate information about their chemical activity, the bifurcation structure of the high-resolution plots led to a discovery of a new technique to control multi-stability.The second model describes the behaviour of a pressure relief valve that can exhibit non-smooth impact dynamics.Results in the literature revealed that the grazing lines-computed via a boundary value problem solver-in the initial condition plane are organized around a spiral hub.The highresolution scans of the initial conditions using our GPU accelerated initial value problem solver led to the discovery of a new focal point of the grazing lines.In summary, ''brute force'' technique can play an important role in many fields of sciences, including non-linear dynamics.
In general, the prerequisite to employ high-resolution parameter scans is a fast solver.If high computational capacities are required, a natural choice is the usage of CPU clusters that are available in many research institutes.The advantage of this approach is that highly optimised libraries are available for CPUs.However, GPUs have outstanding computational capacity/price ratio, which makes them a good alternative over CPUs.Although the parallelisation strategy for parameter scans seems to be straightforward (assign a GPU thread to each parameter combination) and libraries supporting solution of ODEs on GPUs are already available, still there can be many special issues resulting in a large performance drop.
For instance, the extremely slow CPU-GPU memory transactions need to be avoided by all costs.This can be a cumbersome task for example for systems with impact dynamics, where thousands of parallel threads (each having its own instance of the ODE with a specific parameter combination) can encounter an impact at any time.What should the programmer do if a single thread is impacting?He/she can stop the whole computation, apply the impact law on that specific thread and continue the integration process.This can be quite inefficient if the programmer has to involve CPU computations (depending on the interface and data structure of the package used), and there is always an overhead to restart the simulation as well.Thus, an efficient solver has to be able to detect impact (via event handling) and manipulate the trajectory immediately ''on the fly'' on the GPU for each thread selectively.
There can be several other issues that may have a negative effect on code performance if GPUs are involved.Thus, tuning up the number of the parameters is easy, but a fast and efficient GPU solver usually needs a clever implementation.Such a detailed discussion is beyond the scope of the present study.Nevertheless, our GPU code is designed to efficiently address the majority of these issues.For more details, the reader is again referred to the manual of the program package [54] and to its website www.gpuode.com or to its GitHub repository [53].

Fig. 2
Fig. 2 The second component of the Poincare ´section Pðy 2 Þ of the period-1 orbits versus the pressure-amplitude parameter plane of the dual-frequency driving.(Color figure online)

Fig. 3
Fig. 3 Time series exhibiting transversal and grazing impacts applying different initial conditions.The grazing impacts are marked by the red dots.(Color figure online)