Rounding errors may be beneficial for simulations of atmospheric flow: results from the forced 1D Burgers equation

Inexact hardware can reduce computational cost, due to a reduced energy demand and an increase in performance, and can therefore allow higher-resolution simulations of the atmosphere within the same budget for computation. We investigate the use of emulated inexact hardware for a model of the randomly forced 1D Burgers equation with stochastic sub-grid-scale parametrisation. Results show that numerical precision can be reduced to only 12 bits in the significand of floating-point numbers—instead of 52 bits for double precision—with no serious degradation in results for all diagnostics considered. Simulations that use inexact hardware on a grid with higher spatial resolution show results that are significantly better compared to simulations in double precision on a coarser grid at similar estimated computing cost. In the second half of the paper, we compare the forcing due to rounding errors to the stochastic forcing of the stochastic parametrisation scheme that is used to represent sub-grid-scale variability in the standard model setup. We argue that stochastic forcings of stochastic parametrisation schemes can provide a first guess for the upper limit of the magnitude of rounding errors of inexact hardware that can be tolerated by model simulations and suggest that rounding errors can be hidden in the distribution of the stochastic forcing. We present an idealised model setup that replaces the expensive stochastic forcing of the stochastic parametrisation scheme with an engineered rounding error forcing and provides results of similar quality. The engineered rounding error forcing can be used to create a forecast ensemble of similar spread compared to an ensemble based on the stochastic forcing. We conclude that rounding errors are not necessarily degrading the quality of model simulations. Instead, they can be beneficial for the representation of sub-grid-scale variability.

within the same budget for computation. This would push parametrisation schemes towards smaller scales. However, rounding errors will influence model simulations.
From the perspective of hardware development, the main motivation behind inexact hardware is to trade a reduction in numerical precision against reduced computational cost and an increase in performance. This trade-off is discussed in the literature since a couple of years (see for example [22,23]). The first approach to inexact hardware was to use so-called stochastic processors that allow hardware errors to occur within numerical simulations while power consumption is reduced by reducing the applied voltage to the floatingpoint unit (see for example [16,20]). A second approach is "pruning" for which the physical size of the floating-point unit is reduced, by removing parts that are either hardly used or do not have a strong influence on significant bits in the results of floating-point operations [7,17]. Pruning of the floating-point unit can be combined with the use of inexact memory [10]. A third approach is to use hardware that allows reduced precision floating-point arithmetic, such as field programmable gate arrays (FPGAs). Here, the amount of bits that are used to represent fixed point or floating-point numbers can be customised to the application (the hardware is "information efficient", see [24]). Recent studies showed that the use of dataflow engines shows huge potential for atmospheric modelling [12,21,29], especially if floating-point precision is reduced in parts of the models. Dataflow engines are based on FPGAs and stream data from memory through the dataflow engine where thousands of operations can be performed within one computing cycle at relatively low clock frequency.
Studies of emulated inexact hardware in a spectral dynamical core [8,9] showed that the use of reduced floating-point precision or pruning has large potential to reduce computational cost for atmosphere models. For a spectral dynamical core of an atmosphere model at T85 resolution, numerical precision can be reduced to only 6 bits in the exponent and 8 bits in the significand of floating-point numbers within model parts that cause about 98 % of the computational cost for a double precision control simulation with no strong affect on the quality of results and forecast errors [9].
In this paper, we will study the use of inexact hardware in a model that simulates the one-dimensional Burgers equation with stochastic forcing. The model is using the stochastic parametrisation scheme of [5] for turbulent closure. We will perform simulations with emulated reduced precision that identify the minimal level of precision that can be used with no significant increase in model error, and we will provide estimates for the reduction in computational cost that can be expected for reduced precision simulations in comparison with control simulations in double precision.
In the second half of the paper, we will investigate the nature of the forcing due to rounding error and compare the rounding error distribution against the random noise that is used within the stochastic parametrisation scheme. Parametrisation schemes are used within global atmosphere and ocean models to represent sub-gridscale processes that cannot be represented explicitly at the available numerical resolution. Unfortunately, these processes include many influential physical phenomena, such as deep convection. In a first approach, parametrisation schemes are typically developed with physical reasoning and based on a statistical analysis of observations or high-resolution model output (for example from large eddy simulations). However, the final parameter tuning is often rather based on "trial and error" adjustment than on stringent physical principles or measurements. Parametrisation schemes can only approximate sub-grid-scale behaviour.
Since deterministic parametrisation schemes tend to underestimate sub-grid-scale variability, stochastic parametrisation schemes gain more and more momentum in the development of atmosphere and ocean models. Stochastic parametrisation schemes use random numbers to mimic the stochastic nature of forcings from sub-grid-scale processes. They do not only improve the ensemble spread, but also correct systematic model errors and represent model uncertainty (see, for example, [25]). Stochastic parametrisations include stochastically perturbed physical parametrisation tendencies [3,27], stochastic kinetic energy back-scatter [2,30], cellular automata [1,25], quasi-equilibrium statistical mechanics parametrisations (for example [28]), stochastic differential equations (for example [11,13]) and Markov chains (for example [4,14]). Stochastic forcings are often introduced into the numerical model (for example [2,3,30]) to represent sub-grid-scale variability explicitly. These schemes allow an analysis of forecast uncertainty via an evaluation of the spread of ensemble forecasts.
There is no doubt that it is a difficult challenge to find a suitable distribution for random noise pattern that represents sub-grid-scale variability realistically. However, the papers above suggest several approaches to find appropriate stochastic forcings to be used in parametrisation schemes. The random numbers that are used in stochastic parametrisation schemes (e.g. via Gaussian distributed white noise) are typically of very high quality with a specific mean and standard deviation and with very long repetition times (for example in [5]).
One might argue that this is over-engineered when the large model error in geophysical applications and the rather arbitrary adjustment procedure of parametrisation schemes is taken into account.
The stochastic parametrisation scheme which is used in the Burgers equation model is based on the stochastic mode reduction strategy by [19]. In ongoing research, the same approach is now tested in twoand three-dimensional models of atmospheric dynamics. We use the Burgers model to study the properties of rounding errors and compare the magnitude of rounding errors to the forcing of the stochastic parametrisation scheme. The used setup allows a relatively simple comparison between the stochastic forcing within the stochastic parametrisation scheme and rounding errors due to inexact hardware in a one-dimensional model that is meaningful for atmospheric modelling. We argue that the stochastic forcing of the stochastic parametrisation scheme can be used to hide rounding errors and that the magnitude of the stochastic forcing can serve as a first guess for the upper limit of the magnitude of rounding errors that allow simulations with no change of model statistics.
There are many publications that show that random numbers can be used in stochastic forcings of stochastic parametrisation schemes to improve atmospheric modelling (see [26] and references therein). On the other hand, rounding errors will show almost no correlations in space and time and have similarities to random numbers. Therefore, we argue that rounding errors can also have a beneficial influence in chaotic models, following the same motivation that is used to justify stochastic forcings in stochastic parametrisation schemes-for example to represent sub-grid-scale variability. Obviously, it is unlikely that the shape of the rounding error forcing will be appropriate to represent sub-grid-scale noise without adaptation. However, if the shape and magnitude of the rounding error forcing do not match the desired distribution, changes to the model allow to influence rounding errors. Two examples: (1) A change of the length of the time step (Δt) will have an influence on the net forcing, similar to the influence of a change of the time step in stochastic differential equations if the different scaling of deterministic and stochastic terms with Δt is ignored. (2) A change of the time-stepping scheme will change the shape of the forcing.
If rounding errors are engineered towards a desired distribution, the random forcings that are derived for stochastic parametrisation schemes and designed to improve model simulations by representing sub-grid-scale variability are obviously promising targets for the rounding error distribution. We will study if it is possible to represent such a forcing sufficiently, using only rounding errors. Section 2 of this paper introduces the model and the used parametrisation schemes, and Sect. 3 explains the emulator for reduced precision. In Sect. 4, we show that numerical precision can be reduced heavily before rounding errors start to reduce the quality of the used model. In Sect. 5, we compare the possible gain in resolution against the possible increase in model error when inexact hardware is used. In Sect. 6, we compare the stochastic forcing of the stochastic parametrisation scheme against the rounding error forcing at different levels of precision. In Sect. 7, we describe similarities between rounding errors and random noise, engineer rounding errors towards the distribution of the stochastic forcing of the stochastic sub-grid-scale parametrisation scheme and show results for simulations that replace the stochastic forcing by the engineered rounding errors. In Sect. 8, we show that rounding errors can be used to generate ensemble simulations that are similar to ensemble forecasts based on stochastic parametrisation schemes. In Sect. 9, a summary and the conclusions are present.

1D Burgers equation with stochastic forcing and stochastic sub-grid-scale closure
This section presents the model for simulations of the 1D Burgers equation. We use the stochastic sub-grid-scale parametrisation scheme by [5] based on the MTV stochastic mode reduction strategy by Majda, Timofeyev and Vanden-Eijnden [19]. The so-called MTV strategy separates the modes in the system into resolved and unresolved ones based on their characteristic timescales. This information is used to derive deterministic and stochastic terms to correct the dynamics of resolved modes for interactions to unresolved modes. In contrast to previous studies that used the MTV method for global basis functions (for example [18] and [11]), [5] applied this strategy locally to a finite-difference grid point model. This allows an application of the method to very high-dimensional systems and to consider relatively large numbers of resolved modes (see also [6]).
The 1D Burgers equation with dissipation and random forcing f is given by: where u is velocity and ν is the constant diffusion coefficient. The domain is periodic with a length L. Following [5], the domain is divided into a fine grid with 512 equidistant grid points. Additionally, the domain is also divided into 32 equidistant coarse grid cells, such that 16 gridpoints of the fine grid are situated within each coarse-grained grid cell. The velocity at each fine grid cell is represented as a sum of the mean value x i inside the coarse grid cell and the local variation y j : We discretise Eq. (1) in space using an energy conserving finite-difference discretisation for the quadratic nonlinearities and rewrite the discrete model equation using the large-and small-scale variables x i and y j . We use a stochastic forcing f that has the shape of large-scale waves with wavenumbers 1, 2, and 3 over the entire domain. The forcing will act on the large-scale parameters x i only. According to [5], the new set of equations reads:ẋ with random numbers α k and φ k and Δx = L/512. The terms B αβγ cover quadratic nonlinearities. The L αβ terms result from the dissipation term. For both B αβγ and L αβ , the first subscript α denotes the mode onto which they project and β and γ indicate the modes involved. The exact terms are listed in "Appendix" (see [5] for more details). The summations in the equation suggest that we consider global interactions. However, the interactions in the used finite-difference scheme are local and the resulting tensors are sparse. We work with a Runge-Kutta time-stepping scheme of third order and choose a time step of Δt = 0.01 and ν = 0.02. We simulate the system either with the full set of large-and small-scale parameters x i and y j , or in a parametrised setup for which the small-scale parameters y j are truncated and the system is reduced to the large-scale parameters x i . We follow the nomenclature in [5] and denote simulations of the full set of Eqs.
(3)-(5) as direct numerical simulations (DNS). To parametrise the y j variables, we use the stochastic subgrid-scale closure parametrisation scheme by [5] as standard configuration and a deterministic Smagorinsky closure for comparison. The model with Smagorinsky parametrisation solves the following equation: on the coarse grid with the Smagorinsky constant C s = 0.2. We denote simulations that use Smagorinsky closure with the acronym "smg", following [5].
For the sub-grid-scale parametrisation scheme derived in [5], the full system [Eqs.
Here, W (1) and W (2) denote vectors with independent Wiener processes. The first three terms of the righthand side of the equation discretise the 1D Burgers equation without any sub-grid-scale closure. The fourth, fifth and sixth term represent the deterministic part of the parametrisation scheme with linear, quadratic and cubic corrections. The last two terms form the stochastic part of the parametrisation scheme with additive and multiplicative noise. The specific form of the terms can be found in Appendix D of [5].

Emulated reduced precision
In this section, we provide a short introduction to the representation of real numbers as floating points and the used emulator for reduced precision. Today's hardware is typically only capable to represent floating-point numbers in single (32 bits) or double (64 bits) precision. According to the IEEE754 standard [15], a double precision floating-point number is represented by a sequence of 64 bits. Each bit can take the value 0 or 1. The 64 bits comprise a sign bit (s), eleven bits for the exponent (c 0 , c 1 , c 2 , ..., c 10 ) and 52 bits for the mantissa The relationship between and its bit representation is given by In this paper, we investigate the use of hardware that allows the use of floating-point numbers at various levels of precision. We focus on a reduced number of bits in the significand. Although there are several approaches towards building such hardware (see Sect. 1), it is still only available for use by us in form of FPGAs that are very difficult to programme. Therefore, we need to emulate reduced precision arithmetic within the numerical simulations. The used emulator will round results of floating-point operations to the closest number that can be represented with a reduced number of bits. The resulting rounding error is slightly different to the expected rounding error on real reduced precision floating-point hardware, for which intermediate steps in the calculation of a floating-point operations will also be represented by a reduced number of bits and will therefore also be affected by rounding errors. Guarding bits do, however, keep rounding errors for intermediate steps to a minimum. We therefore expect differences between the emulated and the real hardware rounding to be small.
In simulations with emulated reduced precision, the emulator is acting on all floating-point operations within the entire model run. This includes the calculation of the resolved scales but also the sub-grid-scale parametrisation scheme.

Simulations of the Burgers equation with reduced floating-point precision
This section presents results for simulations with the model with stochastic closure on emulated reduced precision hardware. We compare results for simulations with reduced precision to the control simulation in double precision. We will use the same diagnostics as [5]. Figure 1 shows the energy spectra, the autocorrelation function and the kurtosis 1 for the sub-grid-scale parametrised simulations with reduced significand, the control simulation with sub-grid-scale parametrisation in double precision, the simulation with Smagorinsky closure and the direct numerical simulation. All simulations are run for 600,000 nondimensional units. Table 1 presents results for the variance and the integrated autocorrelation function. While simulations with 8 bit significand are clearly perturbed, the simulations with more bits (10 and 12) do not show strong differences to the double precision control simulation for all diagnostics, except for small changes of the kurtosis for 10 bits in the significand. Differences between the double  precision control simulation and the reduced precision simulations with 10 and 12 bits are much smaller than differences between the control simulation and the direct numerical simulation which suggests that reduced precision will not have a strong influence on the quality of model simulations

Performance analysis
In the previous section, we tested how a reduction in precision will reduce the quality of simulations. In this section, we argue that the use of inexact hardware can be beneficial for model simulations. We present results to answer the question whether an increase in resolution, which is made possible by a reduction in the power consumption or an increase in performance due to reduced precision, can be beneficial for model simulations in such a way that the reduction in model quality due to reduced precision is balanced. It needs to be checked whether the level of precision that has been tested to be sufficient for the original model setup is also sufficient if resolution is increased. We consider a setup with emulated reduced floating-point precision to 19 bits (12 bits for the significand, 6 bits for the exponent and one sign bit). The number of bits is significantly reduced in comparison with double precision (52 bits for the significand, 11 bits for the exponent and one sign bit) and single precision (23 bits for the significand, 8 bits for the exponent and one sign bit). An estimate for a possible cost reduction with reduced precision that seems to be fair is the ratio between the number of bits that can be used. This will certainly depend on the used approach to inexact hardware (see list in introduction). If FPGAs are used, the speed-up will be approximately proportional to the ratio of the number of bits used. An inexact CPU setup with a pruned floating-point unit and inexact memory might even scale better than the ratio between the number of bits [7,10]. We conclude that the reduced precision setup will be more than three times faster compared to the standard double precision simulation (64/19 ≈ 3.4). Figure 2 shows results for the direct numerical simulations and for simulations with Smagorinsky closure calculated either with double or with reduced precision at two different resolutions. While the standard setup is using 32 grid points, the "high"-resolution setup is using 64 grid points. Both setups are compared to the direct numerical simulation with 512 grid points, the latter projected on 32 coarse grid cells. Due to the increase in resolution by a factor of two, we expect computational cost to increase approximately by a factor of four (a factor of two for each dimension in space and time). Therefore, simulations with reduced precision on the fine grid will be only slightly more expensive than double precision simulations on the coarse grid. However, results in Fig. 2 show that an increase in resolution will improve model simulations significantly between wavenumbers 10 and 16 since the impact of diffusion can be pushed to higher wavenumbers. The energy spectra of the simulation with reduced precision are almost identical to the spectra of the high-resolution simulation with double precision, despite the large difference in computational cost.
We used simulations with the Smagorinsky scheme for comparison, since it is difficult to make fair estimates of the computational cost if the stochastic sub-grid-scale scheme is used. Since the stochastic closure model produces a continuous power-law slope in the inertial range, unlike the Smagorinsky model, we do not expect significant improvement in this spectral range, if we run the model with increased resolution but with reduced precision. However, we expect a better representation of the statistical moments, as suggested from Tables 2  and 3 from [5], which show that the relative error in the moments decreases for increasing resolution.

Analysis of the rounding error forcing in comparison to the stochastic forcing of the parametrisation scheme
To learn more about the forcing due to rounding errors when using inexact hardware, we study the properties of rounding errors and compare the stochastic forcing of the stochastic parametrisation scheme with rounding errors at different levels of precision for the significand of floating-point numbers. We test whether rounding errors can be hidden in the stochastic forcing of the parametrisation scheme.
In a first approach, we compare the stochastic forcing of the stochastic parametrisation scheme with the noise pattern that is present if a time step is calculated at different levels of precision in Fig. 3. To generate the Figure, we run the model and perform each time step in three different settings: 1. We calculate the next time step with the parametrised model in double precision (full Eq. (7)). 2. We calculate the same time step with the parametrised model in double precision omitting the stochastic part of the parametrisation scheme (the terms σ (1) dW (1) i and σ (2) i dW (2) i in Eq. (7)). 3. We perform the same step as in setting 1. with emulated reduced precision.  In each setting, the next time step will use the state vector calculated in setting 1. as initial condition. We can now compare the stochastic forcing of the parametrisation scheme (contribution of right-hand side for setting 1. minus contribution of right-hand side for setting 2.) with the noise caused by rounding errors added to the stochastic forcing of the parametrisation scheme (contribution of right-hand side for setting 3. minus contribution of right-hand side for setting 1.) for each time step. Fig. 3 shows that rounding errors decrease with increasing number of bits in the significand, as expected. Changes in the distribution due to rounding errors are hardly visible for 12, 15 and 20 bits in the significand, while the rounding error generates a larger spread of the noise for inexact hardware with 8,9 or 10 bits in the significand. Table 2 confirms these results since simulations with ≥12 bits in the significand show a correlation between the reduced precision and the standard random forcing of more than 0.9 and a change in the standard deviation by less than 6%. Figure 4 shows the probability distributions of the forcing pattern for certain intervals of the x variable in Fig. 3. The distributions of rounding errors have a reasonable shape close to a Gaussian distribution, and the results confirm that the stochastic forcing is closely matched for 12 bits in the significand, while the distributions are clearly different for 9 or 10 bits in the significand. The distributions with inexact hardware are not centred perfectly around zero which might cause problems. 2 However, the mean of the entire forcing is small, as seen in Table 2.  Table 2, we would expect that model simulations with more than 10 bits in the significand will be hardly affected by rounding errors since the errors will be hidden within the stochastic forcing of the sub-grid-scale parametrisation, while there might be changes to model results for ≤10 bits in the significand.
This result fits nicely to the conclusion from Fig. 1 and Table 1 that show that model simulations can be disturbed for less than 10 bits in the significand, while the quality of model simulations is comparable for larger numbers of bits. We conclude that rounding errors can be hidden in the noise pattern for stochastic parametrisation schemes if the rounding error forcing is smaller than the stochastic forcing caused by stochastic parametrisation schemes. If the rounding error forcing does not exceed the stochastic forcing, the quality of the model is hardly influenced by rounding errors. The magnitude of the stochastic forcing can serve as a first guess for the maximal rounding error that is acceptable before model results will be changed significantly.

The use of rounding errors to represent sub-grid-scale variability
In this section, we discuss the properties of rounding errors in general. We show similarities between rounding errors and random noise and exploit these similarities by engineering rounding errors that mimic the stochastic forcing of the sub-grid-scale parametrisation scheme. Finally, we perform simulations that replace the stochastic forcing by the rounding error of inexact hardware. We compare results of these simulations with model simulations of the full stochastic parametrisation scheme in double precision. Figure 5 shows the noise pattern when random input numbers (x-axis) are rounded to either four or ten bits in the significand. Rounding errors show linear pattern (see panel a in Fig. 5) with jumps if one input value is rounded down, while its right neighbour is rounded up. The entire process of rounding is deterministic in nature, and it might be misleading to assign properties of random noise to the noise pattern of rounding errors. However, if the input values cover only the range of one exponent and if the distance between the "roundingup" and "rounding-down" interval is very small compared to the range of input variables, the pattern shows no visible difference to additive white noise with uniform distribution (compare b and c in Fig. 5). The linear noise patterns would obviously be recovered if we would zoom into a smaller range of x-variables, or if we would consider a much larger sample of points. However, we can assume that the rounding error-and therefore the "rounding-up" and "rounding-down" interval-needs to be relatively small in comparison with the range of the signal, to allow reasonable results for model simulations. The range of the rounding error is fixed by the value of the exponent E and the number of bits m that represent the significand of floating-point numbers. The rounding error ζ for a random number is limited by − 2 E 2 (m+2) < ζ ≤ 2 E 2 (m+2) . If the range of input values covers several exponents, the maximal rounding error is changing with the exponent by a factor of two with each step. The length of the interval in which a specific exponent is valid is also changing by a factor of two with each step. A straight line can therefore describe the position of all points at which the maximal absolute rounding error is changing with the exponent. This line is described by the equation h(x) = x 2 (m+1) and plotted in Fig. 5 d and e. The range of the noise pattern is obviously discontinuous at changes of the exponent. However, the noise pattern has similar properties to multiplicative noise (compare d and e with f in Fig. 5). According to a profiler, 3 the calculation of the additive and multiplicative noise term of the stochastic parametrisation scheme [the terms σ (1) dW (1) i and σ (2) i dW (2) i in Eq. (7)] is creating approximately two-thirds of the computational cost of the entire model simulation in terms of computing time. 4 It can be assumed that the calculation of the stochastic forcing will not create a significant ratio of the total cost for model simulations of a full atmosphere or ocean model. Still, for the model under investigation computing time will be reduced significantly if it is possible to replace the stochastic forcing of the sub-grid-scale parametrisation scheme with an engineered rounding error of inexact hardware, additional to savings due to the use of inexact hardware.
If we assume that the stochastic forcing of the sub-grid-scale parametrisation scheme offers a correct representation of sub-grid-scale variability and if we can replace this forcing by engineered rounding errors, we would show that rounding errors can be used to represent sub-grid-scale variability and can therefore be beneficial for the given model. However, we are not aware of an effective method to create rounding error forcings that mimic arbitrary multiplicative noise terms involving interactions between different modes. Therefore, the rounding error in this paper will never be able to show the same quality as the stochastic forcing of the sub-grid-scale parametrisation, and an important and cost-intensive ingredient in the generation of the noise is missing. However, in the present application, it turned out that omitting those interactions in the rounding error forcing does not play an important role.
To get a better impression of the magnitude and the shape of the noise caused by rounding errors when the calculation of the stochastic noise is not included, we run a model and perform each time step in three different settings: 1. We calculate the next time step with the parametrised model in double precision [full Eq. (7)]. 2. We calculate the same time step with the parametrised model in double precision omitting the stochastic part of the parametrisation scheme [the terms σ (1) dW (1) i and σ (2) i dW (2) i in Eq. (7)]. 3. We perform the same step as in setting 2. with emulated reduced precision.
The next time step is based on the new state vector calculated in setting 1. for each setting. We can now compare the stochastic forcing of the parametrisation scheme (contribution of right-hand side for setting 1. minus contribution of right-hand side for setting 2.) with the noise caused by rounding errors (contribution of right-hand side for setting 3. minus contribution of right-hand side for setting 2.) for each time step. Fig. 6 and Table 3 compare the two different forcings for simulations with different numbers of bits in the significand. Rounding errors decrease with increasing the number of bits, as expected. While the rounding errors are smaller    Figure 7 shows the probability distributions of the forcing pattern for certain intervals of the x variable in Fig. 6. Most of the distributions of the rounding error have a reasonable shape close to a Gaussian distribution. Although the distributions are not always centred around zero, the total mean error averaged over all x is small (see Table 3).
We will engineer the pattern of rounding errors to match the stochastic forcing of the sub-grid-scale parametrisation scheme as close as possible. Basically, we want the green distribution in Fig. 6 to match the red distribution. As discussed above, rounding errors can behave similar to both additive and multiplicative noise. Since we cannot reduce the rounding error for a given number of bits in the significand, we will start from the setup with 15 bits in the significand, for which the rounding error is much smaller than the stochastic forcing (see Fig. 6). Our approach assumes that the used hardware can work with multiple levels of exactness. We perform the following operations to obtain additive ζ a and multiplicative ζ m noise: The function r (. . .) denotes a rounding to 11 bits in the significand. All other operations are performed with emulated 15 bits in the significand. Both terms would be zero if no rounding errors were present. We add 2.5 to x i for additive noise, to shift the range of the x i values to make them share the same exponent. For multiplicative noise, we can simply round the parameter value to its representation with 11 bits in the significand and subtract the original value of x i since the range of parameters is symmetric around zero and spread over a wide range of exponents. We use α = 0.3 and β = 3.6 to get a proper adjustment of the noise. Figure 8 shows the error patterns that are engineered to fit the stochastic forcing induced by the stochastic parametrisation scheme. The engineered forcing has a standard deviation of 1.05E−4. This is relatively close to the mean and standard deviation of the stochastic forcing (9.27E−5). The mean of the engineered forcing is small as well (−9.62E−7). The engineered noise will not change its behaviour drastically if the prognostic parameters x i leave the range between −0.5 < x < 0.5, as can be seen in panel b in Fig. 8. Figure 9 shows the correlation of the stochastic forcing of the sub-grid-scale parametrisation scheme and the engineered rounding errors in space and time. As expected, it can be seen that no correlation is visible even to the nearest neighbours for both forcings. The slightly negative correlation in space for the stochastic forcing is caused by the constraint of zero mean for the stochastic forcing in each time step.
Finally, we perform model simulation that use the engineered rounding error to replace the stochastic forcing of the stochastic parametrisation scheme. Results for the energy spectra, the autocorrelation function and the kurtosis are plotted in Fig. 10. The values for variance and integrated autocorrelation function are given in Table 4. Only the kurtosis is disturbed slightly, the other climate-type diagnostics show good results for the simulations with engineered rounding errors, especially when compared against the differences between the sub-grid-scale parametrised control simulation in double precision and the direct numerical simulation. Figure 10 is also showing the mean forecast error in x for simulations on the coarse grid. The forecast error is calculated against direct numerical simulations:  The increase in forecast error that can be seen with engineered rounding errors compared to the simulation with added stochastic forcing seems to be in a reasonable range given the possible reduction in computational cost. It should be mentioned that a model simulation that is calculated in double precision with no stochastic forcing provides reasonable results as well (see Fig. 10; Table 4). However, such models can only be used in ensemble methods using perturbed initial conditions. The forecast error is actually increased slightly for simulations with stochastic forcing compared against simulations with no added stochastic forcing.

Inexact hardware used to generate ensemble simulations
An important property of stochastic parametrisation schemes is the ability to generate ensembles. Typically, different sets of random numbers are used to calculate the stochastic forcing in stochastic parametrisation schemes for different members of an ensemble of simulations, for example, by using different seeds in random number generation. The resulting difference in the stochastic forcing is causing a divergence of ensemble members in time. Since the stochastic forcing is typically adjusted to represent sub-grid-scale variability, the spread of the ensemble can provide important information about predictability and an estimate of the forecast errors. This is very important for numerical weather prediction. For deterministic parametrisation schemes, initial value perturbations need to be used to generate ensembles. We argue that the use of inexact hardware allows similar ensemble setups compared to ensemble methods with stochastic parametrisation schemes. If the initial conditions are perturbed only slightly, within the order of magnitude of the rounding errors, the resulting forcing pattern due to rounding has no obvious correlation with the equivalent forcing pattern in an unperturbed simulation. To illustrate this, we consider the same random numbers that were used to generate the plots of rounding errors, when rounding to a significand of 4 bits in Fig. 5, for a second time. We assume that each x represents an initial condition of a simulation. We add a uniformly distributed random number η to x with −m ≤ η < m where m is the maximal amplitude of the rounding error. The result x 2 is taken as new "initial condition", and a rounding to the closest representation with 4 bits in the significand is performed. Figure 11 shows the rounding error ζ if x 2 is rounded to four bits in the significand, plotted against the initial value x. We use m ≈ 0.008 for panel a and m ≈ 0.016 for panel b. The resulting pattern shows no visible difference to white noise within the range of a given exponent and appears to be sufficiently uncorrelated compared to the original rounding error with no initial perturbation that showed clear "rounding-up/rounding-down" pattern (compare Figs. 11 to 5a and d). In difference to pure initial value ensembles, for which only the initial values are perturbed, the effective forcing due to rounding is changed throughout the simulation compared to the equivalent forcing in the unperturbed simulation. We expect two simulations with inexact hardware that use x and x 2 as initial conditions to diverge from each other in the same way as ensemble members that use stochastic parametrisation schemes with different seeds.
We calculate 50 ensemble simulations with 50 ensemble members each. The ensembles were initialised along a simulation of the model in 6000 nondimensional units distance and integrated for 250 nondimensional units. Figure 12 shows results for the mean ensemble standard deviation for ensembles based on different ensemble methods. The first ensemble is solely based on the stochastic forcing of the sub-grid-scale parametrisation scheme with different random seeds for the different ensemble members. All other ensembles have a tiny perturbation of a uniform distributed random number between −0.0005 < η ≤ 0.0005 added to the initial condition of each prognostic variable x i . It can be seen that the used initial perturbations are hardly influencing model simulations since simulations with the full sub-grid-scale parametrisation scheme with different random seeds and added initial perturbation behave in the same way to simulations without initial perturbation and since the sub-grid-scale parametrised simulations with no stochastic forcing show only a small standard deviation which is decreasing with time since the stochastic large-scale forcing is the same for all simulations. For the simulations in which the engineered rounding error is replacing the stochastic term of the sub-grid-scale parametrisation, we obtain an ensemble spread of similar shape and the same order of magnitude compared to the sub-grid-scale parametrised simulations with stochastic forcing.  If the results of Fig. 12 are compared against the plot of the forecast error in Fig. 10, it can be seen that the ensemble spread is under-dispersive, meaning that the forecast error has larger values than the ensemble spread. Ensemble forecasts will be over-confident since the ensemble spread will not be a realistic representation of the model error that should be expected. Given the underdispersiveness of the forecast, we argue that the small differences in ensemble standard deviations between the engineered rounding error forcing and the stochastic forcing are not significant. However, the use of the engineered rounding error forcing increased the forecast error slightly.

Summary and conclusion
We investigate the use of emulated inexact hardware to integrate a model of the randomly forced 1D Burgers equation with stochastic sub-grid-scale parametrisation. Results show that numerical precision can be reduced to only 12 bits in the significand of floating-point numbers-instead of 52 bits for double precision-with no serious degradation in results for all diagnostics considered (see Sect. 4). Simulations that use inexact hardware on a grid with higher spatial resolution show results that are significantly better compared to simulations in double precision on a coarser grid. The reduction in computational cost due to the use of reduced precision is approximated to be more than a factor of three in comparison with double precision simulations, while the use of higher resolution will increase computational cost by approximately only a factor of four (see Sect. 5).
We study the properties of the induced rounding errors and show for the 1D Burgers model that the stochastic forcing of the stochastic parametrisation scheme can be used to hide rounding errors and that the magnitude of the stochastic forcing can serve as a first guess for the upper limit of the magnitude of rounding errors that allow simulations with no significant change of the model statistics (see Sect. 6). We argue that rounding errors can show similarities to additive and multiplicative noise when the interval of interest covers one or several values of exponents of floating-point numbers, as long as the range of rounding errors is small compared to the considered interval. The rounding error noise is almost uncorrelated for the model under investigation. We use the similarity to additive and multiplicative noise and engineer rounding errors in model simulations with inexact hardware to fit the distribution of the stochastic forcing of the stochastic parametrisation scheme as close as possible, by adding a small number of operations to the calculation of each prognostic variable in each time step (see Sect. 7). We use the engineered rounding errors to replace the stochastic forcing of the stochastic parametrisation scheme and obtain model results that show a similar quality compared to double precision control simulations with stochastic forcing. We show that rounding errors allow new perspectives for ensemble methods since a very small initial perturbation, within the order of magnitude of the rounding errors, is changing the rounding error forcing over the entire simulation. The rounding error forcing is sufficiently uncorrelated to form an ensemble of similar spread and quality compared to ensemble simulations that are based on stochastic forcings (see Sect. 8).
While the large potential for the use of inexact hardware in atmosphere models was already shown in previous studies [7][8][9], this is the first study to test the use of inexact hardware in a grid point instead of a spectral model. We show, in a very idealised setup, that rounding errors can improve numerical simulations in ensemble methods in the same way stochastic forcings of stochastic parametrisation schemes can do. Therefore, rounding errors are not necessarily degrading the quality of models in simulations of a system with chaotic dynamics. Instead, they can be beneficial for the representation of sub-grid-scale variability and can induce a reasonable spread in ensemble simulations. We do not argue that rounding errors will always be beneficial or that no stochastic parametrisation should be used in a model with reduced floating-point precision. Certainly, a first approach to the use of inexact hardware should always try to keep rounding errors as small as possible and random number generators are much easier to tune towards a desired forcing distribution, compared to rounding errors. The use of engineered rounding errors has certainly limitation: it is unlikely that it will be possible to tune rounding error forcings to replace very sophisticated stochastic forcings that show correlations in either space and time or state dependence (other than multiplicative noise) with no degradation in results. However, it is likely that a combination of rounding errors and stochastic forcings can be used in these cases. Furthermore, biases in rounding error patterns (see Fig. 4) need to be treated carefully. It is also obvious that results in this paper are very idealised, especially those with engineered rounding errors, and that robustness needs to be checked in larger and more realistic model setups. However, we believe that the increase in accuracy that would be possible when using inexact but ultra-efficient hardware will compensate for the increased model error due to rounding errors for weather and climate models and that an increased variability due to hardware errors can actually be beneficial for simulations of a system, such as the atmosphere, for which sub-grid-scale errors are inherently stochastic.
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http:// creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Appendix: Terms of the discrete two-scaled system of the Burgers equation
The following terms define the discretised system of the Burgers model and the used sub-grid-scale parametrisation scheme. No changes were made to the terms derived in [5]. In the following, the index c is the index of the coarse cell in which a small-scale parameter y j is situated. cr and cl denote the index of the coarse cells in which either the right or the left neighbour of the small-scale parameter is located. jr and jl mark the index of the small-scale parameter which is located closest to a boundary between two coarse grid cells. The terms that define quadratic interactions are: x 2 cr + x c x cr − x 2 cl − x c x cl The following terms result from the discretisation of the diffusion term: