In this chapter, we study the workings of 3DVar and SC-4DVar on the same chaotic Lorenz 1963 system as used with ensemble methods in Chap. 15. We will apply both 3DVar and SC-4DVar sequentially over multiple data-assimilation windows, and we will demonstrate the difference between the filter solution obtained by 3DVar and the recursive SC-4DVar smoother solution. We will also dive deeper into the behavior of the SC-4DVar with highly nonlinear- and chaotic dynamics and try to understand more of the method’s properties and possible limitations in these cases. After studying the 3DVar and 4DVar methods, we compare them with the ensemble methods used in Chap. 15.

1 Data Assimilation Set up

The governing equations of the Lorenz 1963 system are Eqs. (15.1)–(15.3) in Chap. 15. We use the standard parameter setting of \(\sigma = 10 \), \(\rho = 28\), and \(\beta = 3/8\), which leads to chaotic dynamics as depicted in Fig. 16.1. In all simulations the starting point for the true run is \((-10,-10,20)^T\), and the time step \(\Delta t = 0.01\).

We compute the background error covariance in all experiments by sampling a long run of the model every 16 time steps and calculating the sample covariance matrix. After that, we scale this matrix such that the maximum diagonal entry is 4, resulting in

$$\begin{aligned} {\mathbf {C}}_{\textit{xx}}= \begin{pmatrix} 3.10839873 &{} 3.10666191 &{} -0.09539367 \nonumber \\ 3.10666191 &{} 4.0 &{} -0.04713786 \nonumber \\ -0.09539367 &{} -0.04713786 &{} 3.52161065 \end{pmatrix}. \end{aligned}$$

Note the strong covariance between the x and the y components, related to the two wings in the x-y plane. The covariances with the z component are much smaller. The z component has no knowledge of which of the wings or attractors the solution is on, see Fig. 16.1. From its construction, we can see that this covariance matrix contains the climatological correlations between the model variables and not the actual correlations at the start of a specific data assimilation experiment. It is a general weakness of a variational method that only tries to find the mode of the posterior pdf because one then ignores information on the uncertainty. The initial condition of the data-assimilation run is the true state at time zero perturbed by a random vector \({\mathbf {C}}_{\textit{xx}}^{1/2} \boldsymbol{\xi }\) in which \(\boldsymbol{\xi }\) is a random vector with elements drawn from a standard normal distribution.

Fig. 16.1
figure 1

The plot illustrates the time evolution of the Lorenz 1963 system. Notice the two wings and the transition zone between these, which is mainly responsible for the chaotic dynamics, as seen by the spaghetti-like connections between the wings

We generate observations by sampling the true state at variable time intervals, with uncorrelated observation errors of standard deviation 1.0 and the identity measurement operator. We perform experiments where we observe either all variables or only the y component. When we only observe one variable, we illustrate how an SC-4DVar system (see Chap. 4) propagates or spreads the information from the measurement in time and among the variables.

Fig. 16.2
figure 2

The figure shows a typical 3Dvar solution x, y, and z. Red dots denote the observations for each variable at the end of the window, the black line is the true solution, the blue line the prior, and the purple line the analysis. Note that the 3DVar updates the model state only at the end of the window, and the blue and purple line are identical before the observation time

Fig. 16.3
figure 3

A typical SC-4Dvar solution for x, y, and z. Red dots denote the observations for each variable at the end of the window. The black line is the true solution, the blue line the prior, and the purple line the analysis. In contrast to 3DVar, SC-4DVar updates the model trajectory over the whole assimilation window

In the following, we will run several experiments. We start by comparing SC-4DVar with 3DVar (see Chap. 6) to appreciate the strengths of a smoother, over a filter. After that, we will study the SC-4Dvar   in more detail.

2 Comparing 3DVar and SC-4DVar

In this experiment, we run the system over 100 time steps with observations of all three variables only at the end of the assimilation window. We show the 3DVar solution in the upper half of Fig. 16.2. The black line is the true solution, and the red dots at the end of the time window are the observations. The prior estimate is the blue line, while the purple line denotes the analysis trajectory. Note that the 3Dvar only updates the solution at observation time, so the analysis only differs from the prior solution at the last time point.

In contrast, the SC-4DVar solution in Fig. 16.3 shows that the smoother solution updates the whole trajectory. The purple line is much closer to the truth at any time point. Note that the strong-constraint SC-4DVar used here only updates the initial condition at time zero and then uses the model to fill out the rest of the trajectory. Hence, the SC-4DVar scheme brings the information from the observation from the end of the assimilation window to the beginning. SC-4DVar computes this backward information propagation by solving the adjoint equations, as we have seen in Chap. 4.

Fig. 16.4
figure 4

Typical SC-4Dvar solution when the solution changes wings in the Lorenz 1963 system. Red dots denote the observations for each variable at the end of the window, the black line is the true solution, the blue line the prior, and the purple line the SC-4DVar analysis. Note that the SC-4DVar updates the model trajectory over the whole assimilation window but is unable to find the true trajectory

Fig. 16.5
figure 5

Typical SC-4Dvar solution when the true trajectory changes wings in the Lorenz 1963 system, as in Fig. 16.4, but now with 4 times as many observations spread out over the assimilation window (red dots). The black line is the true solution, the blue line the prior, and the purple line the SC-4DVar analysis

Fig. 16.6
figure 6

4Dvar solution when the true trajectory changes wings in the Lorenz 1963 system with only the y variable observed at only 2 times in the assimilation window (red dots). The black line is the true solution, the blue line the prior, and the purple line the SC-4DVar analysis. Note the strong performance of the SC-4DVar

3 Sensitivity to Observation Density in SC-4DVar

We will now examine the sensitivity of the SC-4DVar solution to the observation density in the assimilation window. We start by extending the assimilation window to 200 time steps, still only observing the state at the end of this window. As can be seen in Fig. 16.4 this is a challenging problem for SC-4DVar. In trying to fit the observations in the first two variables, the solution is worse for the third variable. The problem is complicated because the model trajectory passes through the unstable region where the two wings meet. In this region, the model evolution is very sensitive, and small perturbations will make the model solution go to one or the other wing. Since we assume no model errors, this strong sensitivity is carried over directly to the initial conditions, which the SC-4DVar is trying to estimate. This strong sensitivity manifests itself via multiple minima of the SC-4DVar cost function. We will elaborate on the appearance of local minima in the cost function, so multiple modes in the posterior pdf, in a later section. Finally, we should mention that if the truth stays in the stable regime in one of the winds of the attractor for a long time, the SC-4DVar can follow that solution over multiple oscillations.

The situation improves if we add more observations over the assimilation window. Figure 16.5 demonstrates that if we observe this system every 50 time steps, SC-4Dvar can find the initial condition that follows the truth quite well for 200 time steps. It means that the extra observations remove the multiple minima in the cost function, at least for the present prior initial conditions as we will elaborate on in Sect. 16.5.

4 3DVar and SC-4DVar with Partial Observations

We now run an experiment in which we only observe the y component and compare this to the case where we observe all three variables. We sample the observations every 100 time steps in a 200 timestep assimilation window to make this a challenge.

We show the results from only observing the y variable in Fig. 16.6. In contrast, the results from observing all three variables are indistinguishable from those displayed in Fig. 16.5, where we should remove the dots at 50 and 150 time steps. It is remarkable how well the SC-4DVar performs. To put this in perspective, we also compare these results with running 3DVar in this configuration in Fig. 16.7.

First, we notice that the 3DVar has to make a few strong adjustments to stay close to the true evolution of the system. The x variable is strongly updated in the right direction even though we only observe y, because of the 3DVar prior covariance matrix, which has a strong covariance between x and y, see Eq. (16.1). But the SC-4DVar is truly remarkable. It takes the influence from the observations of y at 100 and 200 time steps and brings those back to the initial condition at time zero via the adjoint equations. It updates all the variables and reruns the model over the window, providing perfect updates for x and z.

Fig. 16.7
figure 7

3Dvar solution when the true trajectory changes wings in the Lorenz 1963 system with only the y variable observed at only 2 times in the assimilation window (red dots). The black line is the true solution, the blue line the prior, and the purple line the 3DVar analysis. Note the strong adjustments of the 3DVar at observation times

Fig. 16.8
figure 8

Strong constraint penalty function for the Lorenz model as a function of the initial x-value, keeping y and z constant, when using data in the intervals \(t\in [0,2]\) (blue), \(t\in [0,4]\) (red), and \(t\in [0,6]\) (green), from Evensen  (2009b)

5 Sensitivity to the Length of Assimilation Window

A chaotic system such as Lorenz 1963 displays extreme sensitivity to small perturbations in initial conditions. (This extreme sensitivity is one of the definitions of chaos in the first place.) Many geophysical systems, such as the atmosphere, ocean, and climate systems, are chaotic, so the present experiments are so important. It will then come as no surprise that the SC-4DVar, which only updates the initial conditions, will be very sensitive to the nonlinearities in the likelihood, the actual realization of the measurement error, the data-assimilation window length, and the prior. We will discuss some of these sensitivities below.

It is well-known that the sensitivity grows with the length of the assimilation window, as we have seen in previous sections. Figure 16.8 shows the shape of the cost function plotted as a function of the x variable at the initial time for three different assimilation-window lengths. This cost function is the one that the SC-4DVar will minimize. The details depend on the prior initial condition and measurements and their error covariances, but the figure demonstrates the point. We see that, in this case, even for an assimilation window of length two non-dimensional time units, which corresponds to 200 time steps, multiple minima appear. And when the assimilation-window length increases to six, the cost function is very wild indeed, with hundreds of local minima.

The question then becomes how it is possible for SC-4DVar, which is a gradient method, to do a reasonable job on this system in the first place. The answer is threefold. First, the blue curve for the second assimilation window shows that the global minimum is at \(x=1.8\). If the prior mean would be close to that, say at \(x=2.8\), the SC-4DVar will find the global minimum. An initial error of the order of one, as in this case, or more minor, is not uncommon. However, if we would start with a similar initial error of one, but now at \(x=0.8\), the 4DVar solution would move off to the left, and it would not find the global minimum but the local minimum at \(x=-1.3\). This discussion shows that the first guess, typically the prior mean, plays a significant role in chaotic systems.

Another reason for the excellent performance of SC-4DVar is that Fig. 16.8 does not show full cost function in the 3-dimensional solution space. The local minima shown in this cross-section may be connected by “valleys” in 3-dimensional space. In that case, the local minima displayed here might not be actual local minima. Of course, with longer assimilation windows such as four and six, it is implausible that there will not be many local minima.

The final reason why 4DVar often gives reasonable answers is the width of the prior pdf. In Fig. 16.8 the prior was relatively wide. One can imagine that a much narrower prior will smooth out the cost function because we can not reach many of the local minima in the set assimilation-window length. A clear example comes from numerical weather prediction. ECMWF is the only operational center that performs an SC-4DVar over a 12-hour window, while the other centers use 6-hour windows. The reason is twofold, the highly accurate prior in the 4DVar system and the superior treatment of the complex satellite observations.

One could now pose the question why we want to run long-window SC-4DVar. The main reasons are the accurate covariance between variables and the number of observations. A longer assimilation window means that we use more future measurements to find the best estimate. Remember that the SC-4DVar uses the adjoint technique, which tells us how small perturbations around a fully nonlinear model trajectory move through the assimilation window. Hence, we can interpret this as having a space-time covariance matrix around a fully nonlinear model trajectory. This space-time covariance matrix adapts to the system’s local space-time dynamics through its connection with the nonlinear model trajectory. As a result, as long as the perturbations remain small, this implicit prior covariance matrix over the whole assimilation window is superior to anything we can generate otherwise. For instance, in any ensemble smoother, we would typically need the localization of the ensemble space-time covariance matrix, which we avoid here.

However, it is good to remember that the above statement is only valid when the linearization is accurate, which means when the prior remains close enough to the true system, both in terms of mean and a small uncertainty. In a chaotic system, the time window that remains valid is always finite, and the only fundamental way to avoid this issue is to include model errors. Model errors allow for adaptation of the model trajectory within the assimilation window, removing part of the sensitivity to the initial conditions.

A practical way to partly remove the sensitivity of the initial conditions on the assimilation window is to divide the assimilation window up into smaller pieces. We first run an SC-4DVar over the first piece, likely resulting in a reasonable initial condition estimate because of the shorter window length. Then this more accurate estimate is used as the first guess in a window that is twice longer. Since the first guess is better, we again can assume a more precise 4DVar solution over this longer window. We can repeat this procedure to cover the whole original assimilation window finally. This idea by  Pires et al. (1996) significantly improves the results of SC-4DVar in small-dimensional and highly nonlinear systems.

Another practical solution employed by many operational weather prediction centers is to run the initial outer loop iterations of the SC-4DVar with a reduced model resolution and a reduced observation set. The resolution and observation density increase in later outer loop iterations, slowly bringing in more nonlinearity while starting each outer-loop iteration from a more accurate first guess. This approach leads to a much more linear data-assimilation problem that is suitable for SC-4DVar.

Fig. 16.9
figure 9

SC-4DVar results over 10 assimilation windows of 200 time steps each. Truth (black), prior (blue) and analysis (purple). The solution is very accurate, and the black line is almost completely covered by the purple line

Fig. 16.10
figure 10

RMSE versus time for SC-4DVar run over 10 assimilation windows. The blue line is the background error and the purple line is the analysis error

6 SC-4DVar with Multiple Assimilation Windows

We will now show results from running SC-4DVar over multiple data assimilation windows. Figure 16.9 shows results from 10 assimilation windows observing all three variables every 50 time steps in an assimilation window of 200 time steps. With this observation frequency observing only the y variable did not work well after two assimilation windows, showing that one has to be careful with results in short data-assimilation experiments.

Figure 16.10 demonstrates that the analysis’s root-mean-square error (RMSE) is tiny, of the order of the observational error of 1, as expected. The background error fluctuates dramatically around 100 and 1400 time steps, typically related to the end of a forecast window.

Finally, we produce a zoom in on the solution when we observe only variable y at a transition between 2 assimilation windows in Fig. 16.11. We see that even the analysis is not smooth at this transition. This result should not come as a surprise because the minimizations at each side of the transition see a different set of observations. Such jumps are standard in strong-constraint SC-4Dvar solutions and are present in atmospheric reanalyses. They arise because one cannot run a SC-4DVar over a too-long window because of the sensitivity to initial conditions in a chaotic system such as the atmosphere or the ocean. To avoid these jumps at the end of assimilation windows, one would have to run a weak-constraint SC-4DVar, in which the influence of the initial conditions becomes negligible after some time. Indeed, the oceanographic ECCO system does provide smoother solutions with assimilation windows of 50 years or more, albeit at relatively low spatial resolution.

Fig. 16.11
figure 11

Zoom in of the analysis solution at the boundary of two assimilation windows. Note that the total solution is not smooth in time because the two solutions on either side of time \(t=4\), corresponding to 400 time steps, are from two independent minimizations

7 A Comparison with Ensemble Methods

Finally, we make a comparison with ensemble methods. We can directly compare the 3DVar to an EnKF. The main difference is that the 3DVar uses a climatological prior covariance matrix, and the EnKF an ensemble-based dynamical prior covariance matrix at each observation time. If the ensemble is large enough, the latter will be more accurate as it contains flow information, while the former does not. As an example, just after the solution passes the transition region where the decision on which wing the system will be in, the uncertainty is high, and the EnKF ensemble spread is large, see the red line in the lower figure in Fig. 15.2. The enhanced uncertainty allows for the following observation to firmly pull the ensemble to the correct wing, as seen in the upper panel of that figure.

In contrast, when the solution circles around in one of the wings, the error growth is small, the ensemble spread remains small, demonstrating that the uncertainty in the solution is small. In contrast, the 3DVar prior covariance matrix is some average of these situations. The climatological variance of the x variable is 3.1, while, in the EnKF, it is of order 1.5. We see that the 3DVar covariance values are quite large and hence conservative. This situation is similar for a 4DVar, and neither of these methods will perform well without these conservative prior covariances.

We should compare 4DVar with the ES and the EnKS or their iterative variants. One apparent issue is that the SC-4DVar struggles with window lengths larger than two, so 200 time steps, while in the previous chapter, we saw in Fig 15.1 that the ES manages to follow the truth quite well, placing the solution in the right wing all the time, for 4000 timesteps! The ES solution was imperfect, and analysis errors were significant but still consistent with the ensemble spread. The main difference is that the SC-4DVar only adjusts the initial conditions, while the ES updates the whole model trajectory at every time step. Hence, the ES can follow the observations quite well.

The EnKS and the SC-4DVar perform quite similarly as long as the assimilation windows for the latter are not too long, typically two non-dimensional time units or shorter for the Lorenz 1963 system. Since SC-4DVar decouples the analysis windows, jumps will occur between assimilation windows, as shown in Fig. 16.11. Inside a window, the solution is perfectly smooth, following the model equations exactly. In contrast, the EnKS update also affects the previous assimilation windows, and we obtain a smooth solution from one assimilation window to the next, but the model equations are only followed exactly between updates.