An efficient ensemble Kalman Filter implementation via shrinkage covariance matrix estimation: exploiting prior knowledge

In this paper, we propose an efficient and practical implementation of the ensemble Kalman filter via shrinkage covariance matrix estimation. Our filter implementation combines information brought by an ensemble of model realizations, and that based on our prior knowledge about the dynamical system of interest. We perform the combination of both sources of information via optimal shrinkage factors. The method exploits the rank-deficiency of ensemble covariance matrices to provide an efficient and practical implementation of the analysis step in EnKF based formulations. Localization and inflation aspects are discussed, as well. Experimental tests are performed to assess the accuracy of our proposed filter implementation by employing an Advection Diffusion Model and an Atmospheric General Circulation Model. The experimental results reveal that the use of our proposed filter implementation can mitigate the impact of sampling noise, and even more, it can avoid the impact of spurious correlations during assimilation steps.


Introduction
A dynamical system approximately evolves according to some imperfect numerical model: where n and m are the model resolution and the number of observations, respectively, and M : R n×1 → R n×1 is an imperfect model operator which mimics the behavior of a very highly non-linear system such as the ocean and/or the atmosphere. On the former representation, the model operator maps the state variable into a sequential time steps realization of the behavior of the dynamical system. In most of the cases, there is a control variable included on the operator that related external inputs to the system and allows for the representation of the interactions between the system and Santiago Lopez-Restrepo slopezr2@eafit.edu.co; lopezrestrepo@tudelft.nl Extended author information available on the last page of the article. the external world. The state variable may or may not be directly measurable and is used as a memory of the system. As seen in Eq. 1, the past behavior of the system affects its future development, but the lack of representation of the state variable may be a pitfall on the full representation of the real world. The relationship between the state space and the real noisy observation y ∈ R m×1 is sometimes a useful tool for the proper understanding and representation of the full system.
Controllability is a property of the dynamical system that allows measuring the ability of a particular control input to manipulate all the states of the system, taking them from point A to the point B in finite time. On the other hand, observability measures the ability of the particular sensor configuration to supply all the information necessary to estimate all the states of the system. State estimation and Parameter estimation are typically the main concerns in control and systems theory. They are required for the proper control law design and are mandatory for the full observability of the system.
In cases when there is a lack of observability, the problem of state estimation and parameter estimation arose, and it can be solved by means of the solution to the optimal filtering problem. That requires an analytical solution of the Bayes theorem by means of the Kushner or Zakai Equation. These are not feasible for non-linear and non-Gaussian systems. They are approximated most often via particle filters [34,35]. The linear and Gaussian case is solved by the well known Kalman filter, and its extension to non-linear and Gaussian cases can be found extensively in the literature. For Large scale systems, the solutions to complete the full observability of the system are not straight forward because the course of dimensionality and more sophisticated solutions to the optimal filtering problem were derived.
Sequential Data Assimilation (DA) is a statistical process that optimally combines information brought by an imperfect numerical forecast x b ∈ R n×1 and a real noisy observation y ∈ R m×1 [2,9] to estimate the actual state x * ∈ R n×1 of a dynamic system such as Eq. 1. When Gaussian assumptions are made over prior and observational errors via Bayes' rule, the posterior estimate has the form: where B ∈ R n×n is the background error covariance matrix, d = y − H x b ∈ R m×1 is the vector of innovations (on the observations), H(x) : R n×1 → R m×1 is the observation operator (which maps vector states to observations), H(x) ≈ H x b + H · x − x b ∈ R m×n , H ∈ R m×n is the Jacobian of H(x) at x b , the information matrix reads: and R ∈ R m×m is the estimated data-error covariance matrix. In practice, an ensemble of model realizations can be employed to estimate the parameters x b and B of prior error distributions. However, given the computational cost of a single model propagation, ensemble sizes are constrained by the hundreds while their underlying error distribution by the millions. Consequently, sampling errors impact the quality of analysis innovations: ensemble covariances are rank-deficient, and even more, they are ill-conditioned [1,32]. Thus, spurious correlations among distant model components are developed in the ensemble covariance [29]. Localization methods are commonly employed during assimilation steps to mitigate the impact of sampling noise. In this context, well-known methods are covariance matrix localization, precision matrix localization, spatial domain localization, and observation impact localization. The selection of one method over the others relies on computational aspects. Yet another manner to mitigate the impact of spurious correlations is based on Shrinkage Covariance Matrix Estimation. In this family of covariance matrix estimators, the background error covariance matrix is estimated as the convex combination of a target matrix T ∈ R n×n , and the ensemble covariance P b ∈ R n×n : The current literature proposes ensemble-based formulations via the covariance estimator (4) in which: 1. the target matrix T is diagonal (no prior structure is assumed for B), and the weight γ is optimally computed via loss functions [30,31], or 2. the target matrix T is static (i.e., it retains climatological information), and the weight γ is ranged in γ ∈ [0, 1] [43,44].
We exploit the opportunity to include our prior knowledge about the structure of B, the information brought by samples from the model dynamics, and the optimal estimation of γ . In this manner, we can obtain a covariance matrix estimator of B that optimally combines all sources of information. While several techniques have been proposed to reduce spurious correlations, most of them are designed for a specific problem, and it is not possible to generalize them for other DA implementations [10,24]. We are looking for a robust and generalizable manner to include previous knowledge of the system to a large scale Chemical Transport Model (CTM) for air quality purposes. Data assimilation is not necessarily the most popular technique to incorporate reality into a CTM model. Some practitioners prefer to spend more time developing emissions inventories rather than incorporate ground data, satellite information or vertical measurements. Nevertheless, some applications have been made recently for the CTM LOTOS-EUROS [10,17,24], and the particularity of the advection and diffusion dynamics govern and condition the emission and deposition processes. For the north of South America, the highly non-linear and chaotic behavior of weather dynamics mixed with a complex topography and a lack of emissions inventory is indeed a challenge for description and forecast of air pollution. Figure 1 shows an example based on the high-resolution application of the LOTOS-EUROS model to study the behavior of PM 10 and PM 2.5 over the Metropolitan Area of the Aburrá Valley in Colombia [23].
Here, it is possible to see how the complex topography that is not well captured by the meteorology and the model conditions the dynamical relations between the states. Traditional localization techniques as covariance localization to avoid spurious correlations are not suitable nor direct applicable to this problem. The idea of design a covariance matrix where the knowledge of the system can be integrated into the DA process comes from this application and its related difficulties using current localization techniques. This paper is organized as follows: Section 2 discusses well-known issues in ensemble-based data assimilation and how to overcome those. In Section 3, we propose an efficient  [23] ensemble-based method via Shrinkage Covariance Matrix Estimation, which accounts for Prior Knowledge about the background error correlations, localization, and inflation aspects are discussed as well. Experimental tests are performed in Section 4. Two models are employed during the experiments: the Advection Diffusion Model and the high-nonlinear model SPEEDY. Conclusions from this research are stated in Section 5.

Preliminaries
In order to state the value of the current contribution, several questions must be solved to demonstrate the feasibility of the new data assimilation technique in an operational fashion [46]: Does the new method provide guidance that is of higher quality or more use than existing methods? Is the potential benefit of running a new technique costeffective? Is the new method sufficient with respect to old methods?. In this section, we discuss ensemble-based data assimilation methods and how those can be implemented in current operational settings. These concepts are necessary to develop our filter formulation.

Ensemble-based data assimilation
In ensemble-based data assimilation, an ensemble of model realizations is employed to estimate the parameters x b and B of prior error distributions, where x b[e] ∈ R n×1 is the e-th ensemble member, for 1 ≤ e ≤ N, and N stands for ensemble size. Hence: and where is the matrix of member deviations, x b is the ensemble mean, P b is the ensemble covariance, and 1 is a vector of the consistent dimension whose components are all ones.
Once an observation is available, the posterior state can be computed via the stochastic Ensemble Kalman Filter (EnKF) [9]: where the e-th column of the innovation matrix on the synthetic observations D ∈ R n×N reads d [

Covariance Matrix Localization
For small ensemble sizes, sampling errors can impact the quality of covariances in Eq. 7. As a direct consequence problems such as filter divergence and long range spurious correlations can appear [1]. Localization is based on the assumption that two distant parts of the system are independent for most geophysical systems. The two main localization methods are: domain localization and covariance localization. Domain localization, also called local analysis, instructs that instead of performing a global analysis for the complete domain, a local analysis can be applied using just local observations [3]. Covariance localization cuts off longer-range correlations in the error covariances at a specified distance [13,33]. The localization is performed throw Schur product denoted by •: Positive definite covariance matrices can be built with the Gaspari-Cohn function G(d) [11]: A cutoff function would be defined by where r is a length scaled called the localization radius [3,37]. The regularized ρ • P b is used as a replacement for P b .

Shrinkage covariance matrix estimation
A more robust family of covariance estimators under the DA case n N are the shrinkage based estimators [8,42]. This kind of estimators follow the form [22]: where α ∈ [0, 1], and T ∈ R n×n is known as the Target matrix. The resulting estimator is a convex combination of the ensemble covariance matrix and the pre-defined T matrix. When there is not available information about the structure of B, an alternative for T is [31]: where I ∈ R n×n is the identity matrix. The value of α is chosen to minimize the loss function where • F represents the Frobenius norm. For target matrices of the form Eq. 13, a distribution-free formulation for the optimal α * LW is proposed by Ledoit and Wolf in [20]: where x [e] ∈ R n×1 denotes the e-th column of the matrix (8). Based on the LW estimator, for Gaussian samples, the Rao-Blackwell Ledoit and Wolf (RBLW) one is proposed.
In the RBLW estimator, the optimal weight is defined by: An EnKF implementation which exploits the special structure of this estimator is the EnKF based on the RBLW estimator (EnKF-RBLW) wherein the posterior ensemble can be built as follows [30,31]: Since numerical models can be highly non-linear, Gaussian assumptions on prior members are commonly broken. This assumption can be relaxed in the EnKF context by employing, for instance, the LW estimator for the estimation of background error covariance matrices during assimilation steps [28]. Besides, different prior structures can be treated in T to enrich the covariance matrix estimation, this is, to account for prior information about the dynamical system.

An ensemble Kalman Filter via shrinkage covariance matrix estimation and prior knowledge
In this Section, a novel EnKF implementation that incorporates prior knowledge of the background error covariance matrix in a practical manner to improve the DA process is presented. The method is based on a shrinkage estimator using a general target matrix. An efficient and totally parallelizable implementation of the method for high-dimensional systems is also proposed.

Filter derivation
As was mentioned above, shrinkage based covariance matrix estimators which allow the use of a target matrix T to structure the covariance matrix, are limited to a target matrix with identity matrix structure [31,39]. Although matrix identity structure can reduce the spurious correlations caused by the ill-conditioned approximation of the error covariance matrix [7,21,30], the assumption of a covariance structure without correlation between the states is not always valid or desirable. Using a general target matrix enables the incorporation of prior information about the system into the error covariance matrix. This prior information can be information about the system physics as for instance, parameters, topography, transport phenomena and environmental information, or knowledge about the covariance structure coming from experts or previous experiments. A close formulation to calculate the weight value α using a general target matrix T KA is proposed in [39,45], (18a) and the KA (Knowledge-Aided) estimator is obtained using (18a) in It is important to note that no assumptions about the structure of T KA are made to calculate α KA . This approach can be seen as an extension of that in [6,21] to a general target matrix and is usable for complex-value data case 1 . Similar to the EnKF-RBLW an implementation of the EnKF can be obtained using the KA shrinkage-based estimator presented in Eq. 18: Since the target matrix T KA in the EnKF-KA is not necessarily a matrix with identity structure, information about the dynamical system can be integrated into the data assimilation process. The prior information is directly related to the error covariance of the model states; this means that it is possible to integrate information of the system and guide the dynamical relationship between the states and the relation between states and observations. Although there are no restrictions in the structure of T KA , it is important to remarks that T KA is still a covariance matrix, so all the related conditions have to be accomplished. In Section 4 are shown examples of how to select T KA properly.

Domain localization
Both most popular concepts of localization can be applied in the EnKF-KA approach: covariance localization [13,14], and local domain analysis [32]. We explore the implementation of local domain analysis due to the advantages not only in the spurious correlation mitigation but also in the implementations. Since the main idea of the EnKF-KA is to incorporate prior information of the system in the DA framework, it is inherent that this information has to be saved and available in all the DA processes. In high-dimensional applications, it is not convenient and, in some cases, prohibitive to save a matrix of the dimension of T KA ∈ R n×n , and calculate P b ∈ R n×n directly. It is here where the concept of local domains is crucial for the implementations of the EnKF-KA for high-dimensional systems. In local domains, a box of radius r of components around the state of interest is created, and just the states and observations within this box (local domain) are used in the analysis step [16,32,38]. This process is repeated for all the state components, doing multiple local analysis (in a smaller dimension) instead of a unique and global analysis (in a higher dimension). Another advantage of this implementation is that it facilitates the parallelization of the analysis since each local analysis can be performed in an independent core [12,31]. The implementation of the EnKF-KA using local domains analysis is summarized in the next steps: 1. A local domain of radius r is created for any model component. The k − th local domain is formed by n r (n r << n) and m r observation. The use of domain decomposition is applied, so that boundary information is shared across neighboring domains. In this manner, we preserve the continuous dynamics of some physical variables such as Temperature, Wind Components, and Pressure. Figure 2 illustrates this strategy. The background ensemble and the analysis ensemble into the box is denoted by X b k ∈ R n r ×N and X a k ∈ R n r ×1 respectively. The covariance model error into the box are denoted by B k ∈ R n r ×n r , the local observation is denoted by y k ∈ R m r ×1 with observation covariance R k ∈ R m r ×m r , and the local innovation matrix is denoted by D k ∈ R m r ×N . 2. Compute the local sample covariance matrix P b k ∈ R n r ×n r 3. Define the local target matrix T k ∈ R n r ×n r . On this step, the use of previous knowledge of the model dynamics is required. Knowledge is understood as the human-based experience in front of a large scale model used to represent reality. Large scale models for atmospheric dynamics, weather, water and ocean, reservoir modeling are used normally by experts in their fields. Even if the data to be assimilated is measured, some details and specifications are not captured on the model or included on it. Other possible causes are that due to the spatial-temporal resolution chosen for the numerical solution of the equations, it does not allow to capture intrinsic relationships between the states. We suggest a matrix T k built on the basis of that specific knowledge. Although T k must meet all requirements of a covariance matrix, the main contribution is that the matrix T k must not fulfill any requirement about its structure and also can change dynamically.
5. Perform the local analysis step 6. Once all the local analyses are performed, map those to the global domain. The global analysis state is then obtained. This does not mean to perform a new global analysis. In [32] two map approaches are proposed. The first one uses only the analysis results at the center point of each local region to form the global analysis vectors. The second one uses the average of all the local analysis where a grid cell is involved in obtaining the global analysis.
Note that with a correct selection of r, the matrix computations in each local domain are inexpensive, so Eq. 20 can be computed efficiently for high-dimensional systems.

Inflation aspects
In the context of EnKF-KA, the covariance inflation can be efficiently performed increasing the dispersion of matrix (8) by a inflation factor β inf : and by noting that: tr β 2 inf · P b = β 2 inf · tr P b , where tr represent the trace of the matrix. For instance, we can see that covariance inflation on the optimal factor (18a) reads:

Results with an advection diffusion model
This section illustrates the proposed EnKF-KA over simple a advection-diffusion process. The advection-diffusion governs the changes of a conservative property such as the concentration in a fluid environmental [36,41]. The advection-diffusion equation has been used as a simple model to study the behavior and transport of pollutants in the atmosphere. In two dimensions, the horizontal changes in the concentration of a determinate pollutant C in the atmosphere can be approximated as: where v x and u y are the north-south and west-east wind velocities respectively, D where ρ b = 0.05 -We propose three ensemble sizes for the experiments N ∈ {10, 50, 100}. -The assimilation window consists of M = 1000 time steps. Two observation periods are proposed for the test, each time step and each ten time steps. We denote by δt ∈ {1, 10} the elapsed time between two observations. -The error statistics are associated with the Gaussian distribution, where ρ o = 0.001.
where x * and x a are the reference and the analysis solution respectively. -The Root-Mean-Square-Error (RMSE) is used as a measure of performance, in average, on a given assimilation window, where -The percentage of non converge experiments (PNCE) is calculated for all the scenarios.
The idea is to incorporate the physical restrictions that the model does not capture, for this case, the artificial valley, via the EnKF-KA. If we use a standard distancebased localization for a state into the valley to cut the coming information from distant observations, the process will include both observations inside and outside the valley. With the EnKF-KA, we try to cut observations that are outside the valley, even if there are at the same distance, as is represented in Fig. 4. This is achieved by incorporating the physical restrictions (the topography of the interest domain) into the covariance estimation throw the target matrix T KA . The target matrix is built starting from a Gaspari-Cohn function [11] and reducing to zero the covariance between the states inside and outside the valley. After this process, it is very important to test whether the final T KA is still a positive semidefinite matrix. Note that the final covariance between the state inside and outside the valley will not be necessary zero because the final covariance matrix is a convex combination of T KA and P b . In Fig. 5 is shown an example of a T KA matrix obtained using the proposed process for an influence radius r = 4.
The performance of the EnKF-KA is compared with the shrinkage-based EnKF-RBLW and the standard EnKF using covariance localization EnKF-CL with r = 1 (other influence radii were tested, but r = 1 presents the best performance) under the experimental setup presented below. A total of 20 experiments are performed for each scenario. The target matrix T KA is built from a Gaspari-Cohn with r = 1 and following the mentioned process including physical restrictions of the valley. The magnitude of T KA is computed according to the average of the trace of P b . In Fig. 6 is shown the dynamical evolution of the L 2 norm for different scenarios. Figure 7 presents the values of the average RMSE for all the experiment scenarios and the PNCE for the EnKF-CL for the different ensemble members value. For the EnKF-RBLW and the EnKF-KA the P NCE = 0% for all the cases.
As is shown in Figs. 6 and 7, the EnKF-KA presents a lower error rates than the EnKF-RBLW and the EnKF-CL in almost all the scenarios. This shows how the integration of the physical restrictions can help the data assimilation process. It is interesting to evaluate the scenarios with a smaller number of ensemble members, where the differences among the three algorithms are more considerable. The RMSE value of the EnKF-KA in these scenarios is much lower than the EnKF-CL, showing that shrinkage-based estimators are more robust than the sample covariance matrix when n >> N. Since the ensemble statistics approximate the mean and the covariance of the state, the ensemble spread should describe the system uncertainty [15,40]. If the filter estimates the state uncertainty correctly, the ensemble spread should matches with the RMSE when there are no model errors Fig. 4 Comparison of the distance-based localization approach vs the EnKF-KA. In the EnKF-KA the influence region is based on the distance and on the information about the system. The blue square represents the analyzed state, the blue shadow the influence region, and the yellow circles represent the observations. a Distance-based localization. b EnKF-KA [27]. Figure 8 shows the ensemble spread of each algorithm among assimilation steps for a specific experiment. It can be seen how all the algorithms reduce the ensemble spread after few assimilation steps, reducing the system uncertainty levels. The Free-Run keep similar uncertainty values among time because no new information is incorporated. Finally, the EnKF-KA presents the lowest spread values matching with the lowest RMSE values, which means that the ENKF-KA can correctly reproduce the system uncertainty and improve estimation accuracy.
In Fig. 9 is presented the time evolution of states in four different spacial location for one experiment scenario. It is evident that the EnKF-KA reproduces more accurately locations in the border of the artificial valley than the other methods, showing the effect of the incorporated information throw T KA .
An aspect that is important to remarks is the value of α KA for different ensemble member values. The mean α KA value for ensemble number of N = 10, N = 50 and N = 100 arē α 10 = 0.698,ᾱ 50 = 0.591 andᾱ 100 = 0.508. With a small number of ensemble members the assumption of a poor estimation of the covariance throw the sample covariance matrix produces a higher value of α KA , giving more weight to the target matrix than when the number of ensemble, and the quality of the estimation throw the sample covariance matrix, is higher.

SPEEDY (Simplified Parameterizations, privitivE-Equation DYnamics
) is an Atmospheric General Circulation model [5,25], which help us to study the performance of the EnKF-KA method in a highly non-linear model scenario. The model consists of seven numerical layers, and at each one, a T-30 model resolution is employed (96 × 48 grid components) [19,26]. The total number of physical variables at each numerical grid point are five. These are the temperature T (K), the zonal u and the meridional v wind components (m/s), the specific humidity Q (g/kg), and the pressure ρ (hP a). We employ all physical variables into our data assimilation process. Note that, the model dimension in our settings reads n = 133, 632. During our experiments, we consider ensemble sizes of N = 10 and N = 20, this applies for all numerical scenarios. Note that, model resolutions are 13,632 and 6,685 times larger than ensemble sizes (n N), which takes to current DA operational settings. We follow the experimental settings presented in [18,28]: -Long term numerical integrations are applied to build the reference solution as well as the initial background ensemble (two years of a numerical simulation). We start with a system in equilibrium, and after adding a small perturbation, the numerical integration is performed.
-The experiments do not account for model errors.
-Standard deviations of observational errors are detailed in Table 1. -We employ a highly sparse observational network. The observation coverage is 9% of the spatial resolution. This linear observation operator is shown in Fig. 10. Note that this is an irregularly distributed, realistic observational network. -The inflation factor is β inf = 1.3 for all experiments.
-We set up a total simulation time of two months with observations frequencies about 6 and 12 hours. We expect the non-linear dynamics of the SPEEDY model to impact the quality of analysis states as the observation frequency decreases.

Evolution of analysis errors among assimilation steps
As we can see in Figs. 13 and 14, the initial errors decrease as observations are assimilated in each analysis step using the proposed method, for observation frequencies of 6 and 12 hours. It should be noted that the observation frequency affects the estimation quality but not the convergence of the EnKF-KA with the configuration of this experiment. On   Tables 2 and 3 shows the analysis RMSE of the EnKF-KA and the EnKF-RBLW using 6 and 12 hours for observation frequencies and ensemble sizes of N = 10 and N = 20. The RMSE values are computed for 60 days with an initial spin-up period of ten days. As can be seen, the analysis states of the EnKF-KA can improve on the results proposed by the EnKF-RBLW. This can be possible

Uncertainty analysis
For sequential data assimilation based on Ensemble Kalman Filter is known that if the ensemble spread becomes very small or becomes very large, the filter falls into divergence, but also, the ensemble spread can be used to explore the uncertainty associated with the initial condition and the a b c d  Figure 15 shows the mean of ensemble variance among assimilation steps for u, v, T and Q variables in pressure level of 500 Pa. As expected, the ensemble variance decreases as EnK-KA is used for the analysis step. This means that the uncertainty decreases as the observations are assimilated. It should be noted that a covariance inflation factor of 1.3 was used in the experiment. In the same way, Fig. 16 shows samples of the components taken for each of the model's physical variables. It is possible to see how the differences between ensemble members decrease through the assimilation steps.

CPU-time of analysis steps
Statistics of CPU-Times are computed across all analysis steps for both filters. The reported times are shown in Table 4, where the average and the variance of elapsed time for the analysis step computations are in seconds. The forecast step was realized using parallelism in a CPU  As the frequency of observations is decreased, the EnKF-KA formulation can improve on the results of the EnKF-RBLW method. The number of ensemble members reads N = 10 As the frequency of observations is decreased, the EnKF-KA formulation can improve on the results of the EnKF-RBWL method. The number of ensemble members reads N = 20  with four cores; this means that up to four ensembles were forecast simultaneously.

Conclusions
An efficient and practical implementation of the EnKF based on shrinkage covariance matrix estimation (EnKF-KA) was proposed in the present document. The proposed filter implementation exploits the information brought by an ensemble of model realization (numerical model dynamics) and our prior knowledge about the actual dynamical system (i.e., the prior structure of background error correlations). The EnKF-KA uses a target matrix with a general structure, representing a novel approach compared with the current shrinkage-based estimators that use an identity matrix as a target matrix. An efficient implementation for large systems is presented, taking advantage of the local domain decomposition. Experimental tests are performed by using an advection-diffusion model and an Atmospheric General Circulation Model. In both cases, the proposed method can outperform EnKF based on shrinkage covariance estimation where there is no prior information about error correlations, and the standard EnKF using covariance localization. The results support the idea that it is possible to use the information and prior knowledge of the system to improve the current ensemble-based DA method.

Conflict of Interests
The authors declare that they have no conflict of interest.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creativecommonshorg/licenses/by/4.0/.