SCR-Filter Model Order Reduction (2): Proper Orthogonal Decomposition and Artificial Neural Network

Catalysed diesel particulate filters (DPF) have been described as multifunctional reactor systems. Integration of selective catalytic reduction (SCR) functionality in the DPF creates an SCR-in-DPF system that achieves nitrous oxides (NOx) treatment along with particulate matter (PM) collection. The physical and chemical aspects of the integrated SCR-filter complicate system modelling. The goal of this work is to develop low-complexity model of the SCR-filter system which retains high fidelity. A high-fidelity model of the SCR-coated filter has been developed and validated. The performance of the model was described in a previous paper. Model complexity reduction is attempted in this paper. The objective is to achieve simulation times that can support the deployment of the model for online system control in an engine control unit. Two approaches were taken for the SCR-coated filter model order reduction (MOR): a “grey-box” approach via proper orthogonal decomposition (POD) and a “black box” approach via artificial neural network (ANN) function approximation. The POD method is shown to deliver a significant MOR while maintaining a high degree of fidelity but with less than 5% improvement in simulation time. The ANN method delivers a substantial MOR with reduction of three orders of magnitude in simulation time. The accuracy of the ANN model is satisfactory with good generalisation to new test data but noticeably inferior to the POD method.


Introduction
Diesel engines offer superior performance in fuel economy compared with gasoline engines [1], but the simultaneous control of soot/particulates (PM) and nitrogen oxides (NO x ) is challenging. Integration of the selective catalytic reduction (SCR) catalyst within a diesel particulate filter (DPF) monolith is an emerging technology for simultaneous control of soot and NO x emissions.
Modelling of the integrated SCR-filter 1 unit is complicated. The interaction of physical and chemical considerations within different phases of the monolith material over different timescales make the effort of developing adequate representation of the SCR-filter system complicated. The application of the system model in real-time online model-based controls further motivates the need to develop simple but adequate representative models.
A model of the SCR-filter system has been developed and validated in [2]. The model was primarily validated against the published data in Schrade et al. [3]. The model describes the data well; details are provided in ' [2]. A new implementation of the model delivered faster than real-time solution times. However, the full-order model is not easily implemented for online control in an engine control unit (ECU). Therefore, model order reduction is attempted in this work.
Two model order reduction (MOR) techniques are implemented in this paper: (a) proper orthogonal decomposition (POD) and (b) artificial neural networks (ANN).
The POD method is a multi-variate statistical method that aims to obtain a compact representation of data. In the POD method, large-scale system data is decomposed into its characteristic modes (or eigenvectors). A Galerkin projection of 1 SCR-coated filter also referred to as SCRF, SDPF, SCR-in-DPF This is the second of two papers on the model order reduction of integrated SCR filter systems. the large-scale data on the subspace spanned by the largest (or dominant) eigenvectors can be used to derive a lower dimensional surrogate of the original large-scale system.
The motivation for using the POD approach is that the loworder model directly derives from the high-fidelity model (or experimental data), thereby retaining the physics of the system.
POD has been applied to engine research to study turbulence and cyclic variation of flow and combustion properties in internal combustion engines [4,5]. The method is used extensively in model reduction in CFD applications [6]. The Galerkin technique has been applied in exhaust after-treatment system (EATS) modelling in the context of mean weighted residuals method of solution of differential equations [7,8].
The POD approach was applied to the SCR-filter model order reduction in [9]. In that work, the SCR-filter model was not complete and had not been validated. This work improves on the presentation in [9] with the full-order model completed and validated. A change in the representation of diffusion transport in the model led to a different implementation of the SCR-coated filter model and the POD method resulting in different observations on the performance of the POD method in this work compared to [9].
The alternative approach to model order reduction applied in this work is ANN. In this approach, data was obtained from the full-order model developed in [2] and a black-box function approximation is derived to represent the model performance.
The motivation for this approach is twofold: (a) the ANN method is well suited for nonlinear model function approximation [10][11][12]; and (b) ANN can deliver model simplicity and adequate representation required for deploying the SCRcoated filter model in an online application.
Neural network has been applied in EATS modelling [13][14][15][16][17][18][19]. In [19], ANN was used to approximate the SCR catalytic converter to accelerate ammonia dosing optimisation. This is similar to our objective here, namely, to use a neural network to approximate the full-order SCR-filter model for deployment in a model-based control application.
The POD and ANN techniques were employed for MOR in this work. We found that the POD delivered a reduced order model with a higher degree of fidelity to the HFM but with limited improvement in simulation times. The ANN delivered a reduced order model which achieved a more significant reduction in simulation time-on the scale of three orders of magnitude-but with a poorer level of fidelity to the reference HFM.
A note on terminology: "MOR" in this work refers to the process of reducing model complexity for control-oriented application. The success of our MOR technique is assessed on two factors: fidelity and simulation efficiency. Model fidelity measures how close the reduced model is to the original full-order model. The root mean square error (RMSE) metric is used as an indicator of model fidelity. Simulation efficiency measures the change (mostly improvement) in model simulation times.
The rest of the paper is organised as follows. The POD method is introduced in the next section with some background on the application of the POD method to SCR-filter modelling. This is followed by the results of the POD applied to a sample problem. The ANN method is then introduced. Some of the considerations taken in developing ANN for the SCR-filter model application are presented. The results of the ANN method applied to a sample problem are then presented, followed by some concluding remarks.

Proper Orthogonal Decomposition
POD, also known as principal component analysis (PCA), or Karhunen-Loéve decomposition (KLD), is a statistical method which achieves model order reduction (MOR) by extracting the dominant modes of the system and using those modes to devise a lower dimensional approximation. The main idea of the POD method is to decompose system data obtained from experiment or numerical simulation into linear combination of basis functions (POD modes) and associated temporal coefficients. Model order reduction is then achieved by Galerkin projection of the full-order model on the subspace spanned by the lower-dimensioned POD modes.
The motivation for using the POD approach is that the loworder model directly derives from the high-fidelity model (or experimental data), thereby retaining the physics of the system. The derivation of the POD modes from experimental or numerical data makes this model order reduction approach "grey-box".
The mathematical background of the POD method is available in several excellent works [20][21][22][23]; a focused summary of the mathematical background as it applies to this system is presented in [9]. This section extends the approach from [9] for the 1D + 1D SCR-in-DPF model with more than two physical state variables.

The Intuitive Basis of the POD Method
Consider the solution to a simple equation to describe the transport of heat in a slab that has one end exposed to a heat source. The distribution of temperature within the slab is a series of exponential temperature profiles which progressively spreads throughout the entire slab provided the heat source is retained. The system dynamics is composed of a repeating pattern of temperature transport profile which spreads over time within the slab. The repeating patterns can be called "basis functions" [23].
The temporal variation of the physical phenomena (either due to its initial value or influence of source terms) can be captured in temporal coefficients which, along with the fixed repeating patterns, can describe the underlying phenomena. Thus, it is possible to expand a composite variable g(x, t) into a set of spatial basis functions and temporal coefficients (Eq. 1) The idea of decomposition of physical quantities in expansions of the form Eq. 1 was first proposed in the work of Joseph Fourier in his memoir On the Propagation of Heat in Solid Bodies written in 1807 [24]. There he proposed to expand an arbitrary function in a series of trigonometric basis functions. The well-known Fourier series are examples of Eq. 1, where the repeating patterns consist of trigonometric functions. The main difference, however, is that the Fourier patterns or Fourier "basis functions" are not derived from the data. The trigonometric functions are aimed to approximate any arbitrary functions. In POD, the basis functions are derived from the system data [23].
The intuition of the POD method is that for even the most complex physical phenomena, fundamental patterns/ characteristic modes can be identified from the system data. The singular value decomposition (SVD) routine performs an operation similar to extracting the fundamental recurring patterns in a dataset.
The evolution of the specie concentration on a SCR filter system has repeating spatial components which are identified as the basis function from the SVD routine; the associated temporal coefficients track any transient component of the system. The combination of the spatial basis function and the temporal coefficient estimated from the POD-Galerkin method (Eq. 9) approximates the system data without much loss of fidelity. The aim of the POD method is to utilise only a few of the spatial basis of the SVD output to reconstruct the original dataset without much loss of fidelity.

POD Application to SCR-Filter Model
The first step in the application of the POD method is to obtain an ensemble data describing the system. The data can be obtained from experiments or from numerical simulation of the full-order model. The POD method in this work was applied using numerical data obtained from a validated high-fidelity model of the SCR-coated filter system [2]. For each of the cases considered, the numerical model is solved at the specified inlet conditions, to generate the system ensemble data for the POD basis. In this presentation, the numerical model solution is referred to as "high-fidelity model" or HFM, and the low-order POD-derived model is referred to as the "POD-model".
The wall temperature, the ammonia surface coverage (ASC) and mass of soot retained in the system are selected as state variables on which to apply the POD method because they are the explicit time-dependent variables in our system. (For ease of description, only POD basis associated with the ammonia surface coverage on catalyst site S 1 .is described in the published results.) The remaining (quasi-steady state) equations are solved at each time step with the values of the state variables obtained from the POD model evolution.
Let θ, T s , m p ∈ ℝ N × T be the solution obtained from the solution of the HFM for ASC, wall temperature and soot mass retained, respectively, at N discrete units of the wall layer along the length of the catalyst, over the time period [0, T]. The mass of soot variable m p is combined for deep-bed soot in the wall and soot on the cake layer. The POD modes associated with the ASC, wall temperature and soot mass data ensemble are extracted by SVD of the data. Let Φ θ ; Φ T s ; Φ m p ∈ℝ NÂp be the orthonormal matrices of the first p POD modes, where p is the number of modes need to capture a given level of variation energy from the data ensemble. The Galerkin projection of the subspace spanned by the POD modes on the HFM is then as follows: The time-explicit component of the SCR-coated filter model equations [2] can be can be written compactly as is the parametric vector containing other system variables.
Using the POD modes as basis functions Φ X and an approximation of X defined in equation 7, the POD-Galerkin projection is equivalent to with the initial condition of a X 0 ð Þ¼Φ T X X 0 ð Þ. a X (t) is solved over time by Eq. 9, and b X t ð Þ is estimated as b X ¼ Φ X a X from Eq. 7. The other system variables are then obtained based on the approximated b X over time.

Steady State Application
The steady state NO x conversion case reported in Schrade et al. [3] is used to evaluate the POD methodology. Our high fidelity model presented in [2] was validated with data from [3]; therefore it is considered a suitable reference against which to benchmark the performance of our POD model. Interested readers are referred to [3] for the details of the experimental set up and overall context for the case. Table 1 presents the specification of the SCR-coated filter sample and the parameters of the steady-state NO x conversion experiment [3]. The feed gas NH 3 and NO x concentration trend is presented in Fig. 1.
The POD method was applied to capture two variation energy levels: 85% and 95%. This was done to indicate how the POD results approach the HFM as the variation energy level approaches 100%.
The POD results are presented in the following sections.

POD Modes
The spectrum of the singular values obtained from the decomposition of a state variable (ammonia surface coverage for sites S 1 and S 2 [2]) is presented in Fig. 2. The result show that the singular values are dominated by a few points as most of the variation energy can be captured by those point. For example, in Fig. 2b, one POD mode captures 85% of the variation energy and two POD modes capture 99% of the variation energy.
The POD modes selected to capture the given variation energy level are then deployed to reduce the model order. Figure 3 provides some insight into the nature of the first two POD basis functions used to describe this steady-state application. The plots are presented for the ammonia surface coverage (ASC) variable in site S1.

POD Basis Functions
Notice first in Fig. 3a, the plots show the filling of the wall with ammonia from over time, with ASC settling at around 0.7 at 5000 s. This is the HFM output data that is used to generate the POD basis functions. Figure 3b shows this variable averaged over the wall domain over time. Notice that the wall starts at empty of ammonia and gradually fills up to circa 0.7 at time 5000 s. (Note that ammonia is supplied in excess to this system with ANR at circa 1.25; therefore, there is no apparent reduction in surface coverage over time as NO 2 / NO x ratio changes.) The first two POD basis functions are shown in Fig. 3c. The first POD basis function Φ 1θS1 appears like an average of the profiles X θS1 over time. It is a vector value of circa 0.1 flat across the wall domain. The ASC is relatively flat for most of the time in this example, lending further support to the interpretation of the first POD mode as a type of average of the system data.
The second POD basis function Φ 2θS1 is a downward sloping plot from the inlet end of the filter wall towards the outlet end. Note that some of the values of the second POD basis function is negative, which underscores the need to apply caution in any direct physical interpretation of the POD basis function [22].
Although it can be risky to provide a physical interpretation of POD basis functions, these two functions align with the function the SCR system. The conversion depends on the total or average coverage, which is represented by the constant value in the first basis function Φ 1θS1 . However, the ammonia slip depends on the spatial distribution of the coverage, which is typically skewed to the front. The second POD basis function Φ 2θS1 is a bias free linear function, which represents this skewing of the spatial distribution. It is worth noting that these two basis functions are reminiscent of the Legendre polynomials, which are known to be good generic basis functions for systems with strong spatial correlation like this one.
According to Eq. 1, the system data is decomposed into spatial basis functions and temporal coefficients. The associated temporal coefficients estimated from Eq. 9 are shown in Fig. 3d for the first two POD functions. It is seen that the first temporal coefficient a 1X θS1 retains the form of the averaged X θS1 . The second temporal coefficient a 2X θS1 quickly approaches zero to reduce the contribution of the negative values of the second POD basis function on the resulting system approximation. It is this property of the POD modes whereby the combination of the first few POD basis functions and associated temporal coefficient produce results that appear similar to the source system data that makes the POD method attractive as a MOR technique. Additional improvement in data approximation is achieved with more POD basis functions, by definition.

Output Concentration
The model output NO x and NH 3 concentration data is shown in Fig. 4 along with the results from a POD-reduced model at 85% and 99% variation energy level. Results from POD modes at 85% variation energy level are labelled "POD 1 ", while the results from POD modes at 99% variation energy level are labelled "POD 2 ".
Outlet concentration plots show that the performance of POD models agree closely with the HFM. The POD results at 99% energy are closer to the HFM results compared with the 85% POD model for the NO x and NH 3 concentrations. This is expected from the definition of the POD approach.
The performance of the POD model is evaluated by the root mean square error (RMSE) defined as the square root of the MSE in Eq. 12. The RMSE summary data for the POD model in steady-state application is presented in Table 2.
The performance of the POD model is good in the steadystate application. The RMSE of outlet NO x is at most 2 ppm which is circa 1% of the input NO x concentration. The RMSE of outlet NH 3 is at most 14 ppm which is circa 5% of the inlet NH 3 concentration. This demonstrates the ability of the POD model to approximate the HFM without loss of fidelity in the steady-state application.
The performance of the POD model for other relevant parameters is summarised in Table 3.
The results show that the performance of the POD model is satisfactory with low RMSE across the relevant parameters. The RMSEs are lower for the case POD 2 as expected because more system variation energy is captured to develop that POD  [3], converted here to diameter with the same area.

Simulation Time
The evidence of model order reduction is the reduction in simulation time. The simulations are carried out on a PC with the following specification: Intel i5 4460 processor, 3.2 GHz CPU, 32GB RAM running MATLAB R2018b software. The same time step is specified for the HFM and POD models.
The simulation time obtained in the steady-state application is presented in Table 4. The real-time factor (RTF) is the ratio of simulation time to the real time. A model faster than real time has RTF < 1, and RTF > 1 indicates model is slower than real time. The system real runtime is 500 s (model run for 10,000 s in 20s-timesteps).
The results in Table 4 show that our model is faster than real time by one order of magnitude. This is true for the fullorder high-fidelity model and, of course, for the POD models.
The simulation time results also indicate that there is a measure of model order reduction achieved with the POD model. This is seen in the 4% reduction in simulation time for the POD model with 85% variation energy. Simulation time savings of 2% is also seen in the POD model with 99% variation energy.
The savings in simulation time is below expectation. The modest time savings indicate that most of the simulation time is not in the domain of the problem that has had its dimension reduced by the POD method. The majority of the simulation time~80% is spent on solving the inlet channel-wall layeroutlet channel coupled nonlinear boundary value differential equations [2].
The POD method is applied to the state variables with explicit-time terms in their model equations. The POD excels Fig. 3 Further insight into the nature of the POD basis function for the steady-state application. Plot (a) is the HFM output for ASC on active catalyst site S1 over time. Plot (b) is averaged ASC over the wall domain over time. Plot (c) is the first two POD basis functions. The first POD basis function appear like an average of the ASC data over time, a static spatial representation of the data. Plot (d) shows the associated temporal coefficients. System data is approximated by the POD basis functions and associated functions; better approximation is achieved by using more POD basis functions by definition  at reducing the simulation effort for the evolution of the state variables in time by solving a reduced-order initial value differential equation. In this application, the POD method achieves almost 10% reduction in simulation time for this component of the system. The time savings are dwarfed however by the invariant simulation requirements of the principal component of the model solution: the inlet channel-wall layeroutlet channel coupled nonlinear boundary value differential equations [2].
In the steady-state application, the POD method can deliver a modest order reduction of the SCR-coated filter model which retains high degree of fidelity. However, because of the non-linear nature of the system equation, the simulation of the model must be calculated via the reconstruction of the full state; therefore, only a moderate reduction (2-4%) in computation time is achieved. The low RMSE demonstrate the high fidelity of the reduced order model.

Transient State Application
The performance of the POD methodology to a transient state application is presented in this section. The transient state application is based on reference [9].
Transient input conditions are obtained from a sample world harmonized test cycle (WHTC) conditions reported in ref. [25]. The engine-out NO x emission concentrations and the exhaust gas temperatures are obtained from the reference. 2 Details of the test engine conditions can be found in ref. [25]. The exhaust gas flow rate was constant for the simulation. In the experiment, liquid urea solution was injected; thus, the input NH 3 gas concentration was not measured. For our analysis, NH 3 gas is injected at ANR = 1.0. The inlet NO x and NH 3 concentrations and exhaust gas temperatures are presented in Fig. 5. The filter substrate of Table 1 is retained for this application.
The performance of the POD model in a transient state application is discussed in the following sections.

POD Modes
The fundamental idea exploited in the POD method is that in a high-dimensional data ensemble of independent variables, the dominant components of the data can be captured with a surprisingly few number of orthogonal modes [26]. In the transient mode, the singular value decomposition of the ammonia surface coverage shows that only a few singular modes describe most of the variation energy in the system. This is consistent with the observation from applying the POD to the steady-state case.
In Fig. 6b, it is seen that the relative energy of a few POD modes dominates over other POD modes. In this example, two modes are needed to capture 85% of the variation energy in the ammonia surface coverage data and four POD modes can capture 99% of the variation energy. Therefore, the POD method is suited for model order reduction in the transient state application.

POD Basis Functions
In a similar analysis as the steady-state application, Fig. 7 presents an insight into the nature of the POD basis functions for the transient state application. Figure 7a shows the profile of ammonia surface coverage on active site S 1 over the wall at different times in the simulation. Due to the transient state scenario, the ASC is not  monotonically increasing like the steady-state scenario. The variation in ASC can be seen in Fig. 7b. The variation in ASC is due to variation in inlet specie concentration and consumption/storage of species within the wall.
While it can be difficult to offer any clear physical interpretation of the POD basis functions as shown in Fig. 7c, the first POD basis function Φ 1θS1 can be considered essentially constant, representing the average of the ammonia surface coverage data ensemble. The other POD basis functions are somewhat similar to the Legendre polynomials, but they are showing more variation towards the front of the catalyst, where higher gradients are to be expected. With increasing order, the basis functions show increasing variance over distance, helping to resolve finer variations in coverage.
The a-temporal coefficients associated with the POD basis functions are shown in Fig. 7d with the first coefficient a 1X θS1 showing a profile similar in form to the average ammonia surface coverage over the wall domain (Fig. 7b). This supports the interpretation of the first POD basis function as a form of system data average. The other a-temporal coefficients are to be interpreted as the statistical complements of the POD basis functions that enable an approximation of the system data to be achieved by the POD method.

Output Conversion
The performance of the POD model is here investigated with the cumulative NO x conversion and cumulative NH 3 slip over the simulation period. Figure 8 presents the cumulative NOx conversion and NH3 slip for the high-fidelity model (HFM) and two POD models. POD model developed with POD modes which capture 85% of the variation energy, namely, POD 1 and POD model developed with POD modes which capture 99% of the variation energy, POD 2 .
The POD model results show close alignment with the HFM result for NO x conversion and NH 3 slip. The performance of POD model which captures 99% of the variation energy exceeds that of POD 1 which is designed to capture 85% variation energy. This is expected by definition of the POD method. Table 5 presents the RMSE of the outlet NO x and NH 3 concentrations. The performance of the POD model is good in the transient state application. The RMSE errors in NO x outlet concentration is circa 3% of the mean inlet NO x concentration for POD 1 and less than 0.3% for POD 2 . The NH 3 RMSE is circa 12% of mean NH 3 inlet concentration at POD 1 and about 0.4% at POD 2 . The low RMSE, particularly for POD model #2 where 99% of the variation energy is captured, demonstrates that the reduced model order retains a high degree of fidelity for this transient case application. Table 6 show RMSE for other relevant system parameters.
The results show that the performance of the POD model is satisfactory with low RMSE across the relevant parameters. The RMSEs are lower for the case POD 2 as expected because more system variation energy is captured to develop that POD model. The magnitude of the RMSEs for both POD models are within 1-11% of the average HFM outputs which further underscore that the RMSEs are low across the relevant parameters.

Simulation Time
The simulation times recorded for the HFM and the POD models for the transient state application are presented in Table 7. The simulation real runtime is 391 s (model run for 1955s in 5 s time steps). Consistent with the steady-state application, the models achieve faster than real-time simulation time based on their real-time factor RTF < 1.
The results show poor performance in model reduction as there is only a marginal reduction in simulation time less than 1% for POD 2 model. There is no apparent improvement in simulation time with the POD 1 model. Consistent with the steady-state application, much of the simulation time is spent in solving the coupled non-linear boundary value (BVP) equation of the inlet channel, wall layer and outlet channel species transport balance. See Figure 9 for a split of the simulation time. The POD method is applied to the state variables with explicit-time terms in their   The POD model delivers improvement in the "Other" component of the simulation time, but not enough to improve overall simulation time. In the case of POD 1 , the increase in the time taken to solve the BVP component of the model (due to larger initial deviation and more iterations to reach desired error tolerances) altogether reverses any benefit in the POD application.
The POD method-although delivers a high-fidelity low order approximation of the full order model-has not achieved any significant reduction in simulation time. So, while theoretically valid, the practical use remains limited.

Artificial Neural Networks
The second approach to model order reduction applied in this work is artificial neural networks (ANN). In this approach, data was obtained from the high-fidelity model presented in reference [2] and a black-box approximation of the HFM is derived.
The motivation for this approach is twofold: (a) the ANN method is well suited for nonlinear model function approximation [10][11][12]; and (b) ANN can deliver model simplicity and adequate representation required for deploying the SCRcoated filter model in an online application.
Neural network has been applied in exhaust after-treatment system (EATS) modelling [13][14][15][16][17][18][19]. Chi in [13] used ANN to simulate the monolith bed temperature for more accurate estimation of the effect of radial mal-distribution of energy on reaction and by extension EATS performance. ANNs have been used to facilitate the application of NO x virtual sensor for emissions monitoring and online on-board diagnostics [14,15]. Brahma et al. [16] coupled neural networks with phenomenological soot modelling to improve the performance of a DPF unit within a broader EATS control design and optimisation framework. ANNs have also been used to evaluate the relative performance of different SCR catalyst formulation on NO x conversion [17,18]. In this work, we apply ANN to the problem of approximating the SCR-filter model for controloriented application. The authors are not aware of any previous application of the ANN method in order reduction of the SCR filter model. The performance of the ANN method is compared against an alternative MOR technique.
The process of developing our ANN is as follows. Firstly, data was collected over the desired range of operating conditions. Thereafter, a multilayer neural network was created and trained with 85% of the data collected. Cross-validation was carried out with 10% of the training data, and model testing was with the remaining 15% of the data. Model performance  This section provides further remarks on the development process of the ANNs.

System Data
System data was collected from the validated model of the SCR-coated filter developed in reference [2]. Table 8 presents the input and output data. The system state variables are also presented. The state variables are those with time-explicit terms in the SCR-coated filter model. In this model, the state variables drive the evolution of system properties over time. In Table 8, N is the number of discrete unit along the filter channel, w is the number of discrete units along wall, and T is the number of data samples (Fig. 17).
Note that the lumped parameter approach of Kladopoulou et al. [27] is applied to the wall temperature dynamics in the HFM [2]. This approach was taken to simplify the energy model in line with the objective of model development for system control. In our experimental set-up for model validation, thermocouples were inserted at various positions in the monolith to cover a wide range of the radial and axial surface. A volume-weighted average of the temperature data is correlated against the wall temperature predictions of the lumped model with satisfactory performance in a similar way to the work of Kladopoulou et al [27]. Figure 10 shows a schematic of the artificial neural network system showing input, output and state variables.
The process of data generation is stepwise random excitation of the input data within a defined upper and lower limit. The upper and lower limits of the variables are defined to cover the range expected from the model operation. The HFM was validated against Schrade et al. data [3] in this work. Therefore the range of our input parameters were dictated by the range of the reported experimental data in [3]. It is possible to extend our model beyond this application if good-quality data is available for model calibration. The data is discrete, generated at 1 s time steps. The dimension of each variable is included in Table 8 with the number of datapoints equivalent to T. In order to excite the system state variables sufficiently within the operational range, the input signal is held for 100 s between excitation levels. The maximum residence time for the Schrade et al. system [3] is circa 6 s. Figure 11 shows the input signals in a matrix plot. The diagonal of the matrix plot show histograms of the input signals over the range specified. The histogram of each input variable shows that the variable is drawn from a uniform distribution within the specified range. The only exception is the mass of urea (plot in diagonal #4) which is skewed left to match the NH 3 /NO x ratio (ANR) specified for the analysis. The plot highlights that the exposure of the input sufficiently covered the range of interest.
The size of the data generated is 100,000 samples. Eightyfive percent of this data was used for training the ANN. Ten percent of the training data was used for cross-validation. The remaining dataset was used to test the trained ANN model. A new dataset with 100,000 samples was used to further test model performance.

ANN Architecture
The ANN architecture is a classic multi-layer perceptron comprising input node, a single hidden layer, an output layer and output nodes. See Fig. 12 for a schematic of the network.
The size of the input node is shown as 8 + X, the size of the output node is shown as 6 + X, where 8 is the number of input parameters and 6 is the number of output parameters as per Table 8, and X is the number of the state variables after dimension reduction from the original dimension of N + 1 + 3Nw. The approach to dimension reduction is presented in Section 4.4.
The number of hidden layers and the number of neurons in each hidden layer are selected to provide adequate function representation without over-fitting. One hidden layer of 40 neurons was found to be acceptable for this exercise.
The transfer function for the hidden layer is the hyperbolic tangent sigmoid (tansig), because it is a good choice for many   nonlinear functions [17]. The transfer function for the output layer is linear (purelin). The transfer function equations are given in Eq. 10 and 11 respectively.
During network training, the weights and biases associated with each neuron are determined to minimise the mean sum of square errors (MSE) between the model output (y o ) and target data (y t ). The expression for MSE is given in Eq. 12 where b y t and b y o are the normalised 3 dataset designed to exclude any influence of the relative magnitude of the system variables on the overall MSE.
Network training is achieved via the back-propagation method using the Levenberg-Marquardt algorithm [28]. The ANN training is implemented in the Deep Learning Toolbox of MATLAB R2018b.

Static vs. Dynamic Networks
Many ANN structures used for engineering applications are either static (feedforward) neural networks or dynamic (recurrent) neural networks with feedback. In the static ANN (Fig. 13) neurons respond instantaneously to input signals and are well suited for pattern recognition applications where both input and output variables represent spatial patterns independent of time [29].
Dynamic networks (Fig. 14) employ feedback between the neurons of a layer and/or between the layers of the network. The dynamic network more closely represents the working of the biological neurons with the feedback denoting local memory attributes [29].
Comparatively, static ANNs are conditionally stable and better at function approximation in real-time applications [30]. Static ANNs do not take into account time delays and are weaker in modelling nonlinear systems [29].
Dynamic ANN on the other hand exploit feedback and time delays to achieve better performance in nonlinear systems in general and are particularly appropriate for system identification, control and filtering applications [29]. Dynamic ANNs are more complex, require larger training datasets and take longer to train [30].
In this work, we implemented both the static and dynamic networks to compare their performance for this application. Further details on the specific architecture of the static and dynamic networks are provided as follows; however, only the results of the dynamics ANN are reported thereafter due to space constraints.

Static Network
The static network architecture is presented as a schematic in Fig. 15. In the static network, the state variables set as input and the derivatives are outputs. These are connected via the Simulink integrator block to simulate feedback.

Dynamic Network
The dynamic recurrent network is presented in a schematic in Fig. 16. The state variables are set as input to the dynamic network at time, t; while the outputs are the state variable at time, t + 1. The feedback loop is connected by a unit-delay block "D".
Unlike the static network above, which approximates the derivative of the state, this one approximates the state itself for the next time step. In a linear interpretation, there is no difference between the two approaches. But since the network is non-linear, this approach tends to be inherently more stable.

State Variable Dimension Reduction
The solution of the HFM was over an inlet channel-wall layeroutlet channel representation of the SCR-coated filter. The solution domain was discretised into N units in the axial dimension and w units in the wall dimension [2]. Figure 17 shows the solution domain.
The size of the ANN input node is 8 + N + 1 + 3Nw which for N = 10 and w = 5 results in 169 input nodes. The size of the 3 minmax normalisation shrink data to [−1,1] Fig. 12 Schematic of the ANN architecture. Typical netwok contain input, hidden and output layers output node is 6 + N + 1 + 3Nw which is 167 for the same N and w discrete units.
The dimension of the state variables within the input and output nodes were reduced to make the training of the ANN memory efficient. Table 9 shows the lumping technique used for each state variable. The reduced dimension of the input node is 13 and the output node is 11. This approach achieves a significant reduction of the number of state variables and therefore of the model order. 4 Unlike the POD described in Section 2, the chosen basis is not ideal, but it has a clear physical interpretation.
It is noted that this state variable dimension reduction step results in what can be considered a zero-dimensional (0D) model; this is coincidental. The aim in this section was to lump the state variable in a manner that supports the ANN framework, so that model training is memory efficient. The physical interpretation of the lumping process mirrors the assumptions made in deriving a 0D model, namely, that the system can be considered as a CSTR which is homogenous in the spatial domain. Consideration of further overlaps in the ANN model and the conventional 0D model is outside the scope of this paper.

ANN Results and Discussion
This section presents the performance of the static and dynamic neural networks developed.

Overall
The MSE performance of the network for the training, validation and test data is presented in Figure 18. Training was relatively fast for this network with convergence achieved in 4.5 h.
The performance metric in Fig. 18 is for the normalised dataset. The best normalised MSE is 0.0148. The actual network performance when the data is de-normalised is presented in Table 10 for the training and test datasets.
The R 2 parameter is about 98% for both the training and test datasets. This means that the ANN model can explain 98% of the variation within the datasets. This is good performance as an R 2 of 1.0 indicates perfect correlation.
The R 2 parameter is sensitive to outliers; therefore, a modified index of agreement d 1 parameter is also reported. The index of agreement parameter is based on absolute errors and reflects the proportional contribution of each error estimate to the overall model performance without inflation by their square values [31]. The index of agreement at 0.97 indicates good agreement between the model prediction and the true output observations. The R 2 and d 1 results show that the ANN approximation of the full-order SCR filter model performs adequately. Figure 19 presents a further look at the performance of the outlet NO x concentration. Figure 19a shows the ANN model output of outlet NO x concentrations against the target HFM data. The R 2 of the ANN for outlet NO x in the test dataset is 0.95.

Specific Output Variables
The results in Fig. 19a show that for the training and test datasets, the majority of model output outlet NO x concentration congregates around the diagonal line of fit. Figure 19b shows the distribution of the deviation in outlet NO x concentration between the ANN model output and the target HFM data for the test dataset. The results show that most of the points are around the 0-25 ppm error and 80% of the observations are within ± 80 ppm deviation. A closer look at the ANN performance on outlet NH 3 concentration is presented in Fig. 20. Figure 20a shows the model outlet NH 3 concentration against the target data for the train and test datasets. The results show that most of the observations congregate around the line of fit. The test data R 2 value shown on the plot indicate that the ANN model can explain 90% of the variation in the target HFM data. The relatively low R 2 value for the outlet NH 3 concentration is due to few outlier values with large deviation.
The distribution of the deviation between the ANN model output and the target data for the test dataset is presented in Fig. 20b. The histogram corroborates the observation in Fig. 20a with most of the observation around 0-25 ppm. Ninety percent of the model output NH 3 concentrations are within 50 ppm of the target HFM NH 3 concentrations. Figure 20c provides further insight to the performance of the ANN model in the distribution of the prediction of NH 3 slip between the ANN and HFM reported for the test data. The results show that the ANN model tracks the performance of   In general, the results of the model on outlet NO x and NH 3 concentrations show that the ANN can approximate the SCRfilter model.
The performance of the other output variables against the target data in our model is presented in Fig. 21. The plot shows the performance for the training dataset (blue) and the test dataset (red) for outlet CO and CO 2 concentrations, filter pressure drop and the outlet temperature. The model R 2 for the test data is overlaid on each plot.
The summary of the model performance for the output variables is presented for the training and test dataset in Table 11.
The low RMSE, MAE and the high R 2 values for the training dataset show that the trained neural network can approximate the full order SCR-filter model to a reasonable level of accuracy. Similar level of performance on the test data show that the trained neural network can be generalised as a function approximation.
Note that R 2 performance for outlet CO and CO 2 reported in Table 11 shows apparent perfect correlation between the ANN and HFM (R 2 = 1). This is due primarily to the almost inert nature of those components in this system. Diesel oxidation catalyst is not modelled in the HFM and inlet soot concentration is relatively small. Therefore, CO and CO 2 concentration change within the system is insignificant. These results are reported for completeness. Figure 22 shows how the ANN model tracks the increase in CO and CO 2 due to soot burn-off in our model.

Simulation Time
The training time was relatively short at 4.5 h. The system real runtime was 100,000 s in 1 s time steps. The model simulation time was 156 s. This is equivalent to a real time factor of circa 1.6e-3 for this case. This means that the neural network model can achieve three orders of magnitude reduction in simulation time compared to real time.

Trained Network on New Dataset
To minimise the random effects of the training data and assess whether the trained model can be generalised, the trained model was tested against new, previously unseen dataset. (For ease of reference, the new dataset is called "Test Data #2".) The new dataset was different from the Fig. 17 The full-order SCR-filter model solution domain The network performance for outlet NO x and NH 3 concentration is presented in Fig. 23 with similar performance as already discussed for the original dataset. The apparent better performance in the error distribution of Test Data #2 outlet NH 3 (Fig. 23d) compared with Test Data outlet NH 3 (Fig. 20b) even though the NH 3 RMSE and MAE is worse (Table 12) is due to the larger sample size of Test Data #2. Test Data #2 is 100,000 sample size while Test #1 is 15,000 samples, and the error deviation histogram output is in relative frequency.

Practical Considerations for ANN Modelling in EATS
This section presents a few thoughts on further practical aspects to consider for applying ANN to EATS modelling.

Quality of Training Data
The performance of an ANN depends on the quality of data available to train the model. System-specific data can be generated from high-fidelity models as we did in this study. This enables a degree of customisation of the range, domain and frequency of the data. The pitfall of this approach is that specific shortcomings of the HFM carry forward into the trained ANN. Alternatively, data where y t is target observation, y o is model prediction/output, and y t is the mean of target observation [31]. can be obtained from telemetry feedback from real-life deployment of the specific units of interest. More effort is required to process this data for use in the ANN framework The practical restrictions in the measurement of some system variables (e.g. ammonia surface coverage, NO x speciation) further limit the ability of the output models to generalise acceptably. It is not practical to generate data for the ANN from SGB-type experiments because of the efforts required to generate the volume of data required to train a neural network. Once good- quality data is available, the process of developing ANN models is relatively straightforward.

Online Training
By online training, we mean setting up a neural network which continues to accept data input and continually updates the model parameters considering the new data. This is an important aspect of ANN for current and future applications in EATS reduced order modelling. It is envisaged that most of the data for ANN development will be via telemetry feedback from real-life deployment of the specific unit of the EATS. The initial ANN will be trained perhaps on the first tranche of HFM data but with functionality to update "on the fly". The fundamental steps to setting up the base ANN model is described in the previous sections. The online training algorithm will incorporate some dynamic backpropagation. The interest reader is referred to [12] for additional information on customising the ANN model for online training.

Conclusions
An SCR-coated filter model was developed in [2]. A new implementation approach which delivers faster than realtime simulation times was demonstrated. The full-order SCR-filter model (HFM) is however complicated and does not lend itself to easy implementation for control application on a vehicle ECU.
To achieve the objective of model-based control development, it is necessary to reduce the complexity of the SCR-filter model. The SCR-filter model order reduction (MOR) has been attempted in this work. Two MOR techniques were implemented: (a) proper orthogonal decomposition, a grey-box approach, and (b) artificial neural networks for function approximation, a black-box approach.
The performance of both MOR approaches is compared in Table 13.
The POD delivers a reduced order model with a high degree of fidelity to the HFM. This is evident in the low RMSE values.  Dynamic ANN lumped parameter. Network performance on output CO and CO 2 concentration. Percentage increase in CO and CO 2 concentration due to soot oxidation in the system. Soot content is small therefore apparent perfect correlation is reported between ANN and HFM. a Distribution of increase in outlet CO concentration. b Distribution of increase in outlet CO 2 concentration The reduction in simulation time delivered by the POD is marginal when compared with the simulation times of the HFM. The POD method delivers a high-fidelity approximation to the HFM because the POD model is developed from the HFM-the POD bases are derived from system data-thus retaining some physical aspects of the full-order model. The reduction in simulation time is modest because most of the simulation time is spent in the solution of the coupled channel wall layer species boundary value differential equation.
The ANNs deliver a reduced order model which achieves a significant reduction in simulation time. The speed-up achieved is in the order of a thousand times faster than the HFM. The fidelity of the ANN model is satisfactory, although significantly poorer than obtained in the POD approach.
Abbreviations ANN, artificial neural network; ANR, ammonia NOx ratio; ASC, ammonia surface coverage; BVP, boundary value problem; CSTR, continuously stirred tank reactor; DPF, diesel particulate filter; EATS, exhaust after-treatment system; ECU, engine control unit; HFM, high fidelity model; MOR, model order reduction; PCA, principal component analysis; POD, proper orthogonal decomposition; RMSE, root mean square error; RTF, real time factor; SCR, selective catalytic reduction; SVD, singular value decomposition; WHTC, world harmonised transient cycle  Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.