Flow-based dissimilarity measures for reservoir models: a spatial-temporal tensor approach

In reservoir engineering, it is attractive to characterize the difference between reservoir models in metrics that relate to the economic performance of the reservoir as well as to the underlying geological structure. In this paper, we develop a dissimilarity measure that is based on reservoir flow patterns under a particular operational strategy. To this end, a spatial-temporal tensor representation of the reservoir flow patterns is used, while retaining the spatial structure of the flow variables. This allows reduced-order tensor representations of the dominating patterns and simple computation of a flow-induced dissimilarity measure between models. The developed tensor techniques are applied to cluster model realizations in an ensemble, based on similarity of flow characteristics.


Introduction
The increasing demand for energy has encouraged the use of improved production strategies for conventional oil and gas resources. In this context, several studies have indicated a significant scope for reservoir model-based life-cycle optimization of ultimate recovery or net present value (NPV), especially when combined with computer-assisted history matching leading to a closed-loop reservoir management (CLRM) approach (see, e.g., [18,29] or [10]). In this CLRM approach, the assessment of the value of information of different resources becomes an important feature (see, e.g., [6]). To properly account for the effect of geological uncertainty, it is important to perform both the history matching, life-cycle optimization and value of information assessment on the basis of several realizations of the reservoir model. The combination of iterative (large-scale) optimization, and history matching, with the need to use multiple model realizations makes CLRM into a computationally very demanding process for realistically-sized reservoir models. Particularly, for the CLRM framework, one would like to discriminate between realizations which are representatives of the different types of flow responses. For this reason, oil companies have used very few realizations which are often selected manually, to achieve robustness in their operational strategies (see, e.g., [28]). The selection of representative models has become a relevant issue for the practice of reservoir engineering. In other words, there is a need for a dissimilarity measure between reservoir realizations that is relevant for model-based operation of oil reservoirs.
There are several options for discriminating between model realizations, on the basis of either static or dynamic properties of the reservoir models. [35] and [8] have used the permeability fields as a measure of dissimilarity. This has been done by defining a metric space to compare and cluster geological models that share common geological features. These dissimilarity measures based on static permeability or porosity properties are known to be quite different from measures applied to the dynamic behavior of the reservoir models, reflected in the corresponding flow patterns, as, e.g., the evolution of oil saturation over the life cycle of the reservoir. To exemplify this, upscaling of highresolution geological models, as is presented in [13], shows that dissimilar reservoir models may have similar dynamical performance in terms of flow dynamics. At the same time, reservoir models that are close in geological properties can have essentially different flow patterns, and therefore different behavior from a dynamic operation point of view.
From a production optimization perspective, the use of NPV, generated at the end of a certain production period, could be a natural basis for a control-relevant dissimilarity measure between model realizations. If we would restrict attention to life cycle production optimization under predefined production strategy and well configurations, this might be true. However, when reservoir models are to be used also for testing new production strategies, as well as well-placement, infill-drilling and re-completion plans, the NPV measure is considered to be too coarse to distinguish between essential dynamic properties of the models. It is well known that the NPV is not able to capture the relevant aspects of the reservoir flow patterns associated with a particular production strategy, in other words: two essentially different geological models could lead to the same NPV under similar production strategies, but on the basis of essentially different flow patterns.
In [30] and in [31], the total oil production and water rates are used as dissimilarity measures to assess the dynamical responses of different reservoir realizations. Although these measures are less coarse than the NPV, oil and water production rates (combined with pressure measurements) only provide local information of the flow behavior around the wellbore, which cannot be extrapolated to characterize the flow performance of spatial locations far from the production sites. Therefore, it suffers from similar limitations as measures based on NPV.
While streamline simulators have been used to generate a fast characterization of the cumulative production rates (see, e.g., [27,32], and [30]), they have also led to a technique called dynamic fingerprinting ( [41]), where streamline information (time-of-flight (TOF) or drainage time) is used to generate flow patterns that are, like a fingerprint, unique to each realization. Then, fingerprints are used to screen and cluster reservoir realizations with similar dynamical performance. The resulting dissimilarity measures are attractive for flow characterization, though they are merely simplified descriptors of the much more complex spatial-temporal reservoir flow patterns in terms of evolution of the flow variables (phase saturations, pressures, etc.) for a particular production strategy.
In this paper, we address the question whether we can use the full reservoir flow patterns as the dissimilarity measure between reservoir realizations. Reservoir flow patterns are numerical solutions of the pressure and transport partial differential equations (PDEs) ( [4]), and they represent the temporal evolution of the dependent variables in the spatial domain (typically 10 5 ∼ 10 6 grid blocks) of the reservoir. Therefore, the discrete-time trajectories for pressure, saturation, temperature, etc., are usually large-scale data structures, with dimensions induced by the number of grid cells of the reservoir model. The large dimensionality of these structures would make them unsuitable to serve as a dissimilarity measure for performing model discrimination and clustering, and therefore reduced-order representations are necessary. Previously, [9,26] and [23] have used the singular value decomposition (SVD) and POD model order reduction techniques to arrive at low dimensional representations of flow variables, while [41] have used SVD to represent the fingerprints through a reduced set of basis functions. However, the SVD approach has some limitations. As the reservoir flow patterns are stacked in vectors, the natural spatial-temporal structure of the reservoir is lost. This may have serious implications when characterizing flow profiles in low-dimensional spaces, as some information related to the spatial correlations is lost during the vectorization scheme, see [16].
In this paper, we develop a tensor approach for efficient storage of reservoir flow patterns in a multidimensional array. This creates a clear separation of the spatial, temporal, and flow variables coordinates, and allows for reduced-order representations using basis functions in each of the separate coordinates, thereby appropriately maintaining spatial correlation structures. With an additional extension of the tensor coordinates, it will even allow for describing the flow characteristics of an ensemble of models. Tensor decompositions and tensor analysis constitute a largely unexplored subject in reservoir engineering. For the characterization of geological parameters, [1] and [2] have performed low rank approximations of permeability fields using a tensor decomposition, and [14] have used these representations for efficient history matching. For the reduction of dynamical complexity of reservoir models, [16] have utilized this framework for constructing reduced-order dynamic models. In our current paper, we apply state-of-the-art tensor decomposition techniques to characterize flow profiles in low-dimensional spaces. The corresponding reduced-order representations will be analyzed for their suitability to calculate distance measures between models, and for subsequent distance visualization and model clustering.
The paper is organized as follows: In Section 2, we introduce the notions of flow-based dissimilarity measures and the state-of-the-art technology for flow characterization. In Section 3, we present the benefits of exploiting the spatial structure and correlations of the reservoirs by using a tensor formulation and we introduce our spatial-temporal approach for the flow characterization through dissimilarity measures. In Section 4, we present a tensor-based workflow for the flow characterization of an ensemble of reservoir models. In Section 5, we evaluate the performance of the workflow for flow characterization using flow-based dissimilarity measures.

Introduction
In water-flooding, the temporal evolution of the oil saturation (and in particular the oil-water front) provides sensible information for well placement and for the design of schedules for the well controls in order to optimize production. Hence, the reservoir flow patterns are the variables with physical interpretation that best describe the dynamic properties of the hydrocarbon reservoir, and we can conceptually state that two reservoir realizations are similar with respect to their dynamical performance if for a particular operational strategy, the generated reservoir flow profiles are similar.
The variable s(x, t, u) will be used in this paper to represent the flow-related variable, with a spatial coordinate x ∈ R 2 , time t ∈ R, and operational strategy u. In most cases, s will correspond to the oil saturation in each (spatial) grid block, although other variables (e.g., pressure, time-offlight, drainage time) could be included too. They are the solutions of the underlying model's multiphase flow equations through their corresponding PDE's, and the result of a particularly chosen operational strategy of water injection and control valve settings, reflected by the variable u, [17]. In this section, we elaborate on the concept of model distances based on reservoir flow patterns.

Dissimilarity measures and distance functions
When quantifying flow-based dissimilarities between reservoir models, one should consider the use of distance functions. A distance function defines the separation between two elements in a set (the set of reservoir flow responses) and it induces a metric space, where the distance between two different reservoir models is an indicator of their dissimilarity in the dynamical response. There are many functions to compute the distance between two objects: the Euclidean distance, standardized Euclidean, Chebyshev distances, and many more. [34] have used the Hausdorff distance to measure the dissimilarity of geometry for reservoir realizations, and [27] have used connectivity distances based on streamlines. If s 1 and s 2 are the flow-related variables corresponding to two different models, a natural dissimilarity measure to consider is a quadratic distance measure: where K is the total number of time steps; I, J are the number of grid cells in each spatial dimension, and the two models are operated with the same operational strategy u(t k ). For brevity of notation, we will often discard the dependency of s(x, t k , u) on u and simplify the notation to s(x, t k ) whenever there is no risk of confusion. The underlying spatial domain is assumed to be rectangular with Cartesian grid. The temporal evolution of the flow variables s(x, t k ) over all grid cells is a collection of high-dimensional state variables and generally requires the use of huge computational resources for storage, function evaluations, the evaluation of distances as in (1) and its subsequent use for visualization and model clustering. In the next section, a method for the low-dimensional representation of s(x, t k ) is described.

Low dimensional representations and flow-based distances through SVD
Compact representations of s(x, t k ) are important for an efficient and fast numerical calculation of distance measures, see [41]. A typical way to construct lower dimensional representations of s(x, t k ) is obtained by utilizing a basis function expansion for the set of flow variables over all grid cells: where the basis functions ϕ i (x), for i = 1, · · · ,R can be selected to be the most informative spatial patterns in the flow response. IfR N, where N = I · J is the number of grid cells, we say that the reservoir flow pattern s(x, t k ) is characterized in a low-dimensional space by the coefficients σ i (t k ) and by the basis functions ϕ i (x), for i = 1, · · · ,R. The classical technique for obtaining this representation is through principle component analysis (PCA) and the use of singular value decompositions (SVD), [15]. To this end, the dynamic variables in the grid are represented as (I · J ) × 1 vectors, denoted as x k , with elements s(x ij , t k ) at a particular time moment t k . A number K of these snapshot vectors x k is collected at time instants t 1 , · · · , t K , where K may be less than or equal to the total number of simulation time steps. With N = I · J , this results in a N × K matrix of data points which is decomposed using SVD through: where and are N × N and K × K orthogonal matrices containing the left and right singular (column) vectors ϕ r and ψ r , is an N × K rectangular diagonal matrix that has the ordered singular values σ 1 ≥ σ 2 ≥ · · · ≥ σ R ≥ 0 on its main diagonal, R is the rank of X, and ⊗ denotes the tensor or outer product over a vector space. Usually, R ≤ K N in a typical reservoir simulation application. The last equality in (4) indicates that X can be decomposed as the sum of R rank-one matrices r = ϕ r ⊗ψ r . In particular, every individual snapshot vector x k can be written as: where e k is the kth standard unit vector in R K and α k r := σ r ψ r e k is a real-valued coefficient. The R left singular vectors ϕ r , r = 1, . . . ,R then characterize the spatial correlations of the original snapshot matrix X, ordered in decreasing relevance, and allows a low-dimensional approximationx k = R r=1 α k r ϕ r , withR < R, of the reservoir flow patterns and fingerprints in terms of the coefficients α k r . Using (5), the Euclidean distance between the ith and j th snapshots x i , x j is: Now, let us consider the low-rank approximation X = x 1x2 · · ·x K of X, which can be obtained by decomposing (4) as whereR < R is the approximation order and where X = X − X is the approximation error. For anyR < R, the Frobenius norm of the error is minimal over all rankR approximations of X. The approximate representations {x k } k=1,···K of the reservoir flow pattern can now be used as a basis for measuring dissimilarities between models, where the appropriate calculations can be performed on the basis of the coefficients α k r for r = 1, · · ·R and k = 1, · · · K. Let us consider the low-dimensional characterization of the flow patterns in terms of the coefficients α i r , α j r for r = 1, . . . ,R, then the approximated dissimilarity is: where d ij is the (i, j )th element of a matrix D ∈ R K×K of all approximate distances.

Discussion
The SVD-based approach presented in Section 2 has been adopted in industrial practice, [41], but it has some limitations. Through the vectorized form in which flow variables are stored, the spatial-temporal structure of the reservoir is lost. This may have serious implications when characterizing flow profiles in low-dimensional spaces. When SVD is applied to a snapshot matrix X, the sets of orthonormal basis vectors {ϕ r }R r=1 , {ψ r }R r=1 , average the energy of solutions in time, and by definition, do not discriminate among spatial coordinates. This temporal averaging of the energy causes a loss of information for some of the relevant features of spatial coordination in the data, see [16]. For linear systems like single-phase flow problems, the spatial correlations are invariant in time and can be characterized analytically using concepts from system theory such as controllability and observability, see [36]. However, the nonlinearities induced by the multi-phase character of the problems may define time-variant correlations of the states, and the correlation between time and specific spatial direction is ignored by vectorizing the flow variables, in which case it can be attractive to clearly separate all spatial and temporal coordinates to maintain their own independent role when constructing approximations. In the next section, a methodology that overcomes the limitations of SVD methods for flow characterization is presented.
3 Spatial-temporal tensor methods for flow-based dissimilarity measures

Introduction
In this section, we develop a multidimensional approach to understand the dynamical similarities between reservoir models, which is based on tensor representations and decompositions of flow related variables. In addition, we incorporate this approach into a workflow for clustering of models with similar dynamical performance.

Tensor representations of reservoir flow patterns
In the previous section, we have constructed vectorized representations of the flow variables (I ·J ×1 snapshot vectors). Alternatively, if the spatial grid has a Cartesian structure, one can collect K snapshot matrices X k of size I × J , and represent this data object in a three-dimensional array S of size I ×J ×K. That is, the reservoir flow data is represented as a multi-array S ∈ R I ×J ×K . Such a multidimensional array is called a tensor and can be viewed as the natural generalization of vectors and matrices to higher dimensional objects. For a 2D saturation field that evolves over time, a three-dimensional array is schematically depicted in Fig. 1.
A key advantage of multidimensional data objects is that they keep the spatial structure of the Cartesian grid intact. A disadvantage of the use of tensors is that their algebraic properties are more complicated and that numerical tools for tensor operations are less developed. In general, tensors are Fig. 1 Schematic of the tensor representation of a reservoir flow pattern. Axes represent spatial-temporal coordinates. Color-scale corresponds to oil saturation multilinear generalizations of algebraic objects such as vectors and matrices, and there exist suitable extensions of concepts such as decompositions, basis functions and spectral expansions to the multilinear case. In the next subsection, we describe the basic concept of tensor decompositions as an extension to the concept of matrix decomposition.

Tensor decompositions and approximation
In analogy to the matrix decomposition in Eq. 4, the tensor S can be decomposed as a Tucker type decomposition ( [22]): where the scalars σ ij k ∈ R are the elements of the so called core tensor of size I × J × K and where is the outer product of vectors ϕ i ∈ R I , ψ j ∈ R J and χ k ∈ R K . This makes ij k a rank-one three-way tensor. In any such representation, the sets are usually taken as a basis of R I , R J , and R K , respectively, and the Tucker decomposition (9) is viewed as a representation of the tensor with respect to these bases. Similar to the matrix case, tensors can be interpreted as multilinear mappings. More precisely, the rank-one three-way tensor ij k is a mapping ij k : which is a product of inner products in R I , R J and R K . At this stage, it is important to observe that the mapping ij k , defined in this way, is linear in each of its argument. Moreover, if the bases for any triple of indices In words, this says that the entries of the core tensor represent the tensor S when evaluated at its (orthonormal) basis vectors. If the bases are orthonormal, then this observation naturally identifies the entries σ i 0 j 0 k 0 of the core tensor with the evaluation of the tensor S at its (i 0 , j 0 , k 0 )th basis element. If the bases are non-orthonormal bases, then the multilinear functional S : R I ×R J ×R K → R defined in (10) changes its representation. A graphical illustration of the tensor decomposition (9) is depicted in Fig. 2. A low-rank approximation of S can be obtained by decomposing (9) according to S = S + S where, for is viewed as the approximation of S to its modal-rank ( I , J , K) truncation and where S := S − S is viewed as the corresponding approximation error. The size of the approximation error is measured in Frobenius norm and satisfies Suppose that the above tensor decomposition is applied to the data corresponding to a two-dimensional rectangular saturation field that evolves over time. The mth sample X m is then represented as a matrix of dimension I ×J . A number of K samples X m is stored in an order-3 tensor S of size I ×J ×K. This tensor is approximated as in (11), and results in the approximate sample X m of the saturation field defined by the order-2 tensor where e m is the mth standard unit vector in R K and where α m ij := K k=1 σ ij k χ k , e m are real-valued coefficients in the expansion (13) of (rank 1) two-dimensional fingerprints of the saturation field. The coefficient α m ij is a linear combination of the mth element of the basis functions in the set This generalizes (5) to the two-dimensional case and avoids to vectorize the data structures X m .

Algorithms for tensor decompositions
Clearly, the approximation accuracy of (11) and (13) depend on the choice of basis vectors ϕ i , ψ j , and χ k , their ordering and the elements in the core tensor. There exist many algorithms to select these bases in such a way that the approximation error (12) is small or minimized. The problem of finding these sets can be formulated as the optimization problem which is to be solved subject to the constraint that the basis This problem has an analytic solution only for the case where (Î ,Ĵ ,K) = (1, 1, 1). For all other cases one has to resort to numerical approximations. Several algorithms have been proposed to compute tensor decompositions using orthonormal basis functions. The High Order SVD (HOSVD) proposed by [11] was the first extension of the classical SVD to the spatial-temporal case and the methodology is based on an unfolding procedure of tensors. The high order orthogonal iteration (HOOI) by [12], the Tensor SVD proposed by [40], maximum singular value modal rank (MSVM), and the single directional modal-rank decomposition (SDM) by [33] compute singular values (elements of the core tensor) and basis vectors in a sequential way, where the singular values and vectors depend on a search direction at every decomposition level ( I , J , K). The tensor SVD, MSVM, and SDM algorithms keep the tensor structure intact in such a decomposition procedure. In this paper, we consider the Tucker modal-rank type of decomposition, Fig. 2 Schematic description for the truncation of a Tucker decomposition of a 3D tensor see [22], which achieves orthonormal sets and {χ k } K k=1 . There are several tensor toolboxes available for the Matlab platform, like the Matlab Tensor toolbox, see [5] and the Tensorlab, see [38]. In this work, we have used an HOSVD implementation using tensor operations from [5].

Tensor approximation of reservoir flow patterns
Let us exemplify the concept of signal approximation and compression of reservoir flow patterns through tensor decompositions. In the framework of multidimensional (tensor) approximations, the sets of orthonormal basis func- represent the most relevant spatial correlations independently for each spatial coordinate. The coordinate independence can be exploited to approximate flow patterns which have a richer variability in a certain coordinate, as it would be the case for flow patterns in channelized reservoirs.
We consider a 2 facies, 2D oil reservoir with a square geometry of length L = 3000 m, one layer of 10 m thick. The numerical model of one realization has 3600 grid blocks of size 50 m × 50 m. A description of the physical parameters, wells configuration, and a link to the data files can be found in [39]. The sequential solvers of MRST, see [24], have been used to solve the pressure and saturation equations and the production has been simulated for a period of 15 years, time step of 5 days. There are four water injectors, and each of them injects at a rate of 600 m 3 /day, and the producers operate at 150 bar. We collect K = 1095 time steps for the temporal evolution of the oil saturation. Then, we construct a 3D tensor S of size 60 × 60 × 1095, where the x, y, and temporal dimensions correspond to the first, second, and third tensor coordinate accordingly. Hence, the tensor S can be decomposed as in Eq. (9): The reservoir flow pattern in tensor S is described by I = 60 basis functions of size 60 × 1 in the x coordinate, J = 600 basis functions of size 60 × 1 for the y coordinate, and K = 1095 basis functions of size 1095 × 1 for the temporal dimension. We compute low-rank approximations S of the original flow pattern by truncating the sums in Eq. (15). The purpose of this example is to study the effect of decreasing the number of spatial basis functions for the approximation, and therefore the number of temporal basis functions are fixed to {χ k } K=5 k=1 . For the first approximation, we select for the x coordinate and basis {ψ j } J =20 j =1 for the y coordinate, leading to a modal rank approximation of (20,20,5). For the second approximation, we select basis for x and {ψ j } J =10 j =1 basis for y, leading to a modal rank approximation of (10, 10,5). Time snapshots of the reservoir simulation and the approximations are depicted in Fig. 3. From Fig. 3, it is clear that decreasing the number of spatial basis functions would affect the quality of the approximations. This can be quantified by using (12) to derive the relative proximity ν of the approximations S with respect to the original tensor S : Here, we consider modal rank approximations of the type (r, r, 5), for r = 1, · · · , 60, and the relative proximity as a function of r is presented in Fig. 4. For the flow patterns depicted in Fig. 3, it is observed that the approximation (10, 10, 5) preserves almost 85% of the features of S , while the approximation (20, 20, 5) preserves more than 90%.
When fixing K = 5, only 0.46% of the total number of basis function for the temporal domain is used, while achieving a maximum proximity of 94%. For this example, that fact indicates that the temporal dynamics can be explained with a very small amount of the information contained in S . The amount of information required to construct an approximation can be quantified by summing the size in memory of the constitutive elements in (11): , and the corresponding core tensor . Similarly, we consider modal rank approximations Fig. 3 Oil-water front with tensor approximations. Approximation 1 has modal rank (20,20,5). Approximation 2 has modal rank (10,10,5). Colors represent oil saturation Fig. 4 Blue-Left axis: Relative proximity ν(r) . Green-Right axis: Size in memory of the modal rank approximation (r, r, 5) as function of r of the type (r, r, 5), for r = 1, · · · , 60, and the information is presented in Fig. 4. The size in memory of the tensor S is 30.08 MB, while the approximations (20,20,5) and (10, 10, 5) of Fig. 3 have a size of 79.38 and 57.78 KB respectively. These findings indicate that the reservoir flow pattern in S can be approximated using only 0.25% of its original information, while achieving relative proximities higher than 90%. This experiment suggests that more than 99% of the information contained in S is redundant for the purpose of flow characterization.

4D Tensors: an approach for handling multiple realizations
In the multilinear framework, it is possible to define an additional coordinate, where an index that links the dynamical behavior (the reservoir flow pattern) to its corresponding model is assigned to every realization. In this subsection, we restrict our attention to 2D cartesian grids, without losing generality for 3D geometries. For the case where we have an ensemble of R realizations and their corresponding flow patterns, the full data set is described by two spatial coordinates, the temporal coordinate and a coordinate for the realizations, i.e., a 4D tensor S of size I × J × K × R, with I, J the dimension of the spatial coordinates x and y, K the dimension of the temporal coordinate, i.e., the number of time steps, and R the number of reservoir models, which constitutes the size of the ensemble to be characterized. A schematic representation of such a data structure is depicted in Fig. 5. In analogy to the Eq. (9), the 4D tensor S has a Tucker decomposition of the form: where the orthonormal basis vectors {ϕ i | 1 ≤ i ≤ I }, {ψ j | 1 ≤ j ≤ J } span the spatial coordinates, {ω k | 1 ≤ k ≤ K} spans the temporal space and {χ k | 1 ≤ k ≤ R} spans the model space. The reservoir flow patterns of the mth realization X m is then represented as a tensor of dimension I × J × K which is approximated similar as in (13), and where e m is the mth standard unit vector in R R and where α m ij k := R r=1 σ ij kr χ r , e m is a real-valued coefficient in the expansion (17).
This expansion shows explicitly the way how the information is distributed in the decomposition. Clearly, the tensor ϕ i ⊗ ψ j ⊗ ω k contains the spatial-temporal correlations that are shared by all the set of reservoir flow patterns in the ensemble. What makes a reservoir flow pattern X m distinct from others is the selection of the mth element of the basis functions for the model coordinate χ r , e m and subsequently the coefficients α m ij k . As it was indicated previously for the 3D case, the coefficient α m ij k is a linear combination of the mth element of all the basis functions in the set {χ r } R r=1 , i.e., α m ij k = R r=1 σ ij kr χ (m) r , and the information that characterizes the dynamical properties of the realizations are embedded into the elements of core tensor σ ij kr and the set of basis functions for the model coordinate {χ r } R r=1 . This analysis creates the foundations for the definition of low-dimensional representations of the flow profiles in the next subsection.

Flow-based dissimilarity measures in low-dimensional tensor representations
In order to be able to calculate a dissimilarity measure between two models on the basis of low-dimensional representations, we require the tensor representation of the flow patterns for both models to be expanded with the same basis functions, as in Eq. (17). Therefore, we construct the 4D tensor representation described in Section 3.6, and we introduce a metric space by defining a distance function between two reservoir flow patterns X p and X q of the realizations p, q as: σ ij kr χ r , e p − χ r , e q ϕ i ⊗ ψ j ⊗ ω k σ ij kr χ r , e p −e q ϕ i ⊗ ψ j ⊗ ω k F , (18) where e p , e q are the pth and qth standard unit vectors in R R . Let us define the real-valued coefficient δ ij k = R r=1 σ ij kr χ r , e p − e q , which is an element of a tensor D of dimensions I × J × K. Hence, (18) can be written as where , and are column matrices composed by the basis functions Due to the orthonormality of the columns of , , and , we obtain: The expression in (20) suggests that the dissimilarity between two reservoir realizations can be approximated by truncation, which corresponds to the expression In addition, the scalar δ ij k can be expressed in terms of the coefficients α m ij k in (17), δ ij k = α p ij k − α q ij k , and therefore the tensor-based approximation of the flow-based distance between the realizations p and q is defined as If one stores the set of coefficients α m ij k as elements of a tensor A m of dimension I × J × K, then (22) where d pq is the pqth element of a distance matrix D. The approximation error of computing the dissimilarity measure using the approximations in (17) is bounded (see [11]).
The approximation of the distance in (22) can be computed based on the tensors A p and A q , and they can be seen as compact representations of the pth and qth reservoir flow patterns. The set of tensors A = {A 1 , A 2 , · · · , A R } is composed by low-dimensional representations of the reservoir flow patterns for an ensemble of R realizations, and they are used for further steps in flow characterization such as distance visualization and model clustering.

Discussion
With the spatial-temporal methodology, it is possible to identify the tensor coordinates with richer information content, and it introduces flexibility and extra accuracy when representing flow patterns in low-dimensional spaces and calculating dissimilarity measures. Flow-based dissimilarity measures allow the classification of the different types of flow behavior in an ensemble, and the application of the methods described in this section are useful for model clustering, where the computational complexity limits the classification of full reservoir flow patterns. In the next sections, we apply the concept of flow-based dissimilarity measures in low-dimensional spaces to the flow classification of reservoir models using the tensor approach.

Introduction
In reservoir engineering, multiple realizations are used to account for the uncertainty of the rock properties of subsurface, and the industrial practice indicates that despite the fact that realizations look different from the geological perspective, some of them may have similar dynamical performance. In flow classification, we aim to find sets of realizations that share a similar dynamical performance with respect to the spatial-temporal evolution of their corresponding reservoir flow patterns. For that, it is required to compute dissimilarity measures between related flow patterns, a method for visualizing these dissimilarities and a clustering technique to group the models with similar dynamical properties. When using a flow-based dissimilarity measure for model clustering, these steps are constrained by the dimensionality of the data set to be analyzed, and the representation of the reservoir flow patterns in lowdimensional spaces are used for the efficient classification of multiple reservoir realizations. The flow-based approach for dissimilarity measures was introduced in Section 3. Here, we provide the theoretical foundation for the workflow developed in this paper. In Section 4.2, we describe a tensor-based clustering algorithm, and in Section 4.3, we describe the method for visualizing distances based on the distance matrix D.

k-means tensor clustering
When analyzing data sets, analysts aim to extract patterns, object classification, and data ordering. Thereby, k-means clustering finds groups of data which are similar to one another, partitioning a set of objects into clusters. Let us consider the data objects described in Section 3.7, where the tensor data set A = {A 1 , A 2 , · · · , A R } is composed by the low-dimensional representations of the flow patterns. In this section, we aim for a partition of the data set A into a set of K c clusters C = {c 1 , c 2 , · · · , c K c } with corresponding centroid μ k of dimension I × J × K, the same size of the elements in the set A, where c k ⊂ A for k = 1, · · · K c , such that the variance within each cluster is minimized, see [20]. This operation can be formulated as: where N k is the number of elements (size) of the cluster c k and j ∈c k A j indicates the element-wise sum of the tensors A j which have been assigned to the cluster c k . The k-means algorithm has NP-hard complexity ( [3]), which can be relaxed using heuristic algorithms like the Lloyd's algorithm ( [25]). The algorithm has two basic steps: (1) The assignment of every tensor object in A to the closest cluster centroid, and (2) the re-computation of the centroids using the current cluster membership: -Initialize cluster centroids μ 1 , μ 2 , · · · , μ K c randomly.
3. Centroids update: Compute the average of the cluster elements.
The selection of the K c is a user choice; however, there are more systematic methods to determine the initial guess for the number of clusters, see, e.g., [21]. When working with large-scale data sets, it is required to account for the scalability and the computational complexity of the algorithms for data analysis. Lloyd's algorithm has linear computational complexity O(t · K c · R · n), where t is the number of iterations needed to converge and n is the size of the objects to be clustered respectively. From the complexity point of view, the application of the k-means tensor clustering algorithm is constrained by n, i.e., the dimensionality of the objects to be clustered. Particularly, the fact that the reservoir flow patterns are large-scale data structures (n ∼ 10 6 ) poses a challenge for the state-of-the-art clustering algorithms, and limits the use of k-means for flow-based classification.
For the tensor-based clustering approach, the computational effort will be dominated by the tensor decomposition with computational complexity O(max{I 3 , J 3 , K 3 , R 3 }), see, e.g., [11,15]. A direct clustering on the basis of the full reservoir flow patterns would require a computational effort of order O(t ·K c ·R ·I ·J ·K) which will generally be two to three orders of magnitude higher than the former approach, and it is exemplified in the next section.

Visualization of dissimilarities
For the visualization of dissimilarities between a set of reservoir flow patterns, we determine their coordinates in a metric space using multidimensional scaling (MDS). For a detailed description, we refer to [7]. For the application of MDS in uncertainty quantification, we refer to [32,35] and [8]. MDS uses the SVD to determine a low-order set of dimensionless directions in which the relative distances between the objects can be efficiently represented. In particular when considering just two or three of the most relevant directions, it is possible to represent the distances between the objects graphically.

A workflow for model clustering using flow measures
In this subsection, we use the multilinear algebra methods described in this paper to find clusters of models with similar dynamical properties. The developed methodology uses the concept of flow-based dissimilarity measures, computed in low-dimensional spaces to determine the dynamical similarities between reservoir models, by exploiting the tensor structure of the reservoir flow patterns. The purpose of this workflow is to estimate the closeness between two or multiple realizations with respect to a performance indicator relevant to the CLRM framework. The inputs for the workflow are: -R: The number of realizations. -X i : Time snapshots of the reservoir flow patterns (i = 1, · · · , R). -K c : The number of clusters. -A predefined production strategy u(t).
The procedure is described as follows: 1. Reservoir simulation: Simulate the flow patterns for the set of R realizations using u(t). 2. Tensor formulation: Store the reservoir flow patterns of all the realizations in a tensor S as described in Section 3.2.
3. Decomposition: Compute the tensor decomposition of S as in Eq. (16), using the algorithms described in Section 3.4. 4. Low-dimensional characterization: Construct the lowdimensional representation of the flow profiles A = {A 1 , A 2 , · · · , A R }, as described in Section 3.7. 5. Dissimilarity: Compute the distances described in Eq. (22). 6. Clustering: Group the data set A into clusters as described in Section 4.2. 7. Visualization: Construct an MDS map to visualize clusters as described in Section 4.3.
The output of the workflow is the set of K c clusters C = {c 1 , c 2 , · · · , c K c }, which groups the types of dynamical responses of R reservoir realizations. This classification can be further used to create flow-relevant ensembles, where few reservoir models are selected to capture the most relevant dynamical responses of the original set of realizations.
It has to be noted that, if we consider the saturation-based dissimilarity measure, then the presented model clustering method requires a full simulation of all model realizations in the ensemble. In a general CLRM workflow, including robust optimization and/or value of information assessment, this burden will generally be outweighted by the advantages of performing optimization over a considerably reduced ensemble. In the current procedure, the clustering can be done once and "off-line." Nevertheless, if a full simulation of all reservoir realizations is considered unfeasible, the proposed method and techniques in this paper can still be applied e.g. on the basis of time-of-flight maps rather than saturation maps (see, e.g., [40]), thereby considerably reducing the simulation efforts.

Application case
In this section, the workflow for model clustering presented in Section 4 is applied to a set of channelized reservoirs, and we analyze the performance of the spatial-temporal approach using flow-based dissimilarity measures. Channelized reservoirs present a challenge for field development plans, because moderate changes in well configurations may lead to very high variations in the resulting reservoir flow patterns. Let us consider an ensemble of R = 100, 3D reservoir models with a geological structure consisting on a network of fossilized meandering channels of high permeability. The data set has been uploaded to the 4TU.Datacentrum repository and can be accessed by external users, see [19] for the physical parameters of the models. The reservoir size is 480 m × 480 m × 28 m with 7 geological layers, and it is composed by 25,200 grid blocks 8 m × 8 m × 4 m in size. We have used a rectangular-shaped geometry instead of the egg-shaped reservoir described originally in [19]. The well configuration is composed of eight injectors and four producers. A view of some geological realizations is depicted in Fig. 6.

Generation of the reservoir flow patterns
We have simulated the reservoir flow patterns of the R = 100 realizations, which correspond to the spatial-temporal evolution of the oil saturation. The sequential solvers of MRST, see [24], have been used to solve the pressure and saturation equations, and the production has been simulated for a period of 10 years with a time step of 30 days, i.e., K = 122 time steps. The water injection rates are fixed at 79.5 m 3 /day for all the injectors and the bottom-hole pressures are fixed at 395 bar for all the producers.

Low-dimensional tensor representation of the reservoir flow patterns
The data structure that contains the reservoir flow patterns for the ensemble can be stored in a 5D tensor of size I ×J ×Z ×K ×R with I = 60 the dimension of the x coordinate, J = 60 the dimension of the y coordinate, Z = 7 layers, K = 122 time steps and R = 100 realizations. The tensor S is decomposed similarly to the decomposition in (16), while augmenting a coordinate for geological layers. Hence, the reservoir flow patterns corresponding to the mth realization can be described as: where {ν z } Z z=1 are the set of basis functions for the layers coordinate and α m ij zk := R r=1 σ ij zkr χ r , e m . For the low-dimensional approximation of S , we truncate the number of the basis functions in every coordinate such that the approximation error described in (12) is relatively small. We have set the number of basis functions for the layers coordinate to beẐ = 2, and for the temporal coordinate to bê K = 2. From the expression in (17), it is inferred that the number of basis functions for the model coordinate does not affect the number of parameters α ij zk required to describe a flow pattern, and thus we selectR = 100 basis functions for the model coordinate. In order to choose an adequate number of spatial basis functions for the x and y coordinates, we perform an error analysis using the approximation error defined in (12).
In Fig. 8, the approximation error as a function of the number of basis functions used for the approximation S is presented. Using Fig. 8, we can perform a trade-off between accuracy and the amount of information required for the approximation. The truncation criterion is defined by the user, and we accept tensor approximations with relative errors lower than 20% with respect to the original tensor S . By choosing a truncationÎ = 10 andĴ = 10 we achieve a relative error of 17.6% with respect to the flow patterns from the full simulation. Figure 8 displays an almost symmetric shape, and shows that the approximation error tends to zero as we increase the number of basis functions in both coordinates.
Snapshots of the approximation for one of the realizations are depicted in Fig. 7. Despite that the oil/water front is not as sharp as in the reservoir simulation, the tensor approximation is able to capture the relevant flow patterns as time evolves. These approximations allow the characterization of the reservoir flow patterns in low-dimensional spaces. As was described in the previous section, the reservoir flow pattern of the mth realization X m (of size I × J × Z × K, i.e. 3.074 × 10 6 grid-block oil saturations) is characterized in a low-dimensional space by the tensor A m of size I × J × Z × K composed by the set of coefficients α ij zk , i.e., 400 coefficients. This low-dimensional characterization represents a reduction of 99.9% of the amount of information necessary for further classification analysis.

Model clustering and visualization
The low-dimensional tensor representation in (26) and the methods described in Section 4.4 are used for model clustering.
In the previous section, we have derived a set of tensors A = {A 1 , A 2 , · · · , A R } which are low-dimensional representations of the reservoir flow patterns for an ensemble of R realizations. Here, we use them for constructing model clusters with similar reservoir flow patterns.
The k-means clustering algorithm described in Section 4.2 has been used to classify the set A. The algorithm is provided with the number of clusters K c = 7,   Fig. 9. A predefined initial condition for all the centroids μ 0 = 0.1 · E , where E is a tensor of size 10 × 10 × 2 × 2 with all the elements equal to 1.
To visualize the clusters, the MDS map is constructed using a distance matrix D based on the approximated dissimilarity function defined in (22). In the MDS plot, every dot corresponds to a low-dimensional representation of the reservoir flow patterns for one realization, and the colors indicate clusters. The MDS map in Fig. 9 is a 3D projection of a higher-dimensional space, and the first three axes represent 72% of the total variability. The clusters depicted in the MDS plot are visually separated and the plot has a tetrahedron shape. Big clusters are found in the corners of the tetrahedron, indicating four predominant and different types of flow patterns. The distribution of NPV and total oil for the clusters can be visualized in the MDS plots of Fig. 10. From visual inspection, patterns of similar NPV, and total oil production can be detected for models which belong to the same cluster and are close in the flow patterns, and there is a positive correlation of the model distances with the target variables.
Typical flow patterns of these clusters are depicted in Figs. 11, 12, 13, and 14.
To describe the observations, let us analyze the type of reservoir flow patterns classified in some of the clusters. In Fig. 11, the flow patterns corresponding to the realizations 9 and 64 indicate a large connectivity between the injectors 2, 4, and 7 with the producers 2 and 3 resulting in flow patterns with an elongated shape in the y coordinate. In Fig. 12, the flow patterns of the realizations 3 and 38 present a large connectivity between injectors 1, 2, 3, and 4 with  Fig. 13, the flow patterns indicate a large connectivity between injectors 1, 2, and 3 with producer 1, while the flow patterns corresponding to the cluster in Fig. 14 do not exhibit such connectivity. The results depicted in Figs. 11, 12, 13, and 14 confirm the effectivity of the spatial-temporal workflow for model clustering using flow-based dissimilarity measures.
In this application case, the computational complexity of the tensor decomposition comes down to O(K 3 ) = O(122 3 ) ∼ O(2 · 10 6 ). Since the flow patterns have a reduced-order representation composed of n = 400 elements, the application of Lloyd's k-means clustering algorithm has a computational complexity of order ∼ O(0.0004 · 10 6 ). Alternatively, the brute force calculation of all pairwise distances between the original feature vectors, requires a number of computations of the order O( 1 2 R 2 · I · J · Z · K), which for the example case comes down to O( 1 2 100 2 · 60 · 60 · 7 · 122) ∼ O(1.5 · 10 10 ), without considering the clustering step. Applying Lloyd's clustering algorithm to the original features has a computational complexity of order O(it · K c · 1 2 R 2 · I · J · Z · K) = O(it · 7 · 5000 · 60 · 60 · 7 · 122) ∼ O(it · 10 11 ). From these figures, it is clear that the computational advantages of the tensor reduction step are substantial. This computational gain will greatly facilitate the implementation of clustering algorithms for problems of practical size.

NPV and oil production in the clusters
Previously, we have discussed the fact that models with similar outputs such as NPV or production rates might have very different reservoir flow patterns. In the context of model clustering with flow-based dissimilarity measures, it is expected that the models within a cluster share similar flow patterns. We anticipate that the NPVs and total oil productions are similar as well, based on the fact that models with similar flow patterns might have similar  r odd = r − 1 , r even = r + 1 1 = 4 m 3 /day 3 r odd = r + 1 , r even = r − 1 1 = 4 m 3 /day 4 r odd = r − 2 , r even = r + 2 2 = 10 m 3 /day 5 r odd = r + 2 , r even = r − 2 2 = 10 m 3 /day evolution of the oil saturation and pressure patterns, generating close water breakthrough times and rates at the production wells. For the application case considered in this section, we compute the undiscounted NPV as in [18], with r o = 55 USD/stb the oil price, r wi = r wp = 2 USD/stb the cost associated to water injection and production. The range of NPV for the ensemble is ω NP V := [104.82 × 10 6 , 119.27 × 10 6 ] USD, with a mean value of μ NP V = 112.62 × 10 6 USD and standard deviation σ NP V = 2.80 × 10 6 USD. The statistical properties (mean and standard deviation) of NPV and total oil production of the flow-based clusters are presented in Table 1. From the table, we conclude that for all seven clusters, the intra-cluster standard deviations of NPV (σ npv ) and total oil (σ oil ) are smaller than the standard deviations of these variables over the full ensemble. This is an evidence for the statement that the clustering has been done in a way that is relevant with respect to these two performance measures.

Input dependency of the reservoir flow patterns and dissimilarity measures
One of the possible limitations of the workflow is the nonlinear dependency of the reservoir flow patterns on the type of production strategy. In this section, we compare the results of clustering using flow-based dissimilarity measures for different production strategies. The strategies that will be considered in this section consist on a fixed bottom-hole pressure of 395bar at the producers, while the injection rates are perturbed by ±5 and ±12.5%, as specified in Table 2.
In general, it would be beneficial to have a small sensitivity of the workflow to variations in the control input. In order to assess this sensitivity, we have applied the workflow for flow characterization with the control strategies described in Table 2, and have generated K c = 7 clusters for each strategy. Here, we define the closeness of the clusters with deviated control to the clusters of the base case, as the number of shared models among the clusters. This cluster similarity can be quantified as the percentage of the elements shared with the cluster in the base case. Let c base r and c n r be the rth clusters for the base and the nth production strategy. We define the percentage of similarity of the cluster c n r with respect to the base case c base r as: where card(·) denotes the set cardinality, i.e., the number of elements of a set. In Fig. 15, this percentage of similarity is presented.
The results in Fig. 15 indicate that for small deviations (strategies 2,3), there is a good agreement of the clusters for deviated controls with the classification obtained by the base case, as most of the clusters match the base case with at least 68%, with the exception of cluster 7 for strategy 2. This is expected, as there are no large variations in the reservoir flow patterns for small variations in the controls. For large deviations in the control inputs (strategies 4, 5),

Fig. 15
Model clusters similarity with respect to clusters of the base case the similarity of the clusters with respect to the base cases decreases, however, there it is still a good match, and the generated clusters have a significant similarity with respect to the classification found in the base case.
The results indicate that the nonlinear dependency of the flow patterns on the control input is an inherent limitation of the workflow. However, this methodology is valid for small deviations around the production strategy.

Conclusions
Some relevant advantages have been identified for the proposed methodology of model classification using flowbased dissimilarity measures and tensor representations: Firstly, the spatial structure of the reservoir is preserved, which allows the extraction of spatial correlations from the reservoir flow patterns. Secondly, the spatial correlations are not averaged in time, which is particularly useful for the flow characterization of nonlinear reservoir systems, where the spatial correlations of the reservoir variables are timevariant. Thirdly, a tensor-based representation provides the user with enough flexibility for handling multidimensional reservoir flow patterns and for performing a directional approximation of the data, i.e., keeping the directions where the dynamics have a higher variability. As a consequence, the tensor approximations can represent patterns in the full simulation using only 0.1% of the original information.
Finally, a low-dimensional representation of the reservoir flow patterns allows the implementation of dissimilarity and clustering techniques for reservoir models. The presented clustering technique can be used to construct reduced-sized ensembles for instance for applying robust life cycle optimization [37], value of information assessment [6], or well placement optimization while the original uncertainty in the ensemble of reservoir models is captured by a reduced set of ensemble members, chosen after flow-relevant clustering of the realizations, and thereby leading to substantial computational benefits.