Local Misfit Approximation in Memetic Solving of IllPosed Inverse Problems
Abstract
The approximation of the objective function is a well known method of speeding up optimization process, especially if the objective evaluation is costly. This is the case of inverse parametric problems formulated as global optimization ones, in which we recover partial differential equation parameters by minimizing the misfit between its measured and simulated solutions. Typically, the approximation used to build the surrogate objective is rough but globally applicable in the whole admissible domain. The authors try to carry out a different task of detailed misfit approximation in the regions of low sensitivity (plateaus). The proposed complex method consists of independent \(C^0\) Lagrange approximation of the misfit and its gradient, based on the nodes obtained during the dedicated memetic process, and the subsequent projection of the obtained components (single or both) on the space of Bsplines. The resulting approximation is globally \(C^1\), which allows us to use fast gradientbased local optimization methods. Another goal attained in this way is the estimation of the shape of plateau as an appropriate level set of the approximated objective. The proposed strategy can be applied for solving illconditioned real world inverse problems, e.g., appearing in the oil deposit investigation. We show the results of preliminary tests of the method on two benchmarks featuring convex and nonconvex Ushaped plateaus.
Keywords
Illposed global optimization problems Objective approximation Fitness insensitivity1 Motivation and StateoftheArt
The parametric inverse problems under consideration consist in recovering coefficient functions (inverse solutions) describing physical features of phenomena modeled by partial differential equations (PDE), from the measurements of the state, which correspond to the solutions of the related PDE (forward solutions). Such tasks are frequently formulated as global optimization problems, in which one minimizes the misfit between the measurements and simulated forward solution over the set of admissible coefficient representations (see, e.g., [1]).
The main difficulty in solving inverse problems is the usual illconditioning, which typically is the misfit multimodality and insensitivity with respect to some parameters, even over the subsets of positive Lebesgue measure (plateaus). Such a type of illconditioning can be observed in the electric field intensity measurement inversion used in the search for hydrocarbon deposits [2, 3].

the elimination of excess solutions by misfit regularization [4], which may lead to replacing real minimizers by artificial ones imposed by the regularization term, or

finding all solutions, letting experts in the area select reasonable ones and reject artifacts.
If we select the second, more general way, the following task have to be carried out: separate attraction basins of different plateaus in the misfit landscape and individually approximate the area of each plateau. The first task might be performed by means of multimodal genetic optimization methods [5] or by using simple Clustered Genetic Search CGS [6, 7]. There are not, however, any well established methodologies constructing a reasonable (i.e., time and memory efficient) plateau approximation. Some attempts in this direction are the methods of approximating central parts of misfit minimizer attraction basins (see [6, 7]).
This paper puts forward a new method of recognizing plateau as a level set of the local misfit approximation covering the narrow subdomain in which the plateau is located. Input data for this approximation are delivered by a deme distinguished from a memetic strategy solving inverse problem and specially tuned towards filling uniformly plateau regions [8, 9].
The approximation of fitness function in evolutionary searches has been applied since ’80s of the 20th century (see Grefenstette and Fitzpatrick 1985 [10]), whereas the general idea of the objective approximation was known much earlier in the optimization. The development of the misfit approximation methods was summarized and characterized in several survey papers (see e.g. [11, 12, 13]). Typically, the approximation is used as a fitness surrogate (called also metamodel, proxy) if the original one is costly to execute or/and contains stochastic noise component and its evaluation requires multiple executions. Another goal is to obtain a sufficiently smooth surrogate function allowing use of gradientbased methods or/and to avoid local minima and reduce the insensitivity. The most popular approximation methods applied are 2nd degree polynomials fitted by least squares, Kriging (typically with a constant trend) as well as the neural perceptrons.
However, the plateau recognition task needs a much more accurate method, which can be applied locally, in a roughly restricted region of the admissible search domain. These circumstances suggested the authors to apply two approximation methods widely accepted in the Finite Element Method of solving PDEs: \(H^1\)regular one utilizing the linear splines on Delaunay’s simplexes and \(C^\infty \) isoparametric one, defined on cuboid subdomains.
2 Definition of the Problem and Solving Strategy
Let us denote by \(\mathcal {S} \subset \mathcal {D}\) the set of solutions to (1). We will call the set \(\mathcal {P}_{\hat{\omega }} \subset \mathcal {S}\) the plateau associated with the minimizer \(\hat{\omega } \in \mathcal {S}\) if it is the largest nonempty set such that for each \(x \in \mathcal {P}_{\hat{\omega }}\) there exists an open, connected set \(A \subset S\) such that \(x, \hat{\omega } \in A \subset \mathcal {P}_{\hat{\omega }}\) (see [9, 14]). The above definition imposes that \(\hat{\omega } \in \mathcal {P}_{\hat{\omega }}\), moreover \(\mathrm {meas}(\mathcal {P}_{\hat{\omega }}) > 0\).
By a basin of attraction of plateau \(\mathcal {B}_{\mathcal {P}_{\hat{\omega }}} \subset \mathcal {D}\), we mean a single connected part of the largest level set of the objective f, such that it contains the plateau \(\mathcal {P}_{\hat{\omega }} \subset \mathcal {B}_{\mathcal {P}_{\hat{\omega }}}\) and it is contained in the plateau’s attractor, i.e., any strictly decreasing local optimization method starting from an arbitrary point in \(\mathcal {B}_{\mathcal {P}_{\hat{\omega }}}\) converges to some point of \(\mathcal {P}_{\hat{\omega }}\) (see [14] for details).
The problem of our interest is the following: given a subpopulation \(P_{\hat{\omega }} \subset \mathcal {B}_{\hat{\omega }}\) covering the plateau \(P_{\hat{\omega }}\) together with computed values of misfit \((f)^x\) (and possibly also values of its gradient \((Df)^x\)) for \(x \in P_{\hat{\omega }}\), find an approximation of \(\mathcal {P}_{\hat{\omega }}\).
The strategy we propose consists of constructing an approximation of the misfit in the vicinity of \(\mathcal {P}_{\hat{\omega }}\) and obtaining a representation of the plateau as its level set.
It is important to note that there is a class of inverse parametric problems of a great engineering significance, in which the misfit function is continuously differentiable in the strong (i.e., Fréchet) sense and the misfit gradient can be numerically evaluated together with the misfit value, using, e.g., the goaloriented version of the Finite Element Method. The additional computational cost is linear with respect to the number of degrees of freedom (see, e.g., [15, 16] and references therein).
3 Memetic Multideme Global Search
The approximation technique which is the main subject of this paper is thought to be a component of a complex hybrid inverse solver called Hierarchic Memetic Strategy (HMS) [17]. Currently, the latter is built upon a multideme evolutionary global search engine. Each deme executes its own evolutionary engine: in the current implementation it is the Simple Evolutionary Algorithm, i.e., a common type of evolutionary algorithm with floatingpoint encoding, Gaussian mutation, arithmetic crossover and fitnessproportional selection. The demes form a parentchild treelike hierarchy where the accuracy of performed search is determined by the tree level. The single population at the root level is the most explorative, so its search is the least accurate. The search accuracy increases while going towards the leaves, where the search is the most focused. The tree itself has a selforganization ability thanks to an operation performed by its demes, called sprouting. It works as follows: after a fixed number of evolutionary epochs a deme tries to start a child deme around the individual with the currentlybest misfit value. However, the sprouting is performed only unless there is another deme at the child level exploring the area around the mentioned individual.
The HMS stopping condition consists of a local component controlling the evolution in nonroot demes and a global component estimating the maturity of the global search. The local stopping condition stops demes non revealing noticeable progress. The global stopping condition stops the whole strategy if for a given number of epochs no new demes are sprouted and if all leaves have been stopped.
But HMS goes beyond the evolutionary paradigm. Namely, it contains a number of memetic extensions. One of them is the optional accuracyboosting machinery of running local optimization methods in leaf demes (for details we refer the reader to [17]). Another utilized technique is the clustering of the population gathered from leaves. The aim of this mechanism is a preliminary identification of local minima attraction basins as well as plateaus in the misfit landscape. The clustering is supported by a postprocessor which merges clusters apparently occupying the same plateaus. The populations of integrated clusters are then subject to an additional evolution phase using a multiwinner selection operator [9]. Its aim is to provide better coverage of the clusters. The final populations form then the input for the plateau recovery stage utilizing the approximation method described in the sequel.
4 Misfit Approximation Strategies
Since the purpose of the misfit approximation is the reduction of the misfit computation cost, it needs to be efficient to evaluate and, to a lesser degree, to construct. While our main intent is to utilize it to determine the plateau regions, it might also be beneficial to use the approximation as a surrogate objective for local gradientbased convex optimization methods. A desirable property of the approximation is then the global \(C^1\) class. Moreover, a continuously differentiable approximation can better preserve the geometrical properties of the graph of an actual continuously differentiable objective [15, 16].
We propose and compare two variants of the method – one using only values of misfit function and the other utilizing the gradient. In both cases the process of constructing the approximation consists of two stages – first we construct a nonsmooth auxiliary approximation and then approximate it with Bspline basis.
4.1 Approximation Using Misfit Values
The simpler approach using only misfit values at the population points is to create a continuous misfit approximation \(\widetilde{f}_{\hat{\omega }} \in C^0\left( V_{\hat{\omega }}\right) \), which can be regularly extended to \(C^0\left( Q_{\hat{\omega }}\right) \), and project it onto Bspline space \(\mathcal {V}_{\hat{\omega }} \subset C^1 \left( Q_{\hat{\omega }}\right) \) using scalar product inherited from \(L^2\left( Q_{\hat{\omega }}\right) \), which results in the local, smooth misfit approximation \({\mathop {f}\limits ^{}}_{\hat{\omega }} \in C^1 \left( Q_{\hat{\omega }}\right) \).
To create nonsmooth approximation \(\widetilde{f}_{\hat{\omega }}\), Delaunay triangulation of the point set \(P_{\hat{\omega }}\) is computed and piecewise linear Lagrange interpolation is used on each of thus obtained simplices. The resulting approximation is \(C^\infty \) inside each simplex and \(C^0\) globally (see e.g. [18]).
Computing \(L^2\)projection requires numerical integration of expressions involving projected function, which renders projecting misfit directly infeasible due to prohibitive evaluation cost and thus necessitates using the auxiliary Lagrange interpolation.
In the context of general Hilbert spaces the projection of element \(x \in H\) onto a subspace \(V \subset H\), i.e. \(x_0 \in V\) with minimal distance to x is well known to be the unique \(x_0\) such that \(x  x_0\) is orthogonal to V [19, Theorem 5.24]. For a finite dimensional V finding such \(x_0\) requires solving a system of linear equations with Gram matrix of the basis of V. Projecting \(\widetilde{f}_{\hat{\omega }}\) onto \(\mathcal {V}_{\hat{\omega }}\) thus involves solving system of linear equations with Gram matrix of the basis of \(\mathcal {V}_{\hat{\omega }}\), which can be done efficiently using ADS algorithm thanks to the tensor product structure of the chosen basis [20].
This work uses a sequential version of ADS solver. There are currently parallel versions of alternating direction solver under development, targeting the sharedmemory Linux cluster nodes in GALOIS environment [21], distributed memory Linux clusters [22]. The alternating direct solver has been also applied for solution of a sequence of isogeometric \(L^2\)projections resulting from explicit dynamics simulations [23, 24].
4.2 Approximation Using Misfit Values and Gradients
The second strategy is similar to the first, but it does not discard information about gradient values. In addition to \(\widetilde{f}_{\hat{\omega }}\), we construct piecewise linear approximations of components of the gradient \(\widetilde{Df}_{\hat{\omega }} \in (C^0(V_{\hat{\omega }}))^N\) in the same manner. These approximations are not necessarily coherent in the sense that the distributional derivative \(D \widetilde{f}_{\hat{\omega }}\) coincides \(\widetilde{Df}_{\hat{\omega }}\) almost everywhere in \(V_{\hat{\omega }}\), but we can nevertheless use \(\widetilde{f}_{\hat{\omega }}\) and \(\widetilde{Df}_{\hat{\omega }}\) as approximations of misfit and its gradient to compute its \(H^1\)projection onto \(\mathcal {V}_{\hat{\omega }}\).
5 Numerical Results
We have applied both aforementioned misfit approximation strategies to two benchmark problems – one relatively simple with a convex plateau, and one with nonconvex, Ushaped plateau. In both cases the approximations were built using the populations generated by HMS. We studied the quality of obtained approximations with particular attention to plateau regions they yield. Both benchmark functions assume values between 0 and 1. Approximations of the plateau region are constructed as level sets of the misfit approximations – points with misfit below 0.1 are considered to be elements of the plateau. As a quality metric when comparing plateau approximations we used its Hausdorff distance to the levelset based plateau constructed using the exact misfit.
5.1 Convex Plateau
Errors for convex plateau.
Method  \(L^2\) error  \(H^1\) error  Plateau error 

\(L^2\)projection  0.0512  0.3943  0.1723 
\(H^1\)projection  0.0491  0.2884  0.1501 
Furthermore, \(L^2\)projection error depends heavily on the distribution of evaluation points – it is nearly zero at evaluation points (since it is purely a projection of interpolation of values) and grows significantly in regions between these points. Error of \(H^1\)projection on the other hand does not seem to display such clear dependence on evaluation points.
Overall, \(H^1\) projection is slightly more accurate considering mean \(L^2\) error (Table 1). The difference is more significant when we measure error using \(H^1\) norm – error of \(L^2\)projection is about 50% higher than for \(H^1\)projection.
Plateau region approximations produced by both methods are displayed and compared to the one obtained using exact misfit in Fig. 2. \(L^2\)projection produces superior plateau border approximation in certain regions, but in others it is heavily distorted. Plateau approximation obtained using \(H^1\)projection has the correct shape (no distortions), but does not cover the whole exact plateau region. Comparison of Hausdorff distances of both approximations to the exact plateau slightly favors \(H^1\)projection (Table 1).
5.2 UShaped Plateau
Errors for Ushaped plateau.
Method  \(L^2\) error  \(H^1\) error  Plateau error 

\(L^2\)projection  0.1360  0.9408  0.2803 
\(H^1\)projection  0.0692  0.4408  0.6892 
Once again, \(H^1\) projection is more accurate considering mean \(L^2\) and \(H^1\) errors (Table 2), but this time the difference is more significant – the errors of \(H^1\) projection are roughly twice smaller.
Plateau region approximations produced by both methods are displayed and compared to the one obtained using exact misfit in Fig. 2. Both \(L^2\) and \(H^1\)projections produce plateau that closely matches the exact plateau except for the lower left part. In this example \(L^2\)projection gives better overall plateau approximation (Table 2) although its border exhibits more irregularities than the one obtained from \(H^1\)projection.
As in the first example, the approximations were constructed using 300 evaluations at points produced by HMS algorithm. Points of evaluation are displayed in the plots in Fig. 3.
6 Conclusions
There are typically two ways of managing the multimodality and insensitivity in solving parametric inverse problems: the first one related to the misfit regularization which reduces the number of solutions using an appropriate misfit supplement and the second, more general, in which we find all solutions, letting the domain experts select reasonable ones and reject artifacts. The proposed method follows the second approach, by using an accurate smooth approximation of the misfit function in the regions of potential solutions, which are plateaus in the misfit landscape. It consist of two steps: independent \(C^0\) Lagrange approximation of the misfit and its gradient, based on the nodes obtained during the dedicated memetic process, and the projection of single or both components on the space of Bsplines. The resulting approximation is globally \(C^1\), which allows us to use fast gradient methods of local optimization. Another goal attained in this way is the estimation of the shape of plateau by an appropriate level set of the final approximation.
We currently work on releasing the source code of the algorithm and the benchmarks. If interested, please contact any of the authors, so you will be informed once the code is released.
The method is preliminarily tested on two benchmarks having the convex and nonconvex Ushaped plateaus. The test results show slightly better plateau shape estimation obtained by using Lagrange misfit approximation only and the \(L^2\) projection on the space of splines, whereas the joint \(H^1\) projection of the misfit and its gradient Lagrange interpolations delivers smaller \(L^2\) and \(H^1\) approximation errors.
It is worth mentioning that the second option is especially advantageous in case when the misfit gradient can be inexpensively evaluated by solving forward problem (e.g., using the goaloriented finite element method). Moreover, the cost of \(H^1\) projection is significant, especially in the case of multidimensional problems. \(L^2\) projection is less expensive, however, it also suffers from the high dimensionality. In particular, the computational cost of both methods grows exponentially with problem dimension assuming a fixed mesh resolution in all directions. This fact limits the applicability of such approach to problems up to about ten dimensions.
The proposed strategy will be applied to solve illconditioned realworld inverse problems appearing in the oil deposit investigation.
We plan to perform the exhaustive experimental analysis of the strategy’s scalability (misfit approximation error, plateau approximation accuracy, computational cost) with respect to the dimension of the admissible set of parameters. As far as the applied evolutionary sampling method HMS works well, we will check the performance of the proposed local misfit approximation method coupled with other, stateoftheart stochastic population based optimizers.
In the authors’ opinion, it is difficult to compare the proposed strategy with other objective approximation methods, because of radically different goals which they intend to achieve. To the best of the authors’ knowledge, the problems of the misfit approximation in the plateau area and the estimation of plateau shape have not been considered before.
References
 1.Tarantola, A.: Inverse Problem Theory. Mathematics and Its Applications. Society for Industrial and Applied Mathematics, Philadelphia (2005)MATHGoogle Scholar
 2.GajdaZagórska, E., Schaefer, R., Smołka, M., Paszyński, M., Pardo, D.: A hybrid method for inversion of 3D DC logging measurements. Nat. Comput. 3, 355–374 (2014)MathSciNetGoogle Scholar
 3.Smołka, M., GajdaZagórska, E., Schaefer, R., Paszyński, M., Pardo, D.: A hybrid method for inversion of 3D AC logging measurements. Appl. Soft Comput. 36, 422–456 (2015)CrossRefGoogle Scholar
 4.Tikhonov, A., Goncharsky, A., Stepanov, V., Yagola, A.: Numerical Methods for the Solution of IllPosed Problems. Kluwer, Dordrecht (1995)CrossRefMATHGoogle Scholar
 5.Preuss, M.: Multimodal Optimization by Means of Evolutionary Algorithms. Natural Computing. Springer, Heidelberg (2015)CrossRefMATHGoogle Scholar
 6.Schaefer, R., Adamska, K., Telega, H.: Genetic clustering in continuous landscape exploration. Eng. Appl. Artif. Intell. (EAAI) 17, 407–416 (2004)CrossRefGoogle Scholar
 7.Wolny, A., Schaefer, R.: Improving populationbased algorithms with fitness deterioration. J. Telecommun. Inf. Technol. 4, 31–44 (2011)Google Scholar
 8.Faliszewski, P., Sawicki, J., Schaefer, R., Smołka, M.: Multiwinner voting in genetic algorithms for solving Illposed global optimization problems. In: Squillero, G., Burelli, P. (eds.) EvoApplications 2016. LNCS, vol. 9597, pp. 409–424. Springer, Heidelberg (2016). doi: 10.1007/9783319312040_27CrossRefGoogle Scholar
 9.Faliszewski, P., Sawicki, J., Schaefer, R., Smołka, M.: Multiwinner voting in genetic algorithms. IEEE Intell. Syst. (2016, accepted)Google Scholar
 10.Grefenstette, J., Fitzpatrick, J.: Genetic search with approximate fitness evaluations. In: Proceedings of the International Conference on Genetic Algorithms and Their Applications, pp. 112–120 (1985)Google Scholar
 11.Jin, Y.: A comprehensive survey of fitness approximation in evolutionary computation. Soft. Comput. 9(1), 53–59 (2005)Google Scholar
 12.Bhattachaya, M.: Evolutionary approaches to expensive optimization. Int. J. Adv. Res. Artif. Intell. 2(3), 3–12 (2013)Google Scholar
 13.Brownlee, A., Woodward, J., Swan, J.: Metaheuristic design pattern: surrogate fitness functions. In: GECCO 2015 Proceedings, pp. 1261–1264. ACM Press, July 2015Google Scholar
 14.Sawicki, J.: Identification of low sensitivity regions for inverse problems solutions. Master’s thesis, AGH University of Science and Technology, Faculty of Informatics, Electronics and Telecommunication, Kraków, Poland (2016)Google Scholar
 15.Dierkes, T., Dorn, O., Natterer, F., Palamodov, V., Sielschott, H.: Fréchet derivatives for some bilinear inverse problems. SIAM J. Appl. Math. 62(6), 2092–2113 (2002)MathSciNetCrossRefMATHGoogle Scholar
 16.Smołka, M.: Differentiability of the objective in a class of coefficient inverse problems. Comput. Math. Appl. (submitted)Google Scholar
 17.Smołka, M., Schaefer, R., Paszyński, M., Pardo, D., ÁlvarezAramberri, J.: An agentoriented hierarchic strategy for solving inverse problems. Int. J. Appl. Math. Comput. Sci. 25(3), 483–498 (2015)MathSciNetMATHGoogle Scholar
 18.Ciarlet, P.G.: The Finite Element Method for Elliptic Problems. NorthHolland, New York (1978)MATHGoogle Scholar
 19.Folland, G.B.: Real Analysis. Pure and Applied Mathematics, 2nd edn. Wiley, New York (1999). Modern Techniques and Their Applications, A WileyInterscience PublicationGoogle Scholar
 20.Gao, L., Calo, V.M.: Fast isogeometric solvers for explicit dynamics. Comput. Methods Appl. Mech. Eng. 274, 19–41 (2014)MathSciNetCrossRefMATHGoogle Scholar
 21.Łoś, M., Woźniak, M., Paszyński, M., Hassaan, M.A., Lenharth, A., Pingali, K.: IGAADS: parallel explicit dynamics GALOIS solver using isogeometric \(L^2\) projections. Comput. Phys. Commun. (submitted)Google Scholar
 22.Woźniak, M., Łoś, M., Paszyński, M., Dalcin, L., Calo, V.M.: Parallel three dimensional isogeometric \(l^2\)projection solver. Comput. Inform. (accepted)Google Scholar
 23.Łoś, M., Woźniak, M., Paszyński, M., Dalcin, L., Calo, V.M.: Dynamics with matrices possesing Kronecker product structure. Procedia Comput. Sci. 51, 286–295 (2015)CrossRefGoogle Scholar
 24.Łoś, M., Paszyński, M., Kłusek, A., Dzwinel, W.: Application of fast isogeometric \(L^2\) projection solver for tumor growth simulations. Comput. Methods Appl. Mech. Eng. (submitted)Google Scholar