Integrating Electromagnetic Data with Other Geophysical Observations for Enhanced Imaging of the Earth: A Tutorial and Review

In this review, I discuss the basic principles of joint inversion and constrained inversion approaches and show a few instructive examples of applications of these approaches in the literature. Starting with some basic definitions of the terms joint inversion and constrained inversion, I use a simple three-layered model as a tutorial example that demonstrates the general properties of joint inversion with different coupling methods. In particular, I investigate to which extent combining different geophysical methods can restrict the set of acceptable models and under which circumstances the results can be biased. Some ideas on how to identify such biased results and how negative results can be interpreted conclude the tutorial part. The case studies in the second part have been selected to highlight specific issues such as choosing an appropriate parameter relationship to couple seismic and electromagnetic data and demonstrate the most commonly used approaches, e.g., the cross-gradient constraint and direct parameter coupling. Throughout the discussion, I try to identify topics for future work. Overall, it appears that integrating electromagnetic data with other observations has reached a level of maturity and is starting to move away from fundamental proof-of-concept studies to answering questions about the structure of the subsurface. With a wide selection of coupling methods suited to different geological scenarios, integrated approaches can be applied on all scales and have the potential to deliver new answers to important geological questions.


Introduction
The inversion of geophysical observations is one of the major tools to investigate the subsurface of the Earth and find out about the structure and composition of the planet we live on. Boreholes routinely give us access to material at depths of 5 km and in exceptional cases up to 10 km (Huenges et al. 1997), but only sample a small volume in a specific location. In some locations, e.g., Kimberlite pipes, geological processes transport material from depth to the surface and provide us with samples from the lower crust and mantle (e.g., Griffin et al. 2009). However, the bulk of our knowledge comes from comparing the output of numerical simulations calculated from hypothetical models with geophysical observations at or near the surface of the Earth. In some cases, these models are generated by trial and error through forward modelling (e.g., Leibecker et al. 2002;Gatzemeier and Moorkamp 2005;Heise et al. 2008). Performed well this process allows us to test different hypotheses about the physical structures in the area under investigation and it gives insight into the sensitivity of the data to different structures. However, forward modelling typically requires a large number of trials and is therefore a tedious task. In addition, the models are often strong generalizations of the geological structures and preconceived ideas can bias the type of models that are considered in this process.
For these reasons and due to considerable increase in computing power over the last years, most geophysical models nowadays are constructed through formal inversion procedures. Here, we mathematically define the criteria for an acceptable model, typically that the data predicted by the final model fit the observations in a least-squares sense (e.g., Wheelock et al. 2015), and use automated algorithms to find one or a set of models that fulfils these criteria. Given the importance of inversion not only in geophysics, a large number of different algorithms to achieve this task exist (Nocedal 2006). Good introductions to inverse methods in a geophysical context are given by Snieder and Trampert (1999), Tarantola (2004), Menke (2012), Mosegaard and Hansen (2016) and for the special cases of magnetotellurics (MT) and controlled-source electromagnetics (CSEM) in Avdeev (2005), Abubakar et al. (2009), Siripunvaraporn (2012) and Rodi and Mackie (2012). Despite highly refined algorithms and intense research, the inversion of geophysical data is virtually always ill-posed, i.e. similar data can lead to substantially different models (e.g., Backus and Gilbert 1967) and nonunique, i.e. infinite number of models can explain the data to the same level of uncertainty (e.g., Muñoz and Rath 2006). This is because we can only measure at or near the surface, and with large distances between sites compared to the scale length of geological variations. Furthermore, our measurements are band-limited and contaminated by noise (Parker 1980(Parker , 1983. This results in ambiguities in the inverse models that need to be considered when interpreting the data. The nature of these ambiguities is method specific though. For example, magnetotellurics resolves the thickness of a resistive layer well, but not its resistivity, while DC resistivity is sensitive to its resistance, i.e. the resistivity-thickness product (Vozoff and Jupp 1975).
If we can exploit these complementary sensitivities, we can hope to recover the shape and properties of structures within the Earth better compared to using just a single method. This is the core idea of joint and cooperative inversion: We combine information from two or more different types of geophysical data sets in a single inversion algorithm with the goal of improving the resulting models. The aim of this review is to explain and illustrate different approaches to joint inversion with a special emphasis on electromagnetic methods. I will give a working definition for joint inversion and briefly compare and contrast it with other methods to combine different data sets. I will then present a few selected examples that demonstrate strengths and potential pitfalls when using joint inversion methods. Finally, I will try to summarize where we are currently at and what future avenues could be. Compared to the recent review on the subject by Haber and Holtzman Gazit (2013) I will focus on the concepts and applications instead of the mathematical details.

A Short Tutorial on Joint Inversion
For the purpose of this review, I will use the term joint inversion for all approaches where different types of data are inverted within a single computational algorithm, with a single objective function and where all model parameters are adjusted concurrently throughout the inversion (Moorkamp et al. 2016b). Of particular interest are joint inversion approaches that combine different physical properties, e.g., electrical conductivity and seismic velocity as these provide large potential benefits, but also pose particular problems. In contrast, cooperative inversion comprises approaches where only a single data set is inverted and the results from another inversion are used as a reference (e.g., Paasche and Tronicke 2007). In this review, I will not discuss the theory behind post-inversion integration, where the relationship of models retrieved from independent inversions is examined in a quantitative manner (e.g., Bedrosian et al. 2007) and good overviews can be found in Paasche (2016), Bedrosian (2007) and Hansen et al. (2016).
As the definition of joint inversion implies, one of the first and crucial steps is to define an objective function that the optimization algorithm will minimize. In its most general form, this objective function can be written as Here U data m ð Þ and U reg m ð Þ are the data misfit and regularization terms as used by virtually all geophysical inversion approaches (e.g., Pedersen 1977;Constable et al. 1987;Oldenburg 1990;Commer and Newman 2009;Fichtner and Trampert 2011;Menke 2012) and k is the Lagrange parameter for the regularization. The additional term U coupling m ð Þ mathematically defines the relationship between different subsets of model parameters. Note that in this general notation the model vector m can potentially contain elements that are associated with very different quantities. For example, when jointly inverting magnetotelluric and seismic traveltime data for a layered Earth structure, the first N model parameters can be the resistivities of the N layers used in the inversion and the second N model parameters specify the seismic velocities for the same layers. The coupling term then establishes a relationship between these two sets of model parameters which specifies what kind of joint inversion we want to perform.
For cooperative inversion, the mathematical formulation looks identical. Again we have a data misfit term, a regularization term and a coupling term, only now data misfit and regularization are only calculated for resistivity, for example, and the seismic velocity model is considered fixed and is only used in the coupling terms to maximize the resemblance between the models. Figure 1 shows the difference between the two approaches as a flow diagram for the inversion.
I will now discuss a series of very simple experiments in order to illustrate some basic properties of joint and cooperative inversions as well as common coupling approaches and their properties. The basis for all experiments is data calculated from a simple one-dimensional three-layered model (Fig. 2). To simplify the discussion even further and aid visualization, I will also assume to know the physical properties (resistivity and seismic velocity) and thickness of the topmost and lowermost layer and only seek the thickness and physical properties of the middle layer. The inverse problem is therefore reduced to finding two model parameters for each geophysical data set, e.g., resistivity and thickness of the second layer for MT, and it is easy to plot the range of possible solutions as a simple scatter plot. Figure 2 shows the true model (left) with the layer that we are inverting for marked in red. The magnetotelluric data predicted from the model (middle) show the typical three-  Fig. 1 Simplified flow diagrams for joint inversion algorithms (left) and constrained or cooperative inversion algorithms (right) (Moorkamp et al. 2016b). The main difference between the two approaches is that for cooperative inversion one quantity that enters the coupling constraint (here seismic velocity, v) does not change throughout the inversion, while all quantities are adjusted within the joint inversion Fig. 2 The simple magnetotelluric test model (left), the forward response of this model (middle) and the space of acceptable inversion models (right) when only inverting MT data. Throughout this tutorial, I assume that the uppermost and lowermost layers are known and only the thickness and resistivity of the middle layer (marked in red) are sought layer response. In order to illustrate the feasible solutions, I simply generate a large number of random combinations of trial resistivities and thicknesses, calculate the misfit with the true response and retain the parameter combinations that fit the data within 2% of both real and imaginary parts. The resulting feasible solutions are shown as blue dots in Fig. 2 with the true value shown in red. This simple experiment demonstrates the well-known inability to retrieve both resistivity and thickness for a thin conductive layer, instead the layer conductance, i.e. the conductivity-thickness product is well constrained (Parker 1980). This behaviour is clearly visible in the set of solutions. Looking at the shape of the permissible solutions, the layer can have any thickness between 20 m and 60 m in combination with the appropriate resistivity.
If we perform a seismic refraction experiment in the same location and determine the set of permissible solutions from those measurements alone, we get the results shown in Fig. 3. In this case, the seismic refraction data also show ambiguity and a range of velocity and thickness values that explain the observations within the assumed error. However, the range of possible thickness values is narrower, between 30 and 45 m, compared to the MT. The core idea of joint inversion is that under the assumption that both methods sense the same structures, finding models that explain both data sets will result in better recovery of these structures. I will now discuss the most commonly used assumptions and demonstrate with these simple examples how they can be used, what kind of improvement can be expected and where the pitfalls are.

Structural Coupling
One of the most general assumptions we can make about the relationship between different geophysical methods is that they sense the same geological structures within the Earth and in particular their boundaries. Thus, if there are any velocity or resistivity anomalies, they should occur in the same location. Structural joint inversion methods thus do not prescribe any direct correlation between the different physical parameters, but use spatial relationships between changes in these parameters to couple the different methods (e.g., Haber and Oldenburg 1997;Gallardo and Meju 2003). I will discuss some examples of different structural relationships commonly used in joint inversion in the case studies below. For a layered Earth, a simple structural relationship is to assume that layer boundaries in the inversion are at the same depth for all methods (e.g., Manglik and Verma 1998;Moorkamp Fig. 3 The seismic data (left), true model (middle) and set of acceptable velocity-thickness combinations (right) when inverting travel times individually. As for the MT case in Fig. 2 only the properties of the middle layer (marked in red) are considered unknown Surv Geophys ( ) 38:935-962 939 et al. 2007Zevallos et al. 2009;Moorkamp et al. 2010;Roux et al. 2011;Juhojuntti and Kamm 2015). Figure 4 shows the ranges of acceptable models for both data sets for our simple example when assuming coincident boundaries. As no relationship between resistivity and velocity is employed, such an approach primarily limits the range of permissible layer thicknesses. Given that this range was larger for the magnetotelluric data, the range of permissible resistivity models is reduced, while the range of acceptable seismic models is identical to the individual analysis. Overall, the structural coupling achieves the main goal of joint inversion to reduce the range of acceptable models even though in this case only for one subset of parameters. Obviously, for a more realistic inversion, even when assuming a layered Earth, the situation would be more complex. When inverting for the properties of all three layers simultaneously, there is significant interaction between the thicknesses and physical parameters of the different layers. For example, the magnetotelluric data will have reasonable sensitivity to the thickness of the top layer, and this information will help to constrain the seismic model, which in turn helps to constrain the thickness of the second layer. Thus in a real-world joint inversion, it is not always as clear which method dominates the resolution properties, and in many cases, different methods drive the inversion in different part of the model domain (Jegen et al. 2009;Heincke et al. 2014;Demirci et al. 2017). I will discuss some of these issues in the application section; for now I will continue with this simplistic example, as it distils many important aspects of joint inversion.
Considering further how the resolution to layer thickness is transferred from the velocity model to the resistivity model and how it affects the range of acceptable models in Fig. 4, it is also clear that we will find models that fit both data sets even when the thickness of the layer in the true velocity model differs from the true resistivity model. A thickness of 60 m is within the acceptable range for the magnetotelluric data, and thus, combining a conductivity model with a layer thickness of 40 m with a seismic thickness of 60 m will produce a joint model that fits both data sets within the assumed noise level. In the most extreme case, the layer thickness for the true seismic model might be outside the permissible range for the magnetotelluric data, but including the uncertainty of the seismic results, the two regions overlap and we find a joint model that fits both data sets to an acceptable level but lies at the extreme uncertainty end for both methods.
In this very simple example, the most straightforward way to perform cooperative inversion is to select one of the models that fits the seismic data and use it as a reference Fig. 4 The set of acceptable velocity-thickness (left) and resistivity-thickness (right) combinations for the individual inversions and joint inversion with a structural constraint based on coincident layer thickness. The acceptable models for the individual inversions are marked in green, while the joint inversion models are marked in blue model for the MT. In the most strongly constrained case, we can take the layer thickness of that model as an estimate of the true layer thickness and only search for the resisitivity. The result is a very confined set of acceptable models. However, if the estimated thickness differs from the true thickness, the true model will not be included in the acceptable model set and the estimated resistivity will be biased. An approach that more closely resembles current practice (e.g., Kalscheuer et al. 2015) is to include a term in the objective function that requires the thickness of the layer to be as close as possible to the seismic estimate. Figure 5 shows the acceptable models that agree with an estimated layer thickness of 50 m within 10%. Compared to specifying a fixed layer thickness, the range of acceptable models has increased, but is still smaller than for the structurally coupled joint inversion. Depending on the thickness estimate and the uncertainty that we specify on the estimate, the true model will be included in the acceptable models or not. In the case where we have full information on the range of acceptable seismic models, e.g., from a probabilistic inversion approach, the results will be identical to the joint inversion results. However, if we only have one model and an ad hoc estimate of model uncertainty, the results depend very much on how these two relate to the true model.
At first, it might be surprising to see that a cooperative or constrained inversion restricts the set of acceptable models more than a full joint inversion. The explanation is relatively simple though: As no adjustments to the reference model can be made, the uncertainty of the seismic data has been taken out of the equation unless we consider the full posterior model covariance. Thus, models that are the result of such constrained inversions have to be examined with particular care and can be considered a particular case of hypothesis testing as discussed below, i.e. we have demonstrated that a model exists that resembles the seismic reference model and fits the magnetotelluric data.

Coupling Through Parameter Relationships
Structural coupling methods do not make strong assumptions about the relationship between the different geophysical methods and thus are widely applicable. As the example demonstrates though, the result of this can be a moderate improvement in resolution compared to individual inversions. If we utilize more information about the Earth in the inversion, we can potentially improve our results significantly. This is generally true for all inverse methods, additional prior information limits the space of acceptable models, but bears the risk of biasing the solution (Tarantola 2004;Mosegaard and Hansen 2016). The set of acceptable resistivity-thickness combinations for an unconstrained inversion (green) and an inversion where the layer thickness is constrained to match an inferred seismic boundary within AE5 m. In this example, the true model is part of the acceptable solutions; however, if we picked a reference model at the extreme end of the range permitted by the seismic data, it might not be Surv Geophys (2017) 38:935-962 941 Specifying a relationship between velocity and resistivity is one way to introduce such information and strongly links the seismic refraction and magnetotelluric parts of the inversion. Let us assume that from some borehole data or other source of information we have estimated a relationship of the form for the second layer. For the other layers this relationship is not appropriate, but in our simple example the parameters of these layers are assumed to be known. Furthermore, there are examples in the literature where a parameter relationship is only assumed to be valid in a certain depth range (e.g., Hoversten et al. 2006;Chen et al. 2007). If we assume that this relationship is exact and without any error, we can use it to formulate the inverse problem for both magnetotellurics and seismics in terms of seismic velocity and use Eq. 2 to convert between velocity and resistivity. This provides a strong coupling between the two methods, as a change in velocity in the layer has direct impact on the misfit for the magnetotelluric impedances. Figure 6 shows the feasible solutions in terms of resistivity and thickness for this case. We can see that with the correct parameter relationship, the space of feasible solutions is highly restricted as the intersection between the solutions that Fig. 6 Acceptable resistivity-thickness combinations for joint inversion with a parameter relationship. I show the solution when the relationship is considered exact and the assumed parameter relationship matches the true relationship (top left). As before, the acceptable models for the individual inversion are shown in green, while the joint inversion models are blue. In addition, the seismic models projected into resistivitythickness by the relationship are shown in yellow. I also show the results when the assumed relationship is biased (top right) and when the relationship is not considered exact, but included as a constraint (lower left).
The true and biased parameter relationships are shown in the lower right panel are feasible for each individual method is small. This demonstrates the power of such highly coupled joint inversion approaches and the potentially significant gain in information.
A special case of joint inversion with a direct parameter relationship is so-called petrological/petrophysical joint inversion approaches. There the relationship is not expressed in terms of one geophysical parameter as a function of another, but all geophysical quantities (e.g., resistivity, velocity, density) are functions of quantities such as porosity and permeability (e.g., Chen et al. 2007;Gao et al. 2012;Commer et al. 2014) or temperature and mantle composition (e.g., Afonso et al. 2016;Zunino et al. 2016). Conceptually, these approaches are identical to direct coupling between the geophysical quantities with the same caveats, an additional advantage is, however, that the inversion results are directly obtained in terms of quantities of geological interest.
In practice, we can rarely determine empirical parameter relationships with high precision. For example, borehole logging data show significant scatter and it is not clear whether measurements on the scale of centimetres can be used for inversion problems on a scale of tens to hundreds of metres or even kilometres (Moorkamp et al. 2013;Panzner et al. 2014). Figure 6 demonstrates what happens when an incorrect parameter relationship is assumed. Here I used in the inversion instead of the correct equation above. For such a mildly distorted relationship, it is still possible to find feasible solutions, but these are biased in comparison with the true values. Furthermore, the small region of feasible models suggests a highly precise answer, but the true model is not contained within that solution space. Thus, the results of such strongly coupled joint inversion approaches have to be investigated with particular care. One way to incorporate parameter relationships, but reduce the risk of bias is to include the relationship as a constraint and not use it to directly convert between parameters. In this case, we keep the parametrization of the inversion in terms of resistivity and velocity, but in addition to fitting the data require that the model parameters are close to the assumed parameter relationship in a least-squares sense. This is a first example of the coupling term in Eq. 1 which so far was only implicit in the common layer thicknesses or the reduction to a single physical property. The constraint can have the form of a function (e.g., Heincke et al. 2014;Kamm et al. 2015) or as cluster centres that the relationship values should group into (Carter-McAuslan et al. 2015;Sun and Li 2016). Figure 6 shows the result of running the inversion with a constraint based on Eq. 2 instead of a fixed parameter relationship assuming the correct values shown in Eq. 2. As expected, the range of permissible models is increased as the resistivity-velocity combinations of the models do no longer have to lie exactly on the assumed relationship, but instead scatter around it. Thus in cases where there is confidence in the overall shape of the relationship, but individual values might scatter around the general trend, a constraintbased joint inversion can allow for some deviation from that trend.
If the constraint is based on an incorrect relationship, the range of acceptable models for the simple example (not shown) is very similar to the ones shown for the direct relationship. Again the range of permissible models is biased and does not include the true model. In a practical inversion that is based on minimizing the weighted sum shown in Eq. 1, fitting the data and satisfying the relationship will trade-off against each other similar to the way that regularization and data misfit trade-off. It is possible to use this trade-off to identify problems with the parameter relationship and I will discuss general strategies to identify such problems next.

Identifying Problems and Hypothesis Testing
The above examples have highlighted potential traps when using joint inversion with geophysical data. When performing joint inversion, we introduce additional prior information by assuming coincident structures or parameter relationships. If this prior information is incorrect, it can lead to biased results. In the previous examples, I always plotted the space of feasible solutions that fit the data below the assumed error level in order to illustrate the spread of solutions. However, not all of these solutions fit the data equally well. In this case, where the input data are noise free, some solutions fit the data exactly, while others just about fit within the assumed noise level. For real data with reliable error information, the target misfit should be a RMS of 1, i.e. on average the observations are fitted within the estimated error. In practice however, real data can rarely be fit to that level (e.g., Rao et al. 2014;Peacock et al. 2015;Yang et al. 2015). There can be a variety of reasons for this including overly optimistic error estimates or the inability to include sufficiently small surface structures in the inversion. Before any joint inversion takes place, it is therefore essential to perform individual inversions of each data set to establish a reference misfit for each data set (e.g., Linde et al. 2006;Moorkamp et al. 2013). Performing these individual inversions also helps to find appropriate regularization parameters (e.g., Hansen 1992) that can be scaled for the joint inversion (Moorkamp et al. 2016a).
If the joint inversion with comparable model discretization fails to fit the data to the same level of misfit as the individual inversion, it can be either because of nonlinearities introduced through the coupling between the methods Heincke et al. 2017) or because the chosen coupling constraints are inappropriate (Moorkamp et al. 2007). Nonlinearity can be identified by changing the starting model for the joint inversion, for example starting with the best-fitting individual inversion results instead of a half-space or similar model. When starting with well-fitting models, the joint inversion should increase the similarity between the different models under the coupling constraint and the final model should have a comparable misfit to the individual models. If this cannot be achieved, it indicates that the coupling constraint is inappropriate, i.e. the assumptions about the Earth made in the inversion are incorrect.
The result that a carefully constructed coupling constraint, e.g., a parameter relationship based on an assumed rock-physics model, is inappropriate for joint inversion of a particular data set might initially be disappointing. However, it is a quite powerful result if we can show that we can rule out other factors such as the nonlinearity discussed above, issues with the parametrization or with the data errors. Traditionally, the coupling between the different methods is considered known prior information and the focus of the joint inversion is to use this additional information to improve the results compared to individual inversions. Another way to look at it, however, is that by performing joint inversion we test the hypothesis that models exist that comply with the chosen coupling method and explain the observations at the same time. When the joint inversion fails, because we cannot fit the data to a satisfactory level, we have falsified this hypothesis and demonstrated that the coupling constraint does not reflect the geological situation. Of course, a large number of possible coupling constraints exists that can be falsified this way, most of which are nonsensical. In contrast, if it is possible to show through careful analysis that variations of resistivity and velocity in the mantle cannot simply be explained by simple variations in composition, for example, we have learned something valuable about the Earth.
For typical geophysical inverse problems, usually no clear criteria exist for rejecting the result of an inversion. Even though least-squares fitting, the core of the majority of inverse algorithms, is based on a sound and well-developed mathematical foundation, the underlying assumptions, such as a Gaussian distribution for the errors, are very often violated (Matsuno et al. 2014). In addition to using the misfit for individual inversions as a benchmark as discussed above, another useful way to analyse the joint inversion results is to look at how fitting the different data sets and satisfying the constraint interacts. Figure 7 shows the misfit for the parameter relationship against the misfit for the seismic data for all models that fit both data sets at an acceptable level. When the relationship assumed in the inversion is correct (left panel), models with a low seismic misfit also match the constraint better than models with a larger seismic misfit. In contrast, when the assumed parameter relationship is incorrect (right panel), there are no models with both a low seismic misfit and a good agreement with the constraint. Thus, plotting combinations of misfits for the different data sets and the constraints reveals if they are compatible with each other or minimizing one comes at the cost of deteriorating the match for the other. In practical inversion, applications such plots can be constructed by varying the weight for the different terms of the objective function (Thompson et al. 2016) or using specially designed multi-objective minimization algorithms (Kozlovskaya et al. 2007;Moorkamp et al. 2007Moorkamp et al. , 2010Roux et al. 2011;Schnaidt and Heinson 2015;Niri and Lumley 2016).
Throughout the preceding discussion, I have concentrated on electromagnetic and seismic data, as this combination of methods promises significant improvements compared to individual inversions and different physical quantities need to be coupled in the inversion. Naturally, the discussion applies also to combining electromagnetic and gravity data (e.g., Maier et al. 2009), but to a certain degree even when inverting different types of electromagnetic data together (Commer and Newman 2009;Haroon et al. 2015) or electromagnetic with DC resistivity data (e.g., Candansayar and Tezkan 2008;Yogeshwar et al. 2012;Hoversten et al. 2016). Even though in the latter cases the physical parameter under consideration is electrical conductivity, magnetotellurics is sensitive to horizontal conductivity, whereas controlled-source methods are sensitive to horizontal and vertical conductivity which leads to effective anisotropy in finely layered sedimentary environments ). Thus, the conductivities retrieved by the different methods will differ and appropriate coupling methods have to be introduced in the joint inversion, e.g., by introducing electrical anisotropy in the inversion (Commer and Fig. 7 Trade-off between fitting the seismic data and matching the parameter relationship constraint for a correctly assumed parameter relationship (left) and a biased relationship (right). It can be clearly seen that when the relationship is incorrect, there are no models that satisfy the data and the constraint simultaneously Surv Geophys (2017)  Newman 2009), and data misfit will trade-off with the amount of anisotropy permitted by the inversion.

Some Recent Examples of Integrated Analysis in Practice
The previous simple examples have illustrated the concepts of joint inversion and I will now discuss concrete applications of these concepts to real data. Integrated methods are now applied in a wide range of fields, and recent reviews have been published on joint inversion in hydrogeophysics ( Here, I will focus on some of the most recent work and aspects that have not been fully covered by these reviews.

Joint Inversion with Direct Parameter Coupling
As a first case study, I will compare the results of Panzner et al. (2016)  where inversions are performed individually, but information is exchanged between the separate inversions. These kinds of integrated workflows are often used in hydrocarbon exploration instead of full joint inversions, either because a full joint inversion is considered intractable (Um et al. 2014), or because the main goal of the inversion is to generate improved models for migrating seismic reflection data (Colombo and Stefano 2007;Colombo et al. 2008;De Stefano et al. 2011;Colombo et al. 2014;Cui et al. 2015;Takam Takougang et al. 2015) or full waveform inversion (Zerilli et al. 2016). Heincke et al. (2017), in contrast, perform a full joint inversion of all three data sets.
In terms of the coupling between the different methods, both studies employ a very similar approach and utilize cross-property relationships from a borehole in the area to relate seismic velocity and electrical resistivity. Figure 8 shows the cross-property plots and the fitted relationships. From these plots, it is clear that both of these relationships are only approximate representations of the true variation of the parameters in the subsurface. Both velocity and resistivity scatter significantly around the values predicted by the relationship. This is partially due to noise in the borehole measurements, but also due to fine scale variations in lithology. Thus, when averaged to a scale of 100 m, comparable with the size of individual cells in the inversion, the scatter is much less dramatic and the relationship appears to be appropriate (Fig. 9).
Still, based on the same observed data, both studies choose different parametrizations to represent the parameter relationship. For P-wave velocities of less than 6000 m/s, the differences are relatively subtle, while for higher velocities the relationship by Panzner et al. (2016) predicts significantly higher resistivities than the relationship of Heincke et al. (2017). Even though at first sight the difference might appear dramatic, an order of magnitude or more difference in predicted resistivity, for velocities greater than 6200 m/s, in practice the impact on the inversion results is not that strong. First, such high velocities are only reached in very thin layers and the average velocity in the bulk of the borehole varies between 4000 and 6000 m/s where both relationships are very similar. Second, these high velocities and high resistivities are encountered within the basalt which is covered by conductive sediments. Magnetotelluric data only resolves the minimum resistivity of such a structure (Chave and Jones 2012), and the maximum resistivity is unconstrained. Thus, the MT data allow any resistivity value above a certain threshold, but the regularization will limit the resistivity. This is confirmed by the inversion results of both studies. In neither case do the retrieved resistivities exceed 100 Xm.
Even though in this case it appears that the differences in parametrization do not have a significant impact on the inversion results, this comparison highlights the difficulty in finding appropriate representations of cross-property relationships and the requirement for users of joint inversion methods to carefully consider their choices. It is clear that in many cases simple approximations through continuous mathematical functions are not appropriate. Clustering methods (Paasche and Tronicke 2007;Paasche et al. 2010;Li 2016, 2017) provide an alternative for certain scenarios, but also involve a number of choices by the user and assume that lithology can be represented by a finite number of cluster centres. Finding reliable representations of cross-property relationships and assessing their reliability under a range of conditions is therefore one of the most pressing tasks in joint inversion and integrated analysis and in my opinion a highly fruitful field for further research. Figure 10 shows the inversion results from both studies with the trace of the Brugdan borehole (Schuler et al. 2012) in both plots for comparison. The overall geometry and thickness of the basalt layer are similar in both models, but there are differences in some of the details. In particular, the model of Panzner et al. (2016) predicts a significantly stronger change in basalt thickness in the vicinity of the borehole compared to the model of Heincke et al. (2017) and shows deeper high-velocity structures potentially associated with prerifted basement that do not appear in the model of Heincke et al. (2017). Considering the sensitivities of the different data sets (Fig. 10 in Panzner et al. 2016), the addition of CSEM data helps to resolve these structures. In comparison, the gravity data included by (Heincke et al. 2017) do not seem to contribute much to resolving structures at depth.

Joint Inversion with Structural Constraints
The second example taken from Gallardo et al. (2012) is also related to hydrocarbon exploration. As prospecting for new resources has moved into geologically complex areas, the value of integrating different types of data has been recognized by the industry and significant investments have been made to develop integration platforms (e.g., DellAversana et al. 2016) and sponsor case studies in order to gain the necessary experience (e.g., Colombo et al. 2010;Chen and Hoversten 2012;Moorkamp et al. 2011;Roberts et al. 2016). In addition, the importance of so-called nonseismic techniques in the exploration workflow has increased over the last fifteen years (Constable 2010;MacGregor and Tomlinson 2014;Strack 2014;Streich 2016). In this case, the focus of the study is on the Santos Basin, offshore Brazil, and to which extent integrating seismic, magnetotelluric, gravity and magnetic data can help to constrain the geometry of salt and carbonate units. Salt structures in particular are difficult to image with traditional seismic reflection methods due to the steep flanks of the structures and internal scattering of the seismic energy (e.g., Key et al. 2006;Hokstad et al. 2011).
Given the range of different lithologies with highly variable physical properties in the Santos Basin, it is difficult to construct rock property relationships and it is likely that any constructed relationship will not be valid in significant parts of the study area. Thus, Gallardo et al. (2012) use the cross-gradient (Gallardo and Meju 2004, 2007Meju and Gallardo 2016), a versatile structural constraint, that has established itself as one of the most popular coupling approaches in joint inversion (e.g., Linde et al. 2008;Doetsch et al. 2010;Lochbühler et al. 2013;Sánchez and Delgado 2015;Tarits et al. 2015;Zhou et al. 2015). Other structural constraints include curvature based measures (Haber and Oldenburg 1997), directed constraints (Molodtsov et al. 2013), using the roughness of another model to modify the regularization (Günther and Rücker 2006), Gramian constraints (Zhdanov et al. 2012) and joint total variation (Haber and Holtzman Gazit 2013). Depending on the scenario, these coupling methods can have superior properties under certain circumstances. However, to date none of these has achieved the popularity of the cross-gradient.
The cross-gradient function sðm 1 ; m 2 Þ between two model vectors m 1 and m 2 is defined as the cross-product of the spatial gradients of each model, viz.
Haber and Holtzman Gazit (2013) and Meju and Gallardo (2016) discuss some of the issues implementing this constraint for practical inversions. Figure 11 shows an example of two conceptual models, their spatial gradients, i.e. lateral changes in the physical properties represented by each model, and the resulting cross-gradient. As this function will be included in the inversion algorithm either as a constraint (Gallardo et al. 2012) or one of the terms of the objective function , the model cells with large cross-gradient values are considered to violate the constraint and will be modified by the inversion. Figure 11 shows that despite very different visual appearance of the two models, the cross-gradient vanishes in large parts of the model domain, either because one of the spatial gradients is zero, or because the gradients are parallel or anti-parallel to each other. This means that a variety of models will be considered compatible under the crossgradient constraint and is one of the main reasons for its popularity. Even in situations where the different physical parameters in the joint inversion react differently to changes in underlying geology, the cross-gradient constraint will very likely be able to accommodate this. The flip side of this flexibility is that the coupling between the methods is less strong than with more restrictive approaches and, in some instances, can be insufficient for effective joint inversion (Lelièvre et al. 2012;Heincke et al. 2014). It is important to note that in practical smoothness regularized inversion, the case of zero spatial gradient for one or both of the physical properties does not occur very often, as the regularization tends to produce parameter changes over several models cells instead of sharp boundaries (cf. Moorkamp et al. 2011). In fact, it is likely that this phenomenon helps to achieve the desired similarity between the models, as the influence of the cross-gradient is extended to a larger region compared to the sharp boundaries of Model 1 in Fig. 11. In their case history, Gallardo et al. (2012) invert for seismic velocity, electrical resistivity, density and magnetic susceptibility and thus include six cross-gradient constraints in their joint inversion, one for each pair of model parameters. Figure 12 shows the magnitude of the cross-gradient for all combinations of model parameters for individual inversions and joint inversion of all data sets. It is immediately visible that calculating the cross-gradient for the individual inversion results produces significantly larger values for all parameter combinations in the entire modelling domain compared to the joint inversion. This means that for the individual inversions in large parts of the modelling domain, changes in different physical parameters occur in different directions. In contrast, for the joint inversion these changes occur in the same direction, there is only a small region at a depth of about 2 km that shows significant nonzero cross-gradient values. Gallardo et al. (2012) do not further discuss possible causes of this discrepancy, but given the discussion on hypothesis testing in the tutorial above, it would be very interesting to investigate the cause of the high cross-gradient values, as these clearly indicate a violation of the assumption of similar structure. Examining Fig. 12, the main violation of the cross-gradient constraint occurs between the magnetization contrast and all other methods in an area where the observed magnetic data change rapidly along the profile. Thus apart from some unknown geological reason, it is also possible that these high values are caused by model discrepancy, i.e. an inversion grid that is too coarse to accommodate the necessary changes, or too optimistic data errors. Without additional information, it is impossible though to assess the contributions of these three effects.
In any case, the joint inversion has produced models that are significantly more similar to each other than the individual inversion results and as such can be considered better. One problem with multi-parameter inversions is to distil the information in a form that can be easily analysed. In addition to the traditional form of plotting the models for different physical properties next to each other, Gallardo et al. (2012) also use what they call geospectral images in an attempt to present all relevant information in a more compact form. Figure 13 shows the geospectral representation of the individual inversion results and the joint inversion. Three of the physical properties are represented by different colours, and variations in the values of each property are shown as variations in lightness of that colour. The fourth property, here magnetization, is represented by contour lines. The result is a single image that represents all four properties simultaneously and where certain colours indicate specific combinations of physical parameters. While this representation is very compact and certain lithological units can easily be identified, some of the more subtle variations are lost and can be more easily seen in the individual plots. Still, it is a first step towards solving the problem of visualizing all the information gained from integrated inversions efficiently.
One likely reason that the violation of the cross-gradient is located near the surface in this case study is that generally inversion models of surface-based observations tend to Fig. 12 Magnitude of the cross-gradient between different physical parameters for individual inversions (below diagonal) and joint inversion (above diagonal) in the study of Gallardo et al. (2012) become smoother with depth due to the decrease in the sensitivity kernels (Li and Oldenburg 1998;Streich and Becken 2011). As a result of this loss in sensitivity, the influence of regularization increases and thus changes in model parameters are distributed more broadly and the amplitude of anomalies is reduced. Consequently, the cross-gradient value at depth is similarly reduced compared to the near surface. A similar phenomenon can be observed in other studies as well (cf. Moorkamp et al. 2016a). Bennington et al. (2015) propose a normalization of the cross-gradient constraint to avoid this natural decrease in cross-gradient and ensure that structures at depth are equally constrained. They demonstrate the feasibility of such an approach on a synthetic example and data measured across the San Andreas fault. However, so far a careful comparison with the regular crossgradient is missing, and thus, it is difficult to quantify the contribution of the normalization in improving the joint inversion results. The lack of systematic investigations of the properties of different approaches and their comparison is a general problem in joint inversion at the moment. Being a relatively young field, most authors have focused on Fig. 13 Geospectral representation of the individual inversion results (top) and the joint inversion results (Gallardo et al. 2012). In both cases the contours show the magnetization model developing new approaches and demonstrating that these work in principle and only few studies have provided comparisons between different types of joint inversion Le et al. 2016) or looked how particular settings affect the inversion results (Moorkamp et al. 2016a). As the field matures, such studies become more and more important in order to help selecting the appropriate approach from the variety of possibilities.

Cooperative and Constrained Inversion
I will now discuss some case studies of inversions that incorporate information from other geophysical studies as constraints, but do not perform a joint inversion. This means that the information is considered as an estimation of ground truth, potentially with some uncertainty, but there is no feedback to modify it as a result of the inversion. As discussed in the tutorial above, this potentially restricts the range of acceptable models more strongly than a joint inversion, but great care has to be taken to assess the validity of the constraints and assess its uncertainty.
The first case study (Kalscheuer et al. 2015) for constrained inversion combines a range of electric and electromagnetic data with constraints from a coincident seismic reflection profile. The aim of the study is to identify potential sources of groundwater in the Okavango Delta, Botswana. Kalscheuer et al. (2015) jointly invert audio-magnetotelluric (AMT), controlled-source AMT and transient electromagnetic (TEM) data in terms of layered models at a number of different locations. The seismic reflection profile is used to identify the depth to two lithological boundaries: (1) a boundary within the sediments, the so-called POM, and (2) the depth to basement. Without any seismic constraints, the data can be sufficiently explained by three-layer models at all locations. However, introducing depth to basement as a constraint in a three-layer inversion significantly increases the data misfit. Adding a fourth layer to the inversion allows to satisfy the constraint and fit the data to the same misfit level as the unconstrained inversion. This kind of analysis is similar to the procedure discussed in the tutorial above and in Moorkamp et al. (2010). It again highlights the importance of performing individual inversions before attempting any kind of joint inversion or constrained inversion, as it is necessary to have a benchmark misfit. Figure 14 shows a comparison between unconstrained four-layer inversion models (top) and inversions where the lower two interface boundaries have been constrained to match the boundaries interpreted from the seismic reflection profile. In this case, Kalscheuer et al. (2015) estimate the uncertainty to the depth of these boundaries as AE10 m, but note that larger deviations from these depths are permitted by the electromagnetic data. The result of the constrained inversion is that the models at all locations are virtually identical, there are only some minor fluctuations in the depth to first layer boundary. As Kalscheuer et al. (2015) note this means that the electromagnetic observations are compatible with the inferred seismic boundaries. If we accept this as a reflection of the true geological situation, it suggests that the differences in inversion models for the unconstrained inversion are merely a mapping of data errors into the results.
In the preceding example, the seismic reflection data mainly show two strong, horizontal reflections that clearly identify lithological boundaries and the remaining weak amplitude variations can be attributed to noise. More generally, seismic reflection profiles, or similarly ground penetrating radar (GPR) or migrated receiver function images, show a multitude of reflections with complicated geometries and varying amplitudes. Although it can be possible to extract significant discontinuities and incorporate the information in an inversion (e.g., McGary et al. 2014), in many cases it is not clear which reflections are the Surv Geophys (2017) 38:935-962 953 most significant. Here semi-automatic procedures can help to transform a guiding image into a form that can be used to steer a constrained inversion. Zhou et al. (2014) have developed an image-guided inversion approach that uses such a transformation. Based on a guiding image, e.g., a GPR section, two quantities are calculated: (1) a metric tensor field that identifies directions of major changes and directions of homogeneity, the long axis of this tensor is aligned parallel to the direction where the reflection amplitudes are continuous, and (2) semblance, to identify the location of strong discontinuities, semblance values are low at major discontinuities. This process is fully automated and can be applied to depth migrated GPR and seismic sections or even to geological cross sections. In order Fig. 14 Comparison of the unconstrained electomagnetic inversion results (top) and the inversion results constrained by seismic data (bottom) by Kalscheuer et al. (2015). Both sets of models match the data to comparable levels of misfit. Introducing the seismic constraint has made the models virtually independent of location c Fig. 15 Guiding image, calculated metric tensor field and semblance from that image as well as individual and constrained inversion results from the study of Zhou et al. (2014). Introducing the structural constraints has increased the sharpness of several boundaries and changed the geometry of several structures to incorporate the tensor and semblance information in the inversion though, it is necessary to define a threshold below which the semblance is considered to indicate a boundary. Figure 15 shows an example of a guiding image based on a GPR section, the calculated metric tensor field at a few locations and the semblance. Semblance values below the userdefined threshold are shown in red, and at these locations, the regularization will be modified to smooth along the direction of the long axis of the metric tensor and permit changes along the direction of the short axis. In this proof of concept, the authors use fixed values for the smoothing weights in these two directions, but note that these can be adjusted to reflect the length of the axes of the tensor.
Introducing these constraints in the inversion has a significant effect on the resulting model as can be seen in comparison with the unconstrained inversion (Fig. 15). As expected, the constrained model shows stronger resistivity changes across most of the inferred boundaries (e.g., between profile km 0-15), although not all structures are directly related to changes in resistivity (lower structural boundary at profile km 25-45). The data misfit is only marginally higher compared to the unconstrained inversion, but it would be instructive to look at the distribution of misfit in detail and assess which structures have an impact on the data misfit and whether this is an expression of noise in the data or changes caused by the constraints that are not compatible with the observations. Furthermore, the impact of the different user-defined parameters, semblance threshold, smoothing weights, should be tested and their impact on the inversion results assessed.
As a general methodology, imaged guided inversion is attractive though as it can be used to introduce complex structural information with only modest user-interaction. Compared to manually drawn tear zones (e.g., McGary et al. 2014), it can be used to permit varying degrees of structural heterogeneity by using smoothing weights based on semblance magnitude, but lacks the insight of an expert user into the geological significance of different structures. However, this is a general trade-off between manual and automated procedures to produce geophysical models. In the hand of an expert user, manual methods such as forward modelling and constraints based on the interpretation of other information can be highly powerful, but the result strongly depends on the user and can be biased by her expectations. Most automated procedures such as inversion algorithms also require some user-defined quantities, but in most cases the major structures will be comparable (e.g., Miensopust et al. 2013).
These two case studies demonstrate just some of the possibilities of introducing geological and geophysical information in the inversion. Others include mutual information (Mandolesi and Jones 2014), or geostatistical methods in a probabilistic inversion (e.g., Zahner et al. 2016). Bosch (2016) and Hansen et al. (2016) discuss the theoretical basis of using geological prior information in more detail.

Conclusions
The tutorial section of this paper has demonstrated the strengths and weaknesses of joint inversion and constrained inversion approaches. Used sensibly, they can significantly improve the inversion results and obtain models that better reflect the subsurface. However, using multiple data in a sophisticated automated inversion algorithm does not alleviate the user of the need to critically evaluate the inversion models. At a minimum, the data misfit of the joint inversion should match the misfit of the individual inversion, but ideally the trade-offs between fitting the different data sets and the impact of all parameter choices on the results should be explored. I have described some possible recipes that can be used as indications where the integration can be considered successful and where it has to be used with caution. Particularly when the models strongly depend on the choice of one or more parameter, e.g., the weighting between the different data sets or the weight of the coupling constraint, it is necessary to clearly understand why this is the case before any inferences can be made from the models.
Notwithstanding these difficulties, integrated approaches have reached a level of maturity that makes them suitable for application in a large number of areas from the near surface to the deep mantle. A plethora of methods exists that can be tailored to the available data and prior information and the case studies above have exhibited some examples. For some, such as joint inversion with cross-gradient coupling, sufficient experience and systematic studies exist that provide a clear picture where it can be expected to work and where not and how it should be used. Other approaches still lack this kind of information, but with increased use similar strategies can be developed. This is particularly important for methods that require a number of user-defined parameters and a fruitful field for future work. As discussed above, despite the multitude of coupling approaches, there is still room for new developments, particularly on how to include parameter relationship into joint inversion schemes.
In terms of applications, integrated schemes including electromagnetic methods have been used on all scales and with different targets in mind, but to date there seems to be a shortage of studies targeting geothermal exploration problems or investigating the structure of the crust. Here significant developments can be made with potentially large impact.