Introduction

The development of cities implies a growing demand of space and natural resources spreading in depth in both the horizontal and vertical in-depth directions. With the aim of achieving orderly urban growth and ensuring both sustainable and coexisting use of the various natural resource deposits and underground infrastructures, there then exists a clear need to manage the subsurface space (Schokker et al. 2017; Campbell et al. 2010). This need is especially relevant in detrital depositional environments consisting of sediment mixtures of differing grain sizes, since many urban areas around the world are built on such deposits. Underground management will be even more accentuated in the next years given the straightforward integration of geological subsurface into the building information modeling (BIM) method, which will be mandatory within the timeframe of 2021–2025 at a national level in some countries (Wehrens and Wolken 2019).

The highest possible level of knowledge of the geological subsurface properties is desirable for this purpose, but it is unfortunately not available in most cases. These properties condition the suitability of a wide variety of geological potentials in the underground space, e.g., heat supply, storage, water resources, and as space for urban infrastructure. Direct soil observations from drilled materials constitute the most common types of geological data available. Borehole data, which are often very scarce and qualitatively heterogeneous, are most abundant in urban areas. These data have become available for further analysis due to the increasing level of digitalization and represent a potential big data source for geological analysis in favor of spatial planning (Marti 2019). Alongside the common lack of knowledge, geology also possesses an inherent natural complexity of holding highly non-constant properties in space (de Marsily et al. 2005). These facts often result in conservative procedures for managing the use of geological potentials with respect to legislative requirements. Understanding geological natural media implies acknowledging and accounting for their spatial heterogeneity (de Marsily et al. 2005). Added to this, the wide range of potential uses that the subsurface space might hold requires a flexible methodological approach that is able to address multi-functional specifications (Volchko et al. 2020). Both the 3-D spatial delimitation and the appropriate small-scale parameter assignments significantly constrain the suitability of both geological subsurface potentials and their sustainable and coexistent use.

In order to address these challenges, the use of massive borehole data and the digitalization of geological mapping in a more general sense (Jones et al. 2004) and, more concretely, geological 3-D models, in their different forms and approaches, are gaining importance over the last decades — not only for visualization purposes but also as a planning and forecasting tool. This is evidenced by the fact that geological mapping is currently undergoing a transformation from traditional 2-D to 3-D, with geological 3-D modeling becoming a key priority for national geological surveys (NGS) (Berg et al. 2011).

Within the broad field of 3-D geomodeling approaches performed in order to face the aforementioned challenges, two key aspects will be addressed in the present paper: the predictive ability of geology with respect to heterogeneity and multi-purpose usability. This is all intended to be accompanied by a flexible and user-defined implementation.

Background of geological 3-D modeling and model usability

The general term of “3-D structural modeling” constitutes the preliminary step of any geological 3-D model and addresses the definition of the accurate geometry of the geological domain as well as the integration of uncertainties associated with the geometrical input data (Turner 2006; Pyrcz and Deutsch 2014). In this respect, Wellmann and Caumon (2018) have provided a very complete description of the current state and challenges of 3-D structural modeling. Therein, the sole consideration of structural geometric features by means of, e.g., layer-cake models or geometric 3-D models could be used as framework models (Wellmann and Caumon 2018), but they are considered to be an approximation for understanding the small-scale effects and heterogeneities involved. The resulting geometrical model is commonly discretized in volume elements of the tetrahedron type or of a variety of voxel configurations with the ultimate goal of providing subdivisions in which property distributions can be predicted (Turner 2006).

The prediction of both the geological composition and continuous properties (petrophysical, geotechnical, or hydrogeological) in 3-D space is addressed by means of 3-D property modeling in voxelized domains. Property predictions based on voxel models enable modeling of the internal heterogeneity of the subsurface and are becoming increasingly important (Hademenos et al. 2019; Stafleu et al. 2011). In this regard, Kearsey et al. (2015) compared a 3-D model consisting of lithostratigraphic units, each of which possesses a unique dominant lithology, with a lithological 3-D model, thus providing a lithological prediction in each voxel. He concluded that the minor lithologies occurring within the units commonly represent a significant proportion of the lithological composition, which underlines the relevance of the small-scale geological variability. Focusing on the geological composition, the predictive ability of the model commonly depends upon the introduction of a categorical variable whose stationary classes represent subdivisions which are more geologically and statistically homogeneous (Deutsch 2006). Commonly used categorical variables may thus be based on lithofacies classes, such as in the 3-D model for the glacial and postglacial deposits underlying the city of Glasgow in the UK (Bianchi et al. 2015) and in the 3-D lithofacies distributions of Zeeland in the Netherlands (Stafleu et al. 2011; Stafleu and Dubelaar 2016). Alternatively, the grain sizes (or rather the lithological classes) can be considered, i.e., the 3-D lithological distributions also implemented in the models mentioned earlier (Stafleu et al. 2011; Kearsey et al. 2015), as well as for the Belgian Continental Shelf (Hademenos et al. 2019). Lithological and stratigraphical information can be also combined to subdivide the geological space into lithostratigraphic units (Royse 2010; Stafleu et al. 2011; Hademoos et al. 2019). Another option is offered by hydrofacies classes showing lithological facies with distinctive hydraulic properties (Comunian et al. 2011; Theel et al. 2020). The categorical variable chosen largely controls the degree of geological variability which the model may achieve, and it has strong implications for the quantification of model uncertainties and model usability (Potter et al. 2013). The introduction of this categorical variable is case-specific (Diogo 2018), and the variable should be fixed in such a way that it does not oversimplify the geological reality (Lindsay et al. 2012) in addition to counting for the subsequent usability of the model — which should have a multi-purpose nature, if possible.

In regard to usability, geological 3-D models may be used based on pure visualization, the definition of geometric architecture of geo-bodies, to multi-parameter estimation for various specific applications (Ross et al. 2005) like groundwater flow numerical modeling. Regarding the geometric architecture of geo-bodies, this commonly implies the definition of their lateral extension, thickness, and three-dimensional connectivity. The latter feature is of utmost importance to correct flow and transport predictions, especially when dealing with contamination assessments (de Marsily et al. 2005). This interconnectedness is provided by analyzing the geometrical dependences between the geological bodies. Multi-parameter estimation largely depends on the level of detail introduced by the categorical variable defined. A lithological description as precise as possible is needed, on the basis of which properties will be assigned (de Marsily et al. 2005). The various stationary classes into which the geological reality is subdivided, whether they be lithologies, lithofacies, hydrofacies, or lithostratigraphical units, will be parametrized with constant geotechnical, hydrogeological, or petrophysical properties from the published literature, or with field data. In this regard, if the categorical variable considerably oversimplifies the geological reality; the constant parameters derived from it would also be misrepresented. This was demonstrated by Kearsey et al. (2015) by using a lithologically-derived model compared with a lithostratigraphic model while modeling the glaciofluvial deposits in the city of Glasgow. In this sense, 3-D geomodels, most of them based on a structural approach, have been applied in the last years in a wide range of applications in fields as varied as hydrogeology (Bianchi et al. 2015; Campbell et al. 2010), hydrochemistry (Raiber et al. 2012; Royse 2010), resource assessment (Maljers et al. 2015; Van der Meulen et al. 2005, 2007), seismic microzonation (Kruiver et al. 2017), geotechnics (Merritt et al. 2007), and environmental risk assessments (Wycisk et al. 2009).

The extended use of the subsurface in its wide variety of competing and coexisting functions — together with the inherent spatial variability of geology — motivates the need for underground management. This is being increasingly addressed by the digitalization of the geological space, which has resulted in significant progress in advanced geological 3-D models in recent years. The current modeling methodologies implemented in detrital systems introduce a categorical variable which tends to assume a stationary prevailing class (lithologies, lithofacies, hydrofacies, etc.) in each voxel for both predicting the final image of geology and further parameterizing the model with property distributions. In this sense, the presence of highly variable sediment mixtures ultimately controls the non-constant spatial distribution of properties in detrital systems. Our understanding of this small-scale spatial lithological variability can be improved by deepening the knowledge regarding the prediction of 3-D distributions of relative amounts of the grain fractions composing the sediment mixture.

This paper proposes a 3-D modeling approach which provides substantial contributions for improved modeling detrital depositional environments. On the one hand, this novel method addresses the prediction of the geological composition of detrital systems, considering these as clasts mixtures with quantifiable variable amounts of grain fractions. This approach is implemented by introducing a user-definition of the grain fractions being considered into the sediment mixture and the precision of the relative amounts of each grain fraction. On the other hand, this approach aims to address multi-functional specifications for different purposes. Special efforts have been made to combine model scenarios for applications requiring different relative frequencies of grain fractions.

After an explanation of the modeling method, its implementation in the city of Munich will be described using a large data set to display the geological prediction and the multi-user opportunities.

Methodology

In order to better understand the small-scale geological variability of sediment mixtures, a novel 3D-geological modeling approach is presented to, on the one hand, improve the spatial prediction of the lithology distribution and, on the other hand, to provide a flexible interpretation of the modeled lithological distribution for user-specified applications. The cornerstone of this method is the lithological heterogeneities in detrital sedimentary systems being able to be described as grain size distributions. This distribution of cumulative frequencies of grain fractions can be geomodeled in 3-D space in a geostatistical simulation framework by means of a set of partial percentile lithological models, referred to herein as Di models. This set of models in turn defines the cumulative frequencies of each lithological class in each voxel after defining a constant frequency. The relative amounts of grain fractions are thus also inferred. They refer to the lithological stages of the subsurface, being equivalent to the reference passing diameters Di of a grain-size distribution with the same percentile distances, whereby the subscript i defines the cumulative frequency of the total sediment mixture. This can be read as the grain size distribution curve at every voxel.

Based on this principle, the modeling approach uses a stochastic simulation framework by way of the sequential indicator simulation (SIS) method to reproduce (1) the hard data, which are truth measurements from the lithological descriptions from existing boreholes, (2) inferred spatial statistics in the distribution of grain sizes proportions, and (3) the spatial continuity of each lithological class for each Di. This process results in a multi-voxel solution in which many partial percentile lithological models as reference diameters Di are defined. On this basis, the concept of “Model Lithological Uniformity” is proposed, as a measure of the homogeneity of the grain sizes in the detrital mixture. The “Most Uniform Lithological Model” can be directly derived, and it represents the distribution of lithologies possessing the highest relative frequencies in the detrital mixture.

For a clear explanation of each methodological step, we have added some short examples of the case study in Munich.

Step 1 — Conceptual model

The first step of the proposed methodology includes:

  1. i)

    Construction and discretization of the geometrical framework of the 3-D space being modeled.

  2. ii)

    Definition of the grain fractions to be modeled in the detrital mixture. The grain fractions to be considered are defined by a categorical variable according to the grain sizes.

  3. iii)

    Definition of the percentiles of interest of the grain size distribution, denoted as Di. In this context, D represents the grain diameter expressed by the lithological class, and the subscript i is an integer indicating the percentile of interest. For example, the percentile of interest D10 denotes the lithological class below which the 10% finest grain fraction falls. The set of i values is defined so that the grain size distribution is divided into equal frequency groups. This is done by defining a constant step p between percentiles, e.g., the pth percentile, the (p + p)th percentile, the (p + 2p)th percentile, and the (p + 3p)th percentile. This implies a total number of N percentiles of interest considered (where N = 100/p). For instance, if we consider a step of p = 10, then i = 10, 20, 30, 40, 50, 60, 70, 80, 90, and 100 and we obtain N = 10 percentiles of interest. The number of percentiles can be adjusted by selecting a smaller or bigger step p, which increases respectively decreases the number N (e.g., N = 20 for p = 5). As a consequence, the percentile step p sets also the precision in the frequency quantification that the model is able to account for.

On this basis, the construction of N percentile lithological models is addressed using a geostatistical simulation approach. These models, which are linked to each percentile of interest of the sediment mixture, can be expressed as Di models.

To illustrate this, the conceptual model for the case study in Munich considers three grain fractions to be modeled (clay/silt, sand, and gravel), although a finer graduation is conceivable which also includes minor lithological components. In our case, the construction of ten partial percentile models (N = 10) was required in order to obtain a lithological prediction with a precision of 10% (p = 10). These are linked to the following ten reference diameters: D10, D20, D30, D40, D50, D60, D70, D80, D90, and D100 of the whole sediment mixture.

Step 2 — Input data: lithological descriptions from borehole data

The proposed method is intended for and already being applied to process large amounts of lithological descriptions from boreholes. This information is the basic input for the modeling approach. Given the conceptual model and, more specifically, once the grain fractions and the precision of the model are set, the lithological descriptions are prepared and interpreted for each described borehole section in the available dataset accordingly to constrain the partial percentile models as hard data. Borehole data might be stored in databases, which are normally managed at the national or regional level, and they include information about the intersected geology. One common aspect of all databases is that the borehole descriptions consist of coded information representing the geological composition for each drilled interval in terms of (1) the geological nature and (2) the degree of intensity or presence of the lithological components. In this respect, the use of common codes information for soil description is promoted at the European level by European standards EN ISO 14688–1 and EN ISO 14688–2. Beyond this, additional information based on regional or national guidelines, e.g., DIN 4023 (2006), might be also used. The workflows adopted in this approach for data acquisition and preparation were adapted to the borehole database used for the present case study in the city of Munich (Germany) using the integration of the aforementioned European references into the German standards (DIN EN ISO 14688–1 (2020); DIN EN ISO 14688–2 (2020); DIN EN ISO 14689 (2018); and the DIN standards (DIN 4023 (2006)). The “Conceptual model and borehole database” section summarizes the features of this database.

Steps 1 and 2 of the methodology, the data acquisition to the interpretation ready-to-use in geostatistical simulations is illustrated below (Fig. 1) by the following lithological description as an example: (gG,s,o). This can be translated into moderately sandy (s) coarse gravels (gG) with a presence of organic material (o).

In order to achieve a grain-size based, consistent database, the borehole descriptions were firstly subject to quality control (QC), a plausibility analysis, and homogenization, because the borehole descriptions in the databases generally include unnecessary or implausible information. In our example, and as a consequence of the QC, the latter component from the description (gG,s,o) was discarded, resulting in (gG,s), since the presence of organic material does not provide grain-size-based relevant information (see Fig. 1b). Every linguistic code indicating a lithology was converted into an interval in the form of [mmin,mmax], representing its possible imprecise region of relative frequency values in which the true value (mtrue) lies. These ranges arise from the inherent generalizations linked with the soil description schemes used, which provide reference values for the minimal and maximal relative frequencies for each code. Hence, in order to obtain a homogeneous data set for the further use, a correction process of not plausible borehole descriptions ignoring the inherent frequency classification scheme (e.g., describing a lithological frequency over 100%) was also accomplished. In the classification schemes, the true relative quantification of the soil components was bounded between some intervals. Following this interval arithmetic, the borehole description in our example (gG,s) indicates that the relative frequency (%) of the clasts in the range of grain sizes between 20 and 63 mm lies in the interval [70, 85], and in the case of the clasts between 0.063 and 2 mm lies in the interval [15, 30] (Fig. 1b).

Based on these intervals, cumulative distribution functions (CDFs) of the different grain fractions are built. Therefore, the grain-size distribution curve is equal to the discrete cumulative distribution of the lithological classes. The CDFs are considered as synthetic distribution functions, since they are not based on results from a real particle size sieve analysis.

Since the intervals describe a range and not a precise percentage of the lithological components, the intervals of relative frequencies are used to build upper and a lower CDF curves representing the most coarse-grained and most fine-grained interpretations of the lithological descriptions. Similar to the probability bounds analysis (PBA) (Aughenbaugh and Paredis 2007), these curves represent the interpreted bounds between which all possible distributions might lie. Returning to our example, the point (63,100) in Fig. 1c indicates that 100% of all grain fractions are finer than a cobble. At the 20 mm abscissa position, two points are represented, one for each minimal and maximal relative frequencies for the sands mentioned earlier. Whereas point (20,15) shows that 15% of all grain fractions are finer than a coarse gravel, point (20,30) indicates a relative amount of 30% being finer than the aforementioned lithology. The first point then belongs to the most coarse-grained interpretation from the borehole descriptions, and the latter to the most fine-grained interpretation. Lastly, at the 0.063 mm abscissa position, two points are represented in the same position, indicating that there are no particles finer than a sand. The grain fractions which were not included in the borehole descriptions are represented by horizontal steps in the shape of both CDF curves, since they do not provide any cumulative relative amounts. We observe in Fig. 1c that this occurs for the following grain fractions: clay, silt, fine to medium gravel, and cobble. The exact distribution within a grain size composed of several minor grain fractions (in this case, e.g., between 0.063 and 2 mm), is unknown when using a lithological scheme as described above. As a result, the proportions of fine-grained to coarse-grained sands are interpolated by applying a linear equation between two pairs of values for a semi-log plot with an abscissa axis logarithmically scaled.

In order to constrain the partial percentile models, both the lower and upper CDF bounds are read at each of the N reference diameters of the sediment mixture representing the cumulative frequencies. The reading process is implemented according to the parameters defined in the conceptual model: three grain fractions (gravel, sand, and clay/silt) and a precision p of 10% (see Fig. 1d). A set of two recorded lithological classes for each reference diameter Di is obtained, one from the most fine-grained interpretation from the borehole descriptions, i.e., the lower CDF, and another from the most coarse-grained borehole descriptions, i.e., the higher CDF. This set is then coded into categories, in our example into: 1 (clay/silt), 2 (sand), and 3 (gravel). In this way, the information initially contained in the borehole descriptions for each drilled interval, given as linguistic codes, was converted into a set of data in the format of three rows (Di, lithological class for lower CDF, lithological class for upper CDF) and by as many columns as the Di values defined. These data sets describe the interpreted existing lithological class at the specific Di values for the upper and lower CDF interpretation and can be easily imported into the discretized 3-D space as input data in the form of interval-logs (see Fig. 1d).

Step 3 — Spatial statistical inference

This step includes the spatial analysis of geological trends in the distributions of the grain size proportions detected in the available geological data. As a result, “soft” information on areal trends in the distribution of the proportions of the lithological classes are provided. The integration of the trends or “soft” data is a necessary step prior to geostatistical simulation (Pyrcz and Deutsch 2014). The application of all geostatistical algorithms implicitly considers the assumption of stationarity underlying the variable in question — in this case the prevailing lithology from all possible grain sizes defined in the conceptual model. The introduction of deterministic trends overcomes these assumptions (Ringrose and Bentley 2015). Although a wide range of techniques are used in order to account for trends, no purely objective method for trend modeling exists (Pyrcz and Deutsch 2014). Areal trends and vertical proportion curves (VPC) provide a 2-D insight into the spatial distribution of the variables under consideration. In contrast, the 3-D trends combine both in-depth and areal information and account for a fully three-dimensional trend model. Currently, the most common practice consists of subdividing the modeling domain into different large-scale units, e.g., lithostratigraphic units, and inferring the proportions of the categorical variable from the available borehole data in each separated volume (Stafleu et al. 2011; Kearsey et al. 2015).

This modeling approach enables the superimposition of trends, in all their variants, onto the simulation. The type of kriging chosen by the modeler conditions the treatment of stationarity (Ringrose and Bentley 2015). The trends were addressed separately for each of the reference diameters Di. In order to account for the interpretation from both the lower and upper CDF (see Step 1), each Di value required two trends, one for each CDF bound. Trends were then applied as inputs to constrain the geostatistical simulation in the following form p(u; k), where k = 1, 2, and 3 and p(u; k) is the continuous probability between 0 and 1 of lithological class k at location u.

Step 4 — Spatial continuity model

Step 4 of the methodology concerned the quantification of spatial correlation by means of 3-D variogram analysis, providing the understanding of the spatial continuity of the lithological classes from the hard data, which is the continuity of the indicator variables i(u; k). For instance, Goovaerts (1997) and Gringarten and Deutsch (1999) provide a comprehensive description of variogram interpretation and modeling.

The spatial geological variance in the borehole data was investigated both for each reference diameter Di and for the lithological classes considered in the conceptual model. The hard data in interval-logs (see Fig. 1d) needed to be adapted in the form of indicator data in order to use it for the 3-D variogram analysis, and to feed the geostatistical simulations (Step 5). To do this, if we consider the categorical variable under study (which is in this case the lithological class), as well as their mutually exclusive and discrete k values, we can express the indicator categorical variable as (Deutsch 2006):

Fig. 1
figure 1

Steps 1 and 2 of the Di models method. a) Lithological description for a drilled interval in a borehole database. b) Quality control (QC), data homogenization, and translation of lithological descriptions into intervals of relative frequencies. c) Capturing the lower and upper cumulative distribution functions (CDF) bounds for all sediments. d) Reading and coding of lithological classes at each Di according to conceptual model (precision of quantification and grain sizes to be modeled)

Fig. 2
figure 2

Step 5 of the Di models method: estimation of the required number of geostatistical simulations for each Di. a) Histogram and probability plot of the information entropy of the membership to a lithological class. Number of bins of histogram: 30. b) Histogram and probability plot of the information entropy of the membership to a lithological class. Number of bins of histogram: 30. c) Plot of summary statistics of the information entropy of the membership to a lithological class obtained for different number of simulations. d) Plot of sum of variation of the summary statistics obtained by increasing the number of simulations. Figure 13 in Appendix 1 provides an additional illustration of this process

Fig. 3
figure 3

Step 6 of the Di models method: lithological predictability. a) Partial percentile lithological models (Di models). b) Location u1 showing a low MLU value. c) Location u2 showing a high MLU value. d) Most Uniform Lithological Model representing the distribution of lithological classes owning the highest relative frequencies in each voxel. The lithological class k is represented in different colors: k = 1 (clay/silt) in purple, k = 2 (sand) in orange, and k = 3 (gravel) in yellow. Lithological descriptions according to DIN standards DIN 4023 (2006). (S,g/,tu): strongly gravelly sands with a moderate presence of silt/clay. (G,s’): slightly sandy gravels

Fig. 4
figure 4

Step 7 of the Di models method: multipurpose usability

Fig. 5
figure 5

Geological setting in the case study area. a) Location map. b) Simplified geological map in the Munich area showing the main lithostratigraphic units and the boundaries of the city of Munich (modified after Bayerisches Geologisches Landesamt 1996). c) Schematic geological view of the city of Munich (modified after Münichsdorfer 1922). m.a.s.l.: meters above sea level

Fig. 6
figure 6

Borehole database in the case study area. a) Histogram showing the number of boreholes reaching a certain final depth in the city of Munich. The total number of boreholes is 20114. b) Location map of the area of the case study showing the location of the boreholes used and two cross sections AA’ and BB’, which are used to present and discuss the results

Fig. 7
figure 7

Spatial continuity results of the lithological classes in the subsurface 3-D space in the city of Munich. a) Relation between the correlation ranges in the horizontal and vertical directions of each Di model for each lithological class and comparison with common width/thickness (W/T) spaces of fluvial-channel bodies from Gibling (2006). The dotted lines represent different W/T ratios. b) Correlation ellipsoids for the lithological class “gravel” for each Di value. c) Correlation ellipsoids for the lithological class “sand” for each Di value. d) Correlation ellipsoids for the lithological class “silt/clay” for each Di value

Fig. 8
figure 8

3-D views of the partial percentile lithological models (Di models) in the city of Munich. a) D10 model. b) D50 model. c) D100 model

Fig. 9
figure 9

Case study in the city of Munich. a) 3-D view of the Model Lithological Uniformity. b) Cross section BB’ showing the Model Lithological Uniformity. c) 3-D view of the Most Uniform Lithological Model. d) Cross section BB’ showing the Most Uniform Lithological Model. e) 3-D view of the architectural model of the Quaternary and UFM deposits. f) Cross section BB’ showing the architectural model, the aquifer assignation to the filter pipes and the groundwater and potenciometric surfaces (T3 almost not prominent, because it occurs mainly in the western part of Munich; m.a.s.l.: meters above sea level)

Fig. 10
figure 10

Cross section AA’ showing the results of the scenario analysis for the design process of dewatering systems in the Quaternary aquifer in the city of Munich. a) Conservative scenario representing the most demanding dewatering in view of the D60 model. b) Optimistic scenario representing the less demanding dewatering in view of the D40 model (m.a.s.l.: meters above sea level)

Fig. 11
figure 11

Summary of results for cross validation: Bubble plot representing the fuzzy closeness of the estimated lithological classes to the true lithological classes for the D10 to the D100 models and the Most Uniform Lithological Model (MULM). Whereas the bottom right half of the bubbles represents the estimated lithological classes, the upper left half shows the collocated true lithologies. The number of locations for each true-estimate case is represented through the size of the dots. If the closeness to the true lithology was 0, the estimation was correct. These cases are represented along the dotted blue line (true = estimate). If the closeness to the true values was 1, the contiguous lithology was predicted instead of the true one. In the case that gravel was estimated instead of silt/clay, and vice versa, the closeness to the true lithology was 2 and the impact of misclassification was high. The light gray area represents the cases when the prediction underestimated the grain size of the true lithology (e.g., true = sand, estimate = clay/silt). An overestimation in the prediction occurred when the true lithology was finer than the estimated lithology (e.g., true = sand, estimate = gravel). This case is represented by a dark gray area

Fig. 12
figure 12

Analysis of worst-case predictions (closeness to true = 2) and checking the degree of sediment mixing for correct predictions (closeness to true = 0): Boxplots showing the distribution of Model Lithological Uniformity (MLU) for correct predictions and for worst cases of prediction with a high impact of misclassification. The number of locations (each location represents an interval of 10 cm) is 168,209 in the case of correct predictions and 6178 for the worst-cases

$$i\left(u;k\right)=\left\{\begin{array}{l}1,if\;lithological\;class\;prevails\;at\;location\;u\\0,otherwise\end{array}\right.,k=1,2,3$$

where:

$$\begin{array}{l}u\quad:\mathrm{location}\;\mathrm{being}\;\mathrm{estimated}.\\k\quad:\mathrm{lithological}\;\mathrm{class}.\\i\left(u;k\right)\quad:\mathrm{indicator}\;\mathrm{variable}\;\mathrm{for}\;\mathrm{the}\;\mathrm{lithological}\;\mathrm{class}\;\mathrm k\;\mathrm{in}\;\mathrm{the}\;\mathrm{location}\;u.\end{array}$$

Consequently, the hard data are coded into 1 s if the lithological class prevails, and into 0 s if it does not.

Step 5 — Geostatistical simulation

The modeling approach considers the geostatistical simulation of the lithological class for each Di in 3-D space. Geostatistical simulation is preferred over estimation in order to generate multiple equiprobable models (Pyrcz and Deutsch 2014). This is intended to include the various equiprobable interpreted lithologies of the lower and upper CDF bounds for each Di (see Fig. 1d). The integration of the lithological classes for both bounds ultimately implies the introduction and propagation of uncertainties from the borehole descriptions. In order to ensure an equal representation of both valid interpretations for each Di model, it was necessary to fix the same number of simulations for the lower CDF bound as for the upper bound.

Among other possibilities for the geostatistical simulation, the SIS algorithm was selected. This choice was influenced by a variety of aspects. The presence of dense conditioning well data (Hassanpour and Deutsch 2010) and the lack of clear genetic shapes of the sedimentary bodies (Deutsch 2006) prioritize the use of SIS and may limit the use of object-based methods. Its voxel-based character also enables the incorporation of high-resolution trends as secondary data (Step 3), as well as the reproduction of small-scale features (Pyrcz and Deutsch 2014). SIS methods apply indicator kriging (IK) for the simulation of categorical variables (Deutsch 2006; Goovaerts 1997), and they introduce heterogeneity using a sequential random path for visiting unsimulated locations in order to draw Gaussian realizations using an indicator transform (Ringrose and Bentley 2015). SIS visits (iteratively and one-by-one) all unsampled locations until all locations are informed by a simulated value (Journel 1989). Further, SIS reproduces the hard data from the boreholes (Step 2), the inferred spatial statistics (Step 3), and the spatial continuity of the indicator categorical variables of each lithological class (Step 4). Most SIS variations relate to how the trends are superimposed on the simulation. The general remarks of the different options are summarized in detail in Deutsch (2006).

The application of SIS honors the conceptual model taking place separately for each of the reference diameters defined (Di). The required number of realizations concerning each Di is estimated through the analysis of the frequency distribution of the assigned lithological class in the whole model domain. Based on a certain number of simulations, the frequency distribution does not vary significantly. Various simulation sets are run for this purpose, each of them having a different number of realizations. The effort then focused on observing from which simulation set an increase in the number of realizations did not lead to a significant change in the lithological attribution in the whole model. In the example shown in Fig. 2, a total of ten sets are considered. The starting set consisted of 10 realizations and was increased by steps of 10 simulations, reaching a maximal number of 100 for the last set. For each simulation set, the information entropy of the membership to a lithological class was calculated for all voxels of the whole model domain. Figure 2 illustrates the process of estimating the number of simulations required for each Di. Figure 2a, b show the histogram and the probability plot for the information entropy for 10 and 30 simulations, respectively, from which the summary statistics can be computed. The minimal number of simulations was obtained when the entropy summary statistics converged to a constant value. Figure 2c shows how the aforementioned summary statistics vary when increasing the number of simulations by steps of 10 realizations. We observe that the mean has an initial value of 0.4 and becomes stabilized at a value of 0.5 when at least 30 simulations are run. The simulation process also shows that the 10th, 50th, and 90th percentiles of the values obtained vary widely when running simulations sets of 10 and 20 realizations and reach a stable value when the simulation reaches a number of 30 realizations. We can also appreciate this stabilization but less intensively in the case of the standard deviation and the standard error of the mean. In this example a minimum number of 30 simulations is satisfactory. Figure 2d represents the sum of the variation of all aforementioned summary statistics between contiguous simulations sets and shows clearly the simulation set from which the information entropy associated to the membership to a lithological class did not vary significantly. Figure 13 in Appendix 1 provides an additional illustration of this process.

Step 6 — Lithological predictability

Partial percentile lithological models (D i models)

The Di models represent the lithological class associated with each cumulative frequency of the total detrital mixture in each voxel. Figure 3a illustrates a schematic example of a set of Di models. The relative frequency value remained constant and corresponded to the precision p defined in the conception model (10% in this example; see Step 1). There are as many partial lithological models as the number N of percentiles of interest considered. The probability distribution of lithological classes for each Di in each voxel was calculated as a result of the stochastic simulation. For each Di, the multiple probability field was encoded on a representation of the most probable lithological classes. The sum of the relative frequencies of each grain fraction can be directly inferred and is shown for two locations, in Fig. 3b, c.

Model Lithological Uniformity (MLU)

The proposed methodology introduces the concept of “Model Lithological Uniformity” as a measure of the degree of the lithological composition homogeneity of a clasts mixture. Figure 3b, c show the “Model Lithological Uniformity” values at two locations based on the relative frequencies obtained from the Di models. The concept of information entropy (Shannon 1948) has been used in geological 3-D models as an objective measure of uncertainty (Goovaerts 1997; Wellmann and Regenauer-Lieb 2011). This concept was adapted to the particular context of clasts mixtures and understood as a measure of the disorder state of a detrital system at a grain-size level in each voxel. The greater the lithological disorder, the higher the lithological entropy, so the entropy equals 0 if the heterogeneity has a maximum value, and the lithological uniformity thus reaches a minimum value. In this context, the standardized lithological heterogeneity (normalized to 1) can be expressed as the ratio between the estimated lithological entropy in the presence of three possible lithological classes and the upper bound of the lithological entropy (log k). This can be expressed as follows:

$$MLU\left(u;k\right)=1-\frac{h\left(u;\;k\right)}{\log\;k}=1-\left[\frac{-\sum_{k=1}^jp_k\left(u\right)\times log\left[p_k\left(u\right)\right]}{\log\;k}\right]\quad\quad\left[0,1\right]$$

where:

$$\begin{array}{l}MLU\left(u;k\right)\quad:\mathrm{Model}\;\mathrm{Lithological}\;\mathrm{Uniformity}\;\mathrm{to}\;\mathrm a\;\mathrm{lithological}\;\mathrm{class}\;k\;\mathrm{at}\;\mathrm{location}\;u.\\u\quad:\mathrm{location}\;\mathrm{being}\;\mathrm{estimated}.\\\begin{array}{l}k\quad:\mathrm{lithological}\;\mathrm{class}.\\j\quad:\mathrm{largest}\;\mathrm{lithological}\;\mathrm{class}\;(j\hspace{0.17em}=\hspace{0.17em}3\;\mathrm{for}\;\mathrm{gravel}\;\mathrm{in}\;\mathrm{the}\;\mathrm{default}\;\mathrm{case}).\\\begin{array}{l}h\left(u;k\right)\quad:\mathrm{lithological}\;\mathrm{entropy}\;\mathrm{of}\;\mathrm{the}\;\mathrm{membership}\;\mathrm{to}\;\mathrm{a}\;\mathrm{lithological}\;\mathrm{class}\;\mathrm k\;\mathrm{at}\;\mathrm{location}\;u.\\p_k\left(u\right)\quad:\mathrm{relative}\;\mathrm{frequency}\;\mathrm{of}\;\mathrm{lithological}\;\mathrm{class}\;k\;\mathrm{at}\;\mathrm{location}\;u\;\mathrm{based}\;\mathrm{on}\;\mathrm{the}\;D_i\;\mathrm{models}.\end{array}\end{array}\end{array}$$

Most Uniform Lithological Model (MULM)

By applying the concept of “Model Lithological Uniformity” (MLU), the relative frequencies of each lithological class were computed based on the Di models. To illustrate this, Fig. 3b, c show the resulting relative frequencies at two random locations, u1 and u2 with differing degrees of lithological homogeneity. In each location of the 3-D space, the lithological classes possessing the highest relative frequency, e.g., k = 2 (sand) for u1 and k = 3 (gravel) for location u2, conform with the Most Uniform Lithological Model of the system (see Fig. 3d). This model itself does not show the complexity of the geological reality in each voxel (the clast mixture), but rather represents the distribution of the most dominant lithology.

$$MULM\;(u)=k\;with\;max\left(\left\{p_k\left(u\right):\;k=1,\dots,\;j\right\}\right)$$

where:

$$\begin{array}{l}u\quad:\mathrm{location}\;\mathrm{being}\;\mathrm{estimated}.\\k\quad:\mathrm{lithological}\;\mathrm{class}.\\\begin{array}{l}j\quad:\mathrm{largest}\;\mathrm{lithological}\;\mathrm{class}\;(j\hspace{0.17em}=\hspace{0.17em}3\;\mathrm{for}\;\mathrm{gravel}\;\mathrm{in}\;\mathrm{the}\;\mathrm{default}\;\mathrm{case}).\\MULM\left(u\right)\quad:\mathrm{Most}\;\mathrm{Uniform}\;\mathrm{Lithological}\;\mathrm{Model}\;\mathrm{in}\;\mathrm{the}\;\mathrm{location}\;u.\\p_k{(u)}\quad:\mathrm{relative}\;\mathrm{frequency}\;\mathrm{of}\;\mathrm{lithological}\;\mathrm{class}\;k\;\mathrm{at}\;\mathrm{location}\;u.\end{array}\end{array}$$

Step 7 — Multipurpose usability

As shown in Step 6, the relative frequencies of each grain fraction given by the conceptual model were directly inferred from the Di models. This is shown at the center of Fig. 3b and c. A synthetic grain size distribution curve can thus also be derived at each location (see at right of the preceding figures) or extracted along virtual wells. The availability of synthetic grain size curves in individual voxels strongly encourages the model usability from various points of view, which are summarized in Fig. 4. The lithological classes can be reclassified on a cell-by-cell basis, with categories for sediment mixtures, lithofacies types, soil classes for civil engineering purposes, or from cutoffs from numerical property distributions (Fig. 4a3). If needed, the expected lithological composition at a voxel scale can be expressed as coded information according to regional or national guidelines (see the examples in Fig. 3). Moreover, the numerical attribution and modeling of property distributions is not limited to the mere parameterization of assignments based on the literature or field/laboratory data (Fig. 4a1). The Di models approach enables derivation of empirical parameters that are dependent on the virtual grain size variations at each model location, e.g., hydraulic conductivity (Hazen 1892; Seelheim 1880; Beyer 1964), effective porosity (Marotz 1968), and geotechnical parameters. Alternatively, the aforementioned grain size-dependent parameters can be calculated from the CDF bounds of the lithological descriptions along the drilled boreholes. The 3-D property distributions can then be modeled, if required (Fig. 4a2). Whatever the choice made, it then leads to a prediction of the property distributions of such parameters more accurately in 3-D space than those obtained with the straightforward attribution.

In addition, building linkages or groundwater related issues between model cells based on geometrical or parametrical constraints is the basis for identifying groups of cells in compliance with some requirements. This is, for example, the case when analyzing the connectivity of geological bodies in order to determine a reservoir architecture (see Fig. 4b1), or when capturing the saturated aquifer parts overlying a confining layer owing to a certain thickness (Fig. 4b2).

The identification of pessimistic, realistic, and optimistic scenarios for the distribution of specific lithologies in space on the basis of modeled grain-size fractions can be of benefit to applications demanding the geometries of, e.g., confining layers or aquifers (see Fig. 4b3), or for civil engineering issues like tunnel construction. Given that the modeling of lithology distribution always includes uncertainties (caused by non-optimal spatial distribution of input data, imprecise lithological descriptions, etc.), a conservative representation of the lithology geometry of interest (e.g., the maximal lithology distribution with a higher permeability) can support practical assessments like the calculation of dewatering regarding underground structures (to name one example). In this example, a conservative representation of the lithology of interest means that lithology having a higher degree of permeability (gravel, sand), thus causing a higher water encroachment, is modeled with a larger spatial occurrence in order to be on the safe side for the further dewatering calculations. For example, we would in this use case propose a specific Di model (e.g., 60th percentile) as a representation of the lithology distribution.

The aforementioned unicellular and macrocellular workarounds are susceptible to further analysis by means of classical 2-D computations combining features from both approaches.

Application of the method in the case study in the city of Munich

Geological setting

The proposed methodology was tested on a city-wide scale in Munich (Germany), which is located in the North Alpine Foreland Basin in the southern part of Germany (Fig. 5a). This typical asymmetric foreland basin, also known as Molasse basin, dips southwards underneath the Alps and is filled with up to 5000 m of sediments belonging to several marine to continental transgressive/regressive sequences (Lemcke 1988). The most recent deposits in this basin are formed mainly by Quaternary coarse-grained gravels from glacio-fluvial origin, and they cover a total surface area of 2250 km2 along the so-called Munich Gravel Plain (MGP) (see Fig. 5b).

In the city of Munich, the MGP comprises various sandur terraces as well as the floodplain of the Isar River, formed by gravel deposits having a thickness from few meters up to several decameters. Most of the deposits in the MGP were formed during the Pleistocene glacial cycles of Riss and Würm. The melt waters of the Isar-Loisach glacier resulted in the formation of braided rivers which piled up gravelly deposits to huge outwash plains (Bauer et al. 2005). The Quaternary aquifer constitutes one of the most productive groundwater occurrences in Europe (Freudenberger and Schwerd 1996) and conforms with the principal shallow aquifer in the region, which hosts a wide variety of coexisting groundwater uses, such as industrial water, drinking water, or groundwater heat pumps (GWHPs) for open loop geothermal systems. The city of Munich, covering an area of approximately 310 km2 of the total extent of the MGP, is situated in the central part of this outwash plain (see Fig. 5b) and concentrates the majority of uses (e.g., over 2800 GWHPs were recorded in 2020). This relevance has led to an increasing number of scientific activities in recent years addressing various aspects of this aquifer in the Munich area, e.g., its hydrogeological classification like hydrofacies characterization of the gravely deposits (Theel et al. 2020) and the geostatistical relief modeling of the geometry of the aquifer basis (Albarrán-Ordás and Zosseder 2020) or characterization for geothermal purposes (Böttcher et al. 2019). These deposits are underlain by flavio-lacustrine successions in erosive discordance, representing the last Tertiary sedimentation stage of the basin (see Fig. 5c). This deposition occurred during the Middle and Late Miocene and is represented by the Upper Freshwater Molasse (UFM) (Bachmann and Müller 1992). The filling deposits of UFM are several hundred meters thick and comprise grey or brownish and partly also reddish clayed-marly slackwater sediments and sandy to coarse-grained river channel deposits. The mica-bearing clayed and sandy sediments are fairly characteristic of the UFM deposits. The latter are yellowish grey to brownish and are commonly known as Flinz. Since these formations consist almost exclusively of unhardened fluvioterrestrial sediments, the lithological differentiation represents the distinctive criterion for subdividing the UFM into major lithostratigraphical units, which are in turn classified into the Western and the Eastern Molasse facies (Schwerd et al. 1996; Doppler et al. 2005). The UFM includes relevant but less-explored groundwater occurrences and are characterized by their high level of geological complexity, which poses a major challenge to characterizing their geometric architecture and geological composition. The groundwater from this aquifer system is exploited at different depths for a wide variety of uses. Among others, it is mainly used for public and drinking water but also as an industrial and brewing water supply. In addition, this aquifer is conceived for facilitating emergency water supply, if required. A great deal of infrastructure has been built in both the Quaternary and the Tertiary units, e.g., pipework systems, tunnels, and the deeper subway structures. The coexistence of this diverse range of uses and influences as well as the high natural variability of these deposits highlights the need for their long-term groundwater management. Due to its relevance, the latter formation has also been subject of research in recent years (Prösl and Anders 2011; Zosseder et al. 2019).

Conceptual model and borehole database

A borehole database from the Soil Information System of Bavaria (Germany), abbreviated as BIS, was considered in the present case study. This database is a comprehensive soil inventory managed by the Bavarian Environmental Agency and is typical of the datasets provided by governmental institutions, e.g., geological surveys. This database includes, among other issues, information on an exhaustive borehole database from the drilling operations during recent decades (Kresse and Danko 2012). Consequently, the workflows adopted for the data acquisition and preparation were adapted to the peculiarities of this borehole database. However, as mentioned in the “Methodology” section (Step 2), the workflow can be adapted to other borehole databases as needed.

The borehole descriptions contained in the BIS database consist of coded information representing the geological composition for each drilled interval and according to the following standards: DIN 4023 (2006); DIN EN ISO 14688–1 (2020); DIN EN ISO 14688–2 (2020); DIN EN ISO 14689 (2018). These standards define the symbolic codes used for soil description. The dataset used is comprised of the lithological descriptions from 210322 drilling intervals. This information corresponds to a total number of 20,114 boreholes amounting to a density of approximately 64 boreholes per km2 on average (see Fig. 6). The lithological descriptions from the boreholes sum up to a total of 347 km of drilled materials with lithological information to process.

The framework model consisted of a grid covering an area of 27 by 21 km. This was comprised of approximately 23 million voxels, each of which had a resolution of 100 by 100 m in the horizontal axis, and 1 m in the vertical direction. The modeling domain had a maximal depth of 170 m above sea level (m.a.s.l.), and the highest voxel was at a height of 590 (m.a.s.l.). The model holds a very variable thickness, since the ground surface has a marked downward slope towards the north. The user-defined categorization of the lithological classes resulted in three types, depending on the particle grain sizes: 1 for clay and silt (≤ 0.063 mm), 2 for sand (> 0.063–2 mm), and 3 for gravel (> 2 mm). This decision was primarily based upon selecting the lithological classes with more practice-oriented implications. In addition, a precision p of 10% in the relative amounts of grain fractions was defined as previously mentioned by way of example in the “Methodology” section.

Spatial continuity of the lithological classes

To describe the spatial correlation of each lithological class (1, 2, 3), the variograms were estimated from the available interval-based lithological information of the 210,322 borehole descriptions. The points in Fig. 7a show the relationship between the maximal horizontal correlation ranges and the vertical correlation ranges for all Di values and all lithological classes. Although the variogram ranges are not the same as the element body sizes (Ringrose and Bentley 2015), a set of dotted lines was also plotted in order to visualize different reference width-thickness ratios (W/T) and to compare the correlation ranges obtained with the dimensions of fluvial channel bodies. The W/T ratio defines the broadness or narrowness of architectural elements of fluvial systems (Miall 2014). Gibling (2006) emphasized in the study of the 3D geometry of fluvial channel bodies proposing twelve different types of channel bodies and valley fills based on more than 1500 bedrock and fluvial bodies for which width and thickness were recorded. A comparison between the correlation ranges obtained and the results from Gibling (2006) indicated that the correlation ranges of the gravels can be attributed to the common W/T spectrum of values for meandering rivers (see Fig. 7a), which are the result of the juxtaposition of deposits from different courses. These fluvial channel bodies typically have ranges from 4 to 20 m in thickness, from 0.3 to 3 km in width, and W/T values from 7 to 940. Similarly, we can also observe that the narrower correlation ranges of the sands very closely fit the dimensions of fixed river systems (see Fig. 7a). These deposits may represent single-channel or braided-sand bed networks (Gibling 2006) and typically have a W/T from 2.5 to 150, a thickness of between 3 and 15 m, and widths of between 15 and 300 m.

On the other hand, Fig. 7b, c, d illustrate the spatial continuity in the horizontal direction by means of the ellipsoidal representations regarding the direction of maximal correlation and correlation ranges in the horizontal direction for the three lithological classes as well as each Di reference diameter.

The variogram analysis showed that the gravels have high horizontal-vertical ratios for almost every Di (Fig. 7a) and, at the same time, they have high horizontal correlation ranges, especially from the D40 value (Fig. 7b). Although most variograms are less anisotropic, the prevailing angles of maximal correlation are E-W. In the case of sands, lower ranges in both figures were evident, indicating a priori narrower geometries associated with this lithology (Fig. 7c). Except for D10, D20, and D100, all of the maximal horizontal correlation ranges were smaller than approximately 200 m, with correlation angles approximately along the 100° line.

Partial percentile lithological models (D i models)

The distribution of lithological classes 1, 2, and 3 was modeled for each fixed Di value of the cumulative frequencies. The D10, D50, and D100 lithological models are shown in Fig. 8 as an example of the set of Di models. Behind each Di model, the probability of occurrence of each lithological class in each voxel based on 50 stochastic simulations for each Di model will remain for further interpretation.

As might be expected, an increment in particle size of the sediments became clearly evident as the cumulative frequency of the whole sediment mixture increased from D10 to D100. The distribution of lithological classes obtained for the D10 model indicated that the 10% of the finest particles in both the Quaternary and UFM units are mostly composed of clays and silts and, to a lesser extent, of sands. The gravelly areas in this model are extremely rare and reflect deposits made of at least 90% gravel, which are only present very locally in the Quaternary and a deep UFM formation. The D50 model shows that the 50% of the finest sediments in the UFM are still mostly fine-grained. The Quaternary, in contrast, is characterized by a picture dominated by the presence of gravels, indicating that all areas marked as yellow are expected to own relative percentages of gravels of at least 50%. The last partial lithological model shown here, this is the D100 model, shows clearly the presence of clayed/silty successions in the UFM, which means expected relative percentages of 100% of the same lithologies. These impermeable beds are likely to behave as aquifuges between the different confined aquifers in the UFM, mostly composed of sands. In this case, the Quaternary sediments are almost completely formed by at least a small relative percentage of gravels.

Model Lithological Uniformity (MLU) and Most Uniform Lithological Model (MULM)

Based on the Di models shown above, the Model Lithological Uniformity and the Most Uniform Model are derived as described in Step 6 in the “Methodology” section. Figure 9a, b show the Model Lithological Uniformity of the detrital mixtures in Munich´s subsurface, representing the degree of homogeneity of the clastic mixtures present in the Quaternary and UFM deposits (see the “Application of the method in the case study in the city of Munich” section). The lithological classes owing the highest relative frequencies in each voxel in the city of Munich constitute the Most Uniform Lithological Model presented in Fig. 9c, d.

The results clearly highlight a main gravel body at the top, holding a median Model Lithological Uniformity of 0.38. Only 10% of the Quaternary gravels show extremely high MLU values above 0.54. This volume covers, with a median thickness of 13 m, the first meters from the topographical relief and corresponds with the Quaternary glacio-fluvial deposits. Moreover, the bottom surface of this geo-body exhibits a variable relief composed of meander-shaped channels and plateaus (Albarrán-Ordás and Zosseder 2020). The thickness distribution of the Quaternary deposits is highly variable and only locally interrupted at Tertiary hills, such as the “Aubinger Lohe,” a spot at the western end of the city or along the slope edges of the Isar River crossing the city of Munich from south to north (see Figs. 5b, 6b, and 9c). As we can see, the gravelly sediments in the Quaternary are interrupted in depth by a prominent erosive discordance below which the sandy, silty, and clayed lithologies of the UFM become predominant. As a result, wide areas of the Quaternary gravels along this discordance are directly underlying of sandy UFM-sediments and hence, showing a potential hydraulic interaction area between Quaternary and Tertiary aquifers (so called hydraulic windows). Below, we can distinguish the UFM as a system consisting of a series of more coarse-grained (sandy) geological bodies alternating with fine-grained successions. Fine-grained deposits can reach significant thicknesses and they separate the numerous coarse-grained geo-bodies one another. The coarse deposits are mainly composed of sands and in greater depths of approximately 300–350 m.a.s.l. the content of gravel bodies increases importantly. As shown in Fig. 9a, b, the Model Lithological Uniformity values are highest in many clayed and silty formations in the UFM, which correspond with the areas mentioned in the “Partial percentile lithological models (Di models)” section, when describing the D100 model. The MLU decreases to 0.2–0.4 in the Tertiary sandy-dominated aquifers. Almost 25% of the deposits in the UFM, most of them fine-grained, reach the highest MLU values and about the same percentage are below 0.54. Compared to the Quaternary sediments, a larger dispersion of the MLU values is observed in the UFM (IQR of 0.46 versus only 0.26 in the Quaternary). The increased thickness of these fine-grained homogeneous deposits increased the median of the MLU to 0.7 in the whole UFM.

Architectural model of the Quaternary and UFM deposits

As previously mentioned in the “Geological setting” section, both the wide variety of existing and potential future subsurface uses and the inherent non-constant geological properties of the fluvio-glacial and lacustrine deposits underlying the city of Munich will require a long-term management of the underground space. The challenge for a successful underground management is to identify geo-bodies having specific potentials and their connection or separation to others, like the connection between different aquifers and potential hydraulic interactions. For this purpose, it is critical to analyze the geometrical continuity of the prevailing lithologies shown in Fig. 9c. This leads to a subdivision of the modeling space into different geological bodies according to their geometrical associations to ultimately determine the reservoir architecture (Fig. 9e, f). Immediately prior to analyze the geometrical continuity of the lithological distribution of the model shown in Fig. 9c, d, a Moving Window Filtering (windows size of 5 cells in the three directions) was applied. The aim of this filtering is to exclude isolated cells, such as small isolated sand lenses, in the architectural model. The excluded cells were not considered to be relevant for the purposes of the architectural model due to their isolated nature and minimum extension.

The application of the Di models method and the subsequent interconnectivity analysis resulted in four extensive coarse-grained geological bodies separated from one another by fine-grained silty and clayed successions in the UFM. This subset of bodies was termed as T1 to T4, from shallower to greater depths, respectively. At the top, the T1 geo-body consists of a multistory and laterally extended sand body with a maximum depth of 138 m below ground level. Within the T1 body additional separations were made which exist only in some spatial parts. Hence, T1 was identified in tiers, named herein T1A to T1D, which have a variable extension and thickness but remain, as a whole, geometrically connected in the city area presenting different areas of preferential connectivity at various depths. As we can see in Fig. 9e, f, the gravelly sediments in the Quaternary are interrupted by a prominent erosive discordance. As a result of the erosion processes, some of the old deposits of UFM are only present locally and not throughout the whole modeling area, which can be seen, for instance, in Fig. 9f, where the T1A geo-body (highlighted in dark blue) in the northern half of the city is missing. Based on the reservoir architecture, differentiated erosion at the Quaternary aquifer base ultimately leads to a connection of this aquifer with different underlying aquifer parts of the UFM in multiple areas throughout the city (see Fig. 9f). The coarse-grained sandy body T1 (A-D) hosts the main aquifer systems exploited at different depths in the UFM in a complex structure. In contrast, the deeper aquifers T2-T4 of the UFM can be clearly separated and have no further hydraulic interlinkage to other aquifers. Therefore, reservoir architecture derived by the Di models method and its consequences in terms of aquifer interactions constitute a significant step forward in understanding and ultimately providing an effective groundwater management system. Both a schematic graph and a detailed description of the aquifer systems and tiers from the architectural model are shown in Table 1 and Fig. 14 of Appendix 2.

Table 1 Overview of the aquifer systems and tiers from the architectural model of the Quaternary and UFM deposits in the case study in the city of Munich. m.a.s.l.: meters above sea level, UFM: Upper Freshwater Molasse, P10: 10th percentile, P50: 50th percentile, P90: 90th percentile, M: arithmetic mean, SD: Standard deviation, min: minimum value, max: maximum value

The wide range of groundwater uses present in UFM aquifers was mentioned earlier in the “Geological setting” section. In order to properly manage these and future applications and characterize the aquifer systems in the UFM, it will be necessary to itemize the current exploitation status of the different aquifers. This can be done by means of analyzing the filter pipes length and depth for a number of nearly 4500 existing wells and groundwater stations installed in the Munich subsurface. A short extract of this process is illustrated in Fig. 9f. This itemization enables the selective grouping of groundwater measurements together for each separated aquifer system and support ultimately the generation of reasonable hydraulic potentiometric surface maps for each aquifer. Hence, individual hydraulic potential maps for the aquifer sections T1A-D and T2 were produced in the city of Munich based on the modeled underground architecture, as Fig. 9f shows. The potentiometric surface maps were generated using geostatistical interpolation methods (Universal Kriging) and relied on an elaborate groundwater measurement campaign performed in April 2018 within the framework of the GeoPot research project (Geopotentials of the Tertiary subsurface in the wider area of Munich, Germany) (Zosseder et al. 2019). The individual analysis for each aquifer provided by this approach also enables individually addressing a water chemistry characterization of each aquifer in subsequent steps. This was also accomplished in the aforementioned research project and showed significant variations for the different aquifer sections.

Example of application for scenario analysis: design process of a dewatering system

The set of Di models offers a flexible system for a scenario analysis in determining both coarse-grained and fine-grained geological deposits for various applications. In this example, the ultimate goal consists of determining the parts of the gravelly Quaternary aquifer that influence the design process of a dewatering system for an excavation in an urban area based on conservative assumptions. To this end, we considered the groundwater contour map of the Quaternary aquifer obtained from a detailed measurement campaign containing over 6000 groundwater wells performed in April 2014 (Zosseder et al. 2015; Böttcher et al. 2019; Albarrán-Ordás and Zosseder 2020).

In this case, the conservative scenario (in terms of the most demanding dewatering), was estimated by capturing the aquifer geometries holding a maximum expected amount of water-saturated coarse-grained sediments (gravels). According to the standards for soil description used in this case study (see the “Conceptual model and borehole database” section), gravel-dominated deposits were characterized by a minimum relative amount of 40% of gravels. Thus, water-saturated aquifer volumes holding 40% of gravels can be therefore considered as the most demanding dewatering scenario. These volumes can be assessed by estimating the groundwater which fills the gravelly sediments below the water table in the D60 model, which can be identified in Fig. 10a. By contrast, the aquifer parts holding at least 60% of gravels may represent an optimistic forecast regarding a better situation for dewatering shown in Fig. 10b. The latter scenario can be inferred by the water-saturated gravelly volumes in the D40 model. The results show that the conservative scenario for the most demanding dewatering design resulted in a substantial increment representing an 83% increase in the water-saturated areas along the cross section (represented in blue).

Discussion

The modeling approach presented herein aims at better capturing the lithological variability of detrital systems at a voxel scale. This is achieved by estimating relative proportions of mass fractions of the lithological classes, and thus reducing significantly the stationary assumption in each voxel. The current modeling methodologies tend to assume a stationary prevailing class (lithologies, lithofacies, hydrofacies, etc.) in each voxel in predicting the final image of geology (Stafleu et al. 2011; Kearsey et al. 2015; Hademenos et al. 2019). They are commonly applied in combination with stochastic simulations, thus providing probabilities of occurrences from multiple realizations which actually do not represent the relative quantification of the classes in the sediment mixture (Stafleu and Dubelaar 2016). The aforementioned assumption is even more notable in lithostratigraphic models, in which constant properties are assigned to each unit (Kearsey et al. 2015). In contrast, the proposed methodology predicts the lithological classes associated to cumulative frequencies of the whole clasts mixture in each voxel after defining a constant frequency or precision p (e.g., 10% in the case study). Herein, the stationarity space lies only within the lithology assigned to each cumulative frequency, so no quantification of lithological classes is possible below the defined precision. The introduction of the concept of Model Lithological Uniformity (MLU) expresses the degree of homogeneity presented in the clasts mixture in each voxel based on all cumulative frequencies. This concept arises from the adaptation of the concept of information entropy as a measure of the disorder state of a detrital system. The concept of information entropy, which has been applied successfully as an objective measure of uncertainty in geological 3-D models (Wellmann and Regenauer-Lieb 2012; Bianchi et al. 2015), is then used in a different context.

The quantitative aspect of this method benefits from high borehole densities and high levels of detail in the lithological descriptions. If no grain-size-based descriptions are available, then the method loses its added predictive value in terms of relative amounts of lithologies. The user-defined character of this method may result in an overly arduous process if the number of grain fractions being modeled and the precision are unrealistically high.

One common aspect of 3-D geomodels in detrital systems is obtaining a 3-D distribution of the most dominant or prevailing lithological classes. This representation, sometimes referred as a mean model (Pyrcz and Deutsch 2014), holds the categories most likely present in each voxel. Current methods address this by assigning the most probable distribution of classes from the stochastic simulation, and this constitutes the main modeling objective. The proposed method can directly derive a Most Uniform Lithological Model (MULM), with the distribution of classes owning the highest relative frequency in the clasts mixture. The MULM has been successfully implemented in the city of Munich. This is the basis for the architectural model representing the reservoir configuration in the Quaternary and UFM deposits and the subsequent itemization of the current exploitation status of the various aquifers (“Architectural model of the Quaternary and UFM deposits” section).

In order to discuss the predictive ability of the presented methodology, a cross validation procedure was implemented for the estimation of lithological classes in the case study. This technique was based on the elimination of some hard data — a number of boreholes and their known lithological descriptions — and the ulterior re-estimation of the lithological composition at those locations where the true lithology is known from the remaining borehole data (e.g., Isaaks and Srivastava 1989). The subsequent analysis of the discrepancies indicates the quality of prediction of the modeling process (Pyrcz and Deutsch 2014). Special attention was given to the impacts of misclassification and to the ability of the modeling approach to predict different degrees of lithological homogeneity given by the Model Lithological Uniformity.

In a first step, 5% of all hard data was randomly excluded. This amounted to a total number of 1000 boreholes covering 19,022 m of lithological descriptions. To this end, the excluded data comprised boreholes at different depths distributed throughout the whole model. The 3-D modeling process was run with the same model setup and the spatial continuity model of the indicator categorical variables of each lithological class as those described in the “Step 2 — Input data: lithological descriptions from borehole data” and “Step 3 — Spatial statistical inference” sections. The superimposition of trends was adapted so as to ignore the lithological descriptions of the excluded 1000 boreholes.

The quality of prediction was checked along the excluded boreholes to their final depths at intervals of 10 cm along the 1000 boreholes. This checking process was developed qualitatively by measuring the fuzzy closeness of the estimation to the truth lithological classes. The fuzzy closeness accounted for the fact that the impact of misclassifications was different in the sense that assigning “sand” instead of “gravel” is less consequential that assigning “silt/clay” instead of “gravel”. The closeness of the prediction to the true lithologies was calculated for each 10 cm interval. As a result, if the closeness to the true lithology was 0, the estimation was correct because the predicted lithology corresponded to the true value. However, if the closeness to the true values was 1, the contiguous lithology was predicted instead of the true one. Although the prediction was incorrect, the impact of misclassification was less consequential (e.g., true = sand, estimate = silt/clay). In the case that gravel was estimated instead of silt/clay, the closeness to the true lithology would be 2. In this case, the prediction is incorrect and far from the true class, so the impact of misclassification is high. This made it possible to account for the worst cases of prediction.

Figure 11 illustrates a bubble plot representing a closeness matrix of the estimated and true lithological classes in the collocated intervals for the partial percentile models and the Most Uniform Lithological Model presented in the “Partial percentile lithological models (Di models)” and “Model Lithological Uniformity (MLU) and Most Uniform Lithological Model (MULM)” sections. This plot enables calculation of the success rate of prediction. We observed that more than 83% of the estimated lithologies coincided with the true ones for all the partial percentile models, except for the D10 and D20 models, where the percentage of success reaches 67% and 69%, respectively (see closeness to true = 0 in Fig. 11). In the case of the Most Uniform Lithological Model (MULM), the lithological prediction was correct in 89% of the intervals. Of the remaining predictions, only 3% of the estimated classes (corresponding to 6178 intervals of 10 cm) showed higher impacts of misclassifications.

The measure of the fuzzy closeness also makes it possible to identify the misclassifications in the form of underestimations and overestimations. In this context, if the true lithological class was coarser than the estimated one (e.g., true = sand, estimate = clay/silt), the prediction underestimated the grain size of the true lithology. These cases are represented by a light gray area in Fig. 11. In contrast, an overestimation in the prediction occurred when the true lithology was finer than the estimated lithology (e.g., true = sand, estimate = gravel). This situation is represented by a dark gray area in Fig. 11. We can also observe that the incorrected predictions were equally represented in overestimations and underestimations for all models, except for the D10 and D20 models. In the latter cases, an overestimation was noticed mostly in terms of a prediction of sand instead of clay/silt, which will be discussed below.

Emphasis was placed on those locations of the cross validation showing drastically different lithological classes in the Most Uniform Lithological Model. Specifically, locations possessing a misclassification of two orders of magnitude (closeness to true = 2), specifically predictions of clay/silt instead of gravel, and vice versa, were further investigated due to their major impact on the model interpretation. These locations were contrasted with the Model Lithological Uniformity presented in the “Partial percentile lithological models (Di models)” section. As shown in Fig. 12, the distribution of the MLU for the worst-cases was more or less positively skewed having a median of 0.18. Almost 25% of the locations show Model Lithological Uniformities below 0.06 and only 25% of the locations is above 0.54. Not surprisingly, this highlights that the worst-cases in the prediction were predominantly associated with deposits having lower MLU values and the greater lithological variability thereby.

In view of the above issue, one might wonder whether the model is able to predict only deposits having high Model Lithological Uniformity and less complexity. For this reason, the distribution of the MLU values at locations where the estimation is correct was also analyzed in order to explore the ability of the model to predict the true lithologies in different degrees of sediment mixing. The results are represented in Fig. 12 and show a median of 0.44, indicating an almost centered distribution of the locations. This implies that the mass of the distribution of corrected predictions was not concentrated in areas neither of greater lithological variability nor with more uniform lithological composition. The 25% quantile, with 0.18, and the 75% quantile, with 0.70, also showed a wider range of uniformity values.

The predictive ability of the method was proven in the case study, showing a successful prediction of the MULM at 89% of the tested locations for a wide range of Model Lithological Uniformity values. Only 3% of the locations tested showed worst-case predictions and were associated with highly heterogeneous deposits having very low MLU values (mean: 0.18). In the case of the partial percentile models, a success rate of between 83 and 90% was ascertained, except for D10 and D20 (67–69%). Most of the latter discrepancies show up as misclassifications of sand instead of clay/silt (Fig. 11). This effect can be explained by a significant increase in the horizontal correlation ranges of sand identified in the spatial continuity models (Fig. 7c). The spatial variation of lithologies inferred from 3-D variogram analysis was assumed for the whole modeling domain. More accurate 3-D variogram models from large-scale subdivisions of the model volume showing similar depositional conditions may resolve these issues.

Regarding model usability, the classical, straightforward attribution of numerical values from the literature or field data on lithology or sediment mixtures (see Fig. 4a1) commonly oversimplifies the property distributions in 3-D space. This method provides synthetic grain size curves at a voxel scale, which can then lead to (i) direct derivation of empiric parameters dependent on the grain size variations at each voxel, or (ii) 3-D modeling of the property distributions of such parameters based on the information available on existing boreholes. Whichever alternative is chosen, this method enables a more accurate 3-D modeling of property distributions in space than those obtained using the straightforward attribution (see example in Fig. 4a2).

This methodology makes use of stochastic simulations for generating the most probable classes for each partial percentile model related to a different cumulative frequency, where the stationarity is assumed. Additionally, this allows the integration of imprecisions associated with the inherent generalizations for soil description. The advantage thereby is that the uncertainty contributions present in the lithological descriptions are propagated along the modeling process, which is commonly more or less ignored in all of the current modeling strategies. The development of a comprehensive measure of uncertainty based on the perspective of this method in detrital mixtures will require further study.

Conclusions

The need to manage the subsurface space especially in detrital depositional environments requires approaches for improving the capabilities of geological 3-D modeling and, among other aspects, in preserving geological realism, integrating uncertainties in the modeling process to avoid underestimations in the reliability of the model and aspiring to provide a multi-purpose, user-oriented framework.

The Di models method proposes substantial contributions to these aspects based on the cornerstone that the lithological heterogeneities in detrital systems, understood as being mixtures of clasts of a wide range of particle sizes, can be geomodeled in 3-D space. These contributions have been proven in a case study in the city of Munich (Germany). Following this premise, a multi-voxel modeling solution is presented which uses massive lithological descriptions of borehole data and is able to predict, at a grain-size scale and with a user-define implementation, the relative amounts of each grain fraction and, therefore, the full lithological composition on a cell-by-cell basis. In conducting this approach, uncertainties due to imprecise soil description associated to the inherent generalizations are integrated by interpreting different equally valid combinations of the soil components and are ultimately propagated along the modeling process through the stochastic simulations. The further development of the Di models method includes the introduction of the concepts of Model Lithological Uniformity (MLU) and the Most Uniform Lithological Model (MULM). The predictive quality is evaluated in a cross validation process showing a success rate of prediction in the Most Uniform Model of 89% of cases considering 1000 boreholes which amount 19 km of lithological descriptions. The combined analysis of quality of prediction and Model Lithological Uniformity highlights the ability of the method to capture the complexity of geological reality for a wide range of MLU values and, therefore, a wide level of sediment mixing.

The multi-voxel character of this approach provides the relative amounts of each grain fraction in all voxels and encourages its flexibility of use for further applications. On the one hand, it enables an understanding of the small-scale effects present on a cell-by-cell basis derived from the sediment mixing by, e.g., attributing numerical values from the literature or field data to sediment mixtures, thus deriving empiric parameters dependent on the grain-size variations (effective porosity, hydraulic conductivity, share-wave velocity) and modeling property distributions more accurately. On the other hand, macrocellular approaches based on geometrical or cell-by-cell parametrical linkages between cells enable an understanding of complex reservoir architectures, e.g., multistory, laterally extended sand bodies like those presented in the UFM. In the case study in the city of Munich, this process ultimately led to an itemization of approximately 4500 filter pipes classified according to eight aquifers in which they are installed, and the subsequent individual hydrogeological characterization of the groundwater systems by delineating its hydraulic characteristics and their water chemistry.

Equally, this approach has been shown to be very useful when the identification of different relative amounts of grain fractions is needed. The quantification of cumulative frequencies of lithologies can be used in the scenario analysis associated with conservative or optimistic assumptions supporting assessments for specific applications, e.g., groundwater management or civil engineering issues (“Example of application for scenario analysis: design process of a dewatering system” section). This was shown in an example of application for determining the water-saturated parts of the Quaternary aquifer that might influence the design process of a dewatering system of an excavation on conservative assumptions in the city of Munich. This flexibility has great potential for many other applications requiring a high degree of lithological variability in each voxel. Moreover, the variability captured by this method may lead to a more closely definition or modeling of continuous properties in 3-D, as well as for deriving grain-size-based dependent properties or for finer model parameterization.