Introduction

The discontinuous, inhomogeneous and anisotropic nature of the rock mass is mainly attributed to its internal discontinuities. In the heterogeneous rock mass subjected to multi-stage, multi-directional stress field usually form the discontinuities which are distributed randomly and irregularly, as a result study involving the geological discontinuities is generally under the framework of statistics and probability theory (e.g., Watson and Irving 1957; Kiraly 1969; Oda 1980; Einstein and Baecher 1983; Baecher 1983), with the aim of investigating the characteristics of their orientation, size, spacing, density, shape and spatial distribution (e.g., Snow 1970; Watkins 1971; Shanley and Mahtab 1976; Cruden 1977; Pahl 1981; Priest and Hudson 1981; Oda 1982; Mahtab and Yegulap 1984). Based on the parameters measured on the outcrop, a detailed statistical analysis is conducted, followed by the geometric model generated using Monte-Carlo simulation. Mechanical and hydraulic research is then carried out based on this model (Müller et al. 2010). Systematic investigations on the three-dimensional discontinuities network were conducted (e.g., Robinson 1983; Kulatilake et al. 1984, 1985, 1990; Dershowitz et al. 1988; Xu and Peter Xu and Peter 2010) and the techniques developed were successfully extended to the field of rock mechanics (e.g., Einstein et al. 1983; Sen and Kazi 1984; Kulatilake et al. 1993).

Considering the structure of the fractured rock mass Lei and Yuan (1989) proposed a two-dimensional stochastic model using joint midpoint equal probability method, which is the basis for the subsequent analysis of rock mechanics; Xu (1992) established a comprehensive numerical model of jointed rock mass and conducted calculation of rock blocks, fracture network and fluid flow, Using three-dimensional network model, Zhu (1992) investigated constitutive relationship of the network system with non-transect joints, and further analysed the influencing factors of the elastic deformation of rock mass; Chen et al. (1995, 2001) carried out detailed studies on the stochastic discontinuities network model after parameters (e.g., orientation, trace length, spacing, etc.) bias correction; The other promising applications of the network model were investigated by Jia et al. (2002), including the evaluation of rock mass quality, isolated body search, evaluation of joint network connectivity, estimation of the rock mass strength and permeability, joint failure judgment and so forth; Wang et al. (2004) compared the cross-sectional view of the stochastic simulation model with that of the measured outcrop, in an efforts to verify the validity of the simulation. However, the quantitative descriptions of discontinuity features in practical applications are mainly obtained by traditional manual measurement, the sample size and accuracy required by the parameter study are quite limited, significant bias cannot be avoided if too much subjectivity is involved in the statistical process (Li et al. 2011), and the practical need cannot always be met by the model reliability.

Statistical samples obtained based on global positioning system (GPS) and geographical information system (GIS), both in terms of quantity and accuracy, were significantly improved compared with traditional methods (e.g., scan line or window sampling). The whole group of discontinuities in a region can theoretically be sampled by the digital techniques (Huang et al. 2014; Kissinger et al. 2013). The advantage is that the bias of subjectively selecting outcrops can be minimized by complete sampling strategy. It is possible to conduct a detailed discontinuity investigation in a wide large region (Koike et al. 2011; Tenzer et al. 2010). In view of our group has embarked on an integrated research program, whose main goal is to build up a proper stochastic discontinuity network model for Beishan (Gansu province), one of the main candidate sites for the Chinese HLW (high-level radioactive waste) repository.

Acquisition and pretreatment of data

Description of the study site

The study site is located in Jijicao block in Beishan (Gansu province), which is one of the main candidate sites, for Chinese HLW repository. Rock is well-exposed on outcrops in this 12 km2 region. The main lithology of this area is biotite monzonitic granite. Tectonic deformation in this area is characterized by brittle fractures and joints, together with less developed ductile shear deformation. Since the Indosinian--Yanshanian period, tectonic stress field has been dominated by left-lateral strike slip so that inside this area low-angle fractures are in continuous development. The EW direction boundary faults were thus cut and offset by the NE direction fault. Long fractures and fracture zones are mainly characterized by steep dip angle. The discontinuities location and relationship with faults are shown in Fig. 1. The attribute properties of measurement areas (A ~ N, P ~ S in Fig. 1) as shown in Table 1 and the statistical results of discontinuities geometrical properties are shown in Table 2.

Fig. 1
figure 1

Discontinuities measuring location and relationship with faults in Jijicao block (A ~ N, P ~ S are measurement areas, F1 ~ F7 are major faults, surface discernable lengths of F1 ~ F6 are over 1 km, while length of F7 is less than 1 km)

Table 1 Attribute properties of measurement areas
Table 2 Statistics and distribution functions of discontinuities geometrical properties

Methods

To implement fast and efficient digitization acquisition of the discontinuity geometrical properties, such as trace length, persistence, spacing, orientation, aperture and roughness, high-precision GPS-RTK (Real Time Kinematic) is adopted for accurate positioning and data collection (Fig. 2), the real-time kinematic carrier phase differential method is used in this system to position objects, the precision of position data obtained from satellite positioning system can thus be improved to centimeter level, together with compass (detailed measurement), digital close range photogrammetry (for the purpose of auxiliary measurement and orientation checking) and digital camera (micro-focus photographing). More than ten thousands of discontinuities distributed in 18 areas (Fig. 1) are first numbered, measured, and picture-taking with placement of a steel ruler close to the discontinuity, discontinuity parameters can be obtained after post-processing of the measured data and photos. More specifically, orientation data is obtained from compass measurement and digital close range photogrammetry, trace length, persistence and spacing are obtained from measured discontinuity coordinates. Aperture and roughness data are extracted from micro-focus photos by programs. An illustration of the processing procedure is shown in Fig. 3.

Fig. 2
figure 2

Measurement of discontinuity in study site by GPS-RTK. a Base station. b Mobile station

Fig. 3
figure 3

Digitalization processing for roughness and aperture of discontinuity. a Micro-focus photos are taken with placement of a steel ruler next to the discontinuity. b Digitalization processing procedure 1 threshold segmentation 2 denoising 3 morphological algorithm, 4 extract skeleton line of the discontinuity

Sampling bias correction

The intersect frequency of the discontinuities and the outcrop often changes with the relative space position, therefore the measured frequency is not its actual frequency and the sampling bias arises. Besides, the transect (intersect) and length sampling bias are usually introduced into trace length measurements. Specifically, the length sampling bias is induced by the larger sampling chance of longer traces than the shorter ones. The sampling bias corrections should therefore be carried out with respect to the measured data before the statistical analysis.

The outcrop frequency of discontinuities which are nearly parallel with the outcrop is apparently different from the actual frequency within the rock mass. The difference is controlled by the angle between discontinuities and outcrop. The larger the angle is, the closer the exposure frequency to the true frequency is. The correction principle of the orientation bias is based on determining the correction factor or weight coefficient which is formulated by the angle between discontinuities and outcrop. The amount of the discontinuities whose outcrop frequency is smaller than the actual frequency is corrected by the correction factor (Zhou 2013).

Classical Terzaghi correction method (Terzaghi 1965) is applied in this paper and the correction factor (Priest 1993) as the following expression:

$${\text{TCF}}_{{}} = \frac{1}{\cos \delta }$$
(1)
$$\cos \delta = \left| {\cos (\alpha_{n} - \alpha_{s} )\cos \beta_{n} \cos \beta_{s} + \sin \beta_{n} \sin \beta_{s} } \right|$$
(2)

where TCF is the correction coefficient used in the scan-line sampling, δ is the acute angle between the discontinuities normal and the sampling line, α n , β n are the trend and plunge of discontinuities normal vector, α s, β s are the trend and plunge of the sampling line. The spatial relationship between the discontinuity, horizontal plane and rock face are shown in Fig. 4.

Fig. 4
figure 4

Diagram illustrating the spatial relationship between discontinuity, horizontal plane and rock face

The actual correction effect of an example (data obtained in E partition) is shown in Fig. 5. Corrected results show that an obvious frequency difference exists between joints inside rock mass and the ones exposed on outcrops, when joint is sub-parallel to the outcrop. A considerable large bias may be induced if correction process is neglected. The proposed criterion is that orientation frequency correction should be conducted when the dip angle is relatively small, usually less than 20°.

Fig. 5
figure 5

Comparison of correct sampling bias on discontinuities orientation (Red before correction; Blue after correction)

The orientation distribution function for each set is necessary for trace length bias correction. Unfortunately, these functions cannot be anticipated in this stage therefore the trace length bias corrections are not conducted.

Digitalization processing of data

Partition of homogeneity

Spatial distribution of discontinuities in the rock mass is always inhomogeneous. More specifically, this heterogeneity means that hydrologic, geologic and mechanical properties of the rock mass vary from one unit to another. Therefore, it is necessary to determine the homogeneity with similar discontinuities structure and parameters before simulation, to carry out subsequent statistics and modeling, as well as the mechanics and hydraulics research, based on a reasonable partition scheme. Miller (1983) successfully divided the rock mass structure into different homogeneities based on the orientation parameters. The modified Miller method adopted by Kulatilake et al. (1996) also achieved appealing results. An attempt were made by the authors (Guo et al. 2013; Li et al. 2014) to obtain the homogeneities of Jijicao block using the more appropriate Miller method (34-patch large area method) after comparing between different methods. Two homogeneities (CDERSPQN belong to partition I, GH belong to partition II) were obtained after excluding individual interference data. G and H areas around F1 lied in the northern, CDERSPQN across F3 lied in the middle of F2 and F4 (shown in Fig. 1). Jijicao block mainly consists of biotite monzonitic granite. The rock mass has suffered two tectonic movements after the last episode of magma intrusion. Seven faults with length in the order of kilometer were identified in the block which was mainly controlled by the NE direction faults. These NE direction faults were characterized by tensile and tense-shearing properties, accompanied by EW direction compressive and compress-shearing fractures. All of the fractures mentioned above were formed roughly in the same time (late Hercynian). Based on this type of structure characteristics together with the regional tectonic history, a preliminary conclusion can be drawn that the whole piece of rock mass generally shows a relatively high degree of homogeneity, however, specific local areas may be influenced by faults, and thus the degree of homogeneity becomes lower. For example: the longest left-lateral strike slip fault (F3) has the most considerable impact on the overall homogeneity of the block and it is difficult to clearly identify the homogeneity of rock mass near the fault. The theoretical results also confirmed the overall empirical judgment quite well (CDERSPQN belong to partition I with the strongest homogeneity; GH belong to partition II with less obvious homogeneity; F3 fault has a great effect on areas of F and I that lead to the significant difference between them).

Identification of dominant discontinuity sets

Fuzzy clustering method (Hammah and Curran 1998) is used to identify dominant discontinuity sets in homogeneity I and II. This method achieves the optimal division of discontinuity sets by minimizing the objective function. First of all, according to the data characteristics the objective function is defined (this function usually consists of the sum of a certain distance between orientation data, and the orientation distance is represented by the sine value of normal vector angle between two discontinuities), followed by the determination of the fuzzy variables which represent membership degree, and then according to fuzzy variable values (membership degree) the category to which each data belongs is determined. Based on the above algorithm, author himself has written a fuzzy clustering grouping program to handle the measured orientation data. Furthermore, an improvement was added to the original algorithm to better deal with the orientation data with similar membership degree among different sets.

There are often a few “random points” left after discontinuity sets identification, which are usually difficult to uniquely determine the membership relations, namely the probabilities that this point belongs to different sets are nearly identical. If the sets are divided merely based on fuzzy variable values, it is not only difficult to distinguish the boundary of each set, but also easy to interfere the calculation accuracy of center vector of a set. Therefore, based on the measured orientation data of Beishan, “random points” were screened out by improved fuzzy clustering algorithm and subsequently were used to conduct statistical analysis separately, and then dominant discontinuity sets in homogeneity I and II were identified using the fuzzy clustering method (Zhou et al. 2012), the results are shown in Fig. 6, dominant sets center orientation is listed in Table 3.

Fig. 6
figure 6

Results of dominant discontinuity sets in Beishan. a Homogeneity I containing areas of CDERSPQN. b Homogeneity II containing areas of GH

Table 3 Statistics parameters of dominant discontinuity sets

In Fig. 6, discontinuities with approximately equal dip angle and opposite dip directions are divided into the same joint set. The reason is that considering the orientation data is the major index when delineating joint sets, to fulfill the requirements of partitioning, sine of the angle between two joint normal vectors is adopted as the measurement of distance between two orientation data. Joint orientations with the above features (approximately, equal dip angles and opposite dip directions) have small sine values, which mean the distance between the two joint orientations is relatively small, thus they are put into the same set.

Characteristic parameters statistics

The measured coordinate data of discontinuities were imported into the Arc-GIS software to reconstruct objective distributions of discontinuities (e.g., P and I in Fig. 7), Discontinuities located on the same slope side were picked out to carry out the best outcrop plane fitting. After projecting discontinuities to the fitted plane, probability distributions of parameters (e.g., trace length) in dominant sets were obtained by statistical analysis (processing procedure is shown in Fig. 8).

Fig. 7
figure 7

Trace maps of discontinuities in Jijicao block, Beishan. a Area P. b Area I

Fig. 8
figure 8

Discontinuities digitalization processing procedure. a Select discontinuities on the same slope side of outcrop. b Best outcrop plane fitting. c Discontinuities projection. d Parameters statistics

Based on the hypothesis of Baecher circular disk, the sampling bias correction was carried out, spacing, density and trace length statistical distribution were then obtained, the corresponding unknown parameters, such as disk diameter distributions were derived based on trace length probability distribution of discontinuities by using following classics equation (Warburton 1980):

$$h_{A} \left( L \right) = \frac{L}{m}\int\limits_{L}^{m} {\frac{g\left( D \right)dD}{{\sqrt {D^{2} - L^{2} } }}}$$
(3)

The mathematical expectation m can be obtained using the following equation:

$$m = \int\limits_{0}^{\infty } {Dg\left( D \right)dD}$$
(4)

where D is the diameter of discontinuities, L is the trace length of discontinuities, m is mathematical expectation of diameter, g(D) is the probability density function of diameters, h A (L) is the probability density function of trace length.

Mean volumetric frequency (mean 3D density) was derived based on mean areal frequency (mean 2D density) of discontinuities by using following equation (Kulatilake and Wu 1984):

$$E\left( {\lambda_{v} } \right) = \frac{{E\left( {\lambda_{a} } \right)}}{{E\left( D \right)E\left| {\sin \upsilon } \right|}}$$
(5)

where D is the diameter of discontinuities, E(D) is the mean diameter of discontinuities, E(λ v ) is the mean volumetric frequency of discontinuities, E(λ a ) is the mean areal frequency of discontinuities, E|sinv| is the sinusoidal of angle between discontinuities average direction and the sample surface direction of this set.

The calculation results of disk diameter and volumetric frequency are shown in Table 4 (taken homogeneity I as example). There is an inconsistency between probability density function (PDF) of measured trace length and that of theoretical diameter, which implies that trace length and diameter may not follow the same probability distribution. Therefore, it remains unclear whether or not trace length PDF could be used in the DFN simulation instead of true diameter PDF.

Table 4 Statistics parameters of discontinuities in homogeneity I

Establishment of model

Statistical values of spatial frequency are first obtained as input parameters. Pseudo random number is then generated using a Poisson distribution. This number is taken as the discontinuities number within a certain volume of the model. The following step is to simulate disk center position, diameter and orientation, using the same Poisson random number generation technique to generate the center coordinates of each disk in the model volume. The disk diameter and orientation are generated according to their respective probability distribution function by Monte-Carlo method. Ultimately, the disk size and orientation are determined. However, theoretically there are countless models generated according to the method mentioned above. To endow the results with more statistical meaning, each independent parameter was generated five times, and a total of 125 random models were obtained by means of permutation and combination. Homogeneity I is taken as an example. Discontinuities network is generated in three-dimensional space with size 36.5 m × 25 m × 30 m. The sampling windows are used to calculate the average dip direction, the average dip angle, corrected trace length, spherical deviation, etc. After preliminary comparison among different realizations, a relatively satisfactory model is chosen and shown in Fig. 9 (data comparison process is described in detail below) Basic parameters of the model are listed in Table 5. This model is built up based on in situ measurement data, and for the purpose of clearly showing the exact position of measured discontinuities, a plane with specific slope representing true outcrop and intersect traces are also plotted in corresponding position in this more objective model.

Fig. 9
figure 9

Results of stochastic simulations for homogeneity I. a Stochastic model including two dominant sets: set 1 and 2 are marked by red and green, respectively. b A plane which is a simulation of real outcrop, on which intersected traces are plotted in corresponding position

Table 5 Parameters of stochastic model in homogeneity I

Validity

Figure test

Validity tests of the stochastic model mainly consist of figure test and numerical test. Figure test makes visual comparison between traces formed by the stochastic model intersected with sample window (or virtual outcrop surface) with those measured through in situ sampling window. Reliability of the model can therefore be estimated through visual observation. Figure 10 shows a test example of the aforementioned model: (a) is the traces of real discontinuities within the sampling window in simulated domain; (b) is the traces of the stochastic model cut by the sampling window. For the purpose of direct comparison of results, the sampling window used in the simulation is the same as the observation window on real outcrop slope in in situ discontinuities measurement (shown in Fig. 9, specific parameters are shown in Table 5).

Fig. 10
figure 10

Comparison between measured traces and numerical model. a In-situ measured discontinuities traces. b Simulated traces formed by intersection between sampling window and stochastic model

Direct visual comparison of Fig. 10a with b shows that the stochastic model possesses relatively high similarity with the actual situation with respect to the discontinuities number, orientation and trace length characteristics (quantitative index derived from numerical tests are shown in next section). This figure test suggests the preliminary validity of the stochastic model.

Numerical test

The graphical inspection result above only obtains a qualitative judgment that the model is relatively effective, further numerical tests are needed to derive quantitative index for the characterization of validation degree. In numerical tests data from stochastic model are collected after statistical analysis and are compared with the measured data. Data comparison should follow certain principles. In this paper, normalized relative error and coefficient of variation are adopted to implement quantitative comparison. Normalized relative error is the percentage of relative error of model data with respect to the original data, using the following expression (Wang et al. 2004):

$$\eta = \left| {\frac{M - T}{T}} \right| \times 100\,\%$$
(6)

where η is the normalized relative error, M is the mean of model data, T is the mean of the original data.

The definition above shows that the smaller η value is, the more similar the two groups of data are. When η is less than 30 %, the model data are usually considered in good agreement with the measured data. Coefficient of variation is defined as the ratio between standard deviation and mean value, using the following expression (Wang et al. 2004):

$$\text{cov} = \frac{\sigma }{\mu } \times 100\,\%$$
(7)

where cov is the level of uncertainty of parameters or model, σ is the standard deviation, μ is the mean value.

If cov is more than 30 %, variation of data in the group is considered to be big while the confidence level is low. In addition, discontinuities orientation elements (dip direction, dip) are related to each other in stochastic simulation, they should not be treated separately. The following equation (Wang et al. 2004) can be used to evaluate the type of spherical deviation with respect to each orientation data:

$$\zeta = \sqrt {1 - \frac{\left| R \right|}{N}}$$
(8)
$$\left| R \right| = \left[ {\left( {\sum\limits_{i = 1}^{N} {l_{i}^{2} } } \right) + \left( {\sum\limits_{i = 1}^{N} {m_{i}^{2} } } \right) + \left( {\sum\limits_{i = 1}^{N} {n_{i}^{2} } } \right)} \right]^{1/2}$$
(9)

where ζ is the spherical deviation of orientation data, \(l_{i} ,\,m_{i} ,\,n_{i}\) is the ith Cartesian coordinate component of discontinuities normal vector, N is the discontinuities number of each set.

It is generally accepted that the fact that ζ larger than 0.25 suggests a less concentrated and statistically insignificant orientation data set.

The quantitative index derived from numerical tests of Fig. 10 shows that normalized relative error less than 7 % with respect to discontinuities number, spherical deviation less than 0.25 with respect to orientation, normalized relative error less than 11.48 % with respect to trace length.

Numerical tests are carried out for the model on the sampling window; the results are listed in Table 6.

Table 6 Results of comparison between numerical model data and in situ measured data on sampling window

All the data of the model to be validated are gathered and analysed, the overall effectiveness test results are listed in Table 7.

Table 7 Results of comparison between numerical model data and in-site statistical data in three-dimensional space

According to the results listed in Tables 6 and 7, this stochastic model is capable of fulfilling the requirement of statistical testing, both in terms of the sampling window inspection and the statistical results involving the whole data. The maximum difference ratio is 15.20 %, the maximum variation coefficient is 28.01 % and the maximum spherical deviation is 0.22 (Simulation result of set 2 is out of range, which may result from the more significant accumulated error induced by the largest quantity of stochastic discontinuities of this set in the whole model), most of which are within their respective threshold ranges, thus indicate that this stochastic model in the specific condition is with relatively high reliability, therefore, it can better meet the need of practical engineering application, and the mechanical calculation and seepage simulation results based on this model possess high accuracy and theoretical reliability.

Conclusions

Our group has embarked on an integrated research program, whose main goal is to build up a proper stochastic discontinuity network model for Jijicao block (Beishan of Gansu province), one of the main candidate sites for the Chinese HLW repository. In the present paper, methods of data measurement, digitalization and model establishment procedure are described in detail. Reliability of the model was verified graphically and numerically. Validation results suggest that an improvement of feasibility and effectiveness of the model was achieved. The proposed methods provide a new way for the stochastic discontinuity network simulation. Important conclusions are listed as follows:

  1. (i)

    Significant bias may be induced if bias correction processes are neglected. Orientation frequency bias correction should be conducted when the dip angle of the discontinuity is relatively small, usually less than 20 degrees.

  2. (ii)

    Based on in situ measurement data, two homogeneities are identified in Jijicao block, the dominant sets with respect to each homogeneity are also divided (two sets in homogeneity I, three sets in homogeneity II), and finally three-dimensional DFN models are established for both homogeneities.

  3. (iii)

    Statistical results of measured data show that two dominant sets can be identified in homogeneity I of Jijicao block. The optimal diameter probability distributions of both sets are lognormal.

  4. (iv)

    Numerical validations both in terms of the sampling window inspection and the statistical results involving the whole data set demonstrate that the generated model is capable of representing in situ fracture features in a practical point of view. Take homogeneity I as example, compared to measured data, the maximum difference ratio of this model is 15.20 %, the maximum variation coefficient is 28.01 % and the maximum spherical deviation is 0.22. Mechanical calculation and seepage simulation results based on this model possess relatively high accuracy and theoretical reliability.