A Non-parametric Discrete Fracture Network Model

A discrete fracture network (DFN) model based on non-parametric kernel density estimators (KDE) and directional-linear statistics is developed. The model provides a characterization of the fracture network with distributions of fracture orientation and size jointly. A solution to the Bertrand paradox is used for the calculation of disk sizes from trace lengths, the latter calculated from the intersection of disks and highwall faces by triangulation. A Poisson point process is applied for the generation of the model, with fractures assumed to be flat and circular in shape, the number of fractures per unit volume (P30) adjusted to match the experimental length of fractures per unit area (P21). Length censoring of traces due to the surface dimension is considered in the calculations by including semi-bounded traces, i.e., traces censored in one of their ends. Orientation and size biases are corrected with a weighting function in the random sampling. The truncation effect whereby no traces shorter than some cut-off length are recorded, is addressed by a randomized optimization algorithm. The joint fracture orientation-size distribution model developed is tested with trace maps of discontinuities measured from photogrammetric models of twelve highwall faces of quarry benches, with outstanding results. Computational advantages over traditional parametric fracture models are addressed. Nonparametric directional-linear statistics has been used for the construction of DFN models. The mathematical analysis of this new approach is described in detail. A new link between distributions of disc size and pseudo-trace lengths is established. Relationships between means and variances for the distributions of trace length and disc size are discussed. Censoring bias is corrected by a mixed random variable and truncation bias is corrected by including semibounded traces and using a probabilistic filter. The nonparametric DFN model developed is applied to photogrammetric discontinuity data maps. Consistency of the results is assessed and thoroughly discussed. Nonparametric directional-linear statistics has been used for the construction of DFN models. The mathematical analysis of this new approach is described in detail. A new link between distributions of disc size and pseudo-trace lengths is established. Relationships between means and variances for the distributions of trace length and disc size are discussed. Censoring bias is corrected by a mixed random variable and truncation bias is corrected by including semibounded traces and using a probabilistic filter. The nonparametric DFN model developed is applied to photogrammetric discontinuity data maps. Consistency of the results is assessed and thoroughly discussed.

Intensity parameter, number of disk centers inside a volume (m −3 ) P 30 Estimator of P 30 (m −3 ) P 21 Intensity parameter, length of fractures per unit area (m −1 ) P 32 Intensity parameter, total disk area divided by the volume of the domain (m −1 ) q Dimension of the sphere R Disk radii random variable (m) r Radius (

Introduction
Characterization of fractured rock mass is a critical task in the development of open-pit and underground mining for the design of slopes, support means or blast optimization. Discrete Fracture Networks (DFNs) have proved to be an effective tool for the three-dimensional representation of a natural fracture network (Baecher et al. 1977;Long et al. 1982;Baecher 1983;Andersson et al. 1984; Dershowitz and Einstein 1988;Zhang et al. 2002;Miyoshi et al. 2018) and are increasingly being used in many geotechnical and mining engineering problems (Elmo et al. 2015;Elmouttie and Poropat 2012;Umili et al. 2020) e.g., in analysis of rock slope stability (Bonilla-Sierra et al. 2015), geo-mechanical and hydrological behavior of fractured rocks (Robinson 1983;Andersson et al. 1984;Lei et al. 2017), seismic attenuation and stiffness modulus dispersion Hunziker et al. (2018), and multi-physics processes, see e.g., Keilegavlen et al. (2021). The basis of the DFN method is the characterization of fracture parameters using statistical analysis to build a computational model that explicitly represents the geometrical properties of each individual discontinuity and the topological relationships between individual discontinuities and discontinuity sets (Lei et al. 2017;Miyoshi et al. 2018). In general, the term 'discontinuity' groups all weak planes of geological origin, such as fractures, joints, bedding planes, cleavages and faults (Lu 1997). In what follows we are treating all these terms equivalently. Among the geometrical properties of each fracture, one may emphasize orientation, size, location, shape and aperture and their corresponding probability distributions (Baecher et al. 1977;ISRM 1978;Baecher 1983;Dershowitz and Einstein 1988;Dershowitz and Herda 1992). Other relevant characteristics may also be considered, such as genetic type, filling material ) and roughness (Tang et al. 2022 andZou et al. 2022).
Traditionally, DFN models group visible fractures by orientation in families of poles (ISRM 1978) on the stereographic projection. This is done by dividing the stereogram into sectors and fitting directional probability distributions. The most frequent are the bivariate von Mises-Fisher and Bingham (Bingham 1964;Baecher 1983;Fisher et al. 1993). The goodness of the fit is assessed using the chi-squared and the Kolmogorov-Smirnov tests (Baecher 1983). Similarly, analytical probability density functions (pdfs), such as lognormal, uniform, power law, gamma, normal, etc., are used to characterize the distribution of trace lengths measured on an outcrop (Tonon and Chen 2007). Even though analytical forms seldom provide satisfactory approximations to orientation data (Baecher 1983), works in which the statistical analysis is performed in a non-parametric way are rare, see e.g., Xu and Dowd (2010), and those that treat jointly the disk orientation and size pdfs in a non-parametric way are non-existent, to the authors' knowledge. In this work, we use directional-linear statistics as a way to jointly estimate the density functions of fracture sizes and orientations, based on the theory of directional-linear kernels by García-Portugués et al. (2013). In directional statistics, the sample space is the unit sphere, while in directional-linear statistics, the sample space is the Cartesian product of a sphere and the real line.
Because the rock mass structure is impossible to be directly observed without dismantling it completely (Priest 1993), a variety of methods must be used for sampling discontinuity data. From that observation, the 'true' joint density function of orientation-sizes inherent to the rock mass is inferred through stereology. Data from exposed rock faces, such as natural outcrops, excavation faces, and boreholes have proved to be useful (Tonon and Chen 2007). Among sampling techniques for deriving fracture parameters, the most utilized are in-hole images or wireline geophysical logging (Ozkaya aand Mattner 2003;Hekmatnejad et al. 2019), scanline survey, circle sampling or window survey (Priest and Hudson 1981a, b;Kulatilake et al. 2003;Kulatilake and Wu 1984;Einstein 1998, 2000;Song and Lee 2001;Jimenez-Rodriguez and Sitar 2006), as well as digital photogrammetry and laser scanning techniques (Elmo et al. 2015;Vollgger and Cruden 2016).
The history of the problem of determining density functions of disk sizes from trace length probability density functions is well summarized by Tonon and Chen (2007). The work of Warburton (1980b) provided a starting point for estimating trace length probability density functions from disk size density functions, using stereological principles, for disks without aperture, circular in shape, parallel to each other and with centers arranged within a volume following a Poisson process, see e.g., Baecher et al. (1977). Other contributions assume fractures to be elliptical in shape (Zhang et al. 2002) Zheng et al. 2022). The inverse process, i.e., obtaining disk sizes pdf from trace lengths, was already formulated years before by Santaló (1955). However, works by Chen (2007, 2010) showed that this relationship leads in some cases to non-positive pdf values; it also fails for bimodal or multimodal distributions. As this type of distributions is common in the field, a solution to this problem is explored in this work using one of the Bertrand's paradoxes (Bertrand 1888;Aldous et al. 1988;Chiu and Larson 2009) and its relation with Warburton's work (Warburton 1980a).
When characterizing rock exposures, one should consider bias effects inherent to the sampling procedure. According to Kulatilake and Wu (1984), Priest and Hudson (1981a, b), Laslett (1982), Baecher and Lanney (1978), Baecher (1983), Villaescusa and Brown (1992) and Chilès and Marsily (1993) there are four main causes of bias in the distribution of trace lengths: censoring, truncation, size and relative orientation. Censoring occurs when one or both terminations of a trace are not observable. This effect occurs when the surveying surface is finite; the disk orientation-size density function and the size, shape and orientation of the surface to be intersected have also some influence. The resulting traces may be grouped into three categories according to the number of terminations or censoring condition, namely complete, semi-bounded and bounded (White and Willis 2000). Truncation is introduced because traces shorter than a cut-off length (Baecher 1983) cannot be observed. When an observer maps traces on an outcrop, unintentional truncation occurs. Operator's skills, and sampling scale, makes that small traces are less likely to be observed. That is what Riley (2005) called 'the protocol adopted for recording short traces (lower truncation limit)'. Size bias occurs because longer traces are more likely to be sampled than shorter ones (Baecher 1983;Jimenez-Rodriguez and Sitar 2006) thus, large fractures are over-represented and this must be considered when estimating the true size distribution. A fourth bias effect is caused by the relative orientation between fractures and the outcrop surface: a fracture is more likely to intersect a surface the more perpendicular their relative orientations are. All these phenomena need to be suitably accounted by the model to interpret the true nature of the rock mass in the best possible way.
The spatial structure of the fractures' support points is another fundamental aspect in DFN models. The homogeneous Poisson-Boolean point process has been widely used (Baecher et al. 1977;Kulatilake and Wu 1984;Zhang and Einstein 1998;Song and Lee 2001;Jimenez-Rodriguez and Sitar 2006;Hekmatnejad et al. 2019). Other types of pointsupporting processes are the doubly stochastic Poisson, or Cox, the Levy-Lee Fractal, the non-stationary Poisson and the Gibbs processes, among others. See e.g., Aldous et al. (1988), Lee et al. (2011) and Hekmatnejad et al. (2019). The decision on the process to be applied depends on the overall breakage pattern of the rock.
In this work, a novel distribution-free DFN generation technique is developed. It uses a directional-linear KDE to characterize the joint distribution of orientations and trace lengths in a non-parametrical way. The DFN models are calibrated using Monte Carlo realizations of the non-parametric distributions (Zheng et al. 2014 andZheng et al. 2015). Poisson point process is used for fractures location and these are assumed to be flat and circular in shape. Triangulated surfaces are used to conduct the intersection of a disk with a three-dimensional surface; a MATLAB (MATLAB 2019) function called SurfaceIntersection.m (Tuszynski 2014) based on a triangle/triangle intersection algorithm proposed by (Moller 1998) is used for this purpose. A solution to the Bertrand paradox (Bertrand 1888;Chiu and Larson 2009) is considered to estimate disk sizes from trace lengths. Correction of sampling biases is also addressed. The experimental value of P 21 , i.e., length of fractures per unit area for each trace map, is fitted through the number of fractures per unit volume (P 30 ), see (Zheng et al. 2017). Field data used to test the model were collected from photogrammetric models of non-planar highwall faces. Disk orientations are obtained by fitting a plane to the 3D coordinates of the polyline fitted to the exposed trace.
In the following sections, data collection and subsequent data processing techniques are described; the mathematical foundations of the model are developed and the calibration process presented. The results obtained from the model are shown, followed by a discussion on the approximation to calculate the trace length distributions and its validity. The influence of the surface on the distribution of trace lengths and the relation between fracture intensities is also investigated.

Data Overview
The experimental work was carried out at El Aljibe quarry (Almonacid de Toledo, Spain), where mylonite is mined by drilling and blasting. Discontinuities were mapped from photogrammetric models of the highwall faces of 12 blasts in which the sampling scale was about 10 mm. The blasts were arranged in two test campaigns of six blasts each, located in the same area of the pit. A major extensional fault defines two structural domains DS1 (trace maps TM1-TM6) and DS2 (trace maps TM7-TM12) with different fracture density. Bernardini et al. (2022) give an insight to data collection, discontinuity sets and a classical DFN construction using the FracMan ® suite (Golder Associates 2018).
For the analysis, point clouds of the outcrops and discontinuity characteristics (dip, dip direction, and coordinates of polylines' points) are treated in.csv format. The orientations of the 12 faces are similar with a dip direction of 228.54 ± 4° and a dip of 22.98 ± 3.59° (circular mean ± standard deviation). Because the inspected area is smaller than the area of the complete photogrammetric model, it is cropped with a MATLAB function in a way that the parameter P 21 is not affected. The limits of the outcrop surface considered for the calculations are defined by the minimum and maximum coordinates of the traces. The resulting surface areas range between 200 m 2 (TM6) and 390 m 2 (TM10) approximately.
The number and length of the mapped fractures are shown in Table 1. It also shows the sampling window dimensions, areal fracture density P 21 and the percentage of complete, semi-bounded and bounded traces. These are plotted in red, green and blue, respectively in the trace maps of the twelve faces shown in Fig. 1; complete discontinuities correspond to more than 85% of the total, while bounded fractures are observed only in five blasts. Figure 16 shows the stereograms of the experimental poles obtained for each trace map (black dots). There is an increase in the number of fractures marked in DS2 domain (trace maps TM7-TM12), due likely to the effect of the fault mentioned above. This results in a mean P 21 of 0.72 m −1 for DS1 and 1.23 m −1 for DS2. The histograms of the lengths of the fractures mapped are shown in Sect. 3.1.2 (Fig. 4).

Model Description
The DFN model proposed treats fractures individually instead of considering their belonging to a certain set or family. KDEs are suitable to this end, since they are versatile enough to be used in both directional and linear domains. A Poisson-Boolean point process has been considered to model the location of the support points which is characterized by a constant parameter, P 30 . Disks are described as  flat and circular, although they are modeled as triacontagons. 1 Fracture sizes and orientations have been considered to be independent. Corrections are implemented to account for orientation, size, truncation and censoring biases and a method has been developed to random sampling the directional-linear KDE; this sampling method is also applicable to non-planar surfaces such as those observed in the field.

Directional-Linear Kernel to Represent Trace Maps
The DFN is based on a non-parametric distribution using the kernel density estimation for linear-directional data by García-Portugués et al. (2013). Let Z denote a linear random variable with density f g whose support is all the real numbers. For a random sample of Z, with size n, the traditional linear KDE is given by García-Portugués et al. (2013): where K Z is a suitable kernel smoother, and g is the bandwidth of the linear part. In this work we assume a Gaussian smoother. This linear variable may be applied for any length or size of discontinuities equivalently. And let X denote a directional random variable in Cartesian coordinates with density f h that accounts for orientation. The support of such a variable is the q-dimensional hemisphere, denoted by: where ω q is the Lebesgue measure in Ω q , which is the surface area of Ω q and Γ(·) is the gamma function. For spherical data q = 2 and the Lebesgue measure in Ω 2 is 2π. Likewise, for a random sample of X, the directional KDE is given by García-Portugués et al. (2013): where K L is a suitable kernel smoother, h is the bandwidth of the directional part and c h,q (K L ) is a normalizing constant for any x ∈ Ω q , which is given by García-Portugués et al. (2013): In this case, the von Mises-Fisher kernel smoother has been chosen for simplicity: For a directional-linear random variable (X, Z) with support Ω q × ℝ and joint density f h,g , the KDE of orientationsizes may be expressed as (García-Portugués et al. 2013): where the notation K L K Z should be understood as:

Log-transformed Kernel for Linear Data
To have a kernel support positive, to prevent negative sizes in the distribution, and to meet the condition f g (0) = 0 , a variable Z* = ln(Z) is introduced (Geenens and Wang 2018). From standard arguments, one has: which readily suggests using the estimator f Z * of f Z * .

Optimal Bandwidths
Bandwidth selection of the KDE is an important aspect in non-parametric statistics; the rule-of-thumb (ROT) bandwidth selector h ROT for directional data (García-Portugués et al. 2013) is considered in this work: The parameter ̂ is estimated by maximum-likelihood. The ROT selector is simple but fails when estimating directional densities with multimodality (García-Portugués 2013). This is an issue in geological discontinuity representation, since orientation measurements often present multimodality (in fact the usual procedure is to group 1 Thirty-sided regular polygon. traces by families, i.e., different modes). To solve this, we consider only data around the likeliest point in the stereogram, so that the information of the main mode is well fitted. h ROT is then calculated from points that do not differ more than 45° from the likeliest point. Figure 2 shows as example the likeliest orientation and the area considered to calculate h ROT with Eq. (10) in trace map TM8; the marginal KDE of orientation plotted was obtained with h ROT = 0.126. The location of points more than 45° away from the likeliest pole is obtained through a parametrization of the hemisphere-cone intersection (see Sect. 3.3.2). The marginal KDEs of the discontinuity orientation are shown in Fig. 3. They are constructed onto a dense equi-spaced Fibonacci grid through a MATLAB algorithm called SpiralSampleSphere.m (Semechko 2012). The likeliest orientations, which are not evident in all the graphs, are marked with white crosses. Bandwidths from Eq. (10) are given in Table 4.
For the linear part of the KDE, the maximum-likelihood cross-validation MLCV (García-Portugués 2013) is used for bandwidth selection; g is chosen to maximize the following quantity: where MLCV(g) is given by (García-Portugués 2013): Figure 4 shows the histograms (blue bars) and the marginal KDEs (red lines) of pseudo-trace lengths (see next section) calculated with the bandwidths obtained with Eqs. (11) and (12). They are given in Table 3 (11)

Trace Length-to-Fracture Size Distribution Estimation
When a fracture, in this case modeled as a circular disk of a given size, intersects a plane surface of infinite extension, the result is a line whose initial and final points are located on the perimeter of the disk. The exposure of this intersection on the surface is called trace and its length is traditionally called trace length. However, when the surface is non-planar, the intersection between the disk and the surface becomes a polyline. Its length, if measured in 3D, is that we call the 'true' trace length. The higher the specific area of the surface, the longer the true length is. This is an inconvenience when calculating fracturing intensity parameters that must be matched with experimental measures since these are closer to the projected length on the mean plane of the surface. This is solved by considering the pseudo-trace length of the chord that a disk forms when intersects the irregular three-dimensional surface. The probability of a trace obtained from the intersection of a disk with a surface of infinite extension to have a certain length can be approached with the Bertrand's Paradox (Bertrand 1888). This, however, offers several answers, all plausible, to the interpretation of the concept of randomness Chiu and Larson (2009). Intuitively, one might choose the paradox No. 3, in which the chord center is assumed to be uniformly distributed over a reference radius. This is equivalent to selecting a random point on a reference radius to become the center of a random chord (see Fig. 5).
Following the scheme in Fig. 5, if one starts from the basis of a disk size distribution D , or a disk radii distribution R = D /2 (a triangular distribution has been chosen arbitrarily for the distribution of radii in the right graphs of Fig. 5), and takes any point on a reference radius of any of those disks, a distribution of lengths Λ from the center of the disk to the mentioned point can be formed. For an arbitrary disk radius r, Λ can take values between 0 and r following a uniform distribution, then Λ may be written as the product of two distributions: where U(0,1) is the uniform distribution between 0 and 1. Equation (14) is represented as operation a on the right part of Fig. 5. From Fig. 5, the distributions of L , R and Λ are related by: Inserting Eq. (13) in Eq. (14), L may be directly obtained from D: Dividing the pseudo-trace lengths by their respective diameters one may obtain: where Y is a random variable whose density function may be expressed as: In practice, the outcrop has finite extension so that some traces are censored. Thus, a modification that corrects the censoring bias has to be included. This is accomplished with a mixed random variable Ξ: where is the incomplete pseudo-trace length random variable, see Fig. 5 operation d, and Ψ a new random variable resulting from multiplying Y and Ξ, see Fig. 5 operation c, using the properties of product distributions. The following cumulative distribution function has been selected for Ξ: where p is a parameter that indicates the proportion of semibounded traces over the total number of traces; it is one of the fitting parameters of the model. This function has been built to be a mixed random variable so that it only applies to a proportion of fractures to be treated as semi-bounded; the square has been chosen to enhance the occurrence of long traces. The resulting probability density function of Ψ is: (20) A proof of the validity of Eq. (20) is offered in Appendix A.4.
Equation (18) is of key importance, as it directly relates trace lengths with their respective disk sizes including the censoring bias; moreover, it allows using non-analytic distributions and finite surfaces to obtain the disk size distribution for each orientation, based on the distribution of measurable trace lengths using the definition of ratio distribution: where the KDE f h,g is used. A list of quadruplets composed by the Cartesian coordinates of each pole on the unit hemisphere, and the incomplete pseudo-trace lengths, is associated to each trace map. For a given orientation X i (the first three) there is a one-dimensional function of the type f h,g X i , l (see the integrand in Eq. (21)). Such function is normalized to obtain the cumulative distribution function F . This may be done for each X i so that a disk size distribution is obtained for each orientation. To obtain disks larger than the incomplete pseudo-trace length with which it is associated, the distribution function of experimental incomplete pseudo-trace lengths is used to calculate the probability level of a given pseudo-trace l i . This level is considered in the distribution function obtained from Eq. (21) to get the respective disk size D i through: where F D is the distribution function obtained from Eq. (21).
To simplify the calculation, these functions are taken as piecewise linear. Equation (21) is solved numerically for each pseudo-trace. Figure 6 shows the result for trace map TM12 and P = 0.18. As it should, the cumulative distribution of disk sizes (in red) is shifted to the right, so that chord length percentiles are always smaller than the percentiles of fracture size.

KDE Random Sampling
To construct a DFN model, the directional-linear kernel must be sampled randomly. This is done by (i) random sampling the kernel smoother, (ii) randomly selecting one of the experimental data points with replacement and (iii) adding both values. Random sampling must consider that each point has a different weight, as will be explained in Sect. 3.3.3. The properties of the kernel are associated to all points that

Random Sampling of Disk Sizes
Once KDE is sampled, the disk size is taken from the quadruplet associated to each individual discontinuity. A random value depending on the kernel smoother used and the directional bandwidth must be added or subtracted to the disk size. For convenience, it is possible to transform the linear kernel into a quantile function; let be the linear part of the KDE: Since a Gaussian smoother is used, the cumulative distribution function of is given by: where erf is the error function. Solving for z in Eq. (24), one obtains:

Random Sampling of Orientations
For the purpose of randomly sampling orientations on the unit hemisphere, points within a certain angular distance around a pole must be selected. This may be accomplished by intersecting a cone with a random opening angle with the hemisphere and then randomly selecting one of the points on the intersection circle. This way, a parametrization of the circle (sin cost, sin sint) is carried out by mapping it onto the cone-hemisphere intersection; δ is the cone's half angle and t a random angle between 0 and 2π. The transformation consists of a rotation and a translation, which reduces to finding a suitable orthonormal basis for ℝ 3 . Figure 7 illustrates the orientation sampling process. For the translation part, the center of the circle is shifted from the origin to: where c is the intersection of the cone's axis with the circle's plane and n is a unit vector which corresponds to the Cartesian coordinates of any disc's pole. For the rotation part, the z-axis must be mapped to the normal of the intersection circle's plane. If: Then let be a horizontal unit vector and With these two vectors, one may construct the parametrization: δ is derived from the von Mises kernel as a quantile function: If the z coordinate of P is positive, then it is transformed in its antipode, so that it is always on the lower hemisphere.

Weighting Function for Random Sampling
When sampling a joint distribution of disk orientations-sizes within a domain and intersecting them with the highwall face, a disk is more likely to intersect the surface the larger and the closer to perpendicular to the surface it is. Three joint orientation-size distributions are considered: (i) the 'true' distribution, (ii) the distribution of disks that intersect the surface, and (iii) the experimental distribution of pseudo-traces. Weighting must be applied to both size and orientation to obtain (i) from (ii), and vice versa. Obtaining  where w(x) is an appropriate weighting function and E[⋅] denotes mathematical expectation. Here f(z) and f W (z) correspond to distributions (i) and (ii), respectively. Equation (32) may be applied for sizes and orientations equally. w(z) may be numerically obtained by computing the ratio between the bin values of the histogram of sizes of disks that intersect the surface (distribution ii) and the histogram of sizes of all the disks generated within the domain (distribution i). These ratios fit well a power law: a being a constant and γ an additional fitting parameter of the model. The value of a is not relevant because it vanishes when normalizing. Baecher (1983) already considered a quadratic weighting function.
(33) w(z) = az Figure 8 shows the process of obtaining γ. By way of example, five hundred thousand disks that follow a uniform distribution of diameters between 0 and 3 m and with random orientation are modeled. Green and red histograms in the left graph represent the 'true' distribution and the distribution of disks that intersect the surface, respectively. Note the size bias effect caused by the surface. The ratio between bin values of both histograms is shown in the right graph (blue circles). The determination coefficient of fitting Eq. (33) to these points (red line) is 0.996, a = 0.6675 ± 4.7·10 -3 and γ = 0.9959 ± 8.3·10 -3 at confidence level of 95%. The fact that an infinite planar surface is assumed makes γ ≈ 1. For a non-planar case, γ takes values higher than 1. For the faces studied in this work, γ ranges from 1.13 to 1.48 (see Table 4 in Appendix B); γ is statistically significant in all cases.
The bias correction due to the relative orientation between disks and the intersection surface (the classic Terzaghi correction (Terzaghi 1965) is made by the function: Fig. 8 Example of calculation of γ. Green and red histograms in the left graph represent the 'true' distribution of disks and the distribution of disks that intersect the surface, respectively (colour figure online) Fig. 9 Weighting function for orientations. Green and red histograms in the left graph represent the 'true' distribution of the dot product T X i and the distribution of disks that intersect the surface, respectively (colour figure online) where b is a constant and μ T is the average normal vector to the surface in case this is non-planar. The value of b is not relevant because it vanishes when normalizing. The vertexNormal.m function in MATLAB has been used to calculate μ T which is the normal unit vector to the surface at each of the points of the face. This is a modified version from Mauldon and Xiaohai (2006). Figure 9 shows the validity of Eq. (34) for a similar simulation, including an infinite planar surface, to that in Fig. 8. The true density function of the dot product T X i is the green histogram in the left graph, and the red one is the distribution of disks that intersect the surface. The ratio between bin values of both histograms is shown in the right graph (blue circles). Fitting Eq. (34) (red line) to these points leads to a determination coefficient of 0.9984 with b = 1.186 ± 7·10 -3 at a confidence level 95%.
Equations (33) and (34) are combined into a single weighting function for calculating the true distribution from the distribution of disks that intersect the surface:

Probabilistic Truncation
Truncation can be described as a 'perfect filter' that discards samples when a certain bound is exceeded. This work considers a 'probabilistic filter' that groups the sources of error cited in Sect. 1 above and truncates fractures according to a probability level associated with a threshold pseudo-trace length. A simple function that that accepts or rejects traces based on a probability level is the log-logistic distribution function: where α is a scale parameter with units of length that is the trace length with a 50% chance of being observed according to the experimental protocol (e.g., ground sampling distance, operator's skills, etc.); β is a dimensionless parameter that controls the filter steepness. Low values of β indicate both a large operation range and a low slope, while high β values indicate a sharp transition.
The filter in Eq. (36) is calibrated through probabilistic optimization. The starting point is the histogram of simulated pseudo-trace lengths for a certain trace map. Each trace has its respective length and, therefore, a probability level of being or not being observed by the operator according to Eq. (36). α and β are estimated such that experimental and simulated pseudo-trace length histograms match. A randomized algorithm based on the binornd.m routine has been developed in MATLAB. It consists in sampling a Bernoulli distribution with a vector composed of n probabilities, n being the number of disks that intersect the surface. The result is a logical vector with 0 when the trace is not observed and 1 when the trace is observed. This process is repeated one thousand times for the whole set of pseudotrace lengths in each highwall face; for each of those realizations, a Kolmogorov-Smirnov test between the experimental and simulated histograms is performed. The average of the test P values is the parameter to be maximized. The number of disks deemed observed after this filtering is N. Figure 10 shows as an example the effect of the probabilistic filter in the simulated pseudo-trace lengths of TM3. Note that there are many small traces that have not been observed as their sizes were probably below the resolution of the experimental discontinuity detection system.

Model Calibration and Results
The objective of the model is to estimate the value of fracture intensity P 30 that replicates the experimental P 21 value for each trace map. The process followed to calibrate the DFN model is summarized in Appendix C; (i) Bandwidths h and g are calculated according to Eqs. (10), (11) and (12). (ii) The pseudo-trace length KDE for each orientation is calculated using Eq. Steps (iii) to (iv) are repeated until convergence. Then (v) the probabilistic truncation process in Sect. 3.4 is applied to fit the truncation function parameters α and β in Eq. (36). The resulting mean number of traces that intersect the face after applying the probabilistic filter, N, is obtained in this process.
An estimator of P 30 , i.e., number of disks to be simulated within the domain to obtain the P 21 observed on the surface in the field is: where l i is the pseudo-trace length of a trace observed experimentally, l is the mean of the pseudo-trace length KDE, TF is the total number of fractures generated within the domain and V the volume of the domain.
The parameters of the model (h, g, p, γ, α and β) are summarized in Table 4 for the twelve highwall faces. Fracture intensities P 30 , P 21 and P 32 are given and also the percentage of fractures that intersect the outcrop over the total number of disks generated, N/TF. An example of DFN model is shown in Fig. 11, see others in Fig. 20, Appendix B. The scale parameter α of the log-logistic function, Eq. (36), decreases from TM1 to TM12 (Table 4). This implies that shorter traces are marked for the last faces. This might be likely caused either by local changes in rock mass due to the existence of the extensional fault that crosses the highwall faces TM11 and TM12, see Bernardini et al. (2022) for additional information. Figure 12 shows the fracturing intensity parameters for the twelve trace maps. They are in  general higher for faces in DS2 domain (TM7-TM12), and they all follow a similar trend except in TM11 and TM12, where P 30 increases and P 21 and P 32 do not. This may be explained as Table 1 shows due to an increase in the number of fractures, and a decrease in their mean sizes (raised to the second power), especially in TM12.
To illustrate the model performance, Fig. 13 shows, for TM1: traces (chords) obtained from a simulation (green lines in the left graph); the marginal KDE of pseudo-trace lengths obtained from experimental data (red curve in the central graph), together with the histogram of pseudo-trace lengths obtained from the simulation (blue bars); and the map of experimental and simulated poles (black and red dots, respectively, right graph). The model is stochastically similar to the observations. The number of fractures observed to intersect the surface is 66 that coincides with the average number of simulated disks that intersect the surface. All other cases are shown in Fig. 16.

Discussion
To assess the validity of the proposed DFN model, the relations proposed by Zhang et al. (2002) for trace lengths and disk sizes must be verified and typical relations among fracture intensities must be met.
The common relations between true disk sizes and pseudo-trace lengths distributions Zhang et al. (2002) have been used to assess Eq. (17) in Sect. 3.2. Such relations were derived for some common distributions of trace length, such as lognormal, negative exponential and gamma, assuming the same distribution for both disk size and pseudo-trace length. Three numerical simulations have been performed with disk sizes following: (i) a lognormal distribution with μ = 0.5 m and σ = 0.25 m, (ii) a negative exponential distribution with μ = 0.25 m, and (iii) a gamma distribution with α = 2 and β = 0.5. The resultant distributions obtained with Eqs. (17) and (35) are shown in Fig. 14. Green histograms correspond to the true disk size distribution and the blue ones the pseudo-trace lengths. The distributions of disks that intersect the surface have also been plotted (red), as these distributions are used for calculating the pseudo-trace lengths. Table 2 summarizes the means and variances (m and v) of the different distributions obtained in the simulations. They are asymptotically equal to those obtained from the relationships provided by Zhang et al. (2002) (for a two million disk simulation they are equal to the third significant digit). Note in Fig. 14 that the distribution type of disk sizes Fig. 14 Lognormal (left), negative exponential (center) and gamma (right) distributions of true disk sizes (green), size of disks that intersect the surface (red) and pseudo-trace lengths (blue) (colour figure online) Table 2 Relation between parameters (mean, m and variance, v) of the true disk size (subscript D) and the pseudo-trace length (subscript L) distributions; same colors as in Fig. 14  and traces is not necessarily preserved; this is quite obvious for the lognormal and the negative exponential cases. Even if the means and variances are well estimated, the actual distributions can be grossly different. A linear relationship would be expected between the fracture intensities P 21 and P 32 for domains with similar properties (Dershowitz and Herda 1992;Mauldon 1994;Wang 2005), with a ratio P 21 /P 32 between 0 and 1. This applies irrespective of the disk size distribution. To verify this, values of P 32 and P 21 from Table 4 have been plotted, grouping them by the domain they belong to, in Fig. 15. Values of P 21 /P 32 are also given in Table 4. Correlation coefficients are shown in the upper left part of the graph. Both P values are lower than 0.05 which indicates the relations are significant. The same relations were tried between P 30 and P 21 , and P 30 and P 32 , in this case without significance to the 0.05 level. P 30 provides an incomplete fracture description as it only indicates the density of support points while the size of the fractures (unlike P 21 and P 32 ) is missing in it. In fact, there is probably no reason for P 30 to be correlated, in general, with P 32 or P 21 though, for a given size distribution of fractures, more fractures will likely encompass more traces (and longer total trace length) on an intersecting surface, and more fracture surface per unit volume. To assess differences between the model developed in this work and traditional DFN models, fracture densities of the domains DS1 and DS2 built with FracMan ® suite Bernardini et al. (2022) are plotted for comparison, as cyan and orange points, in Fig. 15. They fall acceptably well in the linear trends of each pair of fracture densities, especially for the DS2 case (red and orange points).
The use of traditional DFN codes implies inevitably: (i) initial manual sectorization of fractures, which may introduce some bias because of the operator's criterion; to this, the time spent in interpreting the data must be added; (ii) parametrical characterization of the families, which involves selecting some probability distribution functions based on Kolmogorov-Smirnov tests; and (iii) calibration of an intensity parameter for each family of fractures, which entails increasing computational time with increasing number of families. In our own experience with traditional codes, the third step consumes the most computational time. The calibration process of the model proposed here saves considerable amounts of time in this step, since the KDEs considers all fractures independently of the family they might belong to. As an example, the computational time the code consumes in completing an iteration for TM10 is about 75 min on a 2.2 GHz MSI GF62 8RD laptop. Most of the time, 65.2%, is spent in solving the triangle-triangle intersections that is dependent of the quantity of disks, the number of sides of each disk and the number of triangles that make up the intersection surface, a 21% in self-time (third step in Sect. 4), a 5.5% in calibrating the probabilistic filter, a 5.3% in random sampling and the rest in other subroutines.

Conclusions
A novel distribution-free DFN generation technique is developed, making use of directional-linear KDE to non-parametrically characterize jointly the distribution of fracture orientations and trace lengths. The non-parametric approach does not require a classification of fractures into sets as parametric models do and does not demand any statement on a particular functional distribution of trace lengths or disk diameters derived thereof.
The mathematical modeling and the application steps are explained thoroughly, with the inclusion of numerical tests of consistency where required. Bias corrections are provided for censoring, size, orientation and truncation. Censoring is addressed using a new calculation method that includes semi-bounded traces. Size and orientation biases are jointly addressed using a new weighting function. The truncation bias is corrected by a probabilistic filter capable of incorporating fractures to the DFN shorter than those observed in field measurements. This endows the model with a certain scale-independent character, not the case of traditional parametric DFN models.
The model has been applied to twelve trace maps obtained from photogrammetry of highwall faces. Disk orientations  Table 3 were obtained by fitting a plane to the 3D-polyline defined by the traces. The DFN models are calibrated using Monte Carlo realizations of the non-parametric distributions. A Poisson point process is used for fractures location, assumed to be flat and circular in shape. Solution No. 3 of the Bertrand paradox has proved to be effective to estimate disk sizes from trace lengths.
The experimental value of P 21 , is fitted through the number of fractures per unit volume (P 30 ). The resulting fracture intensities P 21 and P 32 are linearly related, which supports the model validity. These intensities have been discussed with classical DFN generation systems.
While the non-parametric model does not require an assumption of a distribution type for traces and fracture diameters, its results meet the classical relations between mean and variance of trace lengths and disk diameters. It is shown however, that the distribution types of trace lengths and diameters can be markedly different, a feature that does not affect kernel-based non-parametric DFNs.

Appendix A. Proofs
In this appendix different proofs mentioned in the manuscript are exposed.

A.1 Warburton's Relationship Derivation
Consider two independent random variables X and Y. The resultant product distribution of a new random variable Z is: Taking the density function in Eq. (22) as the distribution of the random variable Y, the density function of X as the weighed disk size density function and Z as the pseudo-trace length density function, one may write: Since for infinite outcrops the weighing function w(D) = D and E[w(D)] is a constant, then the previous equation may be simplified to: Regarding integration limits, none of the functions are defined for negative values. Furthermore, as any pseudotrace length is obtained from disks with size larger than the pseudo-trace itself, then Eq. (B.3) may be written: Note the similarity of this equation with the relation obtained by Warburton (1980a).

A.2 Relationship Between Eq. (15) and Bertrand Paradox Number 3
The distribution of Eq. (15) is consistent with the distribution function obtained from Aldous et al. (1988) and Chiu and Larson (2009) for Bertrand's paradox number 3. Taking for instance disks of diameter D = 1 m, Fig. 16 compares the results from randomly sampling Eq. (15) and Bertrand's distribution functions together with some sources  Table 3 Statistics of random chord lengths of cases 1-7 described by Chiu and Larson (2009 Bertrand probability P(L > √3) of randomness (cases 4 to 7) discussed by Chiu and Larson (2009) (see Table 3). One may note that Eq. (15) and Bertrand 3 (green line) are clearly equivalent. The same works for any other diameter.

A.3 Numerical Simulation to Prove the Validity of Eq. (17)
Five hundred thousand disks with random orientations were distributed inside a cubic domain with 10 m side, following a Poisson process. Their sizes were assumed to be distributed following a lognormal distribution with parameters μ = 0.5 m and σ = 0.25 m so that they never exceeded the extension of a finite rectangular surface of 30 × 30 m. The resulting traces were divided by their respective disk sizes. Figure 17 shows the histogram of L/D (blue bars) obtained with Eq. (18) and the analytical probability distribution in Eq. (17) (red line). Red points in the right graph show the complete trace lengths corresponding to their disk sizes. The result is quite satisfactory.

A.4 Numerical Simulation to Prove the Validity of Eq. (20)
Disks with random orientations and following a Poisson process were generated within a 10 m side cubic domain so that they intersect a 5 m-radius circumferential outcrop. disk sizes may be set to follow any distribution used in the literature for trace lengths; take for example a gamma distribution with parameters α = 2 and β = 0.5. To the extent possible, these parameters were chosen in a way that the size of the surface is large compared to disk sizes, leading to a limited number of bounded traces. The result of the test is shown in Fig. 18. The graph on the upper left shows the histogram of /D (blue bars) and the density function f Ψ (red line) with p taking a value of 0.2. P has been obtained from the upper right graph, calculating the proportion of semi-bounded traces over the total number of traces. In this graph, l values are represented vs. disk size D ; red, green and blue dots in the upper plot indicate complete, semi-bounded and bounded traces, respectively. The result is, again, satisfactory. Orientation and dip have also been plotted to check independence between the random variables, see bottom graphs in Fig. 18. Orientations are randomly distributed, without a preferential pattern. Table 4 and Figs. 19, 20 sumarize the main results of the model developed in this paper.  Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature.

Data availability
The data that support the findings of this study are available in a Zenodo repository https:// doi. org/ 10. 5281/ zenodo. 75335 65.

Conflict of Interest
The authors declare that they have no conflict of interest.
Ethical Approval Not applicable.

Consent for Publication Not applicable.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.