1 Introduction

In many cases two-dimensional scalar fields are largely supported on small areas, ‘hotspots’. Examples can include the distribution of human populations, which are concentrated in urban settlements, the distribution of debris on the ocean, which can be concentrated in regions where cool or saline water is subducted, and deposits of mineral ores, which can be concentrated at the points where dissolved material is deposited from evaporating water. Another example is images of star fields, where the stars appear as points. Subjectively, images of the distribution of hotspots can appear to bear a familial resemblance. Despite the fact that there are numerous examples of natural process which concentrate upon hotspots, the phenomenon has not received much attention. This paper addresses the question of how these distributions can be characterised, and whether they have scale-invariant features.

The fields that we consider can be modelled by random processes. We consider random, non-negative scalar fields in a two-dimensional space, denoted by \(\phi ({\textbf{r}})\), with statistics which are homogeneous (translationally invariant) and isotropic (rotationally invariant). Extensions to higher dimensions will be obvious.

In the cases where the field is highly concentrated in the vicinity of isolated points, each point has a weight \(w_i\), which represents the integral of the field over a small region surrounding that point. These weights have a probability density function p(w). We are interested in examples where w has a very broad distribution, such that its mean value \(\langle w\rangle \) might be undefined (\(\langle X\rangle \) denotes the expectation value of X throughout). Another possibility is that \(\langle w\rangle \) does exist, but that is dominated by large values which are unlikely to be observed in a finite sample (examples of both of these cases will be considered below).

Nevertheless, we might wish to know how the total weight of the hotspots increases with the size of the sample region, R. We assume that the hotspots are scattered homogeneously with density \(\rho \). Consider the set \({\mathcal {W}}_R\) of approximately \(\rho R^2\) values of \(w_i\) for which \({\textbf{r}}_i\) lies inside a square of side R. This set has the total weight

$$\begin{aligned} W=\sum _{w_i\in {\mathcal {W}}_R}w_i. \end{aligned}$$
(1)

If the weights had a compact distribution, we would estimate the mean value of W as \(\langle W\rangle =\rho R^2\langle w\rangle \), but our interest lies in distributions for which \(\langle w\rangle \) is infinite, or else larger than any weights which are encountered in a typical realisation. A more promising approach is to estimate the median value \({\overline{W}}\). We anticipate that \({\overline{W}}\) will increase very rapidly as a function of the scale length R. Accordingly, we use a logarithmic scale. We can characterise a given hotspot distribution by means of a function F:

$$\begin{aligned} \ln ({\overline{W}})=F(\ln R). \end{aligned}$$
(2)

In Sect. 2 we discuss a model for which \({\overline{W}}\) has a power law dependence upon R, so F(x) is a linear function. By analogy with the definition of fractals sets, we shall refer to the exponent as a ‘dimension’. In the usual fractal system if the mass \(\mu \) inside a ball of radius \(\epsilon \) scales as \(\mu \sim \epsilon ^D\), then D is described as the fractal dimension of the set [1, 2]. Similarly, if \({\overline{W}}\sim R^D\), the exponent D can be thought of as a type of dimension of the set of hotspots. Even if F is not a linear function we can define an effective dimension \(D_\text {eff}\) at the length scale R as the derivative

$$\begin{aligned} D_\text {eff}=\frac{\textrm{d}F(\ln R)}{\textrm{d}(\ln R)}. \end{aligned}$$
(3)

We shall argue that this effective dimension may be higher than the dimension of the embedding space, unlike the fractal dimension. We describe this scale dependence as ultradimensional.

We shall discuss two different models for hotspot distributions. In Sect. 2 we introduce and analyse a model in which the PDF of w is a power-law with divergent mean, which is shown to have an exact scale-invariance. In Sect. 3 we discuss a physical example of a hotspot distribution, namely the probability density for a particle diffusing in a two-dimensional gaussian random potential, V(xy). The equilibrium probability density is proportional to \(\exp [-V(x,y)/{\mathcal {D}}]\) where \({\mathcal {D}}\) is the diffusion coefficient. In the limit as \({\mathcal {D}}\rightarrow 0\), this density is concentrated at “hotspots” which are minima of the potential function. We are able to determine the function \(F(\cdot )\) for both models. Section 4 is a brief conclusion.

2 Power-Law Model

2.1 Definition of the Model

We consider the following simple model. We take a uniform, independent random scatter of points on the plane, \({\textbf{r}}_i\), with density \(\rho \). Each point is assigned a random weight \(w_i\), drawn independently from a distribution with probability density function (PDF) p(w). The weights \(w_i\) represent the integral of \(\phi ({\textbf{r}})\) in the neighbourhood surrounding one of the points upon which it is concentrated.

We introduce a power-law model, such that for large w, the PDF is \(p(w)\sim w^{-\gamma }\). In the calculations below we shall use the following specific distribution as an example:

$$\begin{aligned} p(w)= {\left\{ \begin{array}{ll} (\gamma -1)w^{-\gamma },&{}\quad w\ge 1\\ 0,&{}\quad w<1 \end{array}\right. } \end{aligned}$$
(4)

with \(1<\gamma <2\), so that the distribution is normalisable, but its mean is undefined. We shall also need to consider the cumulative distribution: if \(P(w_0)\) is the probability that \(w>w_0\), then Eq. (4) implies that \(P(w)= w^{-(\gamma -1)}\) for \(w>1\).

We argue that this is a foundational model for the distribution of hotspots. Because power-laws arise naturally in many physical processes, we expect that our model will find many physical realisations. In particular, if the process which generates the weights \(w_i\) is scale invariant, the PDF of w will be a power-law. Two examples of process for which the weight has a power-law distribution are the Scheidegger model for the distribution of flow in rivers [3, 4], and a recently developed model for fluxes in directed percolation [5, 6].

Figure 1 is an illustration of 12 different realisations of this model for hotspot distributions, with \(\gamma =5/3\), plotted on four different lengthscales. We used the inverse transform sampling method to generate the weights, \(w_i\). In order to generate these plots, we transformed to a filtered and normalised set, \(\widetilde{{\mathcal {W}}}_R\), as follows. We scale the hotspot positions by dividing by R, and plot \({\textbf{r}}_i/R\) inside a unit square. We eliminate the values of \(w_i\) below a chosen threshold, for example, those \(w_i\) that are less than \(\epsilon W\), where \(\epsilon \) is a given small positive number. We can also ‘normalise’ these sets by dividing every remaining \(w_i\) by W. These normalised and filtered sets are a natural representation of many types of point-set data. An example is a geographical map showing settlements using symbols with the sizes relative to of the largest settlements in the mapped region, where settlements below a certain size are not shown in order to eliminate clutter. Another example is a photograph of the night sky with the exposure adjusted so that the image saturation is normalised, and stars below a certain intensity are not registered at all.

It will be argued that the statistics of these images has a scale-invariance property, in that it is impossible to identify the scale factors of the panels. Non-trivial scale invariance is usually associated with fractal [1, 2] (or more generally, multifractal [7, 8]) properties, which can usually be characterised by saying that the set is, in some sense, self-similar under a change of scale. The images in Fig. 1 are so diverse that would require a large number of realisations to demonstrate that they are drawn from the same ensemble. We shall argue below that there is a simple quantitative distinction between the scale invariance of Fig. 1 and that of fractal sets.

Fig. 1
figure 1

Illustration of scale-independence of the ‘hotspots’ model. At the position of each hotspot there is a filled circle with area proportional to its weight. The total area of the circles in each image is normalised to be \(1\%\) of the total area of the image. The images use the probability distribution Eq. (4) with \(\gamma =5/3\), and the scale factors are \(R=10000\), \(R=2000\), \(R=400\), \(R=80\), (with three cases of each scale factor). The different images cannot be associated with the different values of R by any statistical test, reflecting the scale-invariance property. The scale factors are: upper row, \(R=10000\) panels (a, b, c), second row, \(R=2000\) panels (d, e, f), third row, \(R=400\) panels (g, h, i), lower row, \(R=80\) panels (j, k, l)

If the derivative of the function F(x) defined by Eq. (2) approaches a constant as \(x\rightarrow \infty \), this is indicative of the sets \(\widetilde{{\mathcal {W}}}_R\) having scale-invariant properties, such that the statistics of \(\widetilde{{\mathcal {W}}}_R\) and \(\widetilde{{\mathcal {W}}}_{\lambda R}\) are indistinguishable, for a wide range of values of the positive number \(\lambda \). This idea can be expressed by saying that the realisations of \(\widetilde{{\mathcal {W}}}_R\) are drawn from an ensemble which is independent of R, depending only upon \(\gamma \). This self-similarity could be trivial, or it could indicate that the hotspot distribution has fractal properties, or have a different explanation. It will be argued that it is the latter possibility which is realised. For our simplified model it will be shown (in Sect. 2.2 below) that, for points distributed randomly in d dimensions with the weight distribution Eq. (4),

$$\begin{aligned} D_\text {eff}=\frac{d}{\gamma -1}. \end{aligned}$$
(5)

Note that, because \(2>\gamma >1\), this effective dimension is higher than the dimension of the embedding space. This indicates that the effective dimension \(D_\text {eff}\) is fundamentally different from a fractal dimension. We describe this scale invariance with \(D_\text {eff}>d\) as ultradimensional.

A different approach to describing the hotspot distribution is to consider the relative sizes of the largest values of \(w_i\) in the set \({\mathcal {W}}_R\). The filtered and normalised sets can be characterised by considering the relative sizes of the largest values of \(w_i\). To this end, we can sort the weights, \(w_i\), into a decreasing sequence \(\{ \overrightarrow{w}_i\}\), and consider the proportion of the total mass which is contained in the first k elements of this set

$$\begin{aligned} f_k=\frac{\sum _{j=1}^k \overrightarrow{w}_j}{\sum _{j=1}^{{\widetilde{N}}} \overrightarrow{w}_j}, \end{aligned}$$
(6)

where \({\widetilde{N}}\) is the number of elements in the filtered set. We can consider the average of \(f_k\) over different regions of the data, and in some cases we can also average over multiple realisations of the distribution. For the model defined by Eq. (4), this leads to a family of functions of \(\gamma \):

$$\begin{aligned} {\widetilde{f}}_k(\gamma )=\langle f_k \rangle . \end{aligned}$$
(7)

We shall make a hypothesis that, for a general model, the set of values of \(f_k\) at length scale R is representative of the model Eq. (4), with an effective value of \(\gamma \) given by rearrangement of Eq. (5):

$$\begin{aligned} \gamma _\text {eff}=1+\frac{d}{F'(\ln \,R)}. \end{aligned}$$
(8)

2.2 Statistics of the Power-Law Model

Consider how the statistics of the total weight W depends upon R for the power-law model, with weight distribution Eq. (4). The mean value of w is undefined, so calculating the expectation value \(\langle W\rangle \) is not a good approach. Estimating the median of W, which will be denoted by \({\overline{W}}\), appears to be more promising.

For each realisation, let \({\hat{w}}\) be the largest of the \(N\sim \rho R^2\) samples of w in the square. Because we expect that the sum W is dominated by the largest values of the \(w_i\), we might hypothesise that \({\overline{W}}\) is approximated by \(\overline{{\hat{w}}}\), that is the median of W is approximated by the median of the set of the largest \(w_i\) values in each sample. Here it will be argued that this multiplier \({\overline{W}}/\overline{{\hat{w}}}\) is independent of both R and \(\epsilon \).

It is easy to calculate \(w^*\equiv \overline{{\hat{w}}}\). The probability that none of the N independent values of w exceeds \({\hat{w}}\) is \([1-P({\hat{w}})]^N\), so that \(w^*=\overline{{\hat{w}}}\) satisfies \([1-P(w^*)]^N=1/2\). This gives

$$\begin{aligned} w^*=\left( \frac{N}{\ln 2}\right) ^{1/(\gamma -1)}. \end{aligned}$$
(9)

Next we estimate the number of points \({\widetilde{N}}\) in the filtered set, and the value of \({\overline{W}}\). The number of values of \(w_i\) in the range from \(w^*\equiv \overline{{\hat{w}}}\) (upper limit) to \(\epsilon w^*\) (lower limit) is

$$\begin{aligned} \begin{aligned} {\widetilde{N}}&\sim N\int _{\epsilon w^*}^{w^*}\textrm{d}w\, p(w)\\&= N\left[ w^{1-\gamma }\right] _{\epsilon w^*}^{w^*}\\&\sim N (w^*)^{-(\gamma -1)}\epsilon ^{-(\gamma -1)}\\&\sim \ln 2\, \epsilon ^{-(\gamma -1)} \end{aligned} \end{aligned}$$
(10)

so that the number of points \({\widetilde{N}}\) in the filtered set is independent of R, although it does depend upon \(\epsilon \) (we have assumed that N is sufficiently large that \(\epsilon w^*\gg 1\)).

The median of the sum W of a large number of values of \(w_i\) is estimated by noting that \(W={\hat{w}}+{\widetilde{W}}\), where \({\hat{w}}\) is the largest of the \(w_i\), and \({\widetilde{W}}\) is the sum excluding the largest of the \(w_i\). The value of \({\widetilde{W}}\) will be approximated by its mean value, which depends upon \({\hat{w}}\). Writing \({\hat{w}}=aw^*\), and taking the leading order as \(N\rightarrow \infty \), \(\epsilon \rightarrow 0\)

$$\begin{aligned} \begin{aligned} \langle {\widetilde{W}}\rangle&\sim (N-1)\int _{\epsilon w^*}^{{\hat{w}}}\textrm{d}w\, w p(w)\\&\sim (N-1) \left[ \frac{\gamma -1}{2-\gamma }w^{2-\gamma }\right] _{\epsilon {\hat{w}}}^{{\hat{w}}} \\&\sim \frac{\gamma -1}{2-\gamma }(N-1){\hat{w}}^{2-\gamma } \\&\sim \frac{(\gamma -1)\ln 2}{2-\gamma }a^{2-\gamma }w^*. \end{aligned} \end{aligned}$$
(11)

This gives the following estimate for W, in terms of \(a={\hat{w}}/w^*\):

$$\begin{aligned} W\approx w^*\left[ a+a^{2-\gamma }\frac{(\gamma -1)\ln 2}{2-\gamma }\right] . \end{aligned}$$
(12)

The value of W depends upon a random quantity, a. The median value of W(a) is \({\overline{W}}=W({\overline{a}})\). And because we define \(w^*=\overline{{\hat{w}}}=w^*{\overline{a}}\), we have \({\overline{a}}=1\). This gives the following estimate for \({\overline{W}}\):

$$\begin{aligned} {\overline{W}}\approx \left( 1+\frac{(\gamma -1)\ln 2}{2-\gamma }\right) \left( \frac{N}{\ln \, 2}\right) ^{1/(\gamma -1)}. \end{aligned}$$
(13)

This indicates that \({\overline{W}}\) exceeds the median of the largest term by a factor which is independent of both \(\epsilon \) and N (and which is therefore therefore independent of R). The independence of \({\overline{W}}/w^*\) upon R indicates that the filtered images are scale-invariant. The fact that this ratio does not depend upon \(\epsilon \) reflects the fact that the images are dominated by the largest values of \(w_i\). Equation (13) implies that the number of \(w_i\), including the largest one, that make a significant contribution to \({\overline{W}}\) is \(1+\frac{\gamma -1}{2-\gamma }\ln 2\). When \(\gamma \rightarrow 1\), there is likely to be only one \(w_i\) that dominates the filtered image. This is in accord with the large jump principle, discussed in [9].

The prediction for \({\overline{W}}\), Eq. (13), was tested numerically. Figure 2 shows the ratio of the empirically determined values of \(w^*\) and \({\overline{W}}\) to the theoretical estimates, Eqs. (9) and (13), for \(N=1000\) with \(M=10^4\) realisations. This verifies Eq. (9), and shows that the N-dependence of \({\overline{W}}\) is the same as that of \(w^*\). The values of \({\overline{W}}\) used to create Fig. 2 span many decades: theoretical values of \({\overline{W}}\) (with \(N=1000\)) range from \(1.58\ldots \times 10^{63}\) for \(\gamma =1.05\) to \(2.99\ldots \times 10^4\) at \(\gamma =1.95\). Given this very wide range of values, Fig. 2 demonstrates that Eq. (13) is a useful approximation.

Fig. 2
figure 2

Plot of ratios of values of \(w^*\) (median of the largest element) and \({\overline{W}}\) (median of sum of N samples) obtained from simulation, divided by their theoretical estimates, Eqs. (9) and (13) respectively, as a function of \(\gamma \). The figure shows data for \(N=1000\), averaged over \(M=10^4\) iterations. The values of \({\overline{W}}\) used to generate this figure span more that 58 decades

Figure 3 shows the expectation value of the fraction \(f_k\) of the contribution to W from the largest k samples, as defined by Eq. (6) (again using \(N=1000\) elements in the sum, and \(M=10^4\) realisations) as a function of \(\gamma \). This verifies that, in a typical realisation, most of the contribution to W comes from a small number of the largest \(w_i\). The fractional contribution approaches unity, in accord with the large jump principle [9], as \(\gamma \rightarrow 1\).

Fig. 3
figure 3

Mean values of fraction of the sum W contained in its largest k elements (defined in Eq. (6)), as a function of \(\gamma \). The number of elements of the sum was \(N=1000\), and there were \(M=10^4\) realisations

We remark that there is a further level of self-similarity in our power law model, which is concerned with varying the exponent \(\gamma \). Because Eq. (4) implies that \(y=(\gamma -1) \ln w\) has a PDF proportional to \(\exp (y)\), the ensembles for different values of \(\gamma \) are equivalent, if we replace w by \((\gamma -1)\ln w\).

Our most general conclusion from this calculation follows from Eq. (13). When extended to d dimensions, we infer that

$$\begin{aligned} {\overline{W}}\sim R^{d/(\gamma -1)} \end{aligned}$$
(14)

so that the apparent dimension \(D_\text {eff}\) which characterises the scale-invariance is given by Eq. (5). Note that \(D_\text {eff}>d\). We say that this scale-invariance is ultradimensional. It is clearly distinguished from the self-similarity of fractal sets, where the dimension D satisfies \(D<d\).

3 Diffusion Model

3.1 Defining the Model

We now consider a physically motivated example of a distribution of hotspots: the equilibrium probability density for diffusion in a random potential, \(V({\textbf{x}})\). Examples of such a process include diffusion of excitons in a disordered semiconductor heterostructure, or diffusion of atoms on a surface during annealing after epitaxial deposition: both of these processes are described in [10]. Motion of a particle is determined by a stochastic differential equation:

$$\begin{aligned} \delta x_i=-\mu \frac{\partial V}{\partial x_i}\delta t+ \sqrt{2{\mathcal {D}}}\delta \eta _i(t), \end{aligned}$$
(15)

where \(\mu \) is the mobility and \(\delta \eta _i(t)\) are white noise signals, independent at each timestep, satisfying \(\langle \delta \eta _i\rangle =0\) and \(\langle \delta \eta _i \delta \eta _j\rangle =\delta _{ij}\delta t\). In the following we set \(\mu =1\) throughout. When \(V=\textrm{const}\), the motion is simple diffusion with the diffusion coefficient \({\mathcal {D}}\). The equilibrium probability density function for the stochastic process Eq. (15) is

$$\begin{aligned} P({\textbf{x}})=\frac{1}{{\mathcal {Z}}}\exp \left[ -V({\textbf{x}})/{\mathcal {D}}\right] , \end{aligned}$$
(16)

where \({\mathcal {Z}}\) is the partition function. We shall assume that motion is confined to a finite but large region (which we take to be a square with the side \({\mathcal {R}}\)).

When \({\mathcal {D}}\) is small, this density is very strongly concentrated in minima of the potential \(V({\textbf{x}})\), and each local minimum of V is associated with a weight w which is the integral of \(P({\textbf{x}})\) over a small region surrounding the minimum. Our aim will be to characterise the function F, defined by Eq. (2), for this model. The mean value of the w does exist, but in the limit as \({{{\mathcal {D}}}}\rightarrow 0\) it is dominated by very large, but very rare values, which are extremely unlikely to be observed (see Eq. (19) below). Increasing R increases the number of w values that are sampled, approximately \(\rho R^2\), and this increases the probability of the sample including one of those very large, rarely encountered values. We show that \(\ln {\overline{W}}\) can increase very rapidly as a function of \(\ln R\), in accord with our definition of ultradimensional behaviour. This example will exhibit an approximate, rather than exact, scale-invariance.

Fig. 4
figure 4

Equilibrium probability density for diffusion in a two-dimensional random potential, as specified by Eqs. (17) and (18) with a Gaussian correlation function, \(\langle V({\textbf{x}})V({\textbf{x}}')\rangle = \exp [-({\textbf{x}}-{\textbf{x}}')^2/2]\), so that \(c=1\) in Eq. (18). The presentation is the same as in Fig. 1: hotspots are represented by circles by the areas proportional to their weights, with \(1\%\) of the image covered. Panel a: \({\mathcal {D}}=0.25\), \(R=50\). Panel b: \({\mathcal {D}}=0.25\), \(R=25\). Panel c: \({\mathcal {D}}=0.4\), \(R=50\). Panel d: \({\mathcal {D}}=0.4\), \(R=25\)

Consider the equilibrium measure when the potential \(V({\textbf{x}})\) is itself a smoothly varying random function, with a Gaussian PDF, and statistics which are homogeneous and isotropic. We shall assume that V(xy) has the following statistical properties:

$$\begin{aligned} \langle V\rangle =0,\quad \langle V^2\rangle =1,\quad \langle V_x^2\rangle =\langle V_y^2\rangle =1, \end{aligned}$$
(17)

where \(V_x=\partial V/\partial x\), etc. These requirements can be satisfied by re-scaling the coordinates and the potential. Also define c by writing

$$\begin{aligned} c=\langle V_{xy}^2\rangle . \end{aligned}$$
(18)

This parameter satisfies \(c\ge 1/2\), with the lower limit realised if the spectral function S(k) of \(V({\textbf{x}})\) (the modulus squared of the Fourier transform of its autocorrelation) has a ring spectrum, \(S(k)\propto \delta (k-k_0)\). If the correlation function of V is a Gaussian, then \(c=1\).

In general, the value of \({\mathcal {Z}}\) depends upon the realisation of the potential \(V({\textbf{x}}\)). The expectation value of \({\mathcal {Z}}\) is finite, but grows extremely rapidly as \({\mathcal {D}}\rightarrow 0\):

$$\begin{aligned} \begin{aligned} \langle {\mathcal {Z}}\rangle&={\mathcal {R}}^2\left\langle \exp \left[ -\frac{V}{{\mathcal {D}}}\right] \right\rangle \\&=\frac{{\mathcal {R}}^2}{\sqrt{2\pi }}\int _{-\infty }^\infty \textrm{d}V\,\exp \left[ -\frac{V^2}{2}-\frac{V}{{\mathcal {D}}}\right] =\exp \left[ \frac{1}{2{\mathcal {D}}^2}\right] {\mathcal {R}}^2. \end{aligned} \end{aligned}$$
(19)

For most realisations of the potential, the value of \({\mathcal {Z}}\) is much smaller than \(\langle {\mathcal {Z}}\rangle \).

When \({\mathcal {D}}\) is sufficiently small that the measure Eq. (16) is concentrated at the minima of \(V({\textbf{x}})\), the weight of a hotspot is approximated by

$$\begin{aligned} w=\frac{1}{{\mathcal {Z}}}\int \textrm{d}x\int \textrm{d}y\ \exp \left[ -V(x,y)/{{\mathcal {D}}}\right] \sim \frac{2\pi {\mathcal {D}}}{{\mathcal {Z}}} \Delta ^{-1/2} \exp [-V^*/{{\mathcal {D}}}], \end{aligned}$$
(20)

where \(V^*\) is the height of the minimum, and \(\Delta =V_{xx}V_{yy}-V_{xy}^2\) is the determinant of the Hessian matrix at the minimum.

Figure 4 illustrates the distribution of the weights of the hotspots of this diffusion model, using the same presentation as Fig. 1 (hotspots are represented by a filled circle with the area proportional to its weight, Eq. (20), and the total area of circles is normalised to \(1\%\)). The simulations of Gaussian random fields were generated by smoothing a numerical representation of white noise by discrete convolution with a smooth kernel, as described in [10]. Despite the fact that this randomly generated landscape is statistically homogeneous, the hotspot distributions are inhomogeneous on a lengthscale which is much greater than the correlation length of the potential. This is a consequence of the fact that, when \({{{\mathcal {D}}}}\) is sufficiently small, the measure is exquisitely sensitive to the depth of the deepest minima of the potential which are encountered in the sample region. We used two different diffusion coefficients \({\mathcal {D}}\) and lengthscales R. The distributions are qualitatively similar to those of the simplified model, shown in Fig. 1.

When \({\mathcal {D}}\) is small, the weights of the hotspots have a very broad distribution. The expectation value \(\langle w\rangle \) is dominated by extremely rare events, which are unlikely to be realised, and it is more useful to estimate the median \({\overline{W}}\) of the total weight inside a region of area \(R^2\). The growth of \({\overline{W}}\) as a function of R is characterised by calculating the function F defined by Eq. (2): \(\ln {\overline{W}}=F(\ln R)\). It will be argued that, for this model, the large-jump principle [9] is applicable, so that \({\overline{W}}\) is well approximated by the median of its largest contributor, denoted by \(w^*\).

We shall consider the following scenario. The potential \(V({\textbf{x}})\) is evaluated, and the weights Eq. (20) calculated, in a region of size \({\mathcal {R}}\). While \({\mathcal {R}}\) is assumed to be large, we assume that \({\mathcal {D}}\) is sufficiently small that \({\mathcal {Z}}\ll \langle {\mathcal {Z}}\rangle \), so that the largest weight is \({\hat{w}}\approx 1\). This implies that when we estimate \({\overline{W}}(R)\), our estimate should satisfy \({\overline{W}}({\mathcal {R}})\approx 1\). We assume that the density of minima of \(V({\textbf{x}})\) is \(\rho \). According to Eq. (20), a large value of w is associated with a minimum of the potential V, which has an approximate depth \(V\approx -{\mathcal {D}}[\ln w+\ln {\mathcal {Z}}]\), and we find it convenient to use a variable

$$\begin{aligned} {\widetilde{V}}\equiv -{\mathcal {D}}\left[ \ln w+\ln {\mathcal {Z}} \right] \end{aligned}$$
(21)

instead of w, because the distribution of weights has a narrow support when expressed in terms of \({\widetilde{V}}\). The largest values of w are observed very rarely, so we shall characterise the density of hotspots with very large values of w as follows: the probability \(P({\widetilde{V}}_0)\) that \({\widetilde{V}}\) is less than \({\widetilde{V}}_0\) is written in the form

$$\begin{aligned} P({\widetilde{V}})=\exp \left[ -J({\widetilde{V}})\right] . \end{aligned}$$
(22)

In order to unambiguously normalise this distribution we regard any minimum of \(V({\textbf{x}})\) as being a hotspot. The function J(V) corresponds to a ‘rate function’ or ‘entropy function’ of large deviation theory [11]. We can then estimate the median of the smallest value of \({\widetilde{V}}\), denoted by \({\widetilde{V}}^*\), by writing \(1/2=[1-P({\widetilde{V}}^*)]^{\rho R^2}\), where \(\rho \) is the density of minima. This yields:

$$\begin{aligned} J({\widetilde{V}}^*)=2\ln R+\ln \rho - \ln \ln 2. \end{aligned}$$
(23)

If the inverse function of J is K (that is \(K(J(V))=V\)), then the required relation between \(w^*={\overline{W}}\) and \(\ln R\) is

$$\begin{aligned} \ln {\overline{W}}=-\ln {\mathcal {Z}}-\frac{1}{{\mathcal {D}}}K(2\ln R+\ln \rho -\ln \ln 2)\equiv F(\ln R). \end{aligned}$$
(24)

In order to use this expression to determine the function \(F(\ln R)\) which appears in Eq. (2), we must determine the large-deviation rate function J(V) which was introduced in Eq. (22).

Note that, according to Eqs. (20) and (21),

$$\begin{aligned} {\widetilde{V}}=V^*+{\mathcal {D}}\ln \left( \frac{\sqrt{\Delta }}{2\pi {\mathcal {D}}}\right) , \end{aligned}$$
(25)

so that, in the limit as \({\mathcal {D}}\rightarrow 0\), \({\widetilde{V}}\rightarrow V\), and it is sufficient for our purposes to determine the PDF of the heights of minima of the function V(xy). We can, therefore, use the cumulative probability of the heights of local minima as the function P in Eq. (22).

Equations (8) and (24) imply that

$$\begin{aligned} \gamma _\text {eff}\sim 1-{\mathcal {D}}J'({\widetilde{V}}^*), \end{aligned}$$
(26)

so that \(\gamma _\text {eff}\sim 1\) when \({\mathcal {D}}\rightarrow 0\). This observation justifies the claim that \({\hat{W}}\sim w^*\).

3.2 Distribution of Weights

We now turn to evaluating the distribution of heights of minima. The two-dimensional case is quite technical, so we shall start by discussing the estimate of \({\overline{W}}(R)\) in one dimension.

Here we require the density of local minima, \(\rho \), and the probability P(V) that the height of a local minimum is less than V. These are readily obtained using the approach developed by Rice [12], following pioneering work by Kac [13]. The density of minima is

$$\begin{aligned} \rho =\int _{-\infty }^\infty \textrm{d}V\int _0^\infty \textrm{d}V''\, V''P(V,0,V''), \end{aligned}$$
(27)

where \(P(V,V',V'')\) is the joint PDF of V and its first two derivatives, evaluated at the same point. We consider the case where V(x) is Gaussian, with correlation function

$$\begin{aligned} \langle V(x)V(x')\rangle =\exp [-(x-x')^2/2]. \end{aligned}$$
(28)

We find the following non-zero statistics of the potential and its derivatives at a given point: \(\langle V^2\rangle =\langle V'^2\rangle =1\), \(\langle V''^2\rangle =3\), \(\langle VV''\rangle =-1\). Using the standard formula for multivariate Gaussian distribution, we find

$$\begin{aligned} P(V,V',V'')=\frac{1}{4\pi ^{3/2}}\exp \left[ -\frac{1}{4}\left( 3V^2+2VV''+V''^2\right) \right] \exp (-V'^2/2), \end{aligned}$$
(29)

and hence the density of minima is

$$\begin{aligned} \rho =\frac{\sqrt{3}}{2\pi }. \end{aligned}$$
(30)

The PDF of the heights of minima is

$$\begin{aligned} \begin{aligned} p(V)&=\frac{1}{\rho }\int _0^\infty \textrm{d}V''\, V''P(V,0,V'')\\&=\frac{1}{\sqrt{3\pi }}\exp (-3V^2/4)-\frac{1}{2\sqrt{3}}V\exp (-V^2/2){{\,\textrm{erfc}\,}}\left( \frac{V}{2}\right) \end{aligned} \end{aligned}$$
(31)

and the cumulative probability for the minimum being at a level less that V is

$$\begin{aligned} P(V)=\int _{-\infty }^V \textrm{d}v\, p(v) =\frac{1}{2}\left[ 1+{{\,\textrm{erf}\,}}\left( \frac{\sqrt{3}V}{2}\right) \right] +\frac{1}{2\sqrt{3}}\exp (-V^2/2){{\,\textrm{erfc}\,}}\left( \frac{V}{2}\right) \end{aligned}$$
(32)

which also allows us to obtain \(J(V)=-\ln [P(V)]\) explicitly. The asymptote for J(V) as \(V\rightarrow -\infty \), and the corresponding asymptote for its inverse function K(J) are:

$$\begin{aligned} J(V)\sim \frac{V^2}{2}+\frac{\ln 3}{2},\quad K(J)\sim \sqrt{2J-\ln 3}. \end{aligned}$$
(33)

The functions J(V) and K(J) for the one-dimensional model with Gaussian correlation function (\(c=1\)) are plotted in Fig. 5.

Fig. 5
figure 5

a The large deviation function for the distribution of minima, J(V) (defined by Eq. (22)), for a one-dimensional random potential with Gaussian correlation function, \(\langle V(x)V(x')\rangle =\exp [-(x-x')^2/2]\). The large deviation rate function J(V) derived from the cumulative distribution of minima P(V), as defined by Eq. (32), is shown as a solid line. Its asymptotic approximation, Eq. (33), is shown as a dashed line. b The inverse function, K(J): exact shown as solid line, asymptote Eq. (33) is dashed line

In the two-dimensional case the calculation of the distribution is more difficult, but the result is already known: for the case where the correlation function is a Gaussian, the PDF of the distribution of minima is [10] (see also erratum, [14]):

$$\begin{aligned} p(V)= & {} \frac{\sqrt{3}}{2\pi }\bigg [ \sqrt{\pi }\exp \left( -\frac{3}{4}V^2\right) {{\,\textrm{erfc}\,}}\left( \frac{V}{2}\right) -V\exp \left( -V^2\right) \nonumber \\{} & {} \quad +\sqrt{\frac{\pi }{2}}(V^2-1)\exp \left( -\frac{V^2}{2}\right) {{\,\textrm{erfc}\,}}\left( \frac{V}{\sqrt{2}}\right) \bigg ] \end{aligned}$$
(34)

and the density of minima in two dimensions is [10]

$$\begin{aligned} \rho =\frac{1}{2\pi \sqrt{3}}. \end{aligned}$$
(35)

The corresponding cumulative distribution cannot be expressed in terms of familiar special functions, so we obtained P(V) by numerical integration. The asymptote can, however, be determined analytically:

$$\begin{aligned} \begin{aligned} J(V)&\sim \frac{V^2}{2}-\ln \left( \frac{V^2+1}{|V|}\right) +\frac{1}{2}\ln \left( \frac{2\pi }{3}\right) \\ K(J)&\sim \sqrt{2J+\ln \left( {\frac{2J+1}{\sqrt{2J}}}\right) +\frac{1}{2}\ln \left( \frac{3}{2\pi }\right) }. \end{aligned} \end{aligned}$$
(36)

Figure 6 shows the corresponding function J(V) and its inverse, compared with Eq. (36).

Fig. 6
figure 6

a The large deviation function for the distribution of minima, J(V) (defined by Eq. (22)), for a two-dimensional random potential with Gaussian correlation function, \(\langle V({\textbf{x}})V({\textbf{x}}')\rangle =\exp [-({\textbf{x}}-{\textbf{x}}')^2/2]\). The large deviation rate function J(V) derived from the cumulative distribution of minima P(V), obtained by numerical integration of Eq. (34), is shown as a solid line. Its asymptotic approximation, Eq. (36), is shown as a dashed line. b The inverse function, K(J): exact shown as solid line, asymptote Eq.  (36) is dashed line

We investigated the statistics of hotspots for the model of diffusion in Gauss random potential (with Gaussian correlation function), by evaluating the function F(x) defined by Eq. (2), for different values of the diffusion coefficient \({\mathcal {D}}\). Because we were able to perform numerical simulations over a wider range of R-values in the one-dimensional case, we present results for both the one- and two-dimensional models.

The results for the one-dimensional case are summarised in Figs. 7 and 8. For each value of \({\mathcal {D}}\), we generated \(M=20\) realisations of the random potential V(x) on an interval of length \(L=100\times 2^{18}\), by smoothing white noise using a Gaussian kernel. The partition function \({\mathcal {Z}}\) was calculated for each realisation, and the local minima \(x_i\) of the potential were identified, together with the values of \(V(x_i)\) and \(V''(x_i)\). For each realisation, we divided the interval into sub-intervals, halving the length each time, for 18 generations. At generation \(k=1\),...,18, for each of the \(M\times 2^{k-1}\) sub-intervals of length \(R=L/2^{k-1}\), we sum the weights w to determine the total weight W of each sub-interval. We then determined the median values, \({\overline{W}}\), of these \(M\times 2^{k-1}\) weights. In Fig. 7 we plot the resulting 18 values of \(\ln {\overline{W}}\) as a function of \(\ln R\), for several different values of the diffusion coefficient \({\mathcal {D}}\).

Fig. 7
figure 7

Numerical investigation of the function F(x) defined in Eq. (2) for the one-dimensional model with Gaussian correlation function: \(\ln {\overline{W}}\) is evaluated as a function of \(\ln R\), for a range of different values of \({\mathcal {D}}\)

Because the values of \({\mathcal {R}}\) and \({\mathcal {D}}\) were chosen so that the largest weights \(w_i\) were of order unity, Eq. (24) simplifies to

$$\begin{aligned} \ln {\overline{W}}(R)= \frac{1}{{\mathcal {D}}}K\left( \ln {\mathcal {R}}+\ln \rho -\ln \ln 2\right) -\frac{1}{{\mathcal {D}}}K\left( \ln R+\ln \rho -\ln \ln 2\right) , \end{aligned}$$
(37)

where \(\rho \) is the density of minima (Eq. (30) in one dimension, Eq. (35) in two dimensions). Figure 8 verifies this expression by showing a collapse of the data in Fig. 7 onto the inverse function of the large-deviation entropy, K(x) in the one-dimensional case.

Fig. 8
figure 8

The data in Fig. 7 collapse onto the function K(x) plotted in Fig. 5b, in accord with Eq. (37)

We generated \(M=4\) realisations of V(xy) on a square of size \({{{\mathcal {R}}}}=256\), with toroidal boundary conditions, by convoluting a discrete representation of white noise with a Gaussian kernel. Figure 9 displays plots of \(\ln {\overline{W}}\) as a function of \(\ln R\) for the two-dimensional Gaussian potential, with different values of the diffusion coefficient \({\mathcal {D}}\), for \(R=2,4,\ldots ,256\). Figure 10 illustrates the collapse of these data onto a plot of K(x), the inverse of the large-deviation rate function J(x). The range of values of R is much smaller than that shown in Figs. 7 and 8, because the two-dimensional simulations are more numerically demanding.

Fig. 9
figure 9

Numerical investigation of the function F(x) defined in Eq. (2) for the two-dimensional model with Gaussian correlation function: \(\ln {\overline{W}}\) is evaluated as a function of \(\ln R\), for a range of different values of \({\mathcal {D}}\)

Fig. 10
figure 10

The data in Fig. 9 collapse onto the function K(x) plotted in Fig. 6b, in accord with Eq. (37). Note that the assumptions behind asymptotics Eq.  (37) break at \(\ln R<2.5\), but the data still collapse even at smaller R

Our analysis, illustrated by the simulations plotted in Figs. 7 and 9, indicates that \(\ln {\overline{W}}\) can increase very rapidly as a function of \(\ln R\), in accord with our definition of ultradimensional behaviour.

4 Concluding Remarks

Images which show the distribution of ‘hotspots’, where a field has an unusually high intensity, appear to have a family resemblance, which may not be strongly dependent upon the size of the sample region.

The distribution of weights of hotspots was characterised by considering the median \({\overline{W}}\) of the total weight in a region of size R, and defining a function F(x) by writing \(\ln {\overline{W}}=F(\ln R)\) (Eq. (2)). The derivative of F is an effective dimension, \(D_\text {eff}=F'(\ln R)\).

We investigated two models for hotspot distributions. Firstly, we considered a one-parameter family of weight distributions, defined by Eq. (4), which were contrived to be scale-invariant. The scale-invariance of these models is characterised by an effective dimension \(D_\text {eff}=d/(\gamma -1)\), where \(\gamma \in (1,2)\) is the parameter in the definition of the model. Because this dimension is greater than that of the embedding space, the scale invariance is distinct from the self-similarity which characterises fractal sets. Examples of realisations of this model are shown in Fig. 1. While it is a mathematical fact that the individual images are drawn from the same ensemble, the realisations do look very different from each other.

We also considered a physically motivated example, namely the equilibrium distribution for diffusion in a random potential. Here the realisations of the hotspot distribution, illustrated in Fig. 4, are qualitatively similar to those of the simplified model. We were able to determine the function F(x) for this model. Because it is not a linear function, this system does not exhibit strict scale invariance.

We have discussed the distribution of the hotspot intensities in terms of an effective dimension, \(D_{\textrm{eff}}\). The distribution of passive scalars in compressible flows, and of inertial particles in turbulent flows, is known to be highly inhomogeneous, and it has been characterised by multifractal dimensions (see, for example, [15,16,17,18] respectively). In discussions of multifractality of the distribution of passive scalars, the multifractal dimensions describe how the moments of the mass \(\mu \) inside a ball depend upon its radius \(\epsilon \): \(\langle \mu ^q\rangle \sim \epsilon ^{(q-1)D_q}\), where \(D_q\) is the Renyi dimension of index q. This approach represents a microscopic analysis, because the limiting process is \(\epsilon \rightarrow 0\). Our analysis considers the median mass \({\overline{W}}\) of a region of dimension R, in the limit as \(R\rightarrow \infty \), and it represents a macroscopic perspective on the structure of the set, concentrating on the large-scale structure of high-density regions. We remark the the large-scale distribution of low-density regions has also been shown to be described by power-laws [19].