1 Introduction

Atmospheric boundary-layer flows are turbulent, i.e. random, so the concentrations of any pollutant gases released into such flows are also random. These concentrations need to be analyzed in terms of their probability distributions. For releases of toxic or malodorous gases, large concentrations are of particular importance for the assessment of hazards or nuisance. In this paper attention will be focussed on large concentrations, so it is the high-concentration tails of the probability distributions which are of interest.

The most appropriate framework for high concentrations is provided by statistical extreme-value theory, for which we outline the relevant aspects below. Let \(P(\theta ;\theta _T)\) be the distribution function of the concentration \(\varGamma \), conditional on \(\varGamma \) being above a threshold \(\theta _T\), i.e.

$$\begin{aligned} P(\theta ;\theta _T) = \text {Prob}(\varGamma \leqslant \theta | \varGamma > \theta _T). \end{aligned}$$
(1)

The corresponding probability density function (p.d.f.) is

$$\begin{aligned} p(\theta ;\theta _T) = \frac{d}{d\theta }P(\theta ;\theta _T). \end{aligned}$$
(2)

Subject to some regularity conditions, Pickands (1975) showed that for large threshold \(\theta _T\)

$$\begin{aligned} p(\theta ;\theta _T) \approx g(\theta -\theta _T;k,a) ~~~~~~~~~~~~\text {for } \theta > \theta _T, \end{aligned}$$
(3)

where g(ska) is the p.d.f. of the generalized Pareto distribution, given by

$$\begin{aligned} g(s;k,a) = \frac{1}{a} \left( 1-\frac{ks}{a}\right) ^{\frac{1}{k}-1}. \end{aligned}$$
(4)

Sections 4.1 and 4.2 of Coles (2001) deal with this. Here k is the shape parameter and \(a ~(>0)\) is the scale parameter.

This result has previously been applied to the statistical modelling of extreme values by fitting, usually by maximum likelihood, to exceedances of a high threshold. Examples of this approach are given by Davison and Smith (1990), Leadbetter (1991) and, for turbulent dispersion, Mole et al. (1995), Anderson et al. (1997), Schopflocher (2001) and Munro et al. (2001). For pollutant concentration the maximum possible value, \(\theta _{{MAX}}\), is finite (bounded above by the largest source concentration). Thus we expect that \(k > 0\), so that g(ska) has a finite upper endpoint a / k. We then have

$$\begin{aligned} \theta _{{MAX}} = \theta _T +\frac{a}{k}. \end{aligned}$$
(5)

Figure 1 of Mole et al. (2008) shows possible shapes for g(ska). If we let \(\theta _T\) vary, while keeping it large so that (3) still applies, then for two large values \(\theta _{T1}\) and \(\theta _{T2}\) of \(\theta _T\), with \(\theta _{T2} > \theta _{T1}\), we have

$$\begin{aligned} p(\theta ;\theta _{T2}) = \frac{p(\theta ;\theta _{T1})}{\int _{\theta _{T2}}^{\theta _{{MAX}}} p(\theta ;\theta _{T1}) \, d\theta }. \end{aligned}$$
(6)

If the parameters corresponding to \(\theta _{T1}\) and \(\theta _{T2}\) are \((k_1,a_1)\) and \((k_2,a_2)\) respectively, then (see Appendix 1 for details) this can only be satisfied if \(k_2=k_1 ~(=k \text { say})\) and

$$\begin{aligned} a_2-a_1=-k(\theta _{T2}-\theta _{T1}). \end{aligned}$$
(7)

Thus, as \(\theta _T\) varies, the form of the asymptotic distribution g(ska) and the value of the shape parameter k cannot change, but the scale parameter a must decrease as \(\theta _T\) increases. The result for a can also be obtained from (5).

Ideally a physical model would give the concentration p.d.f., with a finite upper endpoint for the concentration. Thus this would include probabilities of the exceedances of high thresholds. In practice, however, this is difficult, and models are much more likely to give the first few concentration moments. Motivated by this, Mole et al. (2008) introduced a different method for estimating properties of the distribution for large concentrations. The central argument there was that high-order moments \(m_n\) are dominated by the largest concentration values, so that

$$\begin{aligned} m_n = E\{\varGamma ^n\} \approx \eta \int _0^{\theta _{{MAX}}} \theta ^n \, g(\theta ;k,a) \, d\theta \end{aligned}$$
(8)

for some constant (at a fixed point in space) \(\eta \). (Without loss of generality \(\theta _T\) was taken to be zero—otherwise a can simply be rescaled.) Rather more detail, including the spatial variation of \(\eta \), is given in Sects. 2.2 and 2.4 of Mole et al. (2008), who showed that this implies that, for sufficiently large n,

$$\begin{aligned} \frac{m_{n-1}}{m_n} \approx \frac{1}{a} \left( \frac{1}{n}\right) +\frac{k}{a} = \frac{1}{a} \left( \frac{1}{n}\right) +\frac{1}{\theta _{{MAX}}}. \end{aligned}$$
(9)

The linear relationship between the ratio of successive moments, \(m_{n-1}/m_n\), and 1 / n then allows the estimation of the parameters k, a and \(\theta _{{MAX}}\). Mole et al. (2008) used this method to estimate the parameters from a theoretical model for the concentration moments, and from the experimental data of Sawford and Tivendale (1992), for a steady line-source release in a wind tunnel.

The concentration model in Mole et al. (2008) gave all moments, in terms of some parameters which would have to be modelled. Other modelling approaches to calculating moments include meandering plume (e.g., Gifford 1959; Luhar et al. 2000; and Cassiani and Giostra 2002), large-eddy simulation (e.g., Xie et al. 2004) and p.d.f. micromixing (e.g., Pope 1985; Sawford 2004). Most models would only give the lowest few moments, whereas the moment-based method described here would be expected to require higher moments. This is a problem that requires further investigation.

In the present paper we extend the work of Mole et al. (2008) by applying this moment-based method to the Sawford and Tivendale (1992) wind-tunnel data in a more thorough manner, including the estimation of confidence intervals for the parameter estimates. We also compare these results with those obtained by the more conventional method of maximum-likelihood fitting to exceedances of a high threshold. We have two main aims. The first is to explore the variation in space of properties of the distribution of high concentrations, in particular \(\theta _{{MAX}}\) and k. The second is to establish how well the moment-based method performs, with applications to models in mind.

2 The Estimation Methods

2.1 The Moment-Based Method

Mole et al. (2008) fitted (9) by least squares, using the values \(n=4\) to \(n=8\). The straight-line fit was quite good, but there was a suggestion of curvature which would have altered the estimates if larger values of n had been used.

Fig. 1
figure 1

The ratio of successive moments, \(m_{n-1}/m_n\), against 1 / n. The squares are the values calculated from the data, and the solid line is the fit of (9) using the maximum gradient. The dashed line is \(1/\varGamma _{{MAX}}\), where \(\varGamma _{{MAX}}\) is the largest measured concentration. This case is for the wind-tunnel data of Sawford and Tivendale (1992), on the mean-plume centreline, at a distance of \(X=100\) mm downwind of the source (i.e. \(X/X_0 =0.323\), where \(X_0\) is the distance from the grid to the source)

For a finite-size dataset, for large n the moments \(m_n\) become dominated by the largest measured value \(\varGamma _{{MAX}}\). Thus, using very large n to fit (9) will result in \(\theta _{{MAX}}\) being underestimated (as \(\varGamma _{{MAX}}\)). Conversely, at small n (9) would not be expected to hold. So in fitting (9) to the data a balance must to be struck between using small and large values of n. We have carried out some preliminary analysis, on simulated datasets, which suggests that a reasonable choice is to fit the straight line through the point of maximum gradient when \(m_{n-1}/m_n\) is plotted against 1 / n. For typical cases this will give the largest estimate for \(\theta _{{MAX}}\) that is possible from (9), and for hazard-assessment purposes will therefore tend to be on the conservative side. Figure 1 shows a typical plot of \(m_{n-1}/m_n\) against 1 / n, with the straight line fitted using the maximum gradient.

Confidence intervals for \(\theta _{{MAX}}\) and k are calculated by bootstrapping. We use the studentized bootstrap method with a log transformation of the data (details of this method can be found in e.g. Davison and Hinkley 1997).

2.2 The Maximum-Likelihood Method Using Exceedances

With this method we fit the generalized Pareto distribution (4) to excesses above a high threshold, using maximum likelihood. The thresholds are chosen using mean-excess plots (i.e. mean-residual-life plots)—see e.g. Coles (2001, p. 79). Confidence intervals are estimated using profile likelihood (Davison and Smith 1990). To take account of possible dependence between concentration exceedances when estimating the confidence intervals, declustering (see, e.g., Coles 2001, p. 99) is carried out. This involves choosing a cluster separation time \(\tau \). If the time between successive exceedances is greater than \(\tau \) then they are deemed to belong to separate clusters, and only the maximum concentration value from each cluster is used in the maximum-likelihood fitting. We choose \(\tau \) using the method of Ferro and Segers (2003), which is based on estimating the extremal index. For details of the extremal index see Leadbetter et al. (1983, p. 67); it can be interpreted as the reciprocal of the mean cluster size (Leadbetter 1983).

In a few cases the profile-likelihood method does not converge, and instead we estimate confidence intervals by assuming that the likelihood estimator is approximately multivariate normal (see, e.g., Sect. 4.2.1 of Davison 2003, or Sect. 2.6.4 of Coles 2001). This happens far downwind or in the plume fringes, where there are fewer data on large concentration values. In these cases the estimated confidence intervals will automatically be symmetric, whereas the real ones will probably not be. In particular, we would expect that the upper end of the confidence interval should be larger than that indicated by the multivariate-normal approximation.

3 Results

The two estimation methods are applied to the experimental data of Sawford and Tivendale (1992), obtained from a steady line source in wind-tunnel grid turbulence, with the turbulence approximately uniform in the direction normal to the line source and the mean-flow direction. Further details of the experiments are given in Appendix 2 here, and also in Sawford and Sullivan (1995) and Mole et al. (2008). Concentration (which in this case is temperature) measurements were made at many downwind and crosswind positions.

The wind-tunnel boundaries were sufficiently distant that the temperature was effectively a conserved scalar. Close to the source the mean temperatures were up to 50\(\hbox { K}\) above the background temperature, similar to the experiments of Stapountzis et al. (1986), but much higher than in Warhaft (1984). Therefore, near the source the scalar was not passive, because of the buoyancy. Sawford (2004) showed that the intensity of fluctuations was reduced by comparison with the Warhaft data, but that in the far field these source effects became unimportant. This is less clear-cut here since we are concerned with the largest concentrations. Turbulent advection means that, except in the far field, the largest concentrations decay less quickly downwind than the mean and standard deviation. This means that non-passive effects on the largest concentrations may well persist to larger distances from the source. However, further from the source large temperatures will be found in relatively small volumes, so we expect that non-passive effects will be small. The subsequent figures for cross-plume variations do not show obvious asymmetry resulting from buoyancy (note that the cross-plume direction is in the vertical). This is also the case in Fig. 1 of Sawford and Sullivan (1995) for the first four moments of concentration. So in subsequent attempts to explain the results physically we assume that a good first approximation is provided by taking the scalar to be passive.

Figure 2 shows the estimates of \(\theta _{{MAX}}\) on the mean-plume centreline, as a function of non-dimensional downwind distance from the source, \(X/X_0\), where \(X_0\) is the distance from the grid to the source. (Note that turbulent length scales will be of the order of the hole spacing in the grid, i.e. of order 0.08 when normalized by \(X_0\).) The value of \(\theta _{{MAX}}\) is normalized by the centreline mean concentration \(\mu _0\). Up to about \(X/X_0=0.3\) the two methods agree very well. Beyond \(X/X_0=0.3\) the two methods differ more, but the confidence intervals are larger, and the differences are probably not very significant. The quantity \(\theta _{{MAX}}/\mu _0\) increases steadily from a value just over 1 at \(X/X_0 \approx 0.01\), to a maximum of about 5–6 at \(X/X_0 \approx 0.8\), and then decreases to about 3.5 at \(X/X_0 \approx 8\).

The downwind variation of \(\theta _{{MAX}}/\mu _0\) agrees qualitatively with the results of Mole et al. (2008) (in their Fig. 6), obtained using the moment ratios from \(n=4\) to \(n=8\), but the values here are roughly 20% larger. For the moment-based method the larger values here are to be expected, since we are fitting to the maximum gradient, but the agreement with the maximum-likelihood estimates suggests that the maximum-gradient method is superior to the method used by Mole et al. (2008).

Mole et al. (2008) argued that the variation of \(\theta _{{MAX}}/\mu _0\) with downwind distance X shown in Fig. 2 is what would be expected on physical grounds. For a steady source, close to the source we expect \(\theta _{{MAX}}\) to be close to the largest source concentration \(\theta _2\), and \(\mu _0\) to be close to the mean source concentration \(\theta _1\). Thus, as \(X \rightarrow 0\), we expect \(\theta _{{MAX}}/\mu _0 \rightarrow \theta _2/\theta _1 \geqslant 1\). For most sources \(\theta _2/\theta _1\) is likely to be close to 1.

For a passive conserved scalar the physical processes involved are advection by the turbulent velocity, and molecular diffusion (which is the only process which can change the concentration in a fluid element). The variation of \(\theta _{{MAX}}/\mu _0\) as one goes away from the source is determined by the balance between the effects of these processes. Except very close to the source and very far downwind the mean concentration \(\mu \) is hardly affected by molecular diffusion, and so is controlled by turbulent advection. The maximum concentration \(\theta _{{MAX}}\) can only be altered through the action of molecular diffusion.

Fig. 2
figure 2

All values here are on the mean-plume centreline. Left panel: the moment-based estimate of the maximum possible concentration \(\theta _{{MAX}}\), as a function of non-dimensional downwind distance from the source. \(\theta _{{MAX}}\) is normalized by the centreline mean concentration \(\mu _0\). The squares and solid line are the estimates of \(\theta _{{MAX}} /\mu _0\), and the dashed lines give the 95% confidence intervals, estimated using bootstrapping. Right panel: as left panel, but these are the maximum-likelihood estimates from exceedances of high thresholds. The confidence intervals (again 95%) are estimated using profile likelihood where possible

Most atmospheric releases have large Péclet number \(Pe=ul/\kappa \) near the source, where u is a velocity scale for the turbulent fluctuations, l is a length scale for the turbulent concentration fluctuations, and \(\kappa \) is the molecular diffusivity. The ratio of the size of the advection and diffusion terms in the dispersion equation is Pe, so near the source advection acts much more quickly than diffusion. Further downwind, as the plume becomes stretched out into thin sheets and strands by turbulent advection, the concentration length scale l becomes smaller than near the source, and diffusion acts more quickly, as described by Batchelor (1959). The conduction cut-off length is defined as \((\nu \kappa ^2 /\epsilon )^{1/4}\), where \(\nu \) is the kinematic viscosity and \(\epsilon \) is the turbulent-energy dissipation rate per unit mass. The Schmidt number is defined as \(\nu /\kappa \). For pollutants with Schmidt number of order 1 or greater (which is usually the case, and is true for the experimental data used here) once the concentration length scale reaches the conduction cut-off length then a balance is reached between advection and diffusion, where their time scales are the same and the concentration length scale does not decrease any more.

Thus, provided the source size is large compared with the conduction cut-off length (which would usually be the case for atmospheric releases), near the source advection acts much faster than diffusion. This means that \(\mu _0\) is reduced much more quickly than \(\theta _{{MAX}}\), so \(\theta _{{MAX}}/\mu _0\) increases away from the source. Further downwind the concentration length scale decreases, and since the largest concentrations are found in the thin sheets and strands, diffusion acts to reduce \(\theta _{{MAX}}\) more quickly. Conversely, as the plume width increases downwind, advection reduces \(\mu _0\) more slowly. Eventually the rate of reduction of \(\mu _0\) becomes less than that of \(\theta _{{MAX}}\), so \(\theta _{{MAX}}/\mu _0\) reaches a peak and then decreases with downwind distance.

In the grid turbulence found in the experiments discussed here, the turbulent energy (and hence the velocity scale u) and the dissipation rate \(\epsilon \) decay with downwind distance from the source (see, e.g., Sect. 5.4.6 of Pope 2000). This implies that the Péclet number Pe will decrease faster downwind than in a non-decaying flow, and the conduction cut-off length will increase downwind. Using the values in Table I of Sawford (2004), in these experiments Pe at the source is of order 3 (in the atmosphere the source is likely to be larger, so Pe would be several orders of magnitude larger). It also gives a conduction cut-off length comparable with the source size (for the atmospheric applications the conduction cut-off length would probably be similar, but the source size considerably greater).

However, at the source the turbulence length scale in these experiments is of order 70 times the source size. So up to moderate distances downwind from the source the instantaneous plume is swept from side to side by the turbulence on length scales greater than the width of the instantaneous plume, thus increasing the mean plume width L and decreasing \(\mu _0\) downwind. Very close to the source (up to \(X/X_0 \approx 0.01\) in these experiments) L is increased more by molecular diffusion than by turbulent advection, but at larger downwind distances its increase is dominated by turbulent advection. Reasonably near the source the instantaneous plume is broadened mainly by molecular diffusion, and not by turbulence (which mostly exists at length scales greater than that of the instantaneous plume width). Thus \(\theta _{{MAX}}\) (which is found in the instantaneous plume) decreases more slowly than \(\mu _0\) with increasing distance from the source. Further downwind L increases, and turbulent advection reduces \(\mu _0\) more slowly. Far enough downwind we expect \(\theta _{{MAX}}/\mu _0\) to reach a peak and then decrease. This behaviour of \(\theta _{{MAX}}/\mu _0\) is what Fig. 2 shows. For most atmospheric releases we expect the source size to be rather greater than the conduction cut-off length, and as described above we expect similar behaviour of \(\theta _{{MAX}}/\mu _0\).

If there is an upper bound on the turbulent length scales for the velocity, as is the case for wind-tunnel grid turbulence, then far downwind the mean-plume width is much greater than the largest turbulent length scale for velocity and, as argued by Mole et al. (2008), we expect all concentrations to tend to the local mean. Thus \(\theta _{{MAX}} \rightarrow \mu \) as \(X \rightarrow \infty \), so on the centreline we expect \(\theta _{{MAX}}/\mu _0 \rightarrow 1\). In practice, wind tunnels are not long and wide enough to approach this limit, and in field experiments one would expect non-stationarity and inhomogeneity to make this limit difficult to observe.

Figure 3 shows the moment-based and maximum-likelihood estimates of the shape parameter k. The moment-based estimates show a rapid decrease in values within \(X/X_0 \approx 0.02\) of the source. Beyond this there is no obvious pattern, with values generally between 0.1 and 0.2. The maximum-likelihood estimates have no obvious pattern throughout, with values between 0.1 and 0.3. The moment-based estimates give narrower confidence intervals, and often the estimates of k with the two methods differ by more than the estimated confidence intervals. It appears that there is a systematic difference between the two types of estimator.

The suggestion is that k does not vary much with downwind distance, except perhaps close to the source, with values of order 0.2 or a little less. This is different from the results of Mole et al. (2008), where k was roughly equal to the reciprocal of \(\theta _{{MAX}}/\mu _0\), decreasing to about 0.2 at \(X/X_0 \approx 0.8\), before increasing again with downwind distance. At most distances, the suggestion here is that the shape of the generalized Pareto distribution (4) is as in Fig. 1a of Mole et al. (2008), with zero gradient at the upper endpoint. The one case giving an estimate of \(k>0.5\), corresponding to infinite gradient at the upper endpoint, is for the moment-based estimate at the position closest to the source.

Fig. 3
figure 3

All values here are on the mean-plume centreline. Left panel: the moment-based estimate of k, as a function of non-dimensional downwind distance from the source. The squares and solid line are the estimates of k, and the dashed lines give the 95% confidence intervals, estimated using bootstrapping. Right panel: as left panel, but these are the maximum-likelihood estimates. The confidence intervals (again 95%) are estimated using profile likelihood where possible

Fig. 4
figure 4

Left panel: the moment-based estimate of \(\theta _{{MAX}} /\mu _0\), at non-dimensional downwind distance \(X/X_0=0.0161\), as a function of crosswind distance Z from the mean-plume centreline, normalized by the mean-plume width L. The squares and solid line are the estimates of \(\theta _{{MAX}} /\mu _0\), and the dashed lines give the 95% confidence intervals, estimated using bootstrapping. The asterisks show the mean concentration \(\mu \), normalized by the centreline value \(\mu _0\). Right panel: as left panel, but these are the maximum-likelihood estimates. The confidence intervals (again 95%) are estimated using profile likelihood where possible

Theoretically, close to the source we expect most of the weight of the p.d.f. of concentration to be close to the source concentration values, and away from zero. For a uniform source with concentration \(\theta _0\), this p.d.f. would be \(\delta (\theta -\theta _0)\) at the source, and for a non-uniform source we expect the p.d.f. to be broadened slightly from this. For (4) to give a peak away from \(\theta =0\) requires \(k>1\), so very close to the source we would expect to have large values of k. Far downwind we expect the concentration p.d.f. to tend to \(\delta (\theta -\mu )\), so the same argument for k holds. From these arguments we would anticipate that k would be large close to the source, decrease to a minimum, and then increase again very far downwind. Figure 3 does not show any evidence for the latter, which may just be because we do not have measurements sufficiently far downwind. The moment-based estimates in Fig. 3 do show evidence for larger values of k near the source, but since this is not supported by the maximum-likelihood estimates, it is not clear how reliable this is. The moment-based estimates in Mole et al. (2008) do show larger values of k at small and large downwind distances, but the suggestion from Fig. 2 is that the present results are more reliable.

Figures 4, 5 and 6 show the variation of \(\theta _{{MAX}}/\mu _0\), as a function of crosswind distance Z (with \(Z=0\) on the centreline), at three downwind distances; Z is normalized by the mean-plume width L, defined as the standard deviation of the crosswind profile of mean concentration (which is close to Gaussian: the normalized mean concentration \(\mu /\mu _0\) is also shown in the figures). Again, both moment-based and maximum-likelihood estimates are shown. The downwind distances are \(X/X_0=0.0161\), close to the source, \(X/X_0=0.323\), which is close to the maximum in \(\theta _{{MAX}}/\mu _0\), and \(X/X_0=5.16\), which is far downwind. In some cases in the plume fringes convergence was not obtained, so points are not shown. In some other cases for the maximum-likelihood method, the profile-likelihood method for the confidence intervals did not converge, and confidence intervals estimated from normal approximations to the likelihood estimator are used instead. This is especially the case for \(X/X_0=5.16\), for which the number of exceedances of the threshold \(\theta _T\) tends to be smaller. For both estimators the width of the confidence intervals is quite variable across the plume, especially further from the source. Measurements at different positions were made in different runs, so variability between releases may perhaps explain this.

For \(|Z/L| < 2-3\), the moment-based and maximum-likelihood methods give similar results for \(\theta _{{MAX}}/\mu _0\). At \(X/X_0=0.0161\) the maximum-likelihood values tend to be slightly smaller, and at \(X/X_0=5.16\) the uncertainty is greater than the difference in the estimated values.

Fig. 5
figure 5

As Fig. 4, but for non-dimensional downwind distance \(X/X_0=0.323\)

Fig. 6
figure 6

As Fig. 4, but for non-dimensional downwind distance \(X/X_0=5.16\)

At all downwind distances, for \(|Z/L| \lesssim 3\), \(\theta _{{MAX}}/\mu _0\) appears to be fairly constant, allowing for the larger variability and larger confidence intervals at the far-downwind distance. Far from the centreline larger concentrations occur only rarely, so to obtain accurate estimates of their probability distribution will require longer time series than for positions close to the centreline. This is reflected in the lack of convergence in some cases, and in the generally larger confidence intervals far from the centreline, especially for the maximum-likelihood method.

The crosswind variation of \(\mu /\mu _0\), and the value of L, is determined by turbulent advection. Very far from the centreline the positions are much further from the source than are those positions near the centreline, so diffusion has had longer to act. Thus we expect \(\theta _{{MAX}} \rightarrow 0\) as \(|Z/L| \rightarrow \infty \). At small values of X, where advection reduces \(\mu _0\) more quickly than diffusion reduces \(\theta _{{MAX}}\), we expect that the decrease of \(\theta _{{MAX}}\) away from the mean-plume centreline will be on a larger length scale than L, so we expect \(\theta _{{MAX}}/\mu _0\) to be fairly constant for \(|Z/L|< 2\) or 3, as seen in Fig. 4.

At values of X close to the downwind peak in \(\theta _{{MAX}}/\mu _0\), we still expect this to be true because it depends on the cumulative effect while the fluid elements travel from the source to these downwind distances. So this argument is in agreement with the observations in Fig. 5.

Fig. 7
figure 7

Left panel: the moment-based estimate of k, at non-dimensional downwind distance \(X/X_0=0.0161\), as a function of Z / L. The squares and solid line are the estimates of k, and the dashed lines give the 95% confidence intervals, estimated using bootstrapping. Right panel: as left panel, but these are the maximum-likelihood estimates. The confidence intervals (again 95%) are estimated using profile likelihood where possible

Very far downwind we expect that \(\theta _{{MAX}} \rightarrow \mu \), so \(\theta _{{MAX}}/\mu _0 \rightarrow \mu /\mu _0\), and \(\theta _{{MAX}}/\mu _0\) will decrease on the same length scale L as \(\mu /\mu _0\). Figure 6 suggests that the wind-tunnel measurements do not extend far enough downwind to identify this regime.

Figures 7, 8 and 9 show the crosswind variation of the moment-based and maximum-likelihood estimates of k. There is no clear pattern to the results, but the maximum-likelihood values tend to be a little larger than the moment-based ones. Far from the centreline we believe the results are less reliable, for the reasons discussed above for \(\theta _{MAX}/\mu _0\). The only cases where the estimates of k are greater than 0.5 are for the moment-based method, in the plume fringes at \(X/X_0=0.323\). Here we would expect \(k<0.5\), so we believe these results are a reflection of the lack of reliability. The evidence seems to be that k only varies slowly across the plume. This is what we would expect from the physical arguments given above, except very far downwind, since k is determined by the distribution of large concentrations, which are mainly affected by molecular diffusion.

In about 80% of the cases shown, the moment-based estimate of \(\theta _{MAX}\) is larger than that obtained using maximum likelihood. There also appears to be a tendency for the moment-based confidence intervals to be narrower than those found with maximum likelihood. The suggestion is that there is some systematic difference between the two types of estimator. For applications to toxic or malodorous gases this difference means that the moment-based estimates of \(\theta _{MAX}\) are likely to be conservative.

4 Discussion

We have used two methods for estimating the maximum possible concentration \(\theta _{MAX}\) and the shape parameter k of the distribution of large concentrations. One method is based on the expected behaviour of high-order concentration moments, and the other fits the generalized Pareto distribution (4) directly to concentrations above a high threshold, using maximum likelihood. The two methods agree reasonably well for all the cases shown, and are in very close agreement for \(\theta _{MAX}\) on the mean-plume centreline up to \(X/X_0 \approx 0.3\). The moment-based method used here appears to give better results than that used in Mole et al. (2008).

Fig. 8
figure 8

As Fig. 7, but for non-dimensional downwind distance \(X/X_0=0.323\)

Fig. 9
figure 9

As Fig. 7, but for non-dimensional downwind distance \(X/X_0=5.16\)

The results show that on the centreline \(\theta _{MAX}/\mu _0\) increases from a value slightly larger than 1 very near the source, to a value of 5–6 at \(X/X_0 \approx 0.8\), before decreasing again. In the crosswind direction both \(\theta _{MAX}/\mu _0\) and k vary much more slowly than \(\mu /\mu _0\), as we would expect if they are affected much more by molecular diffusion than by turbulent advection. Very far downwind we would expect diffusion to dominate if there is an upper bound on the velocity length scales. In this case \(\theta _{MAX}/\mu _0\) and k would vary on the same scale as \(\mu /\mu _0\). The results suggest that the measurements do not extend far enough downwind to reach this regime.

These results provide encouragement to proceed with attempts to develop quantitative models, in particular those based on concentration moments, for \(\theta _{MAX}/\mu _0\) and for other properties of the distribution of large concentrations. Such models would then enable hazard assessment to be carried out for practical applications involving releases of toxic and malodorous gases in the atmosphere. Some discussion of possible modelling approaches was given in Mole et al. (2008). In general limits on the quality of results given by this method will be imposed by the ability of models or experiments to give accurate results for concentration moments. The more moments are required, the worse this problem is likely to be.

Grid turbulence provides an approximation to homogeneous isotropic turbulence, for which it is relatively easy to interpret results and develop models. However, in practical applications releases occur in inhomogeneous boundary-layer turbulence which, unlike grid turbulence, does not decay with downwind distance. An alternative would be to use wind-tunnel boundary-layer releases such as those analyzed by Xie et al. (2007).

To avoid the limitation on downwind distances imposed by the dimensions of a wind tunnel, and to consider real boundary-layer conditions, it would be desirable to repeat the analysis for field experiments. There are, however, some difficulties with attempting this. In addition to the usual problems with field experiments of non-stationary conditions and terrain effects, there is a specific difficulty relating to measuring large concentrations. Large concentrations are found at the smallest spatial scales present in the concentration field, so sensors with very good spatial resolution are needed to measure them. Such sensors will be ones which can only make measurements at a single point, so obtaining a wide spatial coverage, especially at positions where long time series are needed, is expensive. On the other hand, a measurement system such as lidar which gives wide spatial coverage cannot resolve sufficiently small scales to give reliable measurements of the largest concentrations. Perhaps this problem may be overcome by resolving at the scales of an averaging human breath.