1 Introduction

Arctic sea ice has experienced a drastic decline over recent decades (Stroeve et al. 2012), perceived as an emblematic sign of climate change and thought to have impacted mid-latitude climate (Gao et al. 2015). State-of-the-art climate models fail to accurately capture the rate of Arctic climate change (Massonnet et al. 2012) as well as their impact on the Northern American and European climates (Jung et al. 2015). A significant source of uncertainty is related to the surface energy budget over sea ice (Notz 2012) where turbulent heat fluxes are key components. Heat exchange between sea ice and the atmosphere play a crucial role on the rate of Arctic sea ice melting (Rothrock et al. 1999; Screen and Simmonds 2010; Zhang 2007; Goosse and Zunz 2014), as well as on the teleconnections between polar and non-polar regions (Bader et al. 2011; Vihma 2014; Overland et al. 2015).

In sea-ice and climate models, turbulent fluxes at the surface-atmosphere interface—namely the surface momentum, sensible heat and latent heat fluxes—are usually represented using bulk flux parameterisations based on the Monin Obukhov Similarity Theory (MOST, Monin and Obukhov 1954). The main idea of the bulk approach is—for an area with horizontally homogeneous conditions of the surface and the near surface atmosphere—to estimate fluxes from the near-surface gradient of the model-resolved (or averaged) variables (wind speed, temperature and humidity) and transfer coefficients for momentum (\(C_D\), the drag coefficient), sensible heat (\(C_H\)) and latent heat (\(C_E\)). The accuracy of these parameterizations depends essentially on the accuracy of the transfer coefficients. The transfer coefficients represent the effects of both the surface roughness characteristics and the atmospheric surface layer stability. The neutral version of the coefficients (\(C_{DN}\), \(C_{HN}\) and \(C_{EN}\)) have had the stability effect removed and therefore only depend on the surface characteristics. \(C_{HN}\) and \(C_{EN}\) have a dependence on \(C_{DN}\) (e.g. Andreas 1996), which is why this study focuses, as a first step, on the accuracy (and uncertainties) of \(C_{DN}\) estimates.

Transfer coefficients estimated from observational campaigns exhibit a large spread in which the contributions from observational uncertainties and from unknown processes are still challenging to disentangle. Even the most up-to-date parameterizations for \(C_D\) above sea ice (e.g. Lüpkes et al. 2012; Lüpkes and Gryanik 2015) still suffer from large uncertainties linked to their dependence on tunable coefficients. The tuning of these coefficients rely on estimates of drag coefficients from field data above sea ice (e.g. Elvidge et al. 2016, 2021; Srivastava et al. 2022). However, these estimates are not direct measurements and they may suffer not only from the impact of instrumental errors, but also from procedural errors in the conversion from measurements to drag coefficients, which involves, for instance, a stability correction using theoretical-empirical stability functions (e.g. Grachev et al. 2000, 2007). The present study assesses to what extent both the instrumental errors and the stability function errors propagate in the computation of the drag coefficients. The uncertainties on the stability function itself is evaluated through bootstrapping methods developed for assessing the uncertainty of experimental integrals (Cordero et al. 2008).

Only a few studies have assessed the uncertainty of \(C_{DN}\) estimates. For instance, based on the SHEBA dataset, Andreas et al. (2010) mention that evaluations of \(z_0\) (resp. \(z_T\) or \(z_Q\)) are uncertain by a factor ranging from 1/3 to 3 (resp. from 1/200 to 200), which results in a \(C_{DN}\) uncertainty of \(\pm 11\%\). Analysing the Ice Station Weddell data and based on Foken and Wichura (1996) and Larsen et al. (2001) arguments, Andreas et al. (2005) estimate the individual eddy-covariance measurements of the drag coefficient to be uncertain by \(\pm 20 \%\) and conclude that this uncertainty dominates the \(z_0\) uncertainty. These studies only provide an averaged uncertainty for an entire experiment. In order to minimise the impact of such uncertainties, as the errors are expected to be random, current practice is to assemble large collections of measurements. The present study provides a complementary approach as it estimates the total uncertainty for each \(C_{DN}\) estimate along with an evaluation of the different sources which contribute to the total uncertainty. While objective and standardized methods across the community for field data screening are still lacking, this study proposes a methodology which relies on those individual \(C_{DN}\) uncertainty estimates in order to provide an additional objective data screening criterion. Estimates of those contributions allow us to make recommendations about their minimisation when estimating field transfer coefficients and about optimal criteria to select data for calibration of existing parameterizations. The SHEBA dataset is used to illustrate the benefit of such uncertainty estimation.

This article is structured as follows. Section 2 summarizes the main equations describing turbulence in the atmospheric surface layer. Section 3 details the data, methods and tools used to carry out this analysis. Results are presented in Sect. 4. An example of handling uncertainties throughout the definition of an objective data screening criteria is proposed in Sect. 5. Section 6 provides a discussion and Sect. 7 our conclusions and recommendations.

2 Definitions and Governing Equations

Turbulent fluxes of momentum \(||\overrightarrow{\tau }||\) (the modulus of the flux momentum vector), sensible heat H and latent heat LE can be obtained by direct measurement as:

$$\begin{aligned} ||\overrightarrow{\tau }||&= \rho _a u_*^2, \end{aligned}$$
(1a)
$$\begin{aligned} H&= - \rho _a C_{p_a} u_*\theta _*, \end{aligned}$$
(1b)
$$\begin{aligned} LE&= - \rho _a L_v u_* q_*, \end{aligned}$$
(1c)

with \(u_*=(-\overline{u^{\prime }w^{\prime }})^{0.5}\), \(\theta _*=-\overline{w^{\prime }\theta ^{\prime }}/u_*\) and \(q_*=-\overline{w^{\prime }q^{\prime }}/u_*\). u and w are the along-stream and vertical velocities, \(\theta \) is the potential temperature, q is the specific humidity, \(\rho _a\) is the near-surface air density, and \(c_{p_a}\) and \(L_v\) are the specific heat of moist air and the latent heat of vaporisation or sublimation, respectively. The Reynolds decomposition for a given x variable is used: \(x={\overline{x}}+x^{\prime }\) (where \({\overline{x}}\) is the Reynolds averaged value and \(x^{\prime }\) the perturbation around the average). High frequency measurements are required to obtain \(x^{\prime }\) values.

When direct measurements are not available, turbulent fluxes can be calculated via the bulk formulas from time-averaged wind speed U, potential temperature \(\theta \) and specific humidity q (strictly speaking \({\overline{U}}\), \({\overline{\theta }}\) and \({\overline{q}}\) but the symbol is not written in the rest of the paper for the sake of clarity). In the same fashion, in atmospheric models, surface flux parameterizations are based on the same formulas and use the resolved (or equivalently: averaged) parameters. The bulk formulas for the momentum, sensible heat and latent heat fluxes are:

$$\begin{aligned} ||\overrightarrow{\tau }||&= \rho _a C_D U^2, \end{aligned}$$
(2a)
$$\begin{aligned} H&= - \rho _a C_{p_a} C_H U \left( \theta _a - \theta _s \right) , \end{aligned}$$
(2b)
$$\begin{aligned} LE&= - \rho _a L_v C_E U \left( q_a - q_s \right) , \end{aligned}$$
(2c)

where subscript a and s stand for the atmospheric and surface values, respectively. \(C_D\), \(C_H\) and \(C_E\) are the transfer coefficients for momentum (drag coefficient), sensible and latent heat, respectively, and hold a key role in turbulent flux parameterizations.

The drag coefficient is obtained by combining Eqs. (1a) and (2a):

$$\begin{aligned} C_D = \frac{u_*^2}{U^2}, \end{aligned}$$
(3)

According to the Monin-Obukhov Similarity Theory (MOST, Monin and Obukhov 1954), the wind speed profile within the surface layer is given by:

$$\begin{aligned} U = \frac{u_*}{\kappa } \left[ ln\left( \frac{z}{z_0} \right) - \psi _m(\zeta ) \right] , \end{aligned}$$
(4)

where k is the Von Karman constant, \(z_0\) is the aerodynamic roughness length and \(\psi _m\) is a stability function (see next paragraph) of the stability parameter \(\zeta =z/L\), with L the Obukhov length. L is defined as:

$$\begin{aligned} L = \frac{u_*^2}{k\beta \theta _{v*}} = - \frac{u_*^3}{k\beta Q_{0v}}, \end{aligned}$$
(5)

where \(Q_{0v}=\overline{w'\theta _v'}\) is the surface virtual temperature flux, \(\theta _{v*}\) is a virtual temperature scaling parameter defined by \(Q_{0v}=-u_*\theta _{v*}\) and \(\beta =g / \widetilde{\theta _v}\) the buoyancy coefficient (with g the acceleration due to gravity and \(\widetilde{\theta _v}\) the layer-average virtual potential temperature).

Since \(\psi _m(\zeta )=0\) under neutral conditions (\(\zeta =0\)), combining Eqs. (3) and (4) provides the neutral drag coefficient:

$$\begin{aligned} C_{DN} = \left( \frac{\kappa }{ln\left( \frac{z}{z_0} \right) } \right) ^2, \end{aligned}$$
(6)

and subsequently the drag coefficient \(C_{D}\) is:

$$\begin{aligned} C_D = C_{DN} \left[ 1 - \frac{\kappa \psi _m(\zeta )}{\sqrt{C_{DN}}} \right] ^{-2}. \end{aligned}$$
(7)

The stability function \(\psi _m(\zeta )\) results from the integration of the flux-gradient relationship \(\varphi _m(\zeta )\) as:

$$\begin{aligned} \psi _m(\zeta ) = \int _0^{\zeta } \frac{1-\varphi _m(\zeta )}{\zeta } d \zeta . \end{aligned}$$
(8)

Several empirical-analytical forms of the flux-gradient relationship \(\varphi _m\) have been proposed in the literature (e.g. Paulson 1970; Dyer 1974; Businger 1988; Beljaars and Holtslag 1988-2005; Fairall et al. 1996; Grachev et al. 2000, 2007) by regressing \(\varphi _m\) observations:

$$\begin{aligned} \varphi _m(\zeta ) = \frac{\kappa z}{u_*} \frac{\partial u}{\partial z}, \end{aligned}$$
(9)

onto \(\zeta \) observations.

The drag coefficient \(C_{D}\) contains the effect on the turbulence of both the stratification (through the stability function \(\psi _m\)) and the surface morphology (through the roughness length \(z_0\)). The stratification impact is expected to be universally dependent on \(\zeta \), following MOST. The roughness length \(z_0\) is related to the typical scale of the surface roughness elements. As a result, \(z_0\) is the only surface property which characterizes the efficiency of transferring momentum between the atmosphere and the surface for a given surface state. Once a stability function is chosen from the literature, only \(C_{DN}\) (or equivalently \(z_0\)) has to be parameterized. The drag coefficient is usually calculated at 10 m above the surface (\(C_{DN10}\)). All analyses in this study use the \(C_{DN10}\) uncertainty (henceforth referred to as \(C_{DN}\)).

For parameterization development, \(C_{DN}\) is derived from direct flux measurements from Eq. (6) after inverting Eq. (4) in order to get a \(z_0\) estimate. For solid and static surfaces (i.e. rocks or vegetation over a short period), \(z_0\) (or \(C_{DN}\)) is constant and may be determined once and for all through direct flux measurement. For water surfaces, the roughness length is a composite of a smooth and a rough component (e.g. Smith 1988; Fairall et al. 2003). The roughness elements are the waves, which themselves depend on U and \(u_*\). As a result, \(z_0\) and \(u_*\) are interactive and \(C_{DN}\) parameterizations become iterative.

For estimating surface fluxes above heterogeneous ice-water surfaces, where the bulk approach hypothesis of horizontal homogeneity is not fulfilled, an alternative approach can be used. It consists of a contribution to \(C_{DN}\) from both water and ice surface properties through a skin drag, in addition to a contribution from a form drag component which represents the pressure force on the floe edges and the ice ridges (e.g. Schlichting 1936; Andreas et al. 1984, 2010; Lüpkes et al. 2012; Lüpkes and Gryanik 2015; Elvidge et al. 2016). Above sea ice, \(z_0\) may not only depend on the ice morphology but also on the potential presence of snow. Note that if water and sea-ice are both present with different surface temperature—which is easily the case—there is no longer a single near-surface stability and the averaging of the ice and water neutral drag coefficients, scaled to their respective areal fraction may leads to errors. In such situations, flux averaging should be preferred (e.g. Wood and Mason 1991; Raupach and Finnigan 1995; Mahrt 2000). However, the observational constraints do not allow accurate separation of the flux contributions from water, sea-ice and form drag in heterogeneous ice-water surfaces. As a result, any attempt to set up a parameterization is confronted with the fact that separate processes can not be independently validated with reference observational data. In order to isolate the different flux contributions, a more feasible approach would be to run high resolution numerical simulations and to consider a coarse-graining approach, as done for instance in Blein et al. (2020a) for assessing the impact of the surface flux heterogeneity in a different context (surface flux heterogeneity above oceans due to convective cells). In this study we use the common assumption that a horizontally-averaged \(C_{DN}\) can be approximated by a \(C_{DN}\) calculated from the horizontally-averaged parameters. Quantifying the impact of such hypothesis is beyond the scope of this paper (see e.g. Blein et al. 2020b, for a method example). Furthermore, it is assumed that the locally-observed parameters are actually the horizontally-averaged parameters, representing the heterogeneous sea-ice field. The time averaging limits the consequences of this assumption.

3 Uncertainties Assessment and Propagation Methods

Estimating neutral drag coefficients from measurements and by using empirical stability functions comes with uncertainties from instrumental and procedural origins. This section describes the methods used to assess their magnitudes and contributions to the total uncertainty in the drag coefficient.

The proposed procedure is applied, as an illustrative example, to the data from the Surface Heat Budget of the Arctic Ocean field campaign (SHEBA, Uttal et al. 2002; Persson 2002). The SHEBA ice camp drifted approximately 2700 km in the Beaufort Gyre between 2 October 1997 and 11 October 1998. It started in the Beaufort Sea, drifted westward into the Chukchi Sea, then turned north into the Arctic Ocean near the date line. Only the tower data are used as they allow for flux-gradient relationship estimation.

These data are pre-screened according to a standard flux quality control (Foken and Wichura 1996) and a non-negative-ustar screening criterion (e.g. Eq. 4.1a in Andreas et al. 2010), as these are prerequisites for handling turbulent fluxes and using MOST. The samples from the entire SHEBA time period are used for statistics. However, only the samples from the aerodynamic summer time period are used (Andreas et al. 2010) whenever the drag coefficients are analysed as a function of the the ice concentration in this paper (see Sect. 5). The ice concentration estimates are provided through the study of Perovich (2002) which assesses the ice concentration from aerial photographs taken during periodic, 200 km long helicopter surveys around the SHEBA ice camp.

3.1 Propagation of Initial Uncertainties

The neutral drag coefficient \(C_{DN}\) is estimated from Eqs. (4) to (6). Each of the variables \(v_i\) used for the \(C_{DN}\) calculation is considered along with its associated random error, expressed as a standard deviation \(\sigma _{v_i}\) around the nominal (or measured for measured variables) value. These initial variable random errors propagate up the \(C_{DN}\) value and lead to the \(C_{DN}\) random-error, following:

$$\begin{aligned} \sigma ^2_{C_{DN}}= \sum ^{N}_{i=1}\left( \frac{\partial C_{DN}}{\partial v_i} \right) ^2 \sigma ^2_{v_i}. \end{aligned}$$
(10)

The \(v_i\) variable is either a physical variable (e.g. U, \(\widetilde{\theta _v}\), z, \(u_*\) and \(\theta _{v*}\)), a parameter (e.g. k or g), or the value of the stability function (see Sect. 3.3). i is an integer as \(1<i<N\) where N is the number of variables entering the equation for \(C_{DN}\). The correlation between the uncertainties of different variables are negligible (not shown) and therefore not accounted for in this study for the sake of clarity.

3.2 Physical Variable Uncertainties and Parameter Uncertainties

Each of the physical variables entering the \(C_{DN}\) equation (U, \(\widetilde{\theta _v}\), z, \(u_*\) and \(\theta _{v*}\)) originate either from a direct sensor measurement (e.g. z or U if the wind speed is measured through a wind propeller-type sensor), a physical equation (e.g. \(\widetilde{\theta _v}\) or U if the wind components are measured separately) or a statistical product (e.g. \(u_*\) and \(\theta _{v*}\)). The errors in physical constants are addressed as follow: we use in this study a von Kármán constant estimate in the atmospheric surface layer of \(k=0.387\pm 0.003\), as proposed by Andreas et al. (2006) and the uncertainty of the gravitational acceleration g is neglected.

For the direct sensor measurements, the sensor accuracy (from the manufacturers) is directly used as the measurement uncertainty. However, in the present context, a time averaging is necessary which minimises the considered measurement uncertainty by dividing the sensor accuracy by the square root of the sample size. The random error is then propagated according to Eq. (10) with \(v_i\) being direct sensor measurements.

For considering the non-direct measurement uncertainties (such as \(\sigma _{\widetilde{\theta _v}}\)), the random error from the sensor accuracy (potentially hourly averaged) of the corresponding dependent variables (e.g. the air temperature \(T_a\), the surface temperature \(T_s\), the pressure P and the relative humidity Hu in the case of \(\widetilde{\theta _v}\)) is propagated up to \(\sigma _{C_{DN}}\) by considering directly the dependence of \(C_{DN}\) on these dependent variables. In other words, it is done by calculating the \(\frac{\partial C_{DN}}{\partial v_i}\) term in Eq. (10), with \(v_i\) the dependent variables which are directly measured (e.g. \(T_a\), \(T_s\), P and Hu instead of \(\widetilde{\theta _v}\)).

The turbulent fluxes do not originate from physical equations but from statistical products which represent averaged characteristics of the turbulence. Observations of turbulent fluxes result from the time averaging of a covariance over the Reynolds average time period and therefore contain random errors due to (i) the random instrumental noise around measurements (e.g. Lenschow and Kristensen 1985; Billesbach 2011; Rannik et al. 2016; ii) poor statistical sampling of larger eddies (e.g. Mann and Lenschow 1994; Vickers and Mahrt 1997; Finkelstein and Sims 2001; Litt et al. 2015) and (iii) the stochastic nature of the turbulence (e.g. Lenschow et al. 1993; Rannik et al. 2006). Rannik et al. (2016) shows that the random instrumental noise is small enough to be neglected relative to the random error due to the stochastic nature of turbulence. For dealing with the second source of errors, quality control procedures for estimating the flux quality have been proposed (e.g. Foken and Wichura 1996; Vickers and Mahrt 1997) and are usually applied in order to screen data samples (e.g. Andreas et al. 2010, etc.). As a result, the latter random error is filtered out in post-processed experiment datasets. Therefore, only the stochastic nature of turbulence should be considered as a source of errors for turbulent fluxes; uncertainty which propagate up to \(\sigma _{C_{DN}}\) according to Eq. (10) with \(v_i\) representing directly \(u_*\) or \(\theta _{v*}\). As a result, random flux errors have to be computed prior to the use of the flux value in the \(C_{DN}\) calculation. Various approaches have been developed for its evaluation (see Rannik et al. 2016, for a review) and lead to estimates which reach 10 to 20 \(\%\) under typical observation conditions. Under stable conditions and heterogeneous surfaces such as isolated floes, this random error may be wider (Rannik et al. 2016). In any case, the random flux error has to be calculated based on the high-frequency data. Generally, reference campaign articles provide a single average value of the random errors for the fluxes over the whole campaign (e.g. Foken and Wichura 1996; Larsen et al. 2001; Andreas et al. 2005). This is the case for the SHEBA experiment (Andreas et al. 2010). Systematic errors can also arise in flux estimates (e.g. Vickers and Mahrt 1997; Massman 2000; Frank et al. 2013), but such potential bias estimations are beyond the scope of this paper and no estimate is available for the SHEBA dataset.

All these physical variable uncertainties and parameter uncertainties are used to compute the \(\sigma _{C_{DN},MRE}\) (for Measurement Random Error) according to Eq. (10) with \(v_i\) representing all variables but the stability function. The impact of the stability function uncertainty is addressed in the next Section. Table 1 gathers the SHEBA initial measurement random errors (Persson 2002).

Table 1 Initial measurement random error (MRE) for the SHEBA data

3.3 Assessing the Stability Function Uncertainty

The stability function \(\psi _m(\zeta )\) is an empirical function which results from the analytical integration of a flux-gradient relationship \(\varphi _m(\zeta )\) (see Sect. 2, Eq. 8) that best fits observations (e.g. Grachev et al. 2007). Observational estimates of \(\varphi _m\) and \(\zeta \) used to propose a flux-gradient relationship usually exhibit a large spread (see the yellow crosses around the red and blue lines in Fig. 1a) and result in the main source of uncertainty of the stability function \(\psi _m(\zeta )\). \(\varphi _m\) and \(\zeta \) also contain their own errors which originate from measurement errors that propagate up to \(\varphi _m\) and \(\zeta \) estimates and from errors in the wind speed vertical gradient estimate. This additional error tends to increase the spread on the \(\varphi _m\) versus \(\zeta \) plot (Fig. 1a, each yellow cross could be replaced by a yellow rectangle to represent those additional uncertainties on both the horizontal and the vertical axis) but these errors arising from measurement random errors can be neglected (not shown) with respect to the spread of the nominal values of (\(\varphi _m\), \(\zeta \)) measurement couples.

The uncertainty in the \(\psi _m(\zeta )\) function therefore relies on the uncertainty propagation through the calculation of the integral from the \(\varphi _m(\zeta )\) function and is based on a bootstrapping method (inspired by the work of Cordero et al. 2008) detailed hereafter. K folds of N couples of (\(\varphi _m\),\(\zeta \)) available measurements (e.g. yellow crosses in Fig. 1a) with replacement are carried out. From each of those K folds, a discrete \(\varphi _m^i(\zeta )\) function (\(i=1,...,K\)) is defined as the averaged \(\varphi _m\) per \(\zeta \) bin (grey solid lines in Fig. 1a), with a regular \(\zeta \) bin spacing in logarithmic space. Each of these K \(\varphi _m^i(\zeta )\) discrete functions is then used to calculate one discrete estimate of \(\psi _m^i\) function geometrically following Eq. (8) (grey solid lines in Fig. 1b). The \(\psi _m\) variability per \(\zeta \) bin provides the \(\psi _m\) uncertainty. This uncertainty can be viewed through two different lenses: (i) as a \(\psi _m\) mean squared error and (ii) as the combination of a \(\psi _m\) standard deviation and bias.

For the \(\psi _m\) mean squared error approach, the \(\psi _m\) variability per \(\zeta \) bin can be used to define the \(\psi _m\) mean squared error \(\textrm{MSE}_{\psi _m}\). This arises from the choice of a given \(\psi _m(\zeta )\) function whereas the actual \(\psi _m^i(\zeta )\) function can take any of the potential values within the spread of all the lines illustrated in Fig. 1b. The \(\psi _m\) mean squared error for a given \(\zeta \) bin is:

$$\begin{aligned} \textrm{MSE}_{\psi _m}(\zeta )=\frac{1}{K}\sum _{i=1}^{K}\left( \psi _m(\zeta ) - \psi _m^i(\zeta ) \right) ^2. \end{aligned}$$
(11)

Since all published \(\psi _m(\zeta )\) functions are close one to another compared with the spread of all the grey lines, we consider the specific choice of \(\psi _m(\zeta )\) function to have a negligible impact on the overall \(\textrm{MSE}_{\psi _m}\) estimated for each \(\zeta \) bin. This approach still considers the published \(\psi _m(\zeta )\) function to be a reasonable truth and the particular SHEBA sampling to be biased for not sampling a large enough range of conditions. The \(\textrm{MSE}_{\psi _m}\) computed as given above considers that the \(\psi _m\) values could be as far above the theoretical-empirical values (red line) as we can see them below in Fig. 1b. The \(\textrm{MSE}_{\psi _m}\) then generates a \(C_{DN}\) random error as:

$$\begin{aligned} \sigma ^2_{C_{DN},\textrm{MSE}_{\psi _m}} = \left( \frac{\partial C_{DN}}{\partial \psi _m} \right) ^2 \textrm{MSE}_{\psi _m}(\zeta ). \end{aligned}$$
(12)

Note that \(\textrm{MSE}_{\psi _m}(\zeta )\) presented here depends only on the \(\zeta \) value. The \(\textrm{MSE}_{\psi _m}(\zeta )\) values for the SHEBA dataset are provided in Table 2, Appendix I. This \(C_{DN}\) uncertainty contributes to the total \(C_{DN}\) random error as:

$$\begin{aligned} \sigma ^2_{C_{DN},\textrm{tot}} = \sigma ^2_{C_{DN},\textrm{MRE}} + \sigma ^2_{C_{DN},\textrm{MSE}_{\psi _m}}. \end{aligned}$$
(13)

For the combination of a \(\psi _m\) standard deviation and bias approach, the K discrete \(\psi _m^i(\zeta )\) (for \(i=1,...,K\)) functions provide an actual mean discrete \(\overline{\psi _m}(\zeta )\) function (black solid line in Fig. 1b) which differs from the published \(\psi _m(\zeta )\) functions. Assuming this \(\overline{\psi _m}(\zeta )\) function as the truth for the targeted dataset, a systematic bias \(\varDelta \psi _m(\zeta )\) therefore arises when using the published \(\psi _m(\zeta )\) functions. This \(\psi _m\) bias generates in turn a bias on \(C_{DN}\) defined as:

$$\begin{aligned} \varDelta C_{DN} = C_{DN}-C_{DN}^{\overline{\psi _m}}, \end{aligned}$$
(14)

where \(C_{DN}^{\overline{\psi _m}}(\zeta )\) is the drag coefficient calculated with the actual mean \(\overline{\psi _m}(\zeta )\) function. In this case, the variability around the mean \(\overline{\psi _m}\) is interpreted as a \(\psi _m\) random error, calculated as:

$$\begin{aligned} \sigma ^2_{\psi _m}(\zeta )=\frac{1}{K}\sum _{i=1}^{K}\left( \psi _m^i(\zeta ) - \overline{\psi _m}(\zeta ) \right) ^2, \end{aligned}$$
(15)

which generates a \(C_{DN}\) random error as:

$$\begin{aligned} \sigma ^2_{C_{DN},\sigma ^2_{\psi _m}} = \left( \frac{\partial C_{DN}}{\partial \psi _m} \right) ^2 \sigma ^2_{\psi _m}(\zeta ), \end{aligned}$$
(16)

and the total \(C_{DN}\) uncertainty when using the actual \(\psi _m(\zeta )\) therefore reads:

$$\begin{aligned} \sigma ^2_{C_{DN}^{\overline{\psi _m}}\textrm{,tot}} = \sigma ^2_{C_{DN},\textrm{MRE}} + \sigma ^2_{C_{DN},\sigma ^2_{\psi _m}}, \end{aligned}$$
(17)

where \(\sigma ^2_{C_{DN},\textrm{MRE}}\) is calculated here using the observed \(\overline{\psi _m}(\zeta )\) function. This approach suggests a systematic bias in the published functions, especially for the unstable range. This would indicate, for example, that stability functions developed in the tropics for unstable conditions might not be suitable for polar conditions. However, this analysis should be conducted on a wider dataset in order to confirm this statement. See the discussion Sect. 6.1.

The first approach (\(\psi _m\) mean squared error approach) is used in the core of this paper, whereas the second approach (\(\psi _m\) standard deviation and bias approach) is discussed in Sect. 6. By using the first approach, we hypothesise that the stability functions developed for unstable conditions are valid even in polar regions, but the SHEBA datasets does not sample large enough a range of conditions to see a representative spread around the empirical functions proposed in the literature. This method is applied to the SHEBA dataset (Fig. 1a and 1b) with \(K=100\) and \(N=500\) for each of the stable and unstable ranges. The data processing follows Grachev et al. (2007). Note that the measurement random errors propagating up to the \(\zeta \) value, do generate a \(\psi _m\) error as \(\psi _m\) is a function of \(\zeta \). However, this random error is negligible compared to the \(\psi _m\) function uncertainty itself (not shown).

Fig. 1
figure 1

Bootstrapping procedure for assessing the uncertainty of the (a) \(\varphi _m(\zeta )\) and (b) \(\psi _m\) functions. (a) Flux-gradient relationship \(\varphi _m\) as a function of the stability parameter \(\zeta \) for the SHEBA data: individual hourly observations (yellow crosses); discrete \(\varphi _m^i(\zeta )\) for each of the \(K=100\) folds (\(\varphi _m\) averaged per \(\zeta \) bins, grey solid lines) and empirical-theoretical interpolations for the unstable conditions (red solid line, Grachev et al. 2000) and the stable conditions (blue solid line, Grachev et al. 2007). (b) Stability function \(\psi _m\) as a function of the stability parameter \(\zeta \) for the SHEBA data: discrete \(\psi _m^i(\zeta )\) (geometrical integral of the discrete \(\varphi _m^i(\zeta )\) following Equation (8) for each of the \(K=100\) folds (grey solid lines), average of all discrete \(\psi _m^i(\zeta )\) per \(\zeta \) bin (black solid line) and empirical-theoretical function from previous studies (red lines and blue lines for the unstable and stable regimes, respectively)

3.4 Other Sources of \(C_{DN}\) Uncertainty

As detailed in Sect. 2, surface temperature can vary significantly between water and ice, leading to a different stability. This effect certainly increases the stability function error (through a \(\zeta \) error) as the surface temperature used for calculating \(\zeta \) is the ice temperature (at the mast base). Furthermore, this horizontal heterogeneity of stability may lead the flow to never achieve equilibrium with local surface, situation for which the MOST would not apply and for which additional uncertainty may arise. While it is clear that the surface heterogeneity increases the uncertainty in the real \(C_{DN}\) estimates, its precise quantification remains beyond the scope of this paper. Current research, such as ongoing works about the MOSAIC expedition (Shupe et al. 2022), addresses this scientific question. Accurate \(C_{DN}\) estimates are also conditioned to the fact of the measurements being actually realised within the surface layer. For the SHEBA dataset, this hypothesis has been verified (e.g. Grachev et al. 2007). In the case of measurement being collected above or near the surface layer upper limit, where surface fluxes start to vary significantly, the MOST would breakdown. However an additional \(C_{DN}\) random error could be estimated in order quantify the relevancy of the associated \(C_{DN}\) estimates. These points are beyond the scope of this paper.

4 Application to the SHEBA Dataset

This section presents the results of the proposed procedure for quantifying the uncertainty of the \(C_{DN}\) estimates from the SHEBA dataset.

4.1 Total Estimates of \(C_{DN}\) Random Error

The relative total \(C_{DN}\) random error (\(\frac{\sigma _{C_{DN}}}{C_{DN}}\)) as a function of \(C_{DN}\) (Fig. 2a, boxplots) looks like a parabola with an average minimum value of approximately 25\(\%\) around \(C_{DN} \approx 1.8 \times 10^{-3}\), a value for which a large amount of \(C_{DN}\) measurements are available. For \(C_{DN}\) between about \(1.0\times 10^{-3}\) and \(2.5\times 10^{-3}\) (the most common range), the average relative random error is between 25 and 50\(\%\), which are larger values than the uncertainty values usually estimated (between 11 and 20\(\%\), e.g. Andreas et al. 2005, 2010). When \(C_{DN}\) decreases, the relative random error increases and the \(C_{DN}\) uncertainty begins to exceed the \(C_{DN}\) value for \(C_{DN}\le 0.001\). Even though the relative total \(C_{DN}\) random error seems to increase with \(C_{DN}\), too few data with \(C_{DN}\ge 0.003\) are available to allow a robust conclusion on the behaviour of the relative total uncertainty for large \(C_{DN}\). The relative total \(C_{DN}\) uncertainty as a function of \(\log (\zeta )\) also looks like a parabola with an average minimum value of approximately 25% for close-to-neutral conditions, (Fig. 2b, boxplots). The total error begins to exceed the \(C_{DN}\) value for approximately \(|\zeta |> 1\). This analysis suggests that for \(C_{DN}\) and \(\zeta \) ranges for which, on average, the relative \(C_{DN}\) random error is greater than 1 may not be used as reliable reference for tuning \(C_{DN}\) parameterization.

Fig. 2
figure 2

Histograms (grey shading, left axis) of (a) the neutral drag coefficient at 10 m \(C_{DN}\) and (b) the stability parameter \(\zeta \). Superimposed: boxplots (black, right axis; whiskers: 5\(^{th}\) and 95\(^{th}\) percentiles, box: interquartile range, line: median, dot: mean) of the distribution of the relative total uncertainty \(\sigma _{C_{DN}}/C_{DN}\) per (a) \(C_{DN}\) bins and (b) \(\zeta \) bins

4.2 Measurements Random Error and \(\psi _m\) Error Contributions

\(C_{DN}\) random errors are partitioned into two potential main sources: the measurement random errors and the uncertainty of the stability function used to obtain neutral transfer coefficients from non-neutral conditions.

Fig. 3
figure 3

\(C_{DN}\) value and random error (as standard deviation) as a function of the stability parameter \(\zeta \) (per \(\zeta \) bin). Grey dotted line with circles: \(C_{DN}\) value; light grey solid line with squares: total \(C_{DN}\) measurement random error \(\sigma _{C_{DN}\textrm{,MRE}}(\zeta )\) (Eq. 10) and grey solid line with squares: total \(C_{DN}\) random error \(\sigma _{C_{DN}\textrm{,tot}}\) (including both the total \(C_{DN}\) MRE and the \(C_{DN}\) random error arising from \(\textrm{MSE}_{\psi _m}\), i.e. \(\sigma _{C_{DN}\mathrm {,MSE_{\psi _m}}}\) Eq. 13)

The total \(C_{DN}\) random error (\(\sigma _{C_{DN},\textrm{tot}}\), dark grey solid line with squares in Fig. 3) is dominated by the total \(C_{DN}\) MRE (\(\sigma _{C_{DN},\textrm{MRE}}\), light grey solid line with squares). The difference between the two curves provides the random error on \(C_{DN}\) resulting only from the mean square error in the \(\psi _m\) estimate (\(\sigma _{C_{DN},\textrm{MSE}_{\psi _m}}\)) which is negligible under unstable conditions, but increases with \(\zeta \) under stable conditions up to about \(30\%\) of the total \(C_{DN}\) random error around \(\zeta =1\).

The fact that the total random error on \(C_{DN}\) (\(\sigma _{C_{DN},\textrm{tot}}\)) exceeds the \(C_{DN}\) value itself for moderately stable or unstable conditions, i.e. for \(\left| \zeta \right| \) larger than 1, is mainly due to the MREs.

4.3 Parameters Driving the \(C_{DN}\) Measurement Random Error

As did the relative \(C_{DN}\) error (Fig. 2), the absolute total \(C_{DN}\) MRE shows a dependency on the stability parameter \(\zeta \) (Fig. 4), with a minimum for neutral stability and an increase with increasing stability and instability. The \(u_*\) uncertainty is the main contributor over the whole range of negative \(\zeta \) as well as up to \(\zeta \approx 0.2\). For more stable conditions, the contributions from the MRE of the sensible heat and momentum fluxes both increase and the MRE of the sensible heat flux becomes the driving contribution. Too few samples show \(\zeta >10\) to conclude for this extreme stability range. On average over the entire stability range (available here), the total \(C_{DN}\) measurement random error \(\sigma _{C_{DN},\textrm{MRE}}\) is largely driven by the contribution from \(u_*\) MRE. On average, 98.7\(\%\) of the total \(C_{DN}\) MRE variance is due to the \(u_*\) contribution. The measurement altitude z is included in the residual contribution and the previous results are still valid for other z values (not shown, for instance for calculating the drag coefficient at surface altitudes other than 10 m).

Fig. 4
figure 4

Budget of the \(C_{DN}\) measurement random error (\(\sigma ^2_{C_{DN},\textrm{MRE}}\)) as a function of the stability parameter \(\zeta \) from all random errors of initial measured variables. The grey solid line with squares shows the total \(C_{DN}\) MRE. Other lines show the contributions from the parameters dominating the total \(C_{DN}\) uncertainty: \(u_*\) (black solid line), \(\overline{w^{\prime }\theta ^{\prime }}\) (black dashed line). Residual contributions (U, P, RH, T, \(\overline{w^{\prime }q^{\prime }}\) and z are shown in dotted black line

The total \(C_{DN}\) measurement uncertainty exhibits a non-negligible dependency only on three of the parameters used to estimate \(C_{DN}\): the larger the U (Fig. 5a), the \(u_*\) (Fig. 5b) or the \(| \overline{w^{\prime }\theta ^{\prime }} |\) (Fig. 5c), the weaker the total \(C_{DN}\) MRE. On average over the U, \(u_*\) or \(\overline{w^{\prime }\theta ^{\prime }}\) range, the \(u_*\) MRE largely dominates the total \(C_{DN}\) MRE except for \(U\le 3\ \textrm{ms}^{-1}\), \(u_* \le 0.07\ \textrm{ms}^{-1}\) and \(-0.005 \ \textrm{Kms}^{-1} \le \overline{w^{\prime }\theta ^{\prime }} \le 0\ \textrm{Kms}^{-1} \), approximately, where the contribution from the MRE of the sensible heat flux dominates. Under weak wind conditions, for which the \(C_{DN}\) value decreases, the relative \(C_{DN}\) MRE become even larger (not shown) and the use of the corresponding \(C_{DN}\) estimates, as reference for parameterization development, could be problematic. The large \(C_{DN}\) MRE values for near to zero \(\overline{w^{\prime }\theta ^{\prime }}\) support the data screening which is usually done by discarding samples with a too weak sensible heat flux (e.g. Andreas et al. 2010)

Fig. 5
figure 5

Same as Fig. 4 but as a function of: (a) the wind speed, (b) the friction velocity \(u_*\) and (c) the sensible heat flux \(\overline{w^{\prime }\theta ^{\prime }}\)

5 Using Uncertainties for Data Screening

5.1 Objective Data Screening Procedure

Data screening is a key step in order to get relevant data used for conducting physical analysis. For turbulence studies, data are first selected following a flux quality control procedure (e.g. Foken and Wichura 1996) in order to remove data which do not fulfil the assumed conditions for Monin-Obukhov similarity theory, such as criteria on the homogeneity and steadiness of turbulence. In addition to the flux quality control procedure which is widely used and approved, other criteria, more subjective or even heuristic, are also often applied with the objective of getting a cleaner dataset. For instance, based on the SHEBA dataset, Andreas et al. (2010) discard samples for which the latent or sensible heat fluxes (and the temperature or humidity gradient) are considered too weak. They also discard samples for which the roughness lengths are seen as extreme values. While these criteria can be partially justified, there is no objective procedure to set precisely any of the imposed thresholds. Data screening procedures should aim at finding the best trade-off, balancing on one side the discarding of a large proportion of samples with large errors that will subsequently propagate in the data analysis and on the other side, the maximisation of the selected samples for the most robust data analysis possible. In this section, various data screening criteria are analysed by comparing the corresponding relative error on \(C_{DN}\), which is used as a quality metric of each data point. Applying the flux quality control data screening removes samples which are non-relevant for turbulence analysis. However, amongst the remaining data, a wide proportion still has large \(C_{DN}\) relative random error (orange distribution in Fig. 6). As an objective way of carrying out this selection and based on the relative \(C_{DN}\) random error estimates, we propose a criterion which consists of retaining data for which the relative \(C_{DN}\) error does not exceed a threshold of 1 (black distribution) after estimating the \(C_{DN}\) error following the method proposed in the present study. A threshold of 1 ensures the random error to be smaller than the \(C_{DN}\) value. This threshold can be adapted if necessary. More data are retained with this approach (5304 samples) than with the one from Andreas et al. (2010) (blue distribution, 1761 samples), thus improving both the data quality (some of the Andreas et al. 2010, data sample show large relative random errors) and the sample number. This indicates that weak sensible or latent heat flux do not necessarily lead to excessively large \(C_{DN}\) errors, at least when using the threshold values used in Andreas et al. (2010). This simple procedure allows for an objective selection of data based on the final criteria that directly impact the analysis, namely the \(C_{DN}\) error. However, we still need to make an informed choice on the acceptable level of relative \(C_{DN}\) error. This choice should ideally be based on an independent estimate of the total \(C_{DN}\) error left in the sample after applying the proposed data-screening criteria. Assessing an independent estimate of the total error is discussed in the next section.

Fig. 6
figure 6

Histogram of relative \(C_{DN}\) error for different data screening. Orange: data screening based on flux quality control (flags provided by observers so that the MOST is verified). Cyan contour: data screening used in Andreas et al. (2010). Black: data screening based on the flux quality control and the threshold of relative \(C_{DN}\) error equal to 1

The data screening also impacts the mean \(C_{DN}\) value, for instance when collecting samples in time bins (for example in 10-day bins, Fig. 7). The data screening is therefore crucial as, for a given time period, the \(C_{DN}\) value can be biased if the data set does contain samples with a relative random error greater than 1. As a result, any trial of explaining the underlying processes would be also affected.

Fig. 7
figure 7

Time series (from October 1997 to August 1998) of the \(C_{DN}\) mean (markers) and standard error of the mean (SEM, error bars) calculated for each 10-days time bin for different data screening. Red: data screened through the flux quality control procedure; blue: data screened as in Andreas et al. (2010) and black: data screened through the flux quality control procedure and the threshold on the relative random error (fixed to 1)

Several studies consider the sea ice concentration as a largely dominant factor explaining the neutral drag coefficient changes in the marginal ice zone (e.g. Lüpkes et al. 2012; Andreas et al. 2010; Elvidge et al. 2016). When collecting \(C_{DN}\) samples in ice concentration bins (Fig. 8a), the data screening can have a strong impact on the \(C_{DN}\) mean values and there is a risk of providing a biased reference for parameterization development if the relative random error is not taken into account.

5.2 Assessing Independently the Actual Total Drag Coefficient Uncertainty

As mentioned in Sect. 3.4, other uncertainty sources than the main ones detailed in this study can play a role on the \(C_{DN}\) estimates and are, to the best of our knowledge, not possible to quantify yet. As a result, the actual total \(C_{DN}\) uncertainty value for each estimate is not available and we cannot quantify the proportion of the actual total uncertainty represented by the uncertainty estimated in the present study.

The experimental uncertainty of a variable (more specifically of a sensor) can usually be assessed by quantifying the variability of this variable when repeating never exactly the same experiment (type A uncertainty in metrology). This corresponds to what is done for sensor calibration and uncertainty assessment. Strictly speaking, repeating the exact same experiment for assessing a \(C_{DN}\) above a given solid surface would only require repeating the experiment above the same surface. In this specific case of a solid surface, the total actual \(C_{DN}\) random error is therefore directly the total variability of the estimates ensemble.

Fig. 8
figure 8

Impact of the data screening on the mean and standard error of the mean when collecting \(C_{DN}\) data in ice concentration bins. Red: data screening based on the flux quality control procedure; blue: data screening based on Andreas et al. (2010) and black: data screening based on both the flux quality control procedure and the threshold on the relative random error (fixed to 1). (a) \(C_{DN}\) ice concentration bin average (markers) and standard error of the mean (SEM or \({\widehat{\sigma }}_{\overline{C_{DN}}}\), error bars) as a function of the ice concentration and (b) \({\widehat{\sigma }}_{\overline{C_{DN}}}\) as a function of the ice concentration

In the case of a sea-ice surface, where the sea and ice fraction can change and for which the skin drag above both water and potential snow cover are interactive with the surface fluxes, as well as the form drag, the experiment can not be exactly repeated, as the fluxes are intrinsically turbulent and are never the exact same from a time period to an other one. As a result, the variability of the \(C_{DN}\) estimates at the same place will integrate both the random error of the estimates and the physical variability. Thus the total actual \(C_{DN}\) uncertainty can not be assessed as long as an exact formulation for describing the air-surface fluxes above sea ice is unavailable. An alternative solution could be to duplicate local observations at the same time.

Alternatively, following the approach of, for instance (Lüpkes et al. 2012; Andreas et al. 2010; Elvidge et al. 2016), which considers that the sea ice concentration is a largely dominant factor explaining the neutral drag coefficient changes in the marginal ice zone, the following criterion can be considered in order to assess the total random error. Considering the ice concentration as the governing parameter, whatever the function explaining the dependence of \(C_{DN}\) on the ice concentration, the \(C_{DN}\) variability for a given ice concentration value is considered as \(C_{DN}\) random error. Here the standard deviation per ice concentration bin is therefore used as an estimate for the total drag coefficient uncertainty.

Of course, other factors may play a role on the variability of \(C_{DN}\), if these factors indeed explain part of the \(C_{DN}\) physical variability. Such factors can be for instance the ice morphology, the spatial organisation of the floes, the presence of blowing snow, etc. As a limitation of experiments above sea ice, such factors are very difficult to quantify precisely, and consequently their impact on the \(C_{DN}\) variability is also tricky to assess. Our estimate of total \(C_{DN}\) uncertainty as the standard error per ice concentration bin can only be applied on the summer season from the SHEBA campaign, whenever sea ice concentration does vary and the associated data were available.

The \(C_{DN}\) standard error per ice concentration bin is quantified through the \(C_{DN}\) standard error of the mean (SEM) \(\sigma _{\overline{C_{DN}}}\) which is approximated as:

$$\begin{aligned} \sigma _{\overline{C_{DN}}} \approx {\widehat{\sigma }}_{\overline{C_{DN}}}=\frac{\sigma _{C_{DN}}}{\sqrt{N}}, \end{aligned}$$
(18)

where the \({\widehat{\sigma }}_{\overline{C_{DN}}}\) is the standard error of the \(C_{DN}\) estimates on the N samples available per ice concentration bin. This formulation compensates for the sample number heterogeneity between ice bins.

The standard error of the mean \(\sigma _{\overline{C_{DN}}}\) as a function of the ice concentration for the QC-screened data (based only on the screening using the flux quality control, red curve in Fig. 8b) shows some variations which are amplified when considering the screened dataset based on the (Andreas et al. 2010) criteria (blue curve), for which maximum values reach twice the initial maximum values. Even if the metric which is chosen here tends to compensate the sample number heterogeneity, the (Andreas et al. 2010) criteria discard a large number of samples, which affects the dispersion estimates. The objective method proposed in the present study of selecting \(C_{DN}\) values depending on their relative error reduces \(\sigma _{\overline{C_{DN}}}\) on the entire ice concentration range available here (black curve). We consider the larger reduction in \(C_{DN}\) dispersion for our proposed screen procedure than for that of Andreas et al. (2010) to be a sign of better overall data quality. Furthermore, the amplitude of the reduction in \(C_{DN}\) dispersion can be used to refine the choice of acceptable level of relative \(C_{DN}\) error (the threshold fixed to 1 in Sect. 5.1).

6 Discussion

6.1 Considering the \(\psi _m\) Error as the Combination of a Standard Deviation and a Bias

This section proposes an alternative estimate of the \(C_{DN}\) uncertainty arising from the stability function uncertainty, as done in 4.2, but considering the total error on the stability function to be the combination of a standard deviation and a bias (see Sect. 3.3). As a reminder, this approach considers the mean observed discrete \(\overline{\psi _m}\) function estimated in Sect. 4.2, as the truth for the targeted dataset here, namely SHEBA. In this case, the total \(C_{DN}\) random error \(\sigma _{C_{DN}^{\overline{\psi _m}}\mathrm {,tot.}}(\zeta )\) (Eq. 17) does not significantly vary from the total \(C_{DN}\) random error found when using the published \(\psi _m(\zeta )\) functions \(\sigma _{C_{DN}\mathrm {,tot.}}(\zeta )\) (Eq. 13), as can be seen by comparing the grey and black solid lines with squares in Fig. 9. Since \(\sigma ^2_{\psi _m}<\mathrm {MSE_{\psi _m}}\), \(\sigma _{C_{DN}^{\overline{\psi _m}}\mathrm {,tot.}}(\zeta )\) tends to be marginally lower than \(\sigma _{C_{DN}\mathrm {,tot.}}(\zeta )\) but the total random \(C_{DN}\) error is dominated in both cases by the MREs. However, using the observed discrete \(\overline{\psi _m}\) function instead of the published \(\psi _m(\zeta )\) function significantly modifies the \(C_{DN}\) values. The resulting bias \(\varDelta C_{DN}(\zeta )\) (Eq. 14, blue and red lines with triangles in Fig. 9) tends to zero toward neutral conditions as expected due to the fact that both published and observed \(\psi _m(\zeta )\) functions tend to zero toward neutrality (Fig. 1b). The \(C_{DN}\) bias becomes significantly negative when \(|\zeta |\) increases, roughly symmetrically around \(\zeta =0\), except for extremely stable or unstable conditions.

Fig. 9
figure 9

\(C_{DN}\) random errors, bias and values as a function of the stability parameter \(\zeta \). Grey solid line with squares: total \(C_{DN}\) random error using the published \(\psi _m(\zeta )\) function (standard deviation \(\sigma _{C_{DN}\mathrm {,tot.}}(\zeta )\)), Eq. 13); black solid line with squares: total \(C_{DN}\) random error using the effective \(\overline{\psi _m}(\zeta )\) function (standard deviation, \(\sigma _{C_{DN}^{\overline{\psi _m}}\mathrm {,tot.}}(\zeta )\), Eq. 17); grey dotted line with bullets: \(C_{DN}\) values using the published \(\psi _m(\zeta )\) function; black dotted line with bullets: \(C_{DN}^{\overline{\psi _m}}\) values using the effective \(\overline{\psi _m}(\zeta )\); blue and red solid lines with triangles: \(C_{DN}\) bias (negative and positive, respectively) when using the published \(\psi _m(\zeta )\) instead of the effective \(\overline{\psi _m}(\zeta )\) functions

Under stable conditions, the pragmatic estimate of the \(\psi _m\) function proposed here differs substantially from the Grachev et al. (2007) analytical function (see Fig. 1b), despite both being based on the same dataset. This results in a small \(C_{DN}\) negative bias when estimates are made by using the analytical stability functions (Fig. 9). However, the proposed method of setting the stability function should be tested on a larger amount of data in order to conduct a fair comparison with the analytical functions.

Under polar unstable conditions, a substantial negative bias appears in our mean estimate of the stability correction \(\psi _m\), even reversing the sign of the \(\psi _m\) correction (see Fig. 1b). Published stability correction functions (e.g. Fairall et al. 1996; Grachev et al. 2000, ) have been developed based on a wealth of observations, mainly from the tropics, where unstable conditions are much more frequent. Even though based on large datasets, those stability correction function could be unsuitable to polar specific conditions. What remains unclear is whether the polar conditions that have been sampled during SHEBA are too narrow to cover all conditions sampled in the tropics, which would falsely indicate that those functions for unstable conditions need to be reviewed for polar conditions, or whether those functions should really behave differently under polar or tropical conditions. To conclude on that point, a better sampling of unstable polar conditions would be required.

6.2 Refining Turbulent Momentum Flux Error Estimates

We have shown, in this study, that uncertainties in turbulent momentum flux estimates are the main contributor to the uncertainty on neutral drag coefficients. However, we have used as uncertainty estimates for the momentum flux, a percentage of the momentum flux value as given by the observers. This rough estimate could be refined by using statistics during the temporal window chosen to compute those momentum fluxes (e.g. Rannik et al. 2016). This would provide an accurate estimate of momentum flux uncertainty for each individual data point, and in turn allow for more accurate estimates of individual neutral drag coefficient uncertainty. Those refinements would benefit the data-screening procedure proposed in this article.

7 Conclusions and Recommendations

The Arctic sea ice decline has captured public attention in the recent past as an emblematic sign of climate change. Refining our projections of sea ice cover evolution for the coming decades requires improved estimates of sea ice heat budget. A substantial part of its uncertainties originates from surface-atmosphere turbulent exchanges. Reducing uncertainties in surface turbulent exchanges implies reducing uncertainties on the transfer coefficients on which their parameterization relies. As a starting point, we focus here on the neutral drag coefficient from which can be derived other transfer coefficients. Developing or validating momentum flux parameterization requires deriving transfer coefficients from field data.

This article proposes a methodology to evaluate as thoroughly as possible the different contributions to the field-derived neutral drag coefficient errors. We list as contributors the instrumental/measurement errors that propagate through the computation of transfer coefficients as well as the uncertainties in the empirical stability functions used to correct for stability effects. This methodology is applied to data from the SHEBA (Surface Heat Budget of the Arctic Ocean) campaign carried out in the Arctic Ocean from October 1997 to October 1998. We conclude that for \(C_{DN}\) between about \(1.0\times 10^{-3}\) and \(2.5\times 10^{-3}\) (the most common range), the average relative random error is between 25 and 50\(\%\). For highly stable or highly unstable conditions (\(|\zeta |>1\)), the total uncertainty in the neutral drag coefficient exceeds on average the neutral drag coefficient value itself. For closer-to-neutral conditions, the total uncertainty is around 25\(\%\) of the drag coefficient and is dominated by measurement uncertainties in surface turbulent momentum fluxes which should therefore be the target of our efforts in uncertainty reduction. This article also proposes an objective data-screening procedure for field data, which consists of retaining data for which the relative error on neutral drag coefficient does not exceed a threshold of 1. This method allows for a reduction of the drag coefficient dispersion compared to other data-screening methods, which we take as an indication of better performance.

The methodology we proposed here was applied exclusively on data collected above sea ice. However, this approach could be applied to assess the neutral drag coefficient uncertainty and for data selection above any surface. The impact of the horizontal heterogeneity is discussed but not assessed in this study and remains an open scientific question. Dedicated field or numerical experiment design are necessary to answer this question. As a follow-up study, the uncertainties estimated in this article will be used to define weights for the field-derived neutral drag coefficients when used to calibrate turbulent flux parameterizations such as the one from Lüpkes et al. (2012).