Introduction

Soil moisture plays a very important role in hydrological processes (Kerr et al. 2001; Zhuo et al. 2015); in particular, the surface soil moisture (SSM) has been widely recognised as a vital element in a number of environmental studies. For example, Wanders et al. (2014) found that calibrating large-scale hydrological models with satellite-retrieved surface soil moisture resulted in “an accurate identification of parameters related to land-surface processes” (Wanders et al. 2014: pp 6874). Dumedah and Coulibaly (2013) demonstrated that assimilating the soil moisture observations and flow data into the soil and water assessment tool could raise the accuracy of flow modelling as well as soil moisture calculation when compared with the open-loop simulation. Yoon and Leung (2015) showed that “antecedent soil moisture information was as important as concurrent ENSO condition in controlling rainfall anomalies over the United States" (Yoon and Leung 2015: pp 5005). Particularly in hydrology, the presence of sufficient moisture in the upper-most few centimetres of soil plays an important role in controlling and distributing water input from rainfall and irrigation into runoff, interflow and groundwater (Idso et al. 1975). As a result, accurate soil moisture information is essential in real-time flood forecasting, as well as decision making in water resource management (Brocca et al. 2010; de Michele and Salvadori 2002; Komma et al. 2008; Zhuo and Han 2016; Zhuo et al. 2016; Srivastava et al. 2013b; Srivastava et al. 2016).

In theory, soil with low thermal inertia changes its temperature significantly during the daily heating-cooling cycle, whereas soil with high thermal inertia responds slower. This means that the areas with drier soil are warmer during the day and cooler in the night (de Griend-Van and Engman 1985). Some researchers believe the diurnal surface soil temperature difference (ΔSST) can reflect the variation of SSM with high performance (Deardorff 1978; Price 1980; Wetzel et al. 1984). This is because the variations of the SST are mainly affected by albedo and diurnal heat capacity, and the diurnal heat capacity is mainly influenced by soil moisture (Price 1980). However, the result found in Srivastava et al. (2013a) is against this common belief, and instead, it concludes that there exists a much stronger correlation between the SST and the SSM. It is explained by Srivastava et al. (2013a) that the possible cause behind the poor result of ΔSST in the paper may be related to satellite night-time data, which suffer "more from night dew, relative humidity changes during night or uncertainty in day/night registration of satellite data” (Srivastava et al. 2013a: pp 3137; Wan 1999). His conclusion raises our interest to further explore the relationship between the SST and the SSM using the in-situ observations which are more reliable than the satellite data, so that a more convincing conclusion can be made. Moreover, to our knowledge, no prior studies have focused on comparing the performance of both products (i.e. the ΔSST and the SST) for the SSM estimation, especially at catchments of contrasting climate and land cover conditions. Hence, in this paper, we have carried out for the first time a detailed assessment of the SST and the ΔSST for SSM simulations at two representative catchments, one in England (Brue) and the other in Australia (Stanley). Since Srivastava et al. (2013a) use satellite datasets only, the accuracy of the data themselves can be highly dependent on weather conditions (e.g. rainfall, humidity and cloud) and vegetation coverages when satellites retrieve those measurements (Schmugge et al. 1986). Furthermore, the datasets retrieved from various satellites, and at different geographical locations, can have diverse spatial resolutions (Wagner et al. 2012). Therefore, in order to avoid the aforementioned uncertainties in our conclusion, the in-situ ground measurements are used, so that the datasets are more reliable.

Materials and methodology

Data and catchments

The Brue catchment is located in Southwest England (51.06°N and 2.93°W) as shown in Fig. 1a, b. The reason for choosing this study area is its representativeness of the UK catchments and the availability of data. The land use of this mid-sized (135 km2) catchment is predominantly grassland on clay soils, with 10% arable farming, 7% forestry and no major urban or industrialised areas (Roberts et al. 2000). The in-situ data are provided by the British Atmospheric Data Centre collected from the HYREX project. The catchment is well equipped with an automatic weather station and an automatic soil water and thermometer station (both stations are located at the same place next to each other as shown in the red star in Fig. 1b). Both the SST and the SSM (at 5 cm) are measured at an hourly interval. The soil moisture is recorded using both the capacitance probes and the tensionmeters (Moore 1995). In this study, the SSM measured by the capacitance probe is chosen due to its better data integrity. The raw output from the capacitance probe is given as frequency in Hertz expressed as F = A(B + 1/θ), where F is the frequency and θ is the soil volumetric water content (m3/m3). The A and B parameters are correlated to the soil characteristics and instrument property, which can be calibrated from a built-in instrument model (Dean 1994). However such calibration was not available during the HYREX project (BADC 2003; Moore 1995). Therefore, the reciprocal of the frequency (i.e. 1/Hz) is used as a representative indicator of the volumetric soil moisture for the Brue catchment to avoid the conversion error. The SST (in the unit of °C) is retrieved by a soil thermometer. The areal daily rainfall data is obtained by averaging the retrievals of the 49 tipping bucket rain gauges using the Thiessen polygon technique. The observations in the Brue catchment cover a 12-month period in 1995.

Fig. 1
figure 1

Geographical locations. a, b Brue with soil moisture observation station in the red star, and river network (Floodsite; Remesan et al. 2009). c, d Stanley microcatchment (S) with soil moisture monitoring station S2 in the red star, and river network (Rüdiger et al. 2007). G, M and K stand for the soil moisture stations in Goulburn River catchment, Merriwa River catchment and Krui River catchment, respectively

The Stanley microcatchment (Wells and Christoph 2003; Rüdiger et al. 2007) is located in New South Wales, Australia (32.1°S and 150.1°E). The study site is within the Goulburn River catchment (6540 km2) which has two subcatchments in the northern half of the catchment, the Merriwa River (651 km2) and Krui River (562 km2). Stanley (1.75 km2) is part of the Krui River catchment. It has a low to moderate vegetation cover and is used mostly for farming. Soil is predominantly basalt-derived soil and cracking clay. Figure 1c, d illustrate the geographical locations of the Goulburn River catchment (G) and its subcatchments (M, K) and microcatchment (S), with the locations of soil moisture observations also indicated. In this study, the surface soil information is collected from the site S2 in the Stanley microcatchment (shown as a red star in Fig. 1d). The reason for choosing site S2 is because it is fully equipped with an automatic weather station, soil sensors and rain gauges. The volumetric SSM (in the unit of m3/m3) at 5 cm deep is measured by a water content reflectometer, and the SST (in the unit of °C) at the same depth is retrieved by a soil temperature sensor. Both sources of soil information are recorded hourly. The daily rainfall amount is recorded by a tipping bucket rain gauge. The data used from Stanley covers a 12-month period in 2005.

It should be noted that the retrieved data from both catchments contain missing data, which are mainly due to instrument failures.

Methodology

The methodology used in this study is mainly correlational statistics. In addition to statistical indicators, summary tables and graphs are also used to analyse the data. In this study, the observed in-situ datasets in their original form are too large to find regularities and draw a conclusion. Instead, table and figures are useful in exploring the overall pattern of the datasets (Mann 1995). The time series analysis (NIST 2012) has been adopted in this study in the soil moisture and rainfall plots to extract meaningful seasonal fluctuations. Moreover, the bivariate analysis via scatter-plots is used to visually demonstrate the correlations between variables such as SST against SSM and ΔSST against SSM. In addition, a combination plot is utilised to show multiple variables in a single plot.

Statistical indicators

The statistical indicators used in this study include Pearson product moment correlation coefficient (r), Spearman rank correlation coefficient (r s ), Nash-Sutcliffe efficiency (NSE) and root mean square error (RMSE).

Pearson product moment correlation coefficient (r) is used to assess the linear correlation between two variables, which can be calculated by

$$ r=\frac{n\left(\sum {x}_i{y}_i\right)-\left(\sum {x}_i\right)\left(\sum {y}_i\right)}{\sqrt{\left[n\sum {x}_i^2-{\left(\sum {x}_i\right)}^2\right]\left[n\sum {y}_i^2-{\left(\sum {y}_i\right)}^2\right]}} $$
(1)

where x i is the observed data and y i is the paired estimated data; n is the number of data pair.

Spearman rank correlation coefficient (r s ) is similar to r except that it operates on the ranks of the data rather than the raw data (Gautheir 2001; Myers and Well 1991). It is used as an indicator to select the time point for the highest correlation between the SST and the SSM. The following equation is used for its calculation:

$$ {r}_s=1-\frac{6\sum_{i=1}^n{d}_i^2}{n^3-n} $$
(2)

where d i is the difference between ranks for each x i , y i data pair.

Nash-Sutcliffe efficiency (NSE) is used widely in hydrology and can be obtained by

$$ \mathrm{NSE}=1-\frac{\sum_{i=1}^n{\left({y}_i-{x}_i\right)}^2}{\sum_{i=1}^n{\left({x}_i-\overline{x}\right)}^2} $$
(3)

Root mean square error (RMSE) is a frequently used measure of residuals between the modelled and the actual observations, which can be calculated as

$$ \mathrm{RMSE}=\sqrt{\frac{1}{n}{\sum}_{i=1}^n{\left({y}_i-{x}_i\right)}^2} $$
(4)

Results

SSM estimated from SST and ΔSST

The SST and ΔSST estimated SSM are evaluated comparatively in this section. For each catchment, the calibration and validation groups are chosen firstly. For each month, the first two thirds of the data are selected for calibration, and the remaining data are used for validation purpose. This approach ensures the seasonal representatives for both calibration and validations datasets (Srivastava et al., 2015).

The performance of ΔSST is firstly examined. The accuracy is measured by NSE and RMSE. The daily maximum and minimum SST are used for the ΔSST calculation (Idso et al. 1975). As shown in Fig. 2, it is interesting to notice that the daily maximum SST for Brue is at 16:00 instead of the expected 13:00 for most days. Whereas, the time for Stanley’s maximum SST is even later (i.e. 18:00). As to the minimum temperature, the time is found in the early morning for Brue (07:00) and 09:00 for Stanley. The possible reason for the delayed maximum values of SST is that the incoming solar radiation exceeds outgoing heat energy for many hours after noon, and the equilibrium is usually reached from 15:00–17:00 or even later. The exact time points may also be affected by many other factors such as water bodies, soil types, vegetation cover, wind, cloud cover, water vapour and soil moisture. Those factors can also influence the time points of the minimum values of SST. The performance of the ΔSST estimated SSM is shown in Fig. 3. It is clear to see that the overall calibration result in Brue is acceptable; however, for Stanley, the outcome is rather poor. To further examine the linear and non-linear modelling choices, both r and r s between the SSM and the ΔSST are calculated (Shrivastava et al. 2013b). The similar values obtained from r and r s indicate that the linear model is fit for the SSM prediction from the ΔSST at both catchments. As for the validation, the ΔSST-derived SSM demonstrates some degree of match with the observed SSM in the Brue catchment, while it is poor in Stanley.

Fig. 2
figure 2

Daily maximum and minimum SST histogram plots for Brue (a) and Stanley (b), during 1995 and 2005, respectively. It is noted missing data exist in both catchments

Fig. 3
figure 3

Performance of SSM estimation using ΔSST in Brue with a calibration and b validation, respectively, and Stanley, with c calibration and d validation, respectively

Although absent from the literature, the SSM derived from the SST is evaluated in this study. We hypothesise that when the SST climbs up, there will be more evaporation to cause a decline in the SSM. To test this hypothesis, the following explorations are carried out. r s is again used as a statistical indicator to find the time for the highest relationship between the SST and the SSM. As illustrated in Fig. 4, it is interesting to find that the time point selection for the SSM and the SST can have a big impact on the correlation result (especially the chosen time points of SST). In the Brue catchment, the SSM at 12:00 with the SST at 11:00 has the strongest correlation r s  = − 0.84. Generally, SST during the sunrise (04:00–08:00) shows the lowest correlation, while in the late morning to the midnight (10:00–24:00), a better correlation is achieved. For Stanley, the best correlation is found at 13:00 and 24:00 for the SSM and the SST respectively (r s  = − 0.37). On the other hand, the weakest correlations are found during midmorning (09:00–11:00), while relatively steady correlations are explored from afternoon to early morning (15:00–03:00). These identified best time points are chosen to build a model between the SSM and the SST, as presented in Fig. 5. Compared with the performance of the ΔSST, the estimated results by the SST are evidently improved in both catchments. In Brue, the calibration outcome is increased to r = − 0.87, which is much better than that from the ΔSST test (r = − 0.66). In Stanley, although the correlation with the SST is still not satisfactory, it is improved from r = − 0.32 to r = − 0.49. Moreover, the similar values calculated from r s and r reveal that the linear fitting method is suitable in both catchments. Furthermore, it is obvious to see that the validation result in Brue shows a significant enhancement (NSE = 0.85 and RMSE = 0.036) compared with the result from the ΔSST method. However, the validation outcome in Stanley remains poor; therefore, further exploration is essential. For both catchments, the ΔSST-derived SSM gives a very weak correlation with the observed SSM indicating that ΔSST is less suitable for the estimation of SSM. For this reason, the ΔSST datasets are not carried forward to the rest of the study.

Fig. 4
figure 4

Spearman correlations (r s ) between the 24-h SSM and the 24-h SST at a Brue and b Stanley

Fig. 5
figure 5

Performance of SSM estimation using SST in Brue with a calibration and b validation, respectively, and Stanley, with c calibration and d validation, respectively

Seasonal based SSM estimation

To further improve the performance of the SST-modelled SSM, an approach based on four seasons is introduced. This is because the relationships between the SST and the SSM can behave differently in various seasons. The splits of seasonal datasets are based on spring (March–May), summer (June–August), autumn (September–November) and winter (December to February). The data selection procedure is similar to that of the whole year study (i.e. for each month the first two thirds of data are used for calibration and the remaining third for validation). For the purpose of finding the time point for the best correlation between the SST and the SSM, the Spearman correlations are again calculated between the 24-h SSM and the 24-h SST for each season, and the results are shown in Table 1. For both catchments, the choice of time points for the SSM is relatively random. However, the best time points for the SST is progressively later from spring to winter in Brue (except the winter data). Whereas for Stanley, all the time points for SST are found during night time. These time points are then utilised to build the relationship between the SSM and the SST. The Pearson and Spearman correlations are calculated as shown in Fig. 6. Since both indicators show similar results, a linear model is again used to estimate SSM. As shown in Fig. 6, for Brue, it is surprising to observe that the winter SSM has only little fluctuation (~ 0.96–1.05 1/Hz) and shows a positive correlation between the SSM and the SST during the calibration period, which is distinct from the other seasons. Moreover, the SSM estimation in the winter is the weakest among all seasons. This is because in winter, there is very low evapotranspiration. It can be as low as zero, and at most time stays around 0.1 mm/day. Therefore, when it rains, the soil tends to stay wet. But, in summer, the evapotranspiration can be almost 6 mm/day, so any soaked rainfall can be quickly evaporated. This is why there are two groups of data in winter (one during the winter rainy period and one during the winter dry period) with few data in between. As a result, in winter, the temperature relationship is not ideal. Luckily, soil moisture data users are more interested in the summer time when soil information is really needed (e.g. for irrigation and other purposes). For the validation, the plot shows a rather strong seasonal demarcation. It can be seen from Table 2 that the best validation performance is obtained in the spring (based on NSE indicator), while the worst result is from the winter datasets. The seasonal equations are then combined to examine the results for the whole period. The comparison between the SST estimated SSM and the observed SSM shows a very strong correlation (NSE = 0.90 and RMSE = 0.03). For Stanley, the overall correlations are good during the calibration period, especially the correlation in the autumn season. However, the autumn has a positive correlation which is not in line with the other three seasons. During the validation period, the winter season shows a similarly poor performance as in the Brue catchment, which could also be caused by the frozen soil. The combined seasonal algorithms give NSE = 0.40 and RMSE = 1.55, which show a far better performance than the whole-year algorithm.

Table 1 Time points (24 h) for the best Spearman correlations obtained between the SSM and the SST in four seasons
Fig. 6
figure 6

Seasonally based performance of SSM estimation using SST in Brue with a calibration and b validation, respectively, and Stanley, with c calibration and d validation, respectively

Table 2 Statistical performances of SSM estimation using SST, during calibration and validation

Discussion and conclusions

Through the exploration, a high accuracy of SSM estimation is achieved in the Brue catchment with the seasonally based method. Although the performance in Stanley is also improved remarkably, the overall result is still not as good as the one in the Brue catchment. Here, an investigation is made to further discuss the factors that may contribute to the difference between those two catchments:

  • Precipitation: Rainfall data could be a significant factor influencing SSM in Stanley. As seen from Fig. 7, the rainfall amount in Brue is much larger than that in Stanley, with a total annual rainfall of 854.7 mm in Brue and 386.6 mm in Stanley. Moreover, the rainfall events in Brue are rather evenly distributed; whereas in Stanley, it is unevenly distributed. Therefore, in Stanley, after a long period of drought, its SSM has already reached its wilting point, such that the subsequent SSM would vary slightly with respect to the SST changes.

  • Climate: The Stanley microcatchment has temperate climate with big fluctuations in its precipitation and evaporation amount during a year (Stern et al. 2000), whereas the Brue catchment is within the typical temperate maritime climate. Therefore, in Brue, the Atlantic depressions are the major source bringing heavy and intense rainfall in spring, autumn and winter. During summer time, convection can sometimes cause heavy rainfall (Metoffice 2013). The annual potential evapotranspiration in Brue is calculated as 487 mm, while it is 1360 mm in Stanley. It is clear that in Stanley, with much higher evapotranspiration and less rainfall, the surface soil can easily become dry and interrupt the relationship with the SST.

  • Soil type: In Stanley, soils are predominantly basalt derived, consisting of mostly sand- and silt-size material (Sijing and Marinos 1997) with little cracking clays (Wells and Christoph 2003). While, the major soil type in Brue is clay. As classified by the US Department of Agriculture, sand’s particle size is between 0.05 and 2.0 mm, while clay’s is smaller than 0.002 mm. Generally, the smaller the soil particle size, the worse the drainage ability is. Fine soil like clay in Brue tends to hold water within the soil system for a longer time. Oppositely in Stanley, the moisture absorbed by the sandy soil (from rainfall and air humidity) can quickly percolate into deeper soil layers. Therefore, with less rainfall, the SSM in Stanley can easily reach its wilting point.

  • Vegetation covers: The values of the annual mean Normalised Difference Vegetation Index (NDVI) are retrieved from the AVHRR-FASIR dataset (Los et al. 2000; Sellers et al. 1996). It is found that the annual mean NDVI for Brue is 0.67, which indicates dense vegetation. In contrast, the annual mean NDVI for the Stanley catchment is 0.29, which indicates only low to moderate vegetation cover. Canopy is able to shade a certain amount of direct solar radiation and retain the moist level in soil. Moreover, the air temperature between the canopy and the earth surface is lower than the bare soil and therefore further reduces evaporation amount. Hence, in Brue, the surface soil is able to retain more water content before it goes dry.

Fig. 7
figure 7

Four seasons’ daily rainfall density plot for Brue (ad) and Stanley (eh), with the total annual rainfall of 854.7 mm in Brue and 386.6 mm in Stanley

From these aforementioned four points, it can be noted that the correlation between the SSM and the SST is stronger in areas where the surface soil is usually wet. Other factors such as wind speed, solar radiation and model used could also have impacts on the results. Further research over a variety of catchments will be necessary to validate those speculations.

This study has found that SST is much better than ΔSST in estimating SSM. For Brue, the validation result from the seasonally based analysis is very convincing. Such a high correlation could be useful for practical applications, such as for the remote-sensing SSM estimation. This is because the globally covered SST information can be retrieved from the thermal infrared band from many satellites, such as the Moderate Resolution Imaging Spectroradiometer (MODIS) Level 3 land surface temperature product, and the observations of the solar energy reflected and absorbed by the earth from the clouds and the Earth’s Radiant Energy System (CERES). Furthermore, it is found that the selection of SST time point is rather important for an accurate SSM estimation. Therefore, the multi-temporal correlation results between the SST and the SSM can provide useful information on selecting the most suitable satellites that pass a specific study area. In addition, explorations at more catchments with the consideration of adding precipitation, climate, soil and vegetation parameters would be beneficial to further enhance the conclusion of this paper.

It is noted that both SST and SSM in this paper refer to the soil sample at 5 cm below the surface. This is different to the satellite land surface temperature. However, it is not correct to assume that soil moisture at 5 cm below the surface has a better correlation with the surface temperature than the soil temperature at 5 cm below the surface. This is because the soil moisture and temperature are related to each other due to the varied heat capacity with the water content (water has a much higher heat capacity than dry soil particles). Therefore, this paper is logically valid to link the soil moisture and temperature at the same depth. However, studies to explore the relationship between the satellite land surface temperature and the soil moisture at 5 cm below the surface will be carried out in the future.