1 Introduction

Disasters caused by floods occur everywhere in the world in small to large river basins, with their intensity increasing probably due to the global climate change in recent years [33]. In order to prepare for such disasters by establishing a proper risk management scheme and river improvement plan, it is crucial to systematically acquire accurate hydrological information such as rainfall intensity distributions, water level and discharge. Among these hydrological parameters, the number of flow measurement points and the flow data are much less than the other parameters, because accurate discharge measurement is difficult to conduct especially in flood conditions. On the other hand, it is indispensable from the viewpoint of water resources management to accurately measure the discharge even at low flow rates. Regarding typical measurement at high-flow rates, estimation of discharge is conducted from in situ measurements of flow velocity by using impellor-type current meters, acoustic Doppler velocimeters, or acoustic Doppler current profilers (ADCP) [22] coupled with measurements of bathymetry. However, in Japan, a float method is officially used in major rivers in the last several decades [21], because of the difficulty of using intrusive measurement instrument at high-velocity flows with various flotsams such as driftwood or floating objects. The floats are made of cardboard pipe in which certain amount of sand is contained to control its specific gravity close to unity when it floats. According to the Japanese regulation, floats with different lengths are used depending on the water depth; e.g. two-meter float is used when the water depth is greater than 2.6 m. The longest float is 4 m and used for a depth greater than 5.2 m. In the float method, the time that a float passes through a specified length, usually about one hundred meter, is measured by a stop watch by a pair of field workers. The transverse spacing of the float measurement is about ten to twenty meters, suggesting that a discharge of a river with one hundred meter has to be measured only by using four floats in case of emergency. Such a rapid measurement has to be conducted because a hydrograph tends to show a sharp peak with a duration of a few hours in Japan. Therefore, the available measurement time would become very short to obtain the flow rate at nearly the same water level, leading to a possibility that a significant error might occur in the data. In addition, this method becomes difficult to carry out in case of a huge flood that could overflow the levee because of the danger of measurement works itself.

The most possible solution to the problem is to use image-based techniques such as the large-scale particle image velocimetry (LSPIV) [2, 3, 6, 8,9,10, 12, 23, 24, 26,27,28,29] or the space–time image velocimetry (STIV) [1, 13, 14, 19, 30]. These techniques can measure surface velocity distributions by analyzing surface images captured from a river bank. The fundamental assumption behind the techniques is that visible texture on the water surface acts as a passive tracer relative to the surface flow. The surface texture is basically a superposition of turbulence-generated surface ripples moving in all directions. The above assumption has been verified in various field measurements at least when the wind effect is negligible [5, 32, 34, 35]. To compare LSPIV with STIV, the former measures an instantaneous velocity vector using a pair of images with a specified time separation while the latter measures an averaged velocity in space and time at a specified spacing usually set in the streamwise direction by using all images at once. With respect to discharge measurement, STIV is superior to LSPIV, especially when the depression angle of image shooting is small and the pixel resolution farther from the camera location becomes low [35].

However, even in STIV, measurement errors might occur in the cases where the quality of surface textures are not appropriate for the image analysis due to random noises, white caps by wave breaking, standing surface waves, shadow of background projected on the water surface close to shoreline, etc. In STIV, a streamwise array of pixels is sampled over time to create a temporal image sequence or time stack, which we call space–time image (STI) in this paper. Since sloped features in the space–time image represent the advection velocity of surface texture, measurement becomes difficult when the STI includes textures that do not directly related to the streamwise flow due to the above-mentioned factors. Therefore, for establishing a reliable measurement system, deteriorated STI has to be detected by setting a threshold with respect to the quality of STI. In this paper, we proposed a new algorithm using autocorrelation function for space–time images and its performance is evaluated through the comparison with the other techniques. The new technique is termed the quality evaluation of STI by using two dimensional autocorrelation function (QESTA).

2 Outline of the conventional STIV

2.1 Generation of space–time image

When STIV is applied to river flow measurements, video images are usually taken from a river bank and search lines are set parallel to the flow direction as shown in Fig. 1a. Usually, search lines with a constant length in the physical scale are set at a constant spacing in ortho-rectified image, as shown in Fig. 1b. In the case of Fig. 1, thirty search lines with a length of 23.0 m and a spacing of 5.46 m are set to cover a width of about 160 m of the Uono River’s water zone. Square panels shown in the figure are used as the ground control points (GCPs) for the image rectification [24]. Figure 2 shows some examples of original STI for the search lines indicated in Fig. 1 by using 358 frames that correspond to 11.9 s. Figure 2 also provides images enhanced by the histogram equalization filter usually used for image enhancement. Images by the standardization filter to be explained later are also shown in Fig. 2. Although the physical length is the same at 23 m, the horizontal scale decreases with the distance from the camera, while the vertical scale is constant at 358 pixels. Paying attention to each space–time image, the texture in STI02 or STI02h nearest to the camera includes a pattern different from the linear texture probably due to the effect of ripples reflected from the river bank. The STIs away from the bank tend to display almost linear texture until STI20. STIs further away from the camera and close to the other side of the bank displays non-uniform texture caused by a vertical dark pattern projected by the shade of weed. Moreover, the image itself becomes vague due to the lower spatial resolution. Generally, it is readily apparent that the tilted texture will loose its clearness with the distance from the camera. In this example, the spatial resolution of the search line varies from 1.67 cm/pixel in the nearest-to-the-camera position to 6.53 cm/pixel in the farthest position as shown in Fig. 2. Therefore, the spatial resolution of STI is reduced by a factor of 4 in this case. The depression angle from the camera to each search line is also provided in Fig. 2, which varies from 3.51 degrees for the farthest one to 14.01 degrees for the nearest one. The value of spatial resolution as well as the depression angle determines the critical condition of STIV measurement.

Fig. 1
figure 1

Original and orthorectified images with search lines. a Original view, b Orthorectified image

Fig. 2
figure 2

Original and filtered STIs

2.2 The standardization (STD) filter

In this research, a new filter for image enhancement for STI is developed to equalize the unevenly distributed image intensity included in the original STI such as shown in STI30 in Fig. 2. In the filter, vertical distribution of pixel intensity are normalized by calculating standard deviations for each vertical pixel array so that the entire variance takes the same value. The filter, the standardization (STD) filter, can be expressed by the following equation,

$$I^{S} \left( {x,t} \right) = \left( {I(x,t) - \mu_{t} (x)} \right)/\sigma_{t} (x)$$
(1)

where I(x,t) is the original pixel intensity distribution of a space–time image, IS(x,t) is the filtered image, μt(x) is the mean value of pixel intensity for a vertical array given by

$$\mu_{t} (x) = \frac{1}{T}\int\limits_{{}} {I(x,\tau )} d\tau$$
(2)

and σt(x) is the standard deviation for the pixel array calculated by

$$\sigma_{t} (x) = \left[ {\frac{1}{T}\int\limits_{{}} {\left\{ {I(x,\tau ) - \mu_{t} (x)} \right\}^{2} } d\tau } \right]^{1/2} .$$
(3)

The effect of the STD filter is provided in Fig. 2. It is obvious that the texture in each STI became much clearer by normalizing the image by the standard deviation for each vertical pixel array. For example, the shade appeared in STI30 or STI130 h disappeared completely in STI30 s and furthermore the imbedded linear texture have emerged after applying the STD filter.

To visually compare the velocity variation in the transverse direction, Fig. 3 provides STIs with their lateral length adjusted to the constant length. It is apparent that the texture gradient ϕ is larger in the center of the section while they become smaller closer to each bank, indicating the flow is gradually accelerated in the middle of the section and decelerated near the flow boundaries. The basic idea behind STIV is that once such a linear texture is obtained in STI, a space-and-time averaged streamwise velocity can be calculated merely by measuring the slope of the texture.

Fig. 3
figure 3

STIs with the lateral length adjusted

2.3 Conventional method of STIV

In the conventional STIV developed by Fujita et al. [13], the STI is partitioned into rectangular subregions where local texture gradient is calculated by the optical gradient tensor method [13, 25]. The mean gradient value is obtained by first generating a histogram of gradients and then taking the weighted average of them by using the coherency of the local image as a weighting factor [1, 13]. Alternatively, two dimensional Fast Fourier transformation (FFT) using a band pass filter can be used to improve the quality of STI by picking up only the pattern related to the flow [11, 15]. The conventional STIV has been successfully applied to several flood flow measurements [1, 18], but it is difficult to evaluate the quality of STI based on the optical gradient tensor method. Other than the above technique [4, 7] developed a similar approach (optical current meter, OCM) for alongshore currents in the nearshore environment, but the information about the quality of STI was not provided in their research [5, 32].

3 A new algorithm of STIV

3.1 Detection of pattern gradient using 2D auto-correlation function

In the new technique, a two dimensional autocorrelation function for the image intensity distribution of STI is used for detecting the most probable texture gradient included in STI. The autocorrelation function is defined by the following equation,

$$R(\tau_{x} ,\tau_{y} ) = \int\limits_{ - \infty }^{\infty } {\int\limits_{ - \infty }^{\infty } {f(x,y)f(x - } \tau_{x} ,y - \tau_{y} )dxdy} ,$$
(4)

where f(x,y) is the image intensity distribution in STI and (τxτy) are shift parameters. To calculate the autocorrelation function efficiently, the Wiener–Khinchin theorem is utilized that states the inverse Fourier transform of the power spectral density function gives the autocorrelation function, i.e.

$$R(\tau_{x} ,\tau_{y} ) = F^{ - 1} \left[ {\left| {F\left( {\xi ,\eta } \right)} \right|^{2} } \right]$$
(5)

where

$$F(\xi ,\eta ) = \int\limits_{ - \infty }^{\infty } {\,\int\limits_{ - \infty }^{\infty } {f(x,y)e^{ - i(\xi x + \eta y)} dx} dy} ,$$
(6)

and F−1 stands for the inverse Fourier transform. With this theorem, the autocorrelation function of an STI can be numerically calculated efficiently by using the Fast Fourier Transform algorithm. As an example, the distribution of the autocorrelation function R(τxτy) of the STI shown in Fig. 4a is provided in Fig. 4b. The distribution is normalized such that R(0,0) takes a value of one at the centre of the autocorrelation image, i.e. \(\hat{R}(\tau_{x} ,\tau_{y} ) = R(\tau_{x} ,\tau_{y} )/R\left( {0,0} \right)\). It is apparent that the region of higher correlation shows an inclined pattern corresponding to the actual texture gradient.

Fig. 4
figure 4

The new technique of STIV; a original STI, b autocorrelation of STI c log-polar transformation of (b), d directional average distribution(DAD) of c

In order to efficiently obtain the numerical value of the gradient, the distribution R(τxτy) within a circle indicated in Fig. 4b was transformed to the logarithmic polar coordinates (ρ,θ) shown in Fig. 4c by taking the origin at the centre of Fig. 4b. The transformation equation is

$$\theta = \arctan (\,\tau_{y} /\tau_{x} )$$
(7)

and

$$\rho = M\log \sqrt {\tau_{x}^{2} + \tau_{y}^{2} } .$$
(8)

M is the coefficient of intensification and M = 15 is used in this example. The purpose of using the log-polar coordinates is to emphasize the information near the origin which is important for accurately obtaining the texture gradient. The darker region in Fig. 4c corresponds to a larger correlation region near the origin. Finally, in order to find the peak value of the gradient, the distribution in Fig. 4c is averaged in the ρ direction to obtain the directional average distribution (DAD) μ(θ) by,

$$\mu (\theta ) = \frac{1}{\hbox{max} (\rho )}\int\limits_{0}^{\hbox{max} (\rho )} {\hat{R}(\rho ,\theta )d\rho }$$
(9)

where

$$\hbox{max} (\rho ) = M\log [\hbox{min} (\hbox{max} (\tau_{x} ),\hbox{max} (\tau_{y} ))].$$
(10)

An example applied to the distribution indicated by Fig. 4c is provided in Fig. 4d. In this expression, the texture gradient is obtained by subtracting the angle showing the maximum value of (θ) from π/2, i.e.

$$\varphi = \frac{\pi }{2} - \arg \hbox{max} [\mu (\theta )].$$
(11)

The actual maximum angle was obtained by applying a 5 point fitting curve to the discrete values around the peak and by finding reasonable peak values. Once the texture gradient is obtained, the velocity in the direction of the search line usually set in the streamwise direction can be calculated from the following relation,

$$u = \frac{{S_{x} }}{{S_{t} }}\tan \varphi .$$
(12)

Sx is the unit physical length scale per pixel in the lateral direction and St is the unit time scale per pixel in the downward vertical direction of STI.

3.2 Quality evaluation parameters of STI

As mentioned in the introduction, the accuracy of the STIV depends on the quality of STI and it is necessary to pick up only reliable STI for establishing a real time measurement system. In this research, two parameters for evaluating the STI quality are proposed; i.e. the Poisson ratio (ν)-type index NTI, and the shear deformation (γ)-type index GTI.

3.2.1 Poisson type index NTI

As can be seen from Fig. 4b, the region of higher autocorrelation distribution tends to have a shape of an ellipse elongated towards the longer axis, when the pattern takes unidirectional texture. On the other hand, when the pattern tends to include textures not related to flow features such as random noises or shadowed part, the shape of the ellipse will have a shape closer to a circle. Considering such a characteristics of the autocorrelation distribution with respect to the pattern appeared in STI, the quality of an STI can be evaluated by the ratio of the maximum and the minimum value of μ(θ), which is written as

$$NTI = \frac{\hbox{max} [\mu (\theta )]}{\hbox{min} [\mu (\theta )]}.$$
(13)

Qualitatively, NTI takes larger value for a better STI for flow measurements. The numerical evaluation of this parameter will be presented in the later section.

3.2.2 The relation between the texture gradient and the error in measurement

Once the gradient of the texture ϕ indicating the flow is obtained by Eq. (11), the relationship between ϕ and the measurement error can be derived as follows. The relation between velocity and the texture gradient shown in Eq. (12) can be simply expressed as the following relation with k being a constant coefficient;

$$u = k\tan \phi .$$
(14)

Using this expression, the error of the velocity measurement Δu due to the gradient measurement error Δϕ can be expressed by the following relation,

$$\Delta u = k\frac{d\tan \phi }{d\phi }\Delta \phi = \frac{k}{{\cos^{2} \phi }}\Delta \phi .$$
(15)

Therefore, the error ratio of the velocity measurement p can be given as

$$p = \frac{\Delta u}{u} = \frac{1}{{\tan \phi \cos^{2} \phi }}\Delta \phi .$$
(16)

It is clear from the functional feature of Eq. (16), p takes the minimum value at 45 degrees for the same error of Δϕ. On the other hand, the error increases significantly for textures with smaller or larger angles.

3.2.3 The shear effect index GTI

In addition to the index about the uniformity of the pattern, NTI, another index in terms of the distortion of the ellipse of the autocorrelation distribution is proposed as a gamma index GTI. The idea behind this index is to take the level of symmetry of the autocorrelation ellipse into account, because the autocorrelation ellipse tends to become non-symmetric along its axis when the pattern in STI includes textures out of the main gradient, such as due to the effects of dispersive waves propagating in directions different from the main flow, local wave breakings or another noisy patterns. The level of symmetry can be calculated by the phase difference between the angles showing the maximum and the minimum DAD, which can be expressed by

$$\Delta \hat{\phi } = \left| {\arg \hbox{max} [\mu (\theta )] - \arg \hbox{min} [\mu (\theta )] - \pi /2} \right| .$$
(17)

The phase difference increases from zero for an ideal parallel pattern to a larger value with the increase of the distortion of the ellipse. The index GTI is obtained by replacing the error of gradient measurement ∆ϕ in Eq. (15) with the phase difference \(\Delta \hat{\phi }\) indicated in Eq. (17), which can be calculated by

$$GTI = \frac{1}{{\tan \phi \cos^{2} \phi }}\Delta \hat{\phi } = \frac{{\left| {\arg \hbox{max} [\mu (\theta )] - \arg \hbox{min} [\mu (\theta )] - \pi /2} \right|}}{{\tan \phi \cos^{2} \phi }}.$$
(18)

Since GTI is basically related to the error ratio of velocity measurements indicated in Eq. (15), it is possible to show the reliable range of velocity measurement as

$$u \pm \Delta u = u\left( {1 \pm GTI} \right) .$$
(19)

The method presented above including the measurement of texture gradient as well as the use of indices for evaluating the STI quality is hereafter termed the Quality evaluation of STI by using two-dimensional autocorrelation function (QESTA).

4 Evaluation of QESTA

For evaluating the accuracy and performance of QESTA, synthetic space time images with various texture gradients are examined.

4.1 Synthetic data test

In order to test the performance of QESTA, we prepared two types of space time images with given texture gradient, one of them has clear linear feature and the other has wavy feature. The synthetic images were generated by utilizing the Perlin noise [31]. The Perlin noise is a type of gradient noise and has been used in the filed of computer graphics to generate arbitrary noises that look natural. In this test, an image with a size of 512 by 512 pixel having the Perlin noise in the lateral direction is first generated. The test images were obtained by rotating it from 5 to 85 degrees. In the case of clear linear feature, lateral and vertical wave numbers of the noise were set at 1 and 48, respectively, while in the case of wavy feature they were set at 10 and 48, respectively. Several examples are provided in Fig. 5, in which linear and wavy patterns are clearly generated.

Fig. 5
figure 5

Synthetic images with clear linear pattern (upper) and wavy pattern (lower). a 5°, b 30°, c 45°, d 60°, e 85°, f 5°, g 30°, h 45°, i 60°, j 85°

4.2 Measurement accuracy of QESTA

Figure 6 compares the texture gradient measured by QESTA and the given gradient. It is apparent that QESTA yields favorable results for all the angles irrespective of the difference of texture features. For the clear texture, the average and the maximum error were 0.252 degrees and 0.458 degrees, respectively and for the case of wavy texture, they were 0.398 degrees and 0.919 degrees, respectively. There is no correlation with angle. Regarding the image quality indices, GTI varies from 0.0054 to 0.166 with the average of 0.053 for the clear texture and it varies from 0.0063 to 0.299 with the average of 0.070 for the wavy texture. This indicates GTI takes smaller values for clearer texture, which is reasonable considering the difference of texture. On the other hand, NTI varies from 1.67 to 1.75 with the average of 1.71 for the clear texture and 1.79 to 1.82 with the average of 1.80 for the wavy texture. The variation of NTI is very small and there is no significant difference between the types of texture. It can be concluded that GTI is more sensitive to the image quality but has no correlation with texture gradient. As a comparison, Fig. 7 provides the autocorrelation distribution and DAD for the case of 45 degree. It is obvious that the clear texture has much elongated ellipse than the wavy texture, but the the DAD does not show a big difference.

Fig. 6
figure 6

Measurement accuracy of QESTA for synthetic images. a clear linear feature STI b wavy feature STI

Fig. 7
figure 7

Comparison of the autocorrelation distribution (upper) and DAD for texture gradient of 45 degree (lower). a Clear linear feature, b wavy feature, c clear linear feature, d wavy feature

5 Application of QESTA to flood flow measurement

5.1 Outline of the measurement site

The performance of the new STIV algorithm, QESTA, was examined by applying it to the measurement of snowmelt flood of the Uono River, which is a major tributary of the Shinano River that flows into the Japan Sea. The measurements were carried out downstream and upstream of the Negoya Bridge, located at Horinouchi in Uonuma City of Niigata Prefecture Japan on April 24, 2014. The measurements were conducted by shooting video images by using a high-definition camera from the left bank. The width of water at the upstream section was about 160 m at the time of measurement. During the image shooting, measurements by a remote-controlled and boat-mounted ADCP was concurrently conducted at the upstream section taking a rout in a zig-zag manner to cover a larger river reach. As the upstream measurement section by STIV, already indicated in Fig. 1, agrees with one of the trajectories of ADCP, comparisons were made in terms of the transverse velocity distribution. STIV analysis was conducted by setting search lines with a length of 23.0 m and a spacing of 5.46 m as already mentioned. In addition, LSPIV analysis was conducted at the same section to compare the relative accuracy.

5.2 Comparison of surface velocity distributions

Figure 8 provides surface velocity distributions obtained by various techniques; the conventional gradient tensor method (GTM), LSPIV, ADCP, QESTA with original STI and QESTA using STIs processed by the standardization filter. Regarding the ADCP results, the data closest to the water surface, i.e. 30 cm below the water surface was compared. As far as the distance from the camera is less than 100 m, image-based techniques yielded similar results with each other, except that LSPIV deviates from STIV at around 40 m from the left bank. Farther from 100 m, QESTA using original STI failed to obtain reliable data due to the vertical texture in STI but QESTA with SDT filtered STI yielded consistent distribution until the other bank. The distribution of GTI is obviously suppressed with STD filter in the entire region while without STD filter GTI becomes very large farther from 100 m. The conventional gradient tensor method (GTM) generated similar data to QESTA with STD filter but show some scatter close to the other bank. In using LSPIV, the 358 oblique images were first reconstructed such that the size of one pixel after the geometric correction becomes 15.3 cm and a pair of image with a time separation of 0.1 s was analyzed by PIV using a template size of 101 by 101 pixels that corresponds to 7.65 m by 7.65 m. The average data was obtained after processing 350 image pairs. Therefore, LSPIV requires much larger data storage space than STIV. The mean velocity vectors thus obtained are shown in Fig. 9, indicating that a favorable distribution is measured by LSPIV until 100 m, but the vectors show some scatter farther from there where the rectified image becomes blurred due to the lower spatial resolution. To compare with the other methods, QESTA with SDT filter fits best with the scattered ADCP data in the entire section. The reason why the measurement by ADCP has a scatter is that almost instantaneous velocity is obtained while the boat is moving in a zigzag manner. Regarding the discharge measurement, ADCP yielded 224.14 m3/s, GTM 232.25 m3/s, QESTA with STD filter 234.65 m3/s, and LSPIV 246.79 m3/s. The relative error to ADCP is 3.6% in GTM, 4.7% in QESTA, and 10.1% in LSPIV. In LSPIV, velocity components in the direction of search lines is used for discharge calculation. Regarding the NTI value, Fig. 10 shows the application of the STD filter raises its value about 0.5, indicating the improvement of texture in STI.

Fig. 8
figure 8

Comparison of surface velocity distributions

Fig. 9
figure 9

Surface velocity vectors obtained by LSPIV

Fig. 10
figure 10

Distribution of NTI before and after STD filter

5.3 Efficiency of analysis

The advantage of STIV over LSPIV is its efficiency of analysis. In LSPIV, sequential ortho-rectified images with a very large image size have to be prepared, therefore it would require a large storage volume and time for the analysis. On the other hand, STIV basically generates a STI with a small image size for the search line and does not require large storage volume as in LSPIV. In the present case, the genuine CPU time to calculate one velocity data was 0.94 s in STIV and 2.06 s in LSPIV when we use a normal Windows PC. In addition, the storage size required in LSPIV was 3.59 GB for saving 358 rectified bitmap images with a size of 2296 by 1806 pixels, while STIV used only 21.0 MB in the present analysis.

6 Conclusions

A new technique for measuring river surface velocity distributions by using video images shoot obliquely from a riverbank was proposed, together with indices for evaluating the quality of space time images. As a measurement method for river flow discharge, only the float method has been officially allowed in the last several decades in Japan. However, due to the difficulty of float measurements in extreme flow conditions in flood disasters in recent years, methods other than the float method have come to be allowed officially from 2017 in Japan. Therefore, image-based techniques such as LSPIV and STIV have been paid attention and among them STIV is recognized as a promising technique for extracting reasonable results, because measurements can be executed safely and economically by utilizing existing river monitoring cameras. The advantage of STIV over LSPIV is that STIV can yield reliable results even when the image shooting is carried out under deteriorated light conditions. The measurement system using STIV is commercialized as a software KU-STIV (be-system.co.jp/navi_soft/soft_kustiv/kustiv.htm) in 2015, and since then STIV has been used successfully in many first-class rivers in Japan. Although a tentative real-time measurement system with STIV is already established [17, 20], space time images occasionally become difficult to handle in dark–light condition. Therefore, it is necessary to pick up only reliable STIs and discard unreliable ones for discharge estimation. From the viewpoint of practical measurement, the quality evaluation by GTI or NTI can be used as indices, although the threshold levels of the two indices have not been determined in the present research. Hence, further examinations under much more deteriorated conditions have to be executed. Finally, thanks to the development of unmanned aerial vehicles (UAVs) in recent years, river flow images have come to be utilized for measurements of much longer river reach by using aerial STIV technique using an efficient image stabilizing technique [16], which will be a promising technique to investigate river flows difficult to access from the ground.