The growing demand for water, mainly due to population growth, increased productive activities, environmental degradation of water bodies, and decreased rainfall have contributed to a panorama of water scarcity in several regions of the planet (Espinosa and Rivera 2016; Li et al. 2017; Karthe et al. 2017; Kisi et al. 2018; Santos et al. 2018). Chen and Li (2016) reports that there are two types of water scarcity: (a) economic scarcity that occurs due to lack of investment, which is characterized by poor infrastructure, and (b) uneven distribution of water and physical scarcity, which occurs when water resources fail to meet the population demand. This study focuses on water scarcity from the point of view of physical scarcity because it proposes the application of mathematical methods to model the water balance and sediment yield in the Epitácio Pessoa Dam river basin, one of the most important basins in the State of Paraíba. The river basin provides the water supply for 20 municipalities, including Campina Grande, which is one of the largest population concentrations in the north-eastern semi-arid region.

With the importance of awareness of the effects of climate variability on nature and human beings, streamflow and the associated impact on the survival of human beings have received much attention from researchers around the world, mainly in semi-arid zones. Thus, several studies have been applying several methodologies to analyse climate and land use changes and the impacts on water resources and sediment yield. Wu et al. (2014) applied a system-theory-based model for streamflow forecasting, and the frequency analysis method was used to optimize the model to reduce prediction errors. The calibration and verification show that the forecasted results of the monthly streamflow are generally in good agreement with the observed data. Silva et al. (2018) forecasted the streamflow using hydrological simulation in a tropical humid basin in the Cerrado biome of Brazil using the Soil and Water Assessment Tool model (SWAT), whereas Silva et al. (2015) evaluated the trend of rainfall–run-off regimes in selected headwater areas using statistic testing and a Mann–Kendall seasonal test in Portugal, and Tian et al. (2012) assessed the integration of monthly water balance modelling and nutrient load estimation in an agricultural catchment. Recently, Wang et al. (2018) applied a genetic algorithm to land use optimization for non-point source pollution control based on CLUE-S and SWAT models for Xiangxi River watershed in China.

The region served by the Epitácio Pessoa Dam river basin faces major problems during periods of drought. Azevêdo et al. (2017) highlight that the water conditions of this basin are not enough to supply the flow of the rivers during the long periods of no rainfall, which can lead to the adoption of containment measures, such as rationing or the prohibition of irrigation.

In Brazil, water scarcity and sediment yield problems occur in several parts of the country, and in recent years there has been a decrease in water availability in regions that are not affected by water crises, such as the Metropolitan Region of São Paulo (Rao et al. 2016). Several Brazilian municipalities have faced a decrease in rainfall since 2012, creating a complex scenario of water scarcity that has caused serious impacts on the public water supply and other water uses, such as irrigation and electric power generation (Santos et al. 2017a). However, Silva et al. (2013) notes that the scarcity in Brazil occurs mainly in the north-east, more precisely in the semi-arid region, which periodically faces the occurrence of annual or multi-annual droughts. In view of the phenomenon of recurrent droughts in the Brazilian semi-arid region, measures have been taken to ensure stability in the water supply to humans, such as the transposition of river waters and the construction of reservoirs (Marengo et al. 2017).

Given the problem of water scarcity, the integration between run-off–erosion models and geographic information systems has stood out as an important tool that allows the simulation of hydrological processes at river basin scales and helps in the management of water resources. Among these models, SWAT (Soil and Water Assessment Tool) has stood out as one of the most used models in the world, showing good results in several applications (Silva et al. 2013; Liu et al. 2017; Malagó et al. 2017; Hallouz et al. 2018). SWAT model allows accounting for several management actions for best management practices, land use change, and reservoirs. Applying SWAT to large river watersheds remains challenging because the model complexity requires large input and calibration datasets (Vigiak et al. 2017). To improve the robustness of SWAT sediment flux assessments in large basins, Zhang et al. (2015) and Tuo et al. (2018) proposed methodologies that combined multi-site calibration, multi-objective approach, and the use of ancillary data to constrain sediment parameters coupled, but this applications are still scarce for the State of Paraíba, north-eastern Brazil. In this sense, this study aimed to estimate the water balance and sediment yield in the Epitácio Pessoa Dam river basin to assist in the management of water resources to prevent water scarcity in this important basin of the State of Paraíba, using data from 1970 to 2014.

Materials and methods

Study area

The Epitácio Pessoa Dam river basin is located between coordinates 6°80′00′′S–8°4′00′′S and 35°9′00′′W–37°5′00′′W (Fig. 1), with an area of approximately 12,406 km2. This basin is in the semi-arid region of north-eastern Brazil, more precisely in the central portion of the State of Paraíba on the Borborema Plateau.

Fig. 1
figure 1

Location map of Epitácio Pessoa Dam river basin in Brazil

Soil types

Several maps and tabular databases were used to conduct this study to model the water balance and sediment yield. The mapping of the soil types used was at a 1:250,000 scale and was developed by the Executive Agency of Water Management of the State of Paraíba (Agência Executiva de Gestão das Águas do Estado da Paraíba) (AESA 2017). The existing soil types in the basin are Cambisol, Leptsol, Luvisol, Fluvisol, Lithosol, Regosol, Planosol, Vertisol and Red Yellow Acrisol (Fig. 2a). Table 1 shows the distribution of soil types in the study area.

Fig. 2
figure 2

Spatial location of a soil types and b land use and land cover classification in the Epitácio Pessoa Dam river basin

Table 1 Distribution of soil and land use and cover types in the Epitácio Pessoa Dam river basin

Land use and cover

The map used in this study for mapping land use and land cover was created based on the classification of orbital images. The land use and occupation map of the basin was generated from a mosaic of three Landsat 2/MSS images, with 4–6–5 (RGB) composition and 80-m spatial resolution, dated 2 August 1980, 21 August 1980, and 10 October 1981, orbit 230 and point 65, which were obtained from the National Institute of Space Research (INPE) and are available at The supervised method was used to classify the images using the maximum likelihood classification due to the extension of the river basin and the spatial resolution of the images (Santana et al. 2014). The land use and land cover were classified based on vegetation size, which consisted of (a) rangebrush vegetation, (b) grassland vegetation, and (c) dense canopy vegetation as well as (d) water and (e) barren land classes. The description and spatial distribution of land use and occupation are shown in Table 1 and Fig. 2b, respectively.

The rangebrush vegetation and grassland vegetation classes predominate in the study area. The vegetation of the Cariri Paraibano region, as well as most of the Brazilian semi-arid region, is characterized by the dominance of deciduous species that are xerophilous with a strong presence of thorny plants and with good adaptation to drought, i.e. a typical vegetation of the Caatinga biome. Santos et al. (2017b) state that the topographic variability and the distribution of soil types, as well as the climate of the region, are the main factors that influence the differentiation of the sizes of vegetation species present in the region.

Figure 3a shows the typical representation of the Caatinga biome, characterizing rangebrush vegetation with smaller size, and the larger grassland vegetation, which occurs in greater abundance in the higher areas of the basin. The canopy vegetation (Fig. 3b) is the low vegetation that typically occupies the banks of the rivers. There is also a considerable presence of barren land (exposed soil) in the basin, which is a concerning fact because the lack of vegetation leaves the soil unprotected and susceptible to erosion (Fig. 3a). Finally, there is the water class, which corresponds to the water bodies in the river basin.

Fig. 3
figure 3

Physiognomy of the Caatinga biome vegetation. a Rangebrush vegetation, showing spaced trees, and barren and degraded soil; b typical canopy strata of the study area

According to Albuquerque et al. (2005), the species that can be found in the area are (a) woody: angico (Anadenanthera colubrina) and catingueira (Caesalpina pyramidalis Tul.), (b) cacti: prickly pear (Opundia sp.), xique-xique (Pilosocereus gounellei), mandacaru (Cereus jamacaru), and facheiro (Pilosocereus piauhinensis), (c) pasture: mimosa grass (Axonopus purpusii Nees), and (d) trees: quince (Cróton sincorensis), mufumbo (Combretum leprosum), pinhão manso (Jatropha pohliana), and pereiro (Aspidosperma pyrifolium).

Slope and climatological data

Figure 4 shows the slope classes for the study area based on the processing of the digital elevation model of the basin from the ASTER-GDEM image (Advanced Spaceborne Thermal Emission and Reflection Radiometer—Global Digital Elevation Model), with a resolution of 30 m (Jing et al. 2013). The ASTER-GDEM image is available at

Fig. 4
figure 4

Map of spatial distribution of slopes and rivers in the Epitácio Pessoa Dam river basin

In this study, the daily data from 21 rain gauge stations and two streamflow gauge stations were used (Table 2). It is also noteworthy that 3 years of “model warm-up” were used: the modelling based on the Poço de Pedras station from 1970 to 1972 and the modelling based on the Caraúbas station from 1973 to 1975 were replicated at the beginning of the series. This model warm-up was performed so that SWAT could be adjusted to the hydrological conditions of the basin before generating the results. Monthly data of air humidity, air temperature, solar radiation, and wind direction were collected at the Monteiro and São João do Cariri climatological stations for the same modelling periods.

Table 2 Characteristics of rain gauges and streamflow gauges* used in this study

The water balance modelling of the Epitácio Pessoa Dam river basin was divided into two parts: (a) the Taperoá river sub-basin between 1970 and 1990 using available streamflow gauge data from the Poço de Pedras station and (b) the upper Paraíba river sub-basin for 1973–1990 based on the streamflow gauge data from the Caraúbas station.

The SWAT model

The Soil and Water Assessment Tool—SWAT (Arnold et al. 1998) is a semi-distributed, semi-physically based model developed to simulate water flow and erosion and the impacts resulting from changes in land use in river basins. SWAT simulates the water balance in the basin based on the equation:

$${\text{SW}}_{t} = {\text{SW}}_{0} + \sum\limits_{t = 1}^{t} {\left( {R_{i} - Q_{i} - {\text{ET}}_{i} - P_{i} - {\text{QR}}_{i} } \right)}$$

where SWt is the final amount of water in the soil (mm); SW0 is the initial amount of water in the soil at day i (mm); t is the time (days); Ri is the rainfall at day i (mm); Qi is the surface run-off at day i (mm); ETi is the evapotranspiration at day i (mm); Pi is the percolation at day i (mm); and QRi is the return flow (capillary rise) at day i (mm).

The model also calculates the streamflow in the basin as a result of the total daily rainfall using the Soil Conservation Service (SCS) curve number (CN) method as follows:

$$Q_{\text{surf}} = \frac{{\left( {R_{i} - 0.2s} \right)^{2} }}{{\left( {R_{i} + 0.8s} \right)}}$$

where Qsurf is the daily surface run-off (mm) and s is the parameter of soil water retention (mm), which varies according to the land use, soil type, and slope and is obtained by the following equation:

$$s \, = \, 25.4\left( {\frac{100}{\text{CN}} - 10} \right)$$

where CN is the curve number for the day, which corresponds to the run-off potential for each soil type and can range from 1 (high permeability) to 100 (impermeable soil).

The SWAT model uses the Modified Universal Soil Loss Equation (MUSLE) for the determination of sediment yield, allowing the simulation of the amount of sediment generated for an event. The sediment yield is calculated by:

$$Y \, = \, 11.8\left( {Q_{ \sup } \times q_{p} \times {\text{area}}_{\text{hru}} } \right)^{0.56} \times K_{\text{USLE}} \times {\text{LS}}_{\text{USLE}} \times C_{\text{USLE}} \times P_{\text{USLE}} \times {\text{CFRG}}$$

where Y is the sediment yield (t); Qsup is the run-off volume (m3); qp is the run-off peak (m3/s); areahru is the area of the hydrological response unit (ha); KUSLE is the soil erodibility factor; CUSLE is the land use and management factor; PUSLE is the conservation practices factor; LSUSLE is the terrain topography factor; and CFRG is the sparse fragmentation factor.

Automatic discharge calibration

The automatic calibration in the SWAT model was performed using the SWAT-CUP public domain software, developed by Abbaspour et al. (2007), for automated calibration adjustment. The SWAT-CUP has five algorithms: (a) sequential uncertainty fitting (SUFI-2) (Abbaspour et al. 2007), (b) generalized likelihood uncertainty estimation (GLUE), (c) parameter solution (ParaSol), (d) Markov chain Monte Carlo (MCMC), and (e) particle swarm optimization (PSO).

In this work, the analysis of sensitivity, calibration, and verification of parameters was conducted using the SUFI-2 algorithm because this algorithm is one of the most used in the automatic calibration of this model for several basins in the semi-arid region (Bressiani et al. 2015; Da Silva et al. 2018). Table 3 shows the optimized values of the 19 input parameters of the SWAT model that were used in the calibration step. There were 1000 iterations for the fitting between the observed and simulated discharges for each streamflow gauge station in the calibration phase. After this step, the model was validated using the discharge data from January 1994 to December 2014 for both streamflow gauges.

Table 3 Fitted values of parameters used in the SWAT model calibration

Statistical analysis of the SWAT model performance

Two statistical methods were used for the evaluation of the comparison between observed and simulated discharges: (a) Nash–Sutcliffe efficiency coefficient—NSE (Nash and Sutcliffe 1970) and (b) coefficient of determination (R2). The NSE analyses the behaviour of the simulated data compared with the observed data and can range from − ∞ (negative infinity) to 1, in which NSE = 1 indicates a perfect fit. This coefficient is calculated by:

$${\text{NSE}} = 1 - \left( {{{\sum\limits_{i = 1}^{n} {\left( {x_{i} - y_{i} } \right)^{2} } } \mathord{\left/ {\vphantom {{\sum\limits_{i = 1}^{n} {\left( {x_{i} - y_{i} } \right)^{2} } } {\sum\limits_{i = 1}^{n} {\left( {x_{i} - x_{m} } \right)^{2} } }}} \right. \kern-0pt} {\sum\limits_{i = 1}^{n} {\left( {x_{i} - x_{m} } \right)^{2} } }}} \right)$$

where xi is the observed event; yi is the event simulated by the model; xm is the mean of the event observed in the simulation period; and n is the number of events.

Coefficient of determination measures the proportion of the variance between both variables, also ranging from 0 (zero) to 1 (perfect association), and is obtained by the equation:

$$R^{2} = \left( {\frac{{\sum\nolimits_{i = 1}^{n} {\left( {y_{i} - y_{m} } \right)\left( {x_{i} - x_{m} } \right)} }}{{\sqrt {\sum\nolimits_{i = 1}^{n} {\left( {y_{i} - y_{m} } \right)\left( {x_{i} - x_{m} } \right)^{2} } } }}} \right)^{2}$$

where ym is the mean of the values calculated by the model.

Results and discussion

Hydrological simulation

The Epitácio Pessoa Dam river basin was divided into 60 sub-basins and 853 hydrological response units (HRUs) in an area of 12,406 km2 (Fig. 5). Figure 6 shows the hyetograph and hydrographs of the simulated and observed discharges in the calibration for the Poço de Pedras and Caraúbas stations, respectively. In general, good agreement between the simulated and the observed discharges was observed, which was also statistically proven based on the objective functions used.

Fig. 5
figure 5

Epitácio Pessoa Dam river basin and sub-basins used in the model

Fig. 6
figure 6

Comparison of observed and simulated monthly streamflow: a Poço de Pedras and b Caraúbas

The results show that the base discharge was overestimated during the first years of the series, while the discharge peaks were underestimated. This deviation between the observed and estimated discharges is due to the difficulty of adjusting SWAT parameters to well represent the hydrological processes in the semi-arid region of Brazil, which has very shallow soils, crystalline basement that often appears in the surface, and geomorphology well diversified, constituted by different forms of relief. This difficulty in the fitting between observed and simulated hydrographs is due to conflicts between water use and land occupation and the intensive use of the natural resources as a whole. In addition, the limitation of the groundwater recharge in the semi-arid region of north-eastern Brazil is noteworthy mainly due to the geological characteristics of the region, which influence the formation of the soils, making them shallow with the presence of rock outcrops from the crystalline basement (Da Silva et al. 2016).

According to Fig. 6, the simulation presents errors for the extreme values due to the characteristics of extreme low discharge and heavy rainfall events which usually occur in the semi-arid region of Brazil. The results show the poor performance of SWAT model during the wet seasons which impeded its application to basins largely characterized by high flows, differently from the results obtained by Gebremariam et al. (2014) and Zhang et al. (2015). The low prediction for the largest extreme events is due to the fact that the model parameters related to groundwater had to be null so that the model could represent well the minimum discharges in this river basin. Thus, in this study, it was chosen to adjust the entire series in order to well represent all the extreme events (minimum and maximum); then a more realistic idea of the annual water balance for a Brazilian semi-arid river basin could be obtained.

Table 4 shows the comparison between the observed and simulated series for the two analysed streamflow gauge stations. It was observed that the two series for the Poço de Pedras station were lower than those for the Caraúbas station. The variation of the mean discharges for the Poço de Pedras station after the calibration was 7%, while it was 105% in the initial modelling. The variation for the Caraúbas station was 28% after the calibration, and in the initial modelling, it was 497%, showing a good model fit. It is evident that the differences between the maximum simulated and observed discharges were also notably reduced, although the model underestimated the discharges in both cases.

Table 4 Comparison between observed and simulated data in the model calibration

The underestimation of the simulated data compared with the observed discharge data may be related to the existence of reservoirs along the boundaries of the study area, which were not included in the modelling due to the lack of information on their sizes. Thus, for future research, it is recommended that the small reservoirs existing along the boundaries of the studied basin, as well as the Epitácio Pessoa Dam, should be included in the modelling in order to obtain more precise and more consistent results on the hydrological simulation. In this sense, the fact that the simulated discharges, for the most part, underestimated the observed discharges can be justified by faults in the parameterization of the basin, especially regarding the lack of inclusion of these reservoirs in the modelling.

The model performance results obtained through the objective functions are indicated in Table 5, where it is observed that the model performed well, presenting values within the preset limits that were adequate (R2 > 0.6 and NSE > 0.5) according to Moriasi et al. (2007). The results obtained for both periods show that the SWAT model overestimated the observed mean discharge, i.e. 5.28 m3/s for the Poço de Pedras station and 7.28 m3/s for the Caraúbas station during the calibration, and 3.65 m3/s for the Poço de Pedras station and 4.11 m3/s for the Caraúbas station during the validation.

Table 5 Objective functions in the calibration and validation steps

The model validation provided good results for the Poço de Pedras station (R2 = 0.87 and NSE = 0.80); however, for the Caraúbas station, the values found were 0.56 and 0.41, respectively, which were slightly less than desirable. The results for the Poço de Pedras station were within the range of values established as acceptable, but the values indicated for the Caraúbas station, although very close to acceptable, were below the allowable minimum.

Water balance of the Epitácio Pessoa Dam river basin

The water balance of the Epitácio Pessoa Dam river basin was simulated using the SWAT model. Table 6 shows the mean water balance values for the Epitácio Pessoa Dam river basin from the simulation based on the Poço de Pedras and Caraúba streamflow gauge stations. Figure 7 shows the water balance results from the Epitácio Pessoa Dam river basin from 1970 to 1990. Based on Table 6 and Fig. 7, it is observed that the estimated actual evapotranspiration was 68% of the total rainfall in the basin. Percolation represented 25% of this total; however, only 0.01% arrives as recharge to the deep aquifer due to the crystalline basement present in the region. The estimated discharge that reached the channels represented 7% of the total rainfall, with only 2% of base flow and 5% of streamflow.

Table 6 Variables obtained for the water balance in the Epitácio Pessoa Dam river basin
Fig. 7
figure 7

Water balance in the Epitácio Pessoa Dam river basin between 1970 and 1990

Estimation of sediment yield

The estimated sediment yield was calculated for the entire data series with the exception of the first 3 years of the model warm-up. Figure 8 shows the relationship between observed rainfall data and calculated sediment yield. It is noteworthy that the rainfall events that yielded sediments above 2 ton/ha/month were above 200 mm for the Poço de Pedras station; however, many rainfall events yielded erosion of less than 2 ton/ha/month. Some relatively high values were observed at the Caraúbas station, such as a rainfall event below 300 mm that yielded 12 ton/ha/month of sediments, and once again, it was observed that most rainfall events yielded less than 2 ton/ha/month. It was observed that the sediment yield was approximately zero at both stations for rainfall events of up to 200 mm. The highest sediment yield values for the Caraúbas station may be related to failures in the modelling process because this station had a slightly poorer performance than the Poço de Pedras station during calibration.

Fig. 8
figure 8

Relationship between rainfall and the estimated sediment yield for: a Poço de Pedras and b Caraúbas, and between observed streamflow and the estimated sediment yield for: c Poço de Pedras and d Caraúbas (1970–1990)

Figure 8 shows the behaviour of the relationship between the observed discharge data and the calculated sediment yield. Based on the results, it was observed that the run-off–erosion ratio was not high (R2 = 0.34 and 0.20 for the Poço de Pedras and Caraúbas stations, respectively). The relationship between rainfall and the calculated sediment yield was also not high (R2 = 0.21 and 0.12 for the Poço de Pedras and Caraúbas stations, respectively) (Fig. 8). These results may be associated with the construction of small dams along the drainage network because the interception caused by dams influences the volume of water that is dammed along the river course, changing the natural flow of the water drained in the basin.

Regarding the spatial distribution of sediment yield, the central-eastern and south-western portions of the basin are the areas with the lowest sediment yield values, with mean values ranging between 0.02 and 1.17 ton/ha/month (Fig. 9). These results can be explained by the influence of shrub and tree cover and the low slopes that exist in these regions, which contribute to reducing soil erosion. Silva and Medeiros (2014) studied the relationship between sediment yield and soil surface cover at a 100-m2 erosion plot scale in experimental basins in the Paraíba semi-arid region, both of which were included in the study area. They reported that a plot with revolved soil is able to generate streamflow approximately five times higher than a plot with the presence of rangebrush vegetation (native caatinga) and the same slopes. This finding shows the importance of the preservation of the caatinga vegetation for soil protection from the erosive processes in the semi-arid portion.

Fig. 9
figure 9

Spatial distribution of mean sediment yield in the Epitácio Pessoa Dam river basin between 1970 and 1990

The southern region of the basin (sub-basins 56 and 59) is the portion with the highest rates of sediment yield. The rangebrush vegetation predominates, and exposed soil is present in these sub-basins. The southern region also has the steepest slopes of the basin. The strong presence of exposed soil and the presence of Lithosols, which have low organic matter content and generally occur in semi-arid areas with steeper slopes (Aragão et al. 2013), intensify the erosion in this area because the terrain has little vegetation cover and the soil has low infiltration capacity. In these sub-basins, the calculated sediment yield was above 12.06 ton/ha/month.


The distributed hydrological model, SWAT, was applied to assess water balance and sediment yield in the Epitácio Pessoa Dam river basin. The model was able to represent the hydrological processes within a semi-arid region of Brazil. Simulated spatially distributed water balance and erosion process results helped to determine the critical areas for water deficit and prone erosion areas occurring in different parts of the basin.

The calibration and validation procedures estimated water balance components and monthly streamflow which generally showed satisfactory model performance, though the model overestimated the low streamflows and underestimated some peak events, including the high flows of a certain period. At the same time, the model simulated some peak streamflows which happened during intensive rainfall while the corresponding observed streamflows showed low values. The water balance of the Epitácio Pessoa Dam river basin was analysed, which enabled the visualization of the hydrological behaviour of the basin as a whole. The SWAT Check tool was used for the analysis, which reads the output data of the SWAT model and provides the component values of the water balance as a figure. The results of the water balance analysis indicated a high rate of evapotranspiration in the basin, where 68% of the precipitation was evapotranspirated, which is expected due to the climate characteristics of the study area, and 7% of the total precipitation was converted into a discharge composed of 5% from the streamflow and 2% from the base flow.

Finally, the SWAT model calibrated for the Epitácio Pessoa Dam river basin can be used as a management tool, helping to understand the hydrological processes that occur in the basin. It can also be used as a guiding force in the decision-making process, considering the importance of the Epitácio Pessoa Dam for the State of Paraíba with respect to the municipalities that depend on its waters for human supply, i.e. for the economic importance that it has as a water source for industries and agribusiness.