Actual Precipitation Index (API) for Drought Classification

The Standard Precipitation Index (SPI) is a widely used statistical technique for the characterization of droughts. It is based on a probabilistic standardization procedure, which converts a Gamma-type probability distribution function (PDF) into a normal (Gaussian) standard series with zero mean and unit standard deviation. Drought classification based on SPI indicates dry and wet spell characteristics, provided that the hydro-meteorological records abide by normal (Gaussian) PDF only, otherwise the results will be biased. Therefore, in this paper, the actual precipitation index (API) method is presented, which provides drought classification and information regardless of the underlying PDFs. The main purpose of this paper is to explain the main differences between SPI and API and to prove that the use of API is the more reliable solution for classification of droughts into five categories described as “Normal dry”, “Slightly dry”, “Medium dry”, “Very dry” and “Extremely dry”. The application of the methodology is presented for two sets of precipitation data; one with exponential PDF monthly precipitation records from Istanbul City, Turkey and one for New Jersey, USA with almost normal (Gaussian) PDF based on annual precipitation records. The comparisons indicate that API is applicable regardless of the underlying PDF of the hydro-meteorology data. It produces real drought classification from the original data without recourse to standard normal PDF conversion.


Introduction
Drought phenomena are forms of natural hazards that start to manifest when meteorological droughts set in, leading to hydrological and agricultural droughts. Droughts cause water stress and scarcity, making it essential to carefully manage water resources and are often the cause of the most dangerous famines in several regions of the world (Lana et al. 2001;Mishra and Singh 2010). In extreme cases, dry spells, and consistent meteorological, hydrological, and agricultural droughts, have pernicious effects on the socioeconomic and agricultural activities and seriously damage their hydro-electric power generation capacity. Droughts occur when, for successive months (1, 3, 6 months) or years (12 months), there is less than average precipitation in an area and are influenced, not only by a precipitation deficit, but also by temperature, wind speed and soil moisture (Rhhee and Cho 2016). Eum et al. (2017) present a statistical downscaling methodology to bridge gap between climate models output and required climate information based on the climate scenario projections implementing trend preservation.
The major driving force causing prolonged drought periods is a reduction in the amount of precipitation, which, in recent times, appears to be due to the impacts of global warming and climate change, following increases in greenhouse gas (GHG) emissions (IPCC 2007(IPCC , 2014. Several adverse consequences are unavoidable in cases of extended drought periods such as water scarcity, reduction in agricultural production, and surface flow recessions (Seckler et al. 1999;Mancosu et al. 2015). Consequently, the ability to forecast and to predict the characteristics of droughts is important to identify their initiation, frequency, and severity. In addition to the systematic 1 3 Published in partnership with CECCR at King Abdulaziz University evolution of climate phenomena, certain random components affect many human activities. Thesecomponents include unpredictable precipitation, runoff and soil moisture which change the hydrological cycle. It is not certain that such changes can be foreseen through the probabilistic, statistical and stochastic methodologies. Along this line, Pal and Al-Tabbaa (2011) stated that hydrological cycle perturbations in response to climate change may involve frequency and intensity distortions in precipitation records, which may affect the availability and quantity of freshwater resources.
Although several drought indices exist, the most commonly used one is the Standard Precipitation Index (SPI) suggested by McKee et al. (1993) for treatment of precipitation records. It is based on the normal (Gaussian) probability distribution function (PDF) conversion of the original PDFs. For hydro-meteorological drought assessments, various drought indices are suggested by different researchers including the Palmer Drought Severity Index (PDSI), Rainfall Anomaly Index (RAI), Standard Precipitation Index, SRI, Standardized Precipitation Evapotranspiration Index (SPEI) (Palmer 1965;Van-Rooy 1965;McKee et al. 1993;Guttman 1999;Shukla and Wood 2008;Zhao et al. 2014;Sobral et al. 2018;Al Adaileh et al. 2019;Bhunia et al. 2020). There are numerous applications of SPI in the literature; for instance, Sobral et al (2018) evaluated the spatial distribution of the standard precipitation index (SPI) against the standardized form of the reconnaissance drought index (RDI) for intense episodes of drought in the state of Rio de Janeiro, Brazil. Methodological uncertainties of SPI concerning limited record length, trends, and outliers have been investigated with a homogeneity test on 14 Italian stations on the basis of underlying Gamma PDFs (Carbone et al. 2018). Wu et al. (2020) have investigated spatiotemporal drought variations in mainland China based on the standardized precipitation evapotranspiration index (SPEI) at 1 and 12-month timescales.
The main purpose of this paper is to highlight the deficiency in the classical SPI methodology, which can yield reliable results only when the original record series has a symmetric or normal (Gaussian) PDF. Otherwise, SPI classifications are deficient. This work presents a procedure to avoid such deficiencies using the actual precipitation index (API) method. This methodology, rather than transforming the original time series data PDF into the standard normal PDF, converts the quantiles of the standard normal PDF to the original PDF. The application of the API method is presented for two precipitation data sets: monthly precipitation records for Istanbul City, Turkey, and annual precipitation records for New Jersey with comparisons with classical SPI results.

Features of the Standard Precipitation Index (SPI)
Specifically, SPI is defined on the basis of standard normal probability distribution (PDF) variables at the 0.00, − 0.50, − 1.00, − 1.50, − 2.00 levels, corresponding, respectively to "normal dry", "slightly dry", "medium dry", "very dry" and "extremely dry" classifications (McKee et al. 1993). It is applied at many time scales including the most frequently used ones: 1, 3, 6, 12, 24 and 48-month scales as well as at the annual level, coupled with moving averages. As stated by Hayes et al. (1999), the advantage of SPI over other drought indices stems from its simple calculation procedure applied to precipitation records only, without any need for additional hydro-meteorological variables. For instance, the Palmer drought severity index (PDSI) (1965) requires several such variables. Mishra and Desai (2005) showed that SPI suitably describes most types of drought events.
In almost all SPI studies, the original records are represented by the two-parameter Gamma PDF, which is one of the most suitable formats for monthly precipitation records, especially in humid regions. As for the semi-arid and arid regions, the Gamma PDF may not be suitable due to the zero precipitation records. It is also shown in this paper that SPI can be applied to any PDF type, because after all, it involves the conversion of a certain PDF to a standard normal (Gaussian) PDF. Along this line, Guenang and Kamga (2014) proposed the full process of distribution selection by fitting many distribution functions to the data and used an appropriate statistical test to select the best fit for calculating the SPI at time scales of 1, 3, 6, 12, 18, and 24-month time periods.
Some of the critical interpretations in the calculation steps of SPI applications are presented briefly as follows.
1. The basis of the method depends mostly on the twoparameter Gamma PDF (Thom 1958) but other PDFs can be taken into consideration. 2. After an appropriate PDF is fitted to the original data, the probabilities are calculated for each data value. 3. These probabilities are converted into a series of standard normal (Gaussian) PDF variants. This step is referred to as the probabilistic standardization procedure, which is different from the statistical standardization procedure, where the mean value of the original data set is subtracted from each data value leading to a sequence of deviations. Subsequently, these deviations are divided by the standard deviation: hence, the final sequence has zero mean and unit variance. To compare the two standardization procedures, Fig. 1 presents a synthetic time series with a theoretical two-parameter Gamma PDF in Fig. 2.

Published in partnership with CECCR at King Abdulaziz University
The SPI methodology converts the skewed PDF to standard and normal (Gaussian) PDF variants which are symmetrical. In this transformation, the statistical structure of the original series changes significantly, especially the serial correlation coefficient (see Fig. 3a). The statistical standardization procedure does not change the serial correlation structure of the original time series. Furthermore, the SPI probabilistic standardization does not have zero mean and unit standard deviation in small sample sizes.
The SPI does not provide perfect standardization with zero mean and unit variance, rather it leads to the normalization of the original data. The question is: on which type of standardization should the drought categorization be based? If the SPI probabilistic one is preferred, then, in addition to the probabilistic standardization, normalization is one of the major characteristics of SPI method. 4. In the SPI series, it is not possible to identify quantitative actual drought features at different levels, such as dry period duration, magnitude or intensity, because the features of the SPI data are not similar to those of the original data series. For such quantification to work, it is preferable to base the dry and wet spell classifications on the statistical standardization procedure. 5. In the SPI method, drought index classifications can be achieved linguistically, but in the actual precipitation index (API) method (see Sect. 3) actual data-based classifications are obtained without standardization or normalization procedures. 6. As will be explained in the following section, the main difference between the SPI and API methodologies is that the former depends on the standard normal PDF values but the API, instead, considers the actual probabilities (column 3) that correspond to the SPI values in the second column of Table 1. 7. In small sample sizes, the SPI values do not have zero mean and unit variance, which is valid for large sample sizes. This implies a bias in SPI when comparing

Actual Precipitation Index (API) Method
The API method can reflect the drought characteristics of the original time series records after execution of the following steps.
1. Fit the most suitable cumulative probability function (CDF) to the given data. The CDFs can be of any type, not only the two-parameter Gamma PDF. 2. Consider the classification boundaries of the SPI method (column 2 in Table 1) and calculate their corresponding cumulative probabilities, p (column 3 in Table 1). These probabilities are calculated by available software such as Matlab by applying the following steps. a. The first step is to convert the SPI class limits to the probabilities in the standardized normal PDF as, b. The resulting probabilities for API classification limits are: These values are shown in Fig. 4 for standard normal PDF with zero mean and unit variance.
It is obvious from this figure that entering SPI classification boundaries on the horizontal axis yields the corresponding API probabilities on the vertical axis. These API probabilities correspond to actual data values in the case of two-parameter Gamma or any other type of PDF. For instance, if, hypothetically, the two-parameter Gamma PDF parameters are taken as = 2 and = 5, the original data PDF probabilities can be obtained as follows: 1. Enter the probability values from Table 1 in the twoparameter Gamma CDF and calculate the corresponding probabilities using any computer programming language. In this case, Matlab program statements are used as follows:  2. Figure 5 indicates the procedural structure with the API probabilities on the vertical axis and the corresponding API classification boundaries on the horizontal axis. 3. These boundaries are treated as truncation levels on the original data series to calculate, if necessary, the dry and wet period durations, total dry amounts (deficits), maximum deficit (Mishra and Singh 2010), which cannot be achieved with the classical SPI studies, because the standardization-normalization procedure causes loss in the original data probabilistic and statistical features.

Data Used
The application is presented first for the monthly precipitation records

API Value Determination
Although the application of SPI is straightforward, the corresponding probability values for each of the 1, 3, 6, 12 and 24-monthly or annual durations need first to be determined for API application. For this purpose, the following steps need to be taken.
1. Determination of PDF: For each period, the corresponding empirical data point scatter diagram is constructed by sorting each sequence into ascending order and then attaching the empirical probability, p, to each order using the following well-known simple formulation, where n is the number of data and m is the rank in the order sequence, p = m n + 1 , Fig. 5 The procedural structure of Gamma CDF for API calculation. The API probabilities on the vertical axis and the corresponding API classification boundaries on the horizontal axis SPI probability values on the standard normal CDF with dry and wet characteristics.  Published in partnership with CECCR at King Abdulaziz University 2. Empirical probability diagram: The plotting of ordered data versus corresponding probability value yields a non-descending scatter diagram. 3. Theoretical PDF fit: The most suitable theoretical PDF is fitted to the empirical scatter diagram, thus producing the best PDF for the basic data. The goodness-of-fit is controlled by Chi-Square and Kolmogorov-Simirnov tests (Stephens 1970). 4. API value determination: The theoretical PDF is used to determine the actual data values corresponding to the API probability values given in the third column of Table 1. 5. All the previous steps are repeated for data sets of 1, 3, 6, 12, 24, 48-month and annual duration. Each one results in a PDF with different parameters and so, the API level values are different from each other.
The completion of these steps results in the wet and dry spell API values as shown in Table 2 for the Istanbul and New Jersey precipitation records.

SPI and API Comparison
For Istanbul precipitation records, an exponential PDF appears to be the most suitable one (see Fig. 6). This shows that not only the two-parameter Gamma PDF can be considered for application of the SPI methodology.
To compare the SPI and API graphs, they are presented for 1-month, 3-months, 6-months, 12-months, 24-months and 24-months durations in Fig. 7. On the left-hand side column are the SPI and on the right-hand side are the corresponding API graphs. The respective values are written in the upper right hand side boxes in each graph.
Comparisons of SPI and API graphs help to detect and identify the following significant points: 1. There are significant differences between the SPI and API graphs for the same monthly durations.     Published in partnership with CECCR at King Abdulaziz University 2. SPI graphs have almost the same normal (Gaussian) PDF reflections with their systematic time series appearances around the zero level, whereas in the API graphs, the original non-symmetric forms are preserved. 3. Although SPI provides drought indices, they are biased due to their transformed values which cannot represent the actual cases. 4. On the SPI graphs, at any classification level, although the dry and wet-spell durations can be calculated, they do not reflect the actual dry and wet spell characteristics of the data. This point becomes very clear especially in Fig. 7c-f. 5. As for the low values of the time series of the API graphs, there is no drought index classification, for instance in Fig. 7f, but, in the SPI graphs, all the classifications are present. 6. It is observed from the comparisons that the standard normal (Gaussian) PDF balances the positive and negative values symmetrically but introduces bias. 7. As mentioned earlier in this paper, the closer the original data PDF is to the normal (Gaussian) PDF, the better the SPI graphical representation will reflect the features of the original data set. However, hydro-meteorological records, especially monthly data, have skewed the PDFs.
As for the New Jersey annual precipitation data, they accord with the normal PDF which is obvious in Fig. 8, with a mean of 45 cm and a standard deviation of 6.06 cm. For this reason, the time series patterns in SPI and API are expected to be very similar to each other. Figure 9 shows the SPI and API graphs for New Jersey annual precipitation values. The reader can observe their similarity in pattern even though the values at the upper right hand side boxes are different. For the sake of brevity, only 1-year, 3-year, 6-year and 12-year graphs are presented.
Comparison of 3-year and 9-year API graphs highlights the following points. Similar comparisons can be done between other graphs.
1. In case of the 3-year API graph, drought classifications of all types appear, but the 12-year API graph does not yield any 'extremely dry' drought classification. Although there are very severe 'extremely dry' classes in the 3-year graph, there is none in the 12-year case. 2. All dry classes in the 3-year API graph have comparatively shorter drought periods than in the12-year API graph. 3. Dryness class intervals as well as the precipitation amounts on the vertical axes are smaller in the 3-year graph than in the 12-year graph.

Conclusions
The main concern of this paper is dry spells and droughts. The most frequently used index, the Standard Precipitation Index (SPI), is evaluated from a different point of view than the Actual Precipitation Index (API). The SPI transforms the original precipitation time series records into a standardized normal (Gaussian) probability distribution function (PDF) and provides classifications, whereas the API works on actual data without any transformation of the underlying PDF and gives classifications on real data values. The API and SPI comparison shows that API can be applied regardless of the hydro-meteorology data PDF directly without requiring any conversion to the normal PDF and it yields, quantitatively, not only the actual drought classification in the original data, but also the drought durations. The API is applied to monthly precipitation records for Istanbul, Turkey, and annual precipitation records for New Jersey, USA. Although there is much variation from year to year, overall, both the cool (November through March) and warm (May through September) seasons have been warmer recently than in the past. The API procedure establishes the necessary classification boundaries of "normal dry", "slightly dry", "medium dry", and "very dry" and "extremely dry" on the actual records. It is observed that API drought classification reflects the real situation on the basis of original hydrometeorology records, whereas the use of SPI is restricted to the standardized normal PDF which cannot represent the actual situation.