An analysis of temperature anomalies in Chile using fractional integration

This paper deals with the study of stationarity and mean reversion in the temperature anomalies series in the southwestern American cone. In particular, monthly temperatures in 12 Chilean meteorological stations were studied (from the 1960’s to nowadays), examining if temperature shocks are expected to remain in the long term or if they are reversible. The results clearly show a significant relationship between the latitude, climate, and the order of integration of the temperatures. The orders of integration tend to be smaller in colder southern parts, therefore impacts of climate change are expected to be more reversible. However, in northern desert areas the orders of integration are larger than 0.5, thus impacts are expected to be maintained for a longer time.


Introduction
Chile is a very large country, with three main locations (Continental Chile, the Chilean Islands and Antarctic Chile). In general terms, Continental Chile is a narrowband structure, located between latitudes 17°S and 56°S (more than 4.200 km long), and mainly in the meridional axis of 70°W with a maximum width of only 420 km. It borders physically with the Pacific Ocean marine front in the west and with the very high mountains of the Andes cordillera in the East. This particular structure has led to a wide variation of different climates where precipitation regimes are subject to larger latitudinal variations even on the same parallel, as altitude strongly modulates annual precipitation (Sarricolea et al. 2017). That is why the Chilean climate is of particular interest for the understanding of climate change and its behavior throughout the rest of the world.
Following the IPCC 2014 (Intergovernmental Panel on Climate Change), there seems to be a wide agreement in the International Community about the global warming effect, as from a global perspective, temperatures are tending to increase on Earth. However, in Chile, there is evidence of a particular effect: temperature trends in the southeast Pacific and along the west coast of subtropical South America are tending to cool, but temperatures in continental Chile and Andes cordillera keep rising (Vuille et al. 2015). In particular, the historical temperature trend in central Chile between 1979 and 2005 showed a decrease of -0.15°C per decade in the coastal areas, in contrast with the 0.25°C/decade increase in the mountain range (Garreaud 2011). This particular event and the different types of existing climates in Chile due to its orography leads to a question of particular interest in terms of the impact and duration of temperature shocks in the long term and their relationship with global warming.
We claim first in the paper that the series under analysis display long memory behavior, which is a feature widely observed in climatological time series (Vyushin and Kushner 2009;Zhu et al. 2010;Franzke 2010Franzke , 2012Rea et al. 2011;Yuan et al. 2013;Gil-Alana 2012, 2018, 2022etc.). In addition, fractional integration is applied to check if the structure of these anomalies is stationary and mean reverting, or even nonstationarity so shocks are expected to remain in the long term. Moreover, the study of the degree of integration in the series might help to understand how fast this process of convergence is to its original long-term projection, helping climatologists in the construction of models and in the assumptions of future measures.
The paper is structured as follows: Sect. 2 includes a short analysis about the temperatures in Chile. Section 3 is devoted to the methodology. Section 4 describes the dataset, while Sect. 5 displays the main empirical results. Section 6 concludes the manuscript with the main implications of this study.

Temperatures in Chile
Existing literature regarding regional temperature analysis is very homogeneous. Salinger (1995) studied the South Pacific by using seasonal and annual changes in temperature using a linear trend, for the 5 decades 1941-1990. He found evidence of a general increase in both daily maximum and minimum temperatures, especially evident in the minimums. He concluded by suggesting that increased concentrations of greenhouse gases and changes in cloudiness could be a plausible mechanism for the overall increase in night-time temperatures. Vuille and Bradley (2000) and Vuille et al. (2003) documented the tendencies of air temperature anomalies from 1939 to 1998 for the tropical Andes, with evidence of a positive linear trend of ? 0.11°C per decade relative to the 1961-1990 mean. More recently, Marengo et al. (2011) analyzed mean annual temperatures and trends in the countries of the northern Andes (Venezuela, Colombia, Ecuador, Peru), with evidence of an increase of about 0.8°C during the twentieth century. This tendency tends to be greater for climate stations at higher elevations and has tripled over the past 25 years of the twentieth century (? 0.34°C per decade), even though some variability appears to be associated with the occurrence of the El Niño Southern Oscillation (ENSO).
Other regional studies have also documented significant warming trends in the Andes over the same period in Peru. Lavado Casimiro et al. (2013) noticed a positive trend in mean temperature of 0.09°C per decade over the region with similar values in the Andes and rainforest when considering average data, with a significant number of stations with more positive deviations over the Andes region. Salzmann et al. (2013) further examined the Andes analysis with the study of Cordillera Vilcanota glacier (the 2nd largest in Peru), with the overall conclusion of a moderate increase in air temperature, marginal glacier changes between 1962 and 1985, but with a massive ice loss since 1985 (about 45% of volume). Seiler et al. (2013) analyzed the Bolivian region, with homogenized daily observations of temperature and the impacts of the different climate models in the region (Pacific decadal oscillation (PDO), El Niño-Southern Oscillation (ENSO) and Antarctic Oscillation (AAO)). Temperatures were found to be higher during PDO(?), El Niño and AAO(?) in the Andes, while the number of extreme events were higher during PDO(?), El Niño and La Niña in the lowlands. In general lines, temperatures increased at a rate of 0.1°C per decade, with stronger increases in the Andes mainly in the dry season. Thibeault et al. (2010) analyzed maximum and minimum temperatures, extreme temperature ranges, frost days, heat wave duration index and warm nights in the Northern Altiplano area (between 16°N and 19°N). Their main results show positive trends in all four indices while trends in warm nights and warm spells are significant, which is consistent with the work in Salinger (1995).
Regarding the specific case of Chile, Rosenbluth et al. (1997) analyzed temperature linear trends computed for the period 1933-1992, with evidence of warming rates from 1.3 to 2.0°C per 100 years. Moreover, during the last 3 decades warming rates appear to be twice as high as in Salinger (1995) or Thibeault et al. (2010). However, for the specific case of Chile, the generalized warming was not present around 41S, where a cooling period from the 1950s to the 1970s still prevails. Schulz et al. (2012) analyzed Climate Change along the coast of northern Chile over the last century studying the evolution of annual temperature mean anomalies at different pressure levels. Evidence was found of positive trends determined by a strong influence of ENSO, and an abrupt upward shift in the air and ocean temperature regimes in the mid-1970s, coinciding with the change from the cold to the warm phase of the IPO. However, the analysis of a temperature record starting in 1900 revealed no similar changes during other phase changes of the IPO but appears to be similar to that reported for other regions where the temperature regime is also modulated by the IPO (Hartmann and Wendler 2005). Santana et al. (2009) analyzed the specific area of Punta Arenas, with average, maximum and minimum temperatures for the last 120 years. It was found evidence of a slight decrease in the average temperatures, but larger temperature amplitudes and warming/cooling periods from different duration.
More recent studies such as Burger et al. (2018), that focused on the 1979-2010 period, suggest the same evidence of significant warming trends that are widespread at inland stations, while trends are non-significant or negative at coastal sites; Hanna et al. (2017) noticed for the 1979-2006 period a strong contrast between the coastal region (surface cooling: -0.2°C per decade) and the Andes region (surface warming: 0.25°C per decade). Falvey and Garreaud (2009) suggested that this coastal cooling appears to form part of a larger-scale La Niña-like pattern and there may extend mixed layer below the ocean to 500 m sea depths, generating a temporal cooling at coastal areas in contrast with the general warming pattern across the globe. Meseguer-Ruiz et al. (2019) analysed temperature records in mainland Chile (18 stations from 1966 to 2015) applying Mann-Kendall's non-parametric test. Evidence of positive trends was found for both the minimum and maximum temperature series, especially during the warm months, as in previous studies. However, in the context of global warming, the diverse climatic regions of continental Chile present different behaviors in terms of latitudinal location, lowland/coastal stations and inland/high-elevation stations. Aranda et al. (2021) studied large south central Chilean lakes surface temperatures, with monthly, seasonal, and annual surface temperature trends during the 2000-2016 period, concluding with a significant increase in surface temperature, with a rate of 0.10°C/ decade over the period. Orrego-Verdugo et al. (2021) analyzed temperature amplitudes in the 1985-2015 period with evidence of an increase in maximum temperature from the north (19°S) to the south of Chile (44°S) and in the highlands, in a maximum range of 3.9%, however, the trend is negative in the southern zones (44°S-56°S), with an average value of 12%.
Nevertheless, all the above literature lacks an important feature observed in climatological data including temperatures, this being the long memory issue. This is a property of time series data that implies that observations that are distant in time would be highly dependent. Not taking this feature into consideration may alter the estimation results about linear (or even) nonlinear time trends in the temperatures. Thus, in the present paper, we examined trends in Chilean stations taking also into account this important property of the data, this being the main contribution of the present work.

Methodology
In this paper we assume that the series under examination display long memory properties. Long memory, also called long range dependence, is a feature of time series data that is characterized because the observations are highly dependent across time and is a property very often observed in climatological data (see, e.g., Bloomfield 1992;Percival et al. 2001;Caballero et al. 2002;Koutsoyiannis 2003;Langousis and Koutsoyiannis 2006;Gil-Alana 2006Franzke 2010;Rea et al. 2011;Bunde and Ludescher 2017;Li et al. 2021;etc.).
Within the long memory class of models, the first applied studies in climatology employed the Hurst exponent (Hurst 1951) or variants of this approach (Koutsoyiannis 2003; Langousis and Koutsoyiannis 2006;López-Lambraño et al. 2018;Chandrasekaran et al. 2019;Benavides-Bravo et al. 2021;etc.). Alternatively, a very popular model is the one based on fractional integration that means that the number of differences to be taken in the series might be any real value thereby including potentially fractional numbers. Thus, a stationary process {x t , t = 0, ± 1, …} is said to be integrated of order d, and denoted as I(d) if it can be represented as where B refers to the backshift operator, i.e., B k x t = x t-k , and where u t is integrated of order 0, i.e., I(0) or sometimes called a short memory process. Within the short memory class of models, we can include the white noise and the stationary and invertible AutoRegressive Moving Average (ARMA) class of models. In this latter context, if u t follows an ARMA(p, q) process, x t is said to be a fractionally integrated ARMA or ARFIMA(p, d, q) model (Sowell 1992). Recent applications of ARFIMA models in temperature and climatological data include among others the papers by Bhansali and Kolkoszka (2003), Polotzek and Kantz (2020), Asha et al. (2021), etc. Clearly, the main advantage of this methodology is that it is more flexible and general than the classical methods employed in time series analysis which consider only integer degrees of differentiation, i.e., d = 0 in case of stationarity, e.g., ARMA models, and d = 1, in case of nonstationarity, e.g., ARIMA (p, 1, q) models. By allowing d to be any real value, these two cases are incorporated as particular cases of interest, and it permits us to consider more general situations such as stationary long memory processes (if 0 \ d \ 0.5) but also nonstationary mean reverting processes (if 0.5 B d \ 1).
On the other hand, in order to determine if temperatures have increased over time, a classical approach consists of estimating a linear time trend of the form: where y t indicates the observed data (temperature anomalies in our case), and a and b are the coefficients referring respectively to a constant and a linear time trend. In this context, a significant positive value of b indicates support of warming temperatures across time.
Using the above two equations and taking also into account the potential seasonal (monthly) nature of the data, our model specification is one based on the following equation: where y t is the time series we observe, a and b are unknown coefficients for the intercept and the linear trend; the regression errors x t are I(d), and u t is a seasonal (monthly) AR(1) process where u is the AR coefficient and e t is a white noise process. The differencing parameter d is the measure of the degree of persistence in the data. Thus, the higher its value is, the higher the degree of dependence. Moreover, it permits us to distinguish between mean reversion and lack of it in a more flexible way than the standard methods that only use the values 0 (for stationary series and mean reversion) and 1 (for nonstationarity and lack of it). Mean reversion and no permanent effects of shocks, occurs as long as d is smaller than 1, and the lower the value of d is, the faster the process of convergence is to its original longterm projection. Thus, four different situations can be distinguished: • if d = 0, the series is short memory, with shocks disappearing relatively fast. • if 0 \ d \ 0.5, the series is long memory though still covariance stationary, with shocks last longer than in the previous case. • if 0.5 B d \ 1, the series is no longer covariance stationary though it is still mean reverting with shocks disappearing in the long run • Finally, if d C 1, the series is not mean reverting and shocks persist forever.
The estimation of d is conducted via the Whittle function in the frequency domain using a parametric approach developed in Robinson (1994). Further details can be seen in Gil-Alana and Robinson (1997) for the functional form of the version of the tests of Robinson (1994) used in this application. Employing alternative methods such as Sowell (1992), Beran (1995) and others produced essentially the same type of results.

Data and empirical results
Climatological data were taken from the public site: https:// climatologia.meteochile.gob.cl/application/historico/tem peraturaHistoricaAnual/330020 where 47 stations were covering the system SACLIM. However, not all these stations have large enough records in temperature data to study persistence and long-term behavior. Thus, only the 10 stations with full temperature records between January 1968 and November 2021 were selected and studied with monthly data, totaling 647 samples. The remaining stations were discarded as they have gaps or their data starts after 2005. As exceptions, Diego Aracena station was also included as its data starts in January 1982 and it was considered good enough with 479 samples, along with Balmaceda which ends at August 2019 and includes 620 samples. The location details of the selected stations are summarized in Table 1. Sarricolea et al. (2017) analyzed the Continental Chilean climate characteristics following the Köppen-Geiger climate classification, concluding that these climates were essentially arid, temperate, and polar due to the elevation of the Andes, being predominantly high tundra (ET) and mediterranean (Cs). Additionally, with respect to latitude, the climates of northern Chile are mostly arid due to the Atacama Desert, and those of southern Chile tend to be temperate, ranging from Mediterranean to marine west coast. In particular, the locations of the selected stations can be classified in five groups depending on their climate behavior:  Regarding the empirical results, Table 2 displays the estimates of d under the three standard assumptions (in relation to the deterministic terms) of (i) no deterministic terms, i.e., imposing a = b = 0 in the first equality in (3); (ii) with an intercept (b = 0) and (iii) with an intercept and a linear time trend. We mark in bold in the table the most adequate specification for each case. In other words, we estimate the model in Eq. (3) and if a and b both appear statistically significant, we choose that as the selected model. Alternatively, we move to a model with an intercept only, i.e., imposing b = 0, due to the insignificancy of the time trend coefficient. Table 2 reports the estimated coefficients of the selected model for each series, while Fig. 2 summarizes the average value of d per region, showing evidence of a climatic or latitudinal relationship with d. Significant time trend coefficients are found in the cases of Quinta Normal, Pudahuel, Concepción, Coyhaique, Balmaceda and Punta Arenas. For the remaining cases, the two coefficients of the deterministic terms (constant and time trend) are found to be insignificant. The seasonal effect does not appear as significant in almost any of the series, with a high coefficient being observed at Quinta Normal (0.273).
Looking at Table 3, we see that the highest time trend coefficient corresponds to Pudahuel (0.000285), followed by Quinta Normal (0.00155) and Punta Arenas (0.00151), Balmaceda (0.00122), Concepción (0.00114), and finally Coyhaique (00.00109). Interestingly, all the estimated values of d are found to be higher than 0, implying long memory or long-range dependence.
Moreover, it appears that there is a relationship between latitude and the order of integration as can be seen in Fig. 3, with a very strong R 2 correlation factor between these 12 stations (0.94). The lowest degree of integration and therefore the weakest effect of shocks, correspond to the group of tundra climate (44°S-56°S). In particular, Coyhaique (with an estimated value of d equal to 0.13), Balmaceda (0.14) and El Tepual (0.15). On the other hand, the highest values are those with a desert climate (17°S-  These results appear to be in line with previous studies that notice small positive trends such as Vuille and Bradley (2000) or Vuille et al. (2003Vuille et al. ( , 2015 for the tropical Andes, or the recent work of Orrego-Verdugo et al. (2021) that confirms an increase in the temperatures for northern Chile. In fact, for arid climates our integration factor shows larger values (d [ 0.5), implying a possibility of small reversion of this climate change patterns if no external policies are applied. However, as noticed by Schulz et al. (2012) global warming is not present around 41S, where our results might imply stronger mean reverting properties (d \ 0.2), suggesting that this cooling pattern might continue in the upcoming future. The physical causes of this pattern are still unclear. Meseguer-Ruiz et al. (2019) points that this issue might be related to the negative Pacific Decadal Oscillation (PDO) phase, the eastern tropical Pacific cooling (Liao et al. 2015), and the modulation induced by the El Niño-Southern Oscillation (ENSO) to the temperatures of the South American Pacific coast (Rosenbluth et al. 1997). Anyway, these empirical results suggest that desert climates appear to be more vulnerable to climate shocks than colder tundra, where these shocks appear to have smaller impacts in the long term. However, the absence of previous studies limits the understanding of the physical mechanisms of these spatial differences and needs further investigation.
Finally, in Table 4 we compare the time trend coefficients obtained throughout the model given by (3) with those obtained imposing the (wrong) assumption that d = 0 in (3). 1 We observe that if long memory is not considered (i.e., if d = 0), all except one of the series (Cerro Moreno) display significant coefficients, producing erroneous conclusions about the climate in Chile. The temperature increase seems to be overestimated in the desert climate and semi-arid areas while underestimated in the tundra climate areas.

Conclusions
The orders of integration of the average monthly temperature deviations time series in eleven Chilean meteorological stations have been calculated, starting in January 1968 until November 2021 with monthly samples, plus an additional one started in January 1982, totaling 647 samples in most cases.
The results obtained, based on fractional integration methods, indicate that there is a significant relationship between the latitude, the climate, and the order of integration of the temperature series. This happens for the specific case of Chile, where generalized global warming was not present around 41S and below, and where a cooling period from the 1950s to the 1970s still prevails (Vuille et al. 2015;Garreaud 2011) specifically in coastal areas. A major limitation of this study has been the absence of long-term data in the Andean stations, where recent studies show a different temperature behavior in comparison with other stations of the same latitude but nearer to The values in parenthesis refer to the selected specifications for the deterministic terms 1 Note that the hypothesis of d = 0 has been decisively rejected in the results reported across Tables 3 and 4. the Pacific coast (Meseguer-Ruiz et al. 2019;Aranda et al. 2021;Orrego-Verdugo et al. 2021, among others). The absence of previous works with this methodology in this b Fig. 2 Average d values obtained per region. Map taken from Sarricolea et al. (2017) and modified by authors part of the globe appears to be a second limitation of this study. However, the results showing low orders of integration specifically in southern latitudes, gives a shadow of hope in the Global Warming policies and in the reversion of these effects, in line with future temperature projections (Araya-Osses et al. 2020;Mutz et al. 2021) that clearly depend on the RCP scenario. Future works with the same data may analyze the potential presence of non-linearities or structural breaks. These two issues are very much connected with long memory and fractional integration, and the linear trend used in this application can be substituted by alternatives models that might include Chebyshev polynomials in time as in Cuestas and Gil-Alana (2016), Fourier functions in time as in Gil-Alana and  or even neural networks . Work in this direction is now in progress.
Acknowledgements Luis A. Gil-Alana and Miguel Martín-Valmayor gratefully acknowledges financial support from the MCIN-AEI-FEDER (Government of Spain) under Grant Agreement No PID2020-113691RB-I00 project from 'Ministerio de Ciencia e Innovación' (MCIN), 'Agencia Estatal de Investigación' (AEI) Spain and 'Fondo Europeo de Desarrollo Regional' (FEDER). All authors declare that there is not any competing financial and/or non-financial interests in relation to the work described. Comments from the Editor and two anonymous reviewers are gratefully acknowledged.
Author contributions The original idea came from Prof. Hube, who is developing investigations around Climate Change in Chile in cooperation with Valparaiso University. Prof. Martin-Valmayor, reserached the data sheet (point 4 of the paper), and completed the introduction (point 1) and literature review (point 2). Then, Prof. Hube was responsible for the investigation process and the quality of the data sheet. Prof. Gil-Alana introduced the methodology (point 3), and then the calculation and initial discussion of results (tables and point 5). Finally, all authors joined together the conclusions (part 6) and the revision of the manuscript.
Funding Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. Professors Gil-Alana and Martín-Valmayor notify that for the development of this research funds have been received from the MCIN-AEI-FEDER (Government of Spain) under Grant Agreement No PID2020-113691RB-I00 and gratefully acknowledge this financial support from the 'Ministerio de Ciencia e Innovación' (MCIN), 'Agencia Estatal de Investigación' (AEI) Spain and 'Fondo Europeo de Desarrollo Regional' (FEDER). All authors declare that there are not any competing financial and/or non-financial interests in relation to the work described.
Availability of data and materials The authors declare that all data supporting the findings of this study is available within the article and its supplementary information files. In particular, the calculation datasets generated during and/or analyzed during the current study are available from the corresponding author on request. The results/data/figures in this manuscript have not been published elsewhere, nor are they under consideration by another publisher, and there are not hyperlinks to publicly archived datasets analysed or generated during the study except for the public information collected.

Declarations
Ethics approval and consent to participate I declare on behalf of all authors that the manuscript has not been submitted to more than one publication for simultaneous consideration, and that this work is original and has not been published elsewhere in any form or language (partially or in full). Our results are clear, honest, and without fabrication, falsification or inappropriate data manipulation, to discipline-specific rules for acquiring, selecting and processing data. No data, text, or theories by others are presented as if they were the author's own ('plagiarism'). Proper acknowledgements to other works are given (including material that is closely copied), summarized and/ or paraphrased), quotation marks (to indicate words taken from another source) are used for verbatim copying of material, and permissions secured for material that is copyrighted. I also declare that research articles, relevant literature and non-research articles are cited appropriately. We have avoided untrue statements about entities (who can be an individual person or a company) or descriptions of their behavior or actions that could potentially be seen as personal attacks or allegations about that entity. This research has no threats to public health or national security (e.g. dual use of research). If necessary, we are prepared to send relevant documentation or data in order to verify the validity of the results presented and sensitive information in the form of confidential or proprietary data is excluded. Authors declare all of the above to make sure to respect third parties' rights such as copyright and/or moral rights.
Consent for publication I declare that all authors consent that they are responsible for correctness of the statements provided in the manuscript, as defined by Springer.
Competing interests I declare that the authors have no competing interests as defined by Springer, or other interests that might be perceived to influence the results and/or discussion reported in this paper. In particular, I certify authors have no commercial association, employment or financial involvement that might pose a conflict of interest with regards to the submitted manuscript. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons. org/licenses/by/4.0/.