1 Introduction

The pivotal roles played by gold and crude oil as key indicators of global financial and economic health are paramount in shaping the intricate landscapes of economic dynamics. Notably, the relationship between these commodities is intricate, with oil price fluctuations influencing the inflation rate and subsequently impacting the price of gold [2]. Gold, characterized by significant volatility throughout the trading day, has been the subject of meticulous investigation. Batten and Lucey [4] delved into this volatility and employed the GARCH methodology to analyse gold futures intraday data spanning from January 1999 to December 2005. The variations in gold prices are influenced by multifaceted factors such as market bubbles and financial crises, posing inherent investment risks. However, despite these challenges, gold has demonstrated remarkable resilience as an asset class. Historically, gold has not only served as a safe haven during times of upheaval [6, 9, 13, 89] but has also delivered competitive returns, often aligning with major stock indices such as the S &P 500 [83]. Cai et al. [15] contributed to the understanding of gold futures by highlighting its time-varying volatility and emphasizing the substantial impact of inflation on gold return volatility. The significance of gold as an investment extends beyond its potential for competitive returns. Gold can serve as a robust hedge against inflation and mitigate losses in the stock market, making it an invaluable tool for asset allocation [11]. The role of gold as a form of “risk insurance” within investment portfolios is evident [2]. Furthermore, in countries like India, where gold holds particular importance, its import-heavy nature directly impacts currency exchange rates [54]. Additionally, gold has demonstrated its utility as a hedge against fluctuations in the foreign exchange value of the dollar [16].

Recent research by Buccioli and Kokholm [14] has shed light on gold’s role as a haven asset for stock indices in the aftermath of a market crash, persisting for approximately 20 days. This finding contradicts the conventional belief held by numerous prior studies suggesting a negative relationship between gold prices and stock prices [6, 25, 40, 78]. However, the discourse on the correlation between gold and stock prices remains multifaceted, with some studies asserting a positive correlation [21], while others contended that there is no discernible relationship [5]. For instance, the Taiwan stock market exhibits no apparent positive or negative relationship with gold [42]. Contrary to the prevailing skepticism, Mercer [56] presented compelling evidence supporting the inclusion of gold in U.S. equity portfolios. Their study reveals that allocating 25% to gold significantly enhances portfolio performance and that gold serves as an effective hedge against the adverse impact of inflationary pressures. This nuanced perspective challenges the notion that gold is merely a defensive asset, suggesting its potential as a proactive and value-enhancing element within diversified investment portfolios. Moreover, Gold remains a reliable predictor for Bitcoin, even amid turbulence. Investor sentiment, gauged through Google trends, S &P500, and Gold, significantly influences Bitcoin decisions, especially during the COVID-19 crisis [10]. In summary, gold can act as a stabilizing force for the financial system by reducing losses in the face of extreme negative market shocks and is a strong haven for most developed markets during the peak of the recent financial crisis [32].

Crude oil, renowned as one of the most volatile commodities globally, exerts a profound influence on the intricate fabric of the global economy. The oscillations in oil futures prices are often linked to supply shortages and can serve as a key barometer for broader economic conditions [36]. The oil market, characterized by its expansive infrastructure and extensive trading activities, stands as a pivotal sector within the global economy. Kilian and Park [49] revealed that approximately 22% of total stock return variations can be ascribed to the dynamics of the crude oil market. Presently, oil consumption commands the forefront of global fossil energy utilization, constituting nearly one-third of all energy consumption [27]. Apart from mere economic dimensions, the repercussions of adverse changes in oil prices permeate a country’s economic development, social stability, and state security [37]. Beyond its status as a critical energy product, crude oil exhibits diverse properties encompassing merchan- dise, financial, and political attributes. The consequential fluctuations in oil prices wield a significant impact on the world economy, financial markets, and regional stability [1].

The intricate interplay between crude oil prices and the economic fortunes of nations is manifest in the financial consequences triggered by rapid price fluctuations. Countries heavily reliant on oil exports witness far-reaching financial ramifications, with plummeting crude oil prices inducing sluggish economic activity and surges in prices precipitating substantial inflation [80]. The inherent volatility of the oil market stems from multifaceted factors [87]. While supply and demand dynamics constitute primary drivers of crude oil prices, the interplay of variables such as stock market trends, economic activity, political situations, and other external factors contributes significantly to market volatility [85]. Crucially, predicting the future trajectory of oil prices assumes paramount importance but remains a formidable challenge. The inherent volatility and uncertainty intrinsic to the oil market, coupled with the influence of external factors, create a complex landscape for price forecasting. The latest research confirms that the COVID-19 pandemic is impacting gold, crude oil and the cryptocurrency market [75]. This recent COVID-19 induced downturn is leading to a decline in worldwide oil demand, exemplifies the profound impact of external factors on market dynamics [53].

The intricate relationship between crude oil prices and various facets of the financial landscape has been a subject of extensive scholarly inquiry. Friedman [35] posited that the inflationary surge triggered by escalating oil prices distorts price signals, diminishing economic efficiency and exerting a deleterious impact on the stock market. Building upon this, Driesprong et al. [26] discerned a negative effect on current stock market returns attributable to a one-period lag in crude oil price changes. Chen [19] contributed to the discourse by establishing a link between rising oil prices and a shift toward bearish market conditions. Wang et al. [81] advanced the understanding of this relationship, asserting that crude oil prices exhibit predictive power for stock market returns, both within and beyond sample boundaries. Their findings underscored a negative correlation between oil price increases and stock market returns. The nuanced impact of crude oil price fluctuations on specific industries is elucidated by Narayan and Sharma [63], who revealed varying effects depending on the industry. Specifically, the energy and transportation sectors experienced rising returns in tandem with increasing crude oil prices, while other industries witnessed a general decline.

The pervasive influence of crude oil price volatility on stock returns is a recurring theme in the literature. Chen et al. [17] established the volatility of crude oil prices as a driving force behind bearish conditions in the Chinese stock market. [68] highlighted the negative impact of increased oil price volatility on real stock returns in the United States. Xiao and Wang [86] linked heightened oil price volatility to diminished economic activity and heightened financial market uncertainty negatively predicted stock returns. This negative moving relationship between oil prices and the stock market, as evidenced by past studies [17, 26, 81], underscored the complex interplay between these variables. The industry-specific responses to oil price changes accentuated the need for a nuanced understanding when analyzing the market dynamics [63]. In light of market uncertainties, the accurate estimation of future crude oil prices emerged as a pivotal research question [65]. This holds particular significance for nations straddling the dual role of oil producers and consumers, where the ability to forecast crude oil prices is integral to economic stabilization programs and financial planning [33]. Despite the evident importance of such predictions, challenges arise due to the non-linear, uncertain, and dynamic nature of crude oil prices, complicating forecasting efforts [44].

The empirical evidence gleaned from prior research underscored the pivotal roles played by both gold and crude oil in shaping stock markets, influencing investment portfolios, and impacting the global economy. Notably, these commodities stand out as the most liquid markets in comparison to their counterparts. The daily trading volumes for gold and crude oil on the futures market, amounting to $28.83 billion and $108.17 billion, respectively, highlight their significant market liquidity [83]. In terms of predictive ability, Huang et al. [42] asserted that among single group variable models, the crude oil group variable emerged as the most potent predictor, followed by the gold group variable, while the traditional financial group variable exhibits the least predictive power. This underscored the unique influence wielded by crude oil and gold on market dynamics.

Understanding the intricate relationship between crude oil and gold becomes imperative due to their substantial influence on the markets. The exploration of this gold-oil interplay promises novel insights into the functioning of global financial markets. Tully and Lucey [79] asserted that two key macroeconomic variables, oil price and inflation rate, exerted influence on the gold market. Notably, a positive correlation of approximately 85% existed between gold and crude oil prices [74]. The linkage between gold and oil markets is further substantiated by studies indicating that oil price changes and political unrest are significant determinants of volatility in gold prices [57]. This extends to the risk premium in gold prices, as influenced by South African political unrest and changes in oil prices [57]. The transmission of volatility between gold and oil markets is supported by the work of Ross [69] and Fleming et al. [34], indicating that information and volatility can be shared across markets. Recent studies also delved into the intricate dynamics between gold and oil. Hammoudeh and Yuan [39] found that monetary policy and oil shocks significantly impact gold prices, while Narayan et al. [62] established a long- run relationship between gold and oil futures, suggesting gold’s use as an inflation hedge. [71] uncovered short-run feedbacks between gold and oil spot prices, with a weaker relationship in the long run.

Despite individual examinations of gold and oil, only a limited number of studies considered their interdependence. However, these studies used long-term data as a whole and did not consider the dynamic correlation. Moreover, the external influences and seasonal variations were not considered. This paper aims to address this gap by presenting a unique approach that integrates three distinct and complementary analytical elements. By doing so, it seeks to contribute to the comprehensive understanding of the intricate relationship between gold and oil prices and their collective impact on financial markets.

In this groundbreaking study, an innovative framework was proposed to seamlessly integrate Rolling Correlation, the SARIMAX Model, and the Rolling Model and revolutionize the predictive analysis of the gold-oil interrelationship. This study focuses on Gold Spot Price, WTI and Brent Futures. This pioneering approach can capitalize on the strengths of Rolling Correlation, the SARIMAX Model, and the Rolling Model, to dramatically enhance prediction accuracy and robustness. The Rolling Correlation component can dynamically evaluate the evolving relationship between gold and oil prices and provide deep insights into their changing interplay. The SARIMAX Model enriches this analysis by factoring in external influences and seasonal variations, offering a more holistic view of the gold-oil nexus. The Rolling Model elevates the prediction performance by continuously updating and adapting to the latest data, ensuring that prediction results up-to-date and relevant.

The research outcome is nothing short of remarkable. The proposed rolling SARIMAX model, applied to Gold–WTI and Gold–Brent, shows a substantial improvement over traditional SARIMAX models, with at least a 40% increase in R-square value and a 95% reduction in mean absolute percent- age error (MAPE). Specifically, the R-square values achieved – 89.8% for Gold Spot Price-WTI Futures and 88.4% for Gold Spot Price-Brent Futures, which are exceptionally high for financial time series models. These metrics show the exceptional predictive power of the proposed prediction model, demonstrating its strong capability in accurately capturing and explaining significant portions of the price movements in complex financial markets. This study makes a significant contribution to the field of financial modelling. The rolling SARIMAX Model distinguishes itself by effectively assimilating real-time data, which is a crucial capability in the volatile realms of gold and oil markets. This adaptability, combined with the model’s ability to analyse both linear and non-linear market interactions and to integrate external variables, positions it as a superior analytical tool. Its robustness in evolving market conditions, due to its continuous adjustment to new data, offers timely insights into market trends and relationships.

The practical implications of our model are extensive and multifaceted. It not only enhances market analysis and risk management by accurately predicting the correlation between gold and crude oil prices but also aids in informed policy and economic planning. For investors, it supports strategic investment decisions, helping in portfolio diversification to minimize risk and maximize returns. Additionally, the newly developed model is instrumental in predicting economic cycles and offers a global economic outlook, assisting stakeholders in decision-making by understanding the impact of commodities on the global economy. In summary, the proposed rolling SARIMAX model represents a paradigm shift in financial market analysis. It provides a nuanced understanding of the intricate interplay between key commodities, which is essential for various stakeholders in navigating the complex dynamics of global financial markets. This study invites readers to embark on a journey of discovery and insight, promising to enrich their understanding of these crucial financial markets.

2 Literature review

2.1 Gold and crude oil analysis

Gold and crude oil have long been considered historically distinguished commodities in the world. Gold market is influenced by two major macroeconomic variables: the price of oil and the rate of inflation [79]. The prices of Gold and crude oil had a favourable association and continued to defy the trend established in 2008. Observing fluctuations in oil prices helps policymakers anticipate Gold prices and implement appropriate economic strategies. Recently, Gold and crude oil have sparked a lot of attention, mainly due to price fluctuations during the COVID and the Russia-Ukraine conflict. Oil prices are inextricably linked to the petrodollar, but Gold prices are indirectly related to the dollar. The impact of a stronger dollar on the supply of oil is favourable. Oil prices tend to decline because of a stronger dollar’s tendency to diminish demand while increasing supply. With a causality connecting the U.S. dollar to the price of oil, these demand and supply effects, therefore, necessitate a negative relationship [23]. Since Gold is a dollar-denominated commodity, its price is typically negatively correlated with the value of the US dollar. Broadly speaking, a relatively strong U.S. dollar tends to maintain Gold prices lower and more under control, whereas a weakening U.S. dollar is likely to push Gold prices higher due to rising demand [48]. As a result, to understand the future economic outlook, investors, traders, and policymakers should investigate the interrelationship between oil and Gold price movements.

It was revealed that there is a statistically substantial association between oil price fluctuations and stock market returns [66]. Oil futures yields and stock index gains for the 1980s are tested for linear and nonlinear causality [22]. The results of the empirical research revealed that oil price shocks did affect stock index returns in a nonlinear manner, underscoring the significance of testing for nonlinear linkages and linear connections. As a result, spikes in oil prices in the post-war period had a substantial impact on aggregate stock gains Jones96). It was also observed that rising oil prices had a significant influence on US stocks, with the amount of the effects possibly increasing ever since the mid-1980s [70]. Moreover, the research identified that while the correlation between oil and stock–bond markets remains stable during periods without economic shocks, it undergoes significant changes during times of economic turbulence, albeit at a lower frequency [59]. This finding is critical as it positions the oil market and overarching financial stress as key vectors for transmitting risk to other financial markets. In particular, this pattern holds substantial implications for the dynamics of financial markets in the Gulf Cooperation Council (GCC) region. It suggests that variations in the oil market, along with general economic stress, play a pivotal role in influencing the performance of other financial markets.

Building on this perspective, further research demonstrated the efficacy of integrating financial stress indices into the prediction model [60]. This integration improved the model’s forecasting capabilities, highlighting the significant role of the oil market in financial market analysis. From an investment standpoint, this enhancement in predictive accuracy implies that oil assets, when aligned with optimal portfolio weights and hedge ratios, can be effectively utilized to reduce portfolio risk in the concerned markets. Additionally, the research indicates that the interplay between oil prices and stock–bond markets is dynamic, adjusting in response to the prevailing financial stress conditions. This evolving relationship necessitates a nuanced understanding for investors and market analysts, particularly in regions heavily influenced by the oil market dynamics. Recognizing and adapting to these shifts in correlation can be critical for effective portfolio management and risk mitigation, especially in an environment where oil prices and financial stress are influential factors. This adaptability is essential for navigating the complexities of financial markets, because static models may fail to capture the full spectrum of risk and opportunity presented by fluctuating oil prices and varying economic conditions.

However, since stock investing is no longer as profitable as it once was due to increased oil costs, investors are increasingly turning to Gold as an alternate solution for investment as Gold has outperformed key asset classes in terms of long-term returns [83]. Consequently, Gold is primarily exploited as a stock hedge, as it is a security that is statistically independent with equities on average during a market downturn [6]. It is discovered that Gold is on average hedged against equities and a safe haven in times of excessive stock market volatility. Baur and McDermott [6] compared the stocks of key emerging markets and demonstrated that Gold is a reliable safe haven during financial crises for the majority of advanced economies, including the USA, and major European economies, but not for Australia, Canada, Japan, or any Brazil, Russia, India, China, and South Africa (BRICS) countries.

Gold is also utilised as a hedging asset in terms of inflation since inflation appears to be the most effective conduit for assessing the link between Gold and oil prices. A spike in crude oil prices would lead to a rise in the market price level [41]. As a result, rising inflation, aided by high oil costs, increases demand for Gold and results in a surge in Gold price [67]. Further, a study utilizing a Vector Autoregressive (VAR) model examined the dynamic interactions and interdependencies between gold and oil prices from 2006 to 2021 [52]. The findings indicate that gold serves as an effective hedge against oil price fluctuations, a trend particularly evident during the COVID-19 pandemic. This is attributed to a notable inverse correlation between gold and oil prices, highlighting gold’s role as a key diversification tool and a safe-haven asset, particularly during periods of significant oil price volatility.

Delving deeper into the Gold-Oil relationship, a study employed an asymmetric VAR-BEKK(DCC)-GARCH (1,1) model to analyze the interactions between crude oil prices, BRICS exchange rates, and gold markets [20]. The VAR-BEKK(DCC)-GARCH model combines Vector Autoregression, Baba, Engle, Kraft, and Kroner formulation, and the Dynamic Conditional Correlation approach within a Generalized Autoregressive Conditional Heteroskedasticity framework. The study found distinct market-specific behaviors. In Brazil and China, gold predominantly absorbs volatility spillovers, whereas in India and South Africa, gold acts as a link between the crude oil market and exchange rate markets. Additionally, a spillover index study using forecast error variance decompositions (FEVDs) from VAR models, quantifies this relationship, showing that the oil market accounts for 5.22% of the variance in the gold market, with a 4.62% percent return transmission from the gold market to oil [3].

Furthermore, an asymmetric causality test conducted revealed causal relationships between crude oil and the Indian Rupee (INR), between crude oil and India’s stock index (i.e, Sensex), and between gold prices and the Sensex [46]. The findings suggested that reductions in crude oil prices were associated with a depreciation of the INR and a decline in the Sensex. There was a bidirectional relationship where decreases in gold prices led to a Sensex downturn, while a falling Sensex tends to elevate gold prices. This intricate web of relationships underscores the interconnectedness of gold, oil, and broader financial markets, highlighting the necessity for a nuanced and dynamic approach to financial market analysis.

The existing body of above research underscores the significance of the interplay between oil, gold, and stock markets in the financial sector. However, a direct analysis of the gold-oil relationship remains notably sparse in scholarly discourse. While certain studies, such as Li and Umair [52], have begun to delve into the dynamics of gold and oil price interactions, and others like Xu et al. [88], Balcilar et al. [3], and [46] have explored the indirect spillover effects between gold and oil, there remains a gap in the development of a robust predictive model for this dynamic gold and oil relationship.

In the subsequent section of the literature review, attention will shift to examining prediction models that concentrate exclusively on either gold, oil, or the interrelationship between the two. This analysis aims to identify and assess models that provide predictive insights into the movements of these crucial commodities, offering a deeper understanding of their intertwined dynamics.

2.2 Gold and oil prediction

Machine learning and deep learning approaches for predicting Gold and oil prices and movements have acquired favour in the scientific and industrial sectors over the last decade.

Xiuzhen et al. [87] acknowledged the inherent volatility of the oil market, where, despite supply and demand being key price determinants, factors like stock market trends, economic activity, political scenarios, and external influences also play crucial roles, as noted by Wu et al. [85]. This complexity renders the prediction of future oil prices both critical and exceptionally challenging, due to the market’s unpredictable nature. Pan et al. [65] highlighted the accurate forecast of crude oil prices as a significant and pressing issue in the forecasting domain. The importance of predicting crude oil prices extends beyond market speculation. As Fang et al. [33] pointed out, it is crucial for countries that are both consumers and producers of oil to formulate economic stabilization and financial strategies. Addressing this forecasting challenge, Wu et al. [88] developed the VILIG (named based on the decomposition-prediction-ensemble-error correction framework) model to reduce the root mean square error (RMSE) in predicting crude oil futures prices, showing improvement over the other models. Furthermore, Xu et al. [88] found that regression neural networks surpass time series networks in forecasting WTI crude oil prices. This superiority is attributed to the networks’ ability to better handle the non-linear nature of time series data. This finding is particularly relevant in a market characterized by high volatility and unpredictability, where traditional linear models may fall short. The advancement in forecasting models like VILIG and regression neural networks represents a significant leap in tackling the complexities of oil price prediction, offering more reliable tools for economic planning and risk management in both oil-producing and consuming nations.

Shafiee and Topal [74] used rational expectations theory to predict future Gold prices. They employed a revised econometric model to forecast Gold price breakthroughs and changes. The study highlighted a robust correlation of around 85% between Gold and crude oil prices from January 1968 to December 2009. In 1968 and 2008, the ratio of the price of one ounce of Gold to one barrel of crude oil was estimated to be around 11 [74]. Further, the correlation between Gold and crude oil prices was also evaluated by examining the volatility fluctuations of Gold and oil futures. Univariate and bivariate Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) models were adopted to take the significance of unobserved heterogeneity in volatility into consideration. Ewing and Malik’s, 2013 uncovered strong evidence of substantial direct volatility spillovers between Gold and oil prices. A small indirect relationship between Gold and crude oil markets could be identified if structural discontinuities in variance are disregarded. The above-mentioned studies laid a strong foundation for the relationship between Gold and crude oil prices, however, failed to predict the fluctuations in the interrelationships between Gold and crude oil prices. The study integrates the GARCH-BEKK model with a machine learning approach to predict shock transmission between financial markets and the oil market. The results affirm significant volatility spillovers from oil to both stock and bond markets. However, the study might not fully account for unforeseen external factors or events, such as economic, political, or global events, potentially impacting the observed volatility spillovers [62].

[45] studied the relationship between worldwide oil prices, precious metal prices, and the Rupee/U.S. exchange rate. Because precious metals are utilized to diversify investment portfolios, long-term price reliance is expected. In a petroleum country like India, rising oil prices creates inflation. Long-term dependency of Gold prices on oil prices is projected as investors buy Gold to protect against inflation. Moreover, using traditional time series and neural network approaches, the interrelationships among Gold, crude oil, and Euros can be investigated. [55]. The findings revealed that oil has a higher impact on Gold than Gold has upon oil, and that oil has a more significant impact on the Euro than the Euro has on oil. This finding revealed the reason why oil gains accounted for 1.7% of Gold gains, whereas Gold returns account negligible for oil returns [51].

Zhang and Wei [90] found a substantial cointegrating relationship across the price returns of oil and Gold commodities when they looked at the long-term causal and lead-and-lag relation between the two prices. The outcomes imply that there is no substantial nonlinear Granger causality among the two markets at a 10% level, indicating that their interacting mechanism is quite straightforward. Further, Gold price and convolutional neural networks-long short-term memory model was employed to examine and predict market variations during the pandemic [61]. The mean squared error of the vector sequence output on the validation set was \(6.0\times 10^{4}\), \(8.0\times 10^{4}\), and \(2.0\times 10^{3}\) for 1 day, 2 days, and 30 days-ahead forecasts, respectively. Research shows that gold and oil prices have been positively correlated for over 80 % of the time in the past 50 years [73]. However, this relationship is not consistent; for instance, in Turkey, crude oil prices have a negligible impact on gold prices [77]. Additionally, the influence of oil prices on gold has varied over time, being more significant during the 1970s due to events like the 1973 oil crisis and the 1979 energy crisis [72]. This highlights the complexity and variability of the gold-oil price relationship. The exploration of the Gold-Oil relationship has been a focal point in numerous prediction research studies. However, these investigations have yet to yield a definitive, and straightforward predictive tool for understanding the interrelationship between gold and oil.

2.3 Research gap

In the realm of financial market analysis, the intricate interplay between gold and oil prices remains a subject of considerable intrigue and study. Although significant strides have been made in understanding individual market dynamics and correlations, the development of a robust predictive model that exclusively focuses on the direct relationship between gold prices and crude oil futures is still an area ripe for exploration. This paper aims to bridge this gap by presenting a novel approach that amalgamates three distinct and complementary analytical elements.

The studies by Li and Umair [52], Mezghani et al. [60], Xu et al. [88], Balcilar et al. [3], and [46] have significantly contributed to the understanding of the interactions and spillover effects between gold and oil markets. Yet, these investigations primarily analysed these commodities in isolation or through indirect interrelations. Similarly, Ewing and Malik [32] and Shafiee and Topal [74] provided deep insights into the volatility behaviours and price forecasting of gold and oil but did not specifically target the dynamic correlation between these two commodities. While, Le and Chang [51] and Zhang and Wei [90] made notable advances in understanding the impact of oil price shocks and the overall relationship between oil and gold, their methodologies lacked the capacity to forecast the evolving interrelationship. Shahbaz et al. [73] further delved into the predictive capabilities of oil prices on gold prices under varying market conditions, indicating the potential for a more integrated model.

Chen and Xu’s [18] demonstrated the superiority of the multivariate GAS(1,1) model over the DCC-GARCH(1,1) in forecasting the volatility and correlation between crude oil and gold prices. Although their approach, captured the dynamic relationships, it did not adequately address volatility clustering and external seasonality factors. Addressing these limitations, our study proposed a unique framework integrating three analytical components: Rolling Correlation, SARIMAX Model, and Rolling Model. This tripartite approach is designed to leverage the strengths of each component, thereby enhancing the accuracy and robustness of the proposed predictive model:

  1. 1.

    Rolling Correlation: This element dynamically assesses the correlation between gold and oil prices, offering insights into how their relationship evolves over time.

  2. 2.

    Static SARIMAX Model: By incorporating this model, external factors and seasonal variations that may influence the gold-oil relationship can be accounted to enrich the understanding beyond mere price dynamics.

  3. 3.

    Rolling Model: This component allows for continuous updating and adaptation of the static SARIMAX model to recent data, ensuring that the predictions remain relevant and timely.

By synergizing these elements, this research endeavours to provide a comprehensive and accurate predictive model for the gold-oil relationship, a task that has not been squarely addressed in prior studies. This novel contribution is particularly significant in the context of the lack of stability in the correlation between these commodities over time. This approach represents a significant step forward in the predictive analysis of these crucial financial markets.

The research design for this study is systematically structured into three distinct sections: Research Methodology, Analysis and Testing, and Results and Discussions. Each section plays a pivotal role in ensuring the rigor and clarity of the research process, contributing to the comprehensive understanding of the study’s findings.

3 Research methodology

3.1 Data collection and pre-processing

This study is focused on key benchmarks in the commodities market, specifically targeting WTI crude oil futures, Brent Crude futures, and the XAU/USD spot real FX price, commonly known as Gold. The selection of these variables is informed by their significant roles within the global commodities market. WTI (West Texas Intermediate) crude oil, recognized as the benchmark for the U.S. oil market, and Brent Crude, serving as a global benchmark, provide a comprehensive understanding of the oil industry’s dynamics. Their inclusion in the study is crucial to collectively represent the broad spectrum of the oil market, encompassing both U.S. and international trends [64].

Gold is included as a contrasting asset to the more volatile oil market, which is viewed as a safe-haven asset and a hedge against inflation. Gold’s XAU/USD spot real FX price offers insights into market sentiments and economic trends, enriching the study’s scope [91]. The nature of the selected oil variables, being futures prices, is especially pertinent. Futures prices are inherently forward-looking, embedding market expectations about future supply and demand scenarios. This characteristic makes futures prices more indicative of future market movements compared to spot prices, enhancing their predictive capability in market analysis [30].

The sample period for the study is meticulously chosen. Data from May 1983 to December 2022 is selected for WTI crude oil futures and Gold. This period, covering nearly four decades, includes various economic cycles and major events that have significantly influenced the oil market, thus it provides a robust dataset for trend analysis [30]. For Brent Crude futures, the data collection starts from July 1988, coinciding with its establishment as a global oil benchmark, and extends to December 2022. This timeframe ensures the dataset encompasses Brent Crude’s complete influence in the global oil market [64].

The sample period offers several benefits. It ensures a comprehensive view of the commodities’ price movements over an extended period, capturing a range of market dynamics and cycles. This long-term perspective is vital for a thorough analysis of trends and patterns in the commodities market. Additionally, the recency of the sample period, ending in December 2022, ensures the study remains relevant to current market conditions, covering recent global events that have profoundly impacted the commodities market.

Furthermore, data pre-processing involves several critical steps. Firstly, data cleaning was conducted to remove any duplicated values in the monthly datasets. This step ensured the integrity and accuracy of the data. Fortunately, no missing values were detected in the historical data, simplifying the pre-processing phase. Subsequently, the datasets were standardized in dollar terms for consistency and comparability. This standardization involved expressing the Gold spot price per troy ounce (1,097 ounces) and the WTI and Brent futures prices per barrel (159 ls) in dollar values. Standardizing the data in this manner allows for meaningful comparisons and analyses across the different commodities.

3.2 Monthly rolling correlation calculation

Correlation coefficients of economic factors are considered to be unstable over time [7, 8, 76]. During periods of crisis, correlation breaks have been reported, where historically unrelated variables started to correlate. Therefore, rolling correlation analysis can be conducted to examine the stability of correlations over time [58]. In this research, the rolling correlation was employed to examine whether the correlation between Gold price and crude oil changes over time. Moreover, when two correlated time series diverge from one another, changes in correlation can indicate that an event has occurred. Further, the rolling correlation between the two time-series data may change over time. For instance, the positive correlation between two time-series data may weaken or strengthen over time. Understanding how the correlation strengthens and weakens over time aids investors in identifying market trends. Therefore, rolling correlation coefficients of the pre-processed historical datasets were utilised to analyse both Gold–WTI and Gold–Brent relationship.

The Monthly Rolling Correlation of Gold–WTI \(R_1\) and Gold–Brent \(R_2\) was estimated by computing the correlation between the past 12 months’ Gold Spot prices and WTI/Brent futures prices using Eq. 1.

$$\begin{aligned} R_k = \frac{N_k\sum X_i Y_{i,k}-\sum X_i \sum Y_{i,k}}{\sqrt{N_k\sum X_{i}^{2}-{(\sum X_{i})}^2}\sqrt{N_k\sum Y_{i}^{2}-{(\sum Y_{i})}^2}} \end{aligned}$$
(1)

where,

  • \(N_k\): period, which equals 12 months in this study

  • \(X_i\): monthly Gold Spot price

  • \(Y_i\): monthly Crude WTI or Brent futures price

  • k: 1 indicates WTI futures, 2 indicates Brent futures

3.3 Development of SARIMAX model for monthly rolling correlation

Due to the recurrent cycle and seasonal patterns cycle identified from the time-series rolling correlation \(R_1\) and \(R_2\), the Seasonal Autoregressive Integrated Moving Average Exogenous (SARIMAX) model is implemented to forecast future correlation. Without a seasonal component in the model, the conventional Autoregressive Integrated Moving Average (ARIMA) model does not support any seasonal data. It anticipates data that is either nonseasonal or has had the seasonal component subtracted, i.e., seasonally adjusted using techniques like seasonal differencing. On the contrary, SARIMAX model includes an appropriate parameter for the seasonality’s duration with three new hyperparameters to describe the autoregression (AR), differencing (I), and moving average (MA) for the seasonal component of the series. The developed SARIMAX model is shown in Eq. 2. Moreover, rolling forecast can extrapolate from features in the past data to predict future correlation coefficients across an ongoing period. In other words, rolling forecast would periodically update future values taking past changes into account. It is in contrast to static forecast methods (i.e. conventional ARIMA models) that anticipate the future for a specific time frame.

Before developing SARIMAX model, the augmented dicky-fuller test is adopted to test the stationary [82] of the monthly rolling correlation coefficients of Gold–WTI \(R_1\) and Gold–Brent \(R_2\). After that, the data set was decomposed to maintain stationary by removing trend from the rolling correlation data set. Thereafter, the autocorrelation function (ACF) and partial autocorrelation function (PACF) test is utilized to find an appropriate set of seasonal variable for the SARIMAX mode l. By using the optimal seasonal variable, Akaike information criterion (AIC) and Bayesian information criterion (BIC) values are generated for different SARIMAX models, and the one with the lowest values is tested for the highest R-square.

SARIMAX(p,d,q)(P,D,Q)s model:

$$\begin{aligned}{} & {} {(1-\Theta _1 \beta )}^p {(1-\theta _1 {\beta }^s)}^P {(1-\beta )}^d {(1- {\beta }^s)}^D y_t \nonumber \\{} & {} \quad = {(1+\Phi _1 \beta )}^q {(1+\phi _1 {\beta }^s)}^Q X_t + \epsilon _t \end{aligned}$$
(2)

Autoregressive component \({(1-\Theta _1\beta )}^p\): The AR component accounts for the autocorrelation in the data. The parameter p represents the order of the AR term, with a higher value indicating that the model is more influenced by past values of the time series.

Differencing component \({(1-\beta )}^d\): The differencing component is used to remove trend and seasonality from the data. The parameter d represents the order of the differencing term, with a higher value indicating that more differencing is applied to the data.

Moving average component \({(1+\Phi _1\beta )}^q\): The MA component accounts for the error term in the model. The parameter q represents the order of the MA term, with a higher value indicating that the model is more influenced by the error term.

Seasonal autoregressive component \({(1-\theta _1 \beta ^s)}^P\): The seasonal AR component accounts for the autocorrelation in the data over the seasonal period. The parameter P represents the order of the seasonal AR term, with a higher value indicating that the model is more influenced by past values of the time series over the seasonal period. S indicates how far away the seasonal components will be from the current period.

Seasonal differencing component \({(1-\beta ^s)}^D\): The seasonal differencing component is used to remove seasonality from the data over the seasonal period. The parameter D represents the order of the seasonal differencing term, with a higher value indicating that more differencing is applied to the data over the seasonal period.

Seasonal moving average (MA) component \({(1+\phi _1 \beta ^s)}^Q\): The seasonal MA component accounts for the error term in the model over the seasonal period. The parameter Q represents the order of the seasonal MA term, with a higher value indicating that the model is more influenced by the error term over the seasonal period.

Backward shift operator \((\beta )\): The backward shift operator is used to shift the time series data backwards, allowing the model to account for autocorrelation in the data. It can represent the dependence of the current value of the time series on past values.

Error term \((\epsilon _t)\): The error term represents the difference between the observed value of the time series and the value predicted by the model. It is used to account for the uncertainty in the forecasts.

Exogenous variables \((X_t)\): These variables are included in the model to improve the accuracy of the forecasts. They can be used to incorporate additional information that may be relevant to the time series data, such as economic indicators, weather data, or other factors that may influence the time series data.

Dependent variable \((y_t)\): is a variable that represents the time series data being analysed and forecasted. It is the dependent variable in the model, which means that it is being explained or predicted by the exogenous variables (X) and the error term \((\epsilon _t)\).

To validate the SARIMAX model, the entire dataset is split into training and testing data. For Gold–WTI \(R_1\), the training dataset is from May 1983 to June 2021, while the testing dataset is from July 2021 to December 2022. For Gold–Brent \(R_2\), the range of training dataset is between July 1988 and June 2021, while the range of testing dataset is between July 2021 and December 2022.

4 Analysis and testing

4.1 Features of crude oil WTI futures and brent futures

To identify the similar features of Crude Oil Brent futures and WTI futures, monthly rolling correlations of Gold–WTI \(R_1\) and Gold–Brent \(R_2\) are illustrated in Fig. 1. Based on R-square value, the price movement of WTI and Brent futures has approximately 85% similarity, meaning these two Futures tend to move in a similar direction and to a similar degree over time. It might be due to the fact that both WTI and Brent are crude oil and are therefore subject to similar market forces and demand drivers, like changes in global oil production, consumption, and trade patterns. Another factor is that these two commodities are often used as benchmarks for other types of oil and oil-related products. As a result, price changes of WTI and Brent futures may have a ripple effect on the prices of other oil-related assets, which could further contribute to the similarity in their price movements.

Fig. 1
figure 1

Monthly Rolling Correlations of Gold–WTI and Gold–Brent The line graph displays the evolving monthly rolling correlations between Gold–WTI \(R_1\) and Gold–Brent \(R_2\), utilizing the past 12 months’ Gold Spot prices and WTI/Brent futures prices. The figure highlights the similarity in price movements between WTI and Brent futures, emphasizing the role of Gold as a common variable. See Sects. 3.2 and 4.1 for detailed discussion

4.2 Monthly rolling correlation of Gold–WTI and gold–brent

The raw data of the rolling correlation of Gold–WTI \(R_1\) and Gold–Brent \(R_2\) are shown in Fig. 2.

Fig. 2
figure 2

Monthly Rolling Correlations of Gold–WTI, Gold–Brent Graphs (a) and (b) depict the individual monthly rolling correlations of Gold–WTI \(R_1\) and Gold–Brent \(R_2\) based on the past 12 months’ Gold Spot prices and WTI/Brent futures prices. These figures underscore the distinctive correlation patterns between Gold and each crude oil benchmark. See Sects. 3.2 and 4.1 for detailed discussion

In order to use the SARIMAX model effectively for time series analysis and forecasting, it is important to ensure that the correlation data is stationary so that statistical properties of the data do not change over time. Therefore, first-order differencing transformation was adopted to calculate the difference between successive values in the time series and remove the trend from the raw correlation data (Fig. 3).

Fig. 3
figure 3

Decomposition of Monthly Rolling Correlation Data Graphs (a) and (b) present the decomposed monthly rolling correlations data for Gold–WTI \(R_1\) and Gold–Brent \(R_2\), emphasizing the maintenance of stationarity in the correlation data. See Sect. 4.2 for detailed discussion

As demonstrated in Fig. 4, the original series is the complete time series data, including all the fluctuations and variations present in the data. The trend is the underlying long-term movement or direction of the data, representing the overall pattern of the data over time. The seasonal component is the periodic variation in the data that occurs at a fixed interval, such as daily, weekly, or monthly, representing the repeating patterns in the data. The residuals are the remaining elements of the data that are not explained by the trend or the seasonal component, representing the random fluctuations and variations in the data.

Fig. 4
figure 4

Time Series Components of Monthly Rolling Correlation Graphs (a) and (b) showcase the time series components of monthly rolling correlations data for Gold–WTI \(R_1\) and Gold–Brent \(R_2\). This decomposition extracts trend, seasonality, and residual elements, enriching our understanding of the correlation dynamics. See Sect. 4.2 for detailed discussion

4.3 Prediction results of static SARIMAX model

According to ACF and PACF tests, the optimal seasonal variable is selected as 8 months. The residuals of SARIMAX model for Gold–WTI \(R_1\) and Gold–Brent \(R_2\) are summarized in Fig. 5. It is seen that the residuals of SARIMAX models are between 0 and 1.

Fig. 5
figure 5

Static SARIMAX residuals for Monthly Rolling Correlation Graphs (a) and (b) display the residuals obtained from the static SARIMAX models applied to monthly rolling correlations data for Gold–WTI \(R_1\) and Gold–Brent \(R_2\). The selection of an optimal seasonal variable (8 months) is based on ACF and PACF tests. Notably, while the residuals from the static SARIMAX models fall within the range of 0–1, they exhibit a slight deviation from zero. However, a nuanced observation reveals a subtle deviation from zero, indicating a certain degree of systematic variation

4.4 Development of rolling SARIMAX model

Because the residuals in the static SARIMAX model weren’t close to zero (Fig. 5), the rolling SARIMAX model (0,1,0) (1,0,1,4) with the highest R-square was adopted. Rolling correlation of Gold–WTI \(R_1\) and Gold–Brent \(R_2\) are adopted to determine coefficient parameters (i.e. \(\theta _1\) and \(\phi _1\)). The input data is the historical correlation coefficients while the output data for the model are the predicted correlation coefficients for a certain future period. In this study, training dataset is used for input while testing dataset is used for output of the SARIMAX model. Overall speaking, the development of the rolling SARIMAX model uses the input data, in this case, the historical correlation values, to make predictions about future correlation values in the specified time period. The AIC and BIC values of different SARIMAX models are summarised in Table 1. For both Gold–WTI \(R_1\) and Gold–Brent \(R_2\), it is seen that the combination of (0,1,0)(1,0,1,4) has the smallest AIC and BIC values. Therefore, this combination was chosen to develop the rolling SARIMAX model.

SARIMAX(p,d,q)(P,D,Q,s) model by: (0,1,0)(1,0,1,4), assumed that are none exogenous variables (i.e. \(X_t\)=0).

$$\begin{aligned}{} & {} {(1-\Theta _1 \beta )}^p {(1-\theta _1 {\beta }^s)}^P {(1-\beta )}^d {(1- {\beta }^s)}^D y_t \nonumber \\{} & {} \quad = {(1+\Phi _1 \beta )}^q {(1+\phi _1 {\beta }^s)}^Q \epsilon _t\end{aligned}$$
(3)
$$\begin{aligned}{} & {} {(1-\Theta _1 \beta )}^0 {(1-\theta _1 {\beta }^4)}^1 {(1-\beta )}^1 {(1- {\beta }^4)}^0 y_t \nonumber \\{} & {} \quad = {(1+\Phi _1 \beta )}^0 {(1+\phi _1 {\beta }^4)}^1 \epsilon _t\end{aligned}$$
(4)
$$\begin{aligned}{} & {} (1-\theta _1 {\beta }^4) (1-\beta ) y_t = (1+\phi _1 {\beta }^4) \epsilon _t\end{aligned}$$
(5)
$$\begin{aligned}{} & {} (1-\theta _1 {\beta }^4 -\beta +\theta _1 {\beta }^5) y_t = (1+\phi _1 {\beta }^4) \epsilon _t\end{aligned}$$
(6)
$$\begin{aligned}{} & {} y_t -y_{t-1}=\theta _1 y_{t-4}-\theta _1 y_{t-5}+\phi _1 \epsilon _{t-4}+\epsilon _t \end{aligned}$$
(7)

let

$$\begin{aligned} z_t = y_t - y_{t-1} \end{aligned}$$
(8)

therefore

$$\begin{aligned} z_t = \theta _1 z_{t-4}+\phi _1\epsilon _{t-4}+\epsilon _t \end{aligned}$$
(9)

4.5 Prediction results of rolling SARIMAX model

Seasonal AR and seasonal MA are the coefficients of the seasonal autoregression and seasonal moving average components of the SARIMAX model, respectively, which can be determined during the model fitting process. The seasonal AR coefficients capture the past-seasonal pattern of the series, and they use past values of the seasonal component to forecast future values. The seasonal AR coefficient for Gold–WTI \(R_1\) and Gold–Brent \(R_2\) is 0.66 and 0.64, respectively. The positive value of these coefficients indicates a positive correlation between past seasonal values and future values, thus contributing p. The seasonal MA component of the SARIMAX model uses past forecast errors to forecast future values. The seasonal MA coefficient for Gold–WTI \(R_1\) and Gold–Brent \(R_2\) is \(-\)0.91 and \(-\)0.90, respectively. This coefficient multiplies the previous forecast error in the forecast equation. The negative value of this coefficient indicates a negative correlation between past forecast errors and future values, thus contributing negatively to the prediction.

According to the determined seasonal AR and MA coefficients, Eq. (9) can be rewritten as Eq. (10) and Eq. (11) for Gold–WTI \(R_1\) and Gold–Brent \(R_2\), respectively.

$$\begin{aligned} z_t= & {} 0.66z_{t-4} - 0.91\epsilon _{t-4}+\epsilon _t\end{aligned}$$
(10)
$$\begin{aligned} z_t= & {} 0.64z_{t-4} - 0.90\epsilon _{t-4}+\epsilon _t \end{aligned}$$
(11)

The residuals of rolling SARIMAX model for Gold–WTI \(R_1\) and Gold–Brent \(R_2\) are summarized in Fig. 6. It is seen that the residuals of SARIMAX model are within the range of \(-\)0.1 and 0.3, which are quite close to 0. It indicates that the rolling SARIMAX model (Fig. 6) can accurately capture the dynamics and trends in the dataset than the static SARIMAX model (Fig. 5).

Fig. 6
figure 6

Rolling Model Residuals for Monthly Rolling Correlation Graphs (a) and (b) illustrate the residuals obtained from rolling SARIMAX models applied to monthly rolling correlations data for Gold–WTI \(R_1\) and Gold–Brent \(R_2\), respectively. Remarkably, the residuals consistently maintain a range between \(-\)0.1 and 0.3, indicating a stable and predictable pattern in the model’s performance than Fig. 5. Refer to Sects. 4.4 and 4.5 to understand the model development process

The rolling SARIMAX model employs first-order differencing to remove the trend from the time series, taking into account how the series is changing over time as well. To make a prediction, the model first applies the differencing to the current observation and then uses the estimated coefficients to calculate the predicted value of the seasonal component, which is then added back to the differenced value to get the final forecasted value. To evaluate the performance of the developed rolling SARIMAX model, the prediction values are compared with actual observations during the period between July 2021 and December 2022, as demonstrated in Fig. 7.

Fig. 7
figure 7

Rolling SARIMAX Forecast for Monthly Rolling Correlation Graphs (a) and (b) showcase the monthly rolling correlation observations and predictions for Gold–WTI \(R_1\) and Gold–Brent \(R_2\), respectively. The predictive values, generated through the Rolling SARIMAX model, are juxtaposed with actual observations spanning July 2021–December 2022. For Gold–WTI \(R_1\), the training dataset is from May 1983 to June 2021, while the testing dataset is from July 2021 to December 2022. For Gold–Brent \(R_2\), the range of training dataset is between July 1988 and June 2021, while the range of testing dataset is between July 2021 and December 2022. Refer to Sects. 4.5 to understand the model development process

The rolling SARIMAX results are summarized in Fig. 8, while its Performance metrics are summarized in Table 2. The Gold–WTI \(R_1\) and Gold–Brent \(R_2\) model has an R-Square of 0.89 and 0.88, respectively.

Fig. 8
figure 8

Rolling SARIMAX results for Monthly Rolling Correlation Images (a) and (b) present the prediction results obtained from the rolling SARIMAX model in Fig. 7 for monthly rolling correlations of Gold–WTI \(R_1\) and Gold–Brent \(R_2\). For a comprehensive understanding of the model development and detailed results, consult Sects. 4.4 and 4.5 in the main text

5 Results and discussions

The research outcome offers insight into the correlation and predictability of commodity prices, specifically focusing on Gold, Brent, and WTI futures. The results are multifaceted and provide substantial contributions to the field of financial forecasting.

In order to evaluate the performance of the proposed novel Rolling SARIMAX model, the same dataset was used to train conventional prediction models, including Autoregressive, Moving average, Autoregressive moving average and Seasonal ARIMA. Its R-square error and mean absolute percentage error are summarised in Table 3.

Overall Key Results of the Research:

  1. 1.

    Similarity in Brent and WTI Price Movements:

    The study uncovered a striking 85 percent similarity in the price movements of Brent and WTI Futures, as indicated by the monthly rolling correlations between Gold–WTI \(R_1\) and Gold–Brent \(R_2\) (Fig. 2). This significant correlation, though not the primary objective of the study, is a noteworthy discovery, to confirm the strong linkage between these two major types of crude oil.

  2. 2.

    Volatility Analysis:

    The decomposition of raw correlation data (Fig. 3) reveals notable volatility between the correlations of Gold with both WTI futures \(R_1\) and Brent futures \(R_2\). This volatility is not only just a fluctuation in prices but suggests substantial direct volatility fluctuations, supporting the findings of Ewing and Malik [32]. Understanding this volatility is critical, especially for market participants and policy makers.

  3. 3.

    Rolling SARIMAX Model’s Efficacy:

    The rolling SARIMAX models for Gold–WTI \(R_1\) and Gold–Brent \(R_2\) show a marked improvement over conventional static SARIMAX models (Fig. 5). The models demonstrate at least a 40 percent increase in R-square value and a 95 percent reduction in MAPE (Table 3). These metrics indicate a significant enhancement in predictive accuracy and reliability.

  4. 4.

    Rolling SARIMAX Model’s Performance Metrics:

    The Rolling SARIMAX model achieved R-square values of 0.898 for Gold Spot Price-WTI Futures \(R_1\) and 0.884 for Gold Spot Price-Brent Futures \(R_2\) (Table 3). These figures are exceptionally high for financial time series models, suggesting that the models have a strong predictive power. High R-square values mean that the model can explain a significant portion of the price movements of Gold spot price in relation to WTI and Brent futures. This is particularly useful for investors and analysts in understanding market trends and making informed decisions.

The Rolling SARIMAX model for Gold Spot Price-WTI Futures \(R_1\) reported a MAPE of 0.1033, and for Gold Spot Price-Brent Futures \(R_2\), the MAPE was 0.1084 (Table 3). These low values signify high accuracy in forecasts, implying that the model’s predictions are closely aligned with actual market movements. In comparison to conventional SARIMAX models, the 95% reduction in MAPE is substantial (Table 3). This demonstrates the enhanced precision of the rolling models in capturing market dynamics and reduces the likelihood of significant forecasting errors.

The superiority of our novel Rolling SARIMAX model over traditional models is a key contribution of this paper. The improved R-square and reduced MAPE values demonstrate that the rolling approach to SARIMAX modelling is more adept at capturing the dynamics of the market and predicting future price movements. This provides practical implications for financial analysts, investors, and policymakers, by offering a more reliable tool for forecasting commodity prices.

6 Research novelty and contribution

The novel rolling SARIMAX model’s pre-eminence in analyzing the dynamics between gold and crude oil markets can be attributed to its unique characteristics and methodological advancements compared to other models. To fully appreciate its superiority, it’s essential to delve into its features and compare them with other notable models in recent studies.

  1. 1.

    Adaptability and Time-Series Framework:

    Unlike static models such as the VAR by Li and Umair [52] and the asymmetric VAR-BEKK(DCC)-GARCH by Chen et al. [20], the newly proposed rolling SARIMAX Model excels in its adaptability. It effectively assimilates real-time data, making it responsive to immediate market changes. This characteristic is crucial in the volatile domains of gold and oil markets, where external shocks and geopolitical events can rapidly alter market dynamics. Also, for instance, Shafiee and Topal’s [74] exploration of gold price forecasting, while foundational, is somewhat constrained by its reliance on historical data, potentially limiting its applicability to current market dynamics. Similarly, Ewing and Malik’s [32] applied GARCH models to assess volatility between gold and oil futures, however this approach might overlook broader market interactions and trends. In contrast, the newly proposed Rolling SARIMAX Model’s adaptability and real-time data integration offer a more holistic understanding of market trends. Whereas, Balcilar et al. [3] explored return and volatility spillovers using the Diebold and Yilmaz spillover index. Their comprehensive analysis could be further enriched by this Rolling SARIMAX Model’s advanced time-series capabilities because, it offers a more adaptive and holistic perspective.

  2. 2.

    Handling of Linear and Non-Linear Relationships:

    The rolling SARIMAX Model is designed to analyze both linear and non-linear interactions within the markets. This dual capability offers a more comprehensive analysis compared to models like the nonparametric causality-in-quantiles method by Shahbaz et al. [73]. While their approach reveals nuanced relationships, it might not be able to capture the full complexity of interactions like the proposed Rolling SARIMAX Model.

  3. 3.

    Integration of External Variables:

    The rolling SARIMAX model’s ability to integrate external variables like geopolitical events and economic indicators sets it apart from traditional approaches. Studies conducted by Malliaris and Malliaris [55] and Zhang and Wei [90] provided valuable insights into using time series and neural network methods, and cointegration and causality analysis, respectively. However, they may not fully capture the dynamic nature of asset relationships, particularly in response to external shocks, as effectively as the proposed Rolling SARIMAX Model.

  4. 4.

    Flexibility and Forecasting Accuracy:

    In comparison with complex models like Wu et al. (2023)’s [88] VILIG model and the combination of econometric and machine learning models [88], the rolling SARIMAX Model offers a more user-friendly and practical framework without sacrificing accuracy. Its simplicity in structure does not detract from its forecasting capabilities. Instead, it enhances its applicability in various economic scenarios, making it a valuable tool for practitioners and researchers.

  5. 5.

    Robustness in Evolving Markets:

    The robustness of the proposed rolling SARIMAX model in evolving market conditions is a significant advantage. It goes beyond static analysis by continuously updating and adjusting to new data. This ongoing adjustment process allows for a more accurate and timely understanding of market trends and relationships, a feature less emphasized in traditional models.

In conclusion, the proposed rolling SARIMAX Model’s approach to handling linear and non-linear relationships, its adaptability to external factors, and its robustness in evolving market scenarios, all contribute to its standing as a superior tool for analyzing the interplay between gold and crude oil markets. Its balance of complexity and practicality makes it not only a sophisticated analytical tool but also accessible for real-world application.

Table 1 SARIMAX model combinations with AIC and BIC values Summary of AIC and BIC values for various SARIMAX model combinations is presented. For both \(R_1\) and \(R_2\), the combination of (0,1,0)(1,0,1,4) emerges with the smallest AIC and BIC values. These findings underscore the suitability of this specific SARIMAX model configuration for capturing and predicting the dynamics in the respective commodity futures
Table 2 Performance metrics of Rolling SARIMAX model. The metrics offer a comprehensive evaluation of the model’s predictive accuracy and effectiveness. Refer to Sects. 4.4 and 4.5 to understand the model development process
Table 3 Performance comparison with conventional models. This table presents a performance comparison between the Rolling SARIMAX model and static conventional prediction models. Notably, the developed Rolling SARIMAX model demonstrates the highest R-square and the lowest Mean Absolute Percentage Error (MAPE), underscoring its superior predictive accuracy in contrast to conventional models

7 Theoretical implication

The proposed predictive model for the gold-oil relationship, incorporating rolling correlation, the SARIMAX model, and the rolling model, presents both theoretical and practical advancements in the field of financial market analysis. Each component contributes uniquely to the model, enhancing its robustness and accuracy. Below is a discussion of the theoretical and practical contributions of each approach, followed by an evaluation of the overall model.

Rolling correlation approach contributes to the theoretical understanding of dynamic inter-market relationships. It facilitates a deeper analysis of how correlations between gold and oil prices evolve over time, challenging static correlation assumptions prevalent in traditional models. This approach advances the theory of time-series analysis by emphasizing the non-static nature of correlations in financial time series. It challenges traditional models that often assume constant relationships over time, promoting a more nuanced understanding of temporal dynamics in market data. Practically, Rolling Correlation provides real-time insights into market dynamics, essential for investors and analysts who need to make informed decisions based on current market conditions. This dynamic correlation analysis can aid in risk management and asset allocation strategies.

The inclusion of the SARIMAX model extends the theoretical framework by incorporating seasonality and the impact of exogenous variables on the gold-oil relationship. It broadens the scope of analysis beyond price movements, incorporating external economic and geopolitical factors. Further, the SARIMAX model enriches time-series analysis by integrating external variables and seasonality. This broadens the conventional scope of time-series models, which often focus solely on internal dynamics, enabling a more comprehensive understanding of how external factors and cyclical patterns influence time-series data. In practical terms, the SARIMAX model enhances the predictive accuracy by accounting for external influences, providing a more holistic view of the factors affecting gold and oil prices. This is particularly beneficial for long-term strategic planning and forecasting.

The rolling model’s theoretical contribution lies in its adaptability and continuous learning approach. It reflects the dynamic nature of financial markets, where past data may not always be a reliable indicator of future trends. This approach to continuously update its parameters based on new data offers a significant theoretical advancement in time-series analysis. It addresses the challenge of model stagnation and relevancy over time, ensuring that the model adapts to the ever-changing nature of financial time series. On a practical level, the Rolling Model ensures that the predictive tool remains relevant and up-to-date, continuously integrating the latest market data. This feature is crucial for maintaining the accuracy and reliability of predictions in a rapidly changing market environment.

Overall, the combined predictive model makes a substantial contribution to the field of time-series analysis in financial markets. The integration of rolling correlation, SARIMAX, and rolling model represents a holistic approach that addresses several critical aspects of time-series data:

  1. 1.

    Dynamic Correlation Analysis:

    By employing rolling correlation, the model emphasizes the importance of understanding evolving relationships in time-series data, moving beyond static correlation analysis. This is crucial for accurately capturing the interconnectedness of financial variables over time.

  2. 2.

    External Factors and Seasonality:

    The inclusion of the SARIMAX model highlights the significance of considering both external influences and seasonal patterns in time-series analysis, aspects often overlooked in simpler models.

  3. 3.

    Adaptability and Continuous Learning:

    The rolling model’s approach to continuously update itself with new data addresses a key challenge in time-series analysis: ensuring that models remain relevant as market conditions change. This adaptability is essential for maintaining the accuracy of predictions and insights derived from time-series data.

The model stands to offer significant benefits to investors, traders, and market analysts. Its ability to provide timely, accurate, and comprehensive predictions makes it a valuable tool for strategic decision-making. The model’s adaptability and inclusion of external factors make it particularly useful in developing risk management strategies, optimizing portfolio allocation, and identifying investment opportunities.

8 Practical implication

Time series correlations are very useful for various types of data modelling and forecasting since they can be employed to determine whether there exists an interrelationship between two time-series variables. In finance, they are often utilised to predict stock prices, and other financial variables. For example, the correlation between stock return and financial indicators of selected companies was examined using the correlation and covariation method [28]. Further, investors generally analyse the correlation between the returns of different assets in a portfolio to predict returns and make informed decisions. This correlation analysis is a vital aspect of the modern portfolio theory as it helps investors predict returns and create a diversified portfolio that maximizes returns and minimises risk by identifying assets with low or negative correlation [29].

In economics, time series correlation can be used to examine the interrelationship between economic indicators and other affecting factors. Like, correlation and Analysis of variance (ANOVA) was used to model the relationship between gross domestic product (GDP) and other variables such as Exchange rates, and Sensex [24]. Furthermore, time series and semi-correlation analysis were also employed to forecast international cross-correlations equity between different nations to better comprehend their economic activities [31].

  1. 1.

    Enhanced Market Analysis and Risk Management:

    The rolling SARIMAX Model allows for a more accurate prediction of the correlation between Gold and Crude oil prices, crucial for risk management in financial markets. Hamilton [38] also emphasizes the importance of understanding oil price fluctuations for economic forecasting and risk management in financial markets.

  2. 2.

    Informed Policy and Economic Planning:

    Governments can leverage the model’s predictions for economic planning, particularly in resource allocation and fiscal policy formulation. According to the World Bank’s Commodity Markets Outlook [84], understanding commodity price movements, like those of oil and gold, is essential for effective policymaking, especially in resource-dependent economies.

  3. 3.

    Strategic Investment Decisions:

    Investors can use the model to make informed decisions on asset allocation, diversifying portfolios to minimize risk and maximize returns. Bodie et al. [12] also highlights the significance of diversification and understanding asset correlations for effective portfolio management.

  4. 4.

    Predicting Economic Cycles:

    The model’s ability to forecast the relationship between Gold and Crude oil prices can signal different phases of the economic cycle, aiding in anticipatory measures. Kilian and Park [50] also indicates that oil price shocks can be predictive of economic cycles, underscoring the value of models that can forecast these relationships.

  5. 5.

    Global Economic Outlook Interpretation:

    Predictions from the model provide insights into global economic health, helping international investors and policymakers in decision-making. The International Monetary Fund’s (IMF) [43] World Economic Outlook reports regularly discuss the impact of commodities like oil and gold on the global economy, emphasizing the need for predictive models in economic analysis.

These practical implications of the Rolling SARIMAX Model demonstrate its importance in providing a deeper and more nuanced understanding of the interplay between key commodities. This understanding is crucial for various stakeholders, including investors, policymakers, and economists, in navigating the complex dynamics of global financial markets.

9 Conclusion

An important presumption when employing a statistical model to examine commercial time-series data is that the model’s parameters will remain static throughout time. Such assumption may not be reasonable considering how frequently the financial landscape evolves. The long-term stability of a time series model can be evaluated using a rolling forecast method. However, in practical applications, rolling correlation isn’t always constant and alters over time. The prediction performance degrades a lot if the correlation coefficient is low. Therefore, a novel prediction model, which integrates rolling correlation method, SARIMAX model and rolling model, is proposed in this study. In practice, the proposed prediction model can facilitate corporations, governments, and investors in analysing the market dynamics.

In this study, the rolling correlation between the Gold spot prices and those of Brent and WTI petroleum futures is evaluated. Thereafter, the rolling SARIMAX model is deployed to forecast the rolling correlation of the Gold spot prices and petroleum (i.e. Brent and WTI) futures price because the time-series correlation data had considerable seasonality. It is demonstrated that the proposed rolling SARIMAX prediction model can improve the R-square error by 42.47 % and 40.54 % and reduce the mean absolute error by 95.65 % and 95.04 % when compared to the conventional SARIMAX models for Gold–WTI \(R_1\) and Gold–Brent \(R_2\), respectively. Accordingly, this study can enable the prediction of the world market outlook and economic growth. For example, the proposed prediction model will enable the governments to determine a nation’s four economic cycle phases: expansion, peak, recession, and trough. Understanding each phase of the economic cycle’s attributes is essential since they represent a nation’s top-down financial analysis.

Other economic metrics that correspond to a country’s economic cycle are GDP and the unemployment rate. Along with these two economic metrics, the proposed novel prediction model can be used to determine the phase of a country’s economic cycle. For example, a negative predicted rolling correlation of Gold and Crude Oil futures prices indicates that the economy is slowing down or there be might certain factors which have elevated the uncertainty in markets. As previously mentioned, in an unstable economy, investors and governments mostly hedge their positions by buying Gold derivatives. Similarly, with a shrinking economy, the demand and prices for crude oil diminishes. Both, the high demand for Gold derivatives and low demand for crude oil in the international market indicates a global slowdown. On the contrary, a positive predicted rolling correlation of Gold and Crude Oil futures prices indicates a strong world economy with rising inflation. With the growing world economy, the demand and prices for crude oil elevate and therefore, results in inflation. With rising inflation, the value of the dollar diminishes, and investors buy precious metals, stocks or real estate to hedge their position against inflation.

The interrelationship between Gold spot price and crude (WTI and Brent) oil futures has been critically evaluated using the proposed novel approach, though considering the correlation of the data, comparing with conventional SARIMAX model and providing improvement in prediction accuracy. It is challenging to forecast how the correlation between Gold and crude oil prices will change over the next two to three decades, as it is influenced by a variety of economic and geopolitical factors. However, based on their underlying drivers, it is reasonable to claim that the interrelationship between Gold and crude oil is unlikely to shift remarkably in the long run. Factors such as ongoing energy demand and the possibility of market instability, which can drive demand for Gold as a safe-haven asset, are expected to remain stable in the future. As a result, the price correlation between these two commodities is likely to persist. Despite the adoption of virtual money in certain nations, Gold will continue to be utilized and stored as a fallback currency.

On the other hand, the lack of infrastructure and rising energy demand makes complete reliance on renewable resources rather challenging. There also exist substantial costs associated with utilising renewable resources, such as the expense of constructing and maintaining necessary infrastructure as well as storing excess energy when it is not fully utilized. On the contrary, fossil fuels may continue to be attractive due to their ease of transportation and widespread availability. Therefore, crude oil and its substitutes will still have considerable demand in the upcoming decades. As the value of fossil fuels depends on a variety of factors, such as their accessibility, the cost of alternative energy sources, and the demand, crude oil will continue to be valuable and can influence the global economy due to its scarcity.

10 Limitation and future study

Rolling prediction models may become unstable over time if the parameters are not constant at any point in the sample. Calculating parameter estimations throughout a rolling window with a defined size over the population is a frequent way to evaluate a model’s parameter consistency. Another drawback of the rolling prediction is that it only provides forecasts up to the same month the next year for timeframes of 12 months, and possibly one or two extra months for longer timeframes. This restricts the analysis because it is not able to compare different years. Addressing seasonality is another issue with rolling projections of 12–18 months.

Future research could gather data on global events that occurred during the time period in question and analyse whether the model’s predictions are significantly influenced by these events. By comparing the model’s predictions to actual outcomes and examining the impact of global events, the research can gain a better understanding of the model’s accuracy and identify potential areas for improvement. Future research could also further investigate the relationship between Gold spot price and crude oil futures by considering a range of variables that may influence their correlation, such as the volumes of each commodity traded, changes in their prices over time, as well as the highest and lowest price points reached. Incorporating these factors into the analysis could provide a more comprehensive understanding of the relationship between the Gold spot price and crude oil futures, gain valuable insights into derivative valuation and hedging decisions, and offer a more nuanced view of the Gold spot-crude oil futures interrelationship and its potential fluctuations.