Abstract
Cross-correlation analysis is a powerful tool for understanding the mutual dynamics of time series. This study introduces a new method for predicting the future state of synchronization of the dynamics of two financial time series. To this end, we use the cross recurrence plot analysis as a nonlinear method for quantifying the multidimensional coupling in the time domain of two time series and for determining their state of synchronization. We adopt a deep learning framework for methodologically addressing the prediction of the synchronization state based on features extracted from dynamically sub-sampled cross recurrence plots. We provide extensive experiments on several stocks, major constituents of the S &P100 index, to empirically validate our approach. We find that the task of predicting the state of synchronization of two time series is in general rather difficult, but for certain pairs of stocks attainable with very satisfactory performance (84% F1-score, on average).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Time series prediction and classification in finance is significantly challenging due to the complexity, multivariate nature, and non-stationary nature of time series in this domain [1]. Security trading and price dynamics in financial markets are particularly complex due to the interacting nature and inter-connectedness of their underlying driving forces and determinants leading to significant co-movements in stocks’ prices. The characterization and modeling of multivariate time series dynamics have long been discussed in the financial literature, where the prevailing approach is that based on classical econometric theory. Among the multivariate linear models, the most widespread ones are vector autoregressive (VAR) models [2, 3], vector moving averages, ARMA (autoregressive moving average) models [4], and cointegrated VAR models [5]. Widespread is the use of multivariate conditional heteroskedasticity GARCH-type, see, e.g., [6] for a review, multivariate stochastic volatility models [7], and more methods based on the realized volatility [8].
Among the nonlinear models, the threshold autoregressive model [9], smooth transition autoregressive models [10], and Markov switching models [11] are nowadays standard approaches. Alternatives include nonparametric methods, functional coefficient [12] and nonlinear additive AR models [13], recurrence analysis, and neural networks. The complexity of modern financial markets running over the so-called limit-order book mechanism is, however, characterized by typical nonlinear, noisy, and often non-stationary dynamics. In addition, the high-dimensional nature of the limit-order book flow and the complexity of the interactions within it constitute severe limits in the applicability of classic econometric methods for its modeling and forecasting. Besides a very limited number of analytical and tractable models for the order flow and price dynamics in limit-order books [14,15,16], machine learning methods have received much attention [17, 18], as they are naturally appealing in this context.
By considering the stock market as a complex system, it is natural to apply such methods for addressing those prediction problems where the application setting and assumptions beneath standard econometric techniques are stringent or inadequate. Indeed, it has extensively been shown that, in financial applications, deep learning (DL) models are often capable of outperforming traditional approaches due to their ability to learn complex data representations based on end-to-end data-driven training, see, e.g., [19,20,21,22,23,24]. DL models have been adopted for a variety of problems ranging from price prediction [25,26,27,28,29], limit-order book-based mid-price prediction [20, 21, 23, 30, 31], and volatility prediction [32,33,34].
Whereas the target of the above literature is generally the analysis and prediction of single time series, this paper focuses on an analysis of stock pair co-movements. Several trading strategies can be put into play to take advantage of co-movements and exploit statistical arbitrages, including pair-trading, portfolio management, or relative and convergence trading strategies applied at an intraday level, e.g., see [35] for an overview. While DL provides a basis for prediction given a set of descriptive features, the issue of how to detect and quantify co-movements remains to be addressed. This paper suggests the use of recurrence analysis based on cross recurrence plots (CRP) for detecting and extracting features indicative of stocks’ shared dynamics or co-movements, along with a deep learning framework for predicting whether certain pairs of stocks will exhibit a shared dynamics in the future (in the sense specified in Sect. 2). Not only in the view of extending the ML and applied econometrics literature in this direction, but the possibility of forecasting epochs of time series synchronization is likewise relevant for practitioners.
For detecting and quantifying co-movements or, more generally, shared dynamical features in time series, the standard econometric approach is that of cross-correlation analysis [11, Ch. 8]. This intuitive linear approach, based on the estimation and perhaps forecasting of cross-correlation matrices, appears to be an element of a much wider theory and methodological approach that has been explored and developed in the last years within a broader generic non-financial setting. Simple cross-correlation analysis has been remarkably extended and generalized toward methods that help explore co-movements between time series within nonlinear, noisy, and non-stationary systems of very complex dynamics, either financial [36,37,38] or not [39,40,41].
Recurrent analysis [42] explores the reconstruction of a phase-space using time-delay embedding for quantifying characteristics of nonlinear patterns in a time series over time [43]. This is done by calculating the so-called Recurrence Plot [44], the core concept of which is to identify all points in time that the phase-space trajectory of a single time series visits roughly the same area in the phase-space. Recurrence plot analysis has no assumptions or limitations on dimensionality, distribution, stationarity, and size of data [42]. These characteristics make it suitable for multidimensional and non-stationary financial time series data analysis. The CRP [45] is an extension of the recurrence plot, introduced to analyze the co-movement and synchronization of two different time series. The CRP indicates points in time that a time series visits the state of another time series, with possibly different lengths in the same phase-space. These concepts are discussed in further detail in Sect. 2.
In this paper, we propose a method for predicting the state of synchronization over time of two multidimensionalFootnote 1 financial time series based on their CRP. In particular, we use the CRP to quantify the co-movements and extract the binary representation of its diagonal elements as the prediction targets for a DL model. For predicting the state of synchronization at the next epoch, we employ a Convolutional Neural Network (CNN) that uses as inputs CRPs independently calculated from data-crops obtained by applying fixed-size sliding windows on the time series. Our extensive experiments on 12 stocks of the S &P100 index selected from different sectors show that the proposed method can predict the synchronization of stock pairs with satisfactory performance.
The remainder of the paper is organized as follows. Section 2 introduces in detail the concepts and theory behind the CRP, with an outlook on its applications in financial and economic problems. Our proposed approach for predicting time instances of time series’ synchronization is presented in Sect. 3. Empirical results on real market data are provided in Sect. 4, while Sect. 5 provides conclusions.
2 Financial time series recurrence analysis
Recurrence in the analysis of time series, seen as a nonlinear dynamic system, is the repetition of a pattern over time. The visualization of recurrences in the dynamics of a time series can be expressed via a RP or recurrence matrix [42]. In other words, the RP represents the recurrence of the phase-space trajectory to a state. The phase-space of a d-dimensional time series \({\mathcal {N}}\) with T observations \({\mathcal {N}} =\{{\textbf{n}}_1^\top , {\textbf{n}}_2^\top ,\dots , {\textbf{n}}_T^\top \}^\top\), with \({\textbf{n}}_i\) being the row vector representing a generic observation at time i, \(i = 1,\dots ,T\) is calculated using the time-delay embedding method. State \(N_i\) in the phase-space is obtained by
where \(\tau\) denotes the delay and k is the embedding dimension, \(T' = T-\tau (k-1)\), \(\tau\) and k can, respectively, be determined with the Average Mutual Information Function method [46] and the False Nearest Neighbors method of [47]. For a uni-dimensional time series, \(N_i\) is a row vector of size \((1 \times k)\), and for a d-dimensional times series \(N_i\) is a row vector of size \((1 \times kd)\). The recurrence state matrix of the reconstructed phase-space, known as Recurrence Plot (RP), at times i and j, is defined as
where \(\varepsilon\) is a threshold distance value, \(H(\cdot )\) is the Heaviside function, and \(\Vert \cdot \Vert\) is the euclidean distance. Due to the underlying embedding (1), \(R_{i,j}\) is defined for i \(i = 1\) up to \(T' = T-\tau (k-1)\). If two states \(N_i\) and \(N_j\) are in an \(\varepsilon\)-neighborhood the value of \(R_{i,j}\) is equal to 1, otherwise is 0. The value of \(\varepsilon\) highly affects the output of RP. When \(\varepsilon\) is too small or too large, the RP cannot identify the true recurrence of states. There are different approaches for finding the best value for \(\varepsilon\) in the literature [42]. We follow the guidelines provided in [48] for selecting \(\varepsilon\). The values on the diagonal line of RP are equal to one (i.e., \(R_{i,i} = 1\)) because in that case, the two states introduced to \(H(\cdot )\) are identical. The diagonal line of RP is called the Line Of Identity (LOI). Recurrence quantification measures derived from RPs, such as recurrence rate (RR), percent determinism, and maximum line length in the diagonal direction [42], give insights into the dynamic behavior of time series. These measures have been used in financial research to analyze the behavior of financial data, e.g., [49,50,51]. The RP of a time series can be used as a data transformation method for time series prediction. A method employing the RP of seven financial time series to train a deep neural network for predicting the market movement is proposed by [52]. Several authors have suggested an RP forecasting approach via DL. A feature extraction method exploiting the RP for parsing a DL algorithm is proposed by [53]. RP can be treated as images enabling the use of computer vision techniques for the forecasting task, such as autoencoders [54] or CNNs [55].
The CRP of two multidimensional time series [56] corresponds to an extension of RP which explores the co-movement of two time series and allows the study of the nonlinear dependencies between them. Let us denote by \(N_i, \,i=1,\dots ,T\) and \(M_j, \,j=1,\dots ,S\) the phase-space states of the time series \(\mathcal {N}\) and \(\mathcal {M}\) of length T and S, respectively. The cross recurrence (CR) states of the reconstructed state-space a time i and j are
with \(i = 1,\dots ,T' = T-\tau (k-1)\) and \(j=1,\dots ,S' = S-\tau (k-1)\). Here \(N_i\) and \(M_i\) are defined as in (1). \(\textrm{CR}_{i,j}\) defines the concept of synchronization and the way synchronization between two financial time series measured: an \(\varepsilon\)-neighborhood of the embeddings \(N_i\), \(M_j\) at epochs i, j. We denote the full cross recurrence matrix, known as cross recurrence plot (CRP) extracted for \(\mathcal {N}\) and \(\mathcal {M}\) as \(\textrm{CRP}_{(\mathcal {N},\mathcal {M})}\), obtained through
The CRP corresponds to a matrix of dimension \(T' \times S'\), which may not be square, as the time series \({\mathcal {N}}\) and \({\mathcal {M}}\) may have different lengths, i.e., \(T\ne S\).
For \(\mathcal {N}\), \(\mathcal {M}\) of equal length T, the \(CRP_{(\mathcal {N},\mathcal {M})}\) is a square \(T'\times T'\) matrix. Opposed to the (univariate) recurrence analysis of one time series (with itself, RP in equation (2)), the diagonal entries of the CRP are either 1 or 0, as the two time series may or may not be synchronized at (i, j), \(i=j, i = 1,\dots ,T'\), see, e.g., Figure 2. In the CRP of two time series, the LOI is replaced by a distorted diagonal, called the Line Of Synchronization (LOS). The LOS reveals the relationship between the two time series in the time domain. In particular, it provides a nonparametric function containing information about the time-rescaling of the two time series, which further allows their re-synchronization [57].
As the time series we consider in our application are multidimensional (\(d>1\)), we point out that the CRP is indeed a multidimensional cross recurrence plot (MdCRP) [56] where \({\textbf{n}}_i\), \({\textbf{m}}_j\) are row vectors rather than scalars, and \(N_i\), \(M_j\) are dk-dimensional row vectors rather than k-dimensional row vectors, as opposed to the conventional CRP based on one-dimensional time series. Yet the above discussion is general and applies to both cases, and \(CR_{i,j}\) is in any case a scalar equal to either 0 or 1. For multidimensional time series, the entries of each of the two time series require normalization in each dimension before estimating the MdCRP [56], e.g., with the z-score.
Financial time series co-movement analysis using the CRP and LOS is studied in [58, 59]. The work in [60] analyses the inter-dependencies of the stock market index and its associated volatility index, further proposing a method for the LOS estimation based on a corrupted CRP. Furthermore, it is important to notice that financial time series are often represented as multivariate instances. Indeed, even the most basic source of financial data generally provides information about volumes along with prices. Furthermore, a number of closely related variables (e.g., volatility measures) are simple to extract. Despite the use of multidimensional inputs being effective and commonly found across several applications [42], existing CRP applications on market data are broadly limited to the use of one-dimensional series only (e.g., prices or volatilities) [42, 49, 61].
3 Proposed method
We exploit the CRP to quantify the co-movement of two multidimensional time series \(\mathcal {N}\) and \(\mathcal {M}\) observed continuously over a common period of length V. Our goal is to predict whether at a certain epoch (e.g., a certain calendar day), \(\mathcal {N}\) and \(\mathcal {M}\) are synchronized in the state-space embedding \(\varepsilon\)-neighborhood given by (3).
Assume two generic time series \(\mathcal {N}\) and \(\mathcal {M}\) are observed over the non-overlapping time-domains \(D_\mathcal {N}= \{ t_1,\dots ,t_T\}\) and \(D_\mathcal {M}= \{ s_1,\dots ,s_S\}\) of respective length T and S, their CRP corresponds to a \(T' = (T-\tau (k-1)) \times (S-\tau (k-1)) = S'\) matrix where the (i, j) element expresses the state of synchronization at the i-th time instance of the first time series (\(t_i\)) and at the j-th time instance of the second (\(s_j\)), in terms of the \(\varepsilon\)-neighborhood of the states \(N_i\) and \(M_j\), as expressed by (3). With the domains being non-overlapping, there are no epochs \(t_i\) and \(s_j\) such that \(t_i = s_j\) in (the same) calendar time, and the state of synchronization at a same calendar time cannot be determined. Indeed, for a fixed \(t_i\), the column-vector \(\textrm{CRP}_{i,\cdot }\) reports for the epochs \(s_j,\, j=1,\dots ,S'\) (past or future with respect to \(t_i\)) whether state-space embedding \(M_j\) is in the same \(\varepsilon\)-neighborhood of \(N_i\).
In this light, if the domains of \(\mathcal {N}\) and \(\mathcal {M}\) only partially overlap over a region \(D:=D_{\mathcal {N}} \cap D_{\mathcal {M}}\) of length V, for our forecasting purpose, their non-overlapping regions \(D_\mathcal {N}{\setminus } D\) and \(D_\mathcal {M}{\setminus } D\) are irrelevant and can be discarded. Their V overlapping time instances \(t_i,\dots ,t_{i+V}\) and \(s_j,\dots , s_{j+V}\) correspond to the same calendar epochs, i.e., \(t_{i+h} = s_{j+h},\, \forall h = 1,\dots ,V\), and are of actual relevance. Over the common domain D, the CRP corresponds to a square \(V' \times V'\) matrix (\(V' = V-\tau (k-1)\)) with a well-defined diagonal expressing the state of synchronization at \(t_i = s_j\), e.g., answering whether at the (same) calendar day \(t_i = s_j\), \(\mathcal {N}\) and \(\mathcal {M}\) are synchronized or not.
This justifies the required form for the input data, corresponding to two (multidimensional) time series \(\mathcal {N}\) and \(\mathcal {M}\) observed over a common period D of equal length V, with \(D =\left\{ v_1 = \text {max}(t_1,s_1),\dots , v_V = \text {min}(t_T,s_S)\right\}\). Since the essence of time series forecasting is that of predicting the future from the past, the data from the past needs to be representative of the h-step ahead forecast. This implicitly requires D to be a continuous set of times for the given sampling frequency. That is, there should be no gaps between days or epochs, namely, \(v_V \equiv v_1 + (V-1)\). Furthermore, in order to calculate (3), we require the time series to be non-corrupted over D in all its multivariate entries, i.e., without missing values. The above requirements are generally met for the financial time series of our interest. The only constraints are that of using data for stocks traded at the same exchange (same trading days and observed festivities) and that of selecting stocks not subject to delisting in the period of interest.
Aligned with the general rationale of time series forecasting, we aim at predicting the one-step-ahead synchronization status between \(\mathcal {N}\) and \(\mathcal {M}\) at epoch \(i+1\), based on some lagged historical records available up to time i, that is based on some suitable set of feature observed or extracted over w past epochs. For \(i = w,\dots ,V'-1\), let us denote by \(\mathcal {N}^w_i\) and \(\mathcal {M}^w_i\) the sub-sample of \(\mathcal {N}\) and \(\mathcal {M}\) of the w most recent observations up to and including epoch i, that is
Let us denote by \(\textrm{CRP}_{(\mathcal {N}^w_i,\mathcal {M}^w_i)}\) the \(w'\times w'\) CRP computed from \(\mathcal {N}^w_i\) and \(\mathcal {M}^w_i\) (with embedding dimension k, lag \(\tau\) and \(w' = w-\tau (k-1)\)). At epoch i, \(\textrm{CRP}_{(\mathcal {N}^w_i,\mathcal {M}^w_i)}\) is used as the input of the neural network for predicting the state of synchronization at \(i+1\). Within this framework, there are \(V'-w\) input-target pairs. The first pair corresponds to the input \(\textrm{CRP}_{(\mathcal {N}^w_{w},\mathcal {M}^w_{w})}\) and target \((\textrm{CRP}_{(\mathcal {N},\mathcal {M})})_{w+1,w+1}\), the last to the input \(\textrm{CRP}_{(\mathcal {N}^w_{V'-1},\mathcal {M}^w_{V'-1})}\) and target \((\textrm{CRP}_{(\mathcal {N},\mathcal {M})})_{V',V'}\). The prediction target at epoch i corresponds to the state of synchronization at \(i+1\), provided by the (diagonal) entry \(\left( \textrm{CRP}_{(\mathcal {N},\mathcal {M})}\right) _{i+1,i+1}\) of the CRP computed for the entire times series \(\mathcal {N}\), \(\mathcal {M}\). In particular, the state of synchronization at any epoch \(i=1,\dots ,V'\) is provided by the diagonal of \(\textrm{CRP}_{(\mathcal {N},\mathcal {M})}\), i.e.,
so that \(\left\{ \text {diag}\left( \textrm{CRP}_{(\mathcal {N},\mathcal {M})}\right) _i \right\} _{i=w+1,\dots ,V'}\) corresponds to the targets for the inputs \(\left\{ \textrm{CRP}_{(\mathcal {N}^w_{i},\mathcal {M}^w_{i})} \right\} _{i = w,\dots ,V'-1}\). The above corresponds to a framework where inputs are created dynamically by using CRPs computed over sub-sampled time series obtained by applying sliding windows of a fixed size. The above construction is illustrated in Fig. 1. Note that \(\textrm{CRP}_{(\mathcal {N}^w_i,\mathcal {M}^w_i)}\) is not analogous to the sub-matrix P obtained from \(\textrm{CRP}_{(\mathcal {N},\mathcal {M})}\) by considering its rows and columns from \(i-w+1\) to i. In \(\textrm{CRP}_{(\mathcal {N},\mathcal {M})}\) the entire data in \(\mathcal {N}\) and \(\mathcal {M}\) accounts for the time series normalization and furthermore tunes the parameter \(\varepsilon\). \(\textrm{CRP}_{(\mathcal {N}^w_i,\mathcal {M}^w_i)}\) is thus truly dependent on the cropped times series data \(\mathcal {N}^w_i\), \(\mathcal {M}^w_i\), while P is not. In a forecasting context, our approach is feasible and unbiased as it does not use any future information following the one available at i. Note that, in general, nothing prevents from choosing the embedding size and lag parameter differently for the CRP computation of the targets and for the CRP computations of the inputs.
A simple example with two one-dimensional time series clarifies how we extract the input features and prediction targets. Consider the two time series \(\mathcal {A}\) and \(\mathcal {B}\) of 10 observations:
Fig. 2 depicts their CRP, i.e., \(\textrm{CRP}_{(\mathcal {A},\mathcal {B})}\) (for simplicity computed with \(k=\tau =1\), and \(V=V'=10\)). The diagonal line of the CRP is highlighted and includes the values of the recurrence states. The diagonal line shows that the behavior of \(\mathcal {A}\) and \(\mathcal {B}\) at timestamps between 1 and 7 to 10 is synchronized, therefore at these timestamps the prediction targets are set to 1 (the actual value of the Heaviside function in (3)). By, for instance, setting \(w = 3\), we aim at predicting \(V-w = 7\) states of synchronization. The first prediction concerns the synchronization at epoch \(w+1=4\), based on the \(\textrm{CRP}_{(\mathcal {N}^3_3,\mathcal {M}^3_3)}\), that is, the CRP calculated from the first three observations of \(\mathcal {A}\) and \(\mathcal {B}\). The prediction of the synchronization at epoch 5 is based on the CRP calculated on observations 2 to 4, i.e., on \(\textrm{CRP}_{(\mathcal {N}^3_4,\mathcal {M}^3_4)}\). The procedure is repeated up to epoch \(V-1=9\), where \(\textrm{CRP}_{(\mathcal {A}^3_{9},\mathcal {B}^3_{9})}\), calculated from the observations 7 to 9, is used for predicting \(\text {diag}(\textrm{CRP}_{(\mathcal {A},\mathcal {B})})_{10}\).
Appendix B includes a visualization that displays the results of recurrence analysis on real-world stock data. This visualization provides empirical evidence of the complexity of the patterns underneath the RP and CRP.
To practically implement the underlying DL model that maps each \(\textrm{CRP}_{(\mathcal {N}^w_{i},\mathcal {M}^w_{i})}\) input to its corresponding \(\text {diag}\left( \textrm{CRP}_{(\mathcal {N},\mathcal {M})}\right) _{i+1}\) output, consider that each input consists of a matrix of zeros and ones that can be considered analogous to an image. Therefore we can rely on well-established classification models. In particular, we employ a Convolutional Neural Network (CNN). Such a neural network is well-suited for capturing the spatial relationships between the features in their input, which in our case correspond to the 0-1 features encoded in the entries of \(\textrm{CRP}_{(\mathcal {N}^w_{i},\mathcal {M}^w_{i})}\). Note that in the CRP calculation \(\mathcal {N}^w_{i}\) and \(\mathcal {M}^w_{i}\) are z-score normalized before computing (3).
We use a CNN architecture formed by two convolutional and one fully connected layer, as illustrated in Fig. 3. The neural network involves the typical blocks of the CNN architecture. The convolutional layers adaptively learn the spatial relationships of inputs, the Rectified Linear Unit (ReLU) activation introduces nonlinearity to the model, and the max-pooling layer provides down-sampling operations reducing the size of the feature map by extracting the maximum value in each patch from the input feature map. The current CNN is chosen based on a grid search over different network architectures, layers’ types, and sizes, aimed at maximizing the F1-score and showing the feasibility of our CRP-based DL approach. Importantly, the temporal connections in the input data are handled within the CNN. In fact, the construction of the input data as outlined above relates all the instances in \(\textrm{CRP}_{SW_i}\) to the target \(y_i\), where the entries in \(\textrm{CRP}_{SW_i}\) furthermore constitute aggregate values (CRP entries) capturing the similarity of the input processes at different lags of time, within the windows W. The output of the stacked convolutional layers is introduced to a fully connected layer, leading to the network’s output by applying a softmax function.
4 Experiments
4.1 Data
Our analyses rely on daily adjusted closing prices and daily number of traded shares (volumes) for 12 representative constituents of the S &P100 index in the period from December 31st, 2014 to November 29th, 2021 (\(V= 1,741\) trading days). The data is retrieved from Yahoo Finance. These 12 stocks are selected based on their market capitalization and their market sector. For each sector, we select the first two stocks of highest-but-comparable capitalization, a practice well-supported by financial theory [62]. Market sectors provide a natural grouping for securities: analyses conducted at a sector level are a common practice for granting comparability and robustness of the results, as across market sectors the dynamics of economic variables are well-known to be asymmetric. Table 1 lists our stock selection. Each stock is expressed as a trivariate time series consisting of daily prices, volumes, and returns. CRPs express temporal similarities in joint terms of the price level, traded volume, and daily return, providing a generalized definition of similarity in time series dynamics at a multivariate level.
For our bivariate analysis on two time series, we have \((12^2-12)/2 = 66\) pairs of stocks. For each stock pair, we use the first \(70\%\) of the data for training (\(V_\text{ train} = 1,218\) days) and the last \(30\%\) for testing (\(V_\text{test} = 523\) days). As the future input instances should not affect the training process, the order of the input data during the training is fixed. The input and targets of the train data and the test data are, respectively,
where \(I = w,\dots ,V_\text {train}\) and \(T =w+1,\dots , V_\text {train}+1\) for the training set and \(I = V_\text {train}+1,\dots ,V-1\) and \(T =V_\text {train}+2,\dots , V\) for the test set. We train the neural network once over the data for all the picks of the stock pairs. This pooled approach is a common practice in closely related Machine Learning literature, e.g., [21], and supported by the empirical findings of [63], suggesting the existence of an universal price formation mechanism (model), and thus price dynamic, not specific for individual assets. In practice, the input and output data is the concatenation of the individual pairs’ inputs-targets. For example, for a set window size w, for the train set the input-target data consists of \((V'_\text {train}-w) \times 66\) examples, that is \((V'_\text {train}-w) \times 66\) pairs of cross recurrent matrices and (scalar) targets, where \(V'_\text {train} = (V_\text {train}-\tau (k-1))\). In the training phase, the training data is used to estimate the optimal weights of the CNN. The test data is then parsed to the estimated CNN and the quality of the network outputs is evaluated against the actual targets. Details are provided in the following two subsections.
For the training of the CNN we adopt the ADAM optimizer with the following hyperparameters: learning rate 0.01 (reduced by a factor of 5 every 40 epochs), momentum parameters 0.9 and 0.999, batch size 128 and epoch size 300. Across the epochs, we keep track of the F1-score on the validation set, which is set to the last 15% portion of the training set. For our classification task, we adopt the binary cross-entropy loss. As the target classes are unbalanced, the loss is weighted for the targets’ class proportion. Details on the filter sizes, kernel sizes and the max pooling size are provided in Fig. 3.
With respect to the CRP computations, throughout our analyses the embedding dimension k is set to 2 or 3 (estimated via FNN method) based on input type, and the delay parameter \(\tau\) is set to 1. Values 0.45, 0.55, 0.65, and 0.75 are used for the threshold \(\varepsilon\). These hyperparameters are selected according to the guidelines and discussion in [48] and [56]. The same values are applied for both the computation for the CRP related to the targets and the CRPs related to the inputs.
In our experiments, we consider two different choices for the window-size hyperparameter, namely \(w=\{10,30, 50, 60, 80\}\) days. With the above settings, \(V=V'=1,741\) days, \(V_\text {train} = V'_\text {train} =1,218\), and \(V_\text {test} = V'_\text {test} = 523\) days. For \(i = w,\dots ,V-1\), \(\textrm{CRP}_{(\mathcal {N}^w_i,\mathcal {M}^w_i)}\) are square matrices of size \(w'=w\) and \(CRP_{(\mathcal {N},\mathcal {M})}\) a square matrix of size V on whose diagonal are found the relevant targets, i.e., \(\text {diag}\left( CRP_{(\mathcal {N},\mathcal {M})}\right) _i\), \(i = w+1,\dots ,V\).
4.2 Experimental results
Stock pairs from the same sector or two different sectors with different co-movement behaviors can provide comprehensive experimental data to show the ability of the proposed method to predict the state of synchronization. To evaluate the performance of our proposed method, all pairs of stocks are used as the input of the method. We collect all pairs of stocks, and for each pair we follow the proposed steps (ref. Figure 1) to create the inputs and targets. We stack the input-target pair-specific data to create a single train and test set for all pairs.
Tables 2 and 3 show the performance of our proposed approach for all pairs of stocks using two types of input: (price, volume) and (price, return, volume), respectively. Given that the target classes are generally imbalanced, the preferred reference performance metric is the F1-score. Yet, we also include accuracy, precision, and recallFootnote 2 to have a clearer overview of the classification performance. For robustness, we run our experiments over a range of values for the window-size W and threshold \(\varepsilon\) hyperparameters, a setup that further clarifies the effect of these hyperparameters on prediction performance. Additional results for all the 66 pairs of stocks are provided in Appendix A.
Results for the (price, volume) time series input are provided in Table 2, results for the (price, return, volume) input in Table 3. In general, our results show that the task of predicting the state of synchronization is not only feasible but, under our setup, quite satisfactory. Indeed our preferred performance F1 metric is as high as 84%. As expected, the results appear to be sensible to the choice of the window size and threshold parameter. In particular, the performance metrics decrease in their values as the threshold parameter and the window size increase. This means that stricter \(\varepsilon\)-neighborhoods are easier to predict and that the relevant information for the prediction of the synchronization state is found in the most recent instances of the CRP. This suggests the existence of patterns in the data that are strongly indicative of close \(\varepsilon\)-neighborhoods. I.e., the CNN detects clear patterns indicative of the fact that the day-ahead synchronization is likely to be very strong (the \(\varepsilon\)-neighborhood is tight), indeed, as \(\varepsilon\) increases, the performance metric decreases, indicating that the model indeed detects strong evidence of “strong” day-ahead synchronization. Regarding the window size, long-lagged CRP information appears to introduce noise in the system without providing any predictive gains, aligned with the intuition that further-in-time information is less and less related to the current state of the system and of little use for prediction.
Suspecting that the joint use of prices and returns might be redundant since they are closely related to each other, we also run a second experiment involving volumes and returns only. It is interesting to note that the inclusion of the returns does not seem to provide any advantage with respect to the (price, volume) input time series, but rather the opposite effect. It is expected that the inclusion of further input variables complicates the patterns in the CRP chessboard so that under the same network architecture, the performance metrics decrease. Furthermore, and aligned with the above, in additional experiments here not reported, we included squared returns (as a gross measure of daily volatility), finding that they also appear detrimental to the performance metrics and prediction task. This perhaps suggests that the network architecture needs to scale up with the dimensionality of the input data, reasonably inducing more complex patterns in the CRP.
An alternative method for assessing the prediction method’s performance is to examine its performance on periods of high and low volatility for stock pairs. To do this, for each stock, we construct estimates of daily volatility by applying the Exponentially Weighted Moving Average (EWMA) filter [64]:
where \(\sigma\) is the volatility, r is the return, t is an index denoting the day, and \(\lambda\) a decay factor. When calculating the volatility, the decay factor \(\lambda\) determines the weight given to older returns. We set \(\lambda = 0.94\), which is commonly used for daily returns series [64].
For each stock in every pair, we intersect the days of highest 30% and lowest 30% volatility. From such an intersection, we identify dates corresponding to two volatility regimes (high volatility and low volatility). For every pair, on average, 14% (12%) of the dates in the test set correspond to high (low) -volatility days. In this way, we derive high-volatility and low-volatility subsets from the test set.
Table 4 displays the out-of-sample performance of the best model (\(w=10,\, \varepsilon =0.45\), price-volume data) for the above two subsets. The results show that the prediction of the synchronization state is very satisfactory on both high-volatility and low-volatility test sets, with higher accuracy observed for the low-volatility set. Notably, the high-volatility test set has a higher percentage of non-synchronized time instances (12% in Class 0) than the low-volatility test set (5% in Class 0), which can be interpreted as evidence of higher non-synchronization or disentangled dynamics on high-volatility days with respect to low-volatility ones. Interestingly, these percentages indicate that even on high-volatility days, stocks tend to be by far very synchronized (88%), though not as much as on low-volatility days (where the fraction of non-synchronized days is 7% smaller). The relatively low number of non-synchronized days for the high-volatility regime also suggests that stocks are similarly perturbed/exposed to the market risk factors causing volatility and that their response to volatility outbursts is similar. Indeed, in 88% of high-volatility test days, the data is detected in the same price-volume embedding, or, in simpler terms, two series share similar dynamics.
5 Conclusion
Predicting the co-movement of two multidimensional time series is a relevant task for the financial industry that supports potential trading strategies based on their interrelationships. This paper contributes to the literature by providing (i) a method relying upon the CRP to quantify the time series coupling over time, (ii) a DL model for predicting the time series synchronization state, (iii) the use of a multidimensional time series representation of the inputs involving prices, volumes, and returns. We conducted extensive analyses on real stock market data from different sectors: our results show that the proposed setup can effectively predict the one-day-ahead synchronization of two time series. An interesting future research direction would be to investigate the applicability of such an approach to a high-frequency domain where the high-dimensional nature of the raw data may provide valuable information for analyzing and predicting the coupling in settings that are known to be characterized by high levels of noise.
Data Availability Statement
The data supporting the findings of this study is freely available for.download at Yahoo finance at the address finance.yahoo.com/quote/XXX/history, where XXX corresponds to the ticker of each of the stocks listed in Table 1. Query the relevant period December 31st, 2014 to November 29th, 2021. The data consists of the daily closing prices and volumes on which the analyses are available.
Notes
Throughout the paper, with uni- or multi- variate we refer to the nature of the analyses (RP as opposed to CRP), and with one- or multi- dimensional we refer to the nature of the time series. That is, the RP (as presented in equation (2)) provides a univariate analysis of a single one-dimensional time series, while the CRP (as presented in latter equation (3)) a multivariate analysis of two one-dimensional or multidimensional time series.
For a binary classification task involving N samples, let TP, TN, FP, and FN, respectively, denote true positive, true negative, false positive, and false negative: the performance measures are defined as follows: \(\text {Accuracy} = \text {(TP+TN)}/N\), \(\text {Precision} = \text {TP}/(\text {TP+FP})\), \(\text {Recall} =\text { TP/(TP+FN)}\), \(\text {F1} = 2 (\text {Precision} \times \text {Recall})/(\text {Precision} + \text {Recall})\).
References
Murphy JJ (1999) Technical analysis of the financial markets: a comprehensive guide to trading methods and applications. Penguin, London
Lütkepohl H (1999) Vector autoregressions. Unpublished manuscript, Institut füur Statistik und Ökonometrie, Humboldt-Universität zu Berlin
Lütkepohl H (2013) Introduction to multiple time series analysis. Springer, New York
Reinsel GC (1993) 2. Vector ARMA time series models and forecasting. Springer, New York, NY, pp 21–51
Juselius K (2006) The cointegrated VAR model: methodology and applications. Oxford University Press, Oxford
Bauwens L, Laurent S, Rombouts JV (2006) Multivariate garch models: a survey. J Appl Economet 21(1):79–109
Harvey A, Ruiz E, Shephard N (1994) Multivariate stochastic variance models. Rev Econ Stud 61(2):247–264
Chiriac R, Voev V (2011) Modelling and forecasting multivariate realized volatility. J Appl Economet 26(6):922–947
Tong H (1978) On a threshold model in pattern recognition and signal processing. Sijhoff & Noordhoff, Amsterdam
Dv Dijk, Teräsvirta T, Franses PH (2002) Smooth transition autoregressive models-a survey of recent developments. Economet Rev 21(1):1–47
Tsay RS (2005) Analysis of financial time series. John Wiley & Sons, Hoboken, New Jersey, United States
Chen R, Tsay RS (1993) Functional-coefficient autoregressive models. J Am Stat Assoc 88(421):298–308
Chen R, Tsay RS (1993) Nonlinear additive arx models. J Am Stat Assoc 88(423):955–967
Cont R, Stoikov S, Talreja R (2010) A stochastic model for order book dynamics. Oper Res 58(3):549–563
Huang H, Kercheval AN (2012) A generalized birth-death stochastic model for high-frequency order book dynamics. Quant Financ 12(4):547–557
Hawkes AG (2018) Hawkes processes and their applications to finance: a review. Quant Financ 18(2):193–198
Heaton JB, Polson NG, Witte JH (2017) Deep learning for finance: deep portfolios. Appl Stoch Models Bus Ind 33(1):3–12
Dixon MF, Halperin I, Bilokon P (2020) Machine learning in finance, vol 1406. Springer, Switzerland
Sezer Berat O, Ozbayoglu Murat A, Dogdu Erdogan (2017) An artificial neural network-based stock trading system using technical analysis and big data framework. In: Southeast Conference, pp. 223–226
Zhang Z, Zohren S, Roberts S (2019) Deeplob: deep convolutional neural networks for limit order books. IEEE Trans Signal Process 67:3001–3012
Tran DT, Iosifidis A, Kanniainen J, Gabbouj M (2019) Temporal attention-augmented bilinear network for financial time-series data analysis. IEEE Trans Neural Netw Learn Syst 30(5):1407–1418
Passalis N, Tefas A, Kanniainen J, Gabbouj M, Iosifidis A (2020) Temporal bag-of-features learning for predicting mid price movements using high frequency limit order book data. IEEE Trans Emerg Topics Comput Intell 4(6):774–785
Shabani M, Tran DT, Kanniainen J, Iosifidis A (2023) Augmented bilinear network for incremental multi-stock time-series classification. Pattern Recogn 141:109604. https://doi.org/10.1016/j.patcog.2023.109604
Haselbeck F, Killinger J, Menrad K, Hannus T, Grimm DG (2022) Machine learning outperforms classical forecasting on horticultural sales predictions. Machine Learn Appl 7:100239
Khare K, Darekar O, Gupta P, Attar VZ (2017) Short term stock price prediction using deep learning. In: IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology, pp. 482–486
Fons E, Dawson P, Zeng X-j, Keane J, Iosifidis A (2021) Augmenting transferred representations for stock classification. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3915–3919
Bhandari HN, Rimal B, Pokhrel NR, Rimal R, Dahal KR, Khatri RKC (2022) Predicting stock market index using lstm. Machine Learn Appl 9:100320
Basher SA, Sadorsky P (2022) Forecasting bitcoin price direction with random forests: How important are interest rates, inflation, and market volatility? Machine Learn Appl 9:100355
Xu X, Zhang Y (2021) Network analysis of corn cash price comovements. Machine Learn Appl 6:100140
Shabani M, Iosifidis A (2020) Low-rank temporal attention-augmented bilinear network for financial time-series forecasting. In: 2020 IEEE Symposium Series on Computational Intelligence (SSCI), pp. 2156–2161. IEEE
Shabani M, Tran DT, Magris M, Kanniainen J, Iosifidis A (2022) Multi-head temporal attention-augmented bilinear network for financial time series prediction. In: 2022 30th European Signal Processing Conference (EUSIPCO), IEEE, Belgrade, Serbia, pp 1487–1491. https://doi.org/10.23919/EUSIPCO55093.2022.9909957
Kyoung-Sook M, Hongjoong K (2019) Performance of deep learning in prediction of stock market volatility. Econ Comput Econ Cybernet Stud Res 53(2):77–92
Liu Y (2019) Novel volatility forecasting using deep learning-long short term memory recurrent neural networks. Expert Syst Appl 132:99–109
Christensen K, Siggaard M, Veliyev B (2021) A machine learning approach to volatility forecasting. J Financ Econ. https://doi.org/10.1093/jjfinec/nbac020
Guo X, Lai TL, Shek H, Wong SP-S (2017) Quantitative Trading: Algorithms, Analytics, Data, Models, Optimization. Chapman and Hall/CRC, Boca Raton, Florida, United States
Ma F, Wei Y, Huang D (2013) Multifractal detrended cross-correlation analysis between the chinese stock market and surrounding stock markets. Phys A Statist Mech Appl 392(7):1659–1670
Bonanno G, Lillo F, Mantegna RN (2001) High-frequency cross-correlation in a set of stocks. Quant Financ 1(1):96–104
Ramchand L, Susmel R (1998) Volatility and cross correlation across major stock markets. J Empir Financ 5(4):397–416
Webber CL Jr, Zbilut JP (1994) Dynamical assessment of physiological systems and states using recurrence plot strategies. J Appl Physiol 76(2):965–973
Marwan N, Kurths J (2002) Nonlinear analysis of bivariate data with cross recurrence plots. Phys Lett A 302(5):299–307
Lancia L, Fuchs S, Tiede M (2014) Application of concepts from cross-recurrence analysis in speech production: an overview and comparison with other nonlinear methods. J Speech Lang Hear Res 57(3):718–733
Webber CL, Marwan N (eds) (2015) Recurrence quantification analysis. Springer, Switzerland
Takens F (1981) Detecting strange attractors in turbulence. In: Rand D, Young L-S (eds) Dynamical Systems and Turbulence, Warwick 1980. Springer, Berlin, Heidelberg, pp 366–381
Eckmann J-P, Kamphorst SO, Ruelle D et al (1995) Recurrence plots of dynamical systems. World Sci Ser Nonlinear Sci Ser A 16:441–446
Marwan N (1999) Investigation of climate variability in NW Argentina using quantitative analysis of recurrence plots. Norbert Marwan
Fraser AM, Swinney HL (1986) Independent coordinates for strange attractors from mutual information. Phys Rev A 33(2):1134
Kennel MB, Brown R, Abarbanel HD (1992) Determining embedding dimension for phase-space reconstruction using a geometrical construction. Phys Rev A 45(6):3403
Schinkel S, Dimigen O, Marwan N (2008) Selection of recurrence threshold for signal detection. Eur Phys J Special Topics 164(1):45–53
Bastos JA, Caiado J (2011) Recurrence quantification analysis of global stock markets. Phys A Stat Mech Appl 390(7):1315–1325
Yin Y, Shang P (2016) Multiscale recurrence plot and recurrence quantification analysis for financial time series. Nonlinear Dyn 85(4):2309–2352
Zbilut JP (2005) Use of recurrence quantification analysis in economic time series. In: Salzano M, Kirman A (eds) Economics: Complex Windows. Springer, Milano, pp 91–104
Hailesilassie T (2019) Financial Market Prediction Using Recurrence Plot and Convolutional Neural Network. Unpublished manuscript
Li X, Kang Y, Li F (2020) Forecasting with time series imaging. Expert Syst Appl 160:113680
Sood S, Zeng Z, Cohen N, Balch T, Veloso M (2022) Visual time series forecasting: An image-driven approach. In: Proceedings of the Second ACM International Conference on AI in Finance, pp. 1–9
Han D, Orlando G, Fedotov S (2021) Identification of the nature of dynamical systems with recurrence plots and convolution neural networks: A preliminary test. arXiv preprint arXiv:2111.00866
Wallot S (2019) Multidimensional cross-recurrence quantification analysis (mdcrqa)-a method for quantifying correlation between multivariate time-series. Multivar Behav Res 54(2):173–191
Marwan N, Thiel M, Nowaczyk NR (2002) Cross recurrence plot based synchronization of time series. Nonlinear Process Geophys 9(3/4):325–331
Guhathakurta K, Marwan N, Bhattacharya B, Chowdhury AR (2014) Understanding the interrelationship between commodity and stock indices daily movement using ace and recurrence analysis. In: Marwan N, Riley M, Giuliani A, Webber CL Jr (eds) Translational Recurrences. Springer, Cham, pp 211–230
He Q, Huang J (2020) A method for analyzing correlation between multiscale and multivariate systems-multiscale multidimensional cross recurrence quantification (mmdcrqa). Chaos, Solitons & Fractals 139:110066
Tzagkarakis G, Dionysopoulos T (2016) Restoring corrupted cross-recurrence plots using matrix completion: Application on the time-synchronization between market and volatility indexes. In: Webber CL Jr, Ioana C, Marwan N (eds) Recurrence Plots and Their Quantifications: Expanding Horizons. Springer, Cham, pp 241–263
Addo PM, Billio M, Guegan D (2013) Nonlinear dynamics and recurrence plots for detecting financial crisis. North Am J Econ Financ 26:416–435
Fama EF, French KR (1993) Common risk factors in the returns on stocks and bonds. J Financ Econ 33(1):3–56
Sirignano J, Cont R (2019) Universal features of price formation in financial markets: perspectives from deep learning. Quant Financ 19(9):1449–1459
Riskmetrics T (1996) JP Morgan Technical Document. JP Morgan, New York
Marwan N, Romano MC, Thiel M, Kurths J (2007) Recurrence plots for the analysis of complex systems. Phys Rep 438(5–6):237–329
Funding
Open access funding provided by Royal Danish Library, Aarhus University Library. The research received funding from the Independent Research Fund Denmark project DISPA (project No. 9041-00004), and the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie project BNNmetrics (grant agreement No. 890690).
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors have no relevant financial or non-financial interests to disclose that are relevant to the content of this article.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Shabani, M., Magris, M., Tzagkarakis, G. et al. Predicting the state of synchronization of financial time series using cross recurrence plots. Neural Comput & Applic 35, 18519–18531 (2023). https://doi.org/10.1007/s00521-023-08674-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-023-08674-y