High-Frequency Trading in Bond Returns: A Comparison Across Alternative Methods and Fixed-Income Markets

Alaminos, David; Salas, María Belén; Fernández-Gámez, Manuel A.

doi:10.1007/s10614-023-10502-3

High-Frequency Trading in Bond Returns: A Comparison Across Alternative Methods and Fixed-Income Markets

Open access
Published: 02 December 2023

(2023)
Cite this article

Download PDF

You have full access to this open access article

Computational Economics Aims and scope Submit manuscript

High-Frequency Trading in Bond Returns: A Comparison Across Alternative Methods and Fixed-Income Markets

Download PDF

David Alaminos ORCID: orcid.org/0000-0002-2846-5104¹,
María Belén Salas^2,3 &
Manuel A. Fernández-Gámez^2,3

2513 Accesses
1 Altmetric
Explore all metrics

Abstract

A properly performing and efficient bond market is widely considered important for the smooth functioning of trading systems in general. An important feature of the bond market for investors is its liquidity. High-frequency trading employs sophisticated algorithms to explore numerous markets, such as fixed-income markets. In this trading, transactions are processed more quickly, and the volume of trades rises significantly, improving liquidity in the bond market. This paper presents a comparison of neural networks, fuzzy logic, and quantum methodologies for predicting bond price movements through a high-frequency strategy in advanced and emerging countries. Our results indicate that, of the selected methods, QGA, DRCNN and DLNN-GA can correctly interpret the expected bond future price direction and rate changes satisfactorily, while QFuzzy tend to perform worse in forecasting the future direction of bond prices. Our work has a large potential impact on the possible directions of the strategy of algorithmic trading for investors and stakeholders in fixed-income markets and all methodologies proposed in this study could be great options policy to explore other financial markets.

An effective approach for predicting daily stock trading decisions using fuzzy inference systems

Article 28 November 2023

Liquidity Unveiled: Crafting an Index to Decode the Sovereign Bond Market Risk

Article 06 July 2024

Stochastic Analysis for Short- and Long-Term Forecasting of Latin American Country Risk Indexes

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Fixed-income markets represent an important financing source for governments, domestic and international organizations, banks, and both private and public companies with access to the fixed-income market. The development of fixed-income trading can contribute to financial stability in general, and enhance financial intermediation by increasing competition and developing the associated financial infrastructure, products, and services (Nunes et al., 2019; International Monetary Fund, 2021). Government bonds are the main instrument of most fixed-income asset markets, for developed and developing economies alike. In the United States, the main data source for public securities trading activity is GovPX and MTS for Europe. The MTS is a fully electronic, quote-driven interbank market comprising multiple trading platforms. All MTS platforms use identical technology for trading; however, every platform maintains its rules set, and market participants (Biais & Green, 2019; Friewald & Nagler, 2019). Fixed-income securities are commonly traded over-the-counter (OTC), on inter-dealer wholesale platforms, and, less frequently, on retail platforms where liquidity is provided by pre-purchased dealers. Transactions are not anonymous and are bilateral; therefore, the conditions of negotiation are determined by search and trading frictions, in the absence of a focal point, dealers have to proactively seek out and negotiate with possible counterparties to get the "best" offer (Darbha & Dufour, 2013; Glode & Opp, 2020; Neklyudov, 2019).

Fixed-income trading has generally received less attention from researchers than equity market trading, even though fixed-income markets involve significantly more capital raising in comparison to equity markets. The electronic ease of bond trading, however, is on the rise. The impact of electronically supported trading on the performance of the fixed-income market will be interesting to evaluate as its contribution expands. In addition, some liquidity providers for corporate bonds offer requests for trades under specific trade size limits using algorithms instead of human participants (Bessembinder et al., 2020).

Technological advances have transformed how investors can operate in the financial markets. High-Frequency Trading (HFT), which is algorithmic trading (AT) distinguished by high-speed trade execution, exemplifies these changes in technology (Frino et al., 2020). HFT is an approach to a financial market intervention involving complex software tools, which are used to execute high-frequency trades, guided by mathematical algorithms, in the markets for stocks, options, bonds, derivative instruments, commodities, etc. (Rundo, 2019). Hendershott et al. (2011) claim that AT reduces trading costs and increases quote information. In addition, liquidity providers' revenues also increase with AT, although this effect seems to be transitory. In conclusion, financial trading demands that the AT scans the environment for suitable and prompt decisions in the absence of monitored data (Aloud & Alkhamees, 2021).

In academic research, the term "high-frequency trading" is often used to refer to asset price windows of 10 min or 60 min, for example (Christiansen & Ranaldo, 2007). For instance, a study published in the Journal of Financial Markets in 2013 defines high-frequency trading as "trading that takes place in intervals of a few seconds to a few minutes" (Aldrich, 2013). Similarly, a paper published in the Journal of Financial Economics in 2008 found that HFT can improve liquidity provision in corporate bond markets, particularly for less liquid bonds (Mahanti et al., 2008).

However, in practice, the term "high-frequency trading" is often used more strictly to refer to price windows with even shorter times. For example, in the US equities market, the Securities and Exchange Commission defines a high-frequency trader as someone who trades at least 2 million shares or $20 million in securities in a single day, with an average holding time of less than 0.5 s (SEC, 2014).

Despite the varying definitions of high-frequency trading used in academic research and in practice, there is general agreement that this type of trading has significant implications for bond markets. Some studies have suggested that HFT can increase market efficiency and liquidity, while others have argued that it can exacerbate market volatility and lead to market instability (e.g., Frino et al., 2013; Schestag et al., 2016). As HFT continues to evolve and shape financial markets, it is likely that academic researchers and practitioners will continue to debate its effects and implications.

AT, both algorithms driven by fundamental and technical indicator analysis and algorithms supported by machine learning techniques, have been examined by several researchers. According to Goldblum et al., (2021), Machine Learning (ML) is playing an important and growing role in financial business applications. Besides, Deep learning (DL), which is a subclass of ML methods that study deep neural networks, develops DL algorithms that can be used to train complex data and predict the output. Today numerous financial firms, ranging from hedge firms, investment banks, and retailers to modern FinTech providers, are investing in developing expertise in data science and ML (Goodell e al., 2021).

As market turmoil and uncertainty in financial markets have increased considerably, ML algorithms are quite applicable for the analysis of financial markets and, in particular, the fixed-income market. The marketplace is very complicated, and the only forecast that can be made is its unpredictability. The financial market's unforeseeability is caused by the uncertainty of many episodes that occur in it (Goldblum et al., 2021). Deep Neural Networks draw knowledge from the data, that can then be utilised to forecast and produce further data. This feedback decreases unreliability by indicating specific problem-solving. ML is especially useful for handling problems where an analytical solution is not explicitly instructed to do so, such as complicated categorisation techniques or recognition of trends (Ghoddusi et al., 2019). The benefit of Deep Neural Network methods over those offered by classical statisticians and econometricians is that ML algorithms can handle a huge quantity of organised and non-structured information and provide quick predictions or conclusions (Milana & Ashta, 2021).

Publications on the use of ML techniques with specific applications to fixed-income markets are scarce. However, other financial areas have attracted much more interest in the research literature, particularly in the equity and foreign exchange markets (Nunes et al., 2018). Most of these studies involve the stock market, mainly for forecasting with artificial neural networks (ANN), support vector machines (SVM), and random forests (RF) models. These methods have proven to produce excellent results for financial time series forecasting (Deng et al., 2021, 2022). For example, Kara et al. (2011) suggested an ANN-based model for predicting the daily price movement in the stock market, and it yielded high accuracy in the forecast. Akyildirim et al. (2022) compare the trading behaviour of several advanced forecasting techniques, such as ANN, autoregressive integrated moving average (ARIMA), nearest neighbors, naïve Bayes method, and logistic regression to forecast stock price movements relying on past prices. They apply these methods to high-frequency data of 27 blue-chip stocks traded on the Istanbul Stock Exchange. Their results highlight that, among the chosen methodologies, naïve Bayes, nearest neighbors, and ANN can detect the future price directions as well as percentage changes at a satisfactory level, while ARIMA and logistic regression perform worse than the random walk model. In addition, these authors establish a future line of research to test the chosen methods in other markets to achieve more accurate and widespread results.

Some authors have made predictions about the performance of fixed-income assets through neural networks. Vukovic et al. (2020) analyze the model of a neural network that forecasts the Sharpe ratio. Their results demonstrate that neural networks are accurate in predicting nonlinear series with an 82% precision in the test cases for forecasting the future Sharpe ratio dynamics and the position of the investor's portfolio. For future research, they propose analyzing more data in stronger artificial intelligence technologies, such as Long Short-Term Memory (LSTM) neural network technology. They conclude that these adaptive methodologies should provide more accurate analysis and forecasting and such an area of study requires additional attention and effort in the future. Li et al. (2021) analyze sovereign CDS to prevent investment risks and propose a hybrid ensemble forecasting model. They employ Autoregressive Integrated Moving Average (ARIMA) model to predict trend elements, meanwhile, the Relevance Vector Machine (RVM) technique is used to forecast market volatility and noise elements, correspondingly, having the model excellent robustness. They establish that although the suggested model exhibits satisfactory prediction efficiency, there is scope for further improvement and apart from sovereign CDS time series, the prediction model provided may be applied to other financial time series to test the generalizability of the model. Nunes et al. (2019) concentrate on yield curve forecasting, currently the centerpiece of the bond markets. They apply ML to fixed-income markets, specifically multilayer perceptrons (MLPs), to analyze the yield curve overall. They exhibit that MLPs could be effectively utilized to forecast the yield curve. They determine that, in terms of future work, an important area of interest is to keep exploring multitask learning, as they believe that further research is required to identify the terms and conditions under which their methodology could be applied with enhanced performance.

To fill the gap in this research area, our study aims to predict bond price movements based on past prices through high-frequency trading. We compare machine learning methods applied to the fixed-income markets (sovereign, corporate and high-yield debt) in advanced and emerging countries in the one-year bond market for the period from 15 February 2000 to 12 April 2023.

Despite the limited number of observations at 10-min frequency, the methodologies applied in this work are capable of working and finally making estimates, something that would be impossible for conventional statistical methodologies and even for some simple computational methodologies. Some previous works have found more avaibility of this data from OTC markets. Although these limitations exist, several works have also appeared investigating corporate bond data with 10-min interval observations, such as Nowak et al. (2009), Aldana (2017), Gomber and Haferkorn (2015), Holden et al. (2018), Gündüz et al. (2023).

We make at least two further contributions to the literature. First, we analyze the fixed-income market through HFT comparing a wide range of innovative computational machine learning methodologies, since most of the previous studies employ statistical and econometric methods. In addition, the prior literature deals with portfolio optimization only with fixed-income assets is not too many, and even fewer are dealing with the use of HFT. Within the ongoing advancement of financial markets, HFT proportion has increased steadily in recent years, which is generally characterized by fast update frequency and high trading speed. HFT also will produce plenty of profitable market influence, like increasing market liquidity and improving risk-handling ability (Deng et al., 2021). Second, our study has made predictions of bond price movements globally, and so not restricted to developed countries, being interesting for those responsible for the economic policies of any country in the world. Whereas the relevance of public debt markets has led to innumerable papers on these markets in the United States and other advanced countries, comparatively limited research exists on emerging bond markets (Bai et al., 2013). In addition, our study has considered not only sovereign bonds but also corporate and high-yield debt.

The rest of the paper is organized as follows. In Sect. 2, the methodologies are described. Section 3 details the sample and data involved in the research. Section 4 points out the results and findings obtained. By last, Sect. 5 finishes explaining the conclusions reached.

2 Methodologies

We have used different methods to predict bond price movements through HFT. The application of various techniques aims to obtain a robust model, which is tested not just via one categorisation technique but using those that have proven successful in prior literature and other areas. Specifically, this study applies Quantum-Fuzzy Approach, Adaptive Boosting, and Genetic Algorithm, Support Vector Machine- Genetic Algorithm, Deep Learning Neural Network- Genetic Algorithm, Quantum Genetic Algorithm, Adaptive Neuro-Fuzzy Inference System-Quantum Genetic Algorithm, Deep Recurrent Convolutional Neural Networks, Convolutional Neural Networks-Long Short Term Memory, Gated Recurrent Unit- Convolutional Neural Networks, and Quantum Recurrent Neural Networks. The techniques Deep Recurrent Convolutional Neural Networks and Quantum Genetic Algorithm have been the ones that have obtained the best results as will be shown in Sect. 4 of Results, therefore these methodologies will be explained below. The rest named are shown in the “Appendix 1” of this study.

2.1 Quantum Genetic Algorithm (QGA)

The quantum evolutionary algorithm (QEA) is an evolutionary algorithm built on the concept of quantum computing. It introduces notions like superposition states in quantum computing and incorporates the single encoding form to obtain improved experimental results in the combinatorial optimisation problem. Nevertheless, when it comes to the optimisation of multimodal functions using QEA, in the specific, high-dimensional multimodal function optimisation problem, it is likely to drop into local optimum and its computing efficiency is poor.

This study aims to improve the global optimisation capacity of the genetic algorithm and the local search ability according to the quantum probabilistic model to introduce a new type of quantum evolutionary algorithm, namely the "quantum genetic algorithm", to deal with the above deficiencies of QEA. This algorithm utilises the quantum probabilistic vector encoding mechanism and takes the crossover operator of the genetic algorithm and the updating strategy of quantum computation simultaneously to optimize the global search capacity of the quantum algorithm effectively.

The quantum genetic algorithm steps are:

2.1.1 Step 1: Population Initialisation

The lowest unit of information in QGA is a quantum bit. The state of a quantum bit can be 0 or 1, expressed as:

$$ \left| {{{\Psi }}\rangle = \alpha } \right.\left| {0\rangle + \beta \left| 1 \right.} \right.\rangle $$

(1)

being $\alpha$, $\beta$ two complex numbers corresponding to the likelihood of happening of the respective state: $\left( {\left| \alpha \right|^{2} + \left| \beta \right|^{2} = 1} \right), \left| \alpha \right|^{2} , \left| \beta \right|^{2}$ symbol the likelihood of the quantum bit in the 0 and 1 state accordingly.

The most commonly adopted coding techniques in EA include binary coding, decimal coding, and symbolic coding. In QGA, a new method of coding is introduced using the quantum bit, namely the use of a pair of complex numbers to describe a quantum bit. A system with m quantum bits is expressed as

$$ \left[ {\left. {\begin{array}{*{20}c} {\alpha _{1} } \\ {\beta _{1} } \\ \end{array} } \right|} \right.\left. {\begin{array}{*{20}c} {\alpha _{2} } \\ {\beta _{2} } \\ \end{array} } \right| {\begin{array}{*{20}c} \ldots \\ \ldots \\ \end{array} } \Big\rfloor \left. {\left. {\begin{array}{*{20}c} {\alpha _{m} } \\ {\beta _{m} } \\ \end{array} } \right|} \right] $$

(2)

In the equation, $\left| {\alpha_{i} } \right|^{2} + \left| {\beta_{i} } \right|^{2} = 1$ (i = 1, 2, …, m). This method of display may be applied to describe any linear superposition of states. For instance, a system of three quantum bits having the next probability amplitudes:

$$ \left[ {\begin{array}{*{20}c} {\frac{1}{{\sqrt 2 }}} \\ {\frac{1}{{\sqrt 2 }}} \\ \end{array} \Bigg\lfloor {\begin{array}{*{20}c} {\frac{{\sqrt 3 }}{2}} \\ {\frac{1}{2}} \\ \end{array} } } \right]\left. {\begin{array}{*{20}c} {\frac{1}{2}} \\ {\frac{{\sqrt 3 }}{2}} \\ \end{array} } \right| $$

(3)

The system state can be defined as

$$ \frac{\sqrt 3 }{{4\sqrt 2 }}\left| {000} \right.\rangle + \frac{3}{4\sqrt 2 }\left| {001} \right.\rangle + \frac{1}{4\sqrt 2 }\left| {010\rangle + } \right.\frac{\sqrt 3 }{{4\sqrt 2 }}\left| {011} \right.\rangle + \frac{\sqrt 3 }{{4\sqrt 2 }}\left| {100} \right.\rangle + \frac{\sqrt 3 }{{4\sqrt 2 }}\left| {101} \right.\rangle + \frac{1}{4\sqrt 2 }\left| {110\rangle + } \right.\frac{\sqrt 3 }{{4\sqrt 2 }}\left| {111} \right.\rangle $$

(4)

2.1.2 Step 2: Conduct Individual Coding and Measuring of the Population Generating Units

QGA is a probabilistic algorithm analogous to EA. The algorithm is $H\left( t \right) = \left\{ {Q_{1}^{t} ,Q_{2}^{t} , \ldots Q_{h}^{t} , \ldots Q_{l}^{t} } \right\}$ h = 1,2,…l) being h the size of the population, $Q_{l} \left( t \right) = \left\{ {q_{1}^{t} ,q_{2}^{t} , \ldots q_{j}^{t} , \ldots q_{n}^{t} } \right\}$ where n represents the number of generator units, t denotes the evolution generation,$ q_{j}^{t}$ symbols the binary coding of the generation volume of the jth generator unit. Its chromosome is shown as below:

$$ q_{j}^{t} = \left[ {\left. {\begin{array}{*{20}c} {\alpha_{1}^{t} } \\ {\beta_{1}^{t} } \\ \end{array} } \right|} \right.\left. {\begin{array}{*{20}c} {\alpha_{2}^{t} } \\ {\beta_{2}^{t} } \\ \end{array} } \right|\begin{array}{*{20}c} \ldots \\ \ldots \\ \end{array} \Big\rfloor \left. {\left. {\begin{array}{*{20}c} {\alpha_{m}^{t} } \\ {\beta_{m}^{t} } \\ \end{array} } \right|} \right] $$

(5)

(j = 1, 2, …, n) (m is the length of the quantum chromosome).

During the ‘‘initialization of H(t),” if $\alpha_{1}^{t}$, $\beta_{1}^{t}$ (i = 1, 2, …, m) in $Q_{l} \left( t \right)$ and all the $q_{j}^{t}$ are initialized, it denotes that all the possible linear superposition states will happen with equal likelihood. Over the step of ‘‘generating S(t) from H(t)”, a common solution set S(t) is created through observation of the state of H(t), wherein the t_th generation, $S\left( t \right) = \left\{ {P_{1}^{t} ,P_{2}^{t} , \ldots ,P_{h}^{t} , \ldots ,P_{l}^{t} } \right\}, P_{l} = \left\{ {x_{1}^{t} ,x_{2}^{t} , \ldots ,x_{j}^{t} , \ldots ,x_{n}^{t} } \right\}$. Every $x_{j}^{t}$ (j = 1, 2, …, n) is a series, $\left( {x_{1} ,x_{2} , \ldots ,x_{i} , \ldots ,x_{m} } \right)$, of length m, which are reached from the amplitude of quantum bit $\left| {\alpha_{i}^{t} } \right|^{2} $ or $\left| {\beta_{i}^{t} } \right|^{2}$ (i = 1, 2, …, m). The relevant procedure in the binary scenario is to randomly identify a number [0, 1]. Take ‘‘1″ if it is larger than $\left| {\alpha_{i}^{t} } \right|^{2}$; take ‘‘0″ otherwise.

2.1.3 Step 3: Make An Individual Measure for Every Item in S(t)

Employ a fitness assessment function to test each object in S(t) and maintain the best object in the generation. If you get a satisfactory solution, the algorithm stops; if not, proceed to the fourth step. When dealing with non-binary optimization problems, the chromosome is usually represented by a set of real-valued parameters rather than a binary string. In such cases, the fitness function is often a continuous function that maps the parameter values to a scalar value that represents the fitness of the solution.

Considering a non-binary optimization problem with a chromosome composed of three real-valued parameters × 1, ×2, and ×3, the fitness function for this problem would be defined as:

$$ {\text{f}}({\text{x}}1,{\text{x}}2,{\text{x}}3) = ({\text{x}}1 - 3)^{2} + ({\text{x}}2 + 1)^{2} + ({\text{x}}3 - 2)^{2} $$

(6)

The objective in this case would be to minimize the fitness function. To accomplish this, the QGA would search for a set of parameter values that produce the minimum fitness value. The process would be similar to that for a binary problem, with the genetic operators applied to the real-valued parameters rather than binary strings.

2.1.4 Step 4: Apply Genetic Operators to Create New Individuals

The crossover operator is applied by swapping some of the qubits between two chromosomes. One of the most commonly used crossover operators in QGA is the uniform crossover, which selects each qubit from one of the two parent chromosomes with a certain probability. The crossover operator can be represented mathematically as:

$$ \left| {\uppsi {\text{child}}} \right\rangle = \upalpha \left| {\uppsi {\text{parent1}}} \right\rangle +\upbeta \left| {\uppsi {\text{parent2}}} \right\rangle \left| {\uppsi {\text{child}}} \right\rangle = \upalpha \left| {\uppsi {\text{parent1}}} \right\rangle +\upbeta \left| {\uppsi {\text{parent2}}} \right\rangle $$

(7)

where α∣ψparent1⟩ + β∣ψparent2⟩ are the two parent chromosomes, ∣ψchild⟩ is the resulting child chromosome, and α and β are complex coefficients determined by the crossover probability.

The mutation operator randomly flips some of the qubits in a chromosome. Mathematically, the mutation operator can be represented as:

$$ \left| {\uppsi {\text{mutated}}} \right\rangle = {\text{Um}}\left| {\uppsi {\text{original}}} \right\rangle \left| {\uppsi {\text{mutated}}} \right\rangle = {\text{Um}}\left| {\uppsi {\text{original}}} \right\rangle $$

(8)

where Um is a single-qubit unitary gate that applies a random rotation around the Bloch sphere axis for the qubit to be mutated. The mutation rate determines the probability of applying the mutation operator to each qubit in a chromosome.

It’s important to note that the application of genetic operators in QGA can be done in different ways, and the specific equations used can vary depending on the implementation and problem being solved.

2.1.5 Step 5: Apply An Appropriate Quantum Rotation Gate U(t) to Update S(t)

The conventional genetic algorithm utilises mating and mutation operations, etc. to keep the population diverse. The quantum genetic algorithm uses a logic gate to the likelihood amplitude of the quantum state to preserve the diversity of the population. Hence, the method of updating by a quantum gate is the essence of the quantum genetic algorithm. The binary system, adaptation values, and the probability amplitude comparison technique are utilised for updating using a quantum gate in the classical genetic algorithm. This approach to updating via a quantum gate is adequate for solving combinatorial optimisation problems with an in-principle optimum. Nevertheless, for real optimisation problems, especially those optimisation problems of multivariable continuous functions, whose best solutions are in principle not available beforehand. Hence, a quantum rotation gate of the quantum logic gate for the new quantum genetic algorithm is assumed here.

$$ U = \left[ {\begin{array}{*{20}c} {\cos \theta } & {\sin \theta } \\ {\sin \theta } & {\cos \theta } \\ \end{array} } \right] $$

(9)

being $\theta$ the quantum gate rotation angle. Its value is shown as

$$ \theta = k \cdot f\left( {\alpha_{i} ,\beta_{i} } \right) $$

(10)

$$ k = \pi \cdot \exp \left( { - \frac{t}{{iter_{\max } }}} \right) $$

(11)

We consider k as a variable linked to the evolution generation to adjust the mesh size in a self-adaptive way. Let t be the evolution generation, π is an angle, $iter_{\max }$ is a constant that relies on the complexity of the optimization problem. The aim of the function $f\left( {\alpha_{i} ,\beta_{i} } \right)$ serves to cause the algorithm to seek the best direction. It is based on the idea of gradually bringing the actual search solution closer to the optimal solution and thus setting the direction of the quantum rotation gate.

Thus, the process of implementing the quantum rotation gate to the entire probability amplitude for the individual object in the population, namely by applying the quantum rotation gate U(t) to update S(t), in the quantum genetic algorithm may be written as:

$$ S\left( {t + 1} \right) = U\left( t \right) \times S\left( t \right) $$

(12)

being t the evolution generation, U(t) represents the t_th generation quantum rotation gate, S(t) symbols the t_th generation probability amplitude of a certain object, S(t + 1) denotes the $t + 1^{th}$ generation probability amplitude of the relevant object.

2.1.6 Step 6: Perturbation

Since QGA is inclined to get caught at a better local extreme value, we disturb the population. QGA analysis has shown that if the best individual of the present generation is a local extreme value, the algorithm is very difficult to free. Thus, the algorithm is stuck at the local extrema if the best individual remains unchanged in subsequent generations.

Finally, we show how is the pseudocode for the implementation of this method for the problem studied and the flowchart (Fig. 1) with the steps to follow as it has been described previously.

2.2 Deep Recurrent Convolutional Neural Network (DRCNN)

The RCNN model consists of a stack of RCLs and may include max pooling layers. To save on computational resources, the first layer is a standard forward convolutional layer with no recurrent connections, followed by a max pooling layer. Four RCLs are used with a max pooling layer in the middle, and there are only feed-forward links between adjacent RCLs. Both clustering operations have a stride of 2 and a size of 3. The fourth RCL's output tracks a global maximum clustering layer that produces the maximum of each feature map, resulting in a feature vector to represent the image. This approach differs from Krizhevsky et al. (2017) model, which uses fully connected layers, and Lin et al. (2013) and Szegedy et al.’s (2017) models, which use global average pooling. Finally, a softmax layer is used to classify the feature vectors into C categories, with the output consisting of:

$$ \mathcalligra{y}_{k} = \frac{{\exp \left( {W_{K}^{T} X} \right)}}{{\mathop \sum \nolimits_{K^{\prime}} \exp \left( {W_{K}^{T} X} \right)}}\left( {{\text{k}} = 1,2, \ldots ,{\text{C}}} \right) $$

(13)

being $\mathcalligra{y}_{k}$ the predicted probability belonging to the kth category, and x the feature vector generated by the global max pooling.

RNNs have been deployed in many fields in time series forecasting with success owing to their enormous predictive power. The standard RNN framework is structured by the output, which depends on its past estimations (Wan et al., 2017). The standard RNN framework uses a hidden state to store information about past inputs, which is combined with the current input to make a prediction for the output at the current time step. The RCNN model incorporates the standard RNN framework by using Recurrent Convolutional Layers (RCLs) to capture the temporal dependencies in sequential data. The output of each RCL is a sequence of hidden states that can be used to make predictions about future inputs. The DRCNN model extends the RCNN model by stacking RCLs to create a deep architecture, with each layer applying a convolutional operation to the hidden states generated by the previous layer. The output of the last layer is then fed into a supervised learning layer to produce a prediction for the output at the current time step. the output of this RNN can be written as:

$$ {\text{y}}_{{\text{t}}} = {\text{f}}\left( {{\text{W}}_{{\text{y}}} *{\text{s}}_{{\text{t}}} + {\text{b}}_{{\text{y}}} } \right) $$

(14)

where yt is the output at time step t, st is the hidden state at time step t, Wy is the weight matrix connecting the hidden state to the output, by is the bias term, and f is the activation function.

An input sequence vector x, the hidden states of a recurrent layer s, and the output of a unique hidden layer y, can be obtained from formulas (14) and (15).

$$ s_{t} = \sigma \left( {W_{xs} x_{t} + W_{ss} s_{t - 1} + b_{s} } \right) $$

(15)

$$ y_{t} = o\left( {W_{so} s_{t} + b_{y} } \right) $$

(16)

being $W_{xs}$, $W_{ss}$, and $W_{so}$ the weights from the input layer x to the hidden layer s, the hidden layer to itself, and the hidden layer to its output layer, respectively. $b_{y}$ represent the biases of the hidden layer and output layer. Formula (16) points out $\sigma$ and $o$ as a symbol of the activation functions.

$$ STFT\left\{ {z\left( t \right)} \right\}\left( {\tau ,\omega } \right) = \mathop \smallint \limits_{ - \infty }^{ + \infty } z\left( t \right)\omega \left( {t - \tau )e^{ - j\omega t} dt} \right) $$

(17)

where z(t) denotes the vibration signals, ω(t) symbols the Gaussian window function focused around 0. T(τ, ω) represent a complex function defining the vibration signals over time and frequency. To compute the hidden layers with the convolutional operation formulas (17) and (18) are used.

$$ S_{t} = \sigma \left( {W_{TS} * T_{t} + W_{SS} * S_{t - 1} + B_{s} } \right) $$

(18)

$$ Y_{t} = o\left( {W_{YS} * S_{t} + B_{y} } \right) $$

(19)

being W the convolution kernels. Below we show the pseudocode of activation function of these RCNNs:

Pseudocode for Activation function
# Activation function for identifying and ranking values
# Input: Characteristics from convolution layer
# Output: Removal of negative values
activation_function = lambda y: 1.0/(1.0 + np.exp(-y))
input_func = np.random.random((2, 3))
K1, a1 = np.random.random((4, 2)), np.random.random(4)
K2, a2 = np.random.random((1, 4)), np.random.random(1)
K3, a3 = np.random.random((1, 1)), np.random.random(1)
layer1 = activation_function(np.dot(K1, input_func) + a1)
layer2 = activation_function(np.dot(K2, layer1) + a2)
output = np.dot(K3, layer2) + a3

To establish a deep architecture, the recurrent convolutional neural network (RCNN) can be stacked and form the DRCNN (Huang & Narayanan, 2017). In this combination case, the last part of the model is a supervised learning layer, set by formula (19).

$$ \hat{r} = \sigma \left( {W_{h} *h + b_{h} } \right) $$

(20)

being W_h the weight and b_h the bias, respectively. The error of predicted and actual observations in the prediction training data may be estimated and fed back into model training (Ma & Mao, 2019). Stochastic gradient descent is implemented to optimise parameter learning. Assuming that the real data at time t is r, the loss function is given in the formula (20).

$$ L\left( {r,\hat{r}} \right) = \frac{1}{2}\left\| {r - \hat{r}} \right\|_{2}^{2} $$

(21)

The number of filters is the number of neurons, since each neuron performs a different convolution on the input to the layer. It can also be distributed as multiples of 32, with a range limit of 32–512. The size of the filter defines how many neighboring data points there are in a convolutional layer. The most used sizes in this work have been 3 × 3 and 5 × 5. Stride and padding are a parameters of the neural network's filter that modifies the amount of movement over observations. In this work and usually, a stride size no greater than 2 × 2 and a padding no greater than 1 × 1 have been used. Finally, we provide a flowchar of the steps to complete in order to run this DRCNN in the Fig. 2.

3 Sample and Data

We employ bond prices for a one-year bond market in the period from February 15th, 2000 to April 12th, 2023. The sample consists of ten sovereign bonds in five advanced economies (Germany, United States, Italy, Spain, and Japan) and five emerging countries (Turkey, Mexico, Indonesia, Nigeria, and Poland); ten corporate bonds in five advanced economies (Walmart, Johnson & Johnson, Verizon, Unilever PLC, Rito Tinto PLC) and in six emerging economies (Air Liquid, Ambev, Cemex, Turkish Airlines, KCE Electronics, Telekomunikacja Polska), and finally, ten high-yield bonds in five developed countries (Caesars Resort Collection LLC, Asurion LLC, Intelsat Jackson Holdings, Athenahealth Group, Great Outdoors Group LLC) and in five developing markets (Petroleos Mexicanos, Petrobas, Sands China Ltd, Indonesia Asahan Alumini, Longfor Properties). “Appendix 3” displays a detailed information about the features of every bond used in the sample. We have got data on the bond prices from the Eikon database from Refinitiv. The data on trades in Refinitiv comprises information on executed trades such as price and volume, which are timestamped up to the microsecond with tools like Refinitiv Tick History. On the other hand, the information on order book includes the limit price and order volume for both the bid and ask sides, covering limits one to ten. This information has been used by recent studies from Clapham et al. (2022), Hansen and Borch (2022), and Dodd et al. (2023). Table 1 summarizes the sample according to every category of the fixed-income market used.

Table 1 Sample of bonds used

Full size table

We categorize all trades that happen in the continuous session across the day as "continuous trades" and build "all trades" aggregating the trades performed in the open and close sessions to the "continuous trades". To avoid dealing asynchronously, we display our data at 10, 30, and 60 min.

In addition, we measure the cost-effectiveness of our selected forecasting techniques by the following ratios.

3.1 Sign Prediction Ratio (SPR)

Correctly predicted price direction change is assigned 1, and − 1 otherwise. This ratio is defined as:

$$ SPR = \frac{{\mathop \sum \nolimits_{j = 1 + M/2}^{M} matches \left( {Y_{j} ,Y_{j}^{\prime} } \right)}}{M/2} $$

(22)

being “matches” the following

$$ matches \left( {Y_{j} ,Y_{j}{\prime} } \right) = \left\{ {\begin{array}{*{20}l} {1 \;\;\;\;if\; sign \left( {Y_{j} } \right) = sign \left( {Y_{j}{\prime} } \right)} \hfill \\ {0\;\;\;\; otherwise} \hfill \\ \end{array} } \right. $$

(23)

being the “sign function” that assigns + 1 for positive arguments and -1 for negative arguments.

With the purpose of correcting the possible deficiencies of the model in terms of its precision regarding the direction of the trend of the movements in the prices of the securities, we have incorporated a modification of the previous equation by adding the Moving Average Convergence Divergence (MACD) model. The MACD model is commonly calculated using the following equation:

$$ {\text{MACD }}\;{\text{Line}} = {\text{12-day}}\;{\text{Exponential}}\;{\text{Moving}}\;{\text{Average}}\;\left( {{\text{EMA}}} \right) - {\text{26-day}}\;{\text{ EMA}} $$

(24)

The MACD line represents the difference between the 12-day EMA and the 26-day EMA. The EMA is a type of moving average that gives more weight to recent data points. By subtracting the longer-term EMA from the shorter-term EMA, the MACD line aims to capture the momentum and trend direction of the underlying asset (Chong & Ng, 2008; Ramlall, 2016; Sezer & Ozbayoglu, 2018).The approach of using MACD as a correction factor in the SPR can be modeled using the following equation:

$$ {\text{SPR}}_{{{\text{Adjusted}}}} = {\text{SPR}} + \, \left( {{1} - {\text{SPR}}} \right)*\left( {{1} - {\text{MACD}}} \right) $$

(25)

where SPR is the original sign prediction ratio, MACD is the signal generated by the MACD model, and 1-MACD is used as a correction factor. The term (1-SPR) represents the complement of the original SPR, reflecting the portion of the original SPR that is not considered accurate. The term (1-MACD) represents the complement of the MACD signal, reflecting the extent to which the MACD signal indicates a potential reversal or correction in the market. The intuition behind this equation is that when the MACD signal is positive, it is likely that the market is trending upwards and the original SPR is more accurate. However, when the MACD signal is negative, it suggests a market reversal or a correction, and the original SPR may not be as accurate. In this case, the correction factor is used to adjust the SPR downward to reflect the possibility of a trend reversal (de Almeida & Neves, 2022; Ramlall, 2016; Slade, 2017).

By adding the correction factor to the original SPR, the adjusted SPR_Adjusted takes into account the possibility of trend reversals or corrections indicated by the MACD signal. When the MACD signal is positive, the original SPR is considered more accurate and is only slightly adjusted. However, when the MACD signal is negative, indicating a potential trend reversal, the original SPR is adjusted more significantly downward to reflect the increased likelihood of a reversal.

This approach aims to combine the predictive power of the original SPR with the insights provided by the MACD signal, adjusting the SPR to account for potential trend changes. It recognizes that the MACD signal can act as a corrective factor when the market conditions indicate a higher likelihood of a trend reversal or correction.

3.2 Ideal Profit Ratio (IPR)

Is the ratio between the total Return and the maximum return.

$$ IPR = \frac{{Total \;{\text{Re}} turn}}{{Maximum \;{\text{Re}} turn}} $$

(26)

The Total Return is computed in the following formula, where “sign” denotes the “sign function” and the better the forecasting approach, the higher the total return will be.

$$ Total \;Return = \mathop \sum \limits_{j = 1 + M/2}^{M} sign \left( {Y_{j}^{\prime} } \right)*Y_{j} $$

(27)

The maximum return is determined by summing all absolute expected figures and reflects the maximum achievable return, considering a perfectly foreseeable forecast. This ratio is defined as:

$$ Maximum \;Return = \mathop \sum \limits_{j = 1 + M/2}^{M} abs \left( {Y_{j} } \right) $$

(28)

Nelson-Siegel model, which was introduced by economists Nelson and Siegel in 1987. The model is based on the idea that the yield curve can be decomposed into three factors: the level factor, the slope factor, and the curvature factor. These factors capture the average level of interest rates, the steepness of the yield curve, and the degree of curvature, respectively. The model can be expressed mathematically as follows:

$$ {\text{r}}\left( {\text{t}} \right) =\upbeta {1} +\upbeta {2}*\left[ {{1} - {\text{exp}}( - {\text{t}}/\uptau )} \right]/({\text{t}}/\uptau ) +\upbeta {3}*\left[ {({1} - {\text{exp}}( - {\text{t}}/\uptau ))/({\text{t}}/\uptau ) - {\text{exp}}( - {\text{t}}/\uptau )} \right] $$

(29)

where r(t) represents the yield on a bond with time to maturity t, and β1, β2, β3, and τ are parameters to be estimated. The parameter β1 represents the long-term mean level of interest rates, β2 represents the slope of the yield curve at short maturities, β3 represents the curvature of the yield curve, and τ represents the time scale over which the yield curve adjusts to its long-term mean.

The rolling regression method involves estimating the relationship between the excess returns of the bond portfolio and changes in the yield curve over a specified rolling time period, such as one month or one quarter. The slope of the regression line represents the expected excess return of the portfolio for a given change in the yield curve (Grinold & Ronald, 1999; Ibbotson & Kaplan, 2000).

The equation for the rolling regression model can be written as follows:

$$ {\text{Excess}}\;{\text{Return}} = \upalpha +\upbeta *{\text{Yield}}\;{\text{Curve}}\;{\text{Change}} +\upvarepsilon $$

(30)

where Excess Return is the excess return of the bond over the risk-free rate, typically estimated using a 3-month U.S. Treasury bond as the benchmark (Campbell et al., 2001). Yield Curve Change is the change in the yield curve over the rolling time period, calculating the yields for each maturity point based on the Nelson-Siegel model for both yield curves. The term α is the intercept of the regression line, which represents the expected excess return of the bond when the yield curve change is zero. The term β is the slope of the regression line, which represents the expected excess return of the bond for a one-unit change in the yield curve. The term ε is the residual error term, which represents the deviation of the actual excess return from the predicted excess.

For its part, a modification of the Ideal Profit Ratio equation has been made following what has been done in works such as Elton et al. (1995) y Grinold and Ronald (1999). The ideal profit ratio is a measure of the performance of an investment strategy relative to a benchmark. It is calculated as the difference between the total returns of the strategy and the maximum returns of the benchmark, divided by the maximum returns of the benchmark.

To incorporate the excess return based on the yield curve into the calculation of the ideal profit ratio, you could modify the equation as follows:

$$ {\text{Ideal}}\;{\text{Profit}}\;{\text{Ratio}} = \left( {{\text{Total}}\;{\text{Return}} - {\text{Expected}}\;{\text{Return}}} \right)/{\text{Maximum}}\;{\text{Return}} $$

(31)

This modified equation measures the performance of the investment strategy relative to the benchmark, while taking into account the impact of the yield curve on the expected return of the portfolio. A higher ideal profit ratio indicates better performance relative to the benchmark.

Finally, after calculating the aforementioned equation of the Ideal Profit Ratio, its final result will be the net value after applying the transaction cost. In our case we used the difference between the average customer buy and the average customer sell price on each day to quantify transaction costs according to the specification of Hong and Warga (2000) and Chakravarty and Sarkar (2003):

$$ TC_{AvgBidAsk} = \frac{{\overline{{P_{t}^{buy} }} - \overline{{P_{t}^{sell} }} }}{{0.5 - \left( {\overline{{P_{t}^{buy} }} - \overline{{P_{t}^{sell} }} } \right)}} $$

(32)

where $\overline{{P_{t}^{buy/sell} }}$ t is the average price of all customers buy/sell trades on day t. We calculate TC_AvgBidAsk for each day on which there is at least one buy and one sell trade and use the monthly mean as a monthly transaction cost measure following the specifacions of previous works (Schestag et al., 2016).

4 Results

From our data described in the previous section, we collect a sample at 10, 30- and 60-min intervals, and afterward, we implement ten different methods defined in Sect. 2. The size of the training sample for the whole daily forecasting time horizon appears as 50% of the total sample size approximated to the nearest integer value, while the other 50% is used as an out-of-sample data set.

We introduce two key measures, defined in Sect. 3, of the performance of the methodologies. First, is the sign prediction rate, representing the proportion of times that the corresponding methodology accurately estimates the direction of the future price (up or down). Since correctly guessing the future price change would not ensure better results, we should contrast the performance of different prediction methodologies with a correct prediction of price changes. Thus, the ideal profit ratio is the relationship between the profitability generated by a particular method and a perfect sign forecast.

We implement the process mentioned above for "continuous operations" in the sample period and, in addition, we also apply it for "all operations" to test for robustness. Tables 2, 3, 4, 5, 6, and 7 display the results achieved for each bond at different time scales and for "continuous trades”. The results for the "all trades" case scenario are presented in “Appendix 2” via Tables 10, 11, 12, 13, 14, 15.

Table 2 Sign Prediction Ratio (10 min) for continuous trades

Full size table

Table 3 Ideal profit ratio (10 min) for continuous trades

Full size table

Table 4 Sign Prediction Ratio (30 min) for continuous trades

Full size table

Table 5 Ideal profit ratio (30 min) for continuous trades

Full size table

Table 6 Sign Prediction Ratio (60 min) for continuous trades

Full size table

Table 7 Ideal profit ratio (60 min) for continuous trades

Full size table

Table 2 reports the sign prediction accuracy ratios of continuous trading for the ten techniques on the considered bonds for 10 min. We remark that QGA performs the best with 31 bonds, with an accuracy rate of over 0.772 and a mean of 0.881. DRCNN and DLNN-GA and are the second and third methods that correctly predict the change in bond price direction, with an average of 0.850 and 0.847 respectively. SVM-GA may also be regarded as a reference model for the other machine learning algorithms, being the fourth best in the comparison. We notice that the fuzzy approach ise the worst-performing techniques, with an overall mean of 0.770 for the Qfuzzy method.

Table 3 shows the results of the ideal profit ratios for the selected bonds and for a time scale of 10 min for every methodology. It is noted that, in line with the success rates in Table 2, QGA is again the best-performing method, as all bonds have a positive ideal profit ratio with a mean value of 0.0175. Nevertheless, in contrast to the results of the accuracy rates, QRNN becomes the second-best performing method this time, as all bonds also have a positive ideal profit ratio and a mean of 0.0140. QRNN is followed by the ANFIS-QGA method with an average of 0.0134. In this case, SVM-GA and CNN-LSTM are the worst-performing models regarding profit generation for continuous 10-min trades. The maximum value of the ideal profit ratio among sovereign bonds is 0.0212 and is reached by QGA in Turkey. Among corporate bonds, Telelomunikacja Polska is the one that reaches a maximum value, being 0.0192, again in the QGA method. Finally, among high-yield bonds, Caesars Resort Collection LLC stands out with a value of 0.0192, also in the QGA method.

Table 4 presents the results concerning the success ratios of continuous operations with a frequency of 30 min. We note that, as in the case of the 10-min frequency, the QGA method is once again the most performing in terms of mean bond success ratio, with an average of 0.860. In the QGA method, among sovereign bonds, Germany has the highest ratio at 0.906. Looking at corporate bonds, also in the QGAs method, Ambev's is the highest value at 0.918, and among high-yield bonds, Caesars Resort Collection LLC ranks highest with a value of 0.849, also in the QGA method. The next methods that present a correct forecast of the future direction of bond prices are DLNN-GA and DRCNN, with a mean of 0.839 and 0.829 respectively. SVM-GA could also be accepted as a good model for the sign prediction ratio. On the other hand, we notice that, as with the 10-min frequency, the method Qfuzzy show low sign prediction capacity, with mean value of 0.750.

If we examine the results of the ideal profit ratio for continuous 30-min trades in Table 5, we observe that QGA emerges as the best model yielding the greatest profit ratio, with an average of 0.0193 with all bonds having a positive ratio. This result is following the success rates of the QGA method in Table 4. However, if we look at the techniques that worst predict the future direction of bond prices, in this case, this is not the fuzzy one but GRU-CNN and AdaBoost-GA. Both have a mean value of 0.0063 and 0.0080 respectively.

On considering a sampling frequency of 60 min, Table 6 reveals that, in line with the previous results, the QGA is better than the other methods regarding the mean sign prediction ratio, with an average of 0.838. As in the case of the 10-min and 30-min time scales, the maximum success ratio for all bonds through this QGA method is achieved for the corporate bond "Ambev". Moreover, DRCNN and DLNN-GA correctly predict all bonds with a rate above 0.701. Furthermore, we remark that, as with the 10-min frequency and the 30-min frequency, the Qfuzzy method displays a weak sign prediction ability, with mean values of 0.732.

If we analyze Table 7, it is evident that the genetic algorithms are the ones that obtain the best results for the 60-min frequency in the ideal profit ratio. QGA, ANFIS-QGA, and SVM-GA are those that reflect, in this order, the highest ideal profit ratio with all of the bonds having a positive ratio. DRCNN and DLNN-GA come after with an average ratio of 0.0131 and 0.0128, respectively. The lowest values, in contrast to the previous table, are those achieved in the GRU-CNN and CNN-LSTM methods, with an average value of 0.0032 and 0.0048 respectively.

When we examine continuous trading time series at 10, 30, and 60 min, we can reveal the impact of the sampling frequency on the prediction. We note in Fig. 3 that all methods show better results concerning the Sign Prediction Ratio at lower frequencies. Nevertheless, the case is otherwise, as illustrated in Fig. 4, for the Ideal Benefit Ratio, as AdaBoost-GA, SVM-GA, ANFIS-QGA and DRCNN perform the best model for trading strategy setting at 60 min of sampling, and QGAperforms the best for 30 min of sampling. Following our results, not only the bonds and methodology but also the prediction intervals are important. As a consequence, we may conclude that one method is not suitable for everything. While a method may be suitable for raw data, it may not be appropriate for fine data.

Tables 8 and 9 show the results of Sovereign bonds at 1 and 5 min frequency intervals. If we observe Table 8, the methodology with the best results is QGA for both ratios, sign prediction and ideal profit. In the case of the sign prediction ratio for a one-minute frequency, German sovereign bonds are the ones that reach the highest value (0.924) and for the 5-min frequency case, Italian sovereign bonds with a ratio of 0.946. With reference to the ideal profit ratio, the best result is obtained by Turkey for a frequency of 1 min (0.0229), and Spain for a frequency of 5 min (0.0247). For all trades, Table 9 illustrates that the best sovereign bond performance in sign prediction ratio at 1-min frequency is Japan (0.930) in the Qfuzzy method. However, at a frequency of 5 min, the DLMM-GA method performs best, with Germany having the highest ratio. Regarding the ideal profit ratio, QRNN is the best methodology, with Turkey obtaining the highest values at both frequencies, 0.0244 at a 1-min frequency and 0.0195 at a 5-min frequency.

Table 8 Sign prediction and ideal profit ratios in small frequency for continuous trades of sovereign bonds

Full size table

Table 9 Sign prediction and ideal profit ratios in small frequency for all trades of sovereign bonds

Full size table

In comparison with other works, Vukovic et al. (2020) obtain an accuracy of 82% on test cases for predicting future Sharpe ratio dynamics with neural networks. Nunes et al. (2019) achieve RMSE reductions compared to the model without synthetic data in the range of 11% to 70% (mean values, for forecast horizons of 15 and 20 days) for predicting the bond market yield curve, using the Multilayer Perceptrons method. In summary, our study has high precision, and also exceeds the accuracy level of previous work, being the genetic algorithms the ones that obtain the best results, especially the QGA method. Moreover, previous literature dealing with fixed-income assets is not concerned with the use of HFT. The results of our study show that bond market transactions through HFT are executed faster and trading volume increases considerably, enhancing the liquidity of the bond market.

Finally, we analyse the cumulative net profits for each bond market (sovereign, corporate and high-yield) and according to each price window (10-min, 30-min, 60-min and 1-min, 5-min). These results are presented in “Appendix 4” via Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15. Over the span of two decades under examination, all models encountered drawdowns of varying magnitudes, ranging from 5 to 15% at different points in time. On average, these drawdowns persisted for approximately 2.5 months. Instances of model underperformance became evident during periods of extreme market volatility, exemplified by the 2008 financial crisis, which witnessed model losses surpassing the 20% mark. Similarly, unexpected geopolitical events posed challenges, with losses reaching up to 18%. Our models demonstrated a propensity to falter when confronted with 'black swan' events of exceptional magnitude that surpassed historical data, as exemplified by the impact of the COVID-19 pandemic (Papadamou et al., 2021). Moreover, these models reduce their performance a little in forecasting abrupt market shifts induced by unprecedented occurrences, such as major regulatory changes. Despite these limitations in predicting and achieving profits, our models achieve a higher and more consistent level of cumulative profits over time than previous work on algorithmic trading models, especially high-frequency trading models (Dixon et al, 2018; Rundo, 2019; Lahmiri & Bekiros, 2021; Goudarzi & Bazzana, 2023).

Sovereign Bonds exhibit a fluctuating pattern over the years, with a negative start in 2001 but a significant shift towards positive gains in 2002. This positive trend continued until 2006, followed by intermittent fluctuations. By 2023, net gains had stabilized at a relatively positive level, demonstrating the resilience of these bonds. Corporate Bonds also had a negative start in 2001 but saw notable improvements in 2002, with consistent gains until around 2006. There was volatility in the subsequent years, with lucrative moments such as in 2010 but also difficulties. By 2023, corporate net gains appear to have regained a positive trajectory. High-Yield Bonds started in the negative in 2001 and remained mostly so until 2003. They then experienced a period of consistent gains until around 2011, followed by volatility. In 2023, they maintain positive cumulative gains, albeit more moderate.

At the beginning of the 2000s, central banks, particularly the U.S. Federal Reserve, had a more neutral monetary policy stance. Interest rates were relatively higher compared to the 2010s (Jarrow, 2019). However, following the burst of the dot-com bubble and the September 11 attacks in 2001, central banks, including the Federal Reserve, lowered interest rates to stimulate economic growth. These rate cuts resulted in lower yields on government bonds (Fabozzi & Fabozzi, 2021).

Bond yields, especially in the U.S., remained relatively low during the first half of the decade but started to rise as the economy improved. The latter part of the 2000s was marked by the U.S. housing bubble and the subsequent global financial crisis of 2008. These events led to a flight to safety, with investors seeking refuge in government bonds, particularly U.S. Treasuries (Gilchrist et al., 2019). This increased demand for government bonds drove prices up and yields down.

Corporate bonds in the early 2000s offered higher yields compared to government bonds, reflecting the risk premium associated with corporate debt. However, during the financial crisis, corporate bond yields rose significantly as investors became concerned about the creditworthiness of corporations (Jarrow, 2019). Bond spreads, which measure the difference in yields between corporate bonds and government bonds, widened substantially during this period. Emerging market bonds experienced mixed performance during the 2000s. Some emerging market economies attracted foreign investment, leading to lower yields on their bonds. However, there were instances of bond market turmoil in emerging markets, driven by factors such as currency devaluations and political instability (Beirne and Sugandi, 2023).

Regarding the 2010s decade, central banks, particularly in developed economies like the United States, Europe, and Japan, implemented accommodative monetary policies in response to the global financial crisis of 2008 (Albagli et al., 2018). These policies included near-zero or negative interest rates and large-scale bond-buying programs (quantitative easing) aimed at stimulating economic growth. As a result, yields on government bonds, which serve as benchmarks for other fixed-income securities, remained historically low (Blanchard, 2023). The low yield environment prompted investors to seek higher-yielding assets, which sometimes led to increased demand for riskier bonds, such as high yield or corporate bonds. This increased demand pushed up bond prices and drove yields lower. The global economy experienced a prolonged period of low inflation and, at times, deflationary pressures during the 2010s. Low inflation expectations are often associated with lower yields on fixed-income securities (Fabozzi & Fabozzi, 2021).

Regulatory changes in the financial industry, such as Basel III banking regulations, encouraged financial institutions to hold more high-quality liquid assets, including government bonds (Ranaldo et al., 2019). This increased demand for government bonds also contributed to lower yields. While low yields were a prominent feature of the 2010s bond market, it's essential to note that not all bonds experienced the same level of yield compression. The extent of yield compression varied among different types of bonds, and some segments of the bond market, like high yield or emerging market bonds, offered higher yields to compensate for increased risk (Fabozzi & Fabozzi, 2021).

5 Conclusions

This study has developed a comparison of methodologies to predict bond price movements based on past prices through high-frequency trading. We compare ten machine learning methods applied to the fixed-income markets in sovereign, corporate and high-yield debt, in both developed and emerging countries, in the one-year bond market for the period from 15 February 2000 to 12 April 2023. Our results indicate that QGA, DRCNN and DLNN-GA can correctly interpret the expected bond future price direction and rate changes satisfactorily. Curiously, QFuzzy is not adequate for forecasting high-frequency returns and dealers ought to avoid these models in their trading decisions, for the sample bond market.

Our study shows that all methods show better results concerning the Sign Prediction Ratio at lower frequencies. Thus, considering 10 min of frequency, the QGA method is the best performer with all the bonds, with an accuracy rate higher than 0.772 and a mean of 0.881. DRCNN and DLNNN-GA are the second and third methods that correctly predict the change in bond price direction, with an average of 0.850 and 0.847 respectively. However, for the Ideal Profit Ratio, not all methods show better results at the highest frequency. Some methods such as SVM-GA, ANFIS-QGA and DRCNN, perform the best model for the trading strategy configuration at 60-min sampling, and QGA performs the best for 30 min of sampling. Therefore, it is important to consider that depending on the sampling frequency and the objective of the approach, one method does not fit all, and a mixture of different alternative techniques must be examined.

In contrast to previous research, this study has achieved better accuracy results and has made a comparison of innovative methods of ML with the use of HFT, not been applied in the bond market so far. ML algorithms have become widely available for fixed-income market analysis, especially since uncertainty in the financial markets has risen sharply. In addition, our study has made predictions of bond price movements globally, hence it is not exclusively focused on industrialized countries. Finally, our study includes not only sovereign bonds but also corporate and high-yield debt, making it of interest to policymakers in any country.

Our study provides important benefits in the field of finance. From an insider trading perspective, it strengthens the implementation of reliable and fast forecast systems on the bond prices, including the pursuit of returns and volatility targeting, and can analyse the information of indirect market-based monetary policy instruments and the macro environment. Adequate bond price predictability can reduce medium- and long-term debt servicing costs through the development of a deep and liquid market for government securities. At the microeconomic level, the development of a robust bond price prediction model can increase overall financial stability and enhance financial intermediation via increased competition and the development of related financial infrastructure, products, and services. In addition, more generally, financial crises tend to arise in credit markets. Our model has the potential to provide financial institutions with information on the effects of policy measures on the credit market's fragility and to provide a better understanding of how market trends influence liquidity provision, implementation costs, and the impact on transaction prices.

In summary, our paper has a great perspective impact It can facilitate the work of professionals from financial institutions dedicated to trading as well as possible private investors and other stakeholders. This research makes an important contribution to high-frequency trading, as the conclusions have important implications both for investors and market participants as they seek to derive economic and financial profits from the bond market.

Our work has limitations in data availability for 10- and 30-min price frequencies for corporate debt securities. In order for this type of research to have greater generalizability for fixed income market practitioners, greater data availability would be necessary. We leave this issue as a reason to explore future research in which more complex trading strategies can be organized to test and demonstrate the effectiveness of the techniques presented in this work for trading in debt securities.

Besides, further research should broaden the scope of the comparative analysis of methodologies to cover the field of crypto-assets, such as cryptocurrencies and fan tokens, since, in recent years, financial institutions have increasingly incorporated crypto-assets in their portfolios.

Availability of Data and Material

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Albagli, E., Ceballos, L., Claro, S., & Romero, D. (2019). Channels of US monetary policy spillovers to international bond markets. Journal of Financial Economics, 134(2), 447–473.
Article Google Scholar
Akyildirim, E., Bariviera, A. F., Nguyen, D. K., & Sensoy, A. (2022). Forecasting high-frequency stock returns: A comparison of alternative methods. Annals of Operations Research, 313(2), 639–690.
Article Google Scholar
Alaminos, D., Salas, M.B., & Fernández-Gámez, M.A. (2022a). Deep Neural Networks Methods for Estimating Market Microstructure and Speculative Attacks Models: The case of Government Bond Market. The Singapore Economic Review, Online Ready.
Alaminos, D., Salas, M. B., & Fernández-Gámez, M. A. (2022b). Forecasting stock market crashes via real-time recession probabilities: a quantum computing approach. Fractals-Complex Geometry, Patterns, and Scaling in Nature and Society, 30(5), 1–16.
Google Scholar
Aldana, I.B. (2017). Investing in Distressed Debt in Europe. Globe Law and Business Limited. London, United Kingdom.
Aldrich, E. (2013). High-frequency trading: A practical guide to algorithmic strategies and trading systems. John Wiley & Sons.
Google Scholar
Aloud, M. E., & Alkhamees, N. (2021). Intelligent algorithmic trading strategy using reinforcement learning and directional change. IEEE Access, 9, 114659–114671.
Article Google Scholar
Bai, J., Fleming, M.J., & Horan, C. (2013). The microstructure of China's government bond market. FRB of New York Staff Report, (622).
Beirne, J., & Sugandi, E. (2023). Central bank asset purchase programs in emerging market economies. Finance Research Letters, 54(C), 103769.
Article Google Scholar
Benedetti, M., Lloyd, E., Sack, S., & Fiorentini, M. (2019). Parameterized quantum circuits as machine learning models. Quantum Science and Technology, 4(4), 043001.
Article Google Scholar
Berezin, F. A., & Shubin, M. (2012). The Schrödinger Equation (Vol. 66). Springer Science & Business Media.
Google Scholar
Bessembinder, H., Spatt, C., & Venkataraman, K. (2020). A survey of the microstructure of fixed-income markets. Journal of Financial and Quantitative Analysis, 55(1), 1–45.
Article Google Scholar
Biais, B., & Green, R. (2019). The microestructure of the bond market in the 20th century. Review of Economic Dynamics, 33, 250–271. https://doi.org/10.1016/j.red.2019.01.003
Article Google Scholar
Bishop, C. M. (1995). Neural networks for pattern recognition. Oxford University Press.
Book Google Scholar
Blanchard, O. (2023). Fiscal policy under low interest rates. The MIT Press.
Book Google Scholar
Bodenhofer, U. (2003). Genetic algorithms: theory and applications. Lecture notes, Fuzzy Logic Laboratorium Linz-Hagenberg, Winter, 2004.
Bossy, E., & Gigan, S. (2016). Photoacoustics with coherent light. Photoacoustics, 4(1), 22–35.
Article Google Scholar
Cao, M., & Shang, F. (2010, March). Double chains quantum genetic algorithm with application in training of process neural networks. In 2010 Second International Workshop on Education Technology and Computer Science (Vol. 1, pp. 19–22). IEEE.
Cao, Y., Guerreschi, G. G., & Aspuru-Guzik, A. (2017). Quantum neuron: an elementary building block for machine learning on quantum computers. arXiv preprint arXiv:1711.11240.
Campbell, R., Huisman, R., & Koedijk, K. (2001). Optimal portfolio selection in a value-at-risk framework. Journal of Banking & Finance, 25(9), 1789–1804.
Article Google Scholar
Caponetto, R., Fortuna, L., Fazzino, S., & Xibilia, M. (2003). Chaotic sequences to improve the performance of evolutionary algorithms. IEEE Transactions on Evolutionary Computation, 7, 289–304.
Article Google Scholar
Chakravarty, S., & Sarkar, A. (2003). Trading costs in three U.S. bond markets. Journal of Fixed Income, 13, 39–48.
Article Google Scholar
Cheng, Y., Zheng, Z., Wang, J., Yang, L., & Wan, S. (2019). Attribute reduction based on genetic algorithm for the coevolution of meteorological data in the industrial internet of things. Wireless communications and mobile computing, 2019.
Chih-Hung, W., Gwo-Hshiung, T., & Rong-Ho, L. (2009). A Novel hybrid genetic algorithm for kernel function and parameter optimization in support vector regression. Expert Systems with Applications, 36, 4725–4735.
Article Google Scholar
Chong, T., & Ng, W.-K. (2008). Technical analysis and the London stock exchange: testing the MACD and RSI rules using the FT30. Applied Economics Letters, 15(14), 1111–1114.
Article Google Scholar
Christiansen, C., & Ranaldo, A. (2007). Realized bond—stock correlation: Macroeconomic announcement effects. Journal of Futures Markets: Futures, Options, and Other Derivative Products, 27(5), 439–469.
Article Google Scholar
Clapham, B., Haferkorn, M., & Zimmermann, K. (2023). The impact of high-frequency trading on modern securities markets. Business & Information Systems Engineering, 65, 7–24.
Article Google Scholar
Conkey, D. B., Brown, A. N., Caravaca-Aguirre, A. M., & Piestun, R. (2012). Genetic algorithm optimization for focusing through turbid media in noisy environments. Optics Express, 20(5), 4840–4849.
Article Google Scholar
Darbha, M., & Dufour, A. (2013). Microstructure of the Euro-area government bond market. In H. K. Baker & H. Kiymaz (Eds.), Market microstructure in emerging and developed markets. Robert W. Kolb series in finance. John Wiley.
Google Scholar
De Almeida, R. L., & Neves, R. F. (2022). Stock market prediction and portfolio composition using a hybrid approach combined with self-adaptive evolutionary algorithm. Expert Systems with Applications, 204, 117478.
Article Google Scholar
Deng, S., Huang, X., Qin, Z., Fu, Z., & Yang, T. (2021). A novel hybrid method for direction forecasting and trading of Apple Futures. Applied Soft Computing, 110, 107734.
Article Google Scholar
Deng, S., Zhu, Y., Huang, X., Duan, S., & Fu, Z. (2022). High-Frequency Direction Forecasting of the Futures Market Using a Machine-Learning-Based Method. Future Internet, 14(6), 180.
Article Google Scholar
Dixon, M. F., Polson, N. G., & Sokolov, V. O. (2018). Deep learning for spatio-temporal modeling: Dynamic traffic flows and high frequency trading. Applied Stochastic Models in Business and Industry, 35(3), 788–807.
Article Google Scholar
Dodd, O., Frijns, B., Idriawan, I., & Pascual, R. (2023). US cross-listing and domestic high-frequency trading: Evidence from Canadian stocks. Journal of Empirical Finance, 72, 301–320.
Article Google Scholar
Dougherty, J., Kohavi, R., & Sahami, M. (1995). Supervised and unsupervised discretization of continuous features. In Machine learning proceedings 1995 (pp. 194–202). Morgan Kaufmann.
Drezner, Z., & Misevičius, A. (2013). Enhancing the performance of hybrid genetic algorithms by differential improvement. Computers & Operations Research, 40(4), 1038–1046.
Article Google Scholar
Elton, E. J., Martin, J. G., & Christopher, R. B. (1995). Fundamental economic variables, expected returns and bond fund performance. Journal of Finance, 40, 1229–1256.
Article Google Scholar
Fabozzi, F. J., & Fabozzi, F. A. (2021). Bond markets, analysis, and strategies (10th ed.). The MIT Press.
Google Scholar
Friewald, N., & Nagler, F. (2019). Over-the-counter market frictions and yield spread changes. Journal of Finance, 74(6), 3217–3257. https://doi.org/10.1111/jofi.12827
Article Google Scholar
Frino, A., Garcia, M., & Zhou, Z. (2020). Impact of algorithmic trading on speed of adjustment to new information: Evidence from interest rate derivatives. Journal of Futures Markets, 40(5), 749–760.
Article Google Scholar
Frino, A., Mollica, V., & Webb, R. I. (2014). The impact of co-location of securities exchanges’and traders’ computer servers on market liquidity. Journal of Futures Markets, 34, 20–33.
Article Google Scholar
Gao, X., Li, X., Zhao, B., Ji, W., Jing, X., & He, Y. (2019). Short-term electricity load forecasting model based on EMD-GRU with feature selection. Energies, 12(6), 1140.
Article Google Scholar
Ghoddusi, H., Creamer, G. G., & Rafizadeh, N. (2019). Machine learning in energy economics and finance: A review. Energy Economics, 81, 709–727. https://doi.org/10.1016/j.eneco.2019.05.006
Article Google Scholar
Gilchrist, S., Yue, V., & Zakrajšek, E. (2019). US monetary policy and international bond markets. Journal of Money, Credit and Banking, 51(51), 127–161.
Article Google Scholar
Glode, V., & Opp, C. (2020). Over-the-counter vs. limit-order markets: The role of traders’ expertise. Review of Financial Studies, 33(2), 866–915. https://doi.org/10.1093/rfs/hhz061
Article Google Scholar
Goldberg, D. E. (1990). A note on Boltzmann tournament selection for genetic algorithms and populationoriented simulated annealing. Complex Systems, 4, 44.
Google Scholar
Goldblum, M., Schwarzschild, A., Patel, A., & Goldstein, T. (2021). Adversarial attacks on machine learning systems for high-frequency trading. In Proceedings of the Second ACM International Conference on AI in Finance (pp. 1–9).
Gomber, P., & Haferkorn. (2015). High frequency trading. Encyclopedia of information science and technology (3rd ed.). IGI Global.
Google Scholar
Goodell, J. W., Kumar, S., Lim, W. M., & Pattnaik, D. (2021). Artificial intelligence and machine learning in finance: Identifying foundations, themes, and research clusters from bibliometric analysis. Journal of Behavioral and Experimental Finance, 32, 100577.
Article Google Scholar
Goudarzi, M., & Bazzana, F. (2023). Identification of high-frequency trading: A machine learning approach. Research in International Business and Finance, 66, 102078.
Grinold, R. C., & Ronald, N. K. (1999). Active portfolio management: A quantitative approach for producing superior returns and controlling risk. McGraw-Hill.
Google Scholar
Grover, L. K. (2005). Fixed-point quantum search. Physical Review Letters, 95(15), 150501.
Article Google Scholar
Guerreschi, G. G. (2019). Repeat-until-success circuits with fixed-point oblivious amplitude amplification. Physical Review A, 99(2), 022306.
Article Google Scholar
Gündüz, Y., Pelizzon, L., Schneider, M., and Subrahmanyam, M.G. (2023). Lighting Up the Dark: Liquidity in the German Corporate Bond Market. The Journal of Fixed Income 32(4).
Gupta, N., & Jalal, A. S. (2020). Integration of textual cues for fine-grained image captioning using deep CNN and LSTM. Neural Computing and Applications, 32(24), 17899–17908.
Article Google Scholar
Hansen, K. B., & Borch, C. (2022). Alternative data and sentiment analysis: Prospecting non-standard data in machine learning-driven finance. Big Data & Society, 9(1), 20539517211070700.
Article Google Scholar
He, K., Chen, Y., & Tso, G. K. (2017). Price forecasting in the precious metal market: A multivariate EMD denoising approach. Resources Policy, 54, 9–24.
Article Google Scholar
Hendershott, T., Jones, C. M., & Menkveld, A. J. (2011). Does algorithmic trading improve liquidity? The Journal of Finance, 66(1), 1–33.
Article Google Scholar
Ho, T. K. (1998). The random subspace method for constructing decision forests. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(8), 832–844.
Article Google Scholar
Holden, C. W., Liu, X., Mao, Y, & Nam, J. (2018) Price Discovery in the Stock, OTC Corporate Bond, and NYSE Corporate Bond Markets (June 9, 2018). Kelley School of Business Research Paper No. 18–53.
Houg, G., & Warga, A. (2000). An empirical study of bond market transactions. Financial Analyst Journal, 56(2), 32–46.
Article Google Scholar
Huang, C. W., & Narayanan, S. S. (2017, July). Deep convolutional recurrent neural network with attention mechanism for robust speech emotion recognition. In 2017 IEEE international conference on multimedia and expo (ICME) (pp. 583–588). IEEE.
Ibbotson, R. G., & Kaplan, P. D. (2000). Does asset allocation policy explain 40%, 90% or 100% of performance? Financial Analysts Journal, 56, 26–33.
Article Google Scholar
International Monetary Fund. (2021). Developing Government Bond Markets. The World Bank, ISBN: 9780821349557. ISSN: 2663-3744, 21 Sep 2001. DOI: Doi: https://doi.org/10.5089/9780821349557.069
Jarrow, R. (2019). Modeling fixed income securities and interest rate options (3rd ed.). Stanford University Press.
Book Google Scholar
Jeddi, S., & Sharifian, S. (2020). A hybrid wavelet decomposer and GMDH-ELM ensemble model for Network function virtualization workload forecasting in cloud computing. Applied Soft Computing, 88, 105940.
Article Google Scholar
Kara, Y., Boyacioglu, M. A., & Baykan, Ö. K. (2011). Predicting direction of stock price index movement using artificial neural networks and support vector machines: The sample of the Istanbul Stock Exchange. Expert Systems with Applications, 38(5), 5311–5319.
Article Google Scholar
Keller, J. M., Gray, M. R., & Givens, J. A. (1985). A fuzzy K-nearest neighbor algorithm. IEEE Trans. Systems Man Cybernet., 15(4), 580–585.
Article Google Scholar
Kim, B. S., & Kim, T. G. (2019). Cooperation of simulation and data model for performance analysis of complex systems. International Journal of Simulation Modelling, 18(4), 608–619.
Article Google Scholar
Kirkpatrick, S., Gelatt, C. D., Jr., & Vecchi, M. P. (1983). Optimization by simulated annealing. Science, 220, 671–680.
Article Google Scholar
Knowles, J. D., & Corne, D. W. (2000). Approximating the nondominated front using the Pareto archived evolution strategy. Evolutionary Computation, 8(2), 149–172.
Article Google Scholar
Kohavi, R. (1995). A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI’95), vol. 2, pp. 1137–1143.
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Pdf ImageNet classification with deep convolutional neural networks. Communications of the ACM, 60, 84–90.
Lahmiri, S., & Bekiros, S. (2021). Deep Learning Forecasting in Cryptocurrency High-Frequency Trading. Cognitive Computation, 13, 485–487.
Article Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
Article Google Scholar
Lewis, J., Hart, E., & Ritchie, G. (1998). A comparison of dominance mechanisms and simple mutation on non-stationary problems. Parallel Problem Solving from Nature (PPSN v), 1498, 139–148.
Google Scholar
Li, S., Deng, M., Lee, J., Sinha, A., & Barbastathis, G. (2018). Imaging through glass diffusers using densely connected convolutional networks. Optica, 5(7), 803–813.
Article Google Scholar
Li, J., Hao, J., Sun, X., & Feng, Q. (2021). Forecasting China’s sovereign CDS with a decomposition reconstruction strategy. Applied Soft Computing, 105, 107291.
Article Google Scholar
Lin, Z., Chen, Y., Zhao, X., & Wang, G. (2013). Spectral-spatial classification of hyperspectral image using autoencoders. In 2013 9th International Conference on Information, Communications & Signal Processing (pp. 1–5).
Lucas, A., Iliadis, M., Molina, R., & Katsaggelos, A. K. (2018). Using deep neural networks for inverse problems in imaging: Beyond analytical methods. IEEE Signal Processing Magazine, 35(1), 20–36.
Article Google Scholar
Ma, M., & Mao, Z. (2019, June). Deep recurrent convolutional neural network for remaining useful life prediction. In 2019 IEEE international conference on prognostics and health management (ICPHM) (pp. 1–4). IEEE.
Mahanti, S., Nashikkar, A., Subramanyam, M. G., Chacko, G., & Mallik, G. (2008). Latent liquidity: A new measure of liquidity with an application to corporate bonds. Journal of Financial Economics, 88, 272–298.
Article Google Scholar
McCann, M. T., Jin, K. H., & Unser, M. (2017). Convolutional neural networks for inverse problems in imaging: A review. IEEE Signal Processing Magazine, 34(6), 85–95.
Article Google Scholar
Milana, C., & Ashta, A. (2021). Artificial intelligence techniques in finance and financial markets: A survey of the literature. Strategic Change, 30(3), 189–209.
Article Google Scholar
Nauck, D., Klawonn, F., & Kruse, R. (1997). Foundations of neuro-fuzzy systems. Wiley.
Google Scholar
Nauck, D., Kruse, R. (1997). New learning strategies for NEFCLASS. In Proc.Seventh International Fuzzy Systems Association World Congress IFSA´97, volume IV, 50–55, Prague.
Neklyudov, A. (2019). Bid–Ask Spreads and the Over-the-Counter Interdealer Markets: Core and Peripheral Dealers. Review of Economic Dynamics, 33, 57–84. https://doi.org/10.1016/j.red.2019.04.005
Article Google Scholar
Nelson, C., & Siegel, A. F. (1987). Parsimonious modeling of yield curves. Journal of Business, 60, 473–489.
Article Google Scholar
Nielsen, M. A., & Chuang, I. L. (2001). Quantum computation and quantum information. Physics Today, 54(2), 60.
Google Scholar
Norouzi, M., Collins, M., Johnson, M. A., Fleet, D. J., & Kohli, P. (2015). Efficient non-greedy optimization of decision trees. Advances in Neural Information Processing Systems, 28.
Nowak, S., Andritzky, J. R., Jobst, A., & N. T. Tamirisa. (2009). Macroeconomic fundamentals, price discovery and volatility dynamics in emerging markets. International Monetary Fund Working Paper No.09/147.
Nunes, M., Gerding, E., McGroarty, F., & Niranjan, M. (2018). Artificial neural networks in fixed income markets for yield curve forecasting. Available at SSRN 3144622.
Nunes, M., Gerding, E., McGroarty, F., & Niranjan, M. (2019). A comparison of multitask and single task learning with artificial neural networks for yield curve forecasting. Expert Systems with Applications, 119, 362–375.
Article Google Scholar
Papadamou, S., Fassas, A. P., Kenourgios, D., & Dimitriou, D. (2021). Flight-to-quality between global stock and bond markets in the COVID era. Finance Research Letters, 38(C), 101852.
Article Google Scholar
Ping-Feng, P., Chih-Shen, L., Wei-Chiang, H., & Chen-Tung, C. (2006). A hybrid support vector machine regression for exchange rate prediction. International Journal of Information and Management Sciences, 17, 19–32.
Google Scholar
Qin, L., Yu, N., & Zhao, D. (2018). Applying the convolutional neural network deep learning technology to behavioural recognition in intelligent video. Tehnički Vjesnik, 25(2), 528–535.
Google Scholar
Quinlan, J. R. (1993). C4,5: programs for machine learning (p. 1993). Morgan Kaufmann Publishers Inc.
Google Scholar
Ramlall, I. (2016). Applied technical analysis for advanced learners and practitioners. Emerald Publishing.
Book Google Scholar
Ranaldo, A., Schaffner, P., & Vasios, M. (2019). Regulatory effects on short-term interest rates. Journal of Financial Economics, 141(2), 750–770.
Refaeilzadeh, P., Tang, L., & Liu, H. (2009). Cross-validation. Encyclopedia of Database Systems. https://doi.org/10.1007/978-0-387-39940-9_565
Article Google Scholar
Rundo, F. (2019). Deep LSTM with reinforcement learning layer for financial trend prediction in FX high frequency trading systems. Applied Sciences, 9(20), 4460.
Article Google Scholar
Schestag, R., Schuster, P., & Uhrig-Homburg, M. (2016). Measuring liquidity in bond markets. Review of Financial Studies, 29, 1170–1219.
Article Google Scholar
Schilling, F. (2016). The effect of batch normalization on deep convolutional neural networks.
Schneider, M. (2018). Market Microstructure, Price Impact and Liquidity in Fixed Income Markets. PhD Thesis, Scuola Normale Superiore Pisa, Italy.
Schrödinger, E. (1935). The present status of quantum mechanics. Die Naturwissenschaften, 23(48), 1–26.
Google Scholar
Securities and Exchange Commission. (2014). Equity market structure literature review—Part II: High frequency trading. Retrieved from https://www.sec.gov/marketstructure/research/hft_lit_review_march_2014.pdf.
Sezer, O. B., & Ozbayoglu, A. M. (2018). Algorithmic financial trading with deep convolutional neural networks: Time series to image conversion approach. Applied Soft Computing, 70, 525–538.
Article Google Scholar
Singh, P., Dhiman, G., & Kaur, A. (2018). A quantum approach for time series data based on graph and Schrödinger equations methods. Modern Physics Letters A, 33(35), 1850208.
Article Google Scholar
Singh, P., & Huang, Y. P. (2019). A new hybrid time series forecasting model based on the neutrosophic set and quantum optimization algorithm. Computers in Industry, 111, 121–139.
Article Google Scholar
Sivaraj, R., & Ravichandran, T. (2011). A review of selection methods in genetic algorithm. International Journal of Engineering Science and Technology, 3, 3792–3797.
Google Scholar
Slade, S. (2017). Artificial intelligence applications on wall street. Taylor & Francis.
Book Google Scholar
Szegedy, C., Ioffe, S., Vanhoucke, V., & Alemi, A. (2017). Inception-v4, inception-resnet and the impact of residual connections on learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 31, No. 1).
Ta, V. D., Liu, C. M., & Tadesse, D. A. (2020). Portfolio optimization-based stock prediction using long-short term memory network in quantitative trading. Applied Sciences, 10(2), 437.
Article Google Scholar
Tacchino, F., Macchiavello, C., Gerace, D., & Bajoni, D. (2019). An artificial neuron implemented on an actual quantum processor. Npj Quantum Information, 5(1), 1–8.
Article Google Scholar
Tang, Y. (2013). Deep learning using linear support vector machines. arXiv preprint arXiv:1306.0239.
Vellekoop, I. M., & Mosk, A. P. (2008). Phase control algorithms for focusing light through turbid media. Optics Communications, 281(11), 3071–3080.
Article Google Scholar
Vukovic, D., Vyklyuk, Y., Matsiuk, N., & Maiti, M. (2020). Neural network forecasting in prediction Sharpe ratio: Evidence from EU debt market. Physica a: Statistical Mechanics and Its Applications, 542, 123331.
Article Google Scholar
Wang, J., Gao, L., Zhang, H., & Xu, J. (2011, July). Adaboost with SVM-based classifier for the classification of brain motor imagery tasks. In International Conference on Universal Access in Human-Computer Interaction (pp. 629–634). Springer, Berlin, Heidelberg.
Wang, Y., Ma, F., Wei, Y., & Wu, C. (2016). Forecasting realized volatility in a changing world: A dynamic model averaging approach. Journal of Banking & Finance, 64, 136–149.
Article Google Scholar
Wan, K. H., Dahlsten, O., Kristjánsson, H., Gardner, R., & Kim, M. S. (2017). Quantum generalisation of feedforward neural networks. Npj Quantum Information, 3(1), 1–8.
Article Google Scholar
Wei, Z., & Chen, X. (2018). Deep-learning schemes for full-wave nonlinear inverse scattering problems. IEEE Transactions on Geoscience and Remote Sensing, 57(4), 1849–1860.
Article Google Scholar
Zhao, Y., Li, J., & Yu, L. (2017). A deep learning ensemble approach for crude oil price forecasting. Energy Economics, 66, 9–16.
Article Google Scholar

Download references

Acknowledgements

Not applicable.

Funding

Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research was funded by the Universitat de Barcelona, under the grant UB-AE-AS017634.

Author information

Authors and Affiliations

Department of Business, University of Barcelona, Barcelona, Spain
David Alaminos
Department of Finance and Accounting, University of Málaga, Málaga, Spain
María Belén Salas & Manuel A. Fernández-Gámez
Cátedra de Economía y Finanzas Sostenibles, University of Málaga, Málaga, Spain
María Belén Salas & Manuel A. Fernández-Gámez

Authors

David Alaminos
View author publications
You can also search for this author in PubMed Google Scholar
María Belén Salas
View author publications
You can also search for this author in PubMed Google Scholar
Manuel A. Fernández-Gámez
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

All the authors contributed an equal effort to complete the manuscript.

Corresponding author

Correspondence to David Alaminos.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1: Other Methodologies

1.1 Quantum-Fuzzy Approach (QFuzzy)

Singh et al. (2018) and Singh and Huang (2019) proposed recently the quantum optimization algorithm (QOP) modeled on the ''entanglement'' concept of quantum mechanics. In the present research, QOP is enhanced to resolve the multi-objective optimization problem (MOOP) and is called QFuzzy. In the case of MOOP, the major goal of QFuzzy is to choose the optimal solution set. For the operation of selecting a set of solutions, all the solutions are placed in a memory where they could be utilized to acquire the Pareto-optimal front screening out all the non-dominated optimal solutions. The concept of an archive is developed in this process, which stores all the non-dominated Pareto-optimal solutions (AONDPS). Subsequently, a selection criterion is assumed to choose the most prominent solution regarding the position of the quantum of the file. We then formulate a generalized MOOP to prove the application of QFuzzy toward seeking optimized solutions for the MOOP.

A MOOP has two goals formulated:

$$ Optimize \,\,\left( {Max.\,or \, Mim.} \right)\Theta \left( x \right) = D_{m} \left( x \right), \;\;x \in {\mathbb{Q}}^{n} $$

(33)

$$ Optimize\,\, \left( {Max. \;or\;Mim.} \right)\beta \left( x \right) = Y_{n} \left( x \right), \;\;x \in {\mathbb{Q}}^{n} $$

(34)

Under linear restrictions

$$ D_{m} \left( x \right) \ge 0, m = 1,2, \ldots ,M $$

(35)

$$ Y_{n} \left( x \right) \ge 0, n = 1,2, \ldots ,N $$

(36)

$$ x_{i}^{{\left( {LB} \right)}} \le x_{i} \le x_{i}^{{\left( {UB} \right)}} ,\;\;\;i = 1,2, \ldots ,n $$

(37)

In Eq. (33), there are M objective functions: $\Theta \left( x \right) = \left( {D_{1} \left( x \right),D_{2} \left( x \right), \ldots , D_{M} \left( x \right)} \right)^{T}$. In Eq. (34), there are N objective functions: $\beta \left( x \right) = \left( {Y_{1} \left( x \right),Y_{2} \left( x \right), \ldots , Y_{N} \left( x \right)} \right)^{T}$. In this case, the objective function must be minimized or maximized. A solution x is a vector of n decision variables, where $x = (x_{1} ,x_{2} ,x_{3} \ldots , x_{m} )^{T}$. The space extended by the $x_{i}$ is called the quantum system ${\mathbb{Q}}^{n}$, whereas the space created by the $\Theta \left( x \right)$ and $\beta \left( x \right)$ values are named the solution space. Finally, the last condition stated here indicates the restriction of the variable that limits the value of each $x_{i}$. For these restrictions, a condition set for a decision variable $x_{i}$ is described as $G^{{\left( {LB} \right)}} \le x_{i} \le G^{{\left( {UB} \right)}}$ that constrains the value of each $x_{i}$ within the lower bound ($G^{{\left( {LB} \right)}}$) and the upper bound ($G^{{\left( {UB} \right)}}$). To optimize the above MOOP with QFuzzy we have the following steps.

Step 1 Quantum initialization in the quantum system: Start every search agent according to the next equation of Schrodinger (1935) as follows:

$$ Q_{k} \left( e \right) = \emptyset \cdot Q1_{k} \left( e \right) + \left( {1 - \emptyset } \right) \cdot Q2_{k} \left( e \right) $$

(38)

In Eq. (6), $Q_{k} \left( e \right) $ denotes the k-th quantum with an epoch e, and k = 1; 2; …; q; being q the total number of quanta in the ${\mathbb{Q}}^{n}$. $Q1_{k} \left( e \right)$ and $Q2_{k} \left( e \right)$ are two wave functions for the k-th quantum; $\emptyset = a + ib$ indicates a complex number, a and b denote real numbers in [0,1] and i is the imaginary unit ${\text{i}} = \sqrt { - 1}$. In the representation of a complex number, multiplication by -1 refers to a 180-degree rotation about the origin of the k-th quantum. Therefore, the multiplication by i refers to a 90-degree rotation of the k-th quantum in the ‘‘positive”, in the counterclockwise direction (Berezin & Shubin, 2012). Because the complex number $\emptyset$ cannot be used straightforwardly to start the quantum in the search space, its absolute value is employed in the computational procedure, which is defined as $\left| \emptyset \right| = \sqrt {a^{2} + b^{2} }$. $Q1_{k} \left( e \right)$ and $Q2_{k} \left( e \right)$ could be related as:

$$ Q1_{k} \left( e \right) = \left\{ {G^{UB} + r_{1} \cdot \left( {G^{UB} - G^{LB} } \right)} \right\} $$

(39)

$$ Q2_{k} \left( e \right) = \left\{ {G^{LB} + r_{2} \cdot \left( {G^{UB} - G^{LB} } \right)} \right\} $$

(40)

where $r_{1} \in \left[ {0,1} \right]$ and $r_{2} \in \left[ {0,1} \right]$ are two different random functions, correspondingly.

Step 2 Quantum localization: The acquired localization of the $Q_{k} \left( e \right)$ is given by $M_{k} \left( e \right)$, and may be stated as follows:

$$ L_{k} \left( e \right) = \frac{1}{{Q_{k} \left( e \right)}}e^{ - 2/} Q_{k} \left( e \right) $$

(41)

Step 3 The motion of the quantum: The motion displayed by the $Q_{k} \left( e \right) $ is indicated by $M_{k} \left( e \right)$, and may be described as:

$$ M_{k} \left( e \right) = \left| {Q_{k} \left( e \right) - \frac{{L_{k} \left( e \right)}}{2}\ln \left( {1/m_{f} } \right)} \right| $$

(42)

being $m_{f}$ the “quantum movement factor”, which can be in [0,1].

Step 4 Quantum shift: The displacement of $Q_{k} \left( e \right)$ is indicated by $D_{k} \left( e \right)$, and may be defined as:

$$ D_{k} \left( e \right) = 2 \cdot \left| {L_{k} \left( e \right) - M_{k} \left( e \right)} \right| $$

(43)

Step 5 Evaluation of the suitability of the shift: A fitness value is established for $D_{k} \left( e \right)$, and updated if a solution better than the preceding one is available.

Step 6 Extension of the quantum search range: The extension of the motion, that is, $M_{k} \left( e \right)$ for the next epoch, e + 1 is expressed as $M_{k} \left( {e + 1} \right)$, and can be stated as:

$$ M_{k} \left( {e + 1} \right) = M_{1} + M_{2} + M_{3} $$

(44)

being

$$ M_{1} = \alpha \cdot M_{k} \left( e \right) $$

(45)

$$ M_{2} = \ln \left( {1/m_{f} } \right) \cdot r_{3} \cdot \left[ {pBD_{k} \left( e \right) - D_{k} \left( e \right)} \right] $$

(46)

$$ M_{3} = \ln \left( {1/m_{f} } \right) \cdot r_{4} \cdot \left[ {gBD_{k} \left( e \right) - D_{k} \left( e \right)} \right] $$

(47)

In this case, $\alpha$ is named the “quantum acceleration factor”, expressed as:

$$ \alpha = \alpha_{\max } - itr \times \frac{{\left| {\alpha_{\max } - \alpha_{\min } } \right|}}{Itr} $$

(48)

Being itr = 1,2,…,Itr. Itr indicates the maximum number of iterations set for the algorithm. Here, $\alpha_{\min }$ and $\alpha_{\max }$ can be taken in [0.1,0.9], where $\alpha_{\max }$ > $\alpha_{\min }$. In formula (14), $pBD_{k} \left( e \right)$ is the personal best displacement that $D_{k} \left( e \right)$ has reached since the first epoch. In formula (15), $gBD_{k} \left( e \right)$ is the global best displacement achieved so far among the displacements. In formulas (14) and (15), $r_{3} \in \left[ {0,1} \right]$ and $r_{4} \in \left[ {0,1} \right]$ show two different random functions, correspondingly.

Step 7 Updating the quantum offset: The offset adjustment, that is, $D_{k} \left( e \right)$ for the next epoch is denoted by $D_{k} \left( {e + 1} \right)$, and may be stated as:

$$ D_{k} \left( {e + 1} \right) = D_{k} \left( e \right) + M_{k} \left( {e + 1} \right) $$

(49)

Step 8 Put AONDPSs into the file. Employing the AONDPSs, this algorithm starts by scanning the Pareto Optimal Front. Two components are integrated with the file to examine the optimal solutions as Knowles and Corne (2000): Controller and Grid. The addition of a particular solution to the file is determined by the controller. The information in the file is regarded as up to date according to one of the below conditions:

Condition 1: If the optimal non-dominated Pareto solution is missing and the file is empty, then the present solution should be adopted.

Condition 2: If an optimal solution is mastered by any other factor within the file, that particular solution must be rejected.

Condition 3: If a Pareto optimal solution is not mastered by the external factor, then the particular solution should be adopted and kept in the file.

Condition 4: If the optimal solutions are controlled by the new element, they are removed from the file.

Finally, the other component is Grid: When the AONDPS are positioned in the file, a solution space is generated for every objective. Recursively bisecting the solution space produces an individual placement called a grid location. A grid location helps identify how many non-dominated Pareto solutions are in a grid and where they are located.

Step 9 Stop the algorithm if the stop condition is satisfied; if not, go back to step 7.

1.2 Adaptive Boosting and Genetic Algorithm (AdaBoost-GA)

According to the conventional AdaBoost algorithm, every base classifier's weight is set after being computed; and the classifier adaptivity of each base classifier is not regarded (Wang et al., 2011). Hence, in this research, the GA is utilised in the adaptive integration procedures of the base classifiers. The number of decision groups is the number of weak classifiers Adaboost, and the weight of every weak classifier is the starting population of GA.

Both crossover probability and mutation probability significantly impact the algorithm's optimization effect (Cheng et al., 2019). To choose the suitable crossover likelihood and mutation likelihood, based on the literature (Drezner and Misevicius, 2013), we describe the crossover likelihood and mutation likelihood as:

$$ P_{c} = \gamma $$

(50)

$$ P_{m} = 0.1 \left( {1 - \gamma } \right) $$

(51)

being $\gamma$ a regulatory factor.

The fitness function was described as:

$$ fit = \frac{{\mathop \sum \nolimits_{i = 1}^{N} I\left( {y\left( {X_{i} } \right) = y_{i} } \right)}}{N} $$

(52)

1.2.1 AdaBoost-GA

The function of Adaboost is to build various base classifiers using training the data distribution and thereafter allocate weights to these base classifiers by the error rate. Adaboost uses the decision group as the base classifier to enlarge the system diversity of the ensemble set and uses the GA algorithm to maximise the weight of each base classifier by combining all the base classifiers.

Given $\left\{ {\omega_{m,j} \left| {i = 1,2, \ldots ,N;m = 1 \ldots ,M} \right.} \right\}$, which is the weight of every sample in the base classifier. $\omega_{m,j}$ symbols the weight of the ith sample in m_th base classifier. Let $y_{m} \left( x \right)$ and Y(x) be a base classifier and strong classifier, respectively. $\alpha_{m}$ denotes the weight of mth classifier. $\varepsilon_{m}$ represents the error function of mth base classifier. And M constitutes the number of base classifiers. The AdaBoost-GA suggested algorithm could be explained below.

Input:

-Training sets

-Validation sets

-$\omega_{1,i} :$ The weight of each training sample

Output:

Y(.): The final strong classifier:

$$ Y\left( x \right) = sign\left( {\mathop \sum \limits_{j = 1}^{M} \alpha_{j} y_{j} \left( x \right)} \right) $$

(53)

1. Initialize $\omega_{1,i} = 1/N$.

2. For i = 1 to N do.

3.

$$ \varepsilon_{m} = \mathop \sum \limits_{i = 1}^{N} \omega_{m,i} I\left( {y_{m} \left( {X_{i} } \right) \ne y_{i} } \right) $$

(54)

4.

$$ \alpha_{m} = 1n\left\{ {\frac{{1 - \varepsilon_{m} }}{{ \varepsilon_{m} }} } \right\} $$

(55)

if $\alpha_{m}$ $\ge$ 0 then $\alpha_{m}$ increases with the decrement of $\varepsilon_{m}$. End if

5.

$$ \omega_{m + 1,i} = \frac{{\omega_{m,i} }}{{Z_{m} }}\exp \left( { - \alpha_{m} y_{i} y_{m} \left( {x_{i} } \right)} \right) $$

(56)

where

$$ Z_{m} = \mathop \sum \limits_{i = 1}^{N} \omega_{m,i} \exp \left( { - \alpha_{m} y_{i} y_{g} \left( {x_{i} } \right)} \right) $$

(57)

6. end for

$$ \alpha_{m} = GA\left( \alpha \right) $$

(58)

$$ Y_{x} = sign\left( {\mathop \sum \limits_{j = 1}^{M} \alpha_{j} y_{j} \left( x \right)} \right) $$

(59)

Return Y(x).

1.3 Support Vector Machine- Genetic Algorithm (SVM-GA)

The problem of SVM parameter setting has been addressed by several approaches ranging from raw force to more refined metaheuristics, one of the best known of which is genetic algorithms (GA). The major benefit of GAs compared to simpler methods is their ability to deliver stochastic near-optimal solutions at modest cost, while simultaneously optimising multiple parameters with no prior knowledge (Goldberg, 1990). Similarly, every parameter in the search space is encoded as an allele or gene in a GA, and the complete configuration of a specific solution has termed a chromosome. The native formulation of GAs encodes every gene in binary form (binary genetic algorithms, BGAs) so that multiple evolutionary operators can be successfully implemented. Such encoding, however, leads to needless computational expense because it transforms real values into their binary value, and conversely; the storage expense is raised because operators such as mutation and crossover require only a single pair of bits to fulfil their function (Chih-Hung et al., 2009). Another method that improves storage and computational costs is called real-valued genetic algorithms (RGA). The latter type of coding is employed in the present work.

However, GAs suffer from some disadvantages, such as early conversion to local optima owing to their genetic operators, and their robustness is only obtained when parameters such as the population size or the number of generations are adjusted. In addition, GA outcomes become less reproducible and need a statistical process to assure solution configuration conversion. Moreover, by keeping the optimal solution in the population, a GA is capable of converging to the global optimum (Sivaraj & Ravichandran, 2011).

Thus, key features in the development of evolutionary algorithms are the genetic operators like selection, crossover, and even the random number generator (RNG) they employ. The choice operator sets the search neighbourhoods, whereas the operators of recombination and mutation search for a particular space. In this work, we suggest a new Boltzmann operator that similarly describes a cooling scheme that the linear schedule suggested by Kirkpatrick et al. (1983). Besides, the convergence properties of a genetic algorithm may be improved by applying the chaotic number generators (Caponetto et al., 2003). In this study, three chaotic sequences have been also employed by SV RGBC genetic operators.

1.3.1 ${\text{SVR}}_{\text{GBC}}$ Approach

A major stage in developing an efficient support vector model for ranking or regression is the adjustment of the parameters. Typically, this procedure accounts for the error compensation constant (C), the choice of a kernel, and its related kernel-specific constants. The easiest way to conduct this fitting procedure is called grid search (GS). This approach involves generating multiple SVM models starting from the learning step with a set step for the parameter values. Best values are achieved through a test of each model on a validation set and the choice of the best result. However, this method has various disadvantages, the most important of which are: a priori knowledge, large computational expense, local optima, and the uncertain distribution of the parameter values. An alternative method for SVM parameter adjustment is GA techniques; these are capable of coping with the above detractors owing to their demonstrated efficiency in the contextual handling of the model constants. GAs performs a non-linear query of the solution space based on no knowledge of the model's characteristics.

For the SVR hyperparameter setting in the volatility prediction task, our novel approach considers a triplet of parameters composed of the kernel type and a set of kernel-specific constants. Starting with an original population of chromosomes produced by a pseudo-random or chaotic distribution, the process begins. Every chromosome is composed of an integer-valued and a real-valued part. The method of choosing individuals for the mating pool in each generation can be elitism, roulette, or the suggested Boltzmann choice method. The rest of the mating pool is filled with an n-point crossover operator and a boundary mutation technique (Chih-Hung et al., 2009; Ping-Feng et al., 2006), employing pseudorandom and chaotic sequences.

1.3.2 Genetic Operators

$SVR_{GBC}$ is a mix of an integer-valued and a real-valued genetic algorithm; it uses various genetic operators such as selection, crossover, and mutation to generate offspring from the population of real solutions. Three methods of selection are included in our process: Elitism, the roulette wheel, and a new selection method named Boltzmann selection. Such systems are employed to select the best offspring to proceed with the development cycle. The chosen individuals are then gathered into a breeding pool and the crossover and mutation operators are performed on them. An inconvenience of the GA crossover operation in the SVR parameter setting problem is chromosomal heterogeneity. A solution to this problem is the employment of a dominance scheme, as indicated by Lewis et al. (1998). Every gene's value is determined by its kernel together with an upper and lower limit of the allele. The mutation is the following stage, where several chromosomes are targeted for a mutation to yield changed clones that will be incorporated into the new population. The present work is based on a uniform mutation, represented as:

$$ \begin{aligned} c^{old} & = \left\{ {c_{1} ,c_{2} , \ldots ,c_{i} , \ldots c_{n} } \right\} \\ c_{i}^{new} & = LB_{i} + r*\left( {UB_{i} - LB_{i} } \right) \\ c^{new} & = \left\{ {c_{1} ,c_{2} , \ldots ,c_{i}^{new} , \ldots c_{n} } \right\} \\ \end{aligned} $$

(60)

being $c^{old}$ and $c^{new}$ a chosen chromosome before and after mutation accordingly; n enumerates the number of genes in the chromosome structure; $c_{i}^{new}$ represents the new value for the i allele after mutation; r symbols a random number in the range [0,1], produced by one of the available probability distributions; $LB_{i}$ and $UB_{i}$ are the lower and upper bound of the i allele. The various ranks and kinds of SVR parameters require an integer-valued mutation and a real-valued mutation (Chih-Hung et al., 2009). The former is employed to handle the kernel type $K_{T}$ owing to its integer encoding, whereas the latter changes $P_{1}$ or $P_{2}$ values. For these ends, a minor alteration of the displayed mutation operator is needed: a rounding function is introduced right after the perturbation of allele i to guarantee correct values.

1.3.3 Boltzmann Selection

As Kirkpatrick et al. (1983) found, the solution acceptance function constitutes an essential feature of simulated annealing (SA). Once the SA procedure is looped for a sufficient amount of time, the solution acceptance distribution function follows the Boltzmann distribution. SA is comprised of three items: A probabilistic acceptance criterion, a neighbourhood exploration approach, and a cooling function to reach thermal balance.

Goldberg (1990) suggested that the SA-like mechanism will improve the capabilities of the GA given the thermal equilibrium afforded by the cooling function, and the effectiveness of the Boltzmann distribution demonstrated to be an exploration heuristic (Goldberg, 1990; Kirkpatrick et al., 1983). Guided by the work of Goldberg (1990), we suggest Boltzmann selection, used to choose the surviving set of mats based on the temperature of the system given by a cooling schedule. Every solution is either accepted or refused following the Boltzmann distribution, defined by (29)

$$ P_{{\left( {x_{i} } \right)}} = e^{{\left( {\frac{ - \Delta E}{{KT}}} \right)}} $$

(61)

being k the Boltzmann constant, $\Delta E$ represents the energy among the best and the present solution, and T denotes the actual system temperature. The latter parameter is achieved in SA by utilising a cooling function from the classical exponential or linear options of annealing schemes (Kirkpatrick et al., 1983). In this work, a cooling scheme linked to the linear cooling function for the AG, which correlates the temperature with the present generation and the number of total generations of the AG, is presented.

1.3.4 Fitness Function

The optimisation procedure of the SVR parameters via the GA needs a fitness function to assess and choose the chromosomes for the matting set. In this paper, x MSE is employed to evaluate the quality of every solution to maintain a good genetic material. This is determined by (30), being $\sigma_{t}$ the observed volatility for period t,$ \hat{\sigma }_{t}$ represents the forecasted volatility for period t, and n symbols the total forecasted time frame.

$$ MSE = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \left( {\sigma_{t} - \hat{\sigma }_{t} } \right)^{2} }}{n} $$

(62)

In addition, the MSE is computed using a statistical estimate of the generalisation error named k-fold cross-validation (CV). This refers to a method of calculating the parameter values of a model based on a training sample (Kohavi, 1995; Refaeilzadeh et al., 2009). The sample, taken in the simplest case, is split into k independent subsets of equal size. Using k − 1 of these, a model is trained and 42 is computed over the residual subset. This procedure is repeated for the remaining k − 1 left samples, and then the mean is calculated. At last, the model that reduces the CV value to the minimum is the optimal one (Refaeilzadeh et al., 2009).

1.4 Deep Learning Neural Network- Genetic Algorithm (DLNN-GA)

The procedure of multiple forward dispersion is defined in the next linear model linking the incident optical modes and the transferred optical modes (Vellekoop & Mosk, 2008).

$$ E_{m} = \mathop \sum \limits_{n = 1}^{N} t_{mn} E_{n} = \mathop \sum \limits_{n = 1}^{N} \left| {t_{mn} } \right|\exp \left( {i\phi_{mn} } \right)\left| {E_{n} } \right|\exp \left( {i\phi_{n} } \right) $$

(63)

being $E_{n}$ the nth complex incident mode with amplitude $\left| {E_{n} } \right|$ and phase $\phi_{n}$, while $E_{m}$ represents the mth complex optical mode transferred from the dispersion media. $t_{mn}$ symbols one element in the complex transmission matrix which constitutes light scattering paths. The phase values fulfil $\phi_{n}$ = − $\phi_{mn}$ (Bossy & Gigan, 2016). The light will focus perfectly on the selected location when it is set to this condition.

The procedure of using the GA for wavefront modelling involves five steps: initialisation, classification, reproduction, mutation, and iteration. First, a given number G of phase patterns is generated, each phase value being selected from a uniform pseudo-random distribution. Next, these standards are marked using a specially designed fitness function. Following formula (32), the fitness function is described as the intensity of the light at a given location (Conkey et al., 2012)

$$ I_{m} = \left| {E_{m} } \right|^{2} = \frac{1}{N}\left| {\mathop \sum \limits_{n = 1}^{N} t_{mn} A_{n} \exp \left( {i\phi_{n} } \right)} \right|^{2} $$

(64)

being $A_{n}$ the amplitude of $E_{n}$. Phase patterns are classified according to the results of the fitness function assessment. High scores lead to higher rankings. The second step is breeding. Offspring is produced by offspring = T × ma + (1 − T) × pa, with T being a random binary template and ma and pa being parental parents. Both parents are sorted under the rule that higher-ranking parents are more likely to be adopted. After reproduction, certain sections of the offspring become mutated and changed by chance. The mutation rate R declines with growing generations n to prevent over mutation, as follows $R = \left( {R_{0} - R_{end} } \right) \times \exp \left( { - n/\lambda } \right) + R_{end,}$, being $R_{0}$, $R_{end}$, and λ the initial mutation rate, the final mutation rate, and the decay factor, for each of them (Conkey et al., 2012). The progeny shall also be assessed by the fitness function. In each generation, a certain number of offspring will be replicated to substitute already existing patterns with inferior scores (Conkey et al., 2012). Then, the G-stage patterns are all reclassified based on their scores. The previous reproduction and mutating proceedings will be repeated several times before the final condition is fulfilled. Usually, the iteration ceases when a prespecified amount of generations is replayed or the result of the fitness function evaluation meets a certain level of threshold.

The advantages of the AG are important. The AG manages to identify a suitable solution quickly. In addition, GA is robust to noise, as it updates the largest number of pixels rather than adjusting pixels one by one. While GA results are significantly affected by many factors, such as the mutation and reproduction rate, the fitness function, and particularly the number of phase patterns employed in every generation, namely the size of the population, matching the right parameters is not trivial and needs time and experience. Furthermore, several scenarios haphazard introduce start patterns, potentially in the neighbourhood of one or more local minima. GA is susceptible to getting locked into a local minimum, as it is a staged optimisation process and descendants reproduce themselves by replicating and mutating existing patterns. This involves the risk that likely better solutions cannot be probed. Consequently, the use of a good initialisation is crucial to reach global optima (Bodenhofer, 2003).

Deep learning, which is a data-driven process, employs a separate strategy to estimate the phase pattern for focusing the light. The poor proposal and nonlinearity of inverse scattering problems show that direct inversion is impractical, which makes the requirements of iterative algorithms with regulation (Wei & Chen, 2018) to be as below:

$$ arg\mathop {\min }\limits_{p} \,\,y - HW_{p2}^{2} + \lambda p_{1} $$

(65)

being H the forward scattering model, y represents the recorded speckle intensity, W symbols a transformation, and p denotes transformation coefficients so that $\hat{x} = W_{p}$ is the desired reconstruction.

Nearly every state-of-the-art iterative algorithm for back-scattering problems are cascades of linear convolutions and pointwise nonlinear transactions, (McCann et al., 2017), resembling the structure of convolutional neural networks (CNNs). Examples of a representative implementation are the well-known iterative shrinkage-thresholding algorithms (ISTAs) founded on model blocks as follows:

$$ p^{m + 1} = A_{\theta } \left[ {\frac{1}{L}W*H*y + \left( {I - \frac{1}{L}W*H*HW} \right)p^{m} } \right] $$

(66)

being L the Lipschitz constant. The iterative optimization governed by formula (60) can be handled as a convolutional procedure with kernel $I - \frac{1}{L}W*H*HW$ and bias (1/L)W ∗ H ∗ y, followed by a nonlinear activation function $A_{\theta }$. So, CNNs can be regarded as inherently suitable for solving the problem of reverse dispersion (Li et al., 2018; Wei & Chen, 2018).

Conventional iterative algorithms are successful and warrant the use of CNNs as a way to approach light via scattering media. Deep CNNs (DCCNs) have been demonstrated powerful in solving inverted problems (Li et al., 2018; Lucas et al., 2018).

Therefore, DCNNs can model the H -1 reverse scattering process through supervised learning, disclosing the connection between the transferred speckles and the incident x optical phase patterns. The inputs of the DCNN are the measured intensity distributions of the transferred speckles captured with a camera, and the outputs are their incident phase patterns modified by a spatial light modulator (SLM). Upon training, the DCNN will accurately map the speckle to the incident phase patterns, and therefore the DCNN can forecast the phase pattern needed to target the light via a particular dispersion media.

Using the DL approach is easier, as the relation of speckle to incident phase patterns is acquired directly via training. However, its output is influenced strongly by the training samples. Accordingly, these samples are typical to represent the complete scattering processes only when the training sample size is large enough, and thus, the DCNN can forecast the best phase pattern for the light focus following the training. However, the supervised learning method experiences the dilemma of achieving the global optimum because it is imprudent to incorporate every conceivable phase pattern and speckle for training, and sample topics are hard to estimate as well. Despite this, the DCNN results can be considered a good initialisation for the GA. Concerning the GA, maximum global convergence is obtained under the constraint that the initial figure is close to the global optima (Bodenhofer, 2003).

The suggested GeneNNN contains two parts. The first part consists of gathering samples to train a DCNN. Following the training, an early focus mote with the phase pattern predicted by the DCNN can be obtained. The second part is the adoption of the GA for optimising the focused process. We present two methods for building early phase patterns, each employing the DCNN results. The first method, called GeneNNv1, adopts the pattern foreseen by the DCNN for one of the starting patterns, whereas other patterns are generated according to a uniform pseudo-random distribution (Conkey et al., 2012). The DCNN pattern will undoubtedly have the strongest classification, thus having the highest chance of being selected for breeding. With the other method, called GeneNNNv2, all starting patterns are built according to the DCNN patterns predicted by the DCNN but adding several extra phase patterns to this general basis, whereas all additional patterns are generated with a uniform pseudo-random distribution as well.

1.5 Adaptive Neuro-Fuzzy Inference System-Quantum Genetic Algorithm (ANFIS-QGA)

For this method, the likelihood amplitudes of every qubit are considered to be two genes, each chromosome comprises two gene strings, and every gene string stands for an optimisation solution. The number of genes is fixed by the number of optimisation parameters. With each qubit of the optimal chromosome as a target, singles are actualised by quantum rotation gates and mutated by non-quantum gates so that population diversity is increased. The mutation procedure was performed by the non-quantum gates and the crossover and selection operations were performed by the quantum rotation gates. The varying tendency of the fitness function at the search point is transferred into the rotation angle calculation function design. If the change ratio of the fitness function at a certain search point is larger than that at other points, the rotation angle is decreased appropriately. With this method, each chromosome can be made to advance in the flatness of the search process to quicken the conversion and hurry in the sweep of the searching operation to prevent losing the globally optimal solution (Cao & Shang, 2010).

The key steps of the suggested model are outlined in Algorithm 1.

Algorithm 1: Suggested model

Step 1: Data Pre-processing (Phase one) Step 2: Generate random population Step 3: Calculate the value of Radii Step 4: Initialize Anfis model Step 5: Perform optimization algorithm using DCQGA (Phase two). Step 6: Finding the best accuracy and performance. Step 7: Stop.

The application of the suggested model comprises two major steps.

Phase one: Data Pre-processing: This stage consists of a procedure that transforms the raw inputs and outputs into an acceptable form before the training process. It is mainly used to decrease the dimensionality of the input data and to improve the performance of the generalisation (Bishop, 1995).

The original data are assigned to [0,1] using min–max normalisation:

$$ X\left( i \right) = \frac{x\left( i \right) - m}{{M - m}} $$

(67)

for the time series data x, m = min{x}, M = max(x). Four types of time series, for example, the opening price, the closing price, the highest price, and the lowest price are separately standardised in the experiment.

Phase two: Optimization algorithm: The algorithm used in this step is inspired by the double-stranded quantum genetic algorithm (Cao & Shang, 2010) and is designed as below:

1.
Produce the original angle to create the double strand from it.
$$ P_{i,1} = \left( {\cos \left( {y_{i,1} } \right), \cos \left( {y_{i,2} } \right), \ldots ,\cos \left( {y_{i,n} } \right) } \right) $$
(68)
$$ P_{i,2} = \left( {\sin \left( {y_{i,1} } \right), \sin \left( {y_{i,2} } \right), \ldots ,\sin \left( {y_{i,n} } \right) } \right) $$
(69)
being $y_{i,n}$ a random number between 0 and 2π, $P_{i,1}$ named cosine solution and $P_{i,2}$ called sine solution.
2.
Execute solution space transform
$$ X_{{\left( {i,j} \right)c}} = 0.5^{*} \left[ {b_{i} \left( {1 + \alpha_{i,j} } \right) + a_{i} \left( {1 - \alpha_{i,j} } \right)} \right] $$
(70)
$$ X_{{\left( {i,j} \right)s}} = 0.5^{*} \left[ {b_{i} \left( {1 + \beta_{i,j} } \right) + a_{i} \left( {1 - \beta_{i,j} } \right)} \right] $$
(71)
where i = 1: m, j = 1: n, m is the number of qubits, and n is the population size
3.
Compute fitness value being equivalent to
$$ \frac{1}{1 + MSE} $$
(72)
4.
Renovate the best composition.
$$ \Delta \theta_{i,j} = - sgn\left( A \right)\Delta \theta_{0} exp\left( { - \frac{{\left| {\nabla {\text{f}}\left( {X_{i,j} } \right)} \right| - \nabla f_{i} min}}{{\nabla f_{i} max - \nabla f_{i} min}}} \right) $$
(73)
5.
Execute quantum rotation gates to renovate corresponding qubits on the actual chromosome for every chromosome.

Being $\Delta \theta_{0}$ the initial value of rotation angle,
$$ A = \left| {\begin{array}{*{20}c} {\alpha_{0} } & {\alpha_{1} } \\ {\beta_{0} } & {\beta_{1} } \\ \end{array} } \right| $$
(74)

Therefore, the rotation angle Δθ can be fixed by rules such as the following: if A ≠ 0, then sgn (Δθ) = −sgn(A), else if A = 0, the direction of Δθ is arbitrary,
$$ \nabla {\text{f}}\left( {X_{i,j} } \right) = \frac{{{\text{f}}\left( {X_{i,p} } \right) - {\text{f}}\left( {X_{i,c} } \right)}}{{\left( {X_{i,j} } \right)^{p} - \left( {X_{i,j} } \right)^{c} }} $$
(75)
$$ \nabla f_{i} max = max\left( {\left| {\frac{{{\text{f}}\left( {X_{1,p} } \right) - {\text{f}}\left( {X_{1,c} } \right)}}{{\left( {X_{1,j} } \right)^{p} - \left( {X_{1,j} } \right)^{c} }}} \right|,\left| {\frac{{{\text{f}}\left( {X_{2,p} } \right) - {\text{f}}\left( {X_{2,c} } \right)}}{{\left( {X_{2,j} } \right)^{p} - \left( {X_{2,j} } \right)^{c} }}} \right|, \ldots ,\left| {\frac{{{\text{f}}\left( {X_{n,p} } \right) - {\text{f}}\left( {X_{n,c} } \right)}}{{\left( {X_{n,j} } \right)^{p} - \left( {X_{n,j} } \right)^{c} }}} \right|} \right) $$
(76)
$$ \nabla f_{i} min = min\left( {\left| {\frac{{{\text{f}}\left( {X_{1,p} } \right) - {\text{f}}\left( {X_{1,c} } \right)}}{{\left( {X_{1,j} } \right)^{p} - \left( {X_{1,j} } \right)^{c} }}} \right|,\left| {\frac{{{\text{f}}\left( {X_{2,p} } \right) - {\text{f}}\left( {X_{2,c} } \right)}}{{\left( {X_{2,j} } \right)^{p} - \left( {X_{2,j} } \right)^{c} }}} \right|, \ldots ,\left| {\frac{{{\text{f}}\left( {X_{n,p} } \right) - {\text{f}}\left( {X_{n,c} } \right)}}{{\left( {X_{n,j} } \right)^{p} - \left( {X_{n,j} } \right)^{c} }}} \right|} \right) $$
(77)
being $X_{i,p}$ and $X_{i,c}$ ith vector in solution space like the parent colony and child colony, and $\left( {X_{1,j} } \right)^{p}$, and $\left( {X_{1,j} } \right)^{c}$, denotes the jth variable of the vector $X_{i,p}$ and $X_{i,c}$, correspondingly.
6.
Mutate qubits based on the likelihood of mutation (Pm = 0.1) equivalent to 1 over the population size (pop size = 10) and these figures are derived experimentally for acceptable speed and efficiency for the model.

The above-listed methods have been used to determine the optimal value for the optimisation parameter utilizing the double-string quantum genetic algorithm and to compute the fitness function by the ANFIS model employing subtractive clustering to cause initial FIS for the ANFIS system.

1.6 Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM)

CNN is characteristic of attending to very evident properties within the sight line, therefore, it is extensively applied in engineering. LSTM features the characteristic to expand based on the time sequence, and it makes a large use in the time series. Following the characteristics of CNN and LSTM, a value forecasting model based on CNN-LSTM is constructed.

CNN was developed as a network model by Lecun et al. in 1998. CNN is a type of feed-forward neural network, which performs well in both image and natural language processing (Kim & Kim, 2019). CNN could successfully be implemented in time-series forecasting. CNN local sensing and distribution of weights may reduce the parameter number to a large extent, thereby increasing the learning efficiency of the model (Qin et al., 2018). CNN consists of two parts mainly: the convolution layer and the clustering layer. The convolution layer each holds a multiplicity of convolution kernels, and its formula of calculation is given in Eq. (46). Following the convolution operation of the convolution layer, features are removed from the data, however, the dimensions of the separated characteristics become high, therefore, to resolve this issue and decrease the training cost of the network, a clustering layer is inserted directly after the convolution layer for reducing the dimension of the characteristics:

$$ l_{t} = tnh \left( {x_{t} *k_{t} + b_{t} } \right) $$

(78)

being $l_{t}$ the output value after convolution, t_nh represents the activation function, $x_{t}$ stands for the input vector, $k_{t}$ means the weight of the convolution kernel, and $b_{t}$ symbols the bias of the convolution kernel.

LSTM is aimed at overcoming the long-standing explosion and vanishing gradient problems in Recurrent Neural Networks (RNNs) (Ta et al., 2020). It has been largely employed in speech detection, sentiment analysis, and text processing since it has a unique memory and can make relatively precise predictions (Gupta & Jalal, 2020). The LSTM contains three parts: the forgetting gate, the input gate, and the output gate.

The computational procedure of the LSTM follows:

1.
The output last time value and the input current time value are entered into the forgetting gate, and the output forgetting gate value is determined after calculation, according to the equation below:
$$ f_{t} = \sigma \left( {W_{f} .\left[ {h_{t - 1} , x_{t} } \right] + b_{f} } \right) $$
(79)
being the value range of $f_{t}$ (0,1), $W_{f}$ symbols the weight of the forget gate, and $ b_{f}$ represents the bias of the forget gate, $x_{t}$ is the input current time value, and $h_{t - 1}$ is the output last time value.
2.
The output last time value and the input current time value are entered in the input gate, and the output value and the status of the input gate candidate are derived after the calculation, illustrated by the below equations:
$$ i_{t} = \sigma \left( {W_{i} \cdot \left[ {h_{t - 1} , x_{t} } \right] + b_{i} } \right) $$
(80)
$$ \tilde{C}_{t} = \tanh \left( {W_{c} \cdot \left[ {h_{t - 1} , x_{t} } \right] + b_{c} } \right) $$
(81)

Being the value range of $i_{t}$ (0,1), $W_{i}$ represents the weight of the input gate, $b_{i}$ symbols the bias of the input gate, $W_{c}$ is the weight of the candidate input gate, and $b_{c}$ represents the bias of the candidate input gate.
3.
Update the current cell state as follows:
$$ C_{t} = f_{t} *C_{t - 1} + i_{t} *\tilde{C}_{t} $$
(82)
where the value range of $C_{t}$ is (0,1).
4.
The output $h_{t - 1}$ and input $x_{t}$ are taken at time t as the input values of the output gate, and the output $o_{t}$ of the output, the gate is determined by:
$$ o_{t} = \sigma \left( {W_{o} \cdot \left[ {h_{t - 1} , x_{t} } \right] + b_{o} } \right) $$
(83)
being the value range of $o_{t}$ (0,1), $W_{o}$ symbols of the weight of the output gate, and $b_{o}$ represents the bias of the output gate.
5.
The LSTM output value is achieved by computing the output of the output gate and the cell status, based on the formula below:
$$ h_{t} = o_{t} *\tanh \left( {C_{t} } \right) $$
(84)

The principal stages of the CNN-LSTM training and prediction procedure are listed below:

1.
Enter the necessary data for the training of the CNN-LSTM.
2.
Since the gap in the input data is large, for better training of the model, the z-score standardisation approach to standardise the input data is assumed, illustrated by the formulas below:
$$ y_{i} = \frac{{x_{i} - \overline{x}}}{s} $$
(85)
$$ x_{i} = y_{i} *s + \overline{x} $$
(86)

being $y_{i}$ the standardized value, $x_{i}$ represents the input data, $\overline{x}$ symbols the average of the input data, and $s$ is the standard deviation of the input data.
3.
Initiate the weighting of each layer of the CNN-LSTM and biases.
4.
Input data passes successively via the convolution layer and the clustering layer in the CNN layer, the input data is feature extracted and the output value is acquired.
5.
The CNN layer output data is computed via the LSTM layer, and the output value is given.
6.
LSTM layer output value is fed into the complete connection layer to obtain the output value.
7.
The output value computed via the output layer is checked against the actual value of this dataset, resulting in the respective error.
8.
The completion conditions are that a specified number of cycles are reached, that the weight is below a specified limit and that the forecast error rate is below a specified limit. If one of the end conditions is achieved, the training shall be fulfilled, the entire CNN-LSTM network shall be updated and go to step 10; if not, proceed to step 9.
9.
Propagates the computed error in reverse, maintains the weight and bias of every layer, and proceeds to step 4 to further train the network.
10.
Save the model trained for the forecast.
11.
Enter input data needed for the forecast.
12.
The input data are normalised by Eq. (40).
13.
Input the calibrated data into the CNN-LSTM trained model, and subsequently obtain an output value for it.
14.
The output value provided by the CNN-LSTM model is the normalised value, and the normalised value is subtracted from the original value. As given by formula (41), where $x_{i}$ the restored normalised value, $y_{i}$ is the CNN-LSTM output value, s is the standard deviation of the input data, and x is the mean value of the input data.
15.
Issue the results restored for completing the prediction procedure.

1.7 Gated Recurrent Unit- Convolutional Neural Networks (GRU-CNN)

RNN is a type of artificial neural network suitable for processing and analysing temporal data sequences, in contrast to classical neural networks, which rely on the weighting connection between the layers. The RNN implements the hidden layers to retain the information of the prior time, and the output is affected by the present states and the memories of the prior time. The structure of the unrolled RNN is presented as follows:

$$ \begin{aligned} a^{t} & = g_{1} \left( {\omega_{aa} a^{t - 1} + \omega_{ax} x^{t - 1} + b_{a} } \right) \\ \hat{y}^{t} & = g_{2} \left( {\omega_{ay} a^{t} + b_{y} } \right) \\ \end{aligned} $$

(87)

being $ a ^{t}$ the output of one single hidden layer at time t, and $\omega_{aa}^{t}$, $\omega_{ax}^{t}$, and $\omega_{ay}^{t} $ are the hidden layers' weight matrixes, the input weight matrixes, and the output weight matrixes, respectively. $b_{a}$ and $b_{y}$ symbol the bias vectors of one single hidden layer and the output, respectively, and $g_{1}$ and $g_{2}$ represent the nonlinear activation function.

The RNN works properly if the output is near its related inputs; nevertheless, with a long-time-interval and a great number of weights, the input shall not have much effect on the output owing to the problem of disappearing gradient. To resolve the gradient vanishing problem and the simple structure of the RNN hidden layer, we proposed a particular kind of RNN called GRU.

The GRU consists of a variation of the LSTM with a closed RNN structure, and in comparison, to the LSTM, there are two gates (update gate and reset gate) in the GRU and three gates (forget gate, entry gate, and exit gate) in the LSTM (Alaminos et al., 2022a; Gao et al., 2019).

The GRU equations are:

$$ \begin{aligned} {\Gamma }_{u} & = \sigma \left( {\omega_{u} \left[ {c^{\langle{t - 1}\rangle} ,x^{\langle{t}\rangle} } \right] + b_{u} } \right), \\ {\Gamma }_{r} & = \sigma \left( {\omega_{r} \left[ {c^{\langle{t - 1}\rangle} ,x^{\langle{t}\rangle} } \right] + b_{r} } \right), \\ \tilde{c}^{\langle{t}\rangle} & = tanh\left( {\omega_{c} \left[ {{\Gamma }_{r} *c^{\langle{t - 1}\rangle} ,x^{\langle{t}\rangle} } \right] + b_{c} } \right), \\ c^{\langle{t}\rangle} & = \left( {1 - {\Gamma }_{u} } \right)*c^{\langle{t - 1}\rangle} + {\Gamma }_{u} *\tilde{c}^{\langle{t}\rangle} , \\ \end{aligned} $$

(88)

being $\omega_{u}$, $\omega_{r}$, and $\omega_{c}$ the training weight matrix of the update gate, the reset gate, and the candidate activation $\tilde{c}^{t}$, respectively and $b_{u}$, $b_{r}$, and $b_{c}$ represent the bias vectors.

1.7.1 The Establishment of CNN Module

CNN is often employed in visual image and video recognition, and text categorisation. To keep the spatial information of the data registered by sensors and smart appliances in the power system, the spatio-temporal matrix was suggested. Its data are based on the sensors' location and time sequence. The spatio-temporal matrix appears like this:

$$ x = \left[ {\begin{array}{*{20}c} {X_{1} \left( 1 \right)} & \cdots & {X_{n} \left( n \right)} \\ \vdots & \ddots & \vdots \\ {X_{k} \left( 1 \right)} & \cdots & {X_{k} \left( n \right)} \\ \end{array} } \right] $$

(89)

being k the $k$ th smart sensor, n is the $n$ th time sequence, and $X_{k} \left( n \right)$ symbols of the data recorded by the $ k$ th a smart sensor at n time. For extracting the charging characteristic from the spatio-temporal matrix, CNN was utilised for processing the spatio-temporal matrix.

First, numerous two-dimensional space–time matrices are piled into blocks of three-dimensional matrices, followed by applying these blocks with a convolution operation. The convolution operation aims to obtain a strongly abstract feature, then after the convolution operation, the results of the convolution operation are applied to the grouping operation. The pooling operation makes no change to the entry matrix depth, though it may decrease the size of the matrices as well as the number of nodes so that the parameters in the complete neural networks are reduced. Following the repeated convolution and pooling operations, the highly abstract feature was extracted and smoothed to a one-dimensional vector, hence it can be linked to the whole layer connected. Next, the weights and bias parameters within the globally connected layer can be computed iteratively. Lastly, the forecasting outcomes are provided by the output of the activation function.

1.7.2 The GRU-CNN Hybrid Neural Networks

Our proposed GRU-CNN hybrid neural networks framework is composed of a GRU module and a CNN module. Inputs are the time sequence data information and spatio-temporal matrices recorded from the power system; outputs represent the forecasting of the future load value. As for the CNN module, it is good for processing two-dimensional data, such as spatio-temporal matrices and images. CNN module employs local connection and weight sharing to extract local characteristics of the data directly from the spatio-temporal matrices and get an efficient presentation using the convolution layer and the clustering layer. CNN module structure contains two convolution layers and one flattening operation, with each convolution layer containing one convolution operation and one clustering operation. Following the second clustering operation, high-dimensional data is smoothed into one-dimensional data, and the outputs of the CNN module become linked to the connected layer. Moreover, the purpose of the GRU module is to catch the long-term dependency, and the GRU module could gather helpful information in the historical data over a long period via the memory cell, while the useless data will be ignored by the oblivion gate. GRU module inputs are the temporal sequence data; the GRU module holds plenty of closed recurrent units, and the outputs of all these closed recurrent units are linked with the connected layer. At last, the load forecasting outcomes may be achieved with the average value of all the neurons in the gated layers.

1.8 Quantum Recurrent Neural Network (QRNN)

A quantum system on n qubits exists in the n-fold Hilbert space of tensor product ${\mathcal{H}} = \left( {{\mathbb{C}}^{2} } \right)^{ \otimes d}$ with resulting dimension $2^{d}$. A quantum state represents a unit vector $\psi$ ∈ ${\mathcal{H}}$, commonly described in quantum computing in bra-ket notation $|\psi \rangle$ ∈ ${\mathcal{H}}$ |; its conjugate transpose with $\langle \left. \psi \right| = |\psi \rangle^{\dag }$; then the inner product $\langle \psi {|}\psi \rangle = \left\| \psi \right\|_{2}^{2}$ means the square of the 2-norm of $\psi \cdot \left[ {\psi \rangle \langle \psi } \right]$ then denominates the outer product, yielding a tensor of rank 2. The computational ground conditions correspond to $|0$ = (1, 0), $|1\rangle$ = (0, 1), and compound ground states are for example set by $|01\rangle =$ $|0 \rangle \otimes |1$ = (0, 1, 0, 0).

Thus a quantum gate becomes a unitary operation ${\mathbb{U}}$ on ${\mathcal{H}}$; where the operation nontrivially operates on a subset ${\mathbb{S}}$ ⊆ [n] of qubits, then ${\mathbb{U}}\epsilon {{\mathbb{S}}{\mathbb{U}}}\left( {2^{{\left| {\mathbb{S}} \right|}} } \right)$; to operate on ${\mathcal{H}}$ we expand ${\mathbb{U}}$ to operate as identity on the remainder of the space, i.e. ${\mathbb{U}}_{{\mathbb{S}}} \otimes 1_{{\left[ n \right]{ \setminus {\mathbb{S}}}}}$. This extension is usually ignored, and indicates if the gate operates in a quantum circuit: the first gate R(θ) represents a unitary of a qubit that operates on the second qubit from below, and which depends on the parameter θ. The dotted line extending from the gate designates a "controlled" operation, if the control, for example, acts only on a single qubit denominates the single block-diagonal unitary map $\left| {0\rangle \langle 0} \right| \otimes 1 + \left| {1\rangle \langle 1} \right| \otimes {\mathbb{R}}\left( \theta \right) = 1 \oplus {\mathbb{R}}\left( \theta \right)$ it stands for “if the control qubit is in state $|1$ apply ${\mathbb{R}}\left( \theta \right)$“. The gate sequences are computed as matrix products, and the circuits.

The projective measures of a single qubit are provided by a hermitian 2 × 2 matrix P, such as M $\left| {1\rangle \langle 1} \right|$= diag(0, 1); the complementary outcome is then $M^{ \bot }$ = 1 − M. They are measured by metres in the circuit. Considering a quantum state $|\psi \rangle$, the post-measurement state is $M|\psi \rangle /p$/p with probability $p = \left\| {M|\psi } \right\|_{2}$. This is also the post-selection likelihood to ensure a measured result M; this likelihood may be extended close to 1 using ∼ $\sqrt {{\raise0.7ex\hbox{$1$} \!\mathord{\left/ {\vphantom {1 p}}\right.\kern-0pt} \!\lower0.7ex\hbox{$p$}}}$ rounds of amplitude amplification (Alaminos et al., 2022b; Grover, 2005).

The quantum recurrent neural networks within this proposal are all runnable on classical hardware in which the “hidden state” on n qubits is expressed by an array of size $2 ^{n} ,$ and the set of parameters is provided by the collection of all parameterized quantum gates in the process, leading to matrices with parameterised inputs. To run a QRNN conventionally, we employ a series of matrix–vector multiplications for the gates, and matrix–vector multiplications with subsequent renormalisation of the status for norm 1 for the measure and post-select transactions. Running on quantum hardware, matrix multiplications are "free", and the hidden state in n qubits, which classically requires exponential memory, may be contained in ∼ n qubits only.

1.8.1 Parametrized Quantum Gates

Quantum VQE circuits are very compact, meaning that they alternate single-qubit parameterised gates with entangled gates, such as controlled-no transactions. Hence, this offers the advantage of packing many parameters in a rather dense circuit. Moreover, although these circuits are known to form a universal family, their high entanglement gate density, as well as the missing correlation between the parameters, results in very over-parameterised models which are difficult to train in sorting tasks for inputs of more than a few bits (Benedetti et al., 2019).

We build a highly structured parameterised quantum circuit in which a few parameters are reused again and again. It is mainly based on a new type of quantum neuron that spins its target lane following a non-linear activation function attached to the polynomials of its binary inputs. The cell consists of a composite of an input stage that, at every step, puts the actual input into the state of the cell. Multiple work steps follow which calculate the input and the cell state, plus a concluding output step that generates a density of probability on possible forecasts. The application of these QRNN cells in an iterative fashion over the input sequence in a recurrent model is very similar to traditional RNNs.

In training we implement quantum amplitude amplification (Guerreschi, 2019) on the output vias, to make sure we measure the right token of the training data at all steps. Although the measures are usually non-unitary operations, using the amplitude amplification step ensures that the measures while training remains as close to unitary as we want them to be.

1.8.2 A Higher-Degree Quantum Neuron

The power of classical neural networks arises from the implementation of non-linear activation factors to the related converse transformations in the layers of the network. Instead, because of the nature of quantum mechanics, any quantum circuit would inevitably be a linear operation.

Nevertheless, nonlinear behaviour does not happen anywhere in quantum mechanics: a simple example is a single-qubit gate R(θ) = exp(iYθ) for the Pauli matrix Y (Nielsen & Chuang, 2001), acting as a

$$ R\left( \theta \right) = \exp \left( {i\theta \left( {\begin{array}{*{20}c} 0 & { - i} \\ i & 0 \\ \end{array} } \right)} \right) = \left( {\begin{array}{*{20}c} {\cos \theta } & {\sin \theta } \\ { - \sin \theta } & {\cos \theta } \\ \end{array} } \right) $$

(90)

namely like a rotation within the two-dimensional space covered by the computational basis vectors of a single qubit, $\left\{ {|0,|1} \right\}$. Meanwhile, the rotation matrix itself remains linear, we observe that the state amplitudes —$\cos \theta $ and $\sin \theta$—depend non-linearly on the angle θ. Lifting the rotation to a checked operation cR(i, $\theta_{i}$) conditional on the $i^{th}$ qubit of a state $|x$ for $x \in \left\{ {0,1} \right\}^{n}$, we obtain the map

$$ \begin{gathered} R\left( {\theta_{0} } \right)cR\left( {1,\theta_{1} } \right) \ldots cR\left( {n,\theta_{n} } \right)|x\rangle |0 = |x\rangle \left( {cos\left( \eta \right)|0\rangle + sin\left( \eta \right)|1\rangle } \right) \hfill \\ {\text{for}}\;\;\;\eta = \theta_{0} + \mathop \sum \limits_{i = 1}^{n} \theta_{i} x_{i} \hfill \\ \end{gathered} $$

(91)

Hence, this corresponds to a rotation by an infinite transformation of the basis vector $|x\rangle$ with $x = \left\{ {x_{1, \ldots ,} x_{n} } \right\} \in \left\{ {0,1} \right\}^{n}$, by a parameter vector $\theta = \left( {\theta_{0} ,\theta_{1} , \ldots ,\theta_{n} } \right).$ The process is linearly expanded to the base and target state superpositions, and owing to the form of R(θ) all changes in amplitude just introduced are true-valued.

This transformation of the cosine of the amplitudes through a checked transaction is already non-linear; however, a sine function is not especially sharp, lacking also a sufficient "flat" region where the activation stays constant, as is the case of a linear rectifier unit. Cao et al. (2017) suggested an approach to implement a linear map into a set of qubits that produces amplitudes that exhibit these steeper slopes and plateaus, in a manner very similar to a sigmoidal activation function. The activation has a parameter of order ord ≥ 1 governing the tilt, the circuit resulting in the activation amplitude. This quantum neuron in pure states is rotated by an angle $f\left( \theta \right) = \arctan \left( {tan\left( \theta \right)^{2ord} } \right)$, where ord ≥ 1 is the order of the neuron. Assuming an affine transformation η for the input bitstring $x_{i}$ as shown in formula (93), this rotation is translated into the amplitudes.

$$ \cos \left( {f\left( \eta \right)} \right) = \frac{1}{{\sqrt {1 + \tan \left( \eta \right)^{{2 \times 2^{ord} }} } }}\;\;\;{\text{and}}\;\;\;\sin \left( {f\left( \eta \right)} \right) = \frac{{\tan \left( \eta \right)^{{2^{ord} }} }}{{\sqrt {1 + \tan \left( \eta \right)^{{2 \times 2^{ord} }} } }} $$

(92)

which arises by standardising the transform $|0\rangle \mapsto \to \cos \left( \theta \right)^{{2^{ord} }} |0\rangle + \sin \left( \theta \right)^{{2^{ord} }} |1\rangle$ as can be seen clearly. For ord = 1, the circuit is shown on the left; for ord = 2 on the right. Superior orders are recursively buildable.

A so-called repetition-to-success (RUS) circuit is this quantum neuron, indicating that the measured ring signals if the circuit has been performed correctly. If the result is zero, the neuron has been committed. A correction circuit returns the state to its original configuration when the result is one. Beginning with a pure state (e.g. $|x\rangle$ for $x \in \left\{ {0,1} \right\}^{2}$ and recurring every time a 1 is measured, an arbitrarily high probability of success is reached.

Alas, for control in superposition, such as a state $|x\rangle$ +$|y/\sqrt 2$, this does not work for $x \ne y$ two bit-strings of length n. The amplitudes within the overlap, in this case, will rely on the success story. A technique called fixed-point oblique amplitude amplification (Tacchino et al., 2019), essentially post-selects in the measurement of result 0 while preserving the unitarity of the operation with arbitrary precision. There is the additional cost of multiple rounds of these quantum circuits, whose number will depend on the chance of a zero being measured in the first place. This depends obviously on the parameters of the neuron, θ, and the input state is given. We stress that by selecting sufficiently large individual post-selection probabilities, there is no exponential reduction in the overall probability of success across the number of quantum neurons employed.

We extend this quantum neuron in this paper with an increase in the number of check terms. More precisely, η as provided in formula (89) is an affine transform of the boolean vector $x = \left\{ {x_{1, \ldots ,} x_{n} } \right\}$ for $x_{i}$ ∈ {0, 1}. When we introduce multi-control gates—having their own parameterised rotation, labelled by a multi-index θ_I that varies depending on the qubits i ∈ I on which the gate conditions—we get the option of incorporating higher-degree polynomials, i.e.

$$ \eta^{\prime} = \theta_{0} + \mathop \sum \limits_{i = 1}^{n} \theta_{i} x_{i} + \mathop \sum \limits_{i = 1}^{n} \mathop \sum \limits_{j = 1}^{n} \theta_{ij} x_{i} x_{j} + \cdots = \mathop \sum \limits_{{\begin{array}{*{20}c} {I \subseteq \left[ n \right]} \\ {\left| I \right| \le d} \\ \end{array} }} \theta_{I} \mathop \prod \limits_{i \in I} x_{i} $$

(93)

being d the degree of the neuron; for d = 2 and n = 4 an example of a checked rotation that increase to this higher-order transformation $\eta^{\prime}$ on the bit string $x_{i} .$ So, higher degree boolean logic operations could be directly encrypted inside a unique conditional rotation: an AND operation between two bits $x_{1}$ and $x_{2}$ is simply $x_{1} x_{2}$.

1.8.3 QRNN Cell and Sequence to Sequence Model

The identified quantum neuron becomes the central component in building our quantum recurrent neural network cell. As for conventional RNNs and LSTMs, we provide such a cell to be applied successively to the input submitted to the network. In particular, the cell consists of input and output lanes that are restored following each step, plus a cell-internal state that is transmitted into the next iteration of the network.

To implement the constructed QRNN cell, we require an iterative application of the QRNN cell to a sequence of input words $in_{1, } in_{2, } \ldots ,in_{L } .$

The outgoing lanes $out_{i }$ label a discrete distribution measuring $p_{i }$ over the class labels. The distribution could be entered into an assigned loss function, like cross-entropy or CTC loss.

Appendix 2: Results for the ‘All Trades’ Scenario

This supplement reports the results of the main forecasting techniques implemented to "all trades" rather than just those produced in the continuous trading situation.

See Tables 10, 11, 12, 13, 14, 15.

Table 10 Sign prediction ratio (10 min) for all trades

Full size table

Table 11 Ideal profit ratio (10 min) for all trades

Full size table

Table 12 Sign prediction ratio (30 min) for all trades

Full size table

Table 13 Ideal profit ratio (30 min) for all trades

Full size table

Table 14 Sign prediction ratio (60 min) for all trades

Full size table

Table 15 Ideal profit ratio (60 min) for all trades

Full size table

Appendix 3: Features About Every Bonds Used in the Sample

Bond Issuer	ISIN code	Issue Date	Maturity	Coupon (%)	Volume Issued (in USD)	Volume Executed (in USD)
Germany	DE0001141836	07/04/2010	07/04/2042	3.250	27,458,500,000	2,532,011,575
	DE0001102317	05/15/2013	05/15/2023	1.500	19,770,120,000	1,695,369,335
	DE0001102325	09/11/2013	08/15/2023	2.000	19,770,120,000	1,647,715,293
	DE0001102333	01/29/2014	02/15/2024	1.750	19,770,120,000	1,813,871,000
	DE0001104875	08/15/2017	08/15/2048	1.250	21,417,630,000	2,111,752,498
	DE0001141794	01/23/2019	04/05/2024	0.000	23,065,140,000	2,131,391,960
	DE0001102358	05/21/2014	05/15/2024	1.500	24,712,650,000	2,064,163,436
	DE0001104883	05/17/2022	06/14/2024	0.200	18,671,780,000	1,520,070,072
	DE0001102366	09/10/2014	08/15/2024	1.000	19,648,260,000	1,509,855,113
	DE0001135366	07/23/2008	07/04/2040	4.750	17,573,440,000	1,447,789,640
United States	US9128283W81	2/15/2018	2/15/2028	2.750	70,572,104,500	5,499,761,156
	US912828P469	02/15/2016	02/15/2026	1.625	64,940,659,900	4,700,899,049
	US9128285W63	01/15/2019	01/15/2029	1.032	13,000,026,300	1,100,035,194
	US912810SG40	02/15/2019	02/15/2049	1.181	15,385,015,600	1,449,539,188
	US91282CDD02	10/31/2021	10/31/2023	0.375	66,099,908,400	4,592,193,508
	US9128285C00	09/30/2018	09/30/2025	3.000	31,000,000,000	2,243,332,249
	US912810QP66	02/15/2011	02/15/2041	2.884	23,984,657,100	1,704,139,189
	US912810QF84	02/15/2010	02/15/2040	2.922	15,171,280,100	1,148,366,429
	US912810RY64	08/15/2017	08/15/2047	2.750	43,512,330,700	3,776,592,369
	US912810QX90	08/15/2012	08/15/2042	2.750	41,995,432,300	3,414,186,998
	US912810QW18	05/15/2012	05/15/2042	3.000	43,918,685,600	3,472,679,035
	US912810PX00	05/15/2008	05/15/2038	4.500	25,500,122,800	1,873,596,980
Italy	IT0005436693	02/01/2021	08/01/2031	0.600	21,300,000,000	1,545,041,693
	IT0005240350	09/01/2016	09/01/2033	2.450	16,808,228,000	1,379,642,560
	IT0005210650	08/01/2016	12/01/2026	1.250	18,891,843,000	1,374,124,929
	IT0005127086	08/01/2015	12/01/2025	2.000	19,427,596,000	1,262,818,368
	IT0004953417	08/01/2013	03/01/2024	4.500	23,264,571,000	1,450,240,610
	IT0005321325	09/01/2017	09/01/2038	2.950	14,963,750,000	1,141,990,363
	IT0004644735	09/01/2010	03/01/2026	4.500	21,999,898,000	1,321,215,545
	IT0005327306	03/15/2018	05/15/2025	1.450	15,419,125,000	965,491,378
	IT0005090318	03/02/2015	06/01/2025	1.500	19,786,723,000	1,195,137,809
	IT0005386245	10/01/2019	02/01/2025	0.350	19,468,306,000	1,233,633,129
Spain	ES00000128H5	07/26/2016	10/31/2026	1.300	26,346,638,000	1,555,103,387
	ES00000127G9	06/09/2015	10/31/2025	2.150	25,740,540,000	1,421,847,795
	ES0000012B88	07/03/2018	07/30/2028	1.400	23,365,049,000	1,483,202,516
	ES0000012B70	11/30/2017	11/30/2023	0.175	5,456,261,000	285,222,503
	ES00000127Z9	01/19/2016	04/30/2026	1.950	22,952,139,000	1,252,560,660
	ES00000124C5	07/16/2013	10/31/2028	5.150	18,769,067,000	1,126,497,346
	ES00000126A4	11/30/2013	11/30/2024	2.140	13,170,578,000	630,704,030
	ES00000120N0	06/20/2007	07/30/2040	4.900	20,669,793,000	1,448,690,669
	ES00000127C8	11/30/2014	11/30/2030	1.186	16,836,799,000	1,106,295,472
Japan	JP1103351E98	9/22/2014	9/20/2024	0.500	58,407,605,995	3,114,395,372
	JP1103291D68	6/20/2013	6/20/2023	0.800	56,783,915,871	2,760,761,340
	JP1103551K72	6/20/2019	6/20/2029	0.100	51,040,754,700.00	3,341,038,531.12
	JP1201551FC0	12/20/2015	12/20/2035	1.000	33,483,028,176	2,594,309,168
	JP1200881660	6/20/2006	6/20/2026	2.300	18,164,193,301	1,080,066,060
	JP1200871653	3/20/2006	3/20/2026	2.200	8,161,464,243	450,517,545
	JP1103451GC0	12/20/2016	12/20/2026	0.100	65,843,010,168	3,791,204,304
	JP1103341E67	6/20/2014	6/20/2024	0.600	53,826,367,974	2,567,397,676
	JP1103301D90	9/20/2013	9/20/2023	0.800	54,398,599,836	1,770,760,994
	JP1200631388	6/20/2003	6/20/2023	1.800	6,943,752,926	183,727,606
	JP1103391F65	6/20/2015	6/20/2025	0.400	58,932,408,728	2,240,969,006
Turkey	XS1843443356	01/31/2019	03/31/2025	4.625	1,372,925,000	49,902,475
	XS1909184753	11/14/2018	02/16/2026	5.200	1,649,992,500	63,873,111
	US900123CF53	01/29/2014	03/22/2024	5.750	2,500,000,000	80,067,646
	XS1629918415	06/17/2017	06/14/2025	3.250	1,101,885,000	36,815,007
	US900123CP36	01/17/2018	02/17/2028	5.125	2,000,000,000	87,130,939
	US900123CW86	11/14/2019	11/14/2024	5.600	2,500,000,000	56,788,080
	US900123AW05	01/24/2005	05/02/2025	7.325	3,250,000,000	86,709,767
	US900123AY60	01/17/2006	03/17/2036	6.875	2,750,000,000	130,954,080
	US900123CB40	04/16/2013	04/16/2043	4.875	3,000,000,000	174,902,998
	US900123CM05	05/11/2017	05/11/2047	5.750	3,500,000,000	206,898,779
Mexico	US91087BAG59	7/31/2019	1/31/2050	4.500	2,903,527,000	201,705,721
	US91087BAK61	4/27/2020	4/27/2032	4.750	2,500,000,000	133,551,452
	MX0MGO000102	2/23/2017	11/07/2047	8.000	258,121,595	16,530,175
	MX0MGO000078	12/30/2004	12/05/2024	10.000	240,194,562	10,264,518
	MX0MGO0000P2	6/23/2011	5/29/2031	7.750	437,969,452	19,781,815
	US91086QBE70	01/21/2014	01/21/2045	5.550	3,000,000,000	193,969,256
	XS2135361686	09/18/2020	09/18/2027	1.350	750,000,000	41,896,431
	US91087BAJ98	04/27/2020	04/27/2025	3.900	1,000,000,000	47,779,374
	US91087BAD29	10/10/2017	02/10/2048	4.600	2,525,274,000	159,228,149
	US91086QAS75	09/27/2004	09/27/2034	6.750	4,266,566,000	254,190,320
Indonesia	US455780CT15	04/15/2020	10/15/2050	4.200	1,650,000,000	123,440,663
	USY20721BE87	04/15/2013	04/15/2043	4.625	1,500,000,000	90,206,635
	US455780CR58	01/14/2020	02/14/2050	3.500	800,000,000	63,291,522
	XS2012546714	06/13/2019	09/18/2026	1.450	817,590,000	40,398,068
	USY20721BU20	07/18/2017	07/14/2047	4.750	1,000,000,000	77,217,563
	US455780CD62	12/11/2017	01/11/2028	3.500	1,250,000,000	77,888,231
	USY20721BT56	07/18/2017	07/18/2027	3.850	1,000,000,000	56,096,003
	US455780CY00	07/28/2021	07/28/2031	2.150	600,000,000	39,203,276
	USY20721BQ18	12/08/2016	01/08/2027	4.350	1,250,000,000	77,979,617
	US455780CQ75	01/14/2020	02/14/2030	2.850	1,200,000,000	81,257,678
Nigeria	XS1717013095	11/28/2017	11/28/2047	7.625	1,500,000,000	87,347,474
	XS1777972941	02/23/2018	02/23/2038	7.696	1,250,000,000	66,002,475
	XS1717011982	11/28/2017	11/28/2027	6.500	1,500,000,000	44,900,152
	XS0944707222	07/12/2013	07/12/2023	6.375	500,000,000	10,540,924
	XS2384698994	09/28/2021	09/28/2028	6.125	1,250,000,000	42,094,445
	XS1777972511	02/23/2018	02/23/2030	7.143	1,250,000,000	45,435,632
	XS1566179039	02/16/2017	02/16/2032	7.875	1,000,000,000	35,545,324
	XS1910826996	11/21/2018	11/21/2025	7.625	1,118,352,000	28,394,300
	XS2384704800	09/28/2021	09/28/2051	8.250	1,250,000,000	75,140,475
	XS2384701020	09/28/2021	09/28/2033	7.375	1,250,000,000	62,662,368
Poland	XS2114767457	02/10/2020	02/10/2025	0.000	1,647,510,000	34,734,468
	XS0224427160	07/20/2005	07/20/2055	4.250	519,208,000	22,189,504
	XS1288467605	09/09/2015	09/09/2025	1.500	1,091,570,000	25,934,254
	XS1015428821	01/15/2014	01/15/2024	3.000	2,196,680,000	44,408,386
	XS1508566392	10/25/2016	10/25/2028	1.000	818,677,500	21,003,581
	XS1584894650	03/23/2017	10/22/2027	1.375	1,091,570,000	25,899,231
	XS1766612672	02/07/2018	08/07/2028	1.125	1,090,420,000	25,933,439
	US857524AC63	01/22/2014	01/22/2024	4.000	2,000,000,000	41,690,796
	XS1346201889	01/18/2016	01/18/2036	2.375	2,183,140,000	82,782,859
	XS1960361720	03/07/2019	03/08/2043	2.000	545,785,000	27,703,475
Walmart	US931142CH46	04/05/2007	04/05/2027	5.875	750,000,000	99,404,779
	US931142BF98	2/15/2000	2/15/2030	7.550	1,000,000,000	100,362,707
	US931142CB75	8/31/2005	09/01/2035	5.250	2,500,000,000	277,573,653
	US931142CK74	8/24/2007	8/15/2037	6.500	2,250,000,000	268,070,264
	US931142CM31	4/15/2008	4/15/2038	6.200	1,500,000,000	178,712,981
	US931142AU74	10/14/1993	10/15/2023	6.750	250,000,000	7,729,944
Johnson & Johnson	US478160CM48	11/10/2017	1/15/2048	3.500	750,000,000	94,421,361
	US478160AL82	5/22/2003	5/15/2033	4.950	500,000,000	31,020,881
	US478160AN49	8/16/2007	8/15/2037	5.950	1,000,000,000	82,177,051
	US478160AT19	6/23/2008	7/15/2038	5.850	700,000,000	58,192,503
	US478160AV64	8/17/2010	09/01/2040	4.500	550,000,000	50,015,272
	US478160BA19	5/20/2011	5/15/2041	4.850	300,000,000	29,396,442
Verizon	XS1405769727	11/02/2016	11/02/2035	3.125	604,125,000	60,766,843
	US92343VFE92	3/20/2020	3/22/2030	3.150	1,500,000,000	127,097,596
	US92343VFU35	11/20/2020	11/20/2050	2.875	2,750,000,000	269,487,826
	AU3CB0246221	8/17/2017	2/17/2025	4.050	301,603,525	24,454,081
	US92343VDU52	3/16/2017	3/16/2037	5.250	3,000,000,000	420,910,769
	AU3CB0268142	11/06/2019	05/06/2026	2.100	304,182,753	28,351,338
Unilever PLC	XS2008925344	06/11/2019	06/11/2039	1.500	709,520,500	88,993,363
	XS1684780031	9/15/2017	9/15/2024	1.375	335,625,000	23,428,648
	XS2008921277	06/11/2019	7/22/2026	1.500	671,250,000	42,497,268
	XS1684780205	9/15/2017	9/15/2029	1.875	335,625,000	26,178,357
	XS2008925344	06/11/2019	06/11/2039	1.500	697,398,000	78,171,947
Rio Tinto PLC	US767201AT32	11/02/2021	11/02/2051	2.750	1,250,000,000	155,997,716
	US767201AD89	6/27/2008	7/15/2028	7.125	750,000,000	60,751,202
	US767201AL06	11/02/2010	11/02/2040	5.200	500,000,000	52,837,019
	XS0863127279	12/11/2012	12/11/2024	2.875	457,678,278	22,966,598
	XS0863076930	12/11/2012	12/11/2029	4.000	671,250,000	42,099,681
	US013716AW59	31/05/2005	01/06/2035	5.750	300,000,000	19,457,646
Air Liquide	FR0011951771	06/05/2014	06/05/2024	1.875	545,785,000	37,902,233
	FR0011439835	03/06/2013	09/06/2023	2.375	327,471,000	18,981,136
	FR0014005HY8	9/20/2021	9/20/2033	0.375	538,457,000	48,757,525
	FR0013182839	6/13/2016	6/13/2024	0.750	537,289,000	33,483,220
	FR0013428067	6/20/2019	6/20/2030	0.625	654,942,000	57,805,408
	FR0013182847	6/13/2016	6/13/2028	1.250	1,091,570,000	82,578,453
	FR0013241346	03/08/2017	03/08/2027	1.000	661,824,000	46,333,309
	FR0012766889	06/03/2015	06/03/2025	1.250	542,643,000	33,326,475
	FR0013505559	04/02/2020	04/02/2025	1.000	539,271,000	24,767,777
Ambev	LU0000870137	4/21/2023	3/31/2027	10.625	41,352,000	4,388,593
	US20441XAB82	10/30/2002	12/15/2011	10.500	497,432,000	10,637,773
	USP30580AA55	12/15/2011	12/15/2019	13.200	209,796,000	6,547,211
	LU1234567890	2/15/2016	2/15/2021	9.750	59,793,000	2,661,324
Cemex	US151290BZ57	01/11/2021	01/11/2031	3.875	1,107,769,000	123,496,287
	US151290BX00	9/17/2020	9/17/2030	5.200	717,384,000	60,778,504
	US151290BW27	06/05/2020	06/05/2027	7.375	935,879,000	65,878,868
	US151290BV44	11/19/2019	11/19/2029	5.450	753,053,000	55,026,216
	US766879AA85	04/01/2003	7/21/2025	7.700	149,897,000	7,485,139
Turkish Airlines	US10010YAA01	03/15/2015	03/15/2027	4.200	328,274,000	31,979,536
Turkish Airlines	USU0567PAA40	03/04/2013	03/04/2025	7.500	750,000,000	57,689,650
KCE Electronics	US48245B1098	8/21/2017	8/21/2022	8.250	800,000,000	40,990,255
	XS2180488111	06/04/2020	12/31/2041	9.683	304,556,000	34,738,793
	XS1265917481	08/03/2015	12/31/2030	8.575	290,000,000	24,240,151
Telekomunikacja Polska	XS1585599667	01/08/2016	01/08/2024	1.875	754,117,000	69,232,737
Telekomunikacja Polska	XS1763026934	05/19/2017	05/19/2024	1.250	532,928,000	42,621,102
Caesars Resort Collection LLC	USU1230PAB77	02/06/2023	2/15/2030	7.000	2,000,000,000	206,375,311
	USU1230PAA94	10/15/2022	10/15/2029	4.625	1,200,000,000	95,097,735
	USU2829LAD74	07/01/2020	07/01/2027	8.125	1,800,000,000	119,510,985
	US28470RAJ14	07/01/2018	07/01/2025	6.250	3,400,000,000	130,617,993
Asurion LLC	US04649VAX82	12/23/2018	12/23/2026	8.750	3,510,356	258,496
	US04649VAW00	11/03/2016	11/03/2024	7.500	150,000,000	8,370,686
	US04649VAQ32	07/08/2014	07/08/2020	6.975	100,000,000	3,347,416
	US04649VAG59	05/04/2012	05/04/2019	7.250	120,000,000	2,795,915
Intelsat Jackson Holdings	US45824TAR68	03/29/2016	02/15/2024	8.000	1,349,678,000	119,432,338
	US45824TAM71	09/15/2018	09/15/2022	6.625	1,275,000,000	41,601,937
	US45824TAS42	09/30/2014	09/30/2022	9.500	490,000,000	20,561,418
	US45824TAP03	06/05/2013	08/01/2023	5.500	2,000,000,000	89,294,027
	USL5137XAT65	9/19/2018	10/15/2024	8.500	2,950,000,000	170,262,952
	USL5137XAJ83	3/29/2016	2/15/2024	8.000	1,250,000,000	686,303,258
Athenahealth Group	US60337JAA43	02/15/2022	02/15/2030	6.500	2,350,000,000	196,740,827
	US04686RAB96	02/15/2021	02/15/2029	6.500	5,900,000,000	481,150,076
	US04686RAC79	09/15/2021	09/15/2029	6.500	1,000,000,000	69,922,650
Great Outdoors Group llc	US07014QAN16	03/06/2018	03/06/2028	8.380	200,000,000	18,961,194
	US07014QAM33	03/05/2022	03/05/2028	7.250	250,000,000	20,901,121
	US07014QAJ04	03/25/2014	03/25/2019	7.500	150,000,000	7,220,059
	US07014QAG64	06/05/2015	06/05/2020	6.950	300,000,000	17,237,389
	US07014QAK76	09/25/2019	09/25/2024	8.450	200,000,000	16,713,977
Petróleos Mexicanos	US706451BD26	9/15/2004	9/15/2027	9.500	162,425,000	11,355,157
	US706451BR12	06/04/2008	6/15/2038	6.625	491,175,000	40,759,059
	US706451BG56	12/15/2005	6/15/2035	6.625	2,749,000,000	202,429,964
	XS0213101073	2/24/2005	2/24/2025	5.500	1,098,340,000	70,566,416
	US71654QCB68	02/04/2016	08/04/2026	6.875	2,969,774,000	179,974,227
	US71654QDN97	02/08/2023	02/08/2033	10.000	2,000,000,000	177,155,911
	US71643VAB18	12/16/2021	2/16/2032	6.700	6,779,842,000	502,568,278
	USP78625DC49	9/26/2014	09/12/2024	7.190	2,612,203,780	133,945,121
Petrobras	XS0718502007	12/12/2011	12/14/2026	6.250	658,493,565	49,036,883
	US71645WAS08	1/27/2011	1/27/2041	6.750	2,250,000,000	241,054,565
	US71645WAQ42	10/30/2009	1/20/2040	6.875	1,500,000,000	139,721,044
	US71647NAM11	03/17/2014	03/17/2024	6.250	559,315,000	33,895,621
	XS0982711714	01/14/2014	01/14/2025	4.750	3,096,714,713	17,648,749
	US71647NAT63	09/27/2017	01/27/2025	5.299	633,300,000	34,370,495
	US71647NAQ25	05/23/2016	05/23/2026	8.750	386,934,000	16,232,224
	US71647NAS80	01/17/2017	01/17/2027	7.375	710,066,000	47,443,714
	US71645WAQ42	10/30/2009	01/20/2040	6.875	728,963,000	62,530,497
Sands China Ltd	US80007RAE53	08/09/2018	08/08/2028	5.900	1,892,760,000	136,711,791
	US80007RAL96	06/04/2020	6/18/2030	4.375	697,375,000	55,222,959
	USG7801RAD10	06/04/2020	6/18/2030	4.375	700,000,000	52,121,268
	US80007RAK14	06/04/2020	01/08/2026	4.300	796,938,000	34,972,824
	US80007RAF29	08/09/2018	08/08/2025	5.625	1,786,475,000	76,954,319
	USG7801RAB53	08/09/2018	08/08/2025	5.125	1,800,000,000	35,750,190
	USG7801RAE92	01/08/2019	01/08/2026	3.800	800,000,000	23,243,316
	US80007RAQ83	03/08/2021	03/08/2029	2.850	649,621,000	27,246,119
Indonesia Asahan Alumini	USY7140WAE85	05/15/2020	05/15/2025	4.750	1,000,000,000	60,601,528
	USY7140WAA63	11/15/2018	11/15/2021	5.230	498,671,000	16,447,360
	USY7140WAF50	05/15/2020	05/15/2030	5.450	1,000,000,000	61,079,590.44
	USY7140WAB47	11/15/2019	11/15/2023	5.710	310,939,000	13,176,920
	USY7140WAC20	11/15/2018	11/15/2028	6.530	598,460,000	36,264,242
Longfor Properties	XS1743535491	01/16/2018	01/16/2028	4.500	500,000,000	34,963,499
	XS1633950453	07/13/2017	07/13/2022	3.875	450,000,000	13,282,281
	XS2033262895	9/16/2019	9/16/2029	3.950	850,000,000	44,377,944
	XS2098650414	1/13/2020	1/13/2032	3.850	400,000,000	31,701,054
	XS2098539815	1/13/2020	4/13/2027	3.375	250,000,000	16,287,838
	XS0877742105	1/29/2013	1/29/2023	6.750	500,000,000	15,988,419
	XS1743535228	1/16/2018	4/16/2023	3.900	300,000,000	9,683,816
	XS0844323930	10/18/2012	10/18/2019	6.875	400,000,000	12,740,378

The column “Volume Issued (in USD)” is referred to the volume issued by the bond and the column “Volume Executed (in USD” is referred to the volume we have actually used through the observations extracted from the database in the sample.

Appendix 4: Cumulative Net Profits for Each Bond Market (Sovereign, Corporate and High-Yield) and According to Each Price Window (10-min, 30-min, 60-min and 1-min, 5-min)

See Figs. 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Alaminos, D., Salas, M.B. & Fernández-Gámez, M.A. High-Frequency Trading in Bond Returns: A Comparison Across Alternative Methods and Fixed-Income Markets. Comput Econ (2023). https://doi.org/10.1007/s10614-023-10502-3

Download citation

Accepted: 16 October 2023
Published: 02 December 2023
DOI: https://doi.org/10.1007/s10614-023-10502-3

Keywords

JEL Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

High-Frequency Trading in Bond Returns: A Comparison Across Alternative Methods and Fixed-Income Markets

Abstract

Similar content being viewed by others

An effective approach for predicting daily stock trading decisions using fuzzy inference systems

Liquidity Unveiled: Crafting an Index to Decode the Sovereign Bond Market Risk

Stochastic Analysis for Short- and Long-Term Forecasting of Latin American Country Risk Indexes

Explore related subjects

1 Introduction

2 Methodologies

2.1 Quantum Genetic Algorithm (QGA)

2.1.1 Step 1: Population Initialisation

2.1.2 Step 2: Conduct Individual Coding and Measuring of the Population Generating Units

2.1.3 Step 3: Make An Individual Measure for Every Item in S(t)

2.1.4 Step 4: Apply Genetic Operators to Create New Individuals

2.1.5 Step 5: Apply An Appropriate Quantum Rotation Gate U(t) to Update S(t)

2.1.6 Step 6: Perturbation

2.2 Deep Recurrent Convolutional Neural Network (DRCNN)

3 Sample and Data

3.1 Sign Prediction Ratio (SPR)

3.2 Ideal Profit Ratio (IPR)

4 Results

5 Conclusions

Availability of Data and Material

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix 1: Other Methodologies

1.1 Quantum-Fuzzy Approach (QFuzzy)

1.2 Adaptive Boosting and Genetic Algorithm (AdaBoost-GA)

1.2.1 AdaBoost-GA

1.3 Support Vector Machine- Genetic Algorithm (SVM-GA)

1.3.1 \({\text{SVR}}_{\text{GBC}}\) Approach

1.3.2 Genetic Operators

1.3.3 Boltzmann Selection

1.3.4 Fitness Function

1.4 Deep Learning Neural Network- Genetic Algorithm (DLNN-GA)

1.5 Adaptive Neuro-Fuzzy Inference System-Quantum Genetic Algorithm (ANFIS-QGA)

1.6 Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM)

1.7 Gated Recurrent Unit- Convolutional Neural Networks (GRU-CNN)

1.7.1 The Establishment of CNN Module

1.7.2 The GRU-CNN Hybrid Neural Networks

1.8 Quantum Recurrent Neural Network (QRNN)

1.8.1 Parametrized Quantum Gates

1.8.2 A Higher-Degree Quantum Neuron

1.8.3 QRNN Cell and Sequence to Sequence Model

Appendix 2: Results for the ‘All Trades’ Scenario

Appendix 3: Features About Every Bonds Used in the Sample

Appendix 4: Cumulative Net Profits for Each Bond Market (Sovereign, Corporate and High-Yield) and According to Each Price Window (10-min, 30-min, 60-min and 1-min, 5-min)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation