1 Introduction

Quantitative trading is supported by mathematical statistics, and it can be used to quickly process large amounts of financial information. Machine learning (ML) of supervised learning approach has been developed to solve financial prediction problems using a set of historical stock prices as features to predict future stock prices (Lee et al. 2020). Recent deep learning (DL) models have outperformed the traditional statistical and ML approaches on many stock market prediction tasks. Most ML approaches based on the optimization algorithms to obtain the best model parameters or rules, different optimization algorithms will lead to the model performances (Liu et al. 2021; Ozcalici and Bumin, 2022). However, the ML and DL algorithms have been used to learn and predict the stock prices or trading actions, but the overfitting problem still happens in the training stage and cannot adapt in the future. Advanced models based on generative adversarial networks (GANs), in which a generator and a discriminator are used to compete in a game, are different from traditional supervised learning models. The GAN model is subjected to a two-stage learning process involving real data and fake data (Gulrajani et al. 2017; Abraham, 2021). The generator in GAN model is required to generate an output that is close to the real data, and the discriminator must be able to discriminate whether the input is real or from the generator. In many studies, GANs models can generate more training samples for model training and enhance model performance in terms of price prediction (Xu et al. 2022; Kumar et al. 2022). According to the aforementioned studies, GAN models outperform supervised learning approaches. Therefore, GAN models can be built a powerful forecasting model for use in the financial field. GAN models adapted for stock trade forecasting can converge from multiple directions; this is different from the traditional supervised learning approach. But the GAN needs a lot of real data of high quality because the distribution of real data is a learning target of the discriminator to recognize. Therefore, the high-quality trading action sequence as real data in historical stock prices should be obtained for the GAN model. In addition, a piecewise linear representation (PLR) approach can be used to identify a series of bottom and peak points from time series data (Wu and Chang 2012). PLR can identify many of the turning points required to guide trading actions such as buying and selling.

GAN model can used to generate the similar examples such as trading actions, and the similar trading actions may avoid the overfitting problem because the GAN is learning by adversarial mechanism, instead of the directly fit the target of the trading actions. To obtain the high quality of trading actions in training stage, the PLR can provide the seed trading actions for GAN model because of optimization searching from historical stock prices. Therefore, the final predicted trading actions by the GAN based on PLR approaches will give the more diverse trading action sequence. The paper proposes a GAN-based framework that can learn and generate trading actions based on historical trading information and technical indices. The objectives of this paper include (1) the PLR approach uses to generate suitable trading points as a referenced trading strategy. (2) (GAN)-based framework proposes to train model which using generator to generate predicted stock trading action sequence, and discriminator to detect whether is the real sequence from PLR or fake sequence from GAN’s generator. (3) GAN-based framework can improve the prediction performance compared with other DL models. In addition, we apply other GAN-based frameworks, namely the least squares GAN (LSGAN) (Mao et al. 2017) and the Wasserstein GAN (WGAN) with gradient penalty (Gulrajani et al. 2017) to achieve superior convergence effects to the DL model. We aim to use the capabilities of the proposed GAN to generate stock trading action sequences that can yield high returns on investment (ROI).

The rest of the paper is organized as follows. Literature review is in Sect. 2. The methodology is introduced in Sect. 3, followed by the introduction of the data and result in Sect. 4. Finally, we discuss the theoretical and practical implications of this work in Sect. 6.

2 Related work

In this section, we review the literature on the development and application trends in stock prediction, machine learning, deep learning, and generative adversarial networks.

2.1 Technical analysis and indicator

According to the efficient market hypothesis, only in an inefficient market is it possible to use technical analysis and historical prices to predict future prices and achieve returns. In Taiwan, even if the stock market is extremely liquid and dominated by high-tech industries, it is an inefficient market when tested using Wald’s weak efficiency market detection model, cross-sectional standard deviation, and cross-sectional absolute deviation. Therefore, in the case of the Taiwan Stock Exchange, by using historical prices to generate stock trade predictions, rewards higher than those obtainable from the broader market may be obtained (Nguyen et al. 2012). Technical analysis involves using historical trading information such as historical prices and trading volume as the basis for trading strategies, and many studies have focused on the development or use of technical analysis tools or indicators to obtain the profits from stock market. Therefore, the technical indicators as features for model training are no longer limited to stock price and volume. For example, candlesticks are calculated based on the opening, highest, lowest, and closing prices, and they facilitate easy observation of price changes in each time interval. Moving average observes the price trend during a certain period, which is the average of the price of a stock at several points in the past. Many technical indicators are used for solving stock prediction problems, such as candlestick charts (Thammakesorn and Sornil, 2019), different types of MA (Detzel et al. 2021; Dai et al. 2021), Bollinger bands, and RSI (Pramudya and Ichsani, 2020; Pramudya, 2020). In sum, technical indicators facilitate the observation of current market trends, and they have certain effects on stock prediction tasks.

2.2 Applications of deep learning in finance

The DL framework is based on the neural network model, which is a nonlinear model with uses multiple neurons. Vijh et al. (2020) proposed that the multi-layer neural networks have been used to obtain predictions of stock closed price. In terms of price prediction, the backpropagation algorithm to optimize model that can have better validity and superiority in price prediction issues. Recurrent neural networks (RNNs) are designed to process sequential or time series data, and have been used to predict stock prices on the stock market (Saud and Shakya 2020). Long short-term memory (LSTM) networks are advanced RNN models that are composed of input, output, and forget gates to determine whether to retain past features or refresh the feature list. In general, LSTM outperform RNN models (Lee et al. 2020). Another convolutional neural network (CNN) model, which was first developed for image classification, can also be applied to the financial field. A convolutional LSTM model has combined CNN and LSTM to predict prices on multiple cryptocurrencies (Alonso-Monsalve et al. 2020). In summary, DL can capture useful features by itself, and has a flexible structure that can be used to predict stock prices or generate trading rules.

2.3 Generative adversarial networks

GANs are based on deep learning, and they can be applied to unsupervised learning problems such as data, images, audio, and text generation. The original GAN model was proposed and composed of two neural network models, namely a generator and discriminator (Goodfellow et al. 2014). GAN was first used for image generation with CNN as the basic structure of the model. Now GAN has been applied to various fields, including problems related to time series. A VAE-GAN model which using LSTM to encoder, generator and discriminator to detect abnormal conditions of IOT (Niu et al. 2020). A random sampling from the latent space as the input of the generator to train the discriminator and generator, and calculates the difference between the result generated by the generator and the actual data (Bashar and Nayak, 2020). However, the original GAN model is limited by the gradient disappearance problem, which makes it difficult for the model to converge. A WGAN model (Arjovsky et al. 2017) has proposed that used Jensen–Shannon divergence to measure the deviation between the distributions of generated data and actual data. The WGAN model proposed to implement model training with weight clipping. If the model weight exceeds the value range, network weights will readjust boundary. Moreover, the weight clipping method with the gradient penalty method to enforce the Lipschitz constraint for GAN model (WGAN-GP) which converged more easily and yielded better image generation results than the weight clipping method (Gulrajani et al. 2017). In the least squares GAN (LSGAN) model proposed that the loss function was changed from cross entropy to mean square error (MSE) (Mao et al. 2017). When applied to picture generation, this model converged faster and in a more stable manner than did the WGAN.

2.4 Applications of GANs in finance

GOH and Lai (2019) used a GAN to establish an association network between the US dollar and 15 other currencies. The network used to generate data which closer to actual currency, and can observe annual changes in the correlations of various currencies. Moreover, Marti (2020) used GAN model to Sampling realistic financial correlation matrices from 1-year data of randomly selected stocks from the SandP500 as a sample and attempted to generate a new covariance matrix with the same distribution. Zhou et al. (2018) proposed a GAN to generate historical information for every minute. Wiese et al. (2020) combined a temporal convolutional network with a GAN to develop the Quant GAN model for generating new samples of SandP500 index prices. Their results displayed that the Quant GAN model was superior to other methods. In addition, GANs also have been used to generate financial data for training other models. For example, Abraham (2021) uses GAN to increase samples of stocks to improve the accuracy of the classification model for predicting the future price of stocks. Yoon et al. (2019) developed the time-series generative adversarial networks model, in which the generator and the discriminator were composed of LSTM model, to generate extra samples that approximated real data. Moreover, the stock-GAN model proposed by Li et al. (2020), which combining WGAN-GP and conditional GAN models, and was used to generate order-by-order data in the stock market to simulate the actual stock order-by-order scenario. Compared with the variational autoencoder and deep convolutional generative adversarial network models, the difference between the stock-GAN-generated data and the original data was smaller, meaning that the order data generated by the stock-GAN was closer to the real data. Most financial applications for generating confrontation networks use existing price and volume information to generate more price and volume information and use the generated price and volume information to train predictive models. The generated price and volume information can be input into deep learning models, which require large volumes of data for training. In addition, the WGAN-GP model has been used to simulate the generation of future stock price series. To avoid model overfitting, Lezmi et al. (2020) used the WGAN-GP model to generate daily changes in financial indicators, such as SandP500 and VIX. The WGAN-GP model can generate sample data that exhibit the statistical characteristics of real data.

3 Methodology

In this study, we develop a GAN-based framework which combining PLR approach, LSTM, and three GAN models to generate trading actions with high performance. The overall process is illustrated in Fig. 1, and it can be divided into three stages as follows:

  • Stage I: Data preprocessing:

    • Step 1: Create input features for model training including daily opening price, closing price, highest price, lowest price, trading volume, and technical indices.

    • Step 2: Employ the PLR approach to generate real output targets such as trading actions by identifying trading points (trading timing) based on the adjusted closing price. Subsequently, transform all trading points to trading actions, such as buying, selling or holding.

    • Step 3: Merge the features and trading actions to obtain sample data by date.

  • Stage II: GAN-based framework building:

    • Step 1: Build a generator in the GAN-based framework to predict trading sequences based on features.

    • Step 2: Build a discriminator in the GAN-based framework to determine whether an input trading sequence is the real trading sequence obtained using PLR or the trading sequence predicted by the GAN’s generator.

  • Stage III: Model training and optimization

    • Ensure the convergence of all model parameters of the GAN-based framework according to the different loss function, namely cross entropy and MSE.

  • Stage IV: Performance evaluation

    • Use cumulative return on investment (CR), the Sharpe ratio (SR), and winning percentage (\(WPTC\)) to evaluate the performance of each of the GAN-based frameworks.

Fig. 1
figure 1

Process flow of the proposed GAN-based framework

The detailed process flow of the proposed GAN-based framework is as follows:

3.1 Data preprocessing

The process of generating training samples involves three steps, namely feature generation, trading sequence generation, and training sample generation.

3.1.1 Feature generation as input

The daily opening price, closing price, highest price, lowest price, adjusted closing price (\(\overline{P }\)), and trading volume are used to calculate technical indices as input features, which consist of a total of 152 technical indicators generated using 97 technical analysis methods.Footnote 1 A total of 157 features are used, including 152 technical indicators and 5 pieces of stock trading data (open, close, high, low, volume). The final input feature for each day is denoted as \(F\).

3.1.2 Trading sequence generation as target

The PLR approach is used to generate a known turning point sequence \(D=[{d}_{1},{d}_{2},\dots ,{d}_{{T}^{\mathrm{train}}}]\) from historical data \(\overline{P }=[{\overline{p} }_{1},{\overline{p} }_{2},\dots ,{\overline{p} }_{{T}^{\mathrm{train}}}]\), where \({T}^{\mathrm{train}}\) is the length of the trading period in the training data. Then, the turning point sequence is transformed into a trading action sequence \(A=[{a}_{1},{a}_{2},\dots ,{a}_{{T}^{\mathrm{train}}}]\). The steps involved in trading sequence generation are as follows:

  • Step 1: Set the start and end points on the sequence for training data segmentation.

  • Step 2: Compute a simple linear regression function by using the least squares method.

  • Step 3: Find point t with the maximum deviation between the actual price of a stock and the price predicted using the linear regression function in the entire segmentation. This point represents the new segmented point t; set \({d}_{t}=1\).

  • Step 4: Perform recursive segmentation at the newly segmented point, which has left and right segments. Repeat steps 1–4 for each new segment until a stopping threshold \(\upgamma \) is satisfied, which is times multiplicate standard deviation of the entire segmentation. Finally, the final turning point sequence D is obtained upon completion of the aforementioned step 1–4, and the index of turning point \(DI\) is obtained as \(\{\underset{t}{\mathrm{arg}}{d}_{t}, if{d}_{t}=1\}\).

  • Step 5: Transform the final tuning point sequence D into a trading action sequence A, which is 0 by default. The types of final trading actions on these turning points are as follows:

    $${a}_{{DI}_{q}}=\left\{\begin{array}{l}1, \quad and\; if\; a{p}_{{DI}_{q}}<{ap}_{{DI}_{q-1}}\\ 2, \quad and\; if\; a{p}_{{DI}_{q}}\ge {ap}_{{DI}_{q-1}}\end{array}\right.$$
    (1)

    where \({a}_{{DI}_{q}}\) refers to the action type on the qth index of the turning point \(DI\).

    For model training, the trading action sequence is converted into a matrix of the training target with one-hot encoding as follows:

    $$ A^{\prime } = {\text{one}}\_{\text{hot}}(A) $$
    (2)

    where \( A^{\prime } \in ^{{T^{{{\text{train}}}} \times 3}} \) is the training target, and \(\mathrm{one}\_\mathrm{hot}\) is the one-hot encoding function.

3.1.3 Training sample generation

In this section, a special training sampling process is considered. A trading period length \(TP\) and a number of historical data days S are considered for predictions for each trading day. A complete sample data \({\mathbb{X}}\) is composed of X with length TP from \(F\):

$${\mathbb{X}}=[{X}_{1},\dots {X}_{k}\dots ,{X}_{TP}]$$
(3)

where \({X}_{k}\) denotes the features on the kth trading day. Each X contains \(S\) features:

$${X}_{k}=\left[{F}_{k-S},{F}_{k-S+1},\dots ,{F}_{1}\right]$$
(4)

where \({F}_{k-S}\) denotes the feature of the previous k-S day. For the training target, the same sampling process is applied:

$$Y=\left[{a}_{1}^{^{\prime}},\dots {a}_{k}^{^{\prime}}\dots ,{a}_{TP}^{^{\prime}}\right]$$
(5)

For the validation and test data, the adjusted closing price is not used to calculate the target trading actions, and sampling of the \(TP\) length is not performed.

3.2 GAN-based framework building

The proposed GAN-based model is composed of a generator and a discriminator. The generator predicts the trading sequence \(\widetilde{Y}\) based on the features \({\mathbb{X}}\) of the training samples; the discriminator distinguishes whether the trading sequence is the predicted trading sequence \(\widetilde{Y}\) or the real trading sequence \(Y\).

3.2.1 Generator

The generator comprises two LSTM layers and two linear layers, where the first layer \({G}^{1}\) estimates the hidden feature of a day based on the features of the previous day, the second layer \({G}^{2}\) estimates the sequential hidden features by using the output of the first layer, and the third layer \({G}^{3}\) predicts the probabilities of trading actions by using the two linear layers. First, we input all \(X\) into the first LSTM layer \({G}^{1}\) to obtain the hidden features of each day \({h}_{s,k}^{1}\) through the s-th time point feature \({x}_{k}^{s}\):

$${h}_{s,k}^{1}={G}^{1}\left({x}_{k}^{s}\right), s\in [1,\dots ,S]$$
(6)

Subsequently, we use the last hidden feature at the Sth point to represent the final state of each day. Therefore, the final state vectors \({[h}_{S,1}^{1},\dots ,{h}_{S,k}^{1}\dots ,{h}_{S,TP}^{1}]\) from \({G}^{1}\) are input into the second LSTM layer \({G}^{2}\) to obtain the sequential hidden features \({h}_{k}^{2}\) as follows:

$${h}_{k}^{2}={G}^{2}\left({h}_{S,k}^{1}\right),k\in [1,\dots ,\mathrm{TP}]$$
(7)

The output \({h}_{k}^{2}\) of the second layer is passed through \({G}^{3}\) to predict trading actions. In \({G}^{3}\), pipeline estimation is performed using first linear layer with a leaky ReLU function and second linear layer with a softmax function:

$${\widetilde{y}}_{k}={G}^{3}({h}_{k}^{2})$$
(8)

where \({\widetilde{y}}_{k}\) contains \({\widetilde{y}}_{k}^{0}\), \({\widetilde{y}}_{k}^{1}\), and \({\widetilde{y}}_{k}^{2}\) for each trading day; these represent the probabilities of holding, buying, and selling, respectively:

$$\widetilde{Y}=[{\widetilde{y}}_{1},\dots {\widetilde{y}}_{k}\dots ,{\widetilde{y}}_{TP}]$$
(9)

where \(\widetilde{Y}\) represents the predicted trading action sequences on all trading days over the entire trading period. The proposed generator is also called the hierarchical-LSTM (H-LSTM) model.

3.2.2 Discriminator

The discriminator comprises three LSTM layers. The first LSTM layer \({D}^{p}\) estimates the hidden feature of each day \({h}_{k}^{p}\) at the k-th time point based on the adjusted closing price \(\overline{P }\) in a training sample, where \(k\in [1,\dots ,TP]\):

$${h}_{k}^{p}={D}^{p}\left({\overline{p} }_{k}\right)$$
(10)

The trading action sequence \(\widetilde{Y}\), which is either the real \(Y\) or the predicted \(\widehat{Y}\), is input into another LSTM layer \({D}^{a}\) to obtain the hidden feature of daily action as follows:

$${h}_{k}^{a}={D}^{a}\left({\widetilde{Y}}_{k}\right)$$
(11)

The \({h}_{k}^{p}\) and \({h}_{k}^{a}\) of each trading day are concatenated from the \({D}^{p}\) layer and the \({D}^{a}\) layer to the third LSTM layer to estimate the sequential hidden feature of each trading day:

$${h}_{k}^{^{\prime}}={D}^{1}\left(\left[{h}_{k}^{p},{h}_{k}^{a}\right]\right)$$
(12)

The last output \({h}_{TP}^{^{\prime}}\) at time point TP of the third layer is input into the layer \({D}^{2}\) to determine whether it is the real target trading sequence or the predicted trading sequence. In layer \({D}^{2}\), pipeline estimation is performed using a linear layer with a leaky ReLU function and another linear layer with a softmax function:

$$Z={D}^{2}({h}_{\mathrm{TP}}^{^{\prime}})$$
(13)

3.3 Model training and optimization

\({Z}_{\mathrm{fake}}\) Is the discriminator output obtained by inputting the real stock price \(\overline{P }\) and the predicted trading action sequence \(\widetilde{Y}\) generated by the generator. \({Z}_{\mathrm{real}}\) is the discriminator output obtained by inputting the real stock price \(\overline{P }\) and the target trading action sequence \(Y\) generated using the PLR approach. The model parameters of the generator are updated using only \({Z}_{\mathrm{fake}}\) to obtain gradients. The model parameters of the discriminator are updated using both \({Z}_{\mathrm{fake}}\) and \({Z}_{\mathrm{real}}\) to obtain gradients. In addition, the model parameters of all of the GAN-based framework are optimized using the Adam optimizer.

3.3.1 Loss functions in the simple GAN framework

In the simple GAN framework, the generator loss L_G^GAN is calculated using the binary cross entropy function:

$${L}_{G}^{\mathrm{GAN}}=\mathrm{log}(1-{Z}_{\mathrm{fake}})$$
(14)

where log denotes the logarithm function. Similarly, the discriminator loss \({L}_{D}^{\mathrm{GAN}}\) is calculated using the categorical cross entropy function:

$${L}_{D}^{\mathrm{GAN}}=\mathrm{log}2\left({Z}_{\mathrm{fake}}\right)+\mathrm{log}(1-{Z}_{\mathrm{real}})$$
(15)

3.3.2 Loss function in the LSGAN framework

According to training concept of the LSGAN algorithm, the generator loss \({L}_{G}^{\mathrm{GAN}}\) in the LSGAN framework is calculated using the MSE function:

$${L}_{G}^{\mathrm{LSGAN}}={({Z}_{\mathrm{fake}}-1)}^{2}$$
(16)

The discriminator loss \({L}_{D}^{\mathrm{LSGAN}}\) in the LSGAN framework is calculated using the MSE function:

$${L}_{D}^{\mathrm{LSGAN}}=\frac{1}{2}{({Z}_{\mathrm{fake}}-0)}^{2}+\frac{1}{2}{({Z}_{\mathrm{real}}-1)}^{2}$$
(17)

3.3.3 Loss function in the WGAN framework

In the WGAN framework, the generator loss \({L}_{G}^{\mathrm{WGAN}}\) is calculated using the binary cross entropy function, which is the same as that in the case of the simple GAN framework. The discriminator loss \({L}_{D}^{\mathrm{WGAN}}\) in the WGAN framework is computed using the WGAN-GP algorithm developed by Gulrajani et al. (2017). The gradient L2 norm uses the loss function of the discriminator in the WGAN framework; therefore, \({L}_{D}^{\mathrm{WGAN}}\) is presented as follows:

$${L}_{D}^{\mathrm{WGAN}}=-{Z}_{\mathrm{real}}+{Z}_{\mathrm{fake}}+\lambda {({\Vert {\nabla }_{{Y}_{\mathrm{mix}}}{Z}_{{Y}_{\mathrm{mix}}}\Vert }_{2}-1)}^{2}$$
(18)

where \(\lambda \) denotes the weight of the penalty on the output \({Z}_{{Y}_{\mathrm{mix}}}\), and \(\lambda \) is set to 0.1. \({Z}_{{Y}_{\mathrm{mix}}}\) is the output of the discriminator in the WGAN framework, and it is obtained by inputting the real stock price \(\overline{P }\) and the mixed trading action sequence \({Y}_{\mathrm{mix}}\). The weighting approach is used to obtain the mixed trading action sequence \({Y}_{\mathrm{mix}}\), which includes \(\widetilde{Y}\) and \(Y\):

$${Y}_{\mathrm{mix}}=\epsilon \left(Y\right)+\left(1-\epsilon \right)\left(\widetilde{Y}\right),0\le \epsilon \le 1$$
(19)

where \(\epsilon \) denotes a random value from 0 to 1.

3.3.4 Loss functions of H-LSTM, GAN-S, LSGAN-S, and WGAN-S frameworks

In addition, we use only the H-LSTM model as the generator to develop the prediction model by applying supervised learning to the target trading actions. The loss of the of H-LSTM model \({L}_{H-\mathrm{LSTM}}\) enhances the model convergence of the generators of each GAN-based framework. The loss function of the H-LSTM model uses the categorical cross entropy function as follows:

$${L}_{H-\mathrm{LSTM}}=\sum_{k=1}^{TP}\sum_{a=0}^{2}{-y}_{k}^{a}\mathrm{log}({\widetilde{y}}_{k}^{a})$$
(20)

where \({\mathrm{y}}_{\mathrm{k}}^{\mathrm{a}}\) denotes the ath type of target trading action, and \({\widetilde{y}}_{k}^{a}\) denotes the ath type of predicted trading action. The loss function of the generators of the GAN-S and WGAN-S models is as follows:

$${L}_{D}^{\mathrm{GAN}-S},{L}_{D}^{\mathrm{WGAN}-S}=\frac{\mathrm{log}\left(1-{Z}_{\mathrm{fake}}\right)+{L}_{H-\mathrm{LSTM}}}{2}$$
(21)

In this case, the average weights of both the H-LSTM model and the GAN-based models are considered. The loss function of the generator of the LSGAN-S model is as follows:

$${L}_{D}^{\mathrm{LSGAN}-S}=\frac{{({Z}_{\mathrm{fake}}-1)}^{2}+{L}_{H-\mathrm{LSTM}}}{2}$$
(22)

3.4 Performance evaluation

There are three indicators such as CR, SR, and WPCT to evaluate predicting performance for each framework. The probabilities that the daily predicted trading actions \(\widetilde{Y}\)=\([{\widetilde{y}}_{1},{\widetilde{y}}_{2},\dots ,{\widetilde{y}}_{N}]\) are converted into the daily predicted action type \(\widetilde{A}=[{\widetilde{a}}_{1},{\widetilde{a}}_{2},\dots ,{\widetilde{a}}_{N}]\) and \({\widetilde{a}}_{n}\) as follows:

$${\widetilde{a}}_{n}=\underset{n}{\mathrm{argmax}}{\widetilde{y}}_{n}^{u}$$
(23)

We suppose that all cash is used to buy or sell stocks. In this case, if the continuous predicted buying or selling actions appear in \(\widetilde{A}\), only the first buying or selling action is considered. If the last action is to buy on the last trading day, the action of the last trading day is converted into a selling action. For example, for the predicted trading action sequence \(\widetilde{A}=[\mathrm{0,2},\mathrm{1,1},\mathrm{0,0},\mathrm{2,2},\mathrm{1,0}]\) covering 10 trading days, the converted trading action is \({A}^{^{\prime}}=[\mathrm{0,0},\mathrm{1,0},\mathrm{0,0},\mathrm{2,0},\mathrm{1,2}]\). Therefore, the final buying actions occur on the third and the ninth days, and the final selling actions occur on the seventh and tenth days. To compute the ROI, \({A}^{^{\prime}}\) is converted into two lists of actual trading positions \({I}^{\mathrm{BUY}}\) and \({I}^{\mathrm{SELL}}\). The trading position of \({I}^{\mathrm{BUY}}\) is

$${I}^{\mathrm{BUY}}=\underset{n,{a}_{n}^{^{\prime}}\in {A}^{^{\prime}}}{\mathrm{arg}}{a}_{n}^{^{\prime}},\quad if\; {a}_{n}^{^{\prime}}=1,$$
(24)

and the trading position of \({I}^{\mathrm{SELL}}\) is

$${I}^{\mathrm{SELL}}=\underset{n,{a}_{n}^{^{\prime}}\in {A}^{^{\prime}}}{\mathrm{arg}}{a}_{n}^{^{\prime}},\quad if\; {a}_{n}^{^{\prime}}=,$$
(25)

where \({I}^{\mathrm{BUY}}\) means that the buy position in \({A}^{^{\prime}}\) represents a buying action, and \({I}^{\mathrm{SELL}}\) means that the sell position in \({A}^{^{\prime}}\) represents a selling action. In this case, the lengths of \({I}^{\mathrm{BUY}}\) and \({I}^{\mathrm{SELL}}\) are equal, and each corresponding position on \({I}^{\mathrm{BUY}}\) and \({I}^{\mathrm{SELL}}\) is a trading pair. Therefore, the returns R = [\({r}_{1},{r}_{2},\dots ,{r}_{M}\)], and the rate of return of the \(m\)-th trading pair can be expressed as follows:

$${r}_{m}=\frac{{\overline{p} }_{{I}_{m}^{\mathrm{SELL}}}-1}{{\overline{p} }_{{I}_{m}^{\mathrm{BUY}}}}$$
(26)

where \({\overline{p} }_{{I}_{m}^{\mathrm{SELL}}}\) denotes the sell price corresponding to the mth sell action, and \({\overline{p} }_{{I}_{m}^{\mathrm{BUY}}}\) denotes the buy price corresponding to the mth buy action. The final cumulative rate of return on investment \(\mathrm{CR}\) is as follows:

$$\mathrm{CR}=\left(\prod_{M=1}^{M}(1+{r}_{M})\right)-1$$
(27)

In addition, the SR during the holding period of the model is computed as follows:

$$\mathrm{SR}=\frac{\mathrm{avg}\_\mathrm{dr}-\mathrm{fixed}\_\mathrm{dr}}{\mathrm{std}\_\mathrm{dr}}$$
(28)

where \(\mathrm{fixed}\_\mathrm{dr}\) is the daily interest rate on the fixed deposit, and \(\mathrm{avg}\_\mathrm{dr}\) and \(\mathrm{std}\_\mathrm{dr}\) denote the average and standard deviation of the daily excess return, respectively. \(\mathrm{avg}\_\mathrm{dr}\) can be computed as follows:

$$\mathrm{avg}\_\mathrm{dr}=\frac{\sum_{m=1}^{M}\sum_{l={I}_{m}^{\mathrm{Buy}}}^{{I}_{m}^{\mathrm{SELL}}-1} \frac{{\overline{p} }_{l+1}-{\overline{p} }_{l}}{{\overline{p} }_{l}}}{\sum_{m=1}^{M}{I}_{m}^{\mathrm{SELL}}-{I}_{m}^{\mathrm{Buy}}}$$
(29)

where \(\mathrm{std}\_\mathrm{dr}\) can be computed as follows:

$$\mathrm{std}\_\mathrm{dr}=\sqrt{\sum_{m=1}^{M}\sum_{l={I}_{m}^{\mathrm{Buy}}}^{{I}_{m}^{\mathrm{SELL}}-1}{\left( \frac{{\overline{p} }_{l+1}-{\overline{p} }_{l}}{{\overline{p} }_{l}}-\mathrm{Fixed}\_\mathrm{dr}\right)}^{2}}$$
(30)

where \(\mathrm{Fixed}\_\mathrm{dr}\) can be computed as follows:

$$\mathrm{Fixed}\_\mathrm{dr}={(1+\mathrm{Semi}-\mathrm{fixed}\_\mathrm{r})}^{1/125}-1$$
(31)

where \(\mathrm{fixed}\_\mathrm{r}\) denotes the \(\mathrm{annual fixed deposit interest rate}\). In addition, we compute the WPCT to evaluate each model as follows:

$$\mathrm{WPTC}=\frac{\Vert R>0\Vert }{M}$$
(32)

where \(\Vert R>0\Vert \) denotes the number of positive return rates on trading pairs.

4 Experimental results

This section describes the datasets, experimental setting and results including the collected dataset, comparative predictors, evaluation metrics and trading performance comparisons.

4.1 Datasets

A set of top 20 stocks represent the highest proportion of the Taiwan Stock Exchange Capitalization Weighted Stock Index (TAIEX). The proportion of stock in TAIEX is equivalent to the ratio of the capital stock value to the Taiwan market. Therefore, this data set contains the 20 most representative companies in Taiwan. These 20 companies encompass a wide variety of industries. All stock trading information, including date, opening price, closing price, highest price, lowest price, trading volume, and adjusted closing price, are collected using the Yahoo Finance API. The null value of a feature is set to 0. To balance the scale differences of different features, all of the features are subjected to min–max normalization, and all of the training data are normalized between 0 and 1. The validation and test data are adjusted according to the zoom scale of the training data. Due to COVID-19 epidemic affects the price trends very fierce, so this study collects the experimental data before 2020. The period of the training data is from October 20, 2016 to December 17, 2018; the periods of the validation and test data are from December 17, 2018 to July 1, 2019 and from July 1, 2019 to December 31, 2019, respectively.

4.2 Descriptive statistics

According to Table 1, data from the semiconductor industry, finance and insurance industry, plastic industry, communication networks industry, and others are used.

Table 1 Stocks and industry

Table 2 displays the statistical results for the adjusted closing prices of each stock in the training data, validation data, and test data. All adjusted closing prices were calculated on August 12, 2020. The stock IDs 2330.TW, 2454.TW, 6505.TW, 3045.TW exhibited rise trends in the training, validation, and test data. The average prices of the semiconductor stocks in the test data are higher than those in the training data, and the average prices of the semiconductor stocks in the validation data are lower than those in the training data. The average prices of the plastics stocks in the validation data are higher those in the training data, but the average prices of the plastics stocks in the test data are lower than those in the validation data. The financial industry stocks exhibit different trends across the three data periods.

Table 2 Descriptive statistics of the adjusted closing prices of 20 stocks

Figure 2 depicts the numbers of trades corresponding to the three actions generated by the PLR approach under γ = 0.01, 0.05, 0.1, and 0.5. For smaller γ, a large number of trading points is obtained; for example, when γ = 0.01, 19 stocks have more than 100 trading pairs. For γ = 0.5, none of the stocks have more than 20 trading pairs.

Fig. 2
figure 2

Number of trading actions for 20 stocks as determined by the PLR approach under different \(\upgamma \)

4.3 Predictor

In the experiments, we used one supervised learning model and six GAN-based frameworks:

  • H-LSTM: This is the traditional supervised learning approach for generator training without the GAN-based framework.

  • GAN: This model trains the generator and the discriminator by using the original GAN model developed by Goodfellow et al. (2014).

  • GAN-S: The convergence of this model is the same as that of the GAN model, but the loss function is combined with the loss \({L}_{H-LSTM}\) of H-LSTM model.

  • LSGAN: The convergence of this model is the same as that of the LSGAN model developed by Mao et al. (2017).

  • LSGAN-S: The convergence of this model is the same as that of the LSGAN model, but the loss function is combined with the loss \({L}_{H-LSTM}\) of H-LSTM model.

  • WGAN: The convergence of the discriminator is the same as that in the case of the WGAN-GP model developed by Gulrajani et al. (2017), but the loss function of the generator is computed using the binary cross entropy method.

  • WGAN-S: The convergence of this model is the same as that of the WGAN model, but the loss function is combined with the loss \({L}_{H-LSTM}\) of H-LSTM model.

The best predictor in each experiment is selected based on positive CRs in the training and validation data and the maximum CRs in the validation data.

4.4 Hyperparameter setting

The detailed hyperparameter settings are summarized in Table 3, and hidden size is applied to the generator and the discriminator at the same time.

Table 3 Explanation of hyperparameters

4.5 Comparison of cumulative returns of seven models

The best models are selected based on their return rates when applied to the validation data. As summarized in Table 4, the GAN model yields the best return rates for stocks 6505.TW, 1301.TW, and 2207.TW. The GAN-S yields the best return rates for stocks 2454.TW, 6505.TW, 2882.TW, 3008.TW, and 3711.TW. The LSGAN yields the best return rates for stocks 2317.TW, 2886.TW, and 2881.TW. The LSGAN-S yields the best return rates for stocks 1216.TW and 2382.TW. The WGAN yields the best return rates for stocks 2330.TW and 2308.TW. The WGAN-S yields the best return rates for stocks 1303.TW, 1326.TW, 2891.TW, and 2884.TW. The H-LSTM yields the best return rate only for stock 3045.TW. In addition, none of the predictors yield positive return rates for stocks 2308.TW, 1216.TW, and 3045.TW.

Table 4 Comparison of CRs of the seven predictors when applied to testing data

As summarized in Table 5, the GAN-S yields the highest rewards on five stocks, and its average CR is 0.0500. However, the GAN has the highest average CR of 0.0542 among the seven models. The WGAN-S is ranked second with an average CR of 0.0523, and the H-LSTM is ranked last with an average CR of 0.0102. In addition, the GAN-S and WGAN-S are superior to the GAN and WGAN in terms of providing the best returns. However, in terms of the average CR, only WGAN-S is superior to WGAN.

Table 5 Counting of stock CRs of seven predictors and average CR

Table 6 displays the count of positive CRs (pCR), the count of negative CRs (nCR), the count of zero CRs, and the positive ratio CR (prCR) for each model when applied to the test data, where pr is calculated as pCR/(pCR + nCR). In the case of the WGAN-S, 15 stocks fall under pCR, and it is the best predictor given its prCR of 0.75. In the case of the H-LSTM, only six stocks fall under pCR, and it is the worst model given its prCR of 0.38. The performance of the other predictors in terms of prCR are between those of the H-LSTM and WGAN-S. On the basis of a comparison of the GAN-S, LSGAN-S, and WGAN-S in terms of supervised loss, only the LSGAN-S exhibits no improvement. However, all of the models outperform the fixed deposit return rate of 0.004.

Table 6 Numbers of positive returns, negative returns, and no trades for 20 stocks

4.6 Investment risk comparison of seven predictors

The annual interest rate for 6-month fixed deposits is 0.00795 according to the Bank of Taiwan’s New Taiwan Dollar Deposit (Lending) interest rate on July 1, 2019, and the interest rate for a half year is 0.004, which is used as the benchmark for SR comparisons. To evaluate investment risk, Table 7 displays the semi-annual fixed deposit interest rate as the basis to calculate the SRs of each predictor and each stock. The H-LSTM has the worst SR compared with the other GAN-based predictors. The WGAN-S yields the best SRs on 5 out of 20 stocks. The GAN yields the best SRs on 4 out of 20 stocks. The WGAN, LSGAN-S, and LSGAN yield the best SRs on 3 out of 20 stocks. The H-LSTM yields the best SRs on 2 out of 20 stocks.

Table 7 Comparison of SRs of the seven predictors when applied to testing data

Table 8 displays the count of the best SRs of each predictor on the test data as well as the average SRs. The WGAN-S is the best one in terms of the count of the best SR scores; it has five best SR scores. The WGAN is the best one in terms of average SR; its average SR of 0.0955. In addition, the H-LSTM and LSGAN-S have negative average SRs; the H-LSTM is the worst one. Moreover, these GAN-based predictors have better SRs than the simple GAN and H-LSTM.

Table 8 Best stock SR counts and average SRs of different predictors

4.7 WPCT comparison of seven predictors

Table 9 displays the WPCT performance of each predictor. All of the predictors have a positive WPCT on many of the stocks. However, the WGAN-S has at least one trading pair on each stock, and only on stock 2308.TW and 3045.TW belong to WPCT of 0.

Table 9 WPCT performance on test data

Table 10 displays WPCT counts in terms of WPCT counts. If \(WPCT\ge 0.5\), all of the predictors satisfy 7–14 stocks; if \(WPCT=1\), all of the predictors satisfy 1–6 stocks. In summary, the average WPCT of the WGAN is 0.7485, which is the highest among the seven models. However, all of the predictors have acceptable performance, with average WPCT values being between 47.01 and 64.73%.

Table 10 Average WPCT of each model on all stocks

5 Performance summary

Table 11 displays the overall performance of the seven predictors. Overall, the WGAN outperforms the five GAN-based predictors and the deep learning predictor. The performance of the GAN-S and WGAN-S is similar and high. The performance of the H-LSTM and LSGAN-S is similar and poor; their overall ranks are 18 and 19, respectively.

Table 11 Performance comparison of the seven models on three evaluation metrics

6 Discussion

Experimental results present the proposed GAN-based framework used to generate more acceptable trading actions according to three evaluation metrics. The MSE loss function used in LSGAN and LSGAN-S is not a reliable convergence approach because the difference between actual value and predicted value is not a real distance. Therefore, GAN, GAN-S, WGAN, and WGAN-S models using binary cross entropy function obtain high performances because the task of trading action prediction belongs to multi-class classification problems which are buying, selling, and holding. Another hand, in the stock prediction task, the WPCT is the most important indicator since if to do trading-pair (buy in and sell out) which always get positive return, in this case, people will obtain high confidence for the prediction model such as WGAN.

7 Conclusion

The paper has proposed the GAN-based frameworks to improve the prediction performances of trading strategy. The experimental results indicate that the GAN-based frameworks yield excellent performance; in particular, the WGAN and WGAN-S predictors yield high average CRs and counts of positive CRs. However, the GAN-based predictors also outperform the traditional supervised learning models such as the H-LSTM predictor. The generator has predicted the more effective trading actions, and the discriminator has distinguished whether these trading actions originate from the PLR or the generator. In addition, the PLR approach provides a reliable training target for GAN-based frameworks and improves stock market trading performance. Therefore, the GAN-based framework has generated the more diverse trading strategies and has decrease the overfitting problem.

There are few limitations in the paper, buying and selling actions are all-in or all-out on the stock market. In practice, there are fixed units for stock trading. Moreover, herein, trading fees and transaction levies are not considered when ROI is computed, and traders may be required to pay larger amounts when executing trades frequently. In the future work, seed trading action sequences for GAN models can apply multiple methods, such as the buy and hold strategy and the stock ticker strategy. Therefore, the discriminator of the GAN-based framework will be able to obtain more real trading action sequences from historical trading information to detect whether a sequence is real or fake.