Study on the prediction of stock price based on the associated network model of LSTM

Stock market has received widespread attention from investors. It has always been a hot spot for investors and investment companies to grasp the change regularity of the stock market and predict its trend. Currently, there are many methods for stock price prediction. The prediction methods can be roughly divided into two categories: statistical methods and artificial intelligence methods. Statistical methods include logistic regression model, ARCH model, etc. Artificial intelligence methods include multi-layer perceptron, convolutional neural network, naive Bayes network, back propagation network, single-layer LSTM, support vector machine, recurrent neural network, etc. But these studies predict only one single value. In order to predict multiple values in one model, it need to design a model which can handle multiple inputs and produces multiple associated output values at the same time. For this purpose, it is proposed an associated deep recurrent neural network model with multiple inputs and multiple outputs based on long short-term memory network. The associated network model can predict the opening price, the lowest price and the highest price of a stock simultaneously. The associated network model was compared with LSTM network model and deep recurrent neural network model. The experiments show that the accuracy of the associated model is superior to the other two models in predicting multiple values at the same time, and its prediction accuracy is over 95%.


Introduction
Stock market has received widespread attention from investors. How to grasp the changing regularity of the stock market and predict the trend of stock prices has always been a hot spot for investors and researchers. The rise and fall of stock prices are influenced by many factors such as politics, economy, society and market. For stock investors, the trend forecast of the stock market is directly related to the acquisition of profits. The more accurate the forecast, the more effectively it can avoid risks. For listed companies, the stock price not only reflects the company's operating conditions and future development expectations, but also an important technical index for the analysis and research of the company. Stock forecasting research also plays an important role in the research of a country's economic development. Therefore, the research on the intrinsic value and prediction of the stock market has great theoretical significance and wide application prospects.
The main purpose of this paper is to design a deep network model to predict simultaneously the opening price, the lowest price and the highest price of a stock on the next day according to the historical price of the stock and other technical parameter data. Therefore, it is proposed an LSTMbased deep recurrent neural network model to predict the three associated values (so it is called the associated neural network model, and abbreviated as associated net model). The associated net model is compared with LSTM and LSTM-based deep recurrent neural network, and verified the feasibility of the model by comparing the accuracy of the three models.
The rest of this paper is organized as follows. Section 2 introduces the research status of stock price forecasting. Section 3 introduces the model design of the associated neural network model. Section 4 describes the design of the 1 3 algorithm and experimental parameters. Section 5 introduces the experimental data set, the experimental results and the analysis on the results. Section 6 concludes the paper.

Related works
There are many related researches on stock price prediction. Support vector machines was applied to build a regression model of historical stock data and to predict the trend of stocks [1]. Particle swarm optimization algorithm is used to optimize the parameters of support vector machine, which can predict the stock value robustly [2]. This study improves the support vector machine method, but particle swarm optimization algorithm requires a long time to calculate. LSTM was combined with naive Bayesian method to extract market emotion factors to improve the performance of prediction [3]. This method can be used to predict financial markets in completely different time scales with other variables. The emotional analysis model integrated with the LSTM time series learning model to obtain a robust time series model for predicting the opening price of stocks, and the results showed that this model could improve the accuracy of prediction [11]. Jia [12] discussed the effectiveness of LSTM for predicting stock price, and the study showed that LSTM is an effective method to predict stock profits. Realtime wavelet denoising was combined with LSTM network to predict the east Asian stock index, which corrected some logic defects in previous studies [13]. Compared with the original LSTM, this combination model is greatly improved with high prediction accuracy and small regression error. Bagging method was used to combine multiple neural network method to predict Chinese stock index (including the Shanghai composite index and Shenzhen component index) [4], each neural network was trained by back propagation method and Adam optimization algorithm, the results show that the method has different accuracy for prediction of different stock index, but the prediction on close is unsatisfactory. The evolutionary method was applied to predict the change trend of stock price [5]. The deep belief network with inherent plasticity was used to predict the stock price time series [6]. Convolutional neural network was applied to predict the trend of stock price [7]. A forward multi-layer neural network model was created for future stock price prediction by using a hybrid method combining technical analysis variables and basic analysis variables of stock market indicators and BP algorithm [8]. The results show that this method has higher accuracy in predicting daily stock price than the technical analysis method. An effective soft computing technology was designed for Dhaka Stock Exchange (DSE) to predict the closing price of DSE [9]. The comparison experiment with artificial neural network and adaptive neural fuzzy reasoning system shows that this method is more effective.
Artificial bee colony algorithm was combined with wavelet transforms and recurrent neural network for stock price forecasting. Many international stock indices were simulated for evaluation, including the Dow Jones industrial average (DJIA), London FTSE 100 index (FTSE), Tokyo Nikkei-225 index (Nikkei) and the Taiwan stock exchange Capitalization Weighted Stock Index (TAIEX). The simulation results show that the system has good prediction performance and can be applied to real-time trading system of stock prediction.
A multi-output speaker model based on RNN-LSTM was used in the field of speech recognition [14]. The experimental results show that the model is better than a single speaker model, and fine-tuning under the infrastructure when adding new output branches. Obtaining a new output model not only reduces memory usage but also better than training a new speaker model. A multi-input multi-output convolutional neural network model (MIMO-Net) was designed for cell segmentation of fluorescence microscope images [15]. The experimental results show that this method is superior to the most state-of-the-art deep learning based segmentation method.
Inspired by the above research, considering that some parameters and indicators of a stock are associated with one another, it is necessary to design a multi-value associated neural network model that can handle multiple associated prices of the same stock and output these parameters and indicators at the same time. For this purpose, it is proposed an associated neural network model based on LSTM deep recurrent network which is established by historical data and for predicting the opening price, lowest price and highest price of the stock on the next day.

Long short-term memory network
Long short-term memory network (LSTM) is a particular form of recurrent neural network (RNN), which is the general term of a series of neural networks capable of processing sequential data. LSTM is a special network structure with three "gate" structures (shown in Fig. 1). Three gates are placed in an LSTM unit, called input gate, forgetting gate and output gate. While information enters the LSTM's network, it can be selected by rules. Only the information conforms to the algorithm will be left, and the information that does not conform will be forgotten through the forgetting gate.
The gate allows information to be passed selectively and Eq. 1 shows the default activation function of the LSTM network, the sigmoid function. The LSTM can add and delete information for neurons through the gating unit. To determine selectively whether information passes or not, it consists of a Sigmoid neural network layer and a pair multiplication operation. Each element output by the Sigmoid layer is a real number between [0, 1], representing the weight through which the corresponding information passes. In the LSTM neural network, there is also a layer containing tanh activation function which shown in Eq. 2. It is used for updating the state of neurons The forgetting gate of the LSTM neural network determines what information needs to be discarded, which reads h t−1 and x t , gives the neuron state C t−1 a value of 0-1. Equation 3 shows the calculation method of forgetting probability where h t−1 represents the output of the previous neuron and x t is the input of the current neuron. is the sigmoid function.
The input gate determines how much new information is added to the neuron state. First, the input layer containing the sigmoid activation function determines which information needs to be updated, and then a tanh layer generates candidate vectors ĉ t , an update is made to the state of the neuron, as shown in Eq. 4 where the calculation methods of i t and Ĉ t are shown in Eqs. 5 and 6 The output gate is used to control how many current neural unites state are filtered and how many controlling units state are filtered which are shown in Eqs. 7 and 8

Deep recurrent neural network
A LSTM-based deep recurrent neural network (DRNN) is a variant of the recurrent neural network. To enhance the expressive power of the model, the loop body at each moment can be repeated many times. As shown in Fig. 2, the structure diagram of deep recurrent neural network is given.  Deep recurrent neural network is composed of LSTM, so its operation mechanism is same as LSTM. During the process of constructing the task model, the dropout method was used. Dropout refers to the temporary discarding of the neural network unit from the network according to a certain probability during the training of the deep learning network, which is a means to prevent over-fitting. The principle of dropout operation is that the neurons in each layer are randomly deleted with probability P in a training iteration, and the data in this iteration are trained with the network composed of the remaining (1 − p)*N neurons, thus alleviating the over-fitting problem. The neural network model without dropout is shown in Fig. 2a, b is the neural network model with dropout.
The LSTM-based deep recurrent neural network model with dropout layer was used as the contrast model to verify the feasibility and applicability of the proposed associated neural network model. The structure of LSTM-based deep recurrent neural network is shown in Fig. 3.

Associated neural network model
Since the daily opening price, the lowest price and the highest price of the stock are associated to one another, and the opening price, the lowest price and the highest price are respectively predicted by different networks generally, the associations between one another are separated. Therefore, based on the deep recurrent neural network, a structural model of multi-value associated neural network (associated net) based on LSTM is designed, it is shown in Fig. 4.
The specific data processing flow of the multi-value associated neural network model is shown in Fig. 5. Data through the input layer to all three branches simultaneously. These three branches predict the opening price, the lowest price and the highest price respectively. In the Chinese stock market, the maximum fluctuation of stock price is only 10%. Therefor the model recombines the output of the left branch (opening price) and the output of the LSTM network of the second branch as the input parameter data of the predicted lowest price, and the highest price is subject to the opening price of the day, the impact of the lowest price, so the output of the left branch (opening price) and the output of the intermediate branch (lowest price) and the output of the LSTM network of the third branch form the highest of the new data forecast price.
In the model training phase, the total loss L total is used as the evaluation function, and the goal is to get the minimal value. The calculation method of the total loss is shown in Eq. 9

Design of algorithm and experiments
Regression method is used to predict a specific value, which is not a pre-defined category, but an arbitrary real number. Regression problem generally has only one output, and the output is the predicting value. The loss function used in regression problems commonly is the mean square error (MSE) (Eq. 10). It is the expectation of the square of the difference between the estimated parameter and the actual parameter. MSE can evaluate the degree of change of the function. The smaller the value of MSE, the better the

Algorithm
Deep learning often requires a lot of time and computational resources to train. It is needed to find an optimization algorithm that requires less resources and has faster convergence speed. The Adam optimization algorithm is an extension of the stochastic gradient descent algorithm and has great advantages in solving the non-convex optimization problem. During the training phase, the Adam optimization algorithm is used in the model, and L total is used as the evaluation function. Multiple values associated with neural network model algorithm framework as shown in Fig. 6, the first input sequence data to Associated Net model, it contains three DRNN networks in Associated Net model. Each DRNN network produces a loss, and the losses sum of these three DRNN networks is the total loss. Then the Adam algorithm is used to optimize the total loss. When the number of iterations did not reach the set number of n iterations in the model, the training will continue to reduce the total loss, otherwise training will stop.

Parameter setting
There is a parameter of step size in the input of the LSTM neural network that means how many historical data to remember as a reference for predicting the current price.
In order to use a relatively good step size in the experiment of the multi-value associated model, a comparison experiment is performed with 6112 sample data, at the step size of 5, 10, 20 and 30, and with the iteration number of 50. The loss variation graphs are shown in Figs. 7, 8, 9 and 10, separately. According to the loss variation graph at the step size of 5, 10, 20 and 30, it is found that the loss at the step size of 10 and 20 decreases the fastest and finally reaches a steady state. By comparing the average loss as shown in Table 1, it is found that the average loss at step size of five is the lowest. The average loss at the step size of 20 differs from the loss at the step size by 0.0014901(shown in Table 1). Considering the loss variation graph and the average loss comprehensively, 20 is chosen as the step size in the model.

Dataset
The experimental data in this paper are the actual historical data downloaded from the Internet. Three data sets were used in the experiments, one is Shanghai composite index 000001 and the others are two stocks of PetroChina (stock code 601857) on Shanghai stock exchange and ZTE (stock code 000063) on Shenzhen stock exchange. Shanghai composite index has 6112 historical data; PetroChina has 2688 historical data and ZTE has 4930 historical data. Each data set is divided into a training set and a test set in chronological order at the ratio of 4:1. Each data set has seven technical parameters. It is used these technical parameters as basic input attributes, and the OP, LP and HP of the next day as output values of the model. The identifiers of technical parameters related to stock are shown in Table 2.
Due to the different measurement unit of different stock index data, for avoiding the impact of different measurement unit, all the attribute data are normalized to fall within a same range. In this paper, the min-max normalization method is used. The normalization function is shown in Eq. 11 (11) x � = x − min max − min   Through the normalization operation, the data is scaled to [0, 1], which not only speeds up the gradient descent to find the optimal solution, but also improves the accuracy.

Experimental analysis of training phase
Using the training data set of the Shanghai Stock Index, Associated Net is compared with LSTM network and LSTM-based deep-recurrent neural network (DRNN) in the experiments. The highest price of the stock of the next day was trained and predicted respectively by LSTM, DRNN and Associated Net. As shown in Table 3, it is found that the mean square error of the three models gradually decreased with the increase of training times. LSTM network and DRNN had experienced a slight fluctuation. From the dimension analysis of the same training times, with the increase of the training times, the LSTM average mean square error is lower than the other two models, but in the test phase, the LSTM has the worst prediction effect and the lowest average accuracy. Because LSTM has been overfitting as the number of training increases. The average mean square error of Associated Net is larger than the average mean square error of LSTM and DRNN. Because our model is more complex and requires a larger number of iterations.
In order to verify this conjecture, several experiments is conducted on Associated Net. The opening price, the highest price and the lowest price of the next day were trained and predicted by the Associated Net model. As shown in Table 4, and the experimental results proved our conjecture. The root of this problem is that the associated network model is composed of multiple deep-recurrent neural networks. The model is complex, the number of neurons is large, and multiple output losses are combined, so the loss of the model decreases slowly. According to the analysis experiment, the model loss chart of each model for 200 iterations is drawn, as shown in Figs. 11, 12, and 13. The output of the Associated Net is the total loss and his three sub-losses (opening price loss, lowest price loss, highest price loss). From the analysis for the loss chart, it is found that the loss of each model is gradually decreasing. The LSTM model has multiple fluctuations during the training process. DRNN and Associated Net are very stable. Moreover, the individual sub-loss of the associated network model is also gradually decreased. As shown in Table 4, although the total loss of Associated Net is higher than that of the other two models, its sub-loss is very low,    Table 5, it is concluded that the model fits PetroChina data better. The data fitting result of ZTE is relatively poor at the beginning, but it gradually becomes better; In the end, their average loss of mean square error became similar. Through the experiments, it is found that the more the training data, the better the model fitting effect. Further more, while the number of iterations of the model training was increased appropriately, and the loss of the model decreased gradually. The above results are due to the following reasons.
• The model is complex and needs large amount of data to train the parameters of each neuron.
• PetroChina has a large circulation, and the stock price fluctuation is relatively small, so that a good fitting effect can be obtained quickly. ZTE's stock price fluctuations are relatively larger, so that more training data is needed to obtain a good fitting effect.

Experimental analysis in the test phase
In order to verify the training of each model in the training phase, the three models were tested separately using a test set of multiple stocks. The mean square error (MSE) is the expected value of the square of the difference between the estimated value of the parameter and the true value of the parameter, MAE is the average of the absolute error, and MAE can better reflect the real situation of the predicted value error. Therefor in the test, the average absolute error of MAE (mean absolute error) (Eq. 12) was used as the evaluation index to calculate the degree of deviation, and the result of 1 − MAE was used as the average accuracy of  Table 6, and average accuracy of Associated Net model with different data sets was shown in Table 7 There are large errors between LSTM prediction values and the real data (shown in Fig. 16); Fig. 17 shows the comparison of DRNN prediction values and real data. The prediction values and the real data are almost coincident, and the deviation between the prediction values and the real data is much small, indicating that the performance of DRNN is better than LSTM in the test data. The deviation between the three prediction results of Associated Net and the real data is also small as shown in Fig. 18. From Figs. 18 and 17, it is found that for the highest price prediction, Associated Net fits the curve of real data better than DRNN, and the data deviation of Associated Net is smaller than that of DRNN. The comparison of the average accuracy for the three models is shown in Table 6. It can be found that for predicting the highest price, Associated Net model has higher average accuracy than the other two models. This phenomenon confirms that the highest value of the next day is not only related to historical data, but also related to the opening price and the lowest price of the same day. Therefore, the Associated  Fig. 18, it is found that it fits well in the three data sets. From the data in Table 7, it is found that ZTE's average prediction accuracy is not as good as the other two stocks. However, the accuracy of the three models is above 95%, and the test results are in line with the prior conjectures. Therefore, Associated Net

Conclusion
In this paper, a multi-value associated network model of LSTM-based deep-recurrent neural network (Associated Net) is proposed to predict multiple prices of a stock simultaneously. The model structure, the algorithm framework and the experiment design are presented. The feasibility and accuracy of the Associated Net are verified by comparing the model with LSTM network model and the LSTM deeprecurrent neural network model. Multiple data sets were used to verify the applicability of Associated Net model. Experiments show that the average accuracy of Associated Net model is not only better than that of the other two models. Moreover, it can predict multiple values simultaneously, and the average accuracy of each predicted value is over 95%. Although the model achieves good effect, there are still some aspects can be improved. For example, simple arithmetic mean algorithm is used in the calculation of total loss in training phase, and the goal is to optimize the model by reducing the total loss. This loss calculation method does not take into count the relationship between each sub-loss, as well as some details when the total loss is the minimum, such as the extreme situation of each sub-loss and the oscillation in the process of loss reduction. In the next step, we will study the dimension reduction of the input parameters and the optimizing the loss calculation method to improve the average accuracy of the model.