Introduction

In the last decade, machine and deep learning models have made significant advances in many applications, prompting investors and financial institutions to design profitable trading strategies [1, 2]; in fact, stock exchanges trade assets worth billions of dollars on a daily basisFootnote 1 while investors seek to achieve profitable returns over their investment horizons by participating in the market. Hence, investing in the capital market may present opportunities to increase expected returns from the investment compared to traditional financial approaches such as savings and time deposits [3].

Therefore, practitioners and researchers have focused their attention on the optimization of a portfolio, which is composed of a collection of financial assets (i.e., stocks, bonds, currencies, cryptocurrencies, and others) [4, 5]. Redistributing these assets of a portfolio to increase the rate of return on investment is commonly defined as portfolio optimization [6, 7], in which the aim is to distribute investment among several assets to jointly maximize returns and minimize the risks [2, 8].

There are two main types of investment strategies (active and passive), as pointed out by [9]. The former aims to beat the average return on the stock market through a careful stock selection process [10] although its long-term performance is unstable because it can change dramatically over time. In turn, the latter [11] aims to achieve the average market returns through a buy-and-hold strategy, which is less complex than active investment.

One of the most used models for portfolio optimization is the Mean-Variance (MV) model proposed by [12], which is a multi-objective problem to jointly maximize the return on investment and minimize the risks. Following Markowitz’s pioneering work, the standard Mean-Variance (MV) model has extended over different directions [13] although they are mainly focused on the improvement and expansion of the MV model without adequately addressing the selection of high-quality assets before creating an optimal portfolio [14].

Despite the financial market attracting the attention of several researchers and practitioners to maximize their profit from their investment [15], it is known to be highly non-linear and volatile [16]. Hence, predicting the temporal behavior of stocks modeled as a time series is considered a challenging task [17]. Furthermore, it is important to identify the intrinsic value of a given stock for mitigating market deterioration and asset loss value [18]. Specifically, one of the main factors in portfolio optimization concerns stock market forecasting due to the increasing attention on identifying high-quality stock selection before optimizing the portfolio. In fact, the effectiveness of the portfolio optimization task is strongly affected by the stock market performance, as shown by [19], requiring more and more accurate forecasting to maximize the return from investment and mitigate risk.

Nevertheless, these methods encounter significant limitations due to the complex nature of the task. Firstly, modeling the non-stationarity, non-linearity, and uncertainty of price series is a critical concern due to the prevalence of noise, jumps, and fluctuations [20, 21]. Second, another challenge concerns how investment budget is distributed among different stocks in terms of numbers and business sectors, falling under the name portfolio diversification [6]. Finally, transaction and risk costs can significantly affect the effectiveness of the designed approaches [22].

In this paper, we propose a cognitively inspired framework for optimizing asset allocation to maximize the revenue from the investments and minimize the related risks, mainly relying on Artificial Intelligence (AI) models that are becoming more and more pervasive in the financial domain as shown in [1]; in fact, [23] underlined the relevance of integrating machine learning-based stock forecasting approaches in portfolio optimization. In particular, we integrate a Long Short-Term Memory (LSTM)-based model, which is one of the most used for predicting time series [24], to forecast stock trends by combining historical financial data and technical indicators. Hence, the main novelty of the proposed approach relies on the design of a cognitively inspired portfolio optimization framework by integrating both deep learning-based forecasting to predict stock trends for maximizing revenue and portfolio diversification and Shape Ratio for minimizing risk. In particular, it deals with the portfolio optimization task to design cognitive agents that perform autonomous actions for supporting decision-making, as also discussed by [25]. This approach falls under the umbrella term of Cognitive Computing (CC) since it is an Artificial Intelligence-based solution representing an important driver for knowledge-rich automation work, satisfying the definition of CC provided by [26]. We further integrate the stock forecasting into the portfolio optimization model, whose need for tight integration has been underlined by [23]. Furthermore, we investigate the main factors affecting the stock forecasting task, which is based on the analysis of various data to make predictions that are the main fundamental of cognitive systems as discussed by [27].

Summarizing, the main novelties of the proposed framework concerns:

  • Design of cognitively inspired prediction model for stock forecasting by combining historical financial data and technical indicators;

  • Design of a portfolio optimization framework by integrating both deep learning-based forecasting to predict stock trends for maximizing the revenue and portfolio diversification and Shape Ratio for minimizing the risk;

  • Evaluation of the proposed framework in a real-world scenario by considering stocks in different industrial domains which in pairs strongly influence each other to satisfy the portfolio differentiation. In particular, we compare the effectiveness of the proposed approach w.r.t. several baselines, also providing an ablation study to verify the integration of a deep learning-based approach for stock forecasting into the portfolio optimization model.

The remainder of the paper is organized as follows. The state-of-the-art approaches to portfolio optimization have been investigated in the “Related Work” section, also underlining the main novelties of the proposal. The “Methodology” section describes the proposed methodology, detailing its main components (stock forecasting and portfolio optimizer). The evaluation analysis has been presented in the “Experimental Evaluation” section, in which we described experimental protocol with the related assessment metrics, while the “Results” section discusses the efficiency and effectiveness results on a real-world dataset w.r.t. different baselines. Finally, the “Conclusion” section summarizes the main findings of the analysis, also pointing out possible future directions. The used notations are shown in Table 1.

Table 1 Notations

Related Work

In the last years, different researchers have focused their attention on portfolio optimization task [6, 13], which is defined as a multi-objective problem for optimizing allocation assets and minimizing the associated risk [13, 17]. Specifically, portfolio optimization aims to find the optimal allocation over different assets taking into account three different risks, as shown in [28]: (i) quantification risk through statistical measure; (ii) asset diversification and (iii) trade-off between risk and return.

Despite that the Markowitz Mean-Variance (MV) model [29] represents the fundamental of the modern portfolio theory, it suffers from multiple limitations for practical applications (e.g., limited assumptions and high computational complexity). Hence, numerous studies have been designed to enhance the MV model [28], mainly based on optimization approaches [30, 31]. The former defined a mixed-integer semidefinite optimization problem with cardinality constraints for limiting the number of invested assets through the analysis of the probability distribution of asset returns. They further designed a cutting-plane algorithm to solve the defined problem. In turn, [31] relied on multi-objective strategies (see [32] for more details) to deal with the computational complexity of portfolio optimization. Furthermore, another challenge concerns the covariance matrices computed from empirical finance data that appear to contain a high amount of noise, requiring specific filtering methods, as shown in [33]. However, the non-linearity and highly non-stationary nature of stock trends pose several challenges in the forecasting task [34].

Over the past decade, machine learning and deep learning techniques have driven significant advances across many application areas, which inspired investors and financial institutions to develop machine learning-aided trading strategies [35, 36]. Integrating AI-based models in portfolio optimization might prevent behavioral economic biases [37]. Chen et al. [7] designed the MV model, in which stocks are selected among the ones with higher potential returns through eXtreme Gradient Boosting (XGBoost), whose hyper-parameters are computed by using an improved firefly algorithm. In [31], the authors combined multiple neural networks output through the particle swarm optimization (PSO)-based weight optimization. In turn, the relevance of Technical Analysis indicators has been investigated by [38, 39] as case studies for stock prediction and portfolio optimization, respectively. However, some of them are lagging indicators, affecting the effectiveness of stock trading and portfolio management.

Since AI-based models have shown overwhelming superiority over well-known statistical methods, incorporating the return prediction of traditional time series models can improve the performance of the original portfolio optimization model, as shown in [40]. Specifically, [23] have underlined the need for a tight integration of stock forecasting approaches based on artificial intelligence in the portfolio optimization process. Recent developments in deep learning techniques have motivated intensive research in AI-aided stock trading strategies. Hence, several researchers have designed portfolio optimization using deep learning approaches since that has the potential to significantly enhance the efficiency of portfolio selection processes and improve the return on the investment [9, 41].

Other approaches [7, 40] aimed to make stock pre-selection before portfolio optimization by integrating Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) into the MV model, respectively. Two-stage approaches have been further proposed by [23, 42] for portfolio optimization. Specifically, the former first computes backward the portfolio value that would have had in past periods based on the time series of its stocks, and, successively, the future value of the portfolio is predicted through a damped trend model. In turn, [23] designed an approach for a man-value-at-risk portfolio optimization model in which the stock trends have been predicted through well-known machine learning models. Bisht and Kumar [19] proposed a further approach for portfolio optimization by selecting the highest performing stocks in previous years, extracted from strong sectors, that are identified through Dempster-Shafer evidence theory based on historical closing prices. In turn, [22] designed a Reinforcement Learning-based approach for portfolio optimization by considering dynamic risks according to market conditions for dynamic portfolio rebalancing.

Nevertheless, the financial market has a highly non-stationary nature hindering the application of typical data-hungry machine learning methods [6], often requiring to analyze contextual information [43,44,45] that makes stock forecasting a more and more challenging task. Hence, some approaches have been developed to incorporate these features in the stock prediction process, as proposed for instance by [46], that designed a multi-source aggregated-based approach for classifying stock price movement. Furthermore, the predicted value of news and social media has been investigated by [47], whose analysis unveiled that news and social media content are more effective with a one and two-to-five-day range, respectively. Nevertheless, the large amount of financial news, the heterogeneity of data sources, and the variety of events to extract from the textual content pose several challenges in integrating news content into the stock forecasting process. Therefore, the use of cognitive agents is required to analyze and predict financial market.

The main novelties of this approach w.r.t. the state-of-the-art ones concern:

  • The design of a cognitively inspired forecasting module that combines historical financial data and technical indicator w.r.t. other approaches [43,44,45] relying on social network and news, whose analysis is very sensitive to noise introduced by fake news spread [48] and malicious users actions [48], that are difficult to identify;

  • A cognitively inspired portfolio optimization approach relies on forecasting score of stock trend and risk factor, integrating the stock diversification into the model, w.r.t. [7, 40], that only incorporates AI-based models for the pre-selection purpose.

  • The design of a portfolio optimization model based on cognitively inspired stock forecasting which combines historical and technical data w.r.t. [23], that do not consider the multi-source analysis and short-time dependency of stock features over the time.

Fig. 1
figure 1

Overview of the proposed framework, mainly composed by Data Pre-Processing, Stock Forecasting and Portfolio optimization

Methodology

Figure 1 shows the overall flow of the proposed cognitively inspired framework, starting from the data crawling until portfolio optimization. In particular, we crawled data from financial data sources (see the “Source Identification” section), that are, successively, pre-processed to feed as input to a machine learning model for stock forecasting. Finally, we integrate risk factor and stock forecasting into the optimizer which will then distribute the budget into the appropriate investments.

Task Definition

In this section, we provide a formal definition of the addressed task, aiming to identify the optimal allocation of financial investments, also called assets. We deal with this task by first estimating the stock price trends that are, successively, used as variables in the optimization model to select a collection of assets. In particular, the first task falls under the name of theStock forecasting (see Definition 1), which is a challenging task due to the highly noisy dynamic, non-linear, non-parametric, and chaotic nature of stock data.

Definition 1

Stock Forecasting. Let \(s_{i,t_0},\ldots ,s_{i,t_N}\) be N observations about historical information of the stock \(s_i\), the stock forecasting task aims to provide an approximation of the function \(Y=f(s_{i,t_0},\ldots ,s_{i,t_N})\), which maps N stock historical values into the target variable score.

In turn, the second task (Portfolio optimization task, aims to build a time-dependent sequence of stocks for maximizing the investment and minimizing the risk (see Definition 2).

Definition 2

Portfolio Optimization. Let \(S=\{s_1,\cdots ,s_N\}\) be a set of assets, we define the portfolio optimization task as the weighted selection of stocks on which to make an investment of a specific amount of money. In particular, a percentage of investment (\(w_i\)) is invested on the i-th stock to maximize the total return of investment \(R=\sum _{i=0}^N {r_i x_i}\) where \(r_i\) is the return of investment for the i-th stock and \(x_i=0/1\) represents whether the stock is considered within the portfolio or not.

Hence, we define portfolio optimization as a multi-objective problem to identify the best investment proportion of stocks while minimizing investment risk. Nevertheless, the task of predicting stock market returns has become more and more challenging due to non-linearity and non-stationary characteristics within financial time series data.

Table 2 Correlations of three stocks (i.e., Apple, Microsoft, and Disney)
Fig. 2
figure 2

Internal architecture of the designed LSTM-based forecasting module

Data Pre-processing

In this section, we analyze the main information composed of the collected dataset (see the “Source Identification” section) and the pre-processing operations (see the “ Data Manipulation” section) made on them.

Source Identification

In the stock market, different information (e.g., trading volume or opening/closing price) is taken into account for investigating the stock trend. In particular, a set of corporate actions affects the share capital of a listed company which results in a change in the value of the shares involved in the transaction.

We consider different features for each stock extracted from Yahoo FinanceFootnote 2 for each date in the interval of analysis:

  1. 1.

    Opening Price

  2. 2.

    Closing Price

  3. 3.

    Adjusted Closing Price

  4. 4.

    Highest Price Reached in the Day

  5. 5.

    Lowest Price Reached in the Day

  6. 6.

    Volume Weighted Average Price

Once these data have been extracted for any stock, different pre-processing operations (i.e., correlation analysis and data scaling) have been performed.

Data Manipulation

In this section, we discuss the pre-processing operations made for unveiling the most relevant features to deal with the designed task. Since we aim to predict the next-day closing price which will be one of the input variables of the proposed portfolio optimization model, we perform the correlation analysis among features w.r.t. the target. We made this analysis by choosing three different stocks (Apple (AAPL), Microsoft (MSFT) and Disney (DIS)), whose outcomes are shown in Table 2. It is easy to note that most of the features are highly correlated; in fact, High and Low attributes represent the upper bound and the lower bound of the two prices of “Open” and “Close” which themselves are quite related. It is worth noticing that Close and Volume features are the most discriminating ones observing the results shown in Table 2.

Stock Forecasting

In this section, we describe the proposed cognitively inspired model for predicting the financial trend of a stock. The reason behind this workflow is mainly due to the need for tightly integration of AI-based stock prediction in the forecasting process, as shown by [23]. Hence, the main novelty of this module is to forecast stock behavior by combining historical financial data and technical indicators, which are often applied in buy and sell signals for stocks. This module relies on an LSTM model, whose internal architecture has been shown in Fig. 2, which has been chosen because it achieves higher results [49] in predicting time series trends. This architecture is able to capture the non-linear dependency of independent variables and the target one making it more suitable for supporting financial applications with respect to the conventional linear models, as shown in [50, 51].

The LSTM architecture aims to unveil temporal dependencies inside the time series through a set of gates, that manage the knowledge flowing inside the cell. The advantage of using LSTM cells is twofold: i) they are designed to solve the vanishing gradient problemFootnote 3 and ii) they learn patterns from long-time sequences.

Portfolio Optimization

This module is based on an optimization approach on the model 1, relying on a slightly modified Markowitz’s model [12], by integrating forecasting predictions made by the previous module and risk factor in a unified optimization model. Its fundamental principle relies on the need to jointly minimize the risk and maximize the overall return by compensating for the asynchronous trends of the individual securities.

Despite classic portfolio optimization models rely on mean historical return as expected return, they obtain an inaccurate estimation of future short-term returns [40]. Specifically, we consider a percentage of the same invested budget that must be exceeded as a target variable.

$$\begin{aligned} \begin{aligned} \min \quad&\sum _{i=1}^{N}{\sum _{j=1}^{N}{\sigma _{ij}x_ix_j}}\\ \text {s.t.} \quad&\sum _{i=1}^{N}{r_ix_iB}\ge pB\\ \quad&\sum _{i=1}^{N}{x_i} = 1\\ \quad&0\le x_i \le 1&\textrm{i} =1,...,\textrm{N}\\ \end{aligned} \end{aligned}$$
(1)

In this case, the factor p will be chosen as the percentage of the budget to exceed while \(x_i\) and \(r_i\) represent the percentage of budget and return that a user wants to invest and return based on the forecast made by the stock forecasting module, respectively. Furthermore, \(\sigma _{ij}\) is inferred from the average of the log-returns of a single stock, which is computationally efficient. The third constraint simply indicates that all the available budget must be used while we do not consider the possibility of short selling in the last constraint.

In turn, a further attempt was made by modifying the objective function to maximize the Sharpe Ratio of the financial portfolio.

$$\begin{aligned} \begin{aligned} \max \quad&\frac{\sum _{i=1}^{N}{lr_ix_i}}{V}\\ \text {s.t.} \quad&\sum _{i=1}^{N}{r_ix_iB}\ge pB\\ \quad&\sum _{i=1}^{N}{x_i} = 1\\ \quad&0\le x_i \le 1&\textrm{i}=1,...,\textrm{N}\\ \end{aligned} \end{aligned}$$
(2)

In particular, \(lr_i\) is the average of the log-returns of a single stock, while V is the volatility of the financial portfolio obtained through the log-return covariance matrix.

Experimental Evaluation

The proposed framework has been designed for next-day portfolio optimization by allocating the investment budget over a subset of stocks. In this section, we discuss the effectiveness and efficiency results of the proposed framework according to the experimental protocol described in the “Experimental Protocol” section. Furthermore, we detail the hyper-parameter tuning process for each module to identify their best configuration (see the “Hyper-parameter Tuning” section) and parameter tuning (see the “ Portfolio Optimizer Tuning” section).

Experimental Protocol

In this section, we describe the experimental protocol used to evaluate the efficiency and effectiveness of the proposed approach after performing a parameter tuning.

For this reason, the aim of the experimental protocol is fourfold:

  • Effectiveness of stock prediction in terms of Root Mean Squared Error (RMSE) - GAIN varying the number of epochs and time window;

  • Efficiency and effectiveness of portfolio optimization in terms of Running Time and Gain, respectively.

  • Effectiveness comparison of the proposed approach w.r.t. several baselines in terms of Total Revenue;

  • An ablation analysis to investigate how stock prediction might support investors in dealing with the portfolio optimization task.

The dataset has been collected from Yahoo Finance,Footnote 4 starting from 01-01-2016 to 01-01-2020, considering two stocks for six different areas because one of the critical issues in portfolio optimization concerns the “portfolio differentiation” [6]. Specifically, it means that the available budget is invested in stocks of different areas:

  1. 1.

    Tech: Apple (AAPL), Microsoft (MSFT)

  2. 2.

    Chemical: DuPont (DD), Abbott Laboratories (ABT)

  3. 3.

    Automotive: Ford (F), HMC (Honda)

  4. 4.

    Pharmaceutical: Johnson & Johnson (JNJ), Albemarle Corporation (ABL)

  5. 5.

    Food: McDonald (MCD), Coca-Cola (KO)

  6. 6.

    Entertainment: Electronic Arts (EA), Disney (DIS)

Table 3 Effectiveness results of the proposed framework varying the number of days and epochs while considering the dataset split in 70-20-10

We evaluate the effectiveness of Stock Forecasting task through the Root Means Squared Error (RMSE), which is defined according to Eq. 3:

$$\begin{aligned} RMSE=\sqrt{\frac{1}{N}\sum _{i=1}^{N} (y_i - \hat{y}_i)^2} \end{aligned}$$
(3)

where N is the number of observation and \(y_i\) and \(\hat{y}_i\) are the real and predicted values, respectively.

In turn, we use the total revenue for evaluating the Portfolio Optimization, which is the amount of money earned or lost during the process.

The framework has been deployed on Google ColabFootnote 5 platform while the software stack relies on Python 3.6 with yfinance,Footnote 6 for crawling financial data on which we compute technical indicator through Ta-lib,Footnote 7 and TensorFlowFootnote 8 and ScipyFootnote 9 to design deep learning models and to perform optimization algorithm, respectively.

Hyper-parameter Tuning

In this section, we investigate the hyper-parameter optimization process for each stage of the proposed methodology (i.e., Stock Forecasting and Portfolio Optimizer). In particular, we perform the hyper-parameter tuning by splitting the entire dataset into three sets: train (\(70\%\)), validation (\(20\%\)), and test (\(10\%\)). Once hyper-parameter tuning has been performed by identifying the best model’s parameters, we used them for the evaluation process described in the “Results” section.

Stock Forecasting Tuning

In this section, we describe the hyper-parameter tuning made at each stage of the proposed framework. Firstly, we evaluate the window size for stock forecasting on the basis of the RMSE, varying five different time windows (3, 5, 10, 30, and 45 days) and different epochs (3, 10, 30, 100). Doing five tests and taking the average of the RMSEs, the following tables were obtained.

In this analysis, we used an LSTM-based model composed of 2 LSTM layers, having 100 neurons for each level and two Dense layers, composed of one with 25 and the other one with one neuron, respectively. The model has been trained by choosing Adam as the optimizer and the Mean Squared Error as the loss parameter.

It is worth noticing in Table 3 that the best RMSE score has been achieved using the 3-day time window. This observation is aligned with the main findings inferred by [52], which state that using window size closest to the forecast horizon may enhance the effectiveness of the prediction module to deal with the non-linearity and non-stationary market behavior.

Afterward, we are interested in integrating financial indicators into our data set to improve the effectiveness performance of the forecasting module although we found that only the Percentage Price Oscillator (PPO) achieves good performance in terms of RMSE. This indicator measures the Momentum, which is the speed at which the price is changing.

PPO has been computed as the nine-day Exponential Moving Average (EMA), decreased and, successively, divided by the 26-day EMA. We provide a formal definition of PPO in Eq. 4, where EMA is the exponential moving average of the stock’s closing price.

$$\begin{aligned} \frac{\textrm{EMA}_{9-d a y}-\textrm{EMA}_{26-d a y}}{\textrm{EMA}_{26-d a y}} \end{aligned}$$
(4)

It is worth noticing in Table 4 that PPO achieves the highest results in terms of RMSE w.r.t. other technical indicators (see Table 5). Furthermore, Table 4 shows that the number of epochs to train the designed network turns out to be around 100 epochs.

In Table 6, we show the RMSE outcomes (mean and standard deviation) of the forecasting module by combining close price and PPO. For each stock, 10 tests have been carried out, maintaining the 100 training periods and varying the time window among three possible values (3 days, 10 days, 30 days).

Table 4 Evaluation of forecasting module, whose input is composed by Close price and PPO technical indicator varying number of epochs

Portfolio Optimizer Tuning

In this section, we are focused on the tuning of the optimizer module evaluated in terms of revenues, varying to main parameters (earning percentage and risk factor). The former represents the amount of budget earned per day among (\(0.5\%\), \(1\%\), \(2\%\) and \(3\%\)) choosing an initial budget equal to 15, 000 while the latter is represented through the covariance of stocks and the Sharp Ratio.

Table 7 shows the outcomes of the tuning of the optimizer module by computing the stock covariance matrix over different time periods (one month and one year). For each forecast week, we carried out 5 trials by computing the gain and losses identified at the end of the experiments.

In turn, Table 8 reports the results obtained by using the Shape Ratio over different time horizons. Specifically, we consider the annualized year and month [53] by multiplying the value by the square root of 252 and 12, respectively. A final test has been carried out by considering a not annualized Shape Ratio, which means to take the previous month’s score ratio data and applying it as the optimizer’s risk matrix.

Results

In this section, we discuss the efficiency (“Efficiency Analysis” section) and effectiveness (“Effectiveness Analysis” section) results of the proposed methodology, also comparing it with respect to several baselines.

Table 5 Evaluation of forecasting module, whose input is composed of Close price and different technical indicators varying number of epochs

Efficiency Analysis

We evaluate the effectiveness of the proposed methodology by varying both the window size (5, 10 and 15) and the number of stocks (6, 12, 20, 30, 50), using the number of training epochs equal to 100 (the best values identified in the optimization phase, as shown in Table 4). Specifically, we compute the mean and standard deviation over 10 repetitions for each parameter combination.

As shown in Table 9, the rise in the number of stocks and the window size increases the training time (in seconds) almost linearly.

Table 6 Mean and standard deviation of RMSE for Close+PPO varying the number of days, fixing the number of epoch equal to 100
Table 7 Evaluation of the optimizer module through covariance matrix based on different time analyses
Table 8 Evaluation of the optimizer module through Shape Ratio based on different time analyses
Table 9 Training times varying number of stocks and days

In turn, Fig. 3 shows training time in seconds to compare the efficiency performance of LSTM and Gated Recurrent Unit (GRU)-based forecasting module varying both windows size and the number of stocks. In particular, Fig. 4 shows the internal architecture of the designed GRU-based forecasting module.

At the end of this analysis, we can observe in Table 9 and Fig. 3 that the worst scenario (i.e., forecast analysis considering a 15-day window and 50 stock) requires a training time around 2 h and 15 min, being suitable for the scenario which we had set. However, the training time in the worst case is decreased to 1 h and 5 min, having identified the best window size to 3 days during the hyper-parameter setting (see Table 3). Finally, it is worth noting in Fig. 3 that GRU is faster than the LSTM because the former is composed of fewer parameters.

Fig. 3
figure 3

Efficiency comparison of the proposed methodology by integrating LSTM and GRU varying the number of stocks and window size

Fig. 4
figure 4

Internal architecture of the designed GRU-based forecasting module

Effectiveness Analysis

We evaluate the effectiveness of the proposed approach by firstly comparing the performance of the proposed methodology integrating LSTM and GRU and, successively, w.r.t. two optimization-based approaches [54, 55]. We carried out this experimental analysis by considering 2 weeks, which corresponds to consider 10 days since that Saturday and Sunday are not trading days for the stock market.

Table 10 shows the experimental outcomes in terms of total revenue, obtained by comparing two versions of the proposed methodology integrating LSTM and GRU, respectively. Although the GRU model shows a shorter training time than LSTM, the latter achieves higher effectiveness results, as shown in Table 10.

In Table 11, we compare the proposed LSTM-based methodology w.r.t. two baselines in terms of total revenue over two weeks.

It is worth noting during the first week that the proposed approach based on Sharpe Ratio outperforms the other baselines, which lose $57 and $48, respectively. Although the proposed approach loses money during the second week, we can see that it achieves a gain of $88.5, leading to obtain a net difference of a maximum of $168 over the baselines.

Comparison w.r.t. State-of-the-Art Approaches

In this section, we compare the effectiveness of the proposed approach with several state-of-the-art baselines on the entire test set in terms of total revenue. The proposed approach is composed of two phases: the former predicts stock’s trend and the latter performs portfolio optimization integrating the outcome of the previous phase. Hence, we choose as baselines approaches proposed by [7, 19, 40], that perform AI-based stock pre-selection before optimizing portfolio. Table 12 shows the outcome of the comparison on the entire test set. It is worth noticing that the proposed approach outperforms the baselines, showing the highest total revenue (up to \(77.29\%\)). In particular, the proposed approach and the one designed by [40] yield the highest total revenue, with a difference compared to the worst result of \(77.29\%\) and \(59.82\%\), respectively. This demonstrates how the integration of machine learning models for stock forecasting into the optimization task leads to gain better results.

Summarizing, the proposed framework improves the portfolio optimization task by combining an analysis of the stock features over time through an LSTM-based model, portfolio diversification, and Shape Ratio for minimizing.

Ablation Study

In this section, we describe the ablation study made for evaluating how the effectiveness of stock forecasting might affect the performance of the portfolio optimization, Specifically, we consider two different strategies for optimizing the portfolio: the former performs pre-selection of stocks and optimizing portfolio while the second one only performs portfolio optimization.

The obtained results in terms of total revenue are shown in Table 13, where it is possible to note that the proposed approach shows the highest score. In particular, this outcome supports the idea that a better prediction of the real market can improve the portfolio optimization approach by supporting investors in choosing robust optimization strategies.

Table 10 Effectiveness comparison of the proposed methodology integrating LSTM and GRU on the test set
Table 11 Effectiveness comparison of the proposed LSTM-based methodology w.r.t. several baselines on the test set
Table 12 Effectiveness comparison of the proposed LSTM-based methodology w.r.t. several state-of-the-art approaches on the test set
Table 13 Outcome of the ablation study to compare the effectiveness of portfolio optimization by integrating stock prediction or not

Conclusion

Over the past few years, the widespread diffusion of Artificial Intelligence has radically transformed the financial domain [56], with a particular focus on stock market analysis due to its growing relevance in the real world. Hence, several practitioners and researchers have focused their activities in designing methods for predicting market behavior with the aim of maximizing profits from investment activities.

In this paper, we designed a cognitively inspired framework for portfolio optimization to optimize the budget distribution over different stocks according to the prediction made and the risk factor associated with the investment. In particular, we integrate stock diversification into the optimization to diminish the risk. This framework relies on an LSTM-based forecasting module that combines historical data and technical indicators for predicting securities behaviors.

The proposed framework has been evaluated on a real-world dataset, composed of information about several stocks starting from 01-01-2016 to 01-01-2020. Firstly, we evaluated the effectiveness and efficacy performances of the forecasting module by comparing two different versions based on LSTM and GRU, respectively. Although the GRU model has a shorter training time than LSTM, the latter achieves higher effectiveness results. Furthermore, we compared the proposed framework w.r.t. different baselines [54, 55] obtaining a net difference of $168 at maximum. We further provided a comparison analysis between the proposed approach w.r.t. several state-of-the-art methodologies [7, 19, 40], in which it is worth noticing that the proposal outperforms the baselines showing the highest total revenue (up to \(77.29\%\)).

Future works will be devoted to extending the examined dataset by increasing the number of stocks and industrial domains. Furthermore, we will investigate how transaction costs as well as content information inferred from news and/or Social Media can be integrated into the proposed framework to improve its effectiveness in dealing with portfolio optimization.