Abstract
Enhanced index tracking (EIT) problem is concerned with selecting a tracking portfolio to beat the benchmark on return while having the minimum tracking error. This paper addresses the EIT problem based on uncertainty theory where stock returns are treated as uncertain variables instead of random variables. Under the framework of uncertainty theory, the paper proposes a new uncertain EIT model where the higher-order moment of the downside is used as the tracking error measure, as higher-order moment makes the model more widely applicable and the downside risk is in line with investors’ perception of risk. Besides, some realistic constraints are considered in the new uncertain EIT model. Then, the properties of the proposed model are discussed. To solve the model, we proposed, which is a nonlinear integer programming problem, a meta-heuristic algorithm presented. The efficiency of the algorithm and the applications of the proposed model are illustrated through numerical experiments.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The enhanced index tracking strategy is to select a tracking portfolio to beat the benchmark on return while having the minimum tracking error. In recent years, the enhanced index funds perform well and their scale grow rapidly (Jorion 2002; Filippi et al. 2016), so the EIT strategy is highly praised in the investment industry. In financial academia, the EIT problem has also attracted scholars’ attention. One central research question involves the definition of the tracking error and the corresponding construction of the EIT model. In an important work, Roll (1992) first uses the variance of the difference between tracking portfolio returns and benchmark returns (TEV) as the tracking error measure and proposes the TEV criterion to build the EIT model. Then, many scholars have developed this approach further by, e.g., adding a constraint on the tracking portfolio variance that investors can take to reduce risk (Jorion 2003).
However, variance as the tracking error measure has been challenged, as it implies that investors weigh the underperformance and overperformance of their tracking portfolios relative to the benchmark equally. Actually, in the EIT strategy, investors welcome the overperformance and do not want the underperformance. So the symmetric tracking error measures are no longer suitable. For this reason, some scholars begin to use the downside risk measure. The downside risk measure is consistent with the investor’s perception of risk because most investors understand risk as the potential underachievement of a target return by an asset (Mao 1970). Among many downside risk measures, the absolute downside deviation has become a popular one in the EIT problem. For example, Rudolf et al. (1999) use the absolute downside deviation as the tracking error measure to propose a linear EIT model and compare it with EIT models using other tracking error measures. Lejeune (2012) provides a game theoretical formulation for the EIT problem in which the minimum excess return over all allowable probability distributions is maximized subject to the absolute downside deviation tracking error constraint. Gaustaroba et al. (2016) put forward an EIT model in which they maximize the ratio of the upside and downside deviations between the tracking portfolio and the benchmark returns. In addition to the absolute downside deviation tracking error measure, the higher-order moment of the downside is also used as a tracking error measure in the literature. Higher-order moment contains more information, and it can make the model more widely applicable (Maringer and Parpas 2009). At present, few studies focus on the EIT problems involving the higher-order moment of the downside as the tracking error measure. To the best of our knowledge, we only find that Yu et al. (2006) have employed the higher-order moment of the downside as a tracking error measure in their model.
Realistic constraints are often considered when constructing the EIT models. For example, Beasley et al. (2003) take into account the transaction costs and the number of stocks that can be purchased in their tracking model. Strub and Baumann (2018) present a mixed-integer linear programming formulation with constraints of the capital budget, transaction costs, and the number of stocks that can be purchased. Besides, Andriosopoulos and Nomikos (2014) consider the realistic constraints of cardinality and min./max.weight when reproducing the performance of a geometric average Spot Energy Index. SantAnna et al. (2017) consider the realistic constraint of cardinality when investigating the out-of-sample performance of cointegration and correlation methods for the EIT strategies. When the EIT models contain realistic constraints, it is difficult to give the analytical solution of the proposed model. So scholars use algorithms to solve their EIT models. Guastaroba and Speranza (2012) present a heuristic framework called Kernel Search to solve their mixed-integer linear programming formulations for the EIT problem. Filippi et al. (2016) modify the Kernel Search heuristic to solve the bi-objective programming they proposed for the EIT problem.
In the above studies, probability theory is used as a main mathematical tool. Though probability theory is a powerful tool for helping people deal with indeterminate parameters, its use is conditional upon there being enough valid data so that obtained distributions from the data are close enough to the real frequencies. Yet there exist situations in reality where people have few data or the data are invalid due to unexpected events in society. For example, the outbreak of COVID-19 has led to a series of unexpected events so that historical data cannot reflect the future effectively. Besides, some empirical tests show that in many cases observed data in the financial market are not random variables. By using the two-sample Kolmogorov–Smirnov test, Ye and Liu (2022a) show US Dollar to Chinese Yuan (USD-CNY) exchange rates cannot be treated as random variables because residuals from different parts are neither from the same population nor white noise in the sense of probability theory. Similar tests also show that Alibaba stock prices (Liu and Liu 2022) and GDP (Ye and Liu 2022b) are not suitable to be treated as random variables. Therefore, there does exist a situation where probability theory is not applicable in reality.
To handle this problem, Liu (2007) proposes a new theory, i.e., uncertainty theory. Uncertainty theory is a new branch of mathematics and is developed based on four axioms. Nowadays, uncertainty theory has been applied in solving various optimization problems (Liu 2010; Gao 2011). Particularly, Huang (2010) is the first to use uncertainty theory to study portfolio selection systematically. Subsequently, scholars have studied many portfolio problems based on uncertainty theory. Wang and Huang (2019) discuss the effect of options on the portfolio performance and find that portfolios with options gain higher expected returns than those without options. Huang and Yang (2020) study how the background risk affects investment decisions. They give properties of the portfolio efficient frontier when stock and background asset returns all take normal uncertainty distributions and tell how background risk affects portfolio selection decisions. To reflect different attitudes toward risk in one portfolio investment, Huang and Di (2020) apply mental accounts to an uncertain optimization model. Besides, Ning et al. (2015) consider triangular entropy in the uncertain portfolio selection problem. Qin et al. (2016) study the uncertain portfolio adjusting problem. Zhang (2019), and Chen et al. (2019) focus on multi-period uncertain portfolio selection problem. Besides, scholars further discuss the performances of uncertain portfolios. Studies show that the returns or the expected returns of uncertain portfolios are higher than those of traditional portfolios in some situations. For example, Huang and Yang (2020) present evidence that the return of the portfolio got from the uncertain mean-variance model is greater than that of traditional Markowitz’s mean-variance model in volatile markets. Xue et al. (2019) find that the expected return of the aggregate portfolio when security return rates are uncertain variables is greater than the expected return of the aggregate portfolio when security return rates are random variables in the case of facing invalid data. It can be seen from the above examples that the uncertain portfolio selection models have some advantages. Yet so far, no work deals with the uncertain EIT problem. How to build the uncertain EIT model? How does the uncertain EIT model perform? The lack of research on these issues motivates us to do the study.
The rest of the paper is organized as follows. In Sect. 2, we provide the necessary knowledge of uncertainty theory for easy understanding of our paper. In Sect. 3, we propose an uncertain EIT model with higher-order moment of the downside and discuss its properties. In Sect. 4, we present a meta-heuristic algorithm for the proposed model. In Sect. 5, we report the experimental results. Section 6 concludes the paper.
2 Preliminaries
In this section, we review some necessary knowledge about uncertainty theory.
Definition 1
(Liu 2007) Let \(\Gamma \) be a nonempty set, and \(\text{ L }\) a \(\sigma \)-algebra over \(\Gamma \). Each element \(\Lambda \in \text{ L }\) is called an event. A set function \(\text{ M } \{\Lambda \}\) is called an uncertain measure if it satisfies the following four axioms: (i) Normality, (ii) Duality, (iii) Subadditivity, (iv) Product Axiom.
Definition 2
(Liu 2007) An uncertain variable is a function \(\xi \) from an uncertainty space \((\Gamma , \text{ L }, \text{ M})\) to the set of real numbers such that \(\{\xi \in B \}\) is an event for any Borel set B of real numbers.
An uncertain variable \(\xi \) is called normal if it has a normal uncertainty distributions
denoted by \(\xi \sim \mathcal {N}(e, \sigma )\) where e and \(\sigma \) are real numbers and \(\sigma >0.\)
Let \(\xi \) and \(\eta \) be uncertain variables. We say \(\xi >\eta \) if \(\xi (\gamma )>\eta (\gamma )\) for almost all \(\gamma \in \Gamma \).
Definition 3
(Liu 2007) For any real number t, the uncertainty distribution \(\Phi \) of an uncertain variable \(\xi \) is defined by
An uncertainty distribution \(\Phi (t)\) is called regular if it is a continuous and strictly increasing function with respect to t at which \(0<\Phi (t)<1,\) and \(\lim \limits _{t\rightarrow -\infty } \Phi (t)=0, \quad \lim \limits _{t\rightarrow +\infty } \Phi (t)=1.\) It is seen that normal uncertainty distribution is regular.
A regular uncertainty distribution \(\Phi \) has an inverse function \(\Phi ^{-1}(\alpha )\) which is called the inverse uncertainty distribution.
The operational law is given as follows:
Theorem 1
(Liu 2010) Let \(\xi _1, \xi _2, \ldots , \xi _n\) be independent uncertain variables with regular uncertainty distributions \(\Phi _1, \Phi _2,\ldots , \Phi _n,\) respectively. If \(f(\xi _1,\xi _2, \ldots ,\xi _n)\) is strictly decreasing with respect to \(\xi _1,\xi _2,\ldots ,\xi _m\) and strictly increasing with respect to \(\xi _{m+1},\xi _{m+2},\ldots ,\xi _n\), then \(\xi =f(\xi _1,\xi _2,\ldots ,\xi _n)\) is an uncertain variable with inverse uncertainty distribution function
According to the operational law, the following theorem can be obtained:
Theorem 2
(Liu 2010) Let \(\xi _{1}\) and \(\xi _{2}\) be independent normal uncertain variables \(\mathcal {N}(e_{1}, \sigma _{1})\) and \(\mathcal {N}(e_{2}, \sigma _{2})\), respectively. Then, the sum \(\xi _{1}+\xi _{2}\) is also a normal uncertain variable, i.e.,
The product of a normal uncertain variable \(\mathcal {N}(e, \sigma )\) and a scalar number \(k>0\) is also a normal uncertain variable, i.e.,
The expected value and variance of an uncertain variable are defined as follows:
Definition 4
(Liu 2007) Let \(\xi \) be an uncertain variable. Then, the expected value of \(\xi \) is defined by
provided that at least one of the two integrals is finite.
Definition 5
(Liu 2007) Let \(\xi \) be an uncertain variable with finite expected value e. Then, the variance of \(\xi \) is defined by
Since operations of the uncertain variables are mainly in the form of inverse uncertainty distributions, Liu (2010) and Yao (2015) presented following formulas to calculate the expected value and variance of an uncertain variable via inverse uncertainty distributions.
For a normal uncertain variable \(\xi \sim \mathcal {N} (e, \sigma )\), we can calculate that its expected value \(E[\xi ]=e\) and variance \(V[\xi ]=\sigma ^{2}\).
The moment of an uncertain variable is defined as follows:
Definition 6
(Liu 2010) Let \(\xi \) be an uncertain variable and let k be a positive integer. Then, \(E[\xi ^{k}]\) is call the \(k-th\) moment of \(\xi \).
3 The uncertain EIT model with higher-order moment of the downside
In this section, we will provide an uncertain EIT model in which the tracking error is measured by the higher-order moment of the downside.
3.1 The assumptions and notations
Suppose there are n different stocks in the asset universe. Short-selling is not allowed. Let \({\textbf {X}}_{\textbf {I}}\) denote the benchmark with the uncertain return \(r_{\textrm{I}}\). Let \(n\times 1\) vector \(\textbf{X}_{\textbf{P}}\) denote the tracking portfolio with the uncertain return \(r_{\textrm{P}}\) which is independent of \(r_{\textrm{I}}\). For example, \(r_{\textrm{P}}\) represents the return of a portfolio from Shanghai Stock Exchange and \(r_{\textrm{I}}\) represents NASDAQ-100 Index return, and they can be converted into independent variables by using the factor method in Huang and Zhao (2014). Next we introduce the rest of the notations used in this paper. For convenience, we show them in Table 1.
3.2 Higher-order moment of the downside
The goal of the EIT strategy is to track a benchmark index closely while providing some excess returns. Generally speaking, the tracking portfolio’s return is not exactly the same as the benchmark’s. There are positive or negative deviations between them. Investors welcome the positive deviation and do not want the negative deviation. So in our model, we will control the negative deviation.
Remember that \(r_{\textrm{P}}\) and \(r_{\textrm{I}}\) represent the uncertain returns of tracking portfolio and benchmark, respectively. Then \((r_{\textrm{P}}-r_{\textrm{I}})^{-}\) represents the negative deviation of \(r_{\textrm{P}}\) with respect to \(r_{\textrm{I}}\). For simplicity, we write \((r_{\textrm{P}}-r_{\textrm{I}})^{-}=min(r_{\textrm{P}}-r_{\textrm{I}},0)\). According to Definition 6, the m-th moment of \(|(r_{\textrm{P}}-r_{\textrm{I}})^{-}|\) is \(E[|(r_{\textrm{P}}-r_{\textrm{I}})^{-}|^{m}]\), \(m=1,2,\ldots \). Let \(\Psi \) denote the uncertainty distributions of \(r_{\textrm{P}}-r_{\textrm{I}}\). Then according to Definition 4, we can have
Equation (10) provides a method for calculating the higher order moment of an uncertain variable.
3.3 The uncertain model
The uncertain EIT model with higher-order moment of the downside is designed to maximize the expected excess return over the benchmark subject with a fixed level of tracking error. The expected excess return is
The tracking error is measured by the higher-order moment of the negative deviation of \(r_{\textrm{P}}\) with respect to \(r_{\textrm{I}}\), i.e.,
So the uncertain EIT model with higher-order moment of the downside is as follows:
where \(D_{m}\) is the investors’ tolerance level of the tracking error. By varying the value of m, model (13) can be changed into different uncertain EIT models that respond to different risk preferences of the investors.
Next we extend model (13) to a general form by including some realistic constraints. (i) The minimum transaction lots constraint. Because some securities markets have a minimum transaction lots limit, investors can only buy an integral multiple of the minimum transaction lots instead of any number of shares. To incorporate minimum transaction lots constraint, let \(\omega _{i}\) denote the number of transaction lots invested in stock i and \(N_{i}\) represent the number of units of stock i required as the minimum quantity. So \(\omega _{i} \cdot N_{i}\) represent the total quantity to be invested on stock i.
(ii) The budget constraint. In real investment, investors usually set budgets. Let \(P_{i}\) denote the unit market price of stock i. Thus, \(\sum ^{n}_{i=1}\omega _{i} \cdot N_{i} \cdot P_{i}\) represents the total amount of money invested in n stocks.
(iii) The limits on holdings and cardinality constraint. The first one is to make sure that each stock is allowed to be invested in specific ranges, i.e., \(0 \le l_{i} \le x_{i}\le u_{i} \le 1\). The second one ensures that only in a specified number of stocks the investment can be made, i.e., \(\sum ^{n}_{i=1} z_{i}=Q\) where \(z_{i}\in \{0,1\}\). If \(z_{i}=1\), then the stock i is chosen for investment; otherwise not.
Based on the aforementioned discussions, the uncertain EIT model with higher-order moment of the downside considering realistic constraints can be formulated as follows:
3.4 Discussion on the proposed model
In this section, we compare our model (14) with other uncertain EIT models. First, the comparison is conducted between models with higher-order and lower-order moments. In model (14), when \(m\ge 3\), tracking error is measured by the higher-order moment of the downside. Below we give an uncertain EIT model in which the tracking error is measured by the lower order moment of the downside, i.e., model (15):
We compare model (14) with model (15), and get the following result.
Theorem 3
With the optimal portfolio produced by model (15), either the expected return is no bigger or the tracking error is higher than that with the optimal portfolio produced by model (14).
Proof
Let \(Obj_{14}\) and \(Obj_{15}\) denote the objective functions of models (14) and (15), respectively, the corresponding optimal solutions are \({{\textbf{X}}}_{14}\) and \({{\textbf{X}}}_{15}\). (i)\({{\textbf{X}}}_{15}\) does not satisfy the constraints in model (14), so \({{\textbf{X}}}_{15}\) is infeasible for model (14). This means that the tracking error is higher than that of model (14). (ii)Therefore, we get \(Obj_{15}({{\textbf{X}}}_{15})\le Obj_{14}({{\textbf{X}}}_{14})\). This means that the expected return is no bigger than that of model (14). \(\square \)
Theorem 3 implies that the expected return of the model with higher order moment may be greater than that of the model with lower order moment, which gives the advantage of higher-order moment as the tracking error measure in the EIT problems.
Then the comparison is conducted between models with and without realistic constraints. We compare models (13) and (14), and obtain the following result.
Theorem 4
The optimal objective value of the uncertain EIT model with realistic constraints is smaller than that of the model without realistic constraints.
Proof
Suppose \({{\textbf{X}}}_{13}\) and \({{\textbf{X}}}_{14}\) are the optimal solutions of models (13) and (14), respectively, and their objective functions are \(Obj_{13}\) and \(Obj_{14}\). Obviously, \({{\textbf{X}}}_{14}\) is a feasible solution of model (13) but \({{\textbf{X}}}_{13}\) is not necessarily a feasible solution of model (14). So \({{\textbf{X}}}_{13}\ne {{\textbf{X}}}_{14}\). Since \({{\textbf{X}}}_{13}\) is an optimal solution of model (13) and \({{\textbf{X}}}_{14}\) is a feasible one, we get \(Obj_{13}({{\textbf{X}}}_{14})< Obj_{13}({{\textbf{X}}}_{13})\). Since the objective functions of model (13) and model (14) are the same, \(Obj_{13}({{\textbf{X}}}_{14})=Obj_{14}({{\textbf{X}}}_{14})\). Therefore, \(Obj_{14}({{\textbf{X}}}_{14})< Obj_{13}({{\textbf{X}}}_{13})\). Thus, Theorem 4 is proved. \(\square \)
Though the optimal objective value of model (13) is relatively big, there do exist some realistic constraints in the real investment. Therefore, it is very necessary to consider realistic constraints in modeling.
3.5 The deterministic form
To solve model (14), we give the deterministic form of the model below.
Theorem 5
Suppose stock returns \(\xi _{i}\) and the benchmark return \(r_{\textrm{I}}\) have regular uncertainty distributions \(\Phi _{i}, i=1,2,\ldots ,n,\) and \(\Phi _{I}\), respectively. Then model (14) is equivalent to the following model:
Proof
(i) Since \(x_{i}>0\) and \(\xi _{i}\) have regular uncertainty distributions \(\Phi _{i}\), according to Theorem 1, the inverse uncertainty distribution of \(\sum ^{n}_{i=1}x_{i}\xi _{i}\) is \(\sum ^{n}_{i=1}x_{i}\Phi ^{-1}_{i}(\alpha ).\) Then according to Eq. (8), we have
(ii) Let \(\eta =\sum ^{n}_{i=1}x_{i}\xi _{i}-r_{\textrm{I}}\) and \(\Psi \) denote the uncertainty distribution of \(\eta \). Since we suppose that \(\xi _{i}\) is independent of \(r_{\textrm{I}}\), according to Theorem 1, we can have the inverse uncertainty distribution of \(\eta \) is \(\Psi ^{-1}(\alpha )=\sum ^{n}_{i=1}x_{i}\Phi _{i}^{-1}(\alpha )-\Phi _{I}^{-1}(1-\alpha )\). Then according to Eq. (10), we have
where \(\Psi ^{-1}(\alpha )=\sum _{i=1}^{n}x_{i} \Phi ^{-1}_{i}(\alpha )-\Phi _{I}^{-1}(1-\alpha )\). \(\square \)
When stock returns all take normal uncertainty distributions, we further give the deterministic of model (16).
Theorem 6
Suppose stock returns \(\xi _{i}\) and the benchmark return \(r_{\textrm{I}}\) take normal uncertainty distributions, i.e., \(\xi _{i}\sim \mathcal {N} (e_{i}, \sigma _{i})\) and \(r_{\textrm{I}} \sim \mathcal {N} (e_{I}, \sigma _{I})\), respectively. Then model (16) can be transformed into the following form:
Proof
(i) As \(x_{i}>0\) and \(\xi _{i}\) take normal uncertainty distributions, according to Theorem 2, we can proof that \(\sum ^{n}_{i=1}x_{i}\xi _{i}\) takes a normal uncertainty distributions, i.e.,
For a normal uncertain variable \(\xi \sim \mathcal {N} (e, \sigma )\), we can calculated that its expected value \(E[\xi ]=e\). So we have \(E\left[ \sum ^{n}_{i=1}x_{i}\xi _{i} \right] =\sum _{i=1}^{n} x_{i}e_{i}\) and \(E[r_{\textrm{I}}]=e_{I}\).
(ii) Similarly, according to Theorem 2, we can prove that \(\sum ^{n}_{i=1}x_{i}\xi _{i}-r_{\textrm{I}}\) also takes a normal uncertainty distributions, i.e.,
So we can get the uncertainty distribution of \(\sum ^{n}_{i=1}x_{i}\xi _{i}-r_{\textrm{I}}\). Then according to Eq. (10), we have
\(\square \)
4 Solving method based on discrete artificial bee colony algorithm
Model (17) is an NP-hard optimization problem that is difficult to obtain an optimal solution with accurate algorithms. Due to this difficulty, meta-heuristic algorithms have become popular in this context because they can obtain an acceptable optimal solution within an allowable time for the NP-hard optimization problem. As one of the meta-heuristic algorithms, the artificial bee colony (ABC) algorithm (Karaboga 2005; Karaboga and Basturk 2007) is simple to implement and has strong robustness. The literature shows that the ABC algorithm performs well not only in continuous optimization problems but also in many discrete optimization problems such as economic maintenance planning problem (Chen et al. 2020), flow shop scheduling problem (Xuan et al. 2019) and clustering problem (Masdari et al. 2019). Therefore, in this paper, we will use the discrete ABC algorithm for solving the proposed optimization model (17).
4.1 Solution feasibility analysis of proposed model
In this section, we conduct the solution feasibility analysis of model (17). The constraints mentioned below are all from model (17).
First, we give “parameter conditions." From constraint (III) we get \(\sum ^{n}_{i=1}x_{i} = 1\). Then by summing constraint (V) with respect to i, we have \(\sum ^{n}_{i=1}l_i z_i\le 1 \le \sum ^{n}_{i=1}u_i z_i\). Obviously, if \(\sum ^{n}_{i=1}l_i z_i>1\) or \(\sum ^{n}_{i=1}u_i z_i<1\), then the feasible solution does not exist. Thus, “parameter conditions" are as follows:
-
(i)
\(\sum ^{n}_{i=1}l_i z_i\le 1\).
-
(ii)
\(\sum ^{n}_{i=1}u_i z_i\ge 1\).
Then, we give “solution generating conditions." From constraint (IV) we know that the number of \(z_i\) which equals 1 value is Q and the number of \(z_i\) which equals 0 value is \((n-Q)\). From constraints (III) and (V), we can get that if \(z_{i}=0\), then \(\omega _{i}=0\). The upper limit of \(w_i\) is obtained from constraints (II), (III), and (V). So “solution generating conditions" are as follows:
-
(i)
\(\sum _{i=1}^{n}z_i = Q\), \(z_i \in \{0,1\}\).
-
(ii)
If \(z_i = 0\), then \(w_i = 0\).
-
(ii)
If \(z_i = 1\), then \(w_i \le \frac{B u_i}{N_i P_i}\), \(w_i \ are \ positive\ integers\).
If \(z_i\) and \(w_i\) are generated so that the solution generating conditions are satisfied, the optimization problem to be solved is to obtain a solution that maximizes the objective function under constraints (I), (II), and (V). We improve the exploitation ability of the algorithm by ensuring that the above three constraints are satisfied when generating and updating solutions in our discrete ABC algorithm.
Algorithm 1 | |
---|---|
Random generation of solution | |
Input: problem parameters (n, Q, B, \(N_i\), \(P_i\), \(l_i\), \(u_i\)) | |
Output: a generated solution (\({{\textbf{X}}}\)) | |
01: select randomly Q stocks i, \(i \in I\) \(/^\star \) I is a set of Q distinct stocks \(^\star /\) | |
02: for i = 1 : n \(/^\star \) Determination of \(z_i\) and \(w_i\) | |
03: if \(i \in I\) then according to the solution generating | |
04: \(z_i = 1\) conditions \(^\star /\) | |
05: \(w_i\) is equal to a random positive | |
integer not larger than \(\frac{B u_i}{N_i P_i}\) | |
06: else | |
07: \(z_i = 0\) | |
08: \(w_i = 0\) | |
09: end if | |
10: end for | |
11: generate \({{\textbf{X}}}\) |
4.2 Discrete ABC algorithm for proposed model
The ABC algorithm is a meta-heuristic algorithm inspired by the behavior of bees. In the ABC algorithm, bees are divided into three classes: employed bees, onlooker bees, and scouts. The employed bees use the previous food source information to find new food sources and share the information with onlooker bees who wait in the hive and look for new food sources according to the information shared by employed bees. The task of scouts is to find a new valuable food source. They randomly look for a food source near the hive. The goal of the whole colony is to find the nectar source with the largest amount of nectar.
Combining the basic principles of the ABC algorithm with the uncertain EIT model, the main steps of the algorithm for solving the proposed model (17) are as follows.
4.2.1 Solution coding method
The optimization problem to be solved is described by the binary variables \(z_i\) and the integer variables \(w_i\), \(i=1,2,\ldots ,n\). In the algorithm, a solution \({{\textbf{X}}}\) is coded as a \(2 \times n\) matrix in which the first row consists of \(z_i\) and the second row consists of \(w_i\) as shown in Fig. 1.
4.2.2 Population initialization
The SN initial solutions to compose the population are randomly generated while satisfying the solution generating conditions in Sect. 4.1. The pseudo-code for generating each solution \(X_j\), \(j=1,2,\ldots ,SN\), in the initialization phase is shown in Algorithm 1.
4.2.3 Crossover and Mutation
In the employed bee phase and the onlooker bee phase, the solution is updated through crossover and mutation. In the algorithm, we implement two-point crossover and two-point mutation and make sure that the solution generating conditions are satisfied. In the two-point crossover process, first, we find the positions where the numbers of 1 value are the same from 1 to that position in \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\). Let H be the set of such positions. Now we give an example in which the parameter settings are n=9 and Q=5. As shown in Fig. 2, there are two solutions \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\). At the position \(i=3\), both \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\) contain two 1’s in (1:3) slices, so \(i=3\) becomes an element of H. After constructing H in the same way, we select two random positions \(i_1\) and \(i_2\) in H and get that the slices \((i_1+1:i_2)\) of \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\) contain the same number of 1’s. Then we replace the slices \((i_1+1:i_2)\) of \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\) with each other to generate new solutions \({{\textbf{X}}}_1^{\textrm{new}}\) and \({{\textbf{X}}}_2^{\textrm{new}}\) as shown in Fig. 2.
The two-point mutation method is as follows. Let the set of positions where \(z_{i}\) takes 1 value be \(H_1\) and the set of positions where \(z_{i}\) takes 0 value be \(H_2\) in the solution \({{\textbf{X}}}_j\). First, we select randomly the positions \(i_1\) and \(i_2\) from sets \(H_1\) and \(H_2\), respectively. Then we invert the values of \(z_{i_1}\) and \(z_{i_2}\) at the \(i_1\) and \(i_2\) positions. We give the \(w_{i_1}\) value as 0 and \(w_{i_2}\) as a random positive integer not larger than \(\frac{B u_{i_2}}{N_{i_2} P_{i_2}}\). Repeat this mutation process \(p_\textrm{m}\times n\) times as shown in Fig. 3. From the new solutions obtained by crossover and mutation, the best solution is selected according to the tournament selection rule. Algorithm 2 shows the process of generating a new solution by crossover and mutation.
Algorithm 2 | |
---|---|
Generation of new solution by Crossover and Mutation | |
Input: solution (\(X_j\)) and its neighborhood (\(X_k\)) | |
Output: new solution (\(X_j^{new}\)) | |
01: find the positions i where the numbers of 1s are \(/^\star \) Two-point crossover \(^\star /\) | |
the same in \(X_j\) and \(X_k\), \(i \in H\) \(/^\star \) H is a set of those positions \(^\star /\) | |
02: choose randomly two positions, \(i_1\) and \(i_2\), in H | |
03: generate \(X_1^{new}\) by replacing \((i_1+1:i_2)\) elements \(/^\star \) See Fig. 2. \(^\star /\) | |
of \(X_j\) with ones of \(X_k\) | |
04: generate \(X_2^{new}\) by replacing \((i_1+1:i_2)\) elements | |
of \(X_k\) with ones of \(X_j\) | |
05: \(X' = X_j\) \(/^\star \) Mutation \(^\star /\) | |
06: for \(t = 1 : p_\textrm{m} \times n\) | |
07: select randomly a position \(i_1\) among 1 elements in \(X'\) | |
08: \(z_{i_1} = 0\) | |
09: \(w_{i_1} = 0\) | |
10: select randomly a position \(i_2\) among 0 elements in \(X_j\) | |
11: \(z_{i_2} = 1\) | |
12: \(w_{i_2}\) is equal to a random positive integer not | |
larger than \(\frac{B u_{i_2}}{N_{i_2} P_{i_2}}\) | |
13: generate \(X_3^{new}\) by replacing (\(i_1\),\(i_2\)) elements of \(/^\star \) See Fig. 3. \(^\star /\) | |
\(X'\) with the above | |
14: \(X' = X_3^{new}\) | |
5: end for | |
16: take out the best solution \(X_i^{new}\) by tournament | |
selection with \(X_1^{new}\), \(X_2^{new}\) and \(X_3^{new}\) |
4.2.4 Fitness function
The fitness values are used to evaluate the best solution in the tournament selection and to calculate the probability that the onlooker bees select a specific food source (solution) in the onlooker bee phase. The higher the fitness value, the better the solution, and the greater the probability that onlooker bees will choose the solution. The fitness \(\textrm{fit}_j\) of solution \({{\textbf{X}}}_j\) is determined by the following equations:
where \(f({{\textbf{X}}}_j)\) is the objective value at \({{\textbf{X}}}_j\) and \(g_l({{\textbf{X}}}_j)\) is the lth constraint violation value at \({{\textbf{X}}}_j\) which is normalized so that \(\max \limits _j{g_l({{\textbf{X}}}_j)} = 1\). L is the number of constraints (I), (II) and (V) of model (17), that is, \(L = 2n+2\). In our algorithm, all solutions in the population are generated and updated to satisfy the solution generating conditions, so only constraints (I), (II), and (V) need to be considered when calculating the fitness value.
The probability that the onlooker bee goes to the jth food source is determined by the following equation:
4.2.5 Scout bee strategy
In the algorithm, the employed bee of food source (solution) that is not updated more than Limits cycles becomes a scout bee. The scout bee discards the original food source and randomly searches for a new one. A new solution is generated according to Algorithm 1 to satisfy the solution generating conditions in Sect. 4.1. The pseudo-code of the total algorithm is listed in Algorithm 3. In Algorithm 3, MCN indicates the maximum number of cycles.
Algorithm 3 | |
---|---|
Discrete ABC algorithm for proposed model | |
Input: common parameters (MCN, SN, Limits, \(p_\textrm{m}\)), | |
problem parameters (n, Q, etc), | |
objective function (\(f({{\textbf{X}}})\)), | |
constraint violation function (\(g_l({{\textbf{X}}}),l=1,\ldots ,2n+2\)) \(/^\star \) Constraints (I), (II) | |
and (V) of model (17) \(^\star /\) | |
Output: best solution (\({{\textbf{X}}}_{best}\)) | |
01: if parameter conditions are not satisfied, then return \(/^\star \) Parameter conditions | |
are in Section 4.1\(^\star /\) | |
02: for \(j = 1:SN\) \(/^\star \) Initialization of | |
population \(^\star /\) | |
03: generate solution \({{\textbf{X}}}_j\) by using Algorithm 1 | |
04: end for | |
05: \(cycle = 1\) | |
06: while \(cycle \le MCN\) | |
07: for j = 1:SN \(/^\star \) Employed bee phase \(^\star /\) | |
08: select randomly \({{\textbf{X}}}_k\) as neighborhood solution | |
09: generate new solution \({{\textbf{X}}}_j^{new}\) by using Algorithm 2 | |
with \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_k\) | |
10: update/maintain \({{\textbf{X}}}_j\) by tournament selection | |
with \({{\textbf{X}}}_j\) and \({{\textbf{X}}}_j^{new}\) | |
11: end for | |
12: calculate probability \(\{P_s\}\) for onlookers \((s=1:SN)\) \(/^\star \) Onlooker bee phase \(^\star /\) | |
13: for j = 1:SN | |
14: select \({{\textbf{X}}}_i\) for jth onlooker bee by roulette selection | |
with probability \(\{P_s\}\) | |
15: select randomly \({{\textbf{X}}}_k\) as neighborhood | |
16: generate new solution \({{\textbf{X}}}_i^{new}\) by using Algorithm 2 | |
with \({{\textbf{X}}}_i\) and \({{\textbf{X}}}_k\) | |
17: update/maintain \({{\textbf{X}}}_i\) by tournament selection | |
with \({{\textbf{X}}}_i\) and \({{\textbf{X}}}_i^{new}\) | |
18: end for | |
19: if solution \({{\textbf{X}}}^\star \) is not updated more than Limits, \(/^\star \) Scout bee phase \(^\star /\) | |
20: generate new solution \({{\textbf{X}}}^{new}\) by using Algorithm 1 | |
and replace \({{\textbf{X}}}^\star \) with \({{\textbf{X}}}^{new}\) | |
21: end if | |
22: \(cycle = cycle + 1\) | |
23: end while | |
24: take out the best solution \({{\textbf{X}}}_{best}\) |
5 Numerical examples
To clearly illustrate the modeling idea and the designed algorithm, we present some numerical examples. In our examples, stock returns are treated as uncertain variables. Moreover, the algorithms are run in MATLAB and all experiments are conducted on a PC with a 3.6 GHz processor and 8 GB RAM.
5.1 Computational results of the tracking portfolio
5.1.1 Data
An investor wants to track the SSE50 index whose return distribution is \(\mathcal {N}(0.130, 0.220)\). The parameter m in model (17) is set at 3. According to three prospectuses of enhanced index funds coded 005,850, 009,059, and 161,037 in China’s securities market, the annual tracking error should not exceed 8%. So we set the tolerance level of the tracking error at 8% in our proposed model. The investor selects 10 stocks from Shanghai Stock Exchange and Shenzhen Stock Exchange as his/her asset universe. The investment period is set from January 2, 2020 to December 31, 2020. Stock codes and the prices of stocks on January 2, 2020, are shown in Table 2. Stock annual return distributions in the investment period are shown in Table 3. The amount of investment after paying transaction costs is \(B=1,000,000\). The tracking portfolio is supposed to contain 6 stocks, i.e., \(Q=6\) and each stock’s minimum proportions are \(l_{i}=5\%\), \(i = 1, 2,\ldots , 10\), respectively. The minimum transaction lots for each stocks are \(N_{i}=100\), \(i = 1, 2,\ldots , 10\).
5.1.2 Stability of algorithm
The stability of calculation results depends on the setting of parameters, namely the size of the population (SN), the maximum number of cycles (MCN), Limits cycles (LC), and mutation probability (\(p_\textrm{m}\)). Here, a series of experiments are conducted to determine these four parameters. We perform 20 times tests for each set of parameter settings. Table 4 gives the minimum, maximum, mean, and standard deviation of the 20 objective values under different parameters, where the best results are highlighted in boldface. According to the result, the best parameter setting is \(\textrm{SN}=50\), \(\textrm{MCN}=10,000\), \(\textrm{LC}=30\), and \(p_\textrm{m}=0.2\). Besides, in Table 4, it can be seen that the standard deviations of the 20 optimal objective values for every set of parameters are all very small, which indicates the robustness of the proposed algorithm.
5.1.3 The optimal tracking portfolio
In this section, we provide the computational result of model (17), which is conducted 20 times, and the best result is reported. Table 5 shows that the optimal tracking portfolio contains stocks 1, 2, 3, 8, 9, and 10, and the corresponding transaction lots are 96, 32, 66, 5, 582, and 61, respectively. The expected return of the optimal tracking portfolio is \(E[r_{\textrm{P}}^{*}]=0.1870\), and the variance is \(V[r_{\textrm{P}}^{*}]=0.0654\). The total investment is 918,289 RMB.
5.2 Sensitivity analysis
In this section, we change some parameters to test their effects on the investment decision. Note that each time only one parameter changes. Similarly, we do the experiments 20 times and select the best result.
(i) Sensitivity analysis w.r.t the parameter m. According to Eq. (10), the change of m means changing the order-moment of the downside. When changing m from 1 to 3, model (17) can be changed into three models. Figure 4 shows the relationship between the objective value and the tracking error tolerance level \(D_m\) in these three models. The abscissa represents \(D_m\) and the ordinate represents the objective value. When model (17) has no solution, the objective value is displayed as zero. As seen in Fig. 4, no matter how m changes, it does not affect the objective value. But one thing is worthy to note. When \(m=1\), model (17) has a solution in the interval of \(D_{1}\in [0.16,0.2]\); when \(m=2\), model (17) has a solution in the interval of \(D_{2}\in [0.08,0.2]\); when \(m=3\), model (17) has a solution in the interval of \(D_{3}\in [0.04,0.2]\). It can be seen that when \(m=3\), model (17) is solvable in a larger range of the tracking error tolerance. This is why we use the higher order moment which makes the proposed model more widely applicable. However, the increase in m also increases the complexity of model solving, so m is selected as 3 in our experiment.
(ii) Sensitivity analysis w.r.t the parameter \(D_{3}\). When \(m=3\), the investor’s tracking error tolerance level is represented by \(D_{3}\) in model (17). We change the value of \(D_{3}\) from 0.05 to 0.15. Figure 5 shows the relationship between the tracking error tolerance level and the expected return of the optimal tracking portfolio. Note that when \(D_{3}\) increases from 0.05 to 0.11, the expected return increases; when \(D_{3}\) increases from 0.11 to 0.15, the expected return remains unchanged. Figure 5 implies that when the tracking error tolerance level is within a certain range, the tracking portfolio’s expected return increases with an increase in the tracking error tolerance level. However, when the tracking error tolerance level exceeds a certain range, the tracking portfolio’s expected return is unchanged.
(iii) Sensitivity analysis w.r.t the parameter Q. The parameter Q represents the number of stocks that the investor wants to hold in the tracking portfolio. We change the value of the parameter Q from 4 to 8. The corresponding results are shown in Table 6. In Table 6, when \(Q=4\), the optimal tracking portfolio’s expected return and variance are \(E[r_{\textrm{P}}^{*}]=0.1904\) and \(V[r_{\textrm{P}}^{*}]=0.0664\). As Q increases to 8, the optimal tracking portfolio’s expected return and variance are \(E[r_{\textrm{P}}^{*}]=0.1839\) and \(V[r_{\textrm{P}}^{*}]=0.0646\). It can be seen that return and risk become small as Q increases. This is consistent with the result that diversification reduces the risk, and also reduces the return.
(iv) Sensitivity analysis w.r.t the parameter B. This parameter denotes the upper limit of the budget after paying transaction costs. We calculate the expected return and variance of the tracking portfolio with different B levels and show them in Table 7. As seen in Table 7, the tracking portfolio’s expected return and variance change little when B increases. From the above experiments, it is found that the parameter B can hardly affect the optimal tracking portfolio’s expected return and variance.
(v) Sensitivity analysis w.r.t the parameter \(l_i\) and \(u_i\). These two parameters denote the minimum and maximum proportions allocated to stock i if it is held. Table 8 shows how \(l_i\) affect the tracking portfolio. In Table 8, when \(l_i=0.01\), the return and risk of the tracking portfolio are \(E[r_{\textrm{P}}^{*}]=0.1898\) and \(V[r_{\textrm{P}}^{*}]=0.0662\); when \(l_i\) increase to 0.15, the return and risk of the tracking portfolio are \(E[r_{\textrm{P}}^{*}]=0.1795\) and \(V[r_{\textrm{P}}^{*}]=0.0634\). As the increase in the minimum proportion allocated to stock i if it is held, the return and risk of the tracking portfolio both decrease. Table 9 shows how \(u_i\) affect the tracking portfolio. As seen in table 9, when \(u_i\) increase from 0.2 to 0.4, the returns of the tracking portfolio increase from \(E[r_{\textrm{P}}^{*}]=0.1813\) to \(E[r_{\textrm{P}}^{*}]=0.1871\), and the risk of the tracking portfolio increase from \(V[r_{\textrm{P}}^{*}]=0.0639\) to \(V[r_{\textrm{P}}^{*}]=0.0655\). However, when \(u_i\) increase from 0.4 to 0.9, the return and risk of the tracking portfolio are unchanged. This is because when \(u_i\in (0.2, 0.4)\), the constraint (V) is active; and when \(u_i\in (0.4, 0.9)\), the constraint (V) is not active.
5.3 Applicability and effectiveness in large-scale problems
To test the applicability and effectiveness of our algorithm in large-scale problems, we do more experiments. In Sections 5.1 and 5.2, we have solved the model for 10 stocks. Next, we will solve the model for 50 stocks. Table 10 shows the uncertain returns of the 50 stocks, and Table 11 shows the results of the experiments. This problem has been solved with the restriction that there should be an investment in at least 25 stocks with at least a 2% allocation in each stock. Table 11 shows the optimal tracking portfolios for different values of tracking error tolerance in dealing with 50 stocks. When \(D_3\) increases to 0.15 or greater, the objective value converges to the near-same value. It is seen that our algorithm is effective in large-scale problems.
6 Conclusions
In reality, financial markets are often affected by unexpected events, which can lead to historical data failing to predict the future effectively. In this case, we employ uncertain variables to describe the stock returns.
Under uncertainty theory framework, this paper has studied the EIT problem. We have proposed an uncertain EIT model with higher order moment of the downside, in which the realistic constraints of minimum transaction lots, investment budget, and the limits on holdings and cardinality are considered simultaneously. To solve the proposed model, we have given the deterministic equivalents of the model. Then we have developed a meta-heuristic algorithm to solve the proposed nonlinear integer programming. Experiment results show that (i) higher-order moment makes the proposed model more widely applicable; (ii) the proposed model and the corresponding algorithm are effective.
Data availability
Enquiries about data availability should be directed to the authors.
References
Andriosopoulos K, Nomikos N (2014) Performance replication of the spot energy index with optimal equity portfolio selection: evidence from the UK, US and Brazilian markets. Eur J Oper Res 234(2):571–582
Beasley JE, Meade N, Chang TJ (2003) An evolutionary heuristic for the index tracking problem. Eur J Oper Res 148(3):621–643
Chen W, Li D, Lu S, Liu W (2019) Multi-period mean-semivariance portfolio optimization based on uncertain measure. Soft Comput 23(15):6231–6247
Chen Z, Zhang L, Tian G, Nasr EA (2020) Economic maintenance planning of complex systems based on discrete artificial bee colony algorithm. IEEE Access 8:108062–108071
Filippi C, Guastaroba G, Speranza M (2016) A heuristic framework for the bi-objective enhanced index tracking problem. Omega 65:122–137
Gao Y (2011) Shortest path problem with uncertain arc lengths. Comput Math Appl 62(6):2591–2600
Gaustaroba G, Mansini R, Ogryczak W, Speranza MG (2016) Linear programming models based on omega ratio for the enhanced index tracking problem. Eur J Oper Res 251(3):938–956
Guastaroba G, Speranza MG (2012) Kernel search: an application to the index tracking problem. Eur J Oper Res 217(1):54–68
Huang X (2010) Portfolio analysis: from probabilistic to credibilistic and uncertain approaches. Springer-Verlag, Berlin
Huang X, Di H (2020) Uncertain portfolio selection with mental accounts. Int J Syst Sci 51(12):2079–2090
Huang X, Yang T (2020) How does background risk affect portfolio choice: An analysis based on uncertain mean-variance model with background risk. J Bank Financ 111:105726
Huang X, Zhao T (2014) Mean-chance model for portfolio selection based on uncertain measure. Insur: Math Econ 59:243–250
Jorion P (2002) Enhanced index funds and tracking error optimization. Unpublished Paper, Graduate School of Management, University of California at Irvine
Jorion P (2003) Portfolio optimization with tracking-error constraints. Financ Anal J 59(5):70–82
Karaboga D (2005) An idea based on honey bee swarm for numerical optimization. Erciyes University, Engineering Faculty, Computer Engineering Department, Tech. rep
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (abc) algorithm. J Glob Optim 39:459–471
Lejeune MA (2012) Game theoretical approach for reliable enhanced indexation. Decis Anal 9(2):146–155
Liu B (2007) Uncertainty theory. Springer-Verlag, Berlin
Liu B (2010) Uncertainty theory: a branch of mathematics for modeling human uncertainty. Springer-Verlag, Berlin
Liu Y, Liu B (2022) Residual analysis and parameter estimation of uncertain differential equations. Fuzzy Optim Decis Mak 23:1–18
Mao JC (1970) Survey of capital budgeting: theory and practice. J Financ 25(2):349–360
Maringer D, Parpas P (2009) Global optimization of higher order moments in portfolio selection. J Glob Optim 43(2):219–230
Masdari M, Barshande S, Ozdemir S (2019) Cdabc: chaotic discrete artificial bee colony algorithm for multi-level clustering in large-scale wsns. J Supercomput 75:7174–7208
Ning Y, Ke H, Fu Z (2015) Triangular entropy of uncertain variables with application to portfolio selection. Soft Comput 19(8):2203–2209
Qin Z, Kar S, Zheng H (2016) Uncertain portfolio adjusting model using semiabsolute deviation. Soft Comput 20(2):717–725
Roll R (1992) A mean/variance analysis of tracking error. J Portf Manag 18(4):13–22
Rudolf M, Wolter HJ, Zimmermann H (1999) A linear model for tracking error minimization. J Bank Financ 23(1):85–103
Sant’Anna LR, Filomena TP, Caldeira JF (2017) Index tracking and enhanced indexing using cointegration and correlation with endogenous portfolio selection. Q Rev Econ Financ 65:146–157
Strub O, Baumann P (2018) Optimal construction and rebalancing of index-tracking portfolios. Eur J Oper Res 264(1):370–387
Wang X, Huang X (2019) A risk index to model uncertain portfolio investment with options. Econ Model 80:284–293
Xuan H, Zhang H, Li B (2019) An improved discrete artificial bee colony algorithm for flexible flowshop scheduling with step deteriorating jobs and sequence-dependent setup times. Math Probl Eng 2019:1–13
Xue L, Di H, Zhao X, Zhang Z (2019) Uncertain portfolio selection with mental accounts and realistic constraints. J Comput Appl Math 39:42–52
Yao K (2015) A formula to calculate the variance of uncertain variable. Soft Comput 19:2947–2953
Ye T, Liu B (2022) Uncertain hypothesis test for uncertain differential equations. Fuzzy Optim Decis Mak 22:1–17
Ye T, Liu B (2022) Uncertain significance test for regression coefficients with application to regional economic analysis. Commun Stat-Theory Methods 52:1–18
Yu L, Zhang S, Zhou XY (2006) A downside risk analysis based on financial index tracking models. In: Stochastic finance, Springer, pp 213–236
Zhang P (2019) Multiperiod mean absolute deviation uncertain portfolio selection with real constraints. Soft Comput 23(13):5081–5098
Funding
This work was supported by Fundamental Research Funds for the Central Universities No. FRF-MP-20-12.
Author information
Authors and Affiliations
Contributions
TY involved in conceptualization, methodology, writing—original draft, visualization. XH took part in methodology, writing—review and editing, validation, supervision, funding acquisition. KRH involved in software, visualization, validation, writing—review and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Ethical approval and Informed Consent
This article does not contain any studies with human participants or animals performed by any of the authors.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Yang, T., Huang, X. & Hong, K.R. A new uncertain enhanced index tracking model with higher-order moment of the downside. Soft Comput 27, 11379–11394 (2023). https://doi.org/10.1007/s00500-023-08265-y
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-023-08265-y