A metaheuristic-based framework for index tracking with practical constraints

Yuen, Man-Chung; Ng, Sin-Chun; Leung, Man-Fai; Che, Hangjun

doi:10.1007/s40747-021-00605-5

A metaheuristic-based framework for index tracking with practical constraints

Original Article
Open access
Published: 20 December 2021

Volume 8, pages 4571–4586, (2022)
Cite this article

Download PDF

You have full access to this open access article

Complex & Intelligent Systems Aims and scope Submit manuscript

A metaheuristic-based framework for index tracking with practical constraints

Download PDF

Man-Chung Yuen¹,
Sin-Chun Ng²,
Man-Fai Leung ORCID: orcid.org/0000-0002-7753-0136¹ &
…
Hangjun Che³

2829 Accesses
26 Citations
Explore all metrics

A Correction to this article was published on 30 November 2022

This article has been updated

Abstract

Recently, numerous investors have shifted from active strategies to passive strategies because the passive strategy approach affords stable returns over the long term. Index tracking is a popular passive strategy. Over the preceding year, most researchers handled this problem via a two-step procedure. However, such a method is a suboptimal global-local optimization technique that frequently results in uncertainty and poor performance. This paper introduces a framework to address the comprehensive index tracking problem (IPT) with a joint approach based on metaheuristics. The purpose of this approach is to globally optimize this problem, where optimization is measured by the tracking error and excess return. Sparsity, weights, assets under management, transaction fees, the full share restriction, and investment risk diversification are considered in this problem. However, these restrictions increase the complexity of the problem and make it a nondeterministic polynomial-time-hard problem. Metaheuristics compose the principal process of the proposed framework, as they balance a desirable tradeoff between the computational resource utilization and the quality of the obtained solution. This framework enables the constructed model to fit future data and facilitates the application of various metaheuristics. Competitive results are achieved by the proposed metaheuristic-based framework in the presented simulation.

Index tracking with controlled number of assets using a hybrid heuristic combining genetic algorithm and non-linear programming

Article 03 February 2016

Assessing the interactions amongst index tracking model formulations and genetic algorithm approaches with different rebalancing strategies

Article 16 September 2023

Efficient DC Algorithm for the Index-Tracking Problem

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Basic fund investment strategies are classified into active and passive strategies. In an active strategy, investors believe in the inefficient market hypothesis that market prices cannot accurately reflect real values. Thus, investors aim to beat the market through their experience, in-depth research, financial forecasting, and stock analysis [19, 34, 35]. In contrast, investors who adopt a passive strategy believe in the efficient market hypothesis (EMH), which states that market values accurately include and reflect all information at all times [40, 41, 51]. Under the EMH assumption, investors believe that it is difficult to outthink market performance, and they assume that the market posts positive returns over time. As a result, investors adopt the buy-and-hold portfolio strategy over the long term with minimal trading activities, and they seek to replicate the performance of the chosen benchmark market index as closely as possible.

Recently, the significance of passive management strategies has increased tremendously. Three major motivations have been determined for this phenomenon [5, 11, 54, 61]. First, the benchmark index continually rose in the past. Under this observation, passive investors have a greater possibility of earning a reasonable return. Second, fund managers have difficulty beating the market in the long term. The longer the selected time frame is, the more likely it is that investors underperform the market. Third, active management requires expensive fixed costs, while passive management requires less fixed costs.

The aforementioned reasons have prompted investors to shift from active investment strategies to passive investment strategies [1]. Passive investment can be achieved through different instruments, such as index funds, passive mutual funds, and passive exchange-traded funds. The principal approach behind these instruments is index tracking. This method is designed to replicate the performance of a particular market index; it can be viewed as a matching between the tracking portfolio and the actual market index.

The most straightforward approach is full replication, which considers all stocks in the index with their corresponding weights. The perfect tracking error is achieved, as this method utilizes all stocks with the same proportion as that of the chosen benchmark index. However, full replication does not work well in practice, as it brings several drawbacks. Imagine an investor who purchases whole Dow Jones Wilshire 5000 Total Stock Market Index or Standard and Poor’ s 500 stocks. First, the cost becomes relatively expensive for small assets under management (AUM), which significantly diminishes the investment return. Second, this process contains various small, illiquid stocks that are difficult to sell for cash without a considerable loss. Thus, the approach damages the return and incurs relatively high costs. Third, when rebalancing the tracking portfolio, the proportions of the whole tracking portfolio must be reassessed. Thus, more fund management is needed.

The second index-tracking method is partial replication. This approach utilizes a small number of stocks to approximately simulate the performance of a chosen market index. Although the tracking error is no longer a perfect match, the costs are lowered, and the process of rebalancing portfolio weights is simplified. Unlike full replication, partial replication involves lower transaction costs and can avoid purchasing illiquid stocks, as only a small number of stocks are employed. In addition, this method partly reassesses the proportions of the tracking portfolio. Thus, partial replication requires less rebalancing costs and is less complicated than full replication.

Under these deliberations, the greatest challenge with index tracking is the tradeoff between the tracking accuracy and cost. When the given portfolio includes a large number of assets, the cost becomes expensive. A common way to handle this problem is the partial replication of market performance without using all assets. However, sparsity and other practical constraints bring complexity, as they form a discontinuous global function optimization problem. A metaheuristic is preferable for dealing with index tracking, as the traditional local method may become trapped in local solutions. The contributions of this paper are briefly discussed as follows:

A framework is proposed for the comprehensive index-tracking problem (ITP) based on metaheuristics.
The comprehensive ITP is addressed through a fully global method instead of other reviewed suboptimal global-local methods.
Competitive simulation performance results are obtained on a benchmark tracking index.
The proposed framework can be extended with other practical constraints and via the application of other metaheuristics.

In this paper, we present a metaheuristic-based framework to address the enhanced ITP (EITP) with various practical constraints. We propose a solution strategy that incorporates a quantitative tracking model, metaheuristic procedure, lookback approach, and constraint validator. This method aims to reduce the complexity of the considered problem, presents an efficient model, and systematizes the process. This paper focuses on the US market.

The structure of this paper is organized as follows. The next section discusses a literature review and related works. The third section presents the formulation of the EITP with various practical constraints. The fourth section presents the proposed framework. The fifth section presents the simulation results and discussion. The last section provides the conclusion of the paper.

Related work

We first define some notations that we use throughout the paper. These notations are presented as follows:

$E_{t}$ is the measured tracking error.
$r_{p}$ denotes the computed return of the tracking portfolio: $ \sum _{n=1}^{N} r_{p(t, n) } \odot w_{(t, n)} $ in time t and the current stocks n, and it is expressed as $[r_{p(1)}, \ldots , r_{p(T)}]$.
$w_{(t, n)}$ denotes the weight to be optimized for n stocks, and this is repeated over time t.
$ r_{b}$ denotes the return of tracking benchmark in time t, which is expressed as $[r_{b(1)}, \ldots , r_{b(T)}]$.
T is the maximum number of trading days in the given period, and $t=1 \ldots T$.
N is the total number of available assets in the tracking benchmark, and $n=1 \ldots N$.
$\odot $ denotes the Hadamard product [25, 44].

The well-known modern portfolio theory (MPT) was an important breakthrough in personal investing, and it provides insight into index tracking. The mean-variance model was the first approach in MPT to discover the efficient frontier for a tradeoff between the expected return and risk [42]. Some reviewed papers are based on portfolio optimization in multiobjective (MO) test problems, where the overall return and financial risk are optimized [22, 31, 32, 36, 60]. The ITP has been widely studied by different researchers and financial analysts. The objective is to minimize the difference between the chosen benchmark index and the tracking portfolio. The artificial index should be as similar to the benchmark index value as possible. This problem is handled with a lookback approach in which historical price information provides hints about the future. The empirical index-tracking error equation is shown as follows:

$$\begin{aligned} \begin{aligned} \min E_{t}&~= \frac{1}{T}|| r_{p} - r_{b} ||_2^2 \\ \text {s.t.}&~\sum _{n=1}^{N} w_n \le 1 . \end{aligned} \end{aligned}$$

(1)

The portfolio weights must be optimized to approximately replicate the market performance. The out-of-sample metric is used to estimate the performance of the proposed framework, as the optimization of index tracking is based on historical information.

In addition, an investor is also concerned about the interest of their portfolio. Enhanced index tracking no longer involves a single-factor model, as it tries to achieve greater returns than those of the benchmark index by sacrificing a degree of tracking error. The measurement of this problem becomes a task of tracking the error and excess return to estimate the solution. The square root of the tracking error $E_{s}$ is applied to the equation for balancing its measurement with that of the excess return, as shown as follows:

$$\begin{aligned} \min E_{s} = \sqrt{ \frac{1}{T}|| r_{p} - r_{b} ||_2^2 }. \end{aligned}$$

(2)

The excess return $E_{e}$ is shown as follows:

$$\begin{aligned} \max E_{e} = \frac{1}{T} r_{p} - r_{b}. \end{aligned}$$

(3)

This function generalizes the tracking error and excess return into a single minimization function, and it is shown as follows:

$$\begin{aligned} \begin{aligned} \min&~ \lambda (E_{s} ) - (1 - \lambda )E_{e} \\ \text {s.t.}&~ \sum _{n=1}^{N} w_n \le 1 . \end{aligned} \end{aligned}$$

(4)

The tradeoff between the tracking error and excess return is determined by $\lambda $. In addition, the logarithmic return is applied in this paper instead of the arithmetic return, as it is more suitable for estimating the tracking portfolio.

A simple example is discussed here to explain why the logarithmic return is better than the arithmetic return. When a stock price rises from 50 to 100, the arithmetic return will be 1.0, and the logarithmic return will be 0.69 with some decimal places. When the price decreases from 100 to 50, the arithmetic return will be − 0.5, and the logarithmic return will be − 0.69 with some decimal places. Based on this observation, arithmetic returns do not present the same price change magnitudes. As a result, arithmetic returns probably overestimate excess returns, and logarithmic returns give the same price change magnitudes for both positive and negative movements [26, 27, 50]. The formula of an arithmetic return is shown as follows:

$$\begin{aligned} r _{a} = \frac{P_{t} - P_{t-1} }{P_{t-1}}. \end{aligned}$$

(5)

The formula of a logarithmic return is shown as follows:

$$\begin{aligned} r_{l} = \ln \left( \frac{P_{t} }{P_{t-1}} \right) . \end{aligned}$$

(6)

Note that the number of days used for the profit return is one less than the number of days used for the closing price. The returns of tracking portfolio $r_{p(t,n)}$ and benchmark $r_{b(t)}$ are computed by $r_{l}$.

The first approach to deal with sparse index tracking is a two-step procedure that decomposes this problem into stock selection and weight allocation. The manual process was an early stock selection method that was based on financial analysis tools, such as composition features [30]. After that, researchers focused on automating the stock selection process, as this approach is superior to a manual or random method. Various evolutionary heuristic and clustering algorithms have been applied to this process [9, 15, 20, 29, 38, 52]. Once the stock is determined, the weight allocation process is addressed by an exact method such as quadratic programming. However, this procedure is a suboptimal global-local method that accomplishes the two steps separately. A second approach was introduced to address this problem, namely, the joint approach. It integrates the two-step procedure into a single process, and it is optimized through various evolutionary heuristic or stochastic neural networks [3, 21, 37, 46, 63, 65]. The last approach is to reformulate the sparse ITP into an alternative approximate function that is optimized through mixed-integer programming [4, 10, 43, 45]. This approach is complex, and the approximate function is not entirely equivalent to the original function. The original problem is a discontinuous and nonconvex. The problem is approximated by a function that is convex and differentiable. More details about the MIP approach can refer in these papers [7, 8].

Several types of approaches are reviewed, and the proposed framework belongs to the second category. A metaheuristic is a principal method for optimizing the joint problem in this framework, and it possesses several joint optimization advantages. First, the joint method is a fully global optimization technique instead of a suboptimal global-local optimization technique, as the two-step procedure is a suboptimal global-local approach. It is not clear whether the solution is a nearby local solution or a global solution. Second, the joint problem is equivalent to the original problem, which implies that further constraints can be applied without reformulating the whole equation. Unlike for the third approach, adding more considerations is not complex for the joint approach. When more objectives and constraints are introduced, to the complexity of the approximate function is increased. Third, the joint method is a direct approach, as joint optimization depends on the selected metaheuristics. A desirable solution is expected with an acceptable computational resource cost through metaheuristics.

Metaheuristics

A metaheuristic is a high-level, problem-independent procedure that designs a collection of guidelines to develop heuristic optimization algorithms [55]. Sparsity and other constraints bring complexity to the problem, making it a discontinuous and nondifferentiable function that is difficult to address with an exact method. Although metaheuristics do not guarantee globally optimal solutions, they obtain good approximate solutions via convergence and the possibility of acquiring the globally optimal solution. In addition, the performance of metaheuristics is often superior to that of traditional methods [6, 48, 56]. For instance, metaheuristics have been successfully applied to a wide range of fields, such as recommender systems, job scheduling, fake news stance detection, feature selection processes, fuzzy shortest paths, and electric vehicle routing [39, 49, 53, 62, 64, 66]. Thus, a metaheuristic is adopted in the proposed framework to address the comprehensive ITP.

A genetic algorithm (GA) was developed by Holland and his students [24]. It was inspired by Darwin’ s theory of natural selection based on the “survival of the fittest” rule, and it was seemingly the first approach to practice this strategy. The GA is a stochastic search method that simulates the mechanics of biological behaviors. The search operators include reproduction, crossover, and mutation. First, reproduction maintains better solutions through selection pressure from the set of candidate solutions. Then, crossover swaps the information of two parents to generate offspring. Finally, the mutation operator is an uncommon random modification that maintains genetic diversity.

Particle swarm optimization (PSO) was developed by Kennedy and Eberhart [28]. It simulates social behaviors such as bird flocking and fish schooling. The swarm searches for food in multidimensional space through its velocity, personal-best position, and global-best position. However, PSO suffers from the premature convergence problem. Some researchers have combined Gaussian mutation with PSO to maintain the population density and escape local minimal solutions [23, 57].

A competitive swarm optimizer (CSO) was developed by Cheng and Jin for large-scale optimization [12]. Although this algorithm was inspired by PSO, their concepts and theories are dissimilar. The CSO operates based on a random pairwise competition mechanism with the current swarm to generate a set of losers and winners among the particles. The swarm updates its position through the competition mechanism instead of the global-best position or personal-best position. The losers learn from the winners, and the winners are retained in the next generation.

Differential evolution (DE) was developed by Storn and Price [58]. This heuristic algorithm is simple and efficient; only demands a few problem parameters; and combines the mutation operator, crossover operator, and selection operator. First, mutation generates the candidate solutions by joining the existing solutions. Next, crossover determines the new vector based on a predefined crossover rate probability. Then, selection retains the best fitness value for the next iteration. This algorithm has two mechanism update schemes. Scheme one includes a crossover rate and an amplification factor, and scheme two introduces an additional control term to incorporate the current best position.

These metaheuristics are applied in the proposed framework, and their performance is compared via objective measurement. Metaheuristics can be viewed as alternative ways to address this nondeterministic polynomial-time (NP)-hard problem through their search abilities and gradient-free optimization process. A good solution is expected when addressing the comprehensive ITP with practical constraints.

Problem formulation

Enhanced index tracking and practical constraints

The significance of time series is considered in the optimization process to better deal with the EITP. The weight of the time series increases steadily throughout the training dataset, as the days that are closest to the trading days in the test dataset are more important in financial data:

$$\begin{aligned} \min{} & {} ~\lambda (E_{s} ) - (1 - \lambda )E_{e} \nonumber \\ \text {s.t.}{} & {} ~ r_p = \tau _{b} \odot \sum _{n=1}^{N} r_{p(t, n) } \odot w_{(t, n)} \; \nonumber \\{} & {} ~ r_b = \tau _{b} \odot r_{b(t)} \; \nonumber \\{} & {} ~ \tau _{b} = \frac{\zeta _{b{(t)}}T }{\sum _{t=1}^{T} \zeta _{b{(t)}}} \nonumber \\{} & {} ~ \zeta _{b}= 1 + \varsigma \left[ \frac{\sum _{t=1}^{1} \vartheta _{t}}{\sum _{t=1}^{T}\vartheta _{t} }_{1}, \frac{\sum _{t=1}^{2} \vartheta _{t} }{\sum _{t=1}^{T}\vartheta _{t} } _{2}, \ldots , \frac{ \sum _{t=1}^{T} \vartheta _{t} }{\sum _{t=1}^{T}\vartheta _{t} }_{T} \right] \; \nonumber \\{} & {} ~ \vartheta = [ \ln {(1)}_{1} , \ln {(2)}_{2} , \ldots , \ln {(T)}_{T}] \; \nonumber \\{} & {} ~ \varsigma \in {\mathbb {R}}_{\ge 0} \; \end{aligned}$$

(7)

where $\tau _{b}$ denotes the biased time factor, and $ \varsigma _{b}$ denotes the biased coefficient for non-negative real numbers. If the biased coefficient is set to zero, it is the same as the standard ITP. When the biased coefficient becomes larger, the recent data is given greater weight. The numerator takes the increasing with the natural logarithm, and the denominator gets the summation over the numerator. We take this equation instead of iteratively increasing to avoid the value growth too fast. Note that t is start from one for training and test dataset.

In particular, fund managers not only consider the EITP but also consider other constraints such as sparsity, weights, AUM, transaction fees, the full share restriction, and risk diversification. Various real-life constraints are considered in this comprehensive model.

Commission fee structures can be divided into tiered pricing and fixed pricing frameworks. For tiered pricing, the prices are split into a few levels. When the investor purchases more stocks, the transaction fee decreases. Fixed pricing includes all exchange and regulatory fees, and it is applied to demonstrate the cost constraint for simplicity. The regulatory fees come from the Financial Industry Regulatory Authority (FINRA) trading activity fee.

The prices of fixed costs in this paper are based on Interactive Brokers (IB), as it is one of the largest trading platforms in the US market. The IB brokerage firm charges USD 0.005 per share, the minimum cost is USD 1.00, and the maximum cost is 1.0% of the trade value. Note that if the calculated maximum per order is smaller than the minimum per order, the maximum per order will be evaluated. The FINRA trading activity fee charges 0.000119% of the total trade volume, the minimum trading activity cost is USD 0.01, and the maximum trading activity cost is USD 5.95. The equations of the commission fee and the FINRA trading activity fee are shown as follows:

$$\begin{aligned} \begin{aligned} o_{n}&~ = 1.0 \le 0.005 \odot I_{n} \le \xi \odot w_{n} \odot 0.01 \\ u_{n}&~ = 0.01 \le \xi \odot w_{n} \odot 0.000119 \le 5.95 \\ \eta&~ = \sum _{i=1}^N (o_{n} + u_{n}), \end{aligned} \end{aligned}$$

(8)

where $o_n$ denotes the commission fee for n current assets, $ u_{n}$ denotes the FINRA trading activity fee for n current assets, $\xi $ denotes the value of the AUM, $w_{n}$ denotes the weight for the n current stocks, N denotes the total number of stocks, i denotes the index $1 \cdots N$, and $\eta $ denotes the total transaction fee.

In addition, a suitable transaction fee is considered. According to the standard practice, the FINRA 5% rule stipulates that the broker should not charge more than 5% of the commission fee value in the US stock market. The equation of the proper transaction fee is shown as follows:

$$\begin{aligned} \begin{aligned} \upsilon _{n} = {\left\{ \begin{array}{ll} o_{n} - \xi \odot w_{n} \odot \varrho , \; &{} {\text {if}} \; o_{n} > \xi \odot w_{n} \odot \varrho \\ 0, \; &{} {\text {if}} \, o_{n} \le \, \xi \odot w_{n} \odot \varrho \; \end{array}\right. } \end{aligned} \end{aligned}$$

(9)

where $\varrho $ denotes the percentage of the acceptable commission fee to be subtracted from the budget value for a rate of 0.05.

Regarding risk, systematic risk and unsystematic risk exist in the finance market. Systematic risk represents the aggregation of risk from all investors in the market, such as the risks related to natural disasters and epidemics. Unsystematic risk denotes the risk that is unique to a particular company value, and it is lowered by diversifying the portfolio weights among different stocks. Therefore, systematic risk is unpredictable in the finance market, and unsystematic risk is considered in risk diversification.

Finance analysts recommend that investors practice risk management strategies that incorporate a broad range of investments within a portfolio. A combination of distinct assets can lower financial exposure to any particular asset risk. The portfolio standard deviation (SD) and upper bound are used to lower the total risk. Before discussing the portfolio SD, the stock correlation coefficient (CC) and portfolio VAR are discussed first, as they are highly related terms.

The stock CC measures the movement relation between two or more assets by calculating the Pearson CC, and the value of the CC is between − 1 and 1 (Asuero et al. [2]; Taylor [59]). A positive CC means that when one stock price increases, the other stock price also increases. Conversely, a negative CC denotes an inverse correlation between these stocks, where the stock prices move in opposite directions. For instance, a highly positive CC implies that the compared stock prices move simultaneously in the same direction and at similar percentages most of the time. Note that a negative stock CC is unusual in the real world. The equation for calculating the stock CC is shown as follows:

$$\begin{aligned} \rho {(x_{1} , x_{2} )} = \frac{\text {cov}(x_{1} , x_{2} )}{\sigma _ {x_{1} } \sigma _ {x_ {2} } }, \end{aligned}$$

(10)

where $\rho $ denotes the CC operand, cov denotes the covariance operator, and $\sigma $ denotes the SD operator. The CC equation for assets $ x_1 $ and $ x_2$ can be expanded as follows:

$$\begin{aligned} \rho {(x_{1} , x_{2} )} = \frac{{}\sum _{t=1}^{T} (x_{(1, t)} - \overline{x_{1} })(x_{(2, t)} - \overline{x_{2} })}{\sqrt{\sum _{t=1}^{T} (x_{(1, t)} - \overline{x_{1} })^2 } \sqrt{ \sum _{t=1}^{T}(x_{(2, t)} - \overline{x_{2} })^2} } \end{aligned}$$

(11)

where x is the return of x asset, and ${\overline{x}} $ is the mean value of x. Then, the matrix of correlation coefficients is shown as follows:

$$\begin{aligned} R (X) = \begin{bmatrix} 1 &{} \quad \rho {(x_{1},x_{2} )} &{} \quad \cdots &{}\quad \rho {(x_{1},x_{n} )} \\ \rho {(x_{2} , x_{1} )} &{}\quad 1 &{}\quad \cdots &{}\quad \rho {(x_{2},x_{n} )} \\ \vdots &{} \quad \vdots &{}\quad \ddots &{}\quad \vdots \\ \rho {(x_{n}, x_{1} )}&{}\quad \rho {(x_{n},x_{2 }) } &{}\quad \cdots &{}\quad 1 \\ \end{bmatrix}, \end{aligned}$$

(12)

where R(X) denotes the CC matrix. After introducing the stock correlation concept, we turn back to the portfolio SD, which measures the overall portfolio risk. A low portfolio SD implies that the portfolio exhibits less volatility and higher stability. In contrast, a high portfolio SD highlights that the investment risk is high. The equation for calculating the portfolio SD is shown as follows:

$$\begin{aligned} \sigma _{p} = \sqrt{ w_{n} \otimes \text {cov}(r_{p}) \otimes w_{n}^T }, \end{aligned}$$

(13)

where $\sigma _{p}$ denotes the portfolio SD, $ \otimes $ denotes the matrix multiplication operation and $w_{n}^T$ denotes the transpose of the matrix. Note that the order of the elements is not exchangeable in matrix multiplication. However, it is hard to determine whether the portfolio SD value whether it is high or low based on the value itself. Therefore, an equally weighted portfolio SD is used to determine the baseline. The multiplier is indicated to relax the risk constraints, as cardinality restricts risk diversification. The more stocks held in the portfolio, the lower the risk exposure is [16].

Various real-life constraints are considered in this model. First, the cardinality constraint restricts the maximum number of stocks and provides sparsity [33]. Thus, the management cost is decreased, and the fund administration workload is reduced. Second, the investor is not able exceed the budget value, and the budget should be utilized as much as possible. Thus, the minimum percentage of the budget value is determined. Note that this also implies that the summed portfolio weights should be smaller than one. Third, the transaction cost should not be too expensive, and this cost is limited. Fourth, risk diversification is considered to benefit a return due to the use of a portfolio with less risk, and risk diversification is measured by the portfolio SD. Fifth, the full share restriction is examined, as the number of buyable stocks should be an integer. Although the idea of a fractional share has been raised, this approach is not available at every brokerage. Sixth, the lower bound for the weights is defined as greater than or equal to zero, and short selling is not permitted. Short selling is a high-risk activity that may cause very large losses. Seventh, the upper bound for the weights is defined to provide risk diversification, as the investor should not put all of their eggs in one basket. This prevents all resources from concentrating on a particular asset, as one could lose everything in such a scenario. Under these considerations, the equation of the EITP with various practical constraints is formulated.

The notations and equation are presented as follows:

$\kappa $ denotes the cardinality constraint that restricts the number of assets in the portfolio.
$\xi $ denotes the amount of AUM being invested.
$\eta $ denotes the transaction fee. There are two major types of transaction fees, with tiered pricing and fixed pricing structures. For simplicity, fixed pricing is considered in this framework.
$\varphi $ denotes the minimum percentage of the budget value that must be contributed to the portfolio.
$e_{n}$ denotes equally weights, which are stated as follows: $ e_{n} = [ \frac{1}{N}_{1}, \frac{1}{N}_2 \cdots \frac{1}{N}_N] $
$ ( w \odot \sigma ) ^T $ denotes the matrix transpose operation.
$ w_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes w_{n}^T $ denotes the portfolio variance (VAR), and the portfolio SD is the square root of the portfolio VAR.
$ e_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes e_{n}^T $ denotes the equally weighted portfolio VAR, and the square root of the equally weighted portfolio VAR is the equally weighted portfolio SD.
$\phi $ denotes the multiplier coefficient for the equally weighted portfolio SD.
$ \iota $ denotes the modulo operator.
$P_{n}$ denotes the closing price on the starting day of the specific period of interest for the n current assets.
$\mu $ denotes the upper bound for each stock.

Note that some notations have been previously mentioned.

$$\begin{aligned} \begin{aligned} \min&~ \lambda (E_{s}) - (1 - \lambda )E_{e} \\ \text {s.t.}&~ \sum _{n=1}^{N} \bigtriangleup (w_{n}) \le \kappa \; \\&~ \varphi \xi \le \sum _{n=1}^{N} ( \xi \odot w_{n} ) + \eta \le \xi \; \\&~ \sum _{n=1}^{N} \upsilon \le 0 \; \\&~ \sqrt{ w_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes w_{n}^T } \le \phi \sqrt{ e_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes e_{n}^T } \; \\&~ \iota \; ( \; \xi \odot e_{n} , P_{n} \; ) = 0 \\&~ 0 \le w_{n} \le \mu \; . \end{aligned} \end{aligned}$$

(14)

The $ \bigtriangleup $ denotes the operator for measuring the cardinality constraint. When the weight of a stock is larger than zero, the cardinality is one:

$$\begin{aligned} \bigtriangleup (w_n) = {\left\{ \begin{array}{ll} 0, \; \textrm{if} \; w_n = 0 \\ 1, \; \textrm{if} \, w_n > 0 . \end{array}\right. } \end{aligned}$$

(15)

In addition, fractional rounding is practiced to address the full share restriction, as the equality constraint is hard for metaheuristics to handle [13]. As a result, the rounding mechanism is applied, and the price is estimated on the starting day of the training and test period. Note that the minimum trading unit is one share in the US. Therefore, investors can invest more freely in the US market than in other regions. For instance, Hong Kong Exchanges have a restrictive policy in which the minimum trading unit is one lot. The equations of fractional rounding are shown as follows:

$$\begin{aligned} \begin{aligned} \lfloor x \rfloor&~ = \; \sup {\{ m \in {\mathbb {Z}}, m \le x\} }, \\ \lceil x \rceil&~ = \; \inf { \{ m \in {\mathbb {Z}}, m \ge x \} }, \\ [ x ]&~ = {\left\{ \begin{array}{ll} \lfloor x \rfloor , \; &{} \textrm{if} \; ( x - \lfloor x \rfloor ) \le q \\ \lceil x \rceil , \; &{} \textrm{if} \; ( x - \lfloor x \rfloor ) > q \end{array}\right. }, \end{aligned} \end{aligned}$$

(16)

where x denotes the element set of integers, [x] denotes the rounding operand, $\lfloor x\rfloor $ denotes the floor operand, $\lceil x\rceil $ denotes the ceiling operand, $\sup $ denotes the supremum, $\inf $ denotes the infimum, q is set to 0.5 (corresponding to midpoint rounding), x denotes the set of all real numbers, and ${\mathbb {Z}}$ denotes the set of integers. The equation for integer rounding is shown as follows:

$$\begin{aligned} I_{n} = \left[ \frac{\xi \odot w_{n} }{P_{n} }\right] , \end{aligned}$$

(17)

where $I_{n}$ denotes the number of full shares and $P_{n}$ denotes the closing price of the current n assets on the starting day. Once the numbers of full shares are obtained, the portfolio weights are reassigned. The equation of the weights reassignment process is shown as follows:

$$\begin{aligned} w_n = \frac{ P_{n} \odot I_{n} }{\xi }. \end{aligned}$$

(18)

After the weight solutions for the test data are rounded, an equation is practiced to prevent cardinality constraint violations. The equation is shown as follows:

$$\begin{aligned} w_{e} = \bigtriangleup (w_{b}) \odot w_{e}, \end{aligned}$$

(19)

where $w_{e}$ is the weight of the test data, and $w_{b}$ is the best discovered weight of the training data.

Penalty technique

The enhanced penalty technique is practiced to handle various practical constraints, as it is widely applied [14]. The concept of this technique comes from Lagrangian relaxation (LR), which is an early method for approximating a challenging constrained problem to a straightforward [17, 18]. The constraint inequality optimization problem is shown as follows:

$$\begin{aligned} \begin{aligned} \min&~ c^Tx \\ \text {s.t.}&~ Ax \le b \\&~ x \in \chi , \end{aligned} \end{aligned}$$

(20)

where x denotes the optimal variables for the primal problem, b and c are given vectors, $c^T$ denotes the transpose operand for transforming c, $\chi $ denotes a set of elements, and $Ax \le b$ denotes the inequality constraint. Then, the equation of the inequality constraint with LR is shown as follows:

$$\begin{aligned} \begin{aligned} \min&~ c^Tx + d(Ax - b) \\ \text {s.t.}&~ x \in \chi , \\ \end{aligned} \end{aligned}$$

(21)

where d denotes the positive Lagrangian multiplier coefficients. After discussing the LR method, the enhanced penalty technique is presented. This method handles the inequality constraint through the summation of the penalty term and an original objective function; this fitness function is shown as follows:

$$\begin{aligned} \begin{aligned} F(x) = f(x) + \sum _{s=1} ^{S} p_s \langle g_s (x) \rangle ^ 2, \end{aligned} \end{aligned}$$

(22)

where f(x) denotes the original objective function, S denotes the number of constraints, s is an index from $ 1 \cdots S$, $\langle \rangle $ denotes the absolute value operand that returns zero for a negative value, $p_s$ denotes the penalty parameter that adjusts the magnitude of the sth constraint, and $g_s (x)$ denotes the constraint for the current s.

Table 1 Major objectives under the cardinality constraint with $\kappa = 5 $

Full size table

Table 2 Major objectives under the cardinality constraint with $\kappa = 10 $

Full size table

Table 3 Major objectives under the cardinality constraint with $\kappa = 15 $

Full size table

Table 4 Major objectives under the cardinality constraint with $\kappa = 20 $

Full size table

Constrained index tracking

Eventually, the EITP with various practical constraints is considered in this metaheuristic-based framework. First, the cardinality constraint is applied to capture the sparsity of this problem. Second, the maximum and minimum budgets and transaction fees are limited. This also implies that the summed weights should not be greater than 100%. Third, the acceptable commission fee is examined. Fourth, the portfolio SD and benchmark portfolio SD are considered the portfolio risk. These constraints are based on Eq. 11. However, the magnitudes of different constraints are not in the same order. Before determining reasonable penalty terms, a fraction is applied to regulate these magnitudes. In addition, some constraints are not yet restricted, and they are handled by the proposed framework. The equations of this problem are shown as follows:

$$\begin{aligned} \begin{aligned} \min&~ \lambda (E_{s}) - (1 - \lambda ) E_{e} + \sum _{s=1} ^{S} p_s \langle g_s (x) \rangle ^ 2 \\ \text {s.t.}&~ g_{1}(x) = \frac{ \sum _{n=1}^{N} \bigtriangleup (w_{n}) }{N} - \frac{\kappa }{N} \; \\&~ g_{2}(x) = \frac{ \sum _{n=1}^{N} ( \xi \odot w_{n} ) + \eta }{\xi } - 1 \;\\&~ g_{3}(x) = \varphi - \frac{ \sum _{n=1}^{N} ( \xi \odot w_{n} ) + \eta }{\xi } \; \\&~ g_{4}(x) = \frac{ \sum _{n=1}^{N} \upsilon }{N} \; \\&~ g_{5}(x) = \sqrt{ w_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes w_{n}^T } \\&\quad - \phi \sqrt{ e_{n} \otimes \text {cov}(r_{p(t, n)}) \otimes e_{n}^T }\; \\&~ r_p = \tau _{b} \odot \sum _{n=1}^{N} r_{p(t, n) } \odot w_{(t, n)} \; \\&~ r_b = \tau _{b} \odot r_{b(t)} \; \\&~ \tau _{b} = \frac{\zeta _{b{(t)}}T }{\sum _{t=1}^{T} \zeta _{b{(t)}}} \\&~ \zeta _{b}= 1 + \varsigma \left[ \frac{\sum _{t=1}^{1} \vartheta _{t}}{\sum _{t=1}^{T}\vartheta _{t} }_{1}, \frac{\sum _{t=1}^{2} \vartheta _{t} }{\sum _{t=1}^{T}\vartheta _{t} } _{2}, \ldots , \frac{ \sum _{t=1}^{T} \vartheta _{t} }{\sum _{t=1}^{T}\vartheta _{t} }_{T} \right] \; \\&~ \vartheta = [ \ln {(1)}_{1} , \ln {(2)}_{2} , \ldots , \ln {(T)}_{T}] \; \\&~ \varsigma \in {\mathbb {R}}_{\ge 0}, \; \\ \end{aligned} \end{aligned}$$

(23)

where S denotes the number of penalty constraints, and s ranges from $1 \cdots 5$. The biased time factor $ \varsigma $ is only applied to the major objective, not to the constraints.

Proposed framework

This proposed metaheuristic-based framework addresses the comprehensive ITP. An investor considers the major objective, general constraints, risk diversification, and the total budget. Therefore, these considerations form a mathematical formulation via the penalty technique, and they are addressed through the framework. The investor determines the quality of the solutions and the number of iterations, and he or she repeats this process until reaching the maximum number of iterations. The overall considerations and operations are reviewed in Fig. 1. Furthermore, other considerations can incorporated into this framework.

The framework starts by collecting asset data for the tracking index and computing their returns. Then, the investor considers this problem with various realistic constraints. Once the issue is settled, these considerations are formulated with the penalty technique. After formulation, the dataset is split into training and test subdatasets. Once the dataset is available, metaheuristic optimization is applied to optimize the portfolio weight. The following step is to check whether the optimized portfolio violates any constraints. Eventually, the performance of the optimized portfolio is evaluated. The process flow of the framework is summarized in Fig. 2.

Table 5 Major objectives under the cardinality constraint $\kappa = 5 $ with biased coefficient $\varsigma $ on DE1

Full size table

Table 6 Major objectives under the cardinality constraint $\kappa = 10 $ with biased coefficient $\varsigma $ on DE1

Full size table

Table 7 Major objectives under the cardinality constraint $\kappa = 15 $ with biased coefficient $\varsigma $ on DE1

Full size table

Table 8 Major objectives under the cardinality constraint $\kappa = 20 $ with biased coefficient $\varsigma $ on DE1

Full size table

The movement of the dataset is discussed here. When the solutions satisfy the problem constraints in the training set, these solutions are passed to the candidate set. After checking for constraint violations, the best solution is applied to the test dataset. The training dataset is used to discover the optimal evaluation model in the development stage. In addition, the standard Pareto principle is applied to split the dataset properly, as the 80/20 rule states that 80% of the results come from 20% of the causes [47]. The training and test datasets are set from 0 to 64%, from 64 to 80%, and from 80 to 100% of the whole dataset. The dataset is summarized in Fig. 3. The experimental dataset is derived from Standard and Poor’ s 100 Index from 01/01/2017 to 31/12/2017.

The procedure of the proposed framework is shown in Framework 1. The pseudocode begins by retrieving the closing price of the tracking index. Once the asset dataset and benchmark data are ready, the return is computed. Next, other functions and parameters are confirmed. After these steps, the portfolio weight is optimized through metaheuristics. The following step is to check for constraint violations. Once the solutions satisfy the feasible region, the weight is stored in the candidate solutions. When all candidate solutions are available, the best candidate in the training dataset is applied to the test dataset. Eventually, the simulation results are reported, and the graph is plotted.

Simulation

Simulation settings

In this simulation, the investors are assumed to not have any bias with respect to the tracking error or the excess return. Thus, $\lambda $ is set to 0.5. The AUM is set to $10^5$, the minimum percentage of the budget to be spent ($\varphi $) is set to 0.98, and the multiplier coefficient for an equally weighted portfolio SD ($\phi $) is set to 1.2. The cardinality constraint is separately set to 10 and 20. When the cardinality constraint is set to 10, the upper bound is set as 0.2. When the cardinality constraint is set to 20, the upper bound is set as 0.1.

For a fair comparison, the population sizes for all metaheuristic algorithms are set to 100. The stopping criteria are defined by maximum number of iterations 20,000. The CSO is set to 40,000 maximum number of iterations because it evaluates half of the particles for each process.

For the GA, binary tournament selection and uniform crossover are used. For PSO, the inertia weight is set to 0.72984, and personal-best and global-best acceleration parameters are set to 2.05. Mutation is applied to enhance the algorithmic performance. The mutation rates of the GA and PSO are set to 0.02. For the CSO, the control parameter of the mean position is set to zero because the number of decision spaces is less than one thousand. Note that this parameter is suggested from the original paper. Regarding DE, two schemes are compared in this paper. The amplification factor is set to 1, and the crossover rate is set to 0.3 in scheme 1. In scheme 2, the amplification factor is set to 1, the crossover rate is set to 0.2, and the additional control parameter is set to 0.99.

Simulation results

The penalty terms control the magnitudes of the constraint violations. Thus, the penalty terms need to be investigated carefully. The penalty terms $ p_1(x)$, $ p_2 (x)$, $ p_3 (x)$, $ p_4(x)$, and $ p_5 (x)$ are set to 100, 100, 2000, 10 and 200, respectively. After determining the suitable penalty terms, the remaining simulations are based on the described settings.

The performances of the GA, PSO, CSO, and DE are compared within the proposed framework. The major objectives of various cardinality constraints are presented in Tables 1, 2, 3 and 4. The cumulative returns are presented for the test data in Figs 4, 5, 6 and 7. Note that N/A denotes the constraint is not satisfied and the solution is dropped. It can be seen that DE1 performs better than other compared algorithms in most of the test cases. Based on this observation, further investigations are conducted using DE1. When the value of the biased time coefficient is greater, the weight is biased towards the later period. It is expected that the later period of the training data is more important in the ITP, and a biased time coefficient can improve the result. The biased time coefficient is set to 0, 250, 500, 750, and 1000, and the performances of various biased time coefficients are tested. Results are presented in Tables 5, 6, 7 and 8. Figures 8, 9, 10 and 11 present the cumulative returns on cardinality constraints with various biased time coefficients. Note that the cumulative return is calculated by the cumulative product for the return. When the cardinality constraint is relaxed, the fitness value is decreased. Figure 12 shows the change in fitness value with various metaheuristics. Figure 13 presents the change in fitness value with various biased time coefficients on DE1. It can be concluded that the use of metaheuristics in the proposed framework is able to solve the formulated optimization problem.

Conclusion

In this paper, the EITP and various practical constraints are addressed by the proposed metaheuristic-based framework. The proposed framework is different from traditional frameworks which makes use of a fully global approach rather than a suboptimal global-local approach. The traditional method achieves unstable performance. Metaheuristics can obtain global solutions with probabilities. Moreover, sparsity, weights, AUM, transaction fees, the full share restriction, and risk diversification are considered.

In summary, the comprehensive ITP is addressed by the proposed framework. In the simulation, the GA, PSO, the CSO, and DE are applied to the comprehensive ITP, and a competitive result is obtained by the proposed method. In addition, the framework can incorporate other practical constraints. In the future, this framework will be able to feasibly be run on further simulations through other metaheuristics and datasets derived from other benchmark market tracking indices.

Change history

30 November 2022
A Correction to this paper has been published: https://doi.org/10.1007/s40747-022-00918-z

References

Anadu K, Kruttli MS, McCabe PE, Osambela E (2018) The shift from active to passive investing: potential risks to financial stability? Financ Econ Discuss Ser. https://doi.org/10.17016/FEDS.2018.060r1
Article Google Scholar
Asuero AG, Sayago A, Gonzalez AG (2006) The correlation coefficient: an overview. Crit Rev Anal Chem 36(1):41–59
Beasley JE, Meade N, Chang TJ (2003) An evolutionary heuristic for the index tracking problem. Eur J Oper Res 148(3):621–643
Article MathSciNet MATH Google Scholar
Benidis K, Feng Y, Palomar DP (2017) Sparse portfolios for high-dimensional financial index tracking. IEEE Trans Signal Process 66(1):155–170
Article MathSciNet MATH Google Scholar
Blitz D, Huij J, Swinkels L (2012) The performance of European index funds and exchange-traded funds. Eur Financ Manag 18(4):649–662
Article Google Scholar
Boussaïd I, Lepagnot J, Siarry P (2013) A survey on optimization metaheuristics. Inf Sci 237:82–117
Article MathSciNet MATH Google Scholar
Che H, Wang J (2019) A collaborative neurodynamic approach to global and combinatorial optimization. Neural Netw 114:15–27
Article MATH Google Scholar
Che H, Wang J (2020) A two-timescale duplex neurodynamic approach to mixed-integer optimization. IEEE Trans Neural Netw Learn Syst 32(1):36–48
Article MathSciNet Google Scholar
Chen AH, Liang YC, Liu CC (2013) Portfolio optimization using improved artificial bee colony approach. In: 2013 IEEE conference on computational intelligence for financial engineering and economics (CIFEr). IEEE, pp 60–67
Chen C, Kwon RH (2012) Robust portfolio selection for index tracking. Comput Oper Res 39(4):829–837
Article MathSciNet MATH Google Scholar
Chen X, Scholtens B (2018) The urge to act: a comparison of active and passive socially responsible investment funds in the United States. Corp Soc Responsib Environ Manag 25(6):1154–1173
Article Google Scholar
Cheng R, Jin Y (2014) A competitive swarm optimizer for large scale optimization. IEEE Trans Cybern 45(2):191–204
Article Google Scholar
Cont R, Heidari M (2014) Optimal rounding under integer constraints. arXiv preprint arXiv:1501.00014
Dai C, Che H, Leung MF (2021) A neurodynamic optimization approach for $L_1$ minimization with application to compressed image reconstruction. Int J Artif Intell Tools 30(01):2140007
Article Google Scholar
Dose C, Cincotti S (2005) Clustering of financial time series with application to index and enhanced index tracking portfolio. Phys A 355(1):145–151
Article MathSciNet Google Scholar
Elton EJ, Gruber MJ, Brown SJ, Goetzmann WN (2009) Modern portfolio theory and investment analysis. Wiley, New York
Google Scholar
Fisher ML (1981) The Lagrangian relaxation method for solving integer programming problems. Manag Sci 27(1):1–18
Article MathSciNet MATH Google Scholar
Fisher ML (1985) An applications oriented guide to Lagrangian relaxation. Interfaces 15(2):10–21
Article Google Scholar
Foster FD, Warren GJ (2016) Interviews with institutional investors: the how and why of active investing. J Behav Financ 17(1):60–84
Article Google Scholar
García F, Guijarro F, Oliver J (2018) Index tracking optimization with cardinality constraint: a performance comparison of genetic algorithms and tabu search heuristics. Neural Comput Appl 30(8):2625–2641
Article Google Scholar
Giuzio M (2017) Genetic algorithm versus classical methods in sparse index tracking. Decis Econ Financ 40(1):243–256
Article MathSciNet MATH Google Scholar
He C, Tian Y, Wang H, Jin Y (2020) A repository of real-world datasets for data-driven evolutionary multiobjective optimization. Complex Intell Syst 6(1):189–197
Article Google Scholar
Higashi N, Iba H (2003) Particle swarm optimization with Gaussian mutation. In: Proceedings of the 2003 IEEE swarm intelligence symposium. SIS’03 (Cat. No. 03EX706). IEEE, pp 72–79
Holland JH (1992) Genetic algorithms. Sci Am 267(1):66–73
Article Google Scholar
Horn RA (1990) The hadamard product. In: Proceedings of the symposium of applied mathematics, vol 40, pp 87–169
Hudson R (2010) Comparing security returns is harder than you think: problems with logarithmic returns. Available at SSRN 1549328
Hudson RS, Gregoriou A (2015) Calculating and comparing security returns is harder than you think: a comparison between logarithmic and simple returns. Int Rev Financ Anal 38:151–162
Article Google Scholar
Kennedy J, Eberhart R (1995) Particle swarm optimization. In: Proceedings of ICNN’95-international conference on neural networks, vol 4. IEEE, pp 1942–1948
Kwiatkowski JW (1992) Algorithms for index tracking. IMA J Manag Math 4(3):279–299
Article MathSciNet MATH Google Scholar
Larsen GA, Resnick BG (1998) Empirical insights on indexing: how capitalization, stratification and weighting can affect tracking error. J Portf Manag 25(1):51
Article Google Scholar
Leung MF, Wang J (2018) A collaborative neurodynamic approach to multiobjective optimization. IEEE Trans Neural Netw Learn Syst 29(11):5738–5748
Article MathSciNet Google Scholar
Leung MF, Wang J (2019) A collaborative neurodynamic optimization approach to bicriteria portfolio selection. In: Advances in neural networks—ISNN 2019. Lecture notes in computer science. Springer International Publishing, Cham, pp 318–327
Leung MF, Wang J (2021) Cardinality-constrained portfolio selection based on collaborative neurodynamic optimization. Neural Netw 145:68–79
Article Google Scholar
Leung MF, Wang J (2021) Minimax and biobjective portfolio selection based on collaborative neurodynamic optimization. IEEE Trans Neural Netw Learn Syst 32(7):2825–2836. https://doi.org/10.1109/TNNLS.2019.2957105
Article MathSciNet Google Scholar
Leung MF, Wang J (2021) A two-timescale neurodynamic approach to minimax portfolio optimization. In: 2021 11th international conference on information science and technology (ICIST). IEEE, pp 438–443
Leung MF, Wang J, Li D (2021) Decentralized robust portfolio optimization based on cooperative-competitive multiagent systems. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3088884
Article Google Scholar
Li Q, Sun L, Bao L (2011) Enhanced index tracking based on multi-objective immune algorithm. Expert Syst Appl 38(5):6101–6106
Article Google Scholar
Li Y, Yang X, Zhu S, Li DH (2014) A hybrid approach for index tracking with practical constraints. J Ind Manag Optim 10(3):905
Article MathSciNet MATH Google Scholar
Lin L, Wu C, Ma L (2021) A genetic algorithm for the fuzzy shortest path problem in a fuzzy network. Complex Intell Syst 7(1):225–234
Article Google Scholar
Malkiel BG (2003) Passive investment strategies and efficient markets. Eur Financ Manag 9(1):1–10
Article Google Scholar
Malkiel BG, Fama EF (1970) Efficient capital markets: a review of theory and empirical work. J Financ 25(2):383–417
Article Google Scholar
Markowitz H (1952) Portfolio Selection. J. Financ 7(1):77–91. https://doi.org/10.2307/2975974
Mezali H, Beasley J (2012) Index tracking with fixed and variable transaction costs. Optim Lett 8(1):61–80
Article MathSciNet MATH Google Scholar
Million E (2007) The hadamard product. Course. Notes 3:6
Google Scholar
Mutunge P, Haugland D (2018) Minimizing the tracking error of cardinality constrained portfolios. Comput Oper Res 90:33–41
Article MathSciNet MATH Google Scholar
Ni H, Wang Y (2013) Stock index tracking by Pareto efficient genetic algorithm. Appl Soft Comput 13(12):4519–4535
Nisonger TE (2008) The “80/20 rule’’ and core journals. Ser Libr 55(1–2):62–84
Google Scholar
Osman IH, Laporte G (1996) Metaheuristics: a bibliography
Pandey AC, Tikkiwal VA (2021) Stance detection using improved whale optimization algorithm. Complex Intell Syst 7(3):1649–1672
Article Google Scholar
Panna M (2017) Note on simple and logarithmic return. In: APSTRACT: applied studies in agribusiness and commerce, vol 11(1033-2017-2935), pp 127–136
Puelz D, Hahn PR, Carvalho CM (2020) Portfolio selection for individual passive investing. Appl Stoch Model Bus Ind 36(1):124–142
Article MathSciNet Google Scholar
Shapcott J (1992) Index tracking: genetic algorithms for investment portfolio selection report. EPCC-SS92-24. Edinburgh Parallel Computing Centre, The University of Edinburgh, Edinburgh
Google Scholar
Shen C, Zhang K (2021) Two-stage improved grey wolf optimization algorithm for feature selection on high-dimensional classification. In: Complex and intelligent systems, pp 1–21
Sorensen EH, Miller KL, Samak V (1998) Allocating between active and passive management. Financ Anal J 54(5):18–31
Article Google Scholar
Sörensen K, Glover F (2013) Metaheuristics. Encycl Oper Res Manag Sci 62:960–970
Google Scholar
Sorensen K, Sevaux M, Glover F (2017) A history of metaheuristics. arXiv preprint arXiv:1704.00853
Stacey A, Jancic M, Grundy I (2003) Particle swarm optimization with mutation. In: The 2003 congress on evolutionary computation, 2003. CEC’03, vol 2, pp 1425–1430. IEEE
Storn R, Price K (1997) Differential evolution—a simple and efficient heuristic for global optimization over continuous spaces. J Glob Optim 11(4):341–359
Article MathSciNet MATH Google Scholar
Taylor R (1990) Interpretation of the correlation coefficient: a basic review. J Diagn Med Sonography 6(1):35–39
Ünal AN, Kayakutlu G (2020) Multi-objective particle swarm optimization with random immigrants. Complex Intell Syst 6(3):635–650
Article Google Scholar
Woolley P, Bird R (2003) Economic implications of passive investing. J Asset Manag 3(4):303–312
Article Google Scholar
Wu CC, Azzouz A, Chen JY, Xu J, Shen WL, Lu L, Said LB, Lin WC (2021) A two-agent one-machine multitasking scheduling problem solving by exact and metaheuristics. In: Complex and intelligent systems, pp 1–14
Yuen MC, Ng SC, Leung MF (2021) Metaheuristics for sparse index-tracking problem: a case study on ftse 100. J Phys Conf Ser 1828:012111 (IOP Publishing)
Article Google Scholar
Zhang Q, Lu J, Jin Y (2021) Artificial intelligence in recommender systems. Complex Intell Syst 7(1):439–457
Article Google Scholar
Zheng Y, Chen B, Hospedales TM, Yang Y (2020) Index tracking with cardinality constraints: A stochastic neural networks approach. In: Proceedings of the AAAI conference on artificial intelligence, vol 34, pp 1242–1249
Zhou Y, Huang J, Shi J, Wang R, Huang K (2021) The electric vehicle routing problem with partial recharge and vehicle recycling. In: Complex and intelligent systems, pp 1–14

Download references

Acknowledgements

This work is supported in part by the Fundamental Research Funds for the Central Universities (Grant no. SWU020006); in part by the National Natural Science Foundation of China (Grant no. 62003281); and in part by Hong Kong Metropolitan University Research Grant (no. 2020/1.4).

Author information

Authors and Affiliations

School of Science and Technology, Hong Kong Metropolitan University, Kowloon, Hong Kong, China
Man-Chung Yuen & Man-Fai Leung
School of Computing and Information Science, Anglia Ruskin University, Cambridge, UK
Sin-Chun Ng
Chongqing Key Laboratory of Nonlinear Circuits and Intelligent Information Processing, College of Electronic and Information Engineering, Southwest University, Chongqing, China
Hangjun Che

Authors

Man-Chung Yuen
View author publications
You can also search for this author in PubMed Google Scholar
Sin-Chun Ng
View author publications
You can also search for this author in PubMed Google Scholar
Man-Fai Leung
View author publications
You can also search for this author in PubMed Google Scholar
Hangjun Che
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Man-Fai Leung.

Ethics declarations

Conflict of interest

On behalf of all the authors, the corresponding author states that there are no conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised due to update in text.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Yuen, MC., Ng, SC., Leung, MF. et al. A metaheuristic-based framework for index tracking with practical constraints. Complex Intell. Syst. 8, 4571–4586 (2022). https://doi.org/10.1007/s40747-021-00605-5

Download citation

Received: 28 April 2021
Accepted: 26 November 2021
Published: 20 December 2021
Issue Date: December 2022
DOI: https://doi.org/10.1007/s40747-021-00605-5

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A metaheuristic-based framework for index tracking with practical constraints

Abstract

Similar content being viewed by others

Index tracking with controlled number of assets using a hybrid heuristic combining genetic algorithm and non-linear programming

Assessing the interactions amongst index tracking model formulations and genetic algorithm approaches with different rebalancing strategies

Efficient DC Algorithm for the Index-Tracking Problem

Introduction