1 Introduction

One important topic in financial engineering and risk management is striking a balance between investment returns and risk by properly allocating portfolio weights. Estimating the target investment ratio (i.e., the optimal proportion of each asset’s value in a portfolio that maximizes the holder’s utility) has been widely studied using mean–variance analysis (Markowitz, 1952) and its extensions (e.g., Black and Litterman, 1990, De Prado, 2016), among other methods. Useful concepts introduced in the literature have been widely adopted. For example, Amenc et al. (2012) compare the performance of portfolio weights located on the efficient frontier by the global minimum variance and/or the maximum Sharpe ratio. They show that the former and the latter approaches perform better in bear and bull markets, respectively. Bodnar et al. (2020) analyze optimal portfolios under power and logarithmic utilities given that the gross returns of portfolios are approximately log-normally distributed. Ramírez-Hassan and Guerra-Urzola (2020) use a Bayesian estimator to reduce the estimation error of the minimum-variance portfolio. The asset allocation concept has also been adopted or extended in machine learning studies such as Yeh et al. (2014), Raffinot (2017), Almahdi and Yang (2017), Jain and Jain (2019), Filos (2019), Guan and Liu (2022), and Cong et al. (2021). Although the initial portfolio can be set to the target ratio, the value weights of the portfolio’s assets change continuously due to asset price evolution. Thus, portfolio weights should be frequently rebalanced to prevent significant divergence from the target ratio. However, continuous portfolio adjustments to coincide with the target ratio incur significant transaction costs (see Bregu, 2020). In contrast, overly infrequent adjustments cause portfolio weights to significantly deviate from the target ratio, which incurs high tracking errors. In wealth management, an optimal rebalancing strategy is needed which strikes a balance between transaction costs and tracking errors.

Popular static rebalancing strategies found in financial textbooks include buy-and-hold, periodic, and tolerance band rebalancing (Qian, 2020). In the first strategy, the portfolio is never adjusted regardless of market price fluctuations. This can incur significant losses due to black swan events like the financial tsunami of 2008. Periodic rebalancing, the second strategy, adjusts the portfolio weight to the target ratio at a predetermined frequency. It keeps the portfolio unadjusted at other time points regardless of the market status (stable or volatile). Tolerance band rebalancing, the third strategy, adjusts the portfolio weight when the weight of any of the assets deviates from the target weight by more than a predetermined tolerance level.

The literature has explored sophisticated ways to rebalance the portfolio dynamically without incurring substantial transaction costs. Instead of explicitly modeling the utility function over wealth, Leland (1999) evaluates the cost of tracking errors by measuring the divergence from the current portfolio ratio to the desired target portfolio ratio in terms of the difference of the mean–variance utility, as stated in footnote 7 of his paper, after which he estimates the total cost as the discounted expectation of tracking errors plus transaction costs. To minimize the total cost, a portfolio weight is adjusted only to the so-called “no-trade region” boundary once it falls out of this region. Partial differential equations are derived to solve this free boundary problem but the number of boundary conditions grows exponentially with the number of invested assets. Instead of solving the intractable free boundary problem, Muthuraman and Kumar (2006) solve the value function that represents the cost of a non-optimal no-trade region and then update the region with the value function until the region converges to the optimal one. Muthuraman and Zha (2008) improve the computational speed by evaluating the value function using simulation. Donohue and Yip (2003) numerically verify the superiority of Leland ’s no-trade-region model over the aforementioned static rebalancing methods. They show that the region’s shape and size are strongly affected by correlations among assets, transaction cost magnitudes, and the risk preferences of investors. Dichtl et al. (2014) and Marti et al. (2021) study whether dynamic rebalancing strategies outperform traditional rebalancing strategies. Dynamic rebalancing has also been used to explain how the capital gain tax (Gallmeyer et al., 2006) and asset liquidity (Kinlaw et al., 2013) influence investor behavior.

Much of the literature, such as Petronio et al. (2014) and Yun et al. (2021), models asset return or price processes as a Markov process. The optimal rebalancing problem can thus be modeled as a Markov decision process, as proposed in Sun et al. (2006). All possible portfolio weights are discretely enumerated as a multi-dimensional, uniformly distributed grid. Changes in portfolio weights due to changes in asset values or portfolio rebalancing are represented by movement from one grid node to another. To determine an optimal rebalancing strategy that minimizes the costs of transactions and tracking errors,Footnote 1 they solve the Bellman equation to determine the corresponding optimal rebalancing action for each grid node. Their method is widely referenced in recent studies of dynamic portfolio management, such as Tahar et al. (2007), Brito (2008), Branger et al. (2010), Israelov and Katz (2011), Holden and Holden (2013), and Carroll et al. (2017). In addition, Kritzman and Myrgren (2009) and Brown and Smith (2011) address and alleviate the curse of dimensionality problem due to increments of assets. This article suggests that the traditional uniformly distributed grid can be improved by allocating the grid points in a non-uniform fashion according to their importance. Specifically, the discrete enumeration of possible portfolio weights incurs discretization error that depends on the distance between adjacent grid nodes and the probability distribution. Given the non-uniform probability distribution of portfolio weights due to the return distribution of assetsFootnote 2 and the portfolio rebalancing strategies, this article allocates grid nodes in a non-uniform fashion according to the occurrence probabilities of portfolio weights. The upper bound of the discretization errors is estimated and minimized by allocating grid nodes optimally determined by the Lagrange multiplier. Thus the proposed method is called a multiresolution grid (MRG) due to the above optimal adjustments to the grid resolution. Similar ideas are found in the derivative pricing literature: the grid resolution is adjusted according to the derivative’s payoff function to reduce pricing errors (see Figlewski and Gao, 1999, Dai et al., 2005, Dai and Lyuu, 2007).

The proposed two-stage MRG algorithm is described as follows. The first stage divides the space of portfolio weights into several areas and estimates the probability of the portfolio weight moving from the target ratio to each area via Monte Carlo simulation. The second stage allocates a fair number of grids to each area to minimize the upper bound of the discretization error of the total cost. Note that any arbitrary portfolio weight is categorized into nearby grid nodes. Thus the distance between two adjacent grid nodes reflects the upper-bound discretization error due to the Lipschitz continuity property of the cost function, as proved in Appendix B. Allocating a higher number of grid nodes into an area implies a higher resolution and a lower discretization error in that area. Note that as the computational time is proportional to the total number of grid nodes, it is infeasible to allocate infinite nodes to each area to reduce the error in an unlimited manner. To balance efficiency and accuracy, the lump sums of the upper bounds of discretization errors contributed by all areas are minimized under the constraint of a fixed number of total grid nodes (or computational resources) by substituting the probabilities of the aforementioned areas into the Lagrange multiplier. The optimal allocation number for each area is obtained by solving the Lagrange multiplier. Then, the multiresolution grid model is constricted by uniformly distributing the predetermined allocation number of nodes to each area. Finally, modified policy iteration (see Puterman and Shin (1978)) is applied to solve the cost and the optimal rebalancing action for each grid node. Experimental results show that the proposed MRG significantly improves on the traditional uniformly-distributed grid approach. MRG also outperforms other popular rebalancing strategies found in financial textbooks by backtesting with real transaction data from 2000 to 2020.

The remainder of this article proceeds as follows. Section 2 introduces the required background knowledge to construct a portfolio rebalancing strategy. Section 3 discusses the construction of MRG. Section 4 compares various rebalancing methods with simulated and real transaction data. Section 5 concludes the article.

2 Preliminaries

This section will introduce the required economic and financial background knowledge and algorithms for solving the Bellman equation. The mean–variance analysis and the construction of target ratios will be discussed in Sect. 2.1. Section 2.2 will introduce how tracking errors can be converted into risk-adjusted costs via the concepts of utility and certainty equivalence proposed in Bernoulli (1954). The total rebalancing cost comprises the risk-adjusted costs and the transaction costs. Section 2.3 will model the relations between future expected discount rebalancing costs and rebalancing strategies via the Bellman equation that will be solved by the modified policy iteration in Sect. 2.4.

2.1 Target Ratio Construction

Assume an investment portfolio comprises n strategic assets,Footnote 3 including stock and bond indexes, as detailed in Appendix A. The returns of these strategic assets are assumed to follow a multivariate normal distribution. The expected return vector \(\mathbf {\mu }\) and the covariance matrix \(\varvec{\Lambda }\) can be obtained by calibrating the historical daily closing prices of these indexes, as mentioned in financial textbooks such as Luenberger (2013). Here we follow Sun et al. (2006) by determining target ratios as the portfolio weight vector w that maximizes the investor’s mean–variance utility functionFootnote 4 defined as

$$\begin{aligned} \mathbf {w^*}=\mathop {\arg \max }_{{\textbf{w}}}\left[ \mathbf {w^T\mu }-\frac{\alpha }{2}\mathbf {w^T}\varvec{\Lambda } \textbf{w}\right] , \end{aligned}$$
(1)

where portfolio weight vectors \({\textbf{w}}\) and \(\mathbf {w^*}\) are n-dimensional vectorsFootnote 5 that represent the value proportion of each asset’s value in a portfolio (the sum of the weights vector equals 1) and \(\alpha \) denotes a constant risk aversion coefficient.Footnote 6 Here the utility is calculated as the expected return \(\mathbf {w^T}\mu \) minus \(\alpha /2\) multiplied by the covariance of the portfolio return \(\mathbf {w^T}\Lambda {\textbf{w}}\).

2.2 Costs of Tracking Errors and Transactions

The total rebalancing costs can be decomposed into two parts: transaction costs, which reflect the cost to adjust the portfolio weights, and tracking errors, which reflect the loss of an investor due to non-optimal asset allocation. The transaction cost TC(\({\textbf{w}}, \mathbf {w'}\)) is defined as

$$\begin{aligned} \texttt {TC}({\textbf{w}}, \mathbf {w'}) = \left| {\textbf{w}} - \mathbf {w'}\right| \times {\textbf{c}}, \end{aligned}$$
(2)

where the portfolio weight vectors \({\textbf{w}}\) and \(\mathbf {w'}\) denote the weights of assets before and after the portfolio rebalancing, respectively. Each element of vector \({\textbf{c}}\) denotes the transaction fee for the corresponding strategic asset.

The tracking error can be converted into risk-adjusted return by the certainty equivalent approach of Bernoulli (1954) illustrated in Fig. 1. The x- and y-axes denote the expected value and standard derivation of the portfolio return, respectively. The utility for each point located at the solid (or dashed) indifference curve is the same. The optimal portfolio with weight \(\mathbf {w^*}\) and a non-optimal one with weight \({\textbf{w}}\) are denoted by the gray and black circles, respectively. The utility for investing the portfolio with the weight \(\mathbf {w^{*}(w)}\) equals the utility for investing a riskless asset with return \(r^{*}(r)\). The utility for investing the portfolio with the optimal weight allocation \(\mathbf {w^*}\) is \(\mathbf {w^{*T}\mu }-\frac{\alpha }{2}\mathbf {w^{*T}}\varvec{\Lambda } \textbf{w}^*\); a non-optimal allocation with weight \({\textbf{w}}\) reduces the utility to \(\mathbf {w^T\mu }-\frac{\alpha }{2}\mathbf {w^T}\varvec{\Lambda } \textbf{w}\). To express the loss of utility in terms of cost without uncertainty, we calculate the riskless returns \(r^*\) and r that have the same utilities for investing with allocation weights \(\mathbf {w^*}\) and \({\textbf{w}}\), respectively, as

$$\begin{aligned} r^* \equiv \mathbf {w^{*T}\mu }-\frac{\alpha }{2}\mathbf {w^{*T}}\varvec{\Lambda } \textbf{w}^*,\ r \equiv \mathbf {w^{T}\mu }-\frac{\alpha }{2}\mathbf {w^{T}}\varvec{\Lambda } \textbf{w}. \end{aligned}$$
Fig. 1
figure 1

Estimating tracking errors with the certainty equivalent approach

Following Roll (1992), the tracking error can be expressed as

$$\begin{aligned} r^{*}-r. \end{aligned}$$
(3)

2.3 Bellman Equation for Minimizing Costs

The Bellman equation decomposes the “value” of a decision problem at a state as the sum of the current payoff determined by the portfolio weight state and the decided rebalancing actions, plus the expected discounted values contributed by the following state transitions and corresponding rebalancing actions. In this rebalancing problem, the portfolio weight space \({\mathbb {W}}\) is composed of discretely enumerated portfolio weights represented by the grid, as in Sun et al. (2006). Each rebalancing strategy \(\pi \) from the strategy space \({\mathbb {S}}\) is defined as a function that maps any portfolio weight \({\textbf{w}} \in {\mathbb {W}}\) to a portfolio weight \({\textbf{w}}'\in {\mathbb {W}}\) indicating that \(\pi \) adjusts the portfolio weight from \({\textbf{w}}\) to \({\textbf{w}}'\). Specifically, \(\pi ({\textbf{w}})={\textbf{w}}'\). Optimization of the rebalancing strategy problem can then be accomplished by minimizing the lump sum of the expected present values of rebalancing costs for each state \({\textbf{w}}\in {\mathbb {W}}\) using dynamic programming, as articulated by the Bellman equation:

$$\begin{aligned} J({\textbf{w}}) \equiv {\min _{\pi \in {\mathbb {S}}}\left( \text {Cost}({\textbf{w}},\pi ({\textbf{w}}))+ \gamma \sum _{\mathbf {w'}\in {\mathbb {W}}} {\mathbb {P}}(\pi ({\textbf{w}}),\mathbf {w'})J(\mathbf {w'})\right) }. \end{aligned}$$
(4)

Here the value function \(J({\textbf{w}})\) represents the minimized expected present value of the rebalancing cost given the current portfolio weight \({\textbf{w}}\). Terms inside the minimum operators are composed of the current rebalancing cost and the lump sum of future expected discounted rebalancing costs. \(\text {Cost}({\textbf{w}},\pi ({\textbf{w}}))\) represents the rebalancing cost incurred when adjusting the portfolio weight vector \({\textbf{w}}\) to the portfolio weight \(\pi ({\textbf{w}})\). This cost comprises the transaction cost and tracking error evaluated by Eqs. (2) and (3), respectively. \(\gamma \) is the discount factor.Footnote 7 After adjusting the portfolio weight to \(\pi ({\textbf{w}})\), the portfolio weight changes to \(\mathbf {w'}\) with transition probability \({\mathbb {P}}(\pi {({\textbf{w}})},\mathbf {w'})\) at the next time step to reflect changes in asset prices due to market fluctuation. Transition probabilities are evaluated using Monte Carlo simulation by assuming that the returns of assets follow the multivariate normal distribution with mean \(\mu \) and covariance matrix \(\Lambda \).

A one-time step portfolio weight evolution and an example of rebalancing are illustrated in Fig. 2. \(\mathbf {w_0}\), \(\mathbf {w_1}\), \(\mathbf {w_2}\), \(\mathbf {w^*}\), \(\mathbf {w_3}\), \(\mathbf {w_4}\), and \(\mathbf {w_5}\) denote the discretized portfolio weights, where \(\mathbf {w^*}\) denotes the optimal portfolio weight defined in Eq. (1). \(t_i\) and \(t'_i\) denote the time immediately before and after portfolio weight rebalances, for \(i=0\) and 1. For every \({\textbf{w}}\in {\textbf{W}}\), the best rebalancing strategy \(\pi ^*\) selects a best adjusted portfolio weight \(\pi ^*({\textbf{w}})\) to minimize the value \(J({\textbf{w}})\) in Eq. (4). For example, assume the initial portfolio weight is \(\mathbf {w_1}\) at time \(t_0\). Instead of directly rebalancing the portfolio back to the optimal \(\mathbf {w^*}\), we adopt the best rebalancing strategy \(\mathbf {\pi ^*}({\textbf{w}}_1)=\mathbf {w_2}\) (denoted in bold blue) to rebalance the portfolio weight to \(\mathbf {w_2}\). The goal of this strategy is to minimize \(J(\mathbf {w_1})\) defined in Eq. (4), ensuring a balance between transaction costs and tracking errors while considering future expenses. Portfolio weight changes stemming from asset price shifts due to market fluctuations are reflected by transition branches between time \(t'_0\) and \(t_1\). For example, the thin blue branches that emit from the portfolio weight \(\mathbf {w_2}\) at time \(t'_0\) reflect the weight’s move to \(\mathbf {w_0}, \mathbf {w_1}, \ldots \) at time \(t_1\) with transitional probabilities \({\mathbb {P}}(\mathbf {w_2},\mathbf {w_0}), {\mathbb {P}}(\mathbf {w_2},\mathbf {w_1}), \ldots \). Unlike Leland (1999), who analytically solves the boundaries of the no-trade region where the best rebalancing strategy is not to adjust, we instead evaluate the best rebalancing strategy for each portfolio weight to identify this region to avoid solving the intractable free boundary problem. For example, we solve the best rebalancing strategy \(\mathbf {\pi ^*}\) for every portfolio weight via Eq. (4) and find that the best rebalancing action for the three following portfolio weights is not to adjust: \(\mathbf {\pi ^*}(\mathbf {w_2})=\mathbf {w_2}\), \(\mathbf {\pi ^*}(\mathbf {w^*})=\mathbf {w^*}\), and \(\mathbf {\pi ^*}(\mathbf {w_3})=\mathbf {w_3}\); the corresponding no-trade region is composed of {\(\mathbf {w_2},\mathbf {w^*}, \mathbf {w_3}\)} and is marked in red in Fig. 2.

Fig. 2
figure 2

Portfolio weight rebalancing with the Bellman Eq. (4) and evolution due to changed asset prices

2.4 Modified Policy Iteration

The above rebalancing problem finds an optimal rebalance portfolio weight \(\mathbf {a^*}({\textbf{w}})\) for each state \({\textbf{w}}\) and can be interpreted as a Markov decision problem. We solve this problem by evaluating the value function J by the modified policy iteration algorithm proposed by Puterman and Shin (1978), as illustrated in Algorithm 1, since it is more efficient than value iteration or policy iteration. This method repeatedly uses partial policy evaluation and policy improvement to approximate the value function J and the rebalancing strategy with \(J_n\) and \(\pi _n\),Footnote 8 respectively. We first initialize the value function \(J_0\) as 0 and the initial strategy \(\pi _{0}\) to adjust all portfolio weights to the target ratio. The first step calculates the n-th round estimation of the value function, denoted as \(\tilde{J_{n}}\), by repeatedly applying rebalancing strategy \(\pi _n\) (line 6). Next, in the policy improvement step, we use \(\tilde{J_{n}}\) to estimate the \((n+1)\)-th round rebalancing strategy \(\pi _{n+1}\) and value function \(J_{n+1}\), respectively (lines 10 and 11). This iterative procedure stops when the value function or the rebalancing strategy converges (line 14). Here the tolerance level \(\epsilon \) is set to 0.001, and the infinity norm \(\left| {\mathbb {J}}\right| _{\infty } \equiv \max (J(\mathbf {w_1}),J(\mathbf {w_2}),\ldots ,J(\mathbf {w_{\left| {\mathbb {W}}\right| })})\), where \(\left| {\mathbb {W}}\right| \) denotes the number of states in state space \({\mathbb {W}}\).

Algorithm 1
figure a

Modified policy iteration

3 Construction of Two-Stage Multiresolution Grid Algorithm

To find the optimal portfolio rebalancing strategy that minimizes the overall costs of transaction and tracking errors, a Markov decision process is used to model portfolio weights with discrete states; changes in portfolio weights due to market price oscillations or rebalancing are represented by transitions between states. The optimal rebalancing strategy for each state is evaluated by solving the resulting Bellman equation. Although dividing the space of portfolio weights into finer partitions increases the accuracy of rebalancing solutions, it also increases the number of states and hence the running time of the rebalancing algorithm. It is thus critical to decrease the aforementioned discretization errors in a computationally tractable manner.

Instead of using ordinary dynamic programming with the uniform resolution mechanism (ODP), this article improves the rebalancing decision problem by a novel two-stage multiresolution grid algorithm (MRG) that varies state allocation resolution according to area importance. Specifically, portfolio weights move around the target ratio with high probability and are frequently rebalanced back to approximately this ratio, making the probability of staying near the ratio higher than the probability of straying far from the ratio. To decrease the expected discretization error due to categorizing portfolio weights into nearby states, we put more states in high-probability areas (i.e., high resolution) and fewer states in low-probability areas (low resolution). The first stage of MRG divides the space of portfolio weights into areas and determines the probability of each area and the upper bounds of the discretization errors as in Sect. 3.1. Area resolutions are determined by the Lagrange multiplier to minimize the upper bound of discretization errors in Sect. 3.2.

3.1 Area Probabilities and Discretization Errors

Fig. 3
figure 3

Division of the portfolio weight space into distinct areas and determination of the corresponding states. The x- and y- axes denote the value weights of the first and second assets in the portfolio, respectively

In the first stage, the portfolio weight space is divided into several areas and the probability of staying in each area is calculated as in Fig. 3. Note that allocations or rebalancing of \(\Theta +1\) assets can be modeled by a \(\Theta \)-dimensional space of portfolio weights. For example, \(\Theta =2\) in Fig. 3. The x- and y-coordinates denote the weights of the first and second asset, respectively. As the sum of the weights of all assets is 1, the weight of the third asset is one minus the weights of the first two assets.

The space of portfolio weights is divided into several even square areas with side length \(\Delta \). The target ratio E denoted by the red dot is centered in one of the square areas.

The center of each square area, say A, B, C, D, E, F, G, H, or I, denotes the state, or the representative portfolio weight of that area. Instead of measuring an infinite number of transitions between all possible portfolio weights (due to market price changes and portfolio rebalancing), the proposed algorithm considers the transitions between states to prevent computational intractability. Discretization errors, therefore, occur since all portfolio weights in the square area are represented by the corresponding state.

To estimate the importance of each area, the transition probability from the target ratio E to each area i, \(\texttt {P}(i)\), is estimated via Monte Carlo simulation. Specifically, the mean vector and the covariance matrix of the asset returns are calibrated by historical data as in Luenberger (2013); simulated asset returns are then generated to estimate the changes in portfolio weights and hence transition probabilities between areas (or between corresponding states).

We illustrate this using a real number example denoted by the x- and y-axes of Fig. 3. The portfolio weight for state E is (20%,20%,60%), where the weight of the third asset is determined by \(60\%=100\%-20\%-20\%\). Assume the one-time step returns for these three assets are \(5\%\), \(0\%\), and \(-5\%\). Then the portfolio return is \(20\%\times 5\%+20\%\times 0\%+60\%\times (-5\%)=-2\%\), and the portfolio weight is changed to

$$\begin{aligned}\left( \frac{20\%\times (1+5\%)}{1-2\%},\frac{20\%\times (1+0\%)}{1-2\%},\frac{60\%\times (1-5\%)}{1-2\%}\right) \approx (21.4\%,20.4\%,58.2\%).\end{aligned}$$

This weight is represented by state F with weight (22%, 20%, 58%).

Fig. 4
figure 4

Estimation of the upper bound of discretization error and the effects of varying area resolutions. This figure divides the area into \(n^\Theta \) subareas, with \(\Theta =2\) and \(n=3\). The central blue state represents the original area center, while eight additional green states are added as the centers of the newly divided subareas. Red dotted lines demarcate these subareas

The estimation of the upper bound of the discretization error and the impact of changing the resolutions of an area are represented in Fig. 4. The side length is divided into n pieces (3 in this example) and the area is evenly cut into \(n^\Theta \) subareas (\(3^2\) in this example) denoted by red dashed lines. Note that any portfolio weight belonging to a subarea is adjusted to the center of the subarea, say the blue/green point. Since cost function J satisfies the Lipschitz continuity condition, as proved in Appendix B, the upper-bound discretization error of the cost changes due to portfolio weight adjustment is proportional to the Euclidean distance between the corner and the center of the subarea:

$$\begin{aligned} \frac{\sqrt{\Theta }\times \Delta }{2\times n}. \end{aligned}$$
(5)

Note that the runtime for each iteration in Algorithm 1 depends on the number of states \(\left| {\mathbb {W}}\right| \). Given limited computational resources (i.e., a fixed number of states), the performance of these algorithms can be further improved by allocating states in a non-uniform fashion according to area importance. \(\left| {\mathbb {W}}\right| \) states (such as the red, blue, and green nodes in Figs. 3 and 4 ) are allocated so as to minimize the overall expected upper-bound discretization error of the portfolio weight adjustment cost Expected_Error as

$$\begin{aligned} \min _{n_i,\forall i\in {\mathbb {A}}} \texttt {Expected\_Error} \nonumber \\ s.t. \ \ \left| {\mathbb {W}}\right| - \sum _{i\in {\mathbb {A}}}n_i^{\Theta }=0, \end{aligned}$$
(6)

where \({\mathbb {A}}\) denotes the set of all areas. As the side length of area i is cut into \(n_i\) pieces, \(n_i^{\Theta }\) states are allocated to area i. Expected_Error is the lump sum of the upper-bound discretization error due to portfolio weight adjustment for each area i and is proportional to the sum of Eq. (5) multiplied by the probability to reach the area:

$$\begin{aligned} \texttt {Expected\_Error} \propto \sum _{i\in {\mathbb {A}}}\frac{\sqrt{\Theta }*\Delta }{2n_i}\times \texttt {P}(i). \end{aligned}$$
(7)

3.2 Lagrange Multiplier Method

Fig. 5
figure 5

Multiresolution grid model

The resolution for (or the number of states allocated to) each area is optimally solved as depicted in Fig. 5 to minimize the Expected_Error under the constraint defined in Eq. (6) via Lagrange multipliers. Since the \(\frac{\sqrt{\Theta }*\Delta }{2}\) term in Eq. (7) is a constant, it suffices to solve

$$\begin{aligned}{} & {} \min _{n_i, \forall i\in {\mathbb {A}}} \sum _{i\in {\mathbb {A}}}\frac{1}{n_i}\times \texttt {P}(i) \nonumber \\{} & {} s.t. \ \ \left| {\mathbb {W}}\right| - \sum _{i\in {\mathbb {A}}}n_i^{\Theta }=0. \end{aligned}$$
(8)

Without loss of generality, the space of portfolio weights is assumed to be divided into nine areas, A, B, \(\ldots \), I as in Fig. 3 to simplify the subsequent derivation. The optimization problem defined in Eq. (6) can be simplified by the property in Eq. (8) as

$$\begin{aligned} \min _{\left[ n_A,\ldots ,n_I \right] } \{P(A)\frac{1}{n_A}+P(B)\frac{1}{n_B}+\cdots +P(I)\frac{1}{n_I}\}, \nonumber \\ s.t. \ \ n_A^{\Theta }+n_B^{\Theta }+\cdots +{n_I^{\Theta }}=|{\mathbb {W}} |. \end{aligned}$$
(9)

The Lagrange multiplier function is derived as

$$\begin{aligned} L(n_A,\ldots ,n_I,\lambda ) = P(A)\frac{1}{n_A}+P(B)\frac{1}{n_B}+\cdots +P(I)\frac{1}{n_I}+ \lambda (n_A^{\Theta }+n_B^{\Theta }+\cdots +{n_I^{\Theta }}-|{\mathbb {W}} |). \end{aligned}$$
(10)

Differentiating Eq. (10) with respect to \(n_i\) for every \(i\in \{A,B, \ldots ,I \}\) and equating the results to zero yields

$$\begin{aligned} \frac{\partial }{\partial n_i}L(n_A,\ldots ,n_I,\lambda ) = -P(i)\frac{1}{n_i^2}+ \lambda \theta n_i^{\Theta -1}=0. \end{aligned}$$
(11)

\(n_i\) is then solved to be

$$\begin{aligned} n_i = \left[ \frac{P(i)}{\lambda \Theta } \right] ^\frac{1}{\Theta +1}. \end{aligned}$$
(12)

Differentiating Eq. (10) with respect to \(\lambda \) and setting the result to zero yields

$$\begin{aligned} \frac{\partial }{\partial \lambda }L(n_A,\ldots ,n_I,\lambda ) = n_A^{\Theta }+n_B^{\Theta }+\cdots +{n_I^{\Theta }} - |{\mathbb {W}} |=0. \end{aligned}$$
(13)

For every \(i\in \{A,B, \ldots ,I\}\), substituting \(\left[ \frac{P(i)}{\lambda \Theta } \right] ^\frac{1}{\Theta +1}\) in Eq. (12) for \(n_i\) in Eq. (13) yields

$$\begin{aligned} \lambda = \left( \frac{\left[ P(A)^\frac{\Theta }{\Theta +1} + P(B)^\frac{\Theta }{\Theta +1} +\cdots + P(I)^\frac{\Theta }{\Theta +1} \right] }{|{\mathbb {W}} |\times \Theta ^{\frac{\Theta }{\Theta +1}}}\right) ^{\frac{\Theta +1}{\Theta }}. \end{aligned}$$
(14)

Finally, \(n_i\) is solved by substituting \(\lambda \) obtained in Eq. (14) into Eq. (12). Since a non-integral \(n_i\) does not fit the integer requirement as the number of vertical (or horizontal) red dotted lines illustrated in Fig. 4, in subsequent numerical experiments, the side length of area i is instead divided into \(\lceil n_i \rceil \) pieces. This increases the number of states involved in MRG; in later experiments, \(|{\mathbb {W}} |\) and \(\#(\textrm{states})\) are used to denote the Lagrange multiplier constraint and the real number of states used in MRG, respectively. Varying \(\lceil n_i \rceil \) reflects the importance of area i and the changing resolution as depicted in Fig. 5.

4 Experimental Results

To evaluate the superiority of MRG, Sect. 4.1 compares the relationships among the running time, allocation of computational resources (\(|{\mathbb {W}} |\) and \(\#(\textrm{states})\)), the expected upper bound of the discretization error Expected_Error, and the lump sum of the future expected discounted costs of transaction and tracking errorsFootnote 9 (denoted as Total Cost in the following experiments) for ODP and MRG. The investment performance for MRG and other related rebalancing methods discussed in investment textbooks such as Qian (2020) are compared in Sect. 4.2.

4.1 Comparison Between ODP and MRG

The investment portfolios in the following experiments are composed of strategic assets including stock and bond indexes described in Appendix A. The expected returns and covariance matrix of strategic assets are calibrated with historical trading records during the period from 1990 to 1999. The investment period is from 2000 to 2019 unless stated otherwise. The target ratios are determined by the mean–variance analysis stated in Eq. (1). To ensure “inclusive finance,” a goal widely mentioned among financial technology trends, our industrial cooperation partner (a commercial bank in Taiwan) plans to design a strategic asset allocation for individuals with limited assets. In practice, an initial set-up cost is required for investing in a strategic asset; transaction costs increase with the number of strategic assets in which one’s wealth is invested. Thus, it is inefficient to allocate excessively small amounts of wealth to specific assets. To reduce this cost, strategic assets with too-small weights, say \(5\%\), are deleted as required by the cooperating bank. The remaining assets are substituted into Eq. (1) to find the target ratio that meets the \(5\%\) constraint. The target portfolio is composed of assets SHCOMP, SENSEX, MXLA, and SPX with ratios of \(28\%\), \(22\%\), \(25\%\), and \(25\%\), respectively.

The computational time complexities for both ODP and MRG with different \(\Delta \) are analyzed first as follows. Although it is difficult to theoretically analyze the time complexity for the modified policy iteration in Algorithm 1 due to the unknown number of iterations, the growth rate of the computation time T can be empirically determined in terms of \(\#(\textrm{states})\), that is, the number of states. Specifically, let \(T\in O\left( \#(\textrm{states})^k\right) \) for a positive constant k, which yields \(T=c\times \left( \#(\textrm{states})\right) ^k\) for a constant c. Taking the logarithm on both sides of the above equation yields \(\log T = \log c+k\log \left( \#(\textrm{states})\right) \).

Slope k can be interpreted as the growth rate of the running time with respect to \(\#(\textrm{states})\). Term \(\log c\) can be considered a measure for all factors except for \(\#(\textrm{states})\) that influence computation time T. Figure 6 illustrates this linear growth relationship for MRG with different \(\Delta \) and for ODP. The growth rate k and \(\log c\) are estimated by applying OLS regression as shown in Table 1. Regardless of the \(\Delta \) setting and the rebalancing method, growth rates are approximately 2.2, which implies that both ODP and MRG run in quadratic time. The \(\log c\) of \(\texttt {ODP}\) and \(\texttt {MRG}\) are approximately \(-11\). This suggests that factors other than \(\#(\textrm{states})\), such as the number of iterations for executing modified policy iteration in Algorithm 1, have a minor impact on the execution time T.

Fig. 6
figure 6

Relation between number of states and runtime

Table 1 Linear regressions for logarithmic (#states) and runtime in Fig. 6

The relation among \(\Delta \), \(|{\mathbb {W}} |\),Footnote 10 and Expected_Error are analyzed in Table 2. Clearly, increments in \(|{\mathbb {W}} |\) increase the resolution of the portfolio weight space and hence reduce Expected_Error. The superiority of MRG is verified by its lower Expected_Error than that of ODP regardless of the changes in \(\Delta \). The \(O\left( \frac{1}{\root \Theta \of {\#(\textrm{states})}}\right) \) convergence rate of Expected_Error is confirmed by the adjusted \(R^2\approx 1\) for linear regressions between Expected_Error and \(\frac{1}{\root \Theta \of {\#(\textrm{states})}}\), as illustrated in Table 3.

Since the convergence rate of Expected_Error (i.e., the coefficient of \(\frac{1}{\root \Theta \of {\#(\textrm{states})}}\)) is the highest when \(\Delta =2.5\%\), MRG in our latter experiments is based on \(\Delta =2.5\%\).

Table 2 Relation between \(|{\mathbb {W}} |\) and Expected_Error
Table 3 Relation between \((\#\textrm{states})\) and Expected_Error

The rebalancing performance of ODP and MRG is compared in Tables 4 and 5, respectively. \(\#(\textrm{rebalances})\) denotes the number of rebalances during the trading period. TC and TE denote the transaction cost and tracking error defined in Eqs. (2) and (3), respectively. Increments in \(\#(\textrm{rebalances})\) increase the chances to adjust portfolio weights to close target ratios; thus, tracking errors are reduced at the expense of increasing transaction costs. A good rebalancing algorithm minimizes the total cost—the lump sum of TC and TE—by optimizing its rebalancing action as in Eq. (4) with constrained computational resources (i.e., the number of grid points or resolution).

Table 4 Impact of changing resolution \(\Delta \) on ODP Performance
Table 5 Impact of changing resolution on MRG performance with \(\Delta =0.025\)

As ODP allocates one state for each area, its resolution or \(\#(\textrm{states})\) increases with decrements in the side length \(\Delta \). However, as MRG allocates extra states to subareas as in Fig. 5, the resolution or \(\#(\textrm{states})\) of MRG increases with increments in the Lagrange multiplier constraint \(|{\mathbb {W}} |\) defined in Eq. (6). Increments in resolution decrease the discretization errors of the portfolio weights as in Fig. 4 and hence increase the accuracy of rebalancing policies; this reduces transaction costs, tracking errors, and thus total costs at the cost of increasing computational times. However, MRG could achieve better accuracy with fewer computational resources. For example, the scenario \((|{\mathbb {W}} |=500)\) in Table 5 uses 780 states—a 42-second runtime—to achieve a total cost of 0.012962, which is lower than that for the \(\Delta \)=0.015 scenario, 0.01318, in Table 4, which requires more computational resources: 1461 states or 206 s. In addition, increasing \(|{\mathbb {W}} |\) to 1500 reduces the total cost to 0.012154, which is lower than the total cost for the finest \(\Delta \)=0.01 scenario (i.e., 0.01245); the former scenario, however, requires far fewer resources (1624 states and 221 s) than the latter (4897 states and 2820 s).

Since the upper bound of the discretization errors in Table 2 and the rebalancing performance in Table 4 and 5 all suggest that MRG outperforms ODP, subsequent experiments focus on comparisons among MRG and other rebalancing methods.

4.2 Grand Investment Performance Comparison

This section comprehensively compares the performance of MRG with popular traditional rebalancing methods such as the buy-and-hold strategy, periodic rebalancing, and threshold rebalancing under different \(\alpha \) risk aversion coefficients and investment periods. Table 6 illustrates the target ratios generated by Eq. (1) and the corresponding mean (denoted by Mean) as well as the standard derivation (Std.) of the portfolio return under different \(\alpha \) settings. Increments in \(\alpha \) decrease the investment risk (proxied by Std.) at the expense of profitability (reflected by decreasing Mean).

Table 6 Target ratio given risk aversion

Tables 7, 8, 9, 10 and 11 illustrate the impact of \(\alpha \) on the performance of investment strategies. Twenty-year strategic asset price evolutions are simulated one million times via Monte Carlo simulation. All investment indicators in these tables are the averages when applying a rebalancing strategy on these one million simulations. Increments in \(\alpha \) increase the likelihood of adjusting the portfolio weight and hence \(\#(\textrm{rebalances})\) except for the buy-and-hold strategy, which does not adjust the portfolio. Increasing \(\#(\textrm{rebalances})\) decreases tracking errors and hence the total costs at the expense of profitability as reflected by decreasing average returns. However, it also increases the Sharpe ratioFootnote 11 which measures the excess return for bearing a unit of risk.

In addition to MRG, Tables 7, 8, 9, 10 and 11 compare the investment performance of popular rebalancing strategies found in financial textbooks such as Qian (2020). Periodic (i) denotes periodic rebalancing, which adjusts the portfolio weight back to the target ratio every i months. Thus \(\#(\textrm{rebalances})\) increases with increments in the rebalancing frequency (or decrements in i). Tolerance (\(j\%\)) denotes tolerance band rebalancing, which adjusts the portfolio weights back to the target ratio once one of the asset weights diverges from the target ratio by \(j\%\). An increment of \(j\%\) decreases the rebalancing likelihood and hence \(\#(\textrm{rebalances})\). As a buy-and-hold strategy never adjusts the portfolio weight, \(\#(\textrm{rebalances})\) is zero. Regardless of the changes to \(\alpha \), the total costs of MRG are almost always smaller than those for other methods, suggesting that MRG strikes a better balance between transaction costs and tracking errors than other rebalancing methods. MRG also exhibits higher expected portfolio returns and Sharpe ratios than periodic and tolerance band rebalancing. Although the buy-and-hold strategy has the highest average return, the total cost and Sharpe ratio are the poorest, showing that it bears a far higher risk than the other methods.

Table 7 Comparison of rebalancing methods with \(\alpha =2\)
Table 8 Comparison of Rebalancing Methods with \(\alpha =3\)
Table 9 Comparison of Rebalancing Methods with \(\alpha =4\)
Table 10 Comparison of Rebalancing Methods with \(\alpha =5\)
Table 11 Comparison of Rebalancing Methods with \(\alpha =6\)

Table 12 compares the Sharpe ratios of various rebalancing strategies for investing from 2000 to 2020. Sharpe ratios are listed for each two-year period. The MRG ratios are generally higher than those for other methods. In addition, the Sharpe ratios for all rebalancing strategies are generally similar in the short run. However, in the long term, MRG significantly outperforms the other strategies. This confirms the robustness of MRG to resist financial crises such as those that occurred during 2000–2020: the dot-com bubble, the financial tsunami of 2008, and the European debt crisis.

Table 12 Sharpe Ratios of Rebalancing Methods for Investing from 2000 to Year Listed in Second Row with \(\alpha =4\)

5 Conclusion

This article improves the traditional uniform-resolution-based rebalancing strategies proposed by the multiresolution scheme illustrated in Fig. 5. To minimize the upper bound of the discretization error under constrained computational resources (i.e., \(|{\mathbb {W}} |\) in Eq. (8)), grid nodes are allocated in a non-uniform fashion according to area importance in the portfolio weight space. Each area’s importance is estimated by the probability of reaching the area, and optimal node allocation is determined by a Lagrange multiplier. Experiments show that MRG outperforms ODP and popular rebalancing strategies such as the periodic, tolerance band, and buy-and-hold rebalancing strategies in that it efficiently strikes a balance between transaction costs and tracking errors to achieve higher Sharpe ratios.