Introduction

The seminal asset allocation model of Markowitz (1952) has been the stronghold of portfolio construction since 1952. However, extensive research documents three main limitations, namely producing unstable, concentrated, and underperforming portfolios. Michaud (1989) provides a detailed exploration of these issues. The finance literature has proposed numerous solutions, including the works of Black and Litterman (1992) and Ledoit and Wolf (2004), among others.

The literature on portfolio construction has recently focused on two fields of application: clustering and graph theory. On the clustering side, the main representative is the hierarchical risk parity (HRP) of de Prado (2016), which applies hierarchical clustering to securities based on correlations. This groups similar investments while distancing dissimilar ones, with optimal allocation achieved via inverse-variance allocation. Raffinot (2017) further provided a comprehensive framework for portfolio construction employing hierarchical clustering.

Peralta and Zareei (2016) marked the pioneering step with the inclusion of graph theory in the portfolio construction literature. They demonstrated the close relationship between graph centrality measures and optimal portfolio weights. Furthermore, they designed portfolios by equally distributing capital to the most central securities in low-volatility periods and rebalancing to the least central securities during high-volatility periods. Vỳrost et al. (2019) constructed optimal portfolios based on four graphical representations of securities: a complete graph, a minimum spanning tree, a planar maximally filtered graph, and a threshold significance graph.

This paper aims at introducing network risk parity (NRP), a novel graph theory-based portfolio construction method that produces fully invested, long-only portfolios. We derive network risk parity by drawing a parallel with hierarchical risk parity. Indeed, while hierarchical risk parity is based on hierarchical clustering, Network risk parity builds portfolios with graph theory. The connection point between both methodologies is that they calculate portfolio weights proportionally to the inverse of the eigenvalues of a modified covariance matrix. In hierarchical risk parity, the optimal weights are calculated as the inverse of each security’s variance. This variance is obtained from the main diagonal of a quasi-diagonal covariance matrix, which is modified to incorporate the hierarchical structure derived using hierarchical clustering. We demonstrate that the weighting system employed by HRP is equivalent to taking the inverse of the eigenvalues of the same covariance matrix, after an essential quasi-diagonalization step. On the other hand, network risk parity operates on a different principle. It utilizes an adjacency matrix based on covariances, where one set of eigenvectors is determined by the eigenvector centrality, a measure of the influence of a node in a network. In NRP, the optimal portfolio weights are calculated as the inverse of the eigenvector centrality. A softmax normalization is applied to these weights to ensure a fully invested, long-only portfolio.

Three advantages of network risk parity stem from transposing clustering-based methodologies into a graph theory framework. Firstly, NRP, grounded on the principle of eigenvector centrality, encapsulates relationships among securities in a one-to-many fashion, resulting from the definition of eigenvector centrality that embeds the importance of neighboring nodes too. Hierarchical clustering, on the other hand, captures only one-to-one relationships due to its agglomerative clustering approach, which looks at pairwise distances. Secondly, hierarchical clustering depends on the determination of a distance and a linkage function. NRP, instead, is based on minimum spanning tree (MST), which solely depends on the function used to convert correlations into distances. Hierarchical risk parity in particular has been criticized for its adoption of a single linkage function, which differentiates securities based on the distance of the nearest points within clusters, thereby causing a chaining effect that expands the tree and impacts portfolio weights (Papenbrock 2011). Third, leveraging on graph theory, we can prove that the portfolio weights of NRP and HRPFootnote 1 have a lower bound larger than zero, thus assigning a positive weight to each portfolio constituent and improving portfolio diversification, a pitfall of the classic mean-variance approach.

Using a bootstrapping approach, we compare the Sharpe ratio of NRP with that of HRP, risk parity (RP), Markowitz’s minimum variance optimization (MVO), and equally weighted (EW) portfolios. While NRP consistently outperforms the other methods, compared to HRP, its performance depends on the number of stocks in the portfolio, outperforming HRP as the portfolio size increases.

The rest of the paper is organized as follows: Section 2 introduces NRP and compares it to HRP. Section 3 presents the empirical results in terms of bootstrapped Sharpe ratio and weights. Section 4 concludes and discusses future work.

Methodology

Covariance

The starting point of network risk parity, similar to most portfolio construction methodologies, is to estimate the covariance matrix of asset returns. We use hourly security prices and define log-returns as:

$$\begin{aligned} {r}_{i,\tau } = \ln \Bigg (\frac{{p}_{i,\tau }}{{p}_{i,\tau -1}}\Bigg ) \quad \forall \tau = 1, \ldots , T; \quad \forall i = 1, ..., n \end{aligned}$$

where \(r_{i,\tau }\) is the return of the i-th asset at time \(\tau\), \(p_{i,\tau }\) is the price of the i-th asset at time \(\tau\), and T is the number of hours in each month. We estimate the conditional covariance by employing the methodology of Barndorff-Nielsen and Shephard (2004), who show that conditional covariances can be estimated through non-parametric realized covariances, \({RC}_t\), which converge in probability to the quadratic variation of the price process under very general assumptions. Thus, we calculate monthly realized covariances, \({RC}_t\), as the aggregation of cross-products of hourly returns, such that:

$$\begin{aligned} {RC}_t =\sum ^{N_t}_{\tau = 1}{r_{\tau }r^{'}_{\tau }}. \end{aligned}$$
(1)

This ensures positive definite realized covariance matrices and makes covariance fully observable and moldable with any time series model. The use of hourly returns in constructing our portfolios helps to provide a higher number of observations, which in turn increases the robustness of our covariance matrix estimation.Footnote 2 It should be noted, however, that network risk parity is a covariance-agnostic portfolio construction method. While accurate covariance estimation is crucial for any portfolio construction approach, the primary focus of this paper is not to provide an improved estimate of covariances.

Review of Hierarchical Risk Parity

Hierarchical Risk Parity produces portfolios from a three-step process. Firstly, hierarchical clustering is performed on correlation-based distance measures, starting from the correlation matrix of asset returns. Since clustering requires a distance measure to encapsulate the relationship between securities, we transform correlations into Euclidean metrics.Footnote 3 For this transformation, Mantegna (1999) defines the distance \(d_{i,j}\) as \(d_{i,j}= 1-{{\rho }_{i,j}}^2\).Footnote 4 The AGNES hierarchical clustering algorithm is then applied to these distances, separating securities into clusters organized in a linkage matrix. In the second step, the linkage matrix undergoes quasi-diagonalization so that the largest values align along the main diagonal. Finally, in the third step, optimal weights are calculated as the inverse of each security’s variance, which is located on the main diagonal after the quasi-diagonalization step.

Graph theory background

For a comprehensive introduction to graph theory, refer to Bollobás (1998, 2001). Consider a directed weighted graph \(G= (V,E,W)\) formed by a finite set of vertices VFootnote 5, a set of directed edges \(E\subset V \times V\), where each \(\left( e_x,e_y\right) \in E\) represents a link from \(v_x\in V\) to \(v_y \in V\), and a set of weights \(W:E\rightarrow {\mathbb {R}}{++}\) defined on each edge. Two nodes \((v_x,v_y) \in V\) are said to be adjacent if there exists an edge \((e_x,e_y) \in E\). The adjacency matrix of the graph \(A_G\) is defined as the square matrix whose entries are \(a{i,j} = w_{i,j}\) if \({v_i,v_j} \in E\) and \(w_{i,j}\) are the weight of the edge, \(a_{i,j} =0\) otherwise.

In a financial setup, a graph can be used to represent a financial market, wherein a security is represented by a vertex, and the relationship between each pair of securities is represented by edges. The simplest way of measuring the relationships among securities is to use linear correlations. As in clustering, the adjacency matrix must be defined on a metric; hence, we apply the same \(d_{i,j}=\ 1-{{\rho }_{i,j}}^2\) transformation. Moreover, the diagonal of the adjacency matrix is set to zero to avoid self-loopsFootnote 6. As such, a non-zero entry in the adjacency matrix indicates the existence of a financial relationship between pairs of securities with strength \(d_{i,j}\).

A graph built on a correlation matrix is a complete digraph, which is a graph where all pairs of vertices are connected by a pair of unique edges. This is the source of the link of HRP to graph theory. However, correlation matrices lack the notion of hierarchy. Simon (1991) argues that complex systems, such as financial markets, can be arranged in a natural hierarchy comprising nested substructures. The goal of codependence analysis is choosing which cross-security relations really matter. From a tree representation standpoint, this means choosing which links in the tree are significant and removing the others. de Prado (2016) argues that the lack of hierarchical structure makes portfolio weights vary in unintended ways in an asset allocation problem. For this reason, complete digraphs do not add additional information compared to correlation matrices, while other subgraphs – such as spanning treesFootnote 7 – can better serve financial needs by incorporating a hierarchical representation and choosing the links between securities that really matter.

We employ the minimum spanning tree (MST) (see Appendix B for a graphical representation), a subset of a complete digraph that includes all vertices but selects the minimum possible number of edges by solving \(\textrm{min}_{S} \sum _{e\in s}{W(e)}\), where \(S \le E\) is the number of links in the MST and e represents each link in each realization, \(s \in S\). To find minimum spanning tees, we employ the algorithm by Kruskal (1956).Footnote 8 It is worth noting that there are several parallels between Kruskal’s algorithm for minimum spanning trees and AGNES algorithm for hierarchical clustering. First, both algorithms start with a set of fully disconnected nodes and iteratively build clusters. Second, both algorithms use a greedy approach to form clusters. In Kruskal’s algorithm, the edges of the graph are sorted by weight based on the distance between securities, and at each step, the algorithm adds the next edge (with the lowest weight) that connects two previously unconnected clusters. Similarly, in AGNES, at each step, the algorithm merges the two clusters that are closest to each other, based on some distance metric. Finally, both algorithms produce a hierarchy of clusters, which can be visualized in the form of an MST or a dendrogram, respectively.

A minimum spanning tree allows quantifying how securities influence each other by means of a centrality measure. A centrality measure \(C:V \rightarrow {\mathbb {R}}_+\) is a function that assigns a non-negative value to each node such that the higher the value, the more the node is connected to others. One such centrality measure is the eigenvector centrality, according to which a security displays a high centrality either by direct links to other securities or by being connected to other securities that are themselves highly connected. As such, the higher the eigenvector centrality, the more central a security is in the tree. Eigenvector centrality is based on the idea that a node’s importance is determined by the importance of the nodes that it is connected to. In formula, the eigenvector centrality of a node \(\zeta (v)\) is given:

$$\begin{aligned} \zeta \left( v\right) = \frac{1}{\lambda _{\textrm{max}}} \sum _{v' \in N(v)}{\zeta (v')} \end{aligned}$$
(2)

where N(v) is the set of neighbors of node v and \(\lambda _{\textrm{max}}\) is the largest eigenvalue.

Network Risk Parity

We derive the network risk parity methodology from a parallel with hierarchical risk parity, assuming that its covariance matrix C is an n-dimensional diagonalized matrix with full-rank. Applying the spectral decomposition theorem to C yields:

$$\begin{aligned} C = U\Lambda U^T = \sum _{j=1}^{n} \lambda _j \textbf{u}_j \textbf{u}_j^T \end{aligned}$$
(3)

where \(\textbf{u}_j\) is the j-th eigenvector associated with the eigenvalue \(\lambda _j\) of matrix C. This relation leads to the eigenvector equation \(C \textbf{u}_i=\lambda _i \textbf{u}_i\), for all \(i = 1,...,n\). The eigenvalues \(\lambda _i\) and eigenvectors \(\textbf{u}_i\) are obtained by solving the characteristic equation \(\det (C - \lambda I) = 0\). Solving for the eigenvalues yields (see Appendix A for the proof):

$$\begin{aligned} \lambda _i = \sigma _i, \quad \forall i = 1,..., n \end{aligned}$$
(4)

where \(\sigma _i\) are the elements of the diagonal matrix C. In other words, the optimal inverse-variance allocation of Hierarchical Risk Parity corresponds to the eigenvalues of the quasi-diagonal covariance matrix C, assuming that the diagonalization step effectively results in a diagonal matrix.

Rearranging the eigenvector centrality definition of the MST in matrix form, we have:

$$\begin{aligned} A \zeta = \lambda _{\textrm{max}} \zeta \end{aligned}$$
(5)

where \(\zeta\) is the eigenvector of the adjacency matrix A associated with its largest eigenvalue \(\lambda _{\textrm{max}}\). In NRP we take a somewhat similar approach to HRP by using the eigenvector centrality and calculate the portfolio weights as:

$$\begin{aligned} w_i^* = \frac{1}{\zeta _i} \quad \forall i = 1, ..., n. \end{aligned}$$
(6)

To obtain fully invested portfolios such that \(\sum w = 1\), we apply the softmax normalization \(\sigma (w) = \frac{e^{w}}{\sum (e^{w})}\). As Laloux (1999) showed that the largest eigenvalue of a correlation matrix can be seen as a representative of systematic risk, and as the eigenvector centrality is associated with the largest eigenvalue of the adjacency matrix, the eigenvector centrality can be understood as a gage of a security’s contribution to systematic market risk. In the NRP approach, these securities are assigned lower weights in the portfolio, thereby aiming to minimize the portfolio’s exposure to systematic risk.

Despite both HRP and NRP use the same distance metrics calculated as \(d_{(i,j)}=\ 1-{{\rho }_{i,j}}^2\), where \({\rho }_{i,j}\) is the Pearson correlation coefficient, they result in different portfolios due to the different ways the notion of hierarchy is imposed on the correlation matrix. Hierarchical risk parity uses hierarchical clustering with Euclidean distance and a single linkage criterion. Network risk parity, instead, uses Kruskal’s algorithm to build a minimum spanning tree and select the meaningful interconnections between securities. Three benefits are associated with the latter approach.

First, in NRP relationships among securities are one-to-many rather than one-to-one as in HRP. In fact, the eigenvector centrality takes into account the importance of neighboring nodes too. On the other hand, hierarchical clustering uses a one-to-one approach to cluster securities.

Second, HRP depends on the specification of a distance function to transform linear correlation into metrics, and of a distance function and a linkage function for clustering purposes. NRP, on the other hand, depends only on the first. In particular, HRP has been criticized for using the single linkage function, which separates objects depending on the distance between the two closest points within clusters. This causes a chaining effect that widens the tree and results in an unequal distribution of the portfolio weights (Papenbrock 2011).

Third, the usage of graph theory allows us to analytically prove the level of concentration of the optimal portfolios, as will be shown.

Portfolio weights lower bound

Leveraging the concept of the degree d(v) of a vertex v in a graph G, defined as the number of vertices in G that are adjacent to v, it can be established – using the Gershgorin circle theoremFootnote 9 - that the maximum eigenvalue \(\lambda _{\textrm{max}}\) of the adjacency matrix A is bounded above by the maximum degree d(v) of the graph. As \(w=\zeta ^{-1}\) and \(\zeta\) corresponds to the largest eigenvalue \(\lambda\), then \(w \ge \max (d)^{-1}\), meaning that all portfolio assets have a weight that is greater than the inverse of the maximum degree of the MST. In particular, as \(d(v)>0\) for all non-empty graphs, the optimal weights of NRP are lower bound by the softmax-normalized degree of the MST. As HRP is based on the inverse-variance allocation of a quasi-diagonalized correlation matrix which we show to be proportional to \(\lambda ^{-1}\) ,Footnote 10 given that the Gershgorin theorem holds for any square matrix, and given that the correlation matrix can be represented as a complete digraph, the portfolio weights of HRP have a lower bound greater than zero as well.

Conversely, the spectral radius of the graph \(\rho (G)\)Footnote 11 is bounded by \(\rho (G) \ge \sqrt{\lambda _{\textrm{max}}-1}\), where \(\lambda _{\textrm{max}}\) is the largest eigenvalue of adjacency matrix. This suggests that the spectral radius can be seen as the degree of concentration that can occur, as the weights are inversely proportional to the eigenvalues, which grow larger in highly volatile periods (hence, resulting in more spread out portfolio weights). This lower bound mechanism contributes to creating more diversified, less concentrated portfolios compared to those derived from mean-variance optimization methods.

Of note is that the largest eigenvalue of a covariance matrix, a measure that varies over time, serves as a proxy for systematic risk (Laloux 1999), and it tends to increase during high-volatility regimes. Viewing this from the perspective of the minimum spanning tree (MST), the spectral radius tends to have a higher upper bound during highly volatile periods as the maximum eigenvalue grows higher. This means that as volatility increases, the spectral radius also increases, thereby leading to a decrease in the optimal portfolio weights and resulting in more diversified portfolios. This ability to adapt to changing market conditions is a key strength of NRP and HRP. As MVO propagates the errors of the ill-estimated covariance matrix through its inversion, its performance becomes relatively worse during highly volatile periods, due to additional concentration. Whereas MVO fails to perform optimally during high-volatility periods (when most needed), NRP and HRP employ a self-defensive mechanism resulting in better-diversified portfolios, thanks to the dynamic adjustment of the spectral radius of the graph.

Empirical results

In this section, we illustrate some empirical results regarding the application of network risk parity. In this section, we consider long-only,Footnote 12 fully invested portfolios rebalanced at monthly frequency.Footnote 13 We use the hourly prices of all the stocks composing the S &P 500 index from January 2010 to March 2023, aggregated to a monthly frequency in the process of calculating conditional covariances. To ensure no look-ahead bias in our study, we implemented a rolling window approach. For each monthly rebalancing, we utilized the past 2 years’ data. To obtain a robust result, we use a bootstrapping approach that randomly selects \(n = (20,50,100,200)\) stocks – without replacement – out of the entire sample and iterates through 10000 simulations. We compare the performances of the competing methods in terms of the Sharpe ratio.Footnote 14

Table 1 reports the average Sharpe ratios of the bootstrapped portfolios for the different numbers of constituents. Network risk parity and hierarchical risk parity outperform the other methodologies in all sample sizes. In particular, hierarchical risk parity achieves the highest Sharpe ratio for smaller portfolio sizes, such as \(n = (20,50)\), while network risk parity is the best performing as the number of portfolio constituents increases.Footnote 15 This behavior is due to the fact that the higher the number of portfolio constituents, the more the minimum spanning tree resembles a financial market in terms of the ability to place an asset at the center of the tree rather than in the periphery. This is explained by the fact that the MST structure used in NRP is an improved representation of systematic risk and financial markets. MST preserves the most significant relationships between assets, effectively filtering out less meaningful correlations. This approach enables us to uncover the market’s inherent structure by identifying clusters of assets that exhibit similar behavior, offering a more comprehensive understanding of systematic risk. Furthermore, financial markets are complex networks characterized by multidimensional inter-dependencies between assets. These connections extend beyond pairwise relationships, something the MST structure, with its graph representation, can capture more accurately. Therefore, the larger the number of portfolio constituents, the more effectively the real asset connections are represented, as the MST can better link assets based on similarities.

Network risk parity outperforms the other competing methods in all portfolio sizes, while MVO is the worst performer across all sample sizes. In fact, MVO often leans toward portfolios that capitalize on idiosyncratic risk, due to the inversion of an ill-estimated covariance matrix that propagates the errors. This often results in concentrated portfolios with large weights assigned to a few assets exhibiting lower volatility. On the other hand, NRP and HRP strive for more balanced portfolios as they are specifically designed to distribute risk equally across all assets in the portfolio. This more equal risk distribution inherently aligns the portfolio to the broader, systematic market movements rather than the idiosyncratic movements of individual assets. By reducing exposure to idiosyncratic risk, these strategies are less susceptible to the unexpected performance of a small subset of assets and therefore tend to be more stable. For NRP and HRP, this is further supported by the spectral radius acting as a self-defensive mechanism, as during periods of high volatility, the spectral radius also increases, thereby resulting in more evenly spread portfolio weights and better diversification. MVO, instead, during periods of high volatility, results in even more concentrated portfolios, as the covariance matrix is even more ill-estimated.

Table 1 Average monthly Sharpe ratios of bootstrapped portfolios for different numbers of constituents (best performing in boldface)

Figure 1 illustrates the histograms of the bootstrapped Sharpe ratios for each asset allocation strategy against the performance of the equally weighted portfolioFootnote 16 for the case \(n = 100\).

Fig. 1
figure 1

Histograms of bootstrapped Sharpe ratios against equally weighted. Note: in red, we report the distribution of the Sharpe ratios achieved by the equally weighted portfolio, while in blue, the distributions of the Sharpe ratios achieved by each asset allocation strategy across the bootstrapped samples. As it is visible, network risk parity and hierarchical risk parity are more skewed toward higher Sharpe ratio values, while the opposite holds for risk parity and Markowitz’s minimum variance

To provide some context regarding the most severe downturns across the bootstrapped samples, for the worst bootstrap simulation of each strategy, we computed the maximum drawdown using \(n = 20\). It was found that all portfolios reached their maximum drawdowns during the height of the COVID-19 crisis in March 2020. In particular, the NRP and HRP witnessed maximum drawdowns of 42% and 39%. RP and EW portfolios suffered a maximum drawdown of 48% and 46%, respectively, while MVO had the largest drawdown of 61%. For comparison, the S & P 500 experienced a maximum drawdown of 34% during the same COVID-19 period.

Finally, Figure 2 plots the boxplots of weights on the bootstrapping runs with \(n = 20\). We highlighted in red the parts corresponding to a portfolio weight of 0, as this is the first red flag to identify ill-concentrated portfolios. As it is visible, only the mean-variance allocation incurs in portfolio weights equal to 0 as well as several times reaches portfolio weights well above 60%, hinting at poor diversification and major concentration. On the other hand, the weights of NRP and HRP appear to be better diversified and less concentrated, as well as exhibit a similar pattern, in light of the common lower bound mechanism. Risk parity, on the other hand, is better diversified than mean-variance, but it reaches peaks up to 60%.

Fig. 2
figure 2

Boxplots of bootstrapped weights to show the major diversification benefits associated with NRP and HRP. Note: The boxplots of the portfolio weights across the bootstrap run with \(n=20\) (chosen to preserve the readability of the plot). In red, we highlight the points where the boxplot touches a value of zero, a result that is not achievable in light of the lower bound rule applicable to graph-based allocations. The trivial equally weighted case is not included

Appendix 7 provides a commentary on the average bootstrapped performances of the competing methods against the market environments that characterized the periods analyzed.

Conclusions

In this manuscript, we have presented network risk parity (NRP), a novel portfolio construction methodology based on graph theory. NRP serves as the counterpart to hierarchical risk parity (HRP), providing a complementary approach to portfolio allocation. By utilizing the concept of eigenvector centrality and the minimum spanning tree (MST), NRP offers several advantages over traditional methods.

One key advantage of NRP is its ability to capture one-to-many relationships between securities. Unlike one-to-one relationships captured by hierarchical clustering in HRP, NRP considers the importance of neighboring nodes as well. This broader perspective allows for a more comprehensive understanding of the interconnectedness and dependencies within a financial market.

Another advantage is the simplicity of the graph-based approach compared to the distance and linkage functions used in hierarchical clustering. NRP relies solely on the adjacency matrix and the MST, reducing the complexity of the methodology while still achieving effective risk allocation.

Furthermore, the graph theory framework of NRP allows for an analytically provable lower bound on the optimal weights of the portfolio. This lower bound ensures that the portfolio remains well-diversified, mitigating the risk of concentration on a few securities and avoiding the pitfalls associated with Markowitz’s minimum-variance approach.

Empirical results based on bootstrapped samples of S & P 500 stocks demonstrate the superiority of NRP over risk parity, Markowitz’s minimum-variance, and equally weighted portfolios in terms of the Sharpe ratio. Notably, the performance of NRP and HRP varies with the number of stocks in the portfolio, with HRP showing strength in smaller portfolios and NRP excelling as the number of constituents increases.

Future research should test different subgraph representations, with major focus posed on how the covariance matrix is estimated or forecast as well as on the inclusion of different asset classes other than equity. Furthermore, future research could focus on an exploration of covariance mis-estimations and the potential improvements brought by distance and adjacency matrices compared to sample covariances. Finally, as the current formulation of network risk parity assumes a long-only portfolio structure, the extension to long-short portfolios could be an interesting avenue for future research.