1 Introduction

Modern finance has been pioneered by Markowitz who set a framework to study choice in portfolio allocation under uncertainty (Markowitz 1952), and for which he earned the Nobel Prize in economics, 1990. Within this framework, Markowitz characterized portfolios by their return and their risk; the latter is formally defined as the variance of the portfolios’ returns.Footnote 1 An investor builds a portfolio that would maximize its expected return for a chosen level of risk; it has since become common for asset managers to optimize their portfolio within this framework. This approach has led a large part of the empirical finance research to focus on the so-called efficient frontier which is defined as the set of portfolios presenting the lowest risk for a given expected return. The efficient frontier is associated with a well-known family of convex functions, studied by Markowitz (1956). Moreover, the distributional properties of the optimal portfolio weights have been used for efficient portfolio selection (Bodnar et al. 2017, 2016; Bodnar and Schmid 2009; Kan and Smith 2008; Jobson and Korkie 1980).

 It is known, from the relevant literature, that financial markets exhibit three types of behavior. In normal times, stocks are characterized by slightly positive returns and moderate volatility, in up-market times (typically bubbles) by high returns and low volatility, and during financial crises by strongly negative returns and high volatility, see Billio et al. (2012) for details. Sore, following Markowitz’ framework, in normal and up-market times, the stocks and portfolios with the lowest volatility should present the lowest returns, whereas during crises those with the lowest volatility should present the highest returns. The detection of normal and crises periods is also crucial for computing an efficient asset allocation (Ivanyuk 2021; Harzallah and Abbes 2020; Pinho and Melo 2017).

However, these tools, when used to build a portfolio, do not always guarantee a good performance in practice (Maillard et al. 2010). Thus, the analysis of investment performance is of special interest in modern finance, especially given the growth of the asset management industry grows in the last decades. Research in this area is axed on Sharpe-like ratios proposed in the 1960’s (Jensen 1967; Sharpe 1966; Treynor 2015). In practice, the performance of a portfolio manager, over a given period, is usually measured as the ratio of his “excess” return with respect to a benchmark portfolio over a risk measure (Grinblatt and Titman 1994). Managers are then ranked according to these ratios, and the one achieving the highest and steadiest returns receives the best score. The major drawback of these techniques is the identification of benchmark portfolios, while the formation of such portfolios remains controversial. Moreover, they suffer from non-negligible estimation errors (Lo 2002), which prevent any performance comparison to be significant. In Pouchkarev (2005)—and independently in Guegan et al. (2011) and Banerjee and Hung (2011)—they use a geometric representation of a stock market to define a cross-sectional score of a portfolio given a vector of assets’ returns. The score of a portfolio is defined as the proportion of all possible asset allocations that the portfolio outperforms in terms of return. The aim is to measure the relative performance with respect to all possible alternative allocations offered to the manager. The term cross-section is used to underline that the score takes into account portfolios that are diversified over all sections of assets, without studying -separately- the performance on specific sections of stocks. In Banerjee and Hung (2011), they follow the same approach by defining what they call naive investor’s strategy. A naive investor’s strategy selects uniformly a portfolio from the set of portfolios, as it is agnostic of the assets’ returns generating process, and hence does not use any such information.

1.1 Contributions

First, we briefly survey the computational framework in Calès et al. (2018) which uses the geometric representation of long-only portfolios in Pouchkarev (2005) and the copula representation for the dependency between portfolios’ return and volatility. A copula is a multivariate joint distribution where the marginal distributions are uniform; for more details on copulae, we refer to Nelsen (2006). We enhance this framework significantly by employing clustering methods on copulae, and we use it to detect all the past crash events in the cryptocurrency market and all the past crises from 1990 to 2008 using real data from DJ600.

We extend the geometric framework in Pouchkarev (2005) to model additional asset allocations to long-only portfolios, e.g. the “150/50” or the “130/30” strategies, which recently have gained popularity Lo and Patel (2007). In particular, we work with the set of fully-invested portfolios, i.e., portfolios whose weights sum up to 1, which is the default choice for the bulk of the asset management industry. However, we let the weights be negative and we use the norm-constraint in Zhao et al. (2020) to set a lower bound on the weights’ values. Then, we introduce a transformation to represent the set of all possible fully-invested portfolios by a convex polytope; i.e., each point in the interior of the polytope corresponds to a single asset allocation.

We use this geometric representation to introduce a new mathematical model of portfolio allocation strategies in a stock market. We consider the concept where portfolio managers compute and propose portfolio allocations, which we call formal allocation proposals. Then, an investor decides which asset allocation proposal to select. Second, she decides how much to modify this proposal to build her final portfolio. Thus, we expect the portfolios of the investors who have chosen the proposal of a manager to be “concentrated around” that proposal. To model this procedure we employ multivariate log-concave distributions. The support of the Probability Density Function (i.e. the subset of \({\mathbb {R}}^n\) which is not mapped to zero) of each distribution is the set of all possible portfolios, i.e. a convex polytope. In particular, we say that a portfolio allocation strategy \(F_{\pi }\) is induced from a log-concave distribution \(\pi\) as follows: to build a portfolio with strategy \(F_{\pi }\) sample a point/portfolio from \(\pi\). Then, we call the mode of \(\pi\) a formal allocation proposal of the allocation strategy \(F_{\pi }\).

We use Markowitz’s framework to parameterize the allocation strategies by the level of risk that a certain group of investors selects. Similarly, for a given level of risk, we use the variance to parameterize to what extend around the formal allocation proposal a subgroup of investors may decide to stick. Finally, as in any stock market plenty of strategies may appear which are chosen by groups of investors. Thus, we define the mixed strategy induced by a convex combination of log-concave distributions, i.e. a mixture distribution.

We use this model of portfolio allocation strategies to define a new portfolio score to evaluate the performance of an investment. Our new score considers the set of truly invested portfolios in a stock market in a given time period. We evaluate the performance of a portfolio, for a given time period, by comparing the portfolio against a mixed strategy \(F_{\pi }\). Thus, we define the score of a portfolio as the expected number of truly invested portfolios that the first outperforms—in terms of return—when the portfolios have been invested according to the mixed strategy \(F_{\pi }\). To estimate the new cross-sectional score within an arbitrarily small error, we provide an efficient algorithm, based on Markov Chain Monte Carlo integration. In extreme cases, our new score becomes equal to that of Pouchkarev (2005), Guegan et al. (2011) and Banerjee and Hung (2011). Thus, it can also be seen as a generalization of the latter cross-sectional score. Moreover, as one may have limited knowledge about how the investors behave in a stock market, or her/his knowledge may vary from a time period to another, we extend our framework to handle these issues. We also provide different versions of our score. Each version provides a piece of different information about the portfolio allocation we would like to evaluate.

We also provide an open-source implementationFootnote 2 to simulate (mixed) allocation strategies and to estimate our new score given a portfolio. Our implementation scales up to a few hundred assets and allocation strategies. We provide a pseudo-real time example in the cryptocurrency market, using the 12 cryptocurrencies with the longest history. We provide extended arithmetic results to show that the informativeness of our new score can be higher than that of existing and well-known performance measures (e.g. Sharpe, Sortino ratios, and Jensen’s alpha). Moreover, we use our computations of the distribution of a portfolio’s score—assuming a distribution on the assets’ returns—to discuss how it could lead to useful insights about its performance. We also compute copulae of portfolios’ return and volatility under the assumption that the portfolios have been built according to a mixed strategy. We show that a copula of a certain time period can be very different from that in Calès et al. (2018). We believe that the last two simulations pave the way for future work, in the problems of crises detection and portfolio allocation.

Finally, since the simulation of allocation strategies and the computation of the score and copulae rely only on sampling from high dimensional log-concave distributions supported on the set of portfolios, our framework works also for a singular covariance matrix. That is, we can incorporate in our framework the results in Gulliksson and Mazur (2020), Bodnar et al. (2018), Mazur et al. (2017), Bodnar et al. (2016) and Pappas and Kaimakamis (2010). However, to keep the presentation simple, in Sect. 4, we assume that the covariance matrix of the assets’ returns is positive definite. More details about our computational methods and its efficiency are found in “Appendix A”.

Paper structure. The next section presents our geometric representation of portfolios we use. Section 3 surveys our work on copulae and the ensuing crisis indicator; our approach is corroborated by two applications on real data. Some elements in this section are presented in Calès et al. (2018), but here we present a broader class of methods (i.e. clustering copulae) and a new result on the cryptocurrency market. Section 4 introduces a new framework for modeling allocation strategies and evaluating portfolio performance by defining a new score of a portfolio. Section 6 presents our pseudo-real time example on real data to illustrate our new framework and the usefulness of our new score. Finally, in Sect. 7, we briefly discuss conclusions and future work.

2 Geometric representation of the set of portfolios

In this section, we formalize the geometric representation of sets of portfolios with an arbitrary large number of assets n. First, we handle the case of long-only strategies and then, we extend this representation to fully-invested portfolios. In both cases, the set of portfolios is a convex polytope in \({\mathbb {R}}^n\).

2.1 Long-only portfolios

In this case, no short sales are allowed. Let a portfolio x investing in n assets, whose weights are \(x=(x_1, \dots , x_n)\in {\mathbb {R}}^{n}\). The portfolios in which a long-only asset manager can invest are subject to \(\sum \nolimits _{i=1}^{n} x_i = 1\) and \(x_i\ge 0, \forall i\). Thus, the set of portfolios available to this asset manager is the unit \((n-1)-\)dimensional canonical simplex, denoted by \(\varDelta ^{n-1}\) and defined as

$$\begin{aligned} \varDelta ^{n-1} := \left\{ x \in {\mathbb {R}}^{n}\ \left| \ \sum _{i=1}^{n} x_i=1, \text{ and } x_i \ge 0, i \in [n] \right. \right\} \subset {\mathbb {R}}^n . \end{aligned}$$
(1)

The simplex \(\varDelta ^{n-1}\) is the smallest convex polytope with nonzero volume in a given dimension. For instance, in the plane any triangle is a simplex, while a triangular pyramid, or tetrahedron, is the simplex in 3D space. The vertices of \(\varDelta ^{n-1}\) represent portfolios composed entirely of a single asset.

2.2 Fully invested portfolios

When short sales are allowed, we write the set of all possible portfolios as,

$$\begin{aligned} P := \left\{ x \in {\mathbb {R}}^{n}\ \left| \ \sum _{i=1}^{n} x_i=1, \text{ and } \Vert x\Vert _1 \le \gamma , i \in [n],\ \gamma \ge 1 \right. \right\} \subset {\mathbb {R}}^n , \end{aligned}$$
(2)

where the \(L_1\)-norm \(\Vert x \Vert _1 = \sum _{i=1}^n |x_i|\). When \(\gamma = 1\) no short sales are allowed and \(P = \varDelta ^n\). When \(\gamma = 1.6\) then P corresponds to fully invested portfolios of the 130/30 type and \(\gamma = 2\) to 150/50. To show that P is a convex polytope for any \(\gamma \ge 1\), we replace the norm-constraint \(\Vert x \Vert _1 \le \gamma\) with a set of linear inequalities. Since \(|x_i| = \max \{-x_i, x_i\}\), for each \(x_i\), we add an auxiliary variable \(y_i\) such that,

$$\begin{aligned} y_i \ge -x_i,\ y_i \ge x_i,\ y_i \ge 0 . \end{aligned}$$
(3)

Then, the set of all possible portfolios is given by,

$$\begin{aligned} \small {\tilde{P}} := \left\{ (x,y)\in {\mathbb {R}}^{2n}\ \bigg |\ \sum _{i=1}^nx_i=1,\ -y_i \le x_i \le y_i,\ \sum _{i=1}^n y_i \le \gamma ,\ i \in [n],\ \gamma \ge 1 \right\} \subset {\mathbb {R}}^{2n} , \end{aligned}$$
(4)

which is a convex polytope as the feasible space is defined only by a set of linear inequalities (half-spaces).

3 Crises detection

In this section, we present our computational methods to address the problem of crises detection in stock markets. We focus on long-only portfolios, which means that the set of portfolios in the following computations is the canonical simplex \(\varDelta ^{n-1}\). It is difficult to capture the dependency between portfolios’ return and volatility from the usual mean-variance representation. So we rely on the copula representation. A copula is a joint probability distribution for which all the marginal probability distributions are uniform. Figure 1 illustrates such a copula and shows a positive dependency between portfolios’ return and volatility. Given a vector of assets’ returns \(R\in {\mathbb {R}}^n\) and the covariance matrix \(\Sigma \in {\mathbb {R}}^{n\times n}\) of the assets’ returns distribution, we say that any portfolio \(x \in \varDelta ^{n-1}\) has return \(f_{ret}(x, R) = R^T x\) and variance (or volatility) \(f_{vol}(x, \Sigma ) = x^T\Sigma x\).

Fig. 1
figure 1

Copula representation of the portfolios distribution, by return and variance. The market considered is made of the 19 sectoral indices of DJSTOXX 600 Europe. The data is from Oct. 16, 2017 to Jan. 10, 2018. Each line and column sum to 1% of the portfolios

To estimate the copula between portfolios’ return and volatility, we consider the following discretization on the values of each quantity. We fix two sequences \(s_0<\dots <s_m\) and \(u_0<\dots <u_m\) such that

$$\begin{aligned} \frac{{\mathrm{vol}}(S_i)}{{\mathrm{vol}}(\varDelta ^{n-1})} \approx p\quad \text { and }\quad \frac{{\mathrm{vol}}(U_i)}{{\mathrm{vol}}(\varDelta ^{n-1})} \approx p,\quad i=0,\dots ,m-1, \end{aligned}$$
(5)

where \(S_i:=\{ x\in {\mathbb {R}}^n\ |\ s_i\le f_{ret}(x, R)\le s_{i+1}\}\) and \(U_i:=\{ x\in {\mathbb {R}}^n\ |\ u_i\le f_{vol}(x, \Sigma )\le u_{i+1}\}\) and \(p<1\) a small constant (e.g. \(p=0.01\)). Equation (5) implies that a constant percentage p of the portfolios have return less than \(s_{i+1}\) and higher than \(s_i\). The same occurs for all the sets \(U_i\), which contain portfolios with bounded volatility.

Furthermore, the sets \(S_i,\ U_i\) define a grid of convex bodies, obtained by a family of parallel hyperplanes and a family of concentric ellipsoids—centered at the origin—intersecting \(\varDelta ^{n-1}\). Precisely, for given integers \(i,j\le m-1\) the body

$$\begin{aligned} Q_{ij} :=\{ x\in \varDelta ^{n-1}\ |\ s_i\le f_{\mathrm{ret}}(x, R)\le s_{i+1} \text { and } u_j\le f_{\mathrm{vol}}(x, \Sigma )\le u_{j+1}\} , \end{aligned}$$
(6)

contains the portfolios with return less than \(s_{i+1}\) and higher than \(s_i\) and volatility less than \(u_{j+1}\) and higher than \(u_j\). Now, to obtain the aforementioned copula one has to estimate the ratios \(\frac{{\mathrm{vol}}(Q_{ij})}{{\mathrm{vol}}(\varDelta ^{n-1})}\) for \(i,j=0,\dots ,m-1\).

We use Monte Carlo to estimate each volume ratio. We leverage direct, efficient uniform sampling from \(\varDelta ^{n-1}\) following (Rubinstein and Melamed 1998) and then count the number of points per body in the grid. In Sect. 3.2, this leads to an indicator to decide the state of the stock market that the estimated copula corresponds to.

Considering the computational efficiency of this method, it can be applied to stock markets with a few thousand assets, since the cost per uniformly distributed sample in \(\varDelta ^{n-1}\) using the exact sampler in Rubinstein and Melamed (1998) is O(n). For run-times see “Appendix A”.

3.1 Computing copulae

In our computations, to define the family of parallel hyperplanes, we consider compound returns over periods of k observations. Let the asset returns \(r_i=(r_{i,1},\dots ,r_{i,n}) \in {\mathbb {R}}^{n}\), \(i\in [k]\), then the component j of the compound return equals,

$$\begin{aligned} R_j = (1+r_{i,j})(1+r_{i+1,j})\ldots (1+r_{i+k-1,j}) -1, \quad j=1,\dots , n. \end{aligned}$$
(7)

This defines vector \(R\in {\mathbb {R}}^n\) normal to a family of parallel hyperplanes, whose equations are fully defined by selecting appropriate constants.

The covariance matrix \(\varSigma\) of the assets’ returns is computed using the shrinkage estimator of Ledoit and Wolf (2004),Footnote 3 as it provides a robust estimate even when the sample size is short with respect to the number of assets.

To compute the copulae, we determine constants defining hyperplanes and ellipsoids so that the volume between two consecutive such objects is \(p = 1\%\) of the simplex volume. Let us refer to the method outlined at Eq. (5) using notation introduced just before this equation. The sequence of \(s_0<\dots <s_m\) are determined by bisection using Varsi’s algorithm. For ellipsoids, we sample from the simplex and look for \(u_0<\dots <u_m\) such that there is an equal number of uniformly distributed points in each intersection.

We set \(m=100\), to estimate each copulae. We thus get \(100\times 100\) copulae representing the distribution of the portfolios with respect to the portfolio returns and volatilities. Figure 2 illustrates such copulae, and shows the different relationship between returns and volatility in good (left) and bad (right, Covid-19 shock event) times.

Fig. 2
figure 2

Copulae that correspond to cryptocurrencies’ states. Left, a normal period (16/12/2017) and right, a shock event due to Covid-19 (15/03/2020). The middle plot shows the mass of interest to characterize the market state

We analyze real data consisting of regular interval (e.g. daily) returns from two different asset sections: stocks from the Dow Jones Stoxx 600 Europe™(DJ600) and cryptocurrencies. We apply the methodology to a subset of assets drawn from the DJ 600 constituents using daily data covering the period from 01/01/1990 to 31/11/2017Footnote 4. Since not all stocks are tracked for the full period of time, we select the 100 assets with the longest history in the index, and juxtapose stock returns and stock returns covariance matrix over the same period to detect crises. For the cryptocurrency assets, we use the daily returns of 12 out of the top 100 cryptocurrencies, ranked by CoinMarketCap’sFootnote 5 market cap (cmc_rank) on 22/11/2020, having the longest available history (Table 6). We compute the daily return for each coin using the daily close price obtained by CoinMarketCap, for several notable coins such as Bitcoin, Litecoin and Ethereum.

Fig. 3
figure 3

Representation of the periods over which the indicator is greater than one for 61–100 days (yellow) and over 100 days (red) (color figure online)

3.2 Indicator and crisis detection

When we work with real data in order to build the indicator, we wish to compare the densities of portfolios along the two diagonals. In normal and up-market times, the portfolios with the lowest volatility present the lowest returns and the mass of portfolios should be on the up-diagonal. During crises, the portfolios with the lowest volatility present the highest returns and the mass of portfolios should be on the down-diagonal, see Fig. 2 as illustration. Thus, setting up- and down-diagonal bands, we define the indicator as the ratio of the down-diagonal band over the up-diagonal band, discarding the intersection of the two. The construction of the indicator is illustrated in Fig. 2 (middle) where the indicator is the ratio of the mass of portfolios in the blue area over the mass of portfolios in the red one.

The indicator is estimated on copulae by drawing 500,000 uniformly distributed points. We compute the indicator per copula over a rolling window of \(k=60\) days and with a band of \(\pm 10\%\) with respect to the diagonal. We experimentally determined both values. The latter corresponds to roughly 3 months when observations are daily. When the indicator exceeds 1 for more than 60 days but less than 100 days, we report the time interval as a “warning” (yellow color), while when exceeds 1 for more than 100 days, we report the interval as a “crisis” (red); see Figs. 3, and  4. The periods are at least 60 days long to avoid detection of isolated events whose persistence is only due to the auto-correlation implied by the rolling window.

We compare DJ 600 results with the database of financial crises in European countries in Lo Duca et al. (2017). The first crisis (May 1990 to Dec. 1990) corresponds to the early 90's recession, the second one (May 2000 to May 2001) to the dot-com bubble burst, the third one (Oct. 2001 to Apr. 2002) to the stock market downturn of 2002, the fourth one (Nov. 2005 to Apr. 2006) is not listed in the European database and is either a false positive of our method or may be due to a bias in the companies selected in the sample, and the fifth one (Dec. 2007 to Aug. 2008) can be associated with the sub-prime crisis.

Our cryptocurrencies indicator detects successfully the 2018 (great) cryptocurrency crash; see Fig. 4. The first shock event detected in 2018 (mid-January to late March) corresponds to the crash of nearly all cryptocurrencies, following Bitcoin’s, whose price fell by about 65% from 6 January to 6 February 2018, after an unprecedented boom in 2017. Intermediate warnings (mid-May to early August) should correspond to cryptocurrencies collapses (80% from their peak in January) until September. The detected crash at the end of 2018 (November 2018 until early January 2019) corresponds to the fall of Bitcoin’s market capitalization (below $100 billion) and price by over 80% from its peak, almost one-third of its previous week value. Finally, the detected event in early 2020 corresponds to the shock event due to COVID-19.

Fig. 4
figure 4

Warning (yellow) and Crises (red) periods detected by the indicator for cryptocurrencies (2014-2020) (color figure online)

3.2.1 Clustering of copulae agrees with indicator

To cluster the probability distributions distances of the copulae, we computed a distance matrix (D) between all copulae using the earth mover’s distance (EMD) (Rubner et al. 2000). The EMD between two distributions is the minimum amount of work required to turn one distribution into the other. We use a fast and robust EMD algorithm, which appears to improve both accuracy and speed (Pele and Werman 2009). Then, we apply spectral clustering (Ng et al. 2001), a method to cluster points using the eigenvectors of the affinity matrix (A) which we derive from the distance matrix, computed by the radial basis function kernel, replacing the Euclidean distance with EMD, where \(A_{ij}=\mathrm{exp}(-D_{ij}^2/2\sigma ^2)\), and for \(\sigma\) we chose the standard deviation of distances. Using the k largest eigenvectors of the laplacian matrix, we construct a new matrix and apply k-medoids clustering by treating each row as a point, so as to obtain k clusters. The results with \(k=6\) and \(k=8\) are shown on the indicators’ values in Figs. 13, 14, and 15. Clusters appear to contain copulae with similar indicator values. Crisis and normal periods are assigned to clusters with high and low indicator values respectively. Therefore, the clustering of the copulae is proportional to discretising the values of the indicator. We do not use any data-driven techniques to select an optimal cluster size, since we apply clustering only to demonstrate that the resulting clusters validate the indicator and distinguish different market states according to the indicator. Additional results on clustering copulae can be found in “Appendix C” (Fig. 5).

Fig. 5
figure 5

Spectral clustering of copulae, with \(k=6\) clusters, on the earth mover’s distances (EMD) of the copulae. Results are shown on the values of the indicator for every copula. There are six different plots, one for every cluster. Red points indicate the copulae assigned to the specific cluster, while the blue points are the copulae assigned to other clusters. Yellow and red time intervals are the identified by the indicator warning and crises periods respectively

4 Modeling allocation strategies and a new portfolio score

We provide an original framework for modeling allocation strategies and a new cross-sectional portfolio score. We define the score of a given asset allocation as the expected value of the proportion of truly invested portfolios in a stock market, that the first outperforms when the portfolios have been built according to, what we call, a mixed strategy.

Here, we assume that in a stock market the portfolio managers make allocation proposals. Then, the investors choose which proposal to select and how much to modify it before they build their final portfolio. Thus, we model a portfolio allocation strategy by a log-concave distribution supported on the portfolio domain P, with its mode being at a benchmark portfolio. Then, an investor builds a portfolio according to that strategy, by generating a point/portfolio from the corresponding distribution.

Definition 1

Let \(\pi\) be a log-concave distribution supported on the portfolio domain \(P\subset {\mathbb {R}}^n\) with Probability Density Function (PDF) \(\pi (x)\). Then, a portfolio allocation strategy \(F:\pi \rightarrow P\) is said to be induced by the distribution \(\pi\), and we write \(F_{\pi }\). More precisely, \(F_{\pi }\) is induced by the following state:

“To build a portfolio with strategy \(F_{\pi }\) sample a point/portfolio from \(\pi\)”.

The mode of \(\pi\) can be seen as the allocation proposal that a portfolio manager has made. Then, we expect the portfolios of the investors, who have chosen that proposal, to be concentrated around that proposal/mode.

Definition 2

Let strategy \(F_{\pi }\) induced by the log-concave distribution \(\pi\). We call the mode of \(\pi\) formal allocation proposal or formal proposal of the portfolio allocation strategy \(F_{\pi }\).

In the sequel, we assume that in a stock market the set of truly invested portfolios, are being built by a combination of different strategies used by the investors (mixed strategy). First, we consider a sequence of log-concave distributions \(\pi _1,\dots ,\pi _M\) restricted to P. Each distribution induces a portfolio allocation strategy, i.e. \(F_{\pi _1},\dots ,F_{\pi _M}\). Then, the mixed strategy is induced by a convex combination of \(\pi _i\), i.e. by a mixture distribution, as the following definition states.

Definition 3

Let \(\pi _1, \dots ,\pi _M\) be a sequence of log-concave distributions supported on the set of portfolios \(P\subset {\mathbb {R}}^n\), and let the mixture density be \(\pi (x) = \sum _{i=1}^Mw_i\pi _i(x)\), where \(w_i\ge 0,\ \sum _{i=1}^Mw_i=1\). We call \(F_{\pi }\) the mixed strategy induced by the mixture density \(\pi\).

In Definition 3, each weight \(w_i\) corresponds to the proportion of the investors that build their portfolios according to the allocation strategy \(F_{\pi _i}\). Thus, the vector of weights \(w\in {\mathbb {R}}^M\) implies how the investors in a certain stock market and time period, tend to behave. Now, we are ready to define the new cross-sectional score of an asset allocation versus a mixed strategy.

Definition 4

Let a stock market with n assets and \(F_{\pi }\) a mixed strategy induced by the mixture density \(\pi\). For given asset returns \(R\in {\mathbb {R}}^n\) over a single period of time, the score of a portfolio, providing a value of return \(R^*\), is

$$\begin{aligned} s = \int _{P} g(x)\pi (x) \mathrm{d}x,\quad g(x) = \left\{ \begin{array}{ll} 1. &{}\quad \text{ if } R^Tx\le R^* ,\\ 0, &{}\quad \text{ otherwise. }\\ \end{array} \right. \end{aligned}$$
(8)

Clearly, the value of the integral in Eq. (8) corresponds to the expected proportion of portfolios that an allocation outperforms—in terms of return—when the portfolios are invested according to the mixed strategy \(F_{\pi }\).

4.1 Log-concave distributions in Markowitz’ framework

In this section, we model allocation strategies in Markowitz’s framework using special multivariate log-concave distributions supported on the set of portfolios P. A proper choice of log-concave distributions allows us to parameterize a strategy by the level of risk and the level of dispersion around the formal allocation proposal of the strategy.

In general, using Markowitz’ framework, one can define, under certain assumptions, the optimal portfolio \({\bar{x}}\) as the maximum of a concave function \(h(x),\ x\in P\). Then, the mode of the log-concave distribution with PDF \(\pi (x)\propto e^{\alpha h(x)}\) is \({\bar{x}}\), while the parameter \(\alpha > 0\) controls the variance of the distribution. Large/small values of \(\alpha\) corresponds to small/large variance.

Notice that as the variance grows, \(\pi\) converges to the uniform distribution. Moreover, as the variance diminishes, the mass of \(\pi\) concentrates around the mode of \(\pi (x)\). Consequently, we use the variance to parameterize the sequence \(\pi _i\propto e^{\alpha _ih(x)}\). That is, small variances correspond to allocation strategies used by investors who stick around the formal allocation proposal. Large variances correspond to allocation strategies used by investors who may modify the formal proposal a lot. Thus, in the first case, the invested portfolios would be highly concentrated around the formal allocation proposal of \(F_{\pi }\) (or around the mode of \(\pi\)) as the mass of \(\pi\) implies. In the second case, the invested portfolios would be highly dispersed around the mode of \(\pi\). In the extreme case of a very large variance, \(\pi\) is close to the uniform distribution. Then, the induced allocation strategy becomes the naive strategy as defined in Banerjee and Hung (2011). We employ the \(L_2\) norm of a log-concave distribution \(\pi\) with respect to (w.r.t.) the uniform distribution to characterize how dispersed, around the formal proposal, the portfolios built according to \(F_{\pi }\) are. The \(L_2\) norm of a distribution f w.r.t a distribution g, when both are supported on a set \(P\subset {\mathbb {R}}^n\) is,

$$\begin{aligned} \Vert f/g \Vert = {\mathbb {E}}_f\bigg ( \frac{f(x)}{g(x)} \bigg ) = \int _P \frac{f(x)}{g(x)}f(x)\mathrm{d}x = \int _P\bigg ( \frac{f(x)}{g(x)} \bigg )^2 g(x)\mathrm{d}x . \end{aligned}$$
(9)

We can now define what we call a D-dispersed allocation strategy.

Definition 5

Let \(\pi \propto e^{\alpha h(x)}\) be any log-concave distribution supported on the set of portfolios P and let \(F_{\pi }\) be the induced portfolio allocation strategy. We say that \(F_{\pi }\) is D-dispersed, where D is the \(L_2\) norm of \(\pi\) w.r.t. the uniform distribution.

Fig. 6
figure 6

Left: illustration of PDFs \(\pi _q\propto e^{-\alpha \phi _q(x)}\), where \(\alpha =1\) and from left to right \(q_1 = 0.3,\ q_2 = 1,\ q_3 = 1.5\). Right: 3 illustrations of the mixture density of Eq. (12), where \(M_1=3,\ M_2=2\). In both plots, each black small star corresponds to a formal allocation proposal of an allocation strategy. From yellow to blue: high to low density regions

Our main approach is to leverage the expected quadratic utility function,

$$\begin{aligned} \phi _q(x) = x^T\Sigma x - q\mu ^Tx,\ x\in P\subset {\mathbb {R}}^n,\quad q\in [0,+\infty ], \end{aligned}$$
(10)

where \(\mu \in {\mathbb {R}}^n\) is the mean and \(\Sigma \in {\mathbb {R}}^{n\times n}\) is the covariance matrix of the assets’ returns and n is the number of assets. This parametric function delivers similar solutions to the original Markowitz problem in Kroll et al. (1984) and Levy and Markowitz (1979). It is also used by the investors to compute the efficient frontier and optimal portfolios. The \(x^T\Sigma x\) is called risk term, the \(\mu ^Tx\) is called return term and the parameter q controls the trade-off between return and risk. Typically, in modern finance, a portfolio manager builds an efficient asset allocation by selecting a value \(q_0\)—which determines the level of risk of his allocation. Then, according to Markowitz (1956), she/he solves the following optimization problem:

$$\begin{aligned} \min \; \phi _{q_0}(x) = x^T\Sigma x - q_0\mu ^Tx,\;\text { subject to } x\in P. \end{aligned}$$

We call the portfolio \({\bar{x}}=\min \limits _{x\in P}\phi _{q_0}(x)\) as the optimal mean-variance portfolio for the risk implied by \(q_0\). Thus, the efficient frontier can be seen as a parametric curve on q.

Let the log-concave distribution,

$$\begin{aligned} \pi _{\alpha , q}\propto e^{-\alpha \phi _q(x)} , \end{aligned}$$
(11)

supported on P. The left plot in Fig. 6 illustrates some examples of the probability density function \(\pi _{\alpha , q}\) where the mean \(\mu\) and the covariance matrix \(\varSigma\) are randomly sampled once. Notice that for different q, the mode (or the formal allocation proposal of the strategy \(F_{\pi _{\alpha ,q}}\)) is shifted.

We use the parameter q to denote the level of risk of a portfolio allocation strategy \(F_{\pi _{\alpha ,q}}\). Small values of q correspond to low risk strategies, whereas large values of q to high risk strategies. Thus, a sequence of such densities can be parameterized by both q (risk) and \(\alpha\) (dispersion). In particular, a mixed strategy \(F_{\pi }\) can be induced by the following mixture density:

$$\begin{aligned} \pi (x) = \sum _{i=1}^{M_1}\sum _{j=1}^{M_2} w_{ij}e^{-a_{ij}\phi _i(x)},\quad \text {where }\phi _i = x^T\Sigma x - q_i\mu ^Tx ,\ x\in P , \end{aligned}$$
(12)

where each \(q_i\) denotes the level of risk. For each \(q_i\) the parameters \(\alpha _{ij}\) imply the level of dispersion of the strategy \(F_{\pi _{ij}}\). Notice that for each level of risk \(q_i\) there are \(M_2\) different levels of dispersion that different groups of investors’ portfolios may appear around the same formal allocation proposal. The right plot of Fig. 6 illustrates some examples of this mixture density.

Since the portfolio score in Definition 4 is equal to the expectation of an indicator function with respect to the measure induced by a mixture of log-concave distributions, it can not be computed exactly (e.g. from a closed-form). In the sequel, we discuss how we can estimate the value of the new score by approximating the value of the corresponding multivariate integral.

4.2 Computation of the score

This section provides a Markov Chain Monte Carlo (MCMC) integration method to guarantee fast and robust approximation within arbitrarily small error for the score in Definition 4. Let the probability density function \(\pi (x) =\sum _{i=1}^Mw_i\pi _i(x)\) to be a mixture of log-concave densities (i.e. \(\pi _i\) are log-concave distributions). Furthermore, let the vector of assets’ returns \(R\in {\mathbb {R}}^n\), the halfspace \(H(R^*):=\{ x\in {\mathbb {R}}^n\ |\ R^Tx\le R^* \}\) and the indicator function \(g(x) = \left\{ \begin{array}{ll} 1. &{} \text{ if } x\in H(R^*) ,\\ 0, &{} \text{ otherwise. }\\ \end{array} \right.\). Then the score in Eq. (8) can be written,

$$\begin{aligned} \begin{aligned} s&= \int _{P}g(x)\sum _{i=1}^Mw_i\pi _i(x)\mathrm{d}x = \sum _{i=1}^Mw_i\int _{P}g(x)\pi _i(x)\mathrm{d}x \\&= \sum _{i=1}^Mw_i\int _{P\cap H(R^*)}\pi _i(x)\mathrm{d}x = \sum _{i=1}^Mw_i\int _{S}\pi _i(x)\mathrm{d}x , \end{aligned} \end{aligned}$$
(13)

where \(S:=P\cap H(R^*)\).

It is clear that the computation of the score s is reduced to integrate M log-concave functions over a convex set S, i.e. to compute each \(\int _{S}\pi _i(x)\mathrm{d}x,\ i\in [M]\). For each one of these M integrals, we use the algorithm presented in Lovasz and Vempala (2006) to approximate it within an arbitrarily small error after a number of operations that grows polynomially with the dimension (number of assets) n. First, we use an alternative representation of the volume of S, employing a log-concave function \(\pi (x)\),

$$\begin{aligned} \begin{aligned} {\mathrm{vol}}(S)&= \int _{S}\pi (x)\mathrm{d}x\ \frac{\int _{K}\pi ^{\beta _1}(x)\mathrm{d}x}{\int _{S}\pi (x)\mathrm{d}x}\ \frac{\int _{S}\pi ^{\beta _2}(x)\mathrm{d}x}{\int _{S}\pi (x)^{\beta _1}\mathrm{d}x}\ \cdots \ \frac{\int _{S}1\mathrm{d}x}{\int _{S}\pi (x)^{\beta _k}\mathrm{d}x}\\&\Rightarrow \int _{S}\pi (x)\mathrm{d}x = {\mathrm{vol}}(S)\ \frac{\int _{S}\pi (x)^{\beta _k}\mathrm{d}x}{\int _{S}1\mathrm{d}x}\ \cdots \ \frac{\int _{S}\pi (x)\mathrm{d}x}{\int _{S}\pi (x)^{\beta _1}\mathrm{d}x} , \end{aligned} \end{aligned}$$
(14)

where the sequence \(\beta _j,\ j\in [k]\) are factors applied on the variance of \(\pi (x)\).

Since S is the intersection of a halfspace with the convex polytope P we use the algorithm in Cousins and Vempala (2015) to approximate \({\mathrm{vol}}(S)\) within error \(\epsilon\) after \(O^*(n^3)\), where \(O^*(\cdot )\) suppresses polylogarithmic factors and dependence on \(\epsilon\). In the special case of \(P=\varDelta ^{n-1}\), we can compute the exact value of \({\mathrm{vol}}(S)\) using Varsi’s algorithm Varsi (1973) after \(n^2\) operations at most. Consequently, the computation of \(\int _{S}\pi (x)\mathrm{d}x\) is reduced to compute k ratios of integrals. For each ratio we have,

$$\begin{aligned} \begin{aligned} r_j&= \frac{\int _{S}\pi (x)^{\beta _{j-1}}\mathrm{d}x}{\int _{S}\pi (x)^{\beta _j}\mathrm{d}x} = \frac{1}{\int _S \pi (x)^{\beta _j}\mathrm{d}x}\int _S\frac{\pi (x)^{\beta _{j-1}}}{\pi (x)^{\beta _j}(x)}\pi (x)^{\beta _j}(x)\mathrm{d}x \\&= \int _S\frac{\pi (x)^{\beta _{j-1}}}{\pi (x)^{\beta _j}}\frac{\pi (x)^{\beta _j}}{\int _S \pi (x)^{\beta _j}\mathrm{d}x}\mathrm{d}x . \end{aligned} \end{aligned}$$
(15)

Thus, to estimate \(r_j\), we just have to sample N points from the distribution proportional to \(\pi (x)^{\beta _j}\) and restricted to S. Then,

$$\begin{aligned} r_j\approx \frac{1}{N}\sum _{i=1}^N\frac{\pi (x_i)^{\beta _{j-1}}}{\pi (x_i)^{\beta _j}} \end{aligned}$$
(16)

as N grows. The key for an efficient approximation of \(r_j\) using Monte Carlo integration is to set \(\beta _j,\ \beta _{j+1}\) such that the variance of \(r_j\) is as small as possible (ideally a constant) for N as small as possible. Lovasz and Vempala (2006) prove that the sequence of \(\beta _1,\dots ,\beta _k\) can be fixed such that the variance of each \(r_j,\ j\in [k]\) is bounded by a constant. Moreover, \(N=O^*(\sqrt{n})\) points per integral ratio \(r_j\) and \(k=O^*(\sqrt{n})\) ratios in total suffices to approximate each \(\int _{S}\pi _i(x)\mathrm{d}x, i\in [M]\) within error \(\epsilon\). Thus, \(O^*(n)\) points suffices to estimate each \(\int _{S}\pi _i(x)\mathrm{d}x\).

Lemma 1

Let the PDF \(\pi (x)\) in the Definition 4 be a mixture of M log-concave densities. The integral ratio in Eq. (16) can be estimated with \(O^*(n)\) samples from \(\pi (x)^{\beta _j}\) within error \(\epsilon\). Thus, the portfolio score in Eq. (8) can be estimated using \(O^*(Mn)\) samples.

To sample from each target distribution proportional to \(\pi (x)^{\beta _j}\) and restricted to S, in Lovasz and Vempala (2006), they use Hit-and-Run random walk (Vempala 2005). This implies a total number of \(O^*(n^4)\) arithmetic operations per generated point. Thus the total number of arithmetic operations to estimate the score s is \(O^*(Mn^5)\). In our implementation, to sample from a log-concave distribution supported on P, we use the reflective Hamiltonian Monte Carlo in Afshar and Domke (2015) which is more efficient in practice than Hit-and-Run. For an extended introduction to geometric random walks, we suggest (Vempala 2005).

5 Mixed strategies

An important question is how one could set the risk and dispersion parameters \(q_i,\ \alpha _{ij}\) and the weight \(w_{ij}\) of each allocation strategy \(F_{\pi _{q_i,\alpha _{ij}}}\) in a certain stock market. The issue is that our knowledge about the stock market and the behavior of the investors in it might be weak or vary from a time period to another. In this section, we provide practical methods to set the parameters of a sequence of log-concave distributions. We also present different versions of the score than those given in Definition 4. For more details about the computational methods, we use in this section are given in “Appendix A”.

5.1 Set the levels of dispersion

Let the concave function \(h(x):P\rightarrow {\mathbb {R}}\), where \(P\subset {\mathbb {R}}^n\) the set of portfolios. Also, let the log-concave probability density function,

$$\begin{aligned} \pi _{\alpha }(x) \propto e^{\alpha h(x)},\ \alpha > 0 , \end{aligned}$$
(17)

supported also on P and \({\tilde{x}}\in P\) the mode of \(\pi _{\alpha }\). Recall that small/large values of \(\alpha\) correspond to large/small values of variance of \(\pi _{\alpha }\). Thus, first we compute a value \(\alpha _L\) such that \(F_{\pi _{\alpha _L}}\) is a e-dispersed allocation strategy; that is the distribution \(\pi _{\alpha _L}\) is e-close to the uniform distribution according to the \(L_2\) norm. Second, we compute a value \(\alpha _U\) such that the mass of the distribution \(\pi _{\alpha _U}\) is almost entirely concentrated in a ball \(B({\tilde{x}},\delta )\), that is a ball centered at the mode \({\tilde{x}}\) and with a small radius \(\delta > 0\). Then, our aim is to compute a sequence \(\alpha _L = \alpha _1< \dots < \alpha _{k} = a_U\) such that,

$$\begin{aligned} \Vert \pi _{\alpha _{i+1}} / \pi _{\alpha _i} \Vert = \Vert \pi _{\alpha _i} / \pi _{\alpha _{i-1}} \Vert ,\ i \in \{ 2,\dots , k-1 \} . \end{aligned}$$
(18)

To compute \(\alpha _L\) we start with \(\alpha _0 = 1\) and we use the annealing schedule in Cousins and Vempala (2015). In particular, we generate the sequence,

$$\begin{aligned} \alpha _i = \alpha _0 \bigg ( 1 - \frac{1}{n} \bigg )^i,\quad i\in \mathbb {N_+} . \end{aligned}$$
(19)

This schedule guarantees that a sample from \(\pi _{\alpha _{i}}\) is a warm start to sample from \(\pi _{\alpha _{i+1}}\) for several random walks (Lovasz and Vempala 2006; Cousins and Vempala 2015) and moreover, the variance of the distribution which is proportional to \(e^{(\alpha _{i+1}-\alpha _i)h(x)}\) is O(1); that is, each jump to the next distribution in the sequence is “small”. Next, for each \(\alpha _i\) we estimate the \(L_2\) norm of \(\pi _{\alpha _i}\) w.r.t. the uniform distribution, by sampling from \(\pi _{\alpha _i}\). We stop when the norm is smaller than a given threshold.

To compute \(\alpha _U\), we use the same annealing schedule, but now we generate an increasing sequence,

$$\begin{aligned} \alpha _i = \alpha _0 \bigg ( 1 + \frac{1}{n} \bigg )^i,\ i\in \mathbb {N_+} . \end{aligned}$$
(20)

We stop when we meet the smallest i such that the \(100(1-\epsilon )\%\) of the mass of \(\pi _{\alpha _i}\) is inside the ball \(B({\tilde{x}},\delta )\) with high probability.We probabilistically guarantee this by sampling a sufficiently large number of points from \(\pi _{\alpha _i}\) and by splitting the sample to \(\nu\) sub-samples. For each sub-sample we compute the ratio of points that lie in \(B({\tilde{x}},\delta )\); that is we obtain \(\nu\) ratios. Then, we perform a t-test using those ratios while the null hypothesis states that the overall ratio is larger than \((1-\epsilon )\). We stop for an \(\alpha _i\) that results in rejecting the null hypothesis.

Finally, to compute a sequence of equidistant distributions as in Eq. (18), we estimate \(d = \max \nolimits _{i\in [k-1]}\{ \Vert \pi _{\alpha _{i+1}} / \pi _{\alpha _i} \Vert \}\). Then, we start from \(\alpha _1 = \alpha _L\). Given \(\alpha _i\), to compute the next value of parameter in the sequence, namely \(\alpha _{i+1}\), we perform bisection method in the interval \([\alpha _{i}, \alpha _U]\) to compute a value such that the \(L_2\) norm of \(\pi _{\alpha _{i+1}}\) w.r.t. \(\pi _{\alpha _i}\) is \(d\pm \epsilon\) with a high probability and a small \(\epsilon > 0\). We stop when we compute an \(\alpha _i > \alpha _U\) and we set \(\alpha _k = \alpha _i\). To select M values of \(\alpha\) we pick \(\alpha _1\) and \(\alpha _k\) and then, we equidistantly pick \(M-2\) values in between them.

5.2 Set the levels of risk

Our practical method computes a sequence \(q_1<\dots <q_M\). The values \(q_i\) are equidistant concerning the portfolio volatility that each \(q_i\) corresponds to. First, we compute the minimum and the maximum value of portfolio volatility. The first one is also called Global Minimum Variance portfolio (Zhao et al. 2020). In particular, we solve the following optimization problems,

$$\begin{aligned} \min /\max \ x^T{\tilde{\Sigma }} x^T,\ x\in P,\ {\tilde{\Sigma }}\in {\mathbb {R}}^{n\times n}\text { pos. def.} \end{aligned}$$
(21)

where P is the set of portfolios and \({\tilde{\Sigma }}\) is an estimation of the covariance using the shrinkage estimate in (Ledoit and Wolf 2004). Let the values of the minimum and the maximum portfolio volatility \(v_{\min }\) and \(v_{\max }\) respectively. To compute M values of the parameter q, we equidistantly select M values of portfolio volatility \(v_{\min }< v_1< \dots<v_M < v_{\max }\). Then, for each \(v_i\) we perform a bisection method in a proper interval \([q_{\min },q_{\max }]\) to compute a \(q_i\) such that,

$$\begin{aligned} | \min \limits _{x\in P} \phi _{q_i}(x) - v_i | \le \epsilon ,\quad i\in [M] , \end{aligned}$$
(22)

for a sufficiently small value of \(\epsilon > 0\), while \(\phi _q(x)\) is the expected quadratic utility function in Eq. (10). In particular, for each \(q_i\), we search in \([q_{i-1},q_M],\ i \in \{2,\dots ,M-1\}\); for \(q_1\), we search in \([0,q_M]\). To compute \(q_M\), we search for the smallest non-negative integer j such that \(\min \limits _{x\in P} \phi _{2^j}(x) > v_M\). Then, we perform a bisection method in \([0, 2^j]\) to compute \(q_M\).

5.3 Set the composition of the investors

The computation of both sequences of q and \(\alpha\) allow to specify the sequence of log-concave distributions,

$$\begin{aligned} \pi _{ij}= e^{-\alpha _{ij}\phi _{q_i}(x)},\quad i\in [M_1],\ j\in [M_2] , \end{aligned}$$
(23)

where we assume that for each level of risk \(q_i\) we have \(M_2\) levels of dispersion. However, to determine a mixed strategy one has to determine the weights \(w_{ij}\) in the corresponding mixture distribution. We recall that each \(w_{ij}\) implies the proportion of investors that build their portfolios according to the allocation strategy induced by \(\pi _{ij}\). Setting \(w_{ij}\) forms the mixed strategy \(F_{\pi }\) while the score in Definition 4 becomes,

$$\begin{aligned} s = \sum _{i=1}^{M_1}\sum _{j-1}^{M_2}w_{ij}\int _{S}\pi _{ij}(x)\mathrm{d}x,\ S:=P\cap H(R^*) . \end{aligned}$$
(24)

First, we allow setting additional bounds on \(w_{ij}\). For example, one would provide an upper/lower bound on the proportion of the investors who chose a specific allocation strategy. In particular, let us assume that we estimate the \(M = M_1 M_2\) integrals of Eq. (24) as described in Sect. 4.2. M is the number of allocation strategies in a certain stock market. Then, let the M values to form a vector \(c\in {\mathbb {R}}^M\). Also let the corresponding weights \(w_{ij}\) in Eq. (24) to form a vector \(w\in {\mathbb {R}}^M\) such that the score,

$$\begin{aligned} s = \langle c, w\rangle , \end{aligned}$$
(25)

where \(\langle \cdot , \cdot \rangle\) denotes the inner product between two vectors. Given a matrix \(A\in {\mathbb {R}}^{N\times M}\) and a vector \(b\in {\mathbb {R}}^N\), let the following feasible region of weights,

$$\begin{aligned} Q = \left\{ w\in {\mathbb {R}}^M\ \bigg |\ Aw\le b,\ w_i\ge 0,\ \sum _i^M w_i = 1 \right\} \subset {\mathbb {R}}^M \end{aligned}$$
(26)

The matrix A and the vector b used to express N further constraints on the weights (e.g. lower, upper bounds or any linear constraint on \(w_{ij}\)). Notice that if no further constraints are given on the weights, then the feasible region Q is the canonical simplex \(\varDelta ^{M-1}\).

Now, let us define three new versions of the score s in Eq. (24).

figure a

For the scores \(s_{\min }\) and \(s_{\max }\), one has to solve a linear program for each one of them. The score \(\bar s\) requires the computation of an integral which can be computed with MCMC integration employing uniform sampling from Q; otherwise, it can be reduced to the computation of the volume of a convex polytope since \(\langle c,w \rangle\) is a linear function of w with the domain being the set Q.

Let \(w_1\in Q\) such that the min score \(s_{\min } = \langle c,w_1\rangle\). The weights denoted by the vector \(w_1\) imply the proportions of the investors that select each allocation strategy such that the portfolio score s takes its possibly minimum value. Similarly, the vector of weights \(w_2\in Q\) such that the max score \(s_{\max } = \langle c,w_2\rangle\), implies the proportions of the investors that select each allocation strategy such that the portfolio score s takes its possibly maximum value. Moreover, it is easy to notice that the mean score \({\bar{s}} = \langle c,{\bar{w}}\rangle\), where the vector of weights \({\bar{w}}\) is the center of mass of Q. For example, if \(Q=\varDelta ^{M-1}\) (i.e. the case where no further constraints are given on the weights) the vector \({\bar{w}}\) is the equally weighted vector.

However, one may have additional knowledge on how the investors tend to behave in a certain stock market, i.e. which allocation strategies they tend to select. We also allow for these degrees of freedom by providing the notion of behavioral functions in our context.

5.3.1 Behavioral functions

In this section, we assume that we are given a set of functions that represents the knowledge, that one may have, related to which allocation strategies the investors tend to select in a certain stock market and time period. We assume that we are given \(M_1 + 1\) functions \(f_q, f_{\alpha ,i},\ i\in [M_1]\) with the domain being [0, 1] for all of them. We call these functions behavioral functions and we use them to create a vector of weights \(w\in {\mathbb {R}}^{M}\), that emphasizes specific strategies, where \(M=M_1M_2\) is the total number of allocation strategies that take place in the stock market. More specifically, \(f_q\) declares the level of risks that the investors tend to select, while \(f_{\alpha , i}\) declares the level of dispersion that the investors’ portfolios—who select risk \(q_i\)—tend to have around the formal allocation proposal.

Fig. 7
figure 7

Examples of behavioral functions

The plots in Fig. 7 demonstrate four possible choices of such functions. For example, if plot C is \(f_q\) then the investors tend to select low-risk investments; the value of \(f_q\) is high for small values of q (low risk) and low for high values of q (high risk). In addition, if the plot D is \(f_{\alpha , i}\) then, the portfolios of the investors who select risk \(q_i\) tend to be highly stuck around the formal allocation proposal that corresponds to \(q_i\); the value of \(f_{\alpha , i}\) is large for large values of \(\alpha\) (low dispersion) and small for small values of \(\alpha\) (high dispersion).

To compute a weight vector w, we map the intervals \([a_{i1}, \alpha _{iM2}]\) and \([v_{\min }, v_{\max }]\) onto [0, 1] by using the following transformation,

$$\begin{aligned} z(t) = \frac{1}{d-c}(t-c),\ t\in [c,d] . \end{aligned}$$
(27)

Throughout this paper, when we write \(z(\cdot )\) we assume that the interval [cd] is defined properly according to the input.

The following pseudo-code describes how we compute such a weight vector when \(M_1+1\) behavioral functions are given.

figure b

Given the behavioral functions, one could use the vector of weights—determined as in the above pseudo-code— and then, the portfolio score is \(s = \langle c,w\rangle\), while \(c\in {\mathbb {R}}^M\) is the vector that contains the values of the integrals in Eq. (24).

5.3.2 Parametric score

In this section, we assume weaker knowledge of how the investors tend to behave than that in Sect. 5.3.1. Thus, we do not explicitly determine the vector of weights \(w\in {\mathbb {R}}^M\)M is the number of allocation strategies in a certain stock market. In particular, let the coordinates of the vector \(r\in {\mathbb {R}}^M\) as in Sect. 5.3.1,

$$\begin{aligned} r_{(i-1)M_1 + j} \leftarrow f_q(z(v_i))f_{\alpha , i}(z(\alpha _{ij})),\ i\in [M_1],\ j\in [M_2] \end{aligned}$$
(28)

where \(f_q,\ f_{\alpha , i}\) the \(M_1+1\) behavioral functions. Then, we use the vector r to denote a bias in the investors' behavior. First, we again allow further bounds and linear constraints on the weights. That is we let the feasible region of the weights to be the set Q of Eq. (26). To denote the bias in the investors' behavior, we employ the exponential distribution

$$\begin{aligned} p_T(w)\propto e^{\langle r, w\rangle /T} ,\ T>0 , \end{aligned}$$
(29)

with the support of \(p_T(w)\) being the set \(Q\subset {\mathbb {R}}^M\).

The distribution \(p_{T}(w)\propto e^{\langle r, w\rangle /T}\) is usually called Boltzmann distribution and the vector r bias vector. In general, the Boltzmann distribution gives the probability that a system will be in a certain state as a function of that state’s energy and the temperature T of the system. The bias vector r determines how the mass tends to distribute in Q and the temperature parameter T how strong the bias denoted by r is.

For example, when Q is the canonical simplex \(\varDelta ^{M-1}\), the mass of \(p_T\) tends to concentrate around the vertices which correspond to the coordinates of r with larger values than the other coordinates. Moreover, as the temperature \(T\rightarrow 0\) this tendency becomes stronger until almost all the mass concentrates around the vertex which corresponds to the coordinate of the largest value of r. As \(T\rightarrow \infty\), \(p_{T}\) converges to the uniform distribution and the bias denoted by r disappears.

We intend to use the temperature T to parameterize how strong the tendency on the investors’ behavior, that the bias vector r implies, is. Then the parametric score is given as,

$$\begin{aligned} \begin{aligned} s(T)&:= \int _{S} \langle c, w\rangle \ p_{T}(w) dw,\ \text {where }p_{T}(w) \propto e^{\langle r, w\rangle /T},\ T>0\\&\quad \text { and each coordinate }\quad r_{(i-1)M_1 + j}= f_{q}(z(v_i))f_{\alpha , i}(z(\alpha _{ij})),\\&\quad \text { and } v_i = \min \limits _{x\in P}\phi _{q_i}(x),\ i\in [M_1],\ j\in [M_2] \end{aligned} \end{aligned}$$
(30)

Let the center of mass \(\bar{w_T}\) in Q when the mass is distributed according to \(p_T(w)\). Notice that \({\bar{w}}_T\) can be seen as a parametric curve on T. Furthermore, it is easy to notice that, for fixed T, the parametric score \(s(T) = \langle c,{\bar{w}}_T\rangle\). Thus, the score s(T) is evaluated on that parametric curve. Following these observations, we are ready to state the following Lemma.

Lemma 2

Let a stock market with M allocation strategies. Assume that we are given the parameters \(q_i,\ \alpha _{ij}\) of Sect. 5.1 and 5.2 and any behavioral functions \(f_q,\ f_{\alpha _i},\ i\in [M_1],\ j\in [M_2]\) and \(M=M_1M_2\) the number of allocation strategies that take place in the stock market. Let the feasible set \(Q\subset {\mathbb {R}}^{M}\) of the weights as in Eq. (26), the min score \(s_{\min }\), the max score \(s_{\max }\) and the mean score \({\bar{s}}\) in Sect. 5.3 and the parametric score in Eq. (30). Then, the followings hold,

$$\begin{aligned} \begin{aligned}&s_{\min }\le s(T) \le s_{\max },\ \forall T>0,\\&{\bar{s}} = \lim _{T\rightarrow \infty }s(T) \end{aligned} \end{aligned}$$
(31)

Notice that the Eq. (31) holds for any set of behavioral functions as the scores \(s_{\min },\ s_{\max }\) always bound the parametric score. Furthermore, when \(T\rightarrow \infty\) the distribution \(p_T(w)\) converges to the uniform distribution over the feasible region of the weights Q and thus the parametric score is equal to the mean score \({\bar{s}}\).

To obtain the parametric score, we compute a sequence of temperatures \(T_i\) that correspond to a sequence of exponential distributions \(p_{T_i}\). Similar to Sect. 5.1, we compute two temperatures \(T_{\max }\) and \(T_{\min }\). The \(L_2\) norm of \(p_{T_{\max }}\) w.r.t. the uniform distribution over Q is smaller than a given threshold and the \(100(1-\epsilon )\%\) of the mass of \(p_{T_{\min }}\) is inside a ball of a small radius \(\delta > 0\), centered at the mode of \(p_{T_{\min }}\). Then, we use the sequence, \(T_i = T_{\max }(1 - \frac{1}{M})^i\), \(i\in {\mathbb {N}}_+\), \(T_i\ge T_{\min }\) and the method in Sect. 5.1 to compute an equidistant—with respect to \(L_2\) norm—sequence of exponential distributions.

6 Simulations on allocation strategies

In this section, we take the set of portfolios \(P\subset {\mathbb {R}}^n\) to be the canonical simplex, \(\varDelta ^{n-1}\), which means that we consider long-only portfolios. We illustrate the usefulness of the new score in analyzing the performance of a portfolio allocation given the asset returns. We also compare it to several well-known portfolio scores. Moreover, we consider the score of a given portfolio as a random variable in a stock market, where the asset returns follow a multivariate distribution. We also illustrate how our modeling of allocation strategies could be used to study the state of a stock market, computing the copula between portfolios’ return and volatility while the portfolios have been build according to a mixed strategy.

6.1 Portfolio score

In our simulation, we consider the daily returns of the 12 cryptocurrencies with the longest history, reported in Table 6. To illustrate our new score, we consider a pseudo-real time example, where we take 100 consecutive asset returns from 22/10/2016 until 29/01/2017. We compute the optimal mean-variance (MV) portfolio using the shrinkage estimate of the covariance matrix of Ledoit and Wolf (2004), while we fix its volatility equal to the average in-sample volatility of the long-only portfolios. We also compute the equally weighted risk contributions (ERC) portfolio by Maillard et al. (2010) which also uses the shrinkage estimator and the Bitcoin (BTC) portfolio. For the sake of completeness, we report the estimated covariance matrix and the average assets’ return for the period of 100 days, that we used to compute the MV portfolio, in “Appendix B”. We also report the three portfolios in Table 1.

Table 1 The mean-variance (MV) optimal portfolio, the equally-weighted risk contributions (ERC) portfolio and the Bitcoin portfolio (BTC)

To evaluate those portfolios, we take the average of the ten vectors of assets’ returns after the 100 daily asset returns. We report this vector of assets’ returns in “Appendix B”. Together with our score, we report Jensen’s alpha, Sharpe ratio, Sortino ratio, and the cross-sectional score in Guegan et al. (2011). The latter is equal to the proportion of all possible allocations that our portfolio outperforms. To compute the Jensen’s alpha and the Sortino ratio, we consider the return of the Global Minimum Variance portfolio as the risk-free rate and for the market portfolio, we set the equally weighted portfolio. To compute the Sharpe ratio, we also set the equally weighted portfolio as the benchmark portfolio.

Table 2 Four well-known scores of the mean-variance (MV) optimal portfolio, the equally weighted risk contributions (ERC) portfolio and the Bitcoin portfolio (BTC)

Regarding our new score, we study two scenarios that differ based on the strategies that take place in the stock market. First, we take three levels of risk with four levels of dispersion for each risk, that is \(M=12\) strategies in total. Second, we take six levels of risk with ten level of dispersion for each risk. In addition, we select ten levels of dispersion around the Bitcoin portfolio. More precisely, consider the family of distributions,

$$\begin{aligned} \pi (x) \propto e^{-\alpha (x-{\bar{x}})^T\Sigma (x-{\bar{x}})},\ x\in {\mathbb {R}}^n , \end{aligned}$$
(32)

where \({\bar{x}}\in {\mathbb {R}}^n\) is the Bitcoin portfolio. Then, we compute the sequence of dispersion as in Sect. 5.1. That is \(M=70\) strategies in total. In both cases, we do not impose any additional constraint for the proportion of the investors that select a specific strategy. This means that the set of weights \(Q\subset {\mathbb {R}}^M\), which determines the composition of the investors in the stock market, is the canonical simplex \(\varDelta ^{M-1}\). Considering the behavioral functions, for the risk we consider three cases: (i) plot B with \(x_0=1/2\), (ii) plot C and (iii) plot D in Fig. 7. The function in case (i) favors strategies with medium level of risk, the function in case (ii) favors strategies with low level of risk, and (iii) favors strategies with high level of risk. Throughout this section for the dispersion of each risk, we consider only the case of plot D, which favors strategies of low dispersion around the formal allocation proposal. For all the behavioral functions, we set the ratio between its maximum over its minimum value equal to 10.

Table 3 The scores of the mean-variance (MV) optimal portfolio, the equally weighted risk contributions (ERC) portfolio, and the Bitcoin portfolio (BTC) when 12 and 70 strategies take place in the stock market

In Table 2, we report the values of the existing portfolio scores. All scores agree that the performance of the MV portfolio is better than both ERC’s and BTC’s. They all also agree that ERC’s performance is better than BTC’s. Moreover, MV is the only portfolio that outperforms the equally weighted portfolio. The cross-sectional score in Guegan et al. (2011) informs us that MV outperforms the \(70.2\%\) of all possible portfolios, ERC the \(34.8\%\) and BTC outperforms only the \(1.8\%\) of all possible portfolios. In Table 3, we report the new score when \(M=12\) or \(M=70\) allocation strategies take place in the stock market. We report the mean score \({\bar{s}}\), which is the score when the investors are equally divided among the allocation strategies and the scores for the three different choices of the risk’s behavioral function, using the weight vector w of Sect. 5.3.1.

Table 4 The scores of the mean-variance (MV) optimal portfolio, the equally-weighted risk contributions (ERC) portfolio, and the Bitcoin portfolio (BTC) for various weight vectors \(w_i\) when 12 strategies take place in the stock market

For all portfolios, the score increases while the investors tend to select allocation strategies with s lower level of risk. The performance of the MV portfolio is very similar in both cases of \(M=12\) and \(M=70\). ERC portfolio performs better when \(M=70\) as each score is about \(10\%\) larger than the case of \(M=12\). BTC also performs better when \(M=70\). However, the score is quite small in both cases. Comparing to the unbiased case of the score in Guegan et al. (2011), it is clear that the value of our score is affected by the investors’ composition and can be higher or smaller than the score in Guegan et al. (2011).

Fig. 8
figure 8

The parametric scores of the mean-variance (MV) optimal portfolio, the equally weighted risk contributions (ERC) portfolio, and the Bitcoin portfolio (BTC) for different risk’s behavioral functions (high, medium, low risk) and \(M=12\) strategies in the stock market. We report the score for the i-th temperature in the sequence we compute according to Sect. 5.3.2. With different colors, we mark different portfolios. Plot B, C, D refer to Fig. 7

In Table 4, we further illustrate how the performance of each portfolio is related to the investors’ composition in the stock market. We consider the case of \(M=12\) and for each risk’s behavioral function, we compute the bias vector r of Sect. 5.3.2. For each bias vector, we report three different weight vectors and the corresponding scores. Recall that a weight vector represents the investor’s composition in a stock market. In particular, we set three different temperatures T in the distribution \(p_T(w)\propto e^{\langle r, w\rangle /T}\). Then, for each temperature, we estimate the center of mass of \(p_{T}\). In each column block and from left to right, we decrease the temperature, and thus, we strengthen the tendency implied by the bias vector r. For all portfolios, as the percentage of investors who select strategies with a high level of risk increases, their score drastically decreases. When the percentage of the investors who select strategies with a medium level of risk increases, the performance of the MV portfolio improves while both ERC’s and BTC’s decrease. When the percentage of investors who select strategies with a low level of risk increases both the MV’s and ERC’s scores increase while BTC’s score decreases. Moreover, BTC’s score is always smaller than 0.2% which implies a quite poor performance.

The plots in Figs. 8 and 9 illustrate a comparison between the three portfolios using their parametric scores. For both sets of allocation strategies, the score of MV is always higher than the scores of ERC and BTC. When the percentage of investors who select strategies with a high level of risk increases the three parametric scores converge as they all go to zero. When the investors tend to select strategies with a medium or a low level of risk we have a major change in the performance of ERC when \(M=70\). For example, for medium risk and \(M=12\) the parametric score of ERC converges to 0, while for \(M=70\) it converges to 1. This is an example of how the (parametric) score can change as the number of allocation strategies in the stock market also changes.

Fig. 9
figure 9

The parametric scores of the mean-variance (MV) optimal portfolio, the equally weighted risk contributions (ERC) portfolio, and the Bitcoin portfolio (BTC) for different risk’s behavioral functions (high, medium, low risk) and \(M=70\) strategies in the stock market. We report the score for the i-th temperature in the sequence we compute according to Sect. 5.3.2. With different colors, we mark different portfolios. Plot B, C, D refer to Fig. 7

6.2 The score as a random variable

Once a portfolio is chosen and assuming a distribution for the asset returns, one can estimate the distribution of the scores of this portfolio. This distribution allows us to understand the risk for this portfolio to perform worse, or better, than a mixed strategy. This estimation is obtained as follows. First, we draw randomly \(10^4\) vectors of asset returns. Then we compute the corresponding scores using our implementation. Finally, we estimate the distributions of the score by a normal kernel function bounded in [0, 1]. Moreover, considering the parametric score we compute a sequence of distributions of scores. That is one distribution for each temperature in Eq. (30).

We show how the mixed strategy in the stock market affects the distribution of the score of a given portfolio. We work under the assumption that the asset returns follow a multivariate Gaussian distribution \({\mathcal {N}}(\mu ,\Sigma )\). We use the same covariance matrix and mean vector as in Sect. 6.1.

Fig. 10
figure 10

The distributions of scores of the Mean-variance (MV) optimal portfolio for different risk’s behavioral functions (high, medium, low risk) and a sequence of temperatures \(T_i\) such that the distributions in the sequence \(p_{T_i}{w}\propto e^{\langle r, w\rangle /T_i}\) are equidistant w.r.t. \(L_2\) norm. Plot B, C, D refer to Fig. 7

In our example, we focus on the MV portfolio in Table 1 considering the case of the \(M=70\) allocation strategies of Sect. 6.1. For each risk’s behavioral function, we use the same temperatures as in Fig. 9, where we compute the parametric scores. Thus, the plots in Fig. 10 illustrate how the distribution of score of the MV portfolio changes as the tendency of the investors’ behavior—induced by the behavioral functions—increases.

When the investors are equally divided among allocation strategies, the distribution of score is bimodal with the modes being around 0.2 and 0.8. The latter implies that—for this investors’ composition—the score of MV has a high probability to be around 0.8 or 0.2. As the percentage of investors who select strategies with a medium level of risk increases the modes are shifting to the extreme values, e.g. around 0 and 1. Moreover, the total mass over the modes increases. Also, the mass over the mode of the largest value gets larger than the mass over the mode of the smallest value. Thus, the probability of MV to achieve a good score increases as the temperature T decreases. On the other side, the MV has a high probability to be either among the best or worst performers, which implies that it is also quite risky w.r.t. to that score. The same occurs as the percentage of investors who select strategies with a low level of risk increases.

However, as the percentage of investors who select strategies with a high level of risk increases, the modes are shifting towards the opposite directions than in the previous cases. Moreover, for the median temperature, the distribution of the score becomes unimodal and it is centered at 0.5. As the temperature further increases the modes are shifting towards the extreme scores, e.g. 0 and 1. In this case, the mass over the mode of the largest value gets smaller than the mass over the mode of the smallest value. Thus, as the percentage of investors who select high-risk strategies increases the probability of MV achieving a bad score increases. However, when the investors’ composition is implied by the median temperature, MV has a very small probability to be among the best or worst performers, as the mass is concentrated around the score of 0.5. The latter make the MV portfolio a safe (stable) choice w.r.t. that score.

6.3 Alternative copulae

Notice that the analysis in Sect. 3 is agnostic on allocation strategies by working directly with the set of portfolios. Thus, it uses uniform sampling from the set of portfolios to estimate the copula between portfolios’ return and volatility. In this section, we compute copulae when the portfolios have been build according to a mixed strategy. To be more precise, we consider the case of \(M=70\) allocation strategies as in Sect. 6.1, which are computed using the covariance matrix and the mean asset returns estimated on the period of 100 days of Sect. 6.1. Then, we consider the next 60 days after the set of 100 days, and—in that time period—we compute one copula per mixed allocation strategy. To compute each copula we use sampling from the corresponding mixture distributions—for more details see “Appendix A”.

The upper left plot in Fig. 11 illustrates the copula computed with uniform sampling from the set of long-only portfolios as in Sect. 3. The indicator of this copula is equal to 0.32, which implies a normal relation between portfolios’ return and volatility. We also obtain similar copulae when the portfolios are built according to three different mixed strategies which differ based on the risk’s behavioral function (high-medium-low level of risk), as in the two previous sections and the weight vectors w have been computed as in Sect. 5.3.1. The indicators for high, medium, and low risk are 0.002, 0.004, 0.01, respectively.

Fig. 11
figure 11

The copulae of portfolios’ return and volatility. The copula in the upper left plot (unbiased case) is computed according to Sect. 3. We compute each one of the three copulae when the portfolios have been built according to three different mixed allocation strategies, while \(M=70\) strategies take place in the stock market. The mixed strategies differ based on the risk’s behavioral function (high, medium, and low risk). Plot B, C, D refer to Fig. 7

On the other hand, in Fig. 12, we compute the bias vector for the risk’s behavioral function given by plot B in Fig. 7, which favors allocation strategies with medium risk. The temperature \(T_1\) corresponds to the mixed strategy with equally divided investors over the \(M=70\) allocation strategies. We notice that as the temperature decreases, i.e. the percentage of the investors who select allocation strategies with a medium level of risk increases, a large percentage of the mass of the copula is shifting towards the corners on the down diagonal. More precisely, the indicators for \(T_1>\dots >T_4\) are 0.03, 0.03, 0.04, 2.19, respectively. The latter implies that the percentage of portfolios with either low return and high volatility or high return and low volatility increases as the tendency of the investors to select medium-risk allocation strategies increases in the stock market.

Fig. 12
figure 12

The copulae of portfolios’ return and volatility. We compute each copula for a certain mixed allocation strategy, while \(M=70\) strategies take place in the stock market. We compute the bias vector using the risk’s behavioral function given by Plot B in Fig. 7, which favors allocation strategies with a medium level of risk. We compute a copula per temperature, where \(T_1>\dots >T_4\) correspond to equidistant exponential distributions \(p_{T_i}\) w.r.t. \(L_2\) norm and the temperature \(T_1\) correspond to a mixed strategy with equally divided investors over the allocation strategies

7 Conclusions and future work

We briefly survey existing work on crises detection and we strengthen its results employing clustering algorithms for bivariate distributions. We also use it to detect all the past crash events in the cryptocurrency markets. This problem motivates us to develop a new computational framework to model asset allocation strategies in a stock market and to define a new portfolio score based on that framework. Our simulations show that the informativeness of our new score can be higher than that of existing portfolio performance scores. To provide efficient computations we develop high dimensional MCMC samplers for log-concave distributions supported on a convex polytope. Our sampler scales up to a few hundred dimensions/assets. We simulate mixed strategies to estimate the distribution of the portfolio’s score—assuming a distribution for the asset returns—and to compute alternative copulae of portfolios’ return and volatility.

A possible future work is to use nonlinear shrinkage, e.g. that in Ledoit and Wolf (2020). An additional direction would be to further study the state of a stock market using the alternative copula computations in Sect. 6.3. Furthermore, we believe that it would be of special interest to use the distribution of the new score to define new performance measures and thus, compute the optimal portfolios with respect to those measures. In particular, the problem reduces to compute a portfolio with a “good” distribution of score. Considering the copula computation in Sect. 3, when the set of portfolios is the fully-invested portfolios we can not use the exact sampler in Rubinstein and Melamed (1998), because the portfolio domain \(P\subset \in {\mathbb {R}}^n\) is a generic convex polytope. Thus, MCMC sampling methods is the sole option. The latter is computationally more expensive than the method in Rubinstein and Melamed (1998). An interesting piece of future work is to develop specialized MCMC uniform samplers for the set of fully-invested portfolios. Last but not least, one could use the clustering methods, we introduce in Sect. 3 to detect intermediate states of a market.