Skip to main content
Log in

Clustering of financial instruments using jump tail dependence coefficient

  • Original Paper
  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

In this paper, we propose a new clustering procedure for financial instruments. Unlike the prevalent clustering procedures based on time series analysis, our procedure employs the jump tail dependence coefficient as the dissimilarity measure, assuming that the observed logarithm of the prices/indices of the financial instruments are embedded into multidimensional Lévy processes. The efficiency of our proposed clustering procedure is tested by a simulation study. Finally, with the help of the real data of country indices we illustrate that our clustering procedure could help investors avoid potential huge losses when constructing portfolios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Notes

  1. The data set consists of the country indices of the following 14 countries: Australia, Belgium, Canada, France, Germany, Japan, China, India, Koera, Malaysia, Mexico, Russia, Brazil, Chile.

References

Download references

Acknowledgements

The authors are indebted to two anonymous reviewers for comments and suggestions that improved the paper. Jiang Wu is grateful to the support from MOE (Ministry of Education in China) Project of Humanities and Social Sciences (Project No. 10YJC790280).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wenjun Jiang.

Appendices

A The simulation procedure of multidimensional Lévy processes

The simulation procedure of multidimensional Lévy processes heavily relies on one key lemma and the improved algorithm given by Tankov (2006b).

Lemma 1

(Tankov (2006b)) Let \(F(u_1,\dots ,d_d)\) be a d-dimensional Lévy copula and

$$\begin{aligned} \lim _{(u_i)_{i\in I}\rightarrow \infty }F(u_1,\dots ,u_d)=F(u_1,\dots ,u_d)\big |_{(u_i)_{i\in I}=\infty } \end{aligned}$$
(10)

for nonempty \(I\subseteq \{1,2,\dots ,d\}\). Define \(F_{u_1}(u_2,\dots ,u_d)\) as

$$\begin{aligned} F_{u_1}(u_2,\dots ,u_d)=sign(\zeta )\frac{\partial }{\partial \zeta }F\big ((\zeta \wedge 0, \zeta \vee 0]\times (-\infty ,u_2]\cdots \times (-\infty ,u_d] \big )\Big |_{\zeta =u_1}, \end{aligned}$$
(11)

where \(\zeta \) shares the same sign of \(u_1\). There exists a set \(N\subseteq \mathbb {R}\) of zero Lebesgue measure such that for \(u_1\in \mathbb {R}\backslash N\), \(F_{u_1}(u_2,\dots ,u_d)\) is a probability distribution of \((u_2,\dots ,u_d)\) conditional on the first element equals to \(u_1\).

Theorem 2

(Tankov (2006b)) Let \(v(\cdot )\) be a Lévy measure on \(\mathbb {R}^d\) satisfying

$$\begin{aligned} \int _{\mathbb {R}}(|x|\wedge 1)v(\mathrm{d}x)<\infty \end{aligned}$$

with marginal tail integrals \(U_i\), \(i=1,\dots ,d\) and Lévy copula \(F(u_1,\dots ,u_d)\) such that (10) is satisfied, and \(F_{u_i}(\cdot )\) be the corresponding conditional distributions.

Fix a truncation level \(\tau \). Let \(\{V_k\}_{k\ge 1}\) and \(\{W_k^i\}_{k\ge 1}^{1\le i\le d}\) be independent sequences of i.i.d. standard uniform random variables. Introduce \(d^2\) random sequences \(\{\gamma _{k}^{ij}\}_{k\ge 1}^{1\le i,j\le d}\), independent from \(\{V_k\}_{k\ge 1}\) and \(\{W_k^i\}_{k\ge 1}^{1\le n\le d}\) such that

  • For \(i=1,\dots ,d\), \(\sum _{k=1}^{\infty }\varDelta _{\{\gamma _{k}^{ii}\}}\) are independent Poisson random measures on \(\mathbb {R}\) with Lebesgue intensity measures.

  • Conditional on \(\gamma _k^{ii}\), the random vector \((\gamma _k^{i1},\dots ,\gamma _k^{i,i-1},\gamma _k^{i,i+1},\dots ,\gamma _k^{id})\) is distributed on \(\mathbb {R}^{d-1}\) with law \(F_{\gamma _k^{ii}}(\cdot )\).

For each \(k\ge 1\) and each \(i=1,\dots ,d\),let \(n_k^i=\#\{j=1,\dots ,d: |\gamma _k^{ij}|\le \tau \}\). Then the process \(\{Z_t^{\tau }\}_{0\le t\le 1}\) with components

$$\begin{aligned} Z_t^{\tau ,j}=\sum _{k=1}^{\infty }\sum _{i=1}^{n}U_j^{(-1)}(\gamma _k^{ij})\mathbb {1}_{n_k^iW_k^i\le 1}\mathbb {1}_{|\gamma _k^{ii}|\le \tau }\mathbb {1}_{[0,t]}(V_k), \quad j=1,\dots ,d, \end{aligned}$$
(12)

is a Lévy process on [0, 1] with characteristic function

$$\begin{aligned} \mathbf {E}[e^{i<u,Z_t^{\tau }>}]=\exp \Big (t\int _{\mathbb {R}^d\backslash S_{\tau }}(e^{i<u,z>}-1)v(\mathrm{d}z) \Big ), \end{aligned}$$
(13)

where \(<\cdot>\) represents the inner product and

$$\begin{aligned} S_{\tau }=\big (U_1^{(-1)}(-\tau ),U_1^{(-1)}(\tau ) \big )\times \cdots \times \big (U_d^{(-1)}(-\tau ),U_d^{(-1)}(\tau ) \big ). \end{aligned}$$
(14)

The above theorem provides a robust algorithm of simulating multidimensional Lévy processes given the Lévy copula; however, it still requires an efficient algorithm to sample the random vector \((\gamma _k^{i1},\dots ,\gamma _k^{i,i-1},\gamma _k^{i,i+1},\dots ,\gamma _k^{id})\) given the cumulative distribution function \(F_{\gamma _k^{ii}}(\cdot )\). While it remains unknown that whether there exists such an algorithm in general, for the particular case of Lévy-Clayton copula considered in our simulation study we discover a method serving this purpose well.

For the two-directed Lévy-Clayton Copula F defined in (9), we have

$$\begin{aligned} F(u_1,\ldots ,u_d;\theta ,\eta )=0 \end{aligned}$$

whenever \(u_1=0\). Hence by denoting \(I_{d-1}:=(-\infty ,u_2]\times \cdots \times (-\infty ,u_d]\in \mathcal {B}(\mathbb {R}^{d-1})\), we have

$$\begin{aligned} F\left( (\zeta \wedge 0,\zeta \vee 0]\times I_{d-1};\theta ,\eta \right) = \text {sign}(\zeta )\int _{I_{d-1}}F(\zeta ,\mathrm{d}v_2, \ldots ,\mathrm{d}v_d;\theta ,\eta ). \end{aligned}$$
(15)

Thus given the signs of \(u_1,u_2,\ldots ,u_n\), interchanging the partial differentiation and the integrals yields

$$\begin{aligned} F_{u_1}(u_2,\ldots ,u_d;\theta ,\eta ) = \int _{I_{d-1}}\tilde{F}_{u_1}(\mathrm{d}v_2, \ldots ,\mathrm{d}v_d;\theta ,\eta ), \end{aligned}$$
(16)

where

$$\begin{aligned} \tilde{F}_{u_1}(u_2,\ldots ,u_d;\theta ,\eta )= & {} \text {sign}(u_1)2^{2-d}\left( 1+|u_1|^{\theta }\sum ^d_{i=2} |u_i|^{-\theta }\right) ^{-\frac{1}{\theta }-1}\\&\left( \eta \varvec{1}_{\{u_1 u_2\dots u_d\ge 0\}}-(1-\eta )\varvec{1}_{\{u_1 u_2\dots u_d<0\}}\right) . \end{aligned}$$

Notice that according to Lemma 1, \(F_{u_1}\) is a probability distribution function. In fact, a random vector having the CDF \(F_{u_1}\) is a mixture of \(2^{d-1}\) random vectors with the following joint CDF:

$$\begin{aligned} G_{u_1}(u_2,\ldots ,u_d;\theta ) = \left( 1+|u_1|^{\theta }\sum ^d_{i=2}u^{-\theta }_i\right) ^{-\frac{1}{\theta }-1},\quad u_i>0 \end{aligned}$$
(17)

for all \(i=2,\ldots ,d\). Hence to construct such a random vector, consider a random sign vector \(\begin{pmatrix} S_2&\cdots&S_d\end{pmatrix}^{\intercal }\in \{-1,1\}^{d-1}\) satisfying

$$\begin{aligned}&\displaystyle \mathbf {P}\left( \text {sign}(u_1)\prod ^d_{i=2}S_i\ge 0\right) = \eta , \end{aligned}$$
(18)
$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( \text {sign}(u_1)\prod ^d_{i=2}S_i< 0\right) = 1-\eta , \end{aligned}$$
(19)
$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( S_2=s_2,\ldots ,S_d=s_d\bigg |\text {sign}(u_1)\prod ^d_{i=2}S_i\ge 0\right) =2^{2-d}\mathbb {1}_{\left\{ \text {sign}(u_1)\prod ^d_{i=2}s_i\ge 0\right\} },\qquad \end{aligned}$$
(20)
$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( S_2=s_2,\ldots ,S_d=s_d\bigg |\text {sign}(u_1) \prod ^d_{i=2}S_i<0\right) =2^{2-d}\mathbb {1}_{\left\{ \text {sign}(u_1)\prod ^d_{i=2}s_i< 0\right\} } \end{aligned}$$
(21)

for arbitrary \(s_2,\ldots ,s_d\in \{-1,1\}\). Thus by generate \(\begin{pmatrix} Y_2&\cdots&Y_d\end{pmatrix}^{\intercal }\) from \(G_{u_1}\) independent of \(\begin{pmatrix} S_2&\cdots&S_d\end{pmatrix}^{\intercal }\), the random vector \(\begin{pmatrix} S_2Y_2&\cdots&S_dY_d\end{pmatrix}^{\intercal }\) has joint CDF \(F_{u_1}\).

The remaining key point to generate random vectors from \(F_{u_1}\) is to generate random vectors from \(G_{u_1}\). Notice that \(|u_1|\) plays a role of scale parameter in \(G_{u_1}\), we may only consider the cumulative distribution function \(G_1\). Since the copula uniquely associated with \(G_1\) is the Clayton copula

$$\begin{aligned} C(u_2,\ldots ,u_d;\theta ') = (u^{-\theta '}_2+\cdots +u^{-\theta '}_d-d+2)^{-\frac{1}{\theta '}}, \quad \theta '=\dfrac{\theta }{\theta +1}, \end{aligned}$$

a random vector from \(G_1\) could be written as \(\begin{pmatrix} G^{-1}_{1,2}(U_2)&\cdots&G^{-1}_{1,d}(U_d)\end{pmatrix}^{\intercal }\) where \(\begin{pmatrix} U_2&\cdots&U_d\end{pmatrix}^{\intercal }\) is generated from the Clayton copula C and

$$\begin{aligned} G_{1,i}(u_i;\theta )=(1+u^{-\theta }_i)^{-\frac{1}{\theta }-1} \end{aligned}$$

for \(i=2,\ldots ,d\) are the marginal distributions of \(G_1\) (all of these marginal distributions are Dagum distributions of type I). Therefore, \(\begin{pmatrix} Y_2&\cdots&Y_d\end{pmatrix}^{\intercal } = \begin{pmatrix} |u_1|G^{-1}_{1,2}(U_2)&\cdots&|u_1|G^{-1}_{1,d}(U_d)\end{pmatrix}^{\intercal }\) is a random vector from \(G_{u_1}\). Consequently, the simulation algorithm could be summarized as follows:

figure a

B Details of the clustering algorithm

figure b

C Descriptive statistics of the log-returns of the dataset

See Table 9.

Table 9 Descriptive statistics of the log-returns

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yang, C., Jiang, W., Wu, J. et al. Clustering of financial instruments using jump tail dependence coefficient. Stat Methods Appl 27, 491–513 (2018). https://doi.org/10.1007/s10260-017-0411-1

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-017-0411-1

Keywords

Navigation