Clustering of financial instruments using jump tail dependence coefficient

Yang, Chen; Jiang, Wenjun; Wu, Jiang; Liu, Xin; Li, Zhichuan

doi:10.1007/s10260-017-0411-1

Clustering of financial instruments using jump tail dependence coefficient

Original Paper
Published: 21 November 2017

Volume 27, pages 491–513, (2018)
Cite this article

Statistical Methods & Applications Aims and scope Submit manuscript

Chen Yang¹,
Wenjun Jiang ORCID: orcid.org/0000-0002-8576-5859²,
Jiang Wu³,
Xin Liu² &
…
Zhichuan Li⁴

442 Accesses
5 Citations
Explore all metrics

Abstract

In this paper, we propose a new clustering procedure for financial instruments. Unlike the prevalent clustering procedures based on time series analysis, our procedure employs the jump tail dependence coefficient as the dissimilarity measure, assuming that the observed logarithm of the prices/indices of the financial instruments are embedded into multidimensional Lévy processes. The efficiency of our proposed clustering procedure is tested by a simulation study. Finally, with the help of the real data of country indices we illustrate that our clustering procedure could help investors avoid potential huge losses when constructing portfolios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Time Series Clustering on Lower Tail Dependence for Portfolio Selection

Dynamic tail dependence clustering of financial time series

Article 06 October 2015

Financial clustering in presence of dominant markets

Article 02 November 2014

Notes

The data set consists of the country indices of the following 14 countries: Australia, Belgium, Canada, France, Germany, Japan, China, India, Koera, Malaysia, Mexico, Russia, Brazil, Chile.

References

Baragona R (2001) A simulation study on clustering time series with metaheuristic methods. Quaderni di Statistica 3:1–26
MathSciNet Google Scholar
Billio M, Caporin M (2009) A generalized dynamic conditional correlation model for portfolio risk evaluation. Math Comput Simul 79(8):2566–2578. https://doi.org/10.1016/j.matcom.2008.12.011, http://linkinghub.elsevier.com/retrieve/pii/S0378475408004138
Billio M, Caporin M, Gobbo M (2006) Flexible dynamic conditional correlation multivariate GARCH models for asset allocation. Appl Financ Econ Lett 2(2):123–130. https://doi.org/10.1080/17446540500428843
Article Google Scholar
Brockwell PJ, Davis RA (eds) (2002) Introduction to time series and forecasting. Springer texts in statistics. Springer, New York. https://doi.org/10.1007/b97391
Google Scholar
Calinski T, Harabasz J (1974) A dendrite method for cluster analysis. Commun Stat Theory Methods 3(1):1–27. https://doi.org/10.1080/03610927408827101
Article MathSciNet MATH Google Scholar
Cont R, Tankov P (2003) Financial modelling with jump processes, vol 2. Financial mathematics series. Chapman and Hall/CRC, Boco Raton. https://doi.org/10.1201/9780203485217
MATH Google Scholar
De Luca G, Zuccolotto P (2011) A tail dependence-based dissimilarity measure for financial time series clustering. Adv Data Anal Classif 5(4):323–340. https://doi.org/10.1007/s11634-011-0098-3
Article MathSciNet Google Scholar
Dobrić J, Schmid F (2005) Nonparametric estimation of the lower tail dependence $\lambda _L$ in bivariate copulas. J Appl Stat 32(4):387–407. https://doi.org/10.1080/02664760500079217
Article MathSciNet MATH Google Scholar
Durante F, Jaworski P (2010) Spatial contagion between financial markets: a copula-based approach. Appl Stoch Models Bus Ind 26(5):551–564. https://doi.org/10.1002/asmb.799
Article MathSciNet MATH Google Scholar
Durante F, Pappadà R, Torelli N (2014) Clustering of financial time series in risky scenarios. Adv Data Anal Classif 8(4):359–376. https://doi.org/10.1007/s11634-013-0160-4
Article MathSciNet Google Scholar
Durante F, Pappadà R, Torelli N (2015) Clustering of time series via non-parametric tail dependence estimation. Stat Pap 56(3):701–721. https://doi.org/10.1007/s00362-014-0605-7
Article MathSciNet MATH Google Scholar
Embrechts M, Arciniegas F, Ozdemir M, Momma M (2001) Scientific data mining with StripMiner/sup TM/. In: SMCia/01. In: Proceedings of the 2001 IEEE mountain workshop on soft computing in industrial applications (Cat. No.01EX504), IEEE, pp 13–16, https://doi.org/10.1109/SMCIA.2001.936721, http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=936721
Engle R (2002) Dynamic conditional correlation: a simple class of multivariate generalized autoregressive conditional Heteroskedasticity models. J Bus Econ Stat 20(3):339–350
Article MathSciNet Google Scholar
Engle R, Sheppard K (2001) Theoretical and empirical properties of dynamic conditional correlation multivariate GARCH. Tech. Rep., National Bureau of Economic Research, Cambridge, MA, https://doi.org/10.3386/w8554, http://www.nber.org/papers/w8554.pdf
Grothe O (2013) Jump tail dependence in lévy copula models. Extremes 16(3):303–324
Article MathSciNet MATH Google Scholar
Grothe O, Hofert M (2015) Construction and sampling of archimedean and nested archimedean lévy copulas. J Multivar Anal 138:182–198
Article MATH Google Scholar
Hubert L, Arabie P (1985) Comparing partitions. J Classif 2(1):193–218
Article MATH Google Scholar
Jing BY, Kong XB, Liu Z (2012) Modeling high-frequency financial data by pure jump processes. Ann Stat 40:759–784
Article MathSciNet MATH Google Scholar
Kallsen J, Tankov P (2006) Characterization of dependence of multidimensional Lévy processes using Lévy copulas. J Multivar Anal 97(7):1551–1572. https://doi.org/10.1016/j.jmva.2005.11.001, http://linkinghub.elsevier.com/retrieve/pii/S0047259X05002022
Kaufman L, Rousseeuw PJ (eds) (1990) Finding groups in data: an introduction to cluster analysis. Wiley Series in Probability and Statistics. Wiley, Hoboken. https://doi.org/10.1002/9780470316801
MATH Google Scholar
Kyprianou A (2006) Introductory lectures on fluctuations of Lévy processes with applications. Springer, Berlin
MATH Google Scholar
Madan DB, Carr PP, Chang EC (1998) The variance gamma process and option pricing. Eur Finance Rev 2(1):79–105
Article MATH Google Scholar
Meilă M, Pentney W (2007) Clustering by weighted cuts in directed graphs. Society for Industrial and Applied Mathematics, Philadelphia, pp 135–144, copyright - Copyright Society for Industrial and Applied Mathematics 2007; Last updated - 2012-05-15
Nelsen RB (2006) An introduction to Copulas. Springer series in statistics. Springer, New York. https://doi.org/10.1007/0-387-28678-0
MATH Google Scholar
Nelsen RB (2007) An introduction to Copulas. Springer, Berlin
MATH Google Scholar
Poirot J, Tankov P (2007) Monte Carlo option pricing for tempered stable (CGMY) processes. Asia-Pacific Financ Mark 13(4):327–344. https://doi.org/10.1007/s10690-007-9048-7
Article MATH Google Scholar
Schmidt R, Stadtmüller U (2006) Non-parametric estimation of tail dependence. Scand J Stat 33(2):307–335. https://doi.org/10.1111/j.1467-9469.2005.00483.x
Article MathSciNet MATH Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. IEEE Trans Pattern Anal Mach Intell 22(8):888–905. https://doi.org/10.1109/34.868688
Article Google Scholar
Sklar A (1959) Fonctions de répartition à $n$ dimensions et leurs marges. Publications de l’Institut de Statistique de L’Université de Paris 8:229–231
MATH Google Scholar
Tankov P (2003a) Dependence structure of spectrally positive multidimensional lévy processes. Unpublished manuscript
Tankov P (2003b) Financial modelling with jump processes, vol 2. CRC Press, Boca Raton
MATH Google Scholar
Tankov P (2006) Simulation and option pricing in lévy copula models. Mathematical Modelling of Financial Derivatives, IMA Volumes in Mathematics and Applications, Springer
Tankov P (2016) Lévy copulas: review of recent results. In: The fascination of probability, statistics and their applications, Springer, pp 127–151
Wagner S, Wagner D (2007) Comparing clusterings: an overview. Universität Karlsruhe, Fakultät für Informatik Karlsruhe

Download references

Acknowledgements

The authors are indebted to two anonymous reviewers for comments and suggestions that improved the paper. Jiang Wu is grateful to the support from MOE (Ministry of Education in China) Project of Humanities and Social Sciences (Project No. 10YJC790280).

Author information

Authors and Affiliations

Economics and Management School, Wuhan University, Wuhan, Hubei, 430072, People’s Republic of China
Chen Yang
Department of Statistical and Actuarial Sciences, University of Western Ontario, London, ON, N6A 5B7, Canada
Wenjun Jiang & Xin Liu
School of Economics, Central University of Finance and Economics, Beijing, 100081, People’s Republic of China
Jiang Wu
Richard Ivey School of Business, University of Western Ontario, London, ON, N6G 0N1, Canada
Zhichuan Li

Authors

Chen Yang
View author publications
You can also search for this author in PubMed Google Scholar
Wenjun Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Jiang Wu
View author publications
You can also search for this author in PubMed Google Scholar
Xin Liu
View author publications
You can also search for this author in PubMed Google Scholar
Zhichuan Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wenjun Jiang.

Appendices

A The simulation procedure of multidimensional Lévy processes

The simulation procedure of multidimensional Lévy processes heavily relies on one key lemma and the improved algorithm given by Tankov (2006b).

Lemma 1

(Tankov (2006b)) Let $F(u_1,\dots ,d_d)$ be a d-dimensional Lévy copula and

$$\begin{aligned} \lim _{(u_i)_{i\in I}\rightarrow \infty }F(u_1,\dots ,u_d)=F(u_1,\dots ,u_d)\big |_{(u_i)_{i\in I}=\infty } \end{aligned}$$

(10)

for nonempty $I\subseteq \{1,2,\dots ,d\}$. Define $F_{u_1}(u_2,\dots ,u_d)$ as

$$\begin{aligned} F_{u_1}(u_2,\dots ,u_d)=sign(\zeta )\frac{\partial }{\partial \zeta }F\big ((\zeta \wedge 0, \zeta \vee 0]\times (-\infty ,u_2]\cdots \times (-\infty ,u_d] \big )\Big |_{\zeta =u_1}, \end{aligned}$$

(11)

where $\zeta $ shares the same sign of $u_1$. There exists a set $N\subseteq \mathbb {R}$ of zero Lebesgue measure such that for $u_1\in \mathbb {R}\backslash N$, $F_{u_1}(u_2,\dots ,u_d)$ is a probability distribution of $(u_2,\dots ,u_d)$ conditional on the first element equals to $u_1$.

Theorem 2

(Tankov (2006b)) Let $v(\cdot )$ be a Lévy measure on $\mathbb {R}^d$ satisfying

$$\begin{aligned} \int _{\mathbb {R}}(|x|\wedge 1)v(\mathrm{d}x)<\infty \end{aligned}$$

with marginal tail integrals $U_i$, $i=1,\dots ,d$ and Lévy copula $F(u_1,\dots ,u_d)$ such that (10) is satisfied, and $F_{u_i}(\cdot )$ be the corresponding conditional distributions.

Fix a truncation level $\tau $. Let $\{V_k\}_{k\ge 1}$ and $\{W_k^i\}_{k\ge 1}^{1\le i\le d}$ be independent sequences of i.i.d. standard uniform random variables. Introduce $d^2$ random sequences $\{\gamma _{k}^{ij}\}_{k\ge 1}^{1\le i,j\le d}$, independent from $\{V_k\}_{k\ge 1}$ and $\{W_k^i\}_{k\ge 1}^{1\le n\le d}$ such that

For $i=1,\dots ,d$, $\sum _{k=1}^{\infty }\varDelta _{\{\gamma _{k}^{ii}\}}$ are independent Poisson random measures on $\mathbb {R}$ with Lebesgue intensity measures.
Conditional on $\gamma _k^{ii}$, the random vector $(\gamma _k^{i1},\dots ,\gamma _k^{i,i-1},\gamma _k^{i,i+1},\dots ,\gamma _k^{id})$ is distributed on $\mathbb {R}^{d-1}$ with law $F_{\gamma _k^{ii}}(\cdot )$.

For each $k\ge 1$ and each $i=1,\dots ,d$,let $n_k^i=\#\{j=1,\dots ,d: |\gamma _k^{ij}|\le \tau \}$. Then the process $\{Z_t^{\tau }\}_{0\le t\le 1}$ with components

$$\begin{aligned} Z_t^{\tau ,j}=\sum _{k=1}^{\infty }\sum _{i=1}^{n}U_j^{(-1)}(\gamma _k^{ij})\mathbb {1}_{n_k^iW_k^i\le 1}\mathbb {1}_{|\gamma _k^{ii}|\le \tau }\mathbb {1}_{[0,t]}(V_k), \quad j=1,\dots ,d, \end{aligned}$$

(12)

is a Lévy process on [0, 1] with characteristic function

$$\begin{aligned} \mathbf {E}[e^{i<u,Z_t^{\tau }>}]=\exp \Big (t\int _{\mathbb {R}^d\backslash S_{\tau }}(e^{i<u,z>}-1)v(\mathrm{d}z) \Big ), \end{aligned}$$

(13)

where $<\cdot>$ represents the inner product and

$$\begin{aligned} S_{\tau }=\big (U_1^{(-1)}(-\tau ),U_1^{(-1)}(\tau ) \big )\times \cdots \times \big (U_d^{(-1)}(-\tau ),U_d^{(-1)}(\tau ) \big ). \end{aligned}$$

(14)

The above theorem provides a robust algorithm of simulating multidimensional Lévy processes given the Lévy copula; however, it still requires an efficient algorithm to sample the random vector $(\gamma _k^{i1},\dots ,\gamma _k^{i,i-1},\gamma _k^{i,i+1},\dots ,\gamma _k^{id})$ given the cumulative distribution function $F_{\gamma _k^{ii}}(\cdot )$. While it remains unknown that whether there exists such an algorithm in general, for the particular case of Lévy-Clayton copula considered in our simulation study we discover a method serving this purpose well.

For the two-directed Lévy-Clayton Copula F defined in (9), we have

$$\begin{aligned} F(u_1,\ldots ,u_d;\theta ,\eta )=0 \end{aligned}$$

whenever $u_1=0$. Hence by denoting $I_{d-1}:=(-\infty ,u_2]\times \cdots \times (-\infty ,u_d]\in \mathcal {B}(\mathbb {R}^{d-1})$, we have

$$\begin{aligned} F\left( (\zeta \wedge 0,\zeta \vee 0]\times I_{d-1};\theta ,\eta \right) = \text {sign}(\zeta )\int _{I_{d-1}}F(\zeta ,\mathrm{d}v_2, \ldots ,\mathrm{d}v_d;\theta ,\eta ). \end{aligned}$$

(15)

Thus given the signs of $u_1,u_2,\ldots ,u_n$, interchanging the partial differentiation and the integrals yields

$$\begin{aligned} F_{u_1}(u_2,\ldots ,u_d;\theta ,\eta ) = \int _{I_{d-1}}\tilde{F}_{u_1}(\mathrm{d}v_2, \ldots ,\mathrm{d}v_d;\theta ,\eta ), \end{aligned}$$

(16)

where

$$\begin{aligned} \tilde{F}_{u_1}(u_2,\ldots ,u_d;\theta ,\eta )= & {} \text {sign}(u_1)2^{2-d}\left( 1+|u_1|^{\theta }\sum ^d_{i=2} |u_i|^{-\theta }\right) ^{-\frac{1}{\theta }-1}\\&\left( \eta \varvec{1}_{\{u_1 u_2\dots u_d\ge 0\}}-(1-\eta )\varvec{1}_{\{u_1 u_2\dots u_d<0\}}\right) . \end{aligned}$$

Notice that according to Lemma 1, $F_{u_1}$ is a probability distribution function. In fact, a random vector having the CDF $F_{u_1}$ is a mixture of $2^{d-1}$ random vectors with the following joint CDF:

$$\begin{aligned} G_{u_1}(u_2,\ldots ,u_d;\theta ) = \left( 1+|u_1|^{\theta }\sum ^d_{i=2}u^{-\theta }_i\right) ^{-\frac{1}{\theta }-1},\quad u_i>0 \end{aligned}$$

(17)

for all $i=2,\ldots ,d$. Hence to construct such a random vector, consider a random sign vector $\begin{pmatrix} S_2&\cdots&S_d\end{pmatrix}^{\intercal }\in \{-1,1\}^{d-1}$ satisfying

$$\begin{aligned}&\displaystyle \mathbf {P}\left( \text {sign}(u_1)\prod ^d_{i=2}S_i\ge 0\right) = \eta , \end{aligned}$$

(18)

$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( \text {sign}(u_1)\prod ^d_{i=2}S_i< 0\right) = 1-\eta , \end{aligned}$$

(19)

$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( S_2=s_2,\ldots ,S_d=s_d\bigg |\text {sign}(u_1)\prod ^d_{i=2}S_i\ge 0\right) =2^{2-d}\mathbb {1}_{\left\{ \text {sign}(u_1)\prod ^d_{i=2}s_i\ge 0\right\} },\qquad \end{aligned}$$

(20)

$$\begin{aligned}&\displaystyle \quad \mathbf {P}\left( S_2=s_2,\ldots ,S_d=s_d\bigg |\text {sign}(u_1) \prod ^d_{i=2}S_i<0\right) =2^{2-d}\mathbb {1}_{\left\{ \text {sign}(u_1)\prod ^d_{i=2}s_i< 0\right\} } \end{aligned}$$

(21)

for arbitrary $s_2,\ldots ,s_d\in \{-1,1\}$. Thus by generate $\begin{pmatrix} Y_2&\cdots&Y_d\end{pmatrix}^{\intercal }$ from $G_{u_1}$ independent of $\begin{pmatrix} S_2&\cdots&S_d\end{pmatrix}^{\intercal }$, the random vector $\begin{pmatrix} S_2Y_2&\cdots&S_dY_d\end{pmatrix}^{\intercal }$ has joint CDF $F_{u_1}$.

The remaining key point to generate random vectors from $F_{u_1}$ is to generate random vectors from $G_{u_1}$. Notice that $|u_1|$ plays a role of scale parameter in $G_{u_1}$, we may only consider the cumulative distribution function $G_1$. Since the copula uniquely associated with $G_1$ is the Clayton copula

$$\begin{aligned} C(u_2,\ldots ,u_d;\theta ') = (u^{-\theta '}_2+\cdots +u^{-\theta '}_d-d+2)^{-\frac{1}{\theta '}}, \quad \theta '=\dfrac{\theta }{\theta +1}, \end{aligned}$$

a random vector from $G_1$ could be written as $\begin{pmatrix} G^{-1}_{1,2}(U_2)&\cdots&G^{-1}_{1,d}(U_d)\end{pmatrix}^{\intercal }$ where $\begin{pmatrix} U_2&\cdots&U_d\end{pmatrix}^{\intercal }$ is generated from the Clayton copula C and

$$\begin{aligned} G_{1,i}(u_i;\theta )=(1+u^{-\theta }_i)^{-\frac{1}{\theta }-1} \end{aligned}$$

for $i=2,\ldots ,d$ are the marginal distributions of $G_1$ (all of these marginal distributions are Dagum distributions of type I). Therefore, $\begin{pmatrix} Y_2&\cdots&Y_d\end{pmatrix}^{\intercal } = \begin{pmatrix} |u_1|G^{-1}_{1,2}(U_2)&\cdots&|u_1|G^{-1}_{1,d}(U_d)\end{pmatrix}^{\intercal }$ is a random vector from $G_{u_1}$. Consequently, the simulation algorithm could be summarized as follows:

B Details of the clustering algorithm

C Descriptive statistics of the log-returns of the dataset

See Table 9.

Table 9 Descriptive statistics of the log-returns

Full size table

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yang, C., Jiang, W., Wu, J. et al. Clustering of financial instruments using jump tail dependence coefficient. Stat Methods Appl 27, 491–513 (2018). https://doi.org/10.1007/s10260-017-0411-1

Download citation

Accepted: 11 November 2017
Published: 21 November 2017
Issue Date: 10 August 2018
DOI: https://doi.org/10.1007/s10260-017-0411-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Clustering of financial instruments using jump tail dependence coefficient

Abstract

Access this article

Similar content being viewed by others

Time Series Clustering on Lower Tail Dependence for Portfolio Selection

Dynamic tail dependence clustering of financial time series

Financial clustering in presence of dominant markets

Notes

References

Acknowledgements