On a bivariate copula for modeling negative dependence: application to New York air quality data

Ghosh, Shyamal; Bhuyan, Prajamitra; Finkelstein, Maxim

doi:10.1007/s10260-022-00636-3

On a bivariate copula for modeling negative dependence: application to New York air quality data

Original Paper
Open access
Published: 28 April 2022

Volume 31, pages 1329–1353, (2022)
Cite this article

Download PDF

You have full access to this open access article

Statistical Methods & Applications Aims and scope Submit manuscript

On a bivariate copula for modeling negative dependence: application to New York air quality data

Download PDF

2535 Accesses
2 Citations
1 Altmetric
Explore all metrics

Abstract

In many practical scenarios, including finance, environmental sciences, system reliability, etc., it is often of interest to study the various notion of negative dependence among the observed variables. A new bivariate copula is proposed for modeling negative dependence between two random variables that complies with most of the popular notions of negative dependence reported in the literature. Specifically, the Spearman’s rho and the Kendall’s tau for the proposed copula have a simple one-parameter form with negative values in the full range. Some important ordering properties comparing the strength of negative dependence with respect to the parameter involved are considered. Simple examples of the corresponding bivariate distributions with popular marginals are presented. Application of the proposed copula is illustrated using a real data set on air quality in the New York City, USA.

The Transformed MG-Extended Exponential Distribution: Properties and Applications

Article Open access 05 June 2024

A distance based two-sample test of means difference for multivariate datasets

Article 09 June 2024

Tail-dependence clustering of time series with spatial constraints

Article Open access 16 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Copulas provide an effective tool for modeling dependence in various multivariate phenomena in the fields of reliability engineering, life sciences, environmental science, economics and finance, etc (Fontaine et al. 2020; Cooray 2019; Joe 2015, Ch-7). Specifically, in recent decades, bivariate copulas were used to generate bivariate distributions with suitable dependence properties (Lai and Xie 2000; Bairamov and Kotz 2003; Finkelstein 2003; Durante et al. 2012; Mohtashami-Borzadaran et al. 2019). The detailed discussion of historical developments, obtained results and perspectives along with the up to date theory can be found in Durante and Sempi (2015) and Hofert et al. (2018). It should be noted that most copulas available in the literature possess some limitations in modeling negatively dependent data, which is a certain disadvantage, as negative dependence between vital variables is often encountered in real life.

Lehmann (1966) introduced several concepts of negative dependence for bivariate distributions. Later, Esary and Lehmann (1972) and Yanagimoto (1972) extended the corresponding definitions and developed stronger notions of bivariate negative dependence. See Balakrishnan and Lai (2009) for detailed discussion on popular dependence notions and their applications in the context of continuous bivariate distributions. Scarsini and Shaked (1996) provided a detailed overview of the corresponding ordering properties for the multivariate distributions. These results provide useful tools for describing the dependence properties of copulas with respect to a dependence parameter. However, only a few bivariate copulas that allow for a simple and meaningful analysis of this kind have been developed and studied in the literature so far. The Farlie–Gumbel–Morgenstern (FGM) family of distributions exhibits negative dependence, but the Spearman’s rho for this family lies within the interval $[-1/3,1/3]$ (Schucany et al. 1978). Bairamov and Kotz (2000) and Bekrizadeh et al. (2012) have considered the four-parameter and the three-parameter extensions of the FGM family proposed by Sarmanov (1996), with Spearman’s rho lying within the interval $[-0.48, 0.50]$ and $[-0.5, 0.43]$, respectively. To address this issue Amblard and Girard (2009) proposed another extension, but its application is limited because of a singular component that is concentrated on the corresponding diagonal. Some other extensions of the FGM copula are discussed in Ahn (2015) and Bekrizadeh and Jamshidi (2017). Hürlimann (2015) have proposed a comprehensive extension of the FGM copula with the Spearman’s rho and Kendall’s tau attaining any value in $(-1,1)$. However, the dependence and ordering properties of these copulas are not well studied in the literature. Recently, Cooray (2019) proposed a new extension of FGM family which exhibits negative dependence among the variables in a very strong sense. However, its Spearman’s rho and Kendll’s tau are restricted to $[-0.70,0]$ and $[-0.52,0]$, respectively.

Thus, it is quite a challenging problem to construct a flexible bivariate copula with the correlation coefficient that takes any value in the interval $(-1,0)$. Moreover, it is not sufficient just to suggest this type of copula, but it is essential to describe its properties (including relevant stochastic comparisons) especially in the case of strong notions of dependence. In many real life scenarios, paired observations of non-negative variables possess strong negative dependence. For example, rainfall intensity and duration are jointly modeled incorporating their negative dependence for the study of derived flood frequency distribution (Kurothe et al. 1997). This paper is motivated by a real case study on air quality for New York Metropolitan area where the joint distribution of the wind speed and ozone level exhibits strong negative dependence (See Sect. 7). We believe that the current study meets to some extent this challenge, as we propose an absolutely continuous negatively dependent copula that satisfies most of the popular notions of negative dependence available in the literature with correlation coefficients in the interval $(-1,0)$.

The paper is organized as follows. In Sect. 2, we describe the baseline (for the proposed copula) distribution and discuss some basic properties including conditional distributions and correlation coefficients. Various notions of negative dependence in the context of the proposed copula and ordering properties are considered in Sects. 3 and 4, respectively. Section 5 provides some examples of negatively dependent standard bivariate distributions. The estimation methodologies are discussed in Sect. 6. As an illustration, we provide a real case study in Sect. 7. Finally, some concluding remarks are given in Sect. 8.

2 The bivariate copula

Bhuyan et al. (2020) proposed a negatively dependent bivariate life distribution that possesses tractable closed-form expressions for the joint distributions and exhibits various strong notions of negative dependence reported in the literature. Most importantly, the correlation coefficient may take any value in the interval $(-1,0)$. One of the marginal distribution is Exponential and the other belongs to skew log Laplace family (Dixit and Khandeparkar 2017). We utilize the negative dependence structure inherent in this model and formulate a copula with strong negative dependence. The joint distribution function and the marginal distributions are given by

$$\begin{aligned} H(x,y) = \left\{ \begin{array}{ll} y^{\lambda } - e^{-\lambda x}+\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\left[ e^{(\lambda +\mu )x}-y^{\lambda +\mu }\right] , & 0<y \le 1, x>-\log y \\ 1-e^{-\lambda x}-\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\left[ 1-e^{-(\lambda +\mu )x}\right] , & x>0, y>1, \end{array}\right. \end{aligned}$$

(1)

and $F(x)= 1- e^{-\lambda x}$ for $x>0$, and $G(y) =\dfrac{\mu }{(\lambda +\mu )}y^{\lambda }\mathbbm {1}(0<y \le 1)+ \left[ 1-\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\right] \mathbbm {1}( y>1)$, respectively, where $\lambda ,\mu >0$. Note that $F(\cdot )$ and $G(\cdot )$ are continuous. We first find the quasi-inverse functions of $F(\cdot )$ and $G(\cdot )$ and ‘put’ those into the arguments of the joint distribution function $H(\cdot ,\cdot )$ given by (1). Then by Corollary 2.3.7 of Nelsen (2006, p-22), we obtain the following copula

$$\begin{aligned} C_{\lambda , \mu }(u,v) = \left\{ \begin{array}{l} v-(1-u)+\dfrac{\lambda \mu ^{\frac{\mu }{\lambda }}}{(\lambda +\mu )^{1+\frac{\mu }{\lambda }}}(1-u)^{{1+\frac{\mu }{\lambda }}}v^{-\frac{\mu }{\lambda }}, 0<v \le \dfrac{\mu }{\mu +\lambda }, 1-\dfrac{(\lambda +\mu )v}{\mu }<u<1, \\ u-(1-v)\left[ 1-(1-u)^{{1+\frac{\mu }{\lambda }}}\right] , 0<u<1, \dfrac{\mu }{\mu +\lambda }<v<1. \end{array}\right. \end{aligned}$$

(2)

Now using the reparameterization $\mu =\theta \lambda $, in (2), we rewrite $C_{\lambda , \mu }$ as

$$\begin{aligned} C_\theta (u,v) = \left\{ \begin{array}{l} v-(1-u)+\dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}(1-u)^{1+\theta }v^{-\theta }, 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ u-(1-v)\left[ 1-(1-u)^{1+\theta }\right] , 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

(3)

for $\theta >0$. It is easy to verify that $C_\theta (u,v)$, given by (3), satisfies the following conditions: (i) $C_\theta (u,0)=0=C_\theta (0,v)$, (ii) $C_\theta (u,1)=u$, $C_\theta (1,v)=v$, for any u, v in $I=[0,1]$, and (iii) $C_\theta (u_2,v_2)-C_\theta (u_2, v_1)-C_\theta (u_1, v_2)+C_\theta (u_1,v_1)\ge 0$, for any $u_1, u_2, v_1, v_2$ in I with $u_1\le u_2$ and $v_1\le v_2$. In Figs. 1 and 2, we provide graphical presentation of the proposed copula for different values of the dependence parameter $\theta $.

The survival copula, is the function ${\bar{C}}$ which couples the joint survival function to its marginal survival functions. It is easy to show that ${\bar{C}}$ is a copula, and is related to the copula C via the equation ${\bar{C}}= u+v-1 + C(1-u, 1-v)$. See Nelsen (2006, p-32) for details. The survival copula and the density function of the proposed copula $C_\theta (u,v)$ are given by

$$\begin{aligned} {\bar{C}}_\theta (u,v) = \left\{ \begin{array}{ll} \dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}u^{1+\theta }(1-v)^{-\theta }, & 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ vu^{(1+\theta )}, & 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} c_\theta (u,v) = \left\{ \begin{array}{ll} \dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-u)^\theta v^{-(1+\theta )}, & 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ (1+\theta )(1-u)^\theta , & 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

(4)

respectively.

2.1 Conditional copulas

The conditional copula of U given $V=v$, is as follows. For $0<v \le \frac{\theta }{(1+\theta )}$,

$$\begin{aligned} C_\theta (u\mid v)= 1- \frac{\theta ^{(1+\theta )}}{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )} v^{-(1+\theta )}, \,\,\,\,1-\dfrac{(1+\theta )v}{\theta }<u<1, \end{aligned}$$

(5)

whereas for $\frac{\theta }{(1+\theta )}<v<1$,

$$\begin{aligned} C_\theta (u\mid v)= 1-(1-u)^{(1+\theta )}, \,\,\,\,\,0<u<1. \end{aligned}$$

(6)

The conditional mean and variance of $U\mid V=v$ are given by

$$\begin{aligned} E[U\mid V=v]= \left\{ \begin{array}{ll} 1-\dfrac{(1+\theta )^2v}{\theta (\theta +2)}, & 0<v \le \dfrac{\theta }{1+\theta } \\ \dfrac{1}{\theta +2}, & \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} Var[U\mid V=v]= \left\{ \begin{array}{ll} \dfrac{(1+\theta )^3v^2}{\theta ^2(\theta +2)^2(\theta +3)},& 0<v \le \dfrac{\theta }{1+\theta } \\ \dfrac{\theta +1}{(\theta +2)^2(\theta +3)},& \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

respectively.

Remark 1

The regression of U on $V=v$ is linearly decreasing in v for $0<v\le \frac{\theta }{\theta +1}$, and independent of v for $\frac{\theta }{\theta +1}<v<1$. Also, it is interesting to note that the conditional variance of $U\mid V=v$ is an increasing function of v and bounded from above by $\frac{\theta +1}{(\theta +2)^2(\theta +3)}.$

The conditional copula of V given $U=u$, is given by

$$\begin{aligned} C_\theta (v\mid u)= \left\{ \begin{array}{ll} 1- \dfrac{\theta ^\theta }{(1+\theta )^\theta }(1-u)^\theta v^{-\theta }, & \dfrac{(1-u)\theta }{(1+\theta )}<v \le \dfrac{\theta }{1+\theta } \\ 1-(1+\theta )(1-v)(1-u)^\theta , & \dfrac{\theta }{1+\theta }<v<1 \end{array}\right. \end{aligned}$$

(7)

The conditional mean and variance of $V\mid U=u$, are given by

$$\begin{aligned} E[V\mid U=u]=\dfrac{(1-u)^\theta }{2(1-\theta )}-\dfrac{2\theta ^2(1-u)}{1-\theta ^2}, \end{aligned}$$

for $\theta \ne 1$, and

$$\begin{aligned} Var[V\mid U=u]= & {} -\dfrac{(1+\theta )(1-u)^\theta \left[ 2 - \theta + \theta ^2 (2 - 6 u) + 3 \theta ^3 u\right] }{3(\theta -2)(\theta ^2-1)^2}\\&+\dfrac{\theta ^3(1-u)^2}{(\theta -2)(\theta ^2-1)^2} -\dfrac{(1+\theta )^2(1-u)^{2\theta }}{4(\theta ^2-1)^2} \end{aligned}$$

for $\theta \ne 1,2$, respectively.

Remark 2

The regression of V on $U=u$ is strictly decreasing in u.

One can use the conditional copula of U given $V=v$, provided in (5) and (6), to simulate from the proposed copula $C_{\theta }$, given by (3), using the following steps.

Step I.:: Simulate $v_i$ and $u_i^{*}$ independently from standard uniform distribution.
Step II.:: If $v_i\le \frac{\theta }{\theta +1}$, then solving $C_{\theta }(u\mid v_i)=u_i^{*}$ from (5), we get $u_i=1-(\frac{\theta +1}{\theta })v_i(1-u_i^{*})^{\frac{1}{1+\theta }}$; else, solving $C_{\theta }(u\mid v_i)=u_i^{*}$ from (6), we get $u_i=1-(1-u_i^{*})^{\frac{1}{1+\theta }}$.
Step III.:: Repeat Step I and Step II n times to obtain independently and identically distributed realizations $(u_i,v_i)$, for $i=1,2,\ldots ,n$ from $C_{\theta }$.

A similar algorithm can be elaborated to simulate from $C_{\theta }$ based on the conditional copula of V given U, provided in (7).The associated R programme for the aforementioned algorithm are provided in the Supplementary material. The Scatter plots based on 500 simulated observations using the aforementioned algorithm for four different values of $\theta $ are given in Fig. 3. As expected, the data points are getting closer to the diagonal $v=-u$ for higher values of $\theta $.

2.2 Basic properties

In this Subsection, we present three important propositions related to the proposed copula. The detailed proofs are presented in Appendix A.

Proposition 1

The copula $C_{\theta }$, defined in (3), is decreasing with respect to its dependence parameter $\theta $, i.e., if $\theta _1\le \theta _2$ then $C_{\theta _2}(u,v)\le C_{\theta _1}(u,v)$, for all $(u,v) \in I^2 = [0,1]\times [0,1]$.

Proposition 2

The copula $C_{\theta }$, defined in (3), is sub-harmonic, i.e., $\nabla ^2 C_\theta (u,v) \ge 0$.

Proposition 3

The copula $C_{\theta }$, defined in (3), is absolutely continuous.

2.3 Measures of dependence

Measures of dependence are commonly used to summarize the complicated dependence structure of bivariate distributions. See Joe (1997, Ch-2), Nelsen (2006, Ch-5) and Hofert et al. (2018, Ch-2) for a detailed review on measures of dependence and its associated properties. In this section, we derive the expressions of the Kendall’s tau and the Spearman’s rho for the proposed copula $C_\theta $. Essentially, these coefficients measure the correlation between the ranks rather than actual values of X and Y. Therefore, these coefficients are unaffected by any monotonically increasing transformation of X and Y.

Definition 1

Let X and Y be the continuous random variables with the dependence structure described by the copula C. Then the population version of the Spearman’s rho for X and Y is given by

$$\begin{aligned} \rho :=\int _0^1\int _0^1uvdC(u,v)-3= \int _0^1\int _0^1C(u,v)dudv-3 \end{aligned}$$

Proposition 4

Let (X, Y) be a random pair with copula $C_\theta $. The Spearman’s rho is given by

$$\begin{aligned} \rho =\dfrac{2(3+3\theta +\theta ^2)}{2+3\theta +\theta ^2}-3, \end{aligned}$$

which is a decreasing function in $\theta $ and takes any values between $-1$ and 0.

Definition 2

Let X and Y be the continuous random variables with copula C. Then, the population version of the Kendall’s tau for X and Y is given by

$$\begin{aligned} \tau := 4\int _0^1\int _0^1C(u,v)dC(u,v)-1 \end{aligned}$$

Proposition 5

Let (X, Y) be a random pair with copula $C_\theta $. Then the Kendall’s tau is given by

$$\begin{aligned} \tau =\dfrac{-\theta }{(1+\theta )}, \end{aligned}$$

which is a decreasing function in $\theta $ and takes any values between $-1$ and 0.

In Fig. 4, we have plotted the Spearman’s rho and the Kendall’s tau against the dependence parameter $\theta $. It is easy to see that the Spearman’s rho is less than the Kendall’s tau for all $\theta >0$.

3 Connections with notions of negative dependence

As discussed in Sect. 2.3, the Spearman’s rho and the Kendall’s tau measure the correlation between two random variables. However, it is possible that these random variables may have the strong correlation, but possess the weak association with respect to different notions of dependence or vice versa. In this section, we discuss several relevant notions of negative dependence, namely Quadrant Dependence, Regression Dependence and Likelihood Ratio Dependence, etc., and explore whether the corresponding properties are satisfied by the proposed copula or not. First, we provide the definitions of the aforementioned dependence notions as discussed in Nelsen (2006) and Balakrishnan and Lai (2009).

Definition 3

Let X and Y be continuous random variables with copula C. Then

1.
X and Y are Negatively Quadrant Dependent (NQD) if $P(X \le x, Y \le y) \le {P(X \le x)P(Y \le y)}$, for all $(x, y) \in R^2$, where $R^2$ is the domain of joint distribution of X and Y, or equivalently a copula C is said to be NQD if for all $(u,v)\in I^2$, $C(u,v)\le uv.$
2.
Y is left tail increasing in X (LTI($Y\mid X$)), if $P[Y \le y \mid X \le x]$ is a nondecreasing function of x for all y.
3.
X is left tail increasing in Y (LTI($X\mid Y$)), if $P[X \le x \mid Y \le y]$ is a nondecreasing function of y for all x.
4.
Y is right tail decreasing in X (RTD($Y\mid X$)), if $P[Y> y \mid X > x]$ is a nonincreasing function of x for all y.
5.
X is right tail decreasing in Y (RTD($X\mid Y$)), if $P[X>x \mid Y > y]$ is a nonincreasing function of y for all x.
6.
Y is stochastically decreasing in X denoted as SD($Y\mid X$), (also known as negatively regression dependent (NRD) ($Y\mid X$)) if $P[Y > y \mid X = x]$ is a nonincreasing function of x for all y.
7.
X is stochastically decreasing in Y denoted as SD($X\mid Y$), (also known as negatively regression dependent ($X\mid Y$)) if $P[X > x\mid Y = y]$ is a nonincreasing function of y for all x.
8.
Let X and Y be continuous random variables with joint density function h(x, y). Then X and Y are negatively likelihood ratio dependent, denote by NLR(X,Y), if $h(x_1,y_1)h(x_2,y_2)\le h(x_1,y_2)h(x_2,y_1)$ for all $x_1, x_2, y_1, y_2\in I$ such that $x_1\le x_2$ and $y_1\le y_2$.

Now in the following theorems, we establish that the proposed copula $C_{\theta }$ satisfies all the aforementioned dependence properties. The detailed proofs are provided in Appendix B.

Theorem 6

Let X and Y be two random variables with copula $C_\theta $. Then (i) X and Y are LTI($Y\mid X$), (ii) X and Y are LTI($X\mid Y$), (iii) X and Y are RTD($Y\mid X$), and (iv) X and Y are RTD($X\mid Y$).

Theorem 7

Let X and Y be two random variables with copula $C_\theta $. Then (i) X and Y are SD($Y\mid X$), and (ii) X and Y are SD($X\mid Y$).

Theorem 8

Let X and Y be two random variables with copula $C_\theta $. Then X and Y are NLR.

Remark 3

Two random variables X and Y with copula $C_{\theta }$ are NQD. This directly follows from Theorem 8. See the interrelationships between different concepts of negative dependence summarised in (Balakrishnan and Lai 2009, p-130) for details.

4 Ordering properties

In Sect. 3, several negative dependence properties of the proposed copula $C_\theta $ has been investigated for the fixed $\theta >0$. In this section, we discuss the ordering properties of the proposed copula $C_\theta $, which provides a precise (and also intuitively expected) notion for one bivariate distribution being more positively or negatively associated than another. For this purpose, we first recall the definitions of the dependence orderings for bivariate distributions. These definitions describe the strength of dependence of a copula with respect to its dependence parameter $\theta $. Lehmann (1966) was first to introduce the NQD and NRD notions. Following this notions, Yanagimoto and Okamoto (1969) introduced the ordering properties as defined below.

Definition 4

Let F and G be two bivariate distributions with the same marginals. Then F is said to be smaller than G in the NQD sense, denoted as $F\prec _{NQD}G$, if

$$\begin{aligned} F(x,y)\ge G(x,y)\,\,\,\,\,\,\forall x\,\, \mathrm{and}\,\, y. \end{aligned}$$

Definition 5

Let F and G be two bivariate distributions with the same marginals, and let (U, V) and (X, Y) be two random vectors having the distributions F and G, respectively. Then F is said to be smaller than G in the NRD sense, denoted by $F\prec _{NRD} G $ or $(U, V)\prec _{NRD} (X, Y)$, if for any $x_1\le x_2$,

$$\begin{aligned} F^{-1}_{V\mid U}(u\mid x)\ge F^{-1}_{V\mid U}(v\mid x^{'})\implies G^{-1}_{V\mid U}(u\mid x)\ge G^{-1}_{V\mid U}(v\mid x^{'}) \end{aligned}$$

for any $u, v \in I$, where $F_{V \mid U}$ denote the conditional distribution of V given $U = u$ and $F^{-1}_{V\mid U}$ denote its right-continuous inverse. Equivalently, $F\prec _{NRD} G $ if and only if $G^{-1}_{Y\mid X}\left[ F_{V\mid U}(y\mid x)\mid x\right] $ is decreasing in x for all y (Fang and Joe 1992).

Later, Kimeldorf and Sampson (1987) have introduced and studied in detail the notion of the Negatively Likelihood Ratio dependence ordering that is described in the following definition. Let the random variables X and Y have the joint distribution G(x, y). For any two intervals $I_1$ and $I_2$ of the real line, let us denote $I_1\le I_2$ if $x_1\in I_1$ and $x_2\in I_2$ imply that $x_1\le x_2$. For any two intervals I and J of the real line let G(I, J) represent the probability assigned by G to the rectangle $I\times J$.

Definition 6

Let F and G be two bivariate distributions with the same marginals, and let (U, V) and (X, Y) be two random vectors having the distributions F and G, respectively. Then F is said to be smaller than G in the NLR dependence sense, denoted by $F \prec _{NLR} G$ or $(U, V) \prec _{NLR} (X, Y)$, if $ F(I_1, J_1)F(I_2, J_2) G(I_1, J_2)G(I_2, J_1)\ge F(I_1, J_2)F(I_2, J_1) G(I_1, J_1)G(I_2, J_1)$ whenever $I_1\le I_2$ and $J_1\le J_2$. When the densities F and G exist and denoted by f and g, respectively, then the aforementioned condition equivalently is written as $f(x_1, y_1)f(x_2, y_2) g(x_1, y_2)g(x_2, y_1)\ge f(x_1, y_2)f(x_2, y_1) g(x_1, y_1)g(x_2, y_1) $ whenever $x_1\le x_2$ and $y_1\le y_2$.

In the following theorems, we derive the sufficient conditions under which one bivariate distribution will be more negatively associated than another. The detailed proofs of the following theorems are presented in Appendix C.

Theorem 9

If $\theta _1\le \theta _2$, then $C_{\theta _1}(u,v)\prec _{NQD}C_{\theta _2}(u,v).$

Theorem 10

If $\theta _1\le \theta _2$, then $C_{\theta _1}(u,v)\prec _{NRD}C_{\theta _2}(u,v).$

Theorem 11

If $\theta _1\le \theta _2$, then $C_{\theta _1}(u,v)\prec _{NLR}C_{\theta _2}(u,v).$

5 Examples

Traditionally, bivariate life distributions available in the literature are positively correlated (Balakrishnan and Lai 2009). However, in many real life scenarios, paired observations of non-negative variables are negatively correlated (Bhuyan et al. 2020). For example, the rainfall intensity and the duration are jointly modeled incorporating their negative dependence for the study of the corresponding flood frequency distribution (Kurothe et al. 1997). Gumbel (1960) and Freund (1961) have proposed the bivariate Exponential distributions with lower bound of the correlation coefficient as $-0.4$. In this section, several specific families of bivariate distributions are generated using the proposed copula (3) with different choices for marginal distribution. For modelling purposes, the Lognormal, Weibull, and Gamma distributions are popular among practitioners in the fields of engineering, medical science, and environmental science (Sharma et al. 2016; Pobočíková et al. 2017; Ramos et al. 2019). We consider these choices as baseline distribution. We first define a bivariate Weibull and bivariate Gamma distribution. Then we consider a case when the marginals are different, one from the Lognormal and another from the Weibull family. It should be noted that the resulting bivariate distributions can be described implementing all notions of negative dependence discussed in Sects. 3 and 4.

Example 1

Bivariate Weibull distribution: A family of bivariate Weibull distributions based on the proposed copula $C_{\theta }$, with marginals $F(x) =\left[ 1 - e^{-(\lambda _1 x)^{\delta _1}}\right] \mathbbm {1}(x>0)$, and $G(y) = \left[ 1 - e^{-(\lambda _2 y)^{\delta _2}}\right] \mathbbm {1}(y>0)$, is given by

$$\begin{aligned}&h(x,y) \\&\quad = \left\{ \begin{array}{ll} \dfrac{\delta _1\delta _2\lambda _1^{\delta _1}\lambda _2^{\delta _2}\theta ^{\theta +1}}{(1+\theta )^{\theta }}x^{\delta _1 -1}y^{\delta _2 - 1} \left( \dfrac{e^{-(\lambda _1 x)^{\delta _1}}}{1-e^{-(\lambda _2 y)^{\delta _2}}}\right) ^{1+\theta }, &0<y \le \phi _1, x>\phi _2(y) \\ \delta _1\delta _2\lambda _1^{\delta _1}\lambda _2^{\delta _2}(1+\theta )x^{\delta _1 -1}y^{\delta _2 - 1}e^{-(\lambda _2 y)^{\delta _2}}\left( e^{-(\lambda _1 x)^{\delta _1}}\right) ^{1+\theta }, & x>0, y>\phi _1 \end{array}\right. \end{aligned}$$

where $\phi _1 =\dfrac{1}{\lambda _2}\left[ \log (1+\theta )\right] ^{\frac{1}{\delta _2}}$, $\phi _2(y)= \dfrac{1}{\lambda _1}\left[ \log \left( \dfrac{\theta }{(1+\theta )(1 - e^{-(\lambda _2 y)^{\delta _2}})}\right) \right] ^{\frac{1}{\delta _1}}$, $\lambda _{i}>0$, $\delta _{i}>0$ for $i=1,2$.

Example 2

Bivariate Gamma distribution: A family of bivariate Gamma distributions based on the proposed copula $C_{\theta }$, with marginals $F(x)=\left[ \int _0^x\frac{1}{\Gamma (\alpha _1)}\beta _1^{\alpha _1}x^{\alpha _1 -1}e^{-\beta _1 x} \right] \mathbbm {1}(x>0)$, and $G(y) =\left[ \int _0^y\frac{1}{\Gamma (\alpha _2)}\beta _2^{\alpha _2}y^{\alpha _2 -1}e^{-\beta _2 y} \right] {\mathbbm {1}(y>0)}$, is given by

$$\begin{aligned}&h(x,y) \\&\quad = \left\{ \begin{array}{l} \dfrac{\beta _1^{\alpha _1}\beta _2^{\alpha _2}\theta ^{1+\theta }x^{\alpha _1 -1}y^{\alpha _2 -1}e^{-(\beta _1 x+\beta _2 y)}}{\Gamma (\alpha _1)\Gamma (\alpha _2)(1+\theta )^\theta }\left[ 1-\dfrac{\gamma _1(\alpha _1, \beta _1 x)}{\Gamma (\alpha _1)}\right] ^\theta \left[ \dfrac{\gamma _2(\alpha _2, \beta _2 y)}{\Gamma (\alpha _2)}\right] ^{-(1+\theta )}, \\ 0<y \le \xi _2 , \xi _1(y)<x< \eta \\ \dfrac{\beta _1^{\alpha _1}\beta _2^{\alpha _2}(1+\theta )}{\Gamma (\alpha _1)\Gamma (\alpha _2)}x^{\alpha _1 -1}y^{\alpha _2 -1}e^{-(\beta _1 x+\beta _2 y)}\left[ 1-\dfrac{\gamma _1(\alpha _1, \beta _1 x)}{\Gamma (\alpha _1)}\right] ^\theta ,\\ 0<x<\eta , \zeta _1<y<\zeta _2, \end{array}\right. \end{aligned}$$

where $\zeta _1=\gamma _2^{-1}\left( \dfrac{\theta }{1+\theta }\right) $, $\zeta _2 = \gamma _2^{-1}(\Gamma (\alpha _2))$, $\xi _2= \gamma _2^{-1}\left( \dfrac{\Gamma (\alpha _2)\theta }{1+\theta }\right) $, $\eta = \gamma _1^{-1}(\Gamma (\alpha _1))$, $\xi _1(y)= \gamma _1^{-1}\left[ \Gamma (\alpha _1) \left( 1-\dfrac{(1+\theta )\gamma _2(\alpha _2,\beta _2 y)}{\theta \Gamma (\alpha _2)}\right) \right] $, $\gamma _i(\alpha _i, \beta _i) = \int _0^{\beta _i} t^{\alpha _i - 1} e^{-t}dt$, $\alpha _{i}>0$, $\beta _{i}>0$ for ${i= 1,2}$.

Example 3

Bivariate Lognormal-Weibull distribution: A family of bivariate distribution with one marginal from Lognormal distribution and another from Weibull distribution based on the proposed copula $C_{\theta }$, with marginal distribution functions $F(x) =\frac{1}{2}\left[ 1 + erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] \mathbbm {1}(x>0)$, and $G(y) = \left[ 1 - e^{-(\lambda y)^{\delta }}\right] {\mathbbm {1}(y>0)}$, is given by

$$\begin{aligned} h(x,y) = \left\{ \begin{array}{l} \dfrac{\delta \lambda ^\delta \theta ^{\theta +1}}{\sigma \sqrt{\pi } (1+\theta )^{\theta } 2^{\frac{2\theta +1}{2}}} \dfrac{y^{\delta -1}e^{-(\lambda y)^\delta }}{x\left( 1-e^{-(\lambda y)^\delta }\right) ^{1+\theta }}\left[ 1 - erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] ^\theta , \\ \quad \quad 0<y \le \psi _1, \psi _2(y)<x<\psi _3 \\ \dfrac{\delta (1+\theta )\lambda ^{\delta }}{\sigma \sqrt{\pi }2^{\frac{2\theta +1}{2}} } \frac{y^{\delta -1}e^{-(\lambda y)^\delta }}{x}\left[ 1 - erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] ^\theta , \\ \quad \quad x>0, y>\psi _1 \end{array}\right. \end{aligned}$$

where $\psi _1 =\frac{1}{\lambda }\left[ \log (1+\theta )\right] ^{\frac{1}{\delta }}$, $\psi _2(y)=\exp \left[ \mu +\sigma \sqrt{2}\,erf^{-1}\left\{ 1-\frac{2(1+\theta )}{\theta }\left( 1-e^{-(\lambda y)^\delta } \right) \right\} \right] $, $\psi _3=\mu +\sigma \sqrt{2}\,erf^{-1}\left( \frac{1}{2}\right) $, $\lambda >0$, $\delta >0$, $-\infty<\mu <\infty $, $\sigma >0$, and $erf(x)=\frac{2}{\sqrt{\pi }}\int _0^x e^{-t^2}dt$.

Remark 4

The bivariate Weibull (in Example 1) and the bivariate Gamma (in Example 2) reduce to bivariate Exponential distribution for $\delta _1=\delta _2 = 1$, and $\alpha _1=\alpha _2 = 1$, respectively.

6 Estimation methodology

In a classical parametric setting, a straightforward approach is to estimate the dependence parameter and the parameters associated with the marginals using maximum likelihood method. This method is theoretically valid but there are some practical limitations. Firstly, the estimation of the dependence parameter $\theta $ depends on the parametric assumptions made on the marginals and the estimate of $\theta $ will be biased if the marginals are misspecified. The second drawback is computational as the log-likelihood function involves potentially large number of parameters and high-dimensional optimization is known to be challenging. See Hofert et al. (2018, Ch-4) for details. To avoid aforementioned computational burden Joe (1997) proposed a two stage method known as inference function for margins (IFM). This estimation method is based on two separate maximum likelihood estimations of the univariate marginal distributions, followed by an optimization of the bivariate likelihood as a function of the dependence parameter. Similar to maximum likelihood estimate, the estimate of $\theta $ based on IFM may be biased if the margins are partially misspecified (Hofert et al. 2018, p-136). Although the IFM has computational edge, it is less efficient compared to the maximum likelihood estimate (Joe 2015, Ch-5).

We propose to use a method that close in spirit to the method of inference function for margins but avoids the issue with misspecified marginals for the estimation of $\theta $. In contrast to IFM, we do not maximize the bivariate likelihood. Instead, we determine the dependence parameter using method of moments (Hofert et al. 2018, p-141). The method of fitting a bivariate distribution with marginals $F_{\eta _{i}}(\cdot )$, indexed by parameter $\eta _{i}$ for $i=1,2$, involves the following steps:

(i)
Obtain the estimates ${\hat{\eta }}_{i}$ for $i=1,2$ using maximum likelihood method.
(ii)
Estimate of $\theta $ is given by ${\hat{\theta }}=\frac{-\tau _{n}}{1+\tau _{n}}$, or obtained by solving $\rho _{n} =\dfrac{2(3+3{\hat{\theta }}+{\hat{\theta }}^2)}{2+3{\hat{\theta }}+{\hat{\theta }}^2}-3$, where $\tau _{n}$ and $\rho _{n}$ are sample version of Kendall’s $\tau $ and Spearman’s $\rho $, respectively.
(iii)
Obtain the fitted bivariate distribution by putting $F_{{\hat{\eta }}_{1}}(\cdot )$, and $G_{{\hat{\eta }}_{2}}(\cdot )$, and ${\hat{\theta }}$ in (3).

These steps are easy to execute and familiar to the practitioners of different fields of science. This method allows the copula to adequately approximate the dependence structure of the bivariate data, which is of prime concern from a practical point of view.

7 Application

7.1 Exploratory data analysis

For an illustrative data analysis based on the proposed copula, we consider a data set on daily air quality measurements for 153 days in the New York Metropolitan Area from May 1, 1973, to September 30, 1973. Information on average wind speed (in miles per hour) and mean ozone level (in parts per billion), were obtained from the New York State Department of Conservation and the National Weather Service, USA. This data set is named ‘airquality’ and openly available in the R package ‘datasets’. See Chambers, Cleveland, Kleiner, and Tukey (1983, Ch 2–5) for the detailed description of the data. Ozone in the upper atmosphere protects the earth from the sun’s harmful rays. On the contrary, exposure to ozone also can be hazardous to both humans and some plants in the lower atmosphere. Variations in weather conditions play an important role in determining ozone levels (Khiem et al. 2010; Topcu et al. 2003). In general, the concentration of the ozone level is affected by wind speed. High winds tend to disperse pollutants, which in turn dilute the concentration of the ozone level. However, stagnant conditions or light winds allow pollution levels to build up and thereby, the ozone level too becomes larger. Environmental scientists and meteorologists are interested in the study of the effect of a wind speed on the distribution patterns of ozone (Gorai et al. 2015) levels. For our analysis, we consider 116 observations discarding the missing values and presented the scatter plot of average wind speed versus ozone levels in Fig. 5a. It indicates strong negative dependence, and we find that Spearman’s rho and Kendall’s tau coefficients are $-0.59$ and $-0.43$, respectively. Further, we apply the methodology proposed by Lu and Ghosh (2021) based on Kolmogorov–Smirnov (KS), Anderson–Darling (AD), and Cramér-vonMises (CvM) discrepancy measures to test the hypothesis if the true underlying copula satisfies the NQD property. The p-values corresponding to KS, AD and CvM tests are 0.893, 0.571, and 0.861, respectively, affirm a strong notion of negative dependence between average wind speed and ozone levels in the NQD sense.

7.2 Modeling wind speed and ozone level

In the field of engineering and environmental science, Lognormal, Weibull, and Gamma distributions are widely used for modeling wind speed recorded in the same location (Shepherd 1978; Monjean and Robyns 2015; Pobočíková et al. 2017; Dhiman et al. 2020; Ramadan et al. 2020). These distributions are also used for modeling the level of various pollutants and ozone level (Sharma et al. 2016; Souza et al. 2018; Mishra et al. 2021). Therefore, we consider these three models for estimation of the parameters associated with the marginal distributions of the wind speed and the mean ozone level. Based on the Akaike information criterion, the Gamma distribution fits both marginals better as compared with other choices. The maximum likelihood estimates of the shape and the scale parameters are obtained as 7.171 and 1.375, respectively, for the wind speed, and the same for the mean ozone levels are 1.7 and 24.775, respectively. The estimate of the dependence parameter is obtained as ${\hat{\theta }}=0.765$. Therefore, the joint distribution of wind speed and the mean ozone level is represented by the bivariate Gamma distribution provided in Example 2, and presented graphically in Fig. 5b. Following Balakrishnan and Ristic (2016), we then use bootstrap based Kolmogorov–Smirnov test to check whether the Gamma distribution is a good fit for the marginals. Also, we evaluate the goodness of fit of the proposed copula based on Kolmogorov–Smirnov statistic utilising the bootstrap algorithm proposed by Genest et al. (2006). We find the proposed model fits the data reasonably well. The R programme related to the proposed estimation methodology are provided in the Supplementary material. In Fig. 5c we present the contour plot of the the distribution of wind speed and mean ozone level. It indicates that the concentration of mean ozone level varies from 6–30 ppb when the wind speed is within 7–16 mph. The estimated conditional distributions of the mean ozone level keeping the wind speed fixed at the empirical first decile (5.7 mph), median (9.7 mph), and ninth decile (14.9 mph) are presented in Fig. 5d. It is easy to see that the distribution of the mean ozone level decreases stochastically (in the sense of the usual stochastic order) as the wind speed increases. This visual representation of the regression dependence property indicates that the ozone level distributions below the level of 60 ppb differ significantly with wind speed. This can assist in formulating policies and guidelines to choose between locations to avoid health hazards related to high ozone levels.

8 Concluding remarks

We construct the new flexible bivariate copula for modeling negative dependence between two random variables. Its correlation coefficient takes any value in the interval $(-1,0)$, which was not the case for other copulas reported in the literature. It is important to note that the Spearman’s rho and the Kendall’s tau have a simple one-parameter form with negative values in the full range. The properties of the proposed copula is an agreement with most of the popular notions of negative dependence available in the literature, namely quadrant Dependence, regression dependence and likelihood ratio dependence, etc. It is an interesting problem to consider a semi-parametric generalisation of the proposed copula and investigate its associated properties. Another possible direction of future research could be a multivariate extension of the proposed copula using the approaches considered by Fischer and Köck (2012) and Mazo et al. (2015).

For an illustrative data analysis based on the proposed copula, we consider a data set on daily air quality measurements for New York Metropolitan Area. Based on the observed data, we find that wind speed and ozone levels strongly dependent in the NQD sense. We consider three different models (Lognormal, Weibull, and Gamma distributions) for estimation of parameters associated with the marginal distributions of the wind speed and the mean ozone level. It is shown that the Gamma distributions fits better for both marginals and that the distribution of the mean ozone level decreases stochastically (in the sense of the usual stochastic order) as the wind speed increases. The scope of the proposed copula goes far beyond this particular application. For example, biomedical researchers can utilize the proposed copula in studying the negative association between BMI and glycated proteins (Espasandín-Domínguez et al. 2019). One can also extend the proposed copula to asses and model the nonlinear and asymmetric negative dependence over time in security and commodity markets (Liu et al. 2017).

References

Ahn JY (2015) Negative dependence concept in copulas and the marginal free herd behavior index. J Comput Appl Math 288:304–322
Article MathSciNet MATH Google Scholar
Amblard C, Girard S (2009) A new extension of bivariate FGM copulas. Metrika 70:1–17
Article MathSciNet MATH Google Scholar
Bairamov I, Kotz S (2000) Dependence structure and symmetry of Huang-Kotz FGM distributions and their extensions. Metrika 56:55–72
Article MathSciNet MATH Google Scholar
Bairamov I, Kotz S (2003) On a new family of positive quadrant dependent bivariate distributions. Int Math J 3(11):1247–1254
MathSciNet MATH Google Scholar
Balakrishnan N, Lai C (2009) Continuous bivariate distributions. Springer, New York
MATH Google Scholar
Balakrishnan N, Ristic MM (2016) Multivariate families of gamma-generated distributions with finite or infinite support above or below the diagonal. J Multivar Anal 143:194–207
Article MathSciNet MATH Google Scholar
Bekrizadeh H, Jamshidi B (2017) A new class of bivariate copulas: dependence measures and properties. Metron 75:31–50
Article MathSciNet MATH Google Scholar
Bekrizadeh H, Parham GA, Zadkarmi MR (2012) The new generalization of Farlie-Gumbel- Morgenstern copulas. Metrika 6:3527–3533
MathSciNet MATH Google Scholar
Bhuyan P, Ghosh S, Majumder P, Mitra M (2020) A bivariate life distribution and notions of negative dependence. Stat 9(1):1–11
Article MathSciNet Google Scholar
Chambers JM, Cleveland WS, Kleiner B, Tukey PA (1983) Graphical methods for data analysis. Wadsworth & Brooks
Cooray K (2019) A new extension of the FGM copula for negative association. Commun Stat Theory Methods 48(8):1902–1919
Article MathSciNet MATH Google Scholar
Dhiman HS, Deb D, Balas VE (2020) Supervised machine learning in wind forecasting and ramp event prediction. Elsevier, Amsterdam
Google Scholar
Dixit VU, Khandeparkar P (2017) Estimation of parameters of Skew Log Laplace distribution. Am J Math Manag Sci 36:277–291
Google Scholar
Durante F, Foscolo E, Rodríguez-Lallena JA, Úbeda-Flores M (2012) A method for constructing higher-dimensional copulas. Statistics 46(3):387–404
Article MathSciNet MATH Google Scholar
Durante F, Sempi C (2015) Principles of copula theory. Chapman and Hall/CRC, Boca Raton
Book MATH Google Scholar
Esary JD, Lehmann EL (1972) Relationship among some concepts of bivariate dependence. Ann Math Stat 43:651–655
Article MathSciNet MATH Google Scholar
Espasandín-Domínguez J, Cadarso-Suárez C, Kneib T, Marra G, Klein N, Radice R, Gude F (2019) Assessing the relationship between markers of glycemic control through flexible copula regression models. Stat Med 38:5161–5181
Article MathSciNet Google Scholar
Fang Z, Joe H (1992) Further developments on some dependence orderings for continuous bivariate distributions. Ann Inst Stat Math 44:501–517
Article MathSciNet MATH Google Scholar
Finkelstein M (2003) On one class of bivariate distributions. Stat Prob Lett 65:1–6
Article MathSciNet MATH Google Scholar
Fischer M, Köck C (2012) Constructing and generalizing given multivariate copulas: a unifying approach. Statistics 46:1–12
Article MathSciNet MATH Google Scholar
Fontaine C, Frostig R, Ombao H (2020) Modeling dependence via copula of functionals of Fourier coefficients. TEST. https://doi.org/10.1007/s11749-020-00703-5
Freund JE (1961) A bivariate extension of the exponential distribution. J Am Stat Assoc 56(296):971–977
Article MathSciNet MATH Google Scholar
Genest C, Quessy JF, Rémillard B (2006) Goodness-of-fit procedures for copula models based on the probability integral transformation. Scand J Stat 33(2):337–366
Article MathSciNet MATH Google Scholar
Gorai AK, Tuluri F, Huang H, Hayami H, Yoshikado H, Kawamoto Y (2015) Influence of local meteorology and $NO_{2}$ conditions on ground-level ozone concentrations in the eastern part of Texas, USA. Air Quality Atmosph Health 8(1):81–96
Article Google Scholar
Gumbel EJ (1960) Bivariate exponential distributions. J Am Stat Assoc 55(292):698–707
Article MathSciNet MATH Google Scholar
Hofert M, Kojadinovic I, Machler M, Yan J (2018) Elements of copula modeling with R. Springer, New York
Book MATH Google Scholar
Hürlimann W (2015) A comprehensive extension of the FGM copula. Stat Pap 58:373–392
Article MathSciNet MATH Google Scholar
Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall, London
Book MATH Google Scholar
Joe H (2015) Dependence modeling with copulas. CRC Press, Taylor & Francis Group, LLC, Boca Raton
MATH Google Scholar
Khiem M, Ooka R, Huang H, Hayami H, Yoshikado H, Kawamoto Y (2010) Analysis of the relationship between changes in meteorological conditions and the variation in summer ozone levels over the central Kanto area. Adv Meteorol. https://doi.org/10.1155/2010/349248
Article Google Scholar
Kimeldorf G, Sampson AR (1987) Positive dependence orderings. Ann Inst Stat Math 39:113–128
Article MathSciNet MATH Google Scholar
Kurothe RS, Goel NK, Mathur BS (1997) Derived flood frequency distribution for negatively correlated rainfall intensity and duration. Water Resour Res 33:2103–2107
Article Google Scholar
Lai CD, Xie M (2000) A new family of positive quadrant dependent bivariate distributions. Stat Probab Lett 46:359–364
Article MathSciNet MATH Google Scholar
Lehmann EL (1966) Some concepts of dependence. Ann Math Stat 37(5):1137–1153
Article MathSciNet MATH Google Scholar
Liu B, Ji Q, Fan Y (2017) A new time-varying optimal copula model identifying the dependence across markets. Quantitative Finance 17(3):437–453
Article MathSciNet MATH Google Scholar
Lu L, Ghosh SK (2022) Nonparametric estimation and testing for positive quadrant dependent bivariate copula. J Business Econ Stat 40(2):664–677
Article MathSciNet Google Scholar
Mazo G, Girard S, Forbes F (2015) A class of multivariate copulas based on products of bivariate copulas. J Multivar Anal 140:363–376
Article MathSciNet MATH Google Scholar
Mishra G, Ghosh K, Dwivedi AK, Kumar M, Kumar S, Chintalapati S, Tripathi SN (2021) An application of probability density function for the analysis of PM2.5 concentration during the COVID-19 lockdown period. Sci Total Environ 782:146681
Article Google Scholar
Mohtashami-Borzadaran V, Amini M, Ahmadi J (2019) On the properties of a reliability dependent model. In: Proceeding of the 5th seminar on reliability theory and its applications, Yazd, Iran, pp 256-265
Monjean P, Robyns B (2015) Eco-friendly innovations in electricity transmission and distribution networks. Elsevier, Amsterdam
Google Scholar
Nelsen RB (2006) An introduction to copula. Springer, New York
MATH Google Scholar
Pobočíková I, Sedliačková Z, Michalková M (2017) Application of four probability distributions for wind speed modeling. Proc Eng 192:713–718
Article Google Scholar
Ramadan A, Ebeed M, Kamel S, Nasrat L (2020) Optimal power flow for distribution systems with uncertainty. Elsevier, Amsterdam
Google Scholar
Ramos PL, Nascimento DC, Ferreira PH, Weber KT, Santos TEG, Louzada F (2019) Modeling traumatic brain injury lifetime data: improved estimators for the generalized gamma distribution under small samples. PLoS ONE 14(8):e0221332
Article Google Scholar
Sarmanov OV (1996) Generalized normal correlation and two-dimensional Fréchet classes. Dokl Akad Nauk SSSR 168:596–599
Google Scholar
Scarsini M, Shaked M (1996) Positive dependence orders: a survey. In: Athens conference on applied probability and time series analysis, pp 70-91
Schucany WR, Parr WC, Boyer JE (1978) Correlation structure in Farlie–Gumbel–Morgenstern distributions. Biometrika 65(3):650–653
Article MathSciNet MATH Google Scholar
Sharma S, Sharma P, Khare M, Kwatra S (2016) Statistical behavior of ozone in urban environment. Sustain Environ Res 26(3):142–148
Article Google Scholar
Shepherd DG (1978) Supervised machine learning in wind forecasting and ramp event prediction. Elsevier, Amsterdam
Google Scholar
Souza A, Oliveira SSD, Aristone F, Olaofe OZ, Kumar SP, Arsić M, Razika I (2018) Modeling of the function of the ozone concentration distribution of surface to urban areas. Eur Chem Bull 7(3):98–105
Article Google Scholar
Topcu S, Anteplioglu U, Incecik S (2003) Surface ozone concentrations and its relation to wind field in Istanbul. Water Air Soil Pollution Focus 3:53–60
Article Google Scholar
Yanagimoto T (1972) Families of positively dependent random variables. Ann Math Stat 24:559–573
Article MathSciNet MATH Google Scholar
Yanagimoto T, Okamoto M (1969) Partial orderings of permutations and monotonicity of a rank correlation statistic. Ann Inst Stat Math 21:489–506
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

The authors are also thankful to Prof. Sujit Ghosh and Ms. Lu Lu for providing the R code for testing quadrant dependence. The work of Dr. Prajamitra Bhuyan was supported in part by the Lloyd’s Register Foundation programme on Data-Centric Engineering at the Alan Turing Institute, UK.

Author information

Authors and Affiliations

Department of Science and Mathematics, Indian Institute of Information Technology Guwahati, Guwahati, India
Shyamal Ghosh
School of Mathematical Sciences, Queen Mary University of London, London, UK
Prajamitra Bhuyan
The Alan Turing Institute, London, UK
Prajamitra Bhuyan
Department of Mathematical Statistics, University of the Free State, Bloemfontein, South Africa
Maxim Finkelstein
Department of Management Science, University of Strathclyde, Glasgow, UK
Maxim Finkelstein

Authors

Shyamal Ghosh
View author publications
You can also search for this author in PubMed Google Scholar
Prajamitra Bhuyan
View author publications
You can also search for this author in PubMed Google Scholar
Maxim Finkelstein
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Prajamitra Bhuyan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 53 KB)

Appendices

Appendix A

1.1 A1 Proof of Proposition 1

Case I For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have

$$\begin{aligned} \dfrac{\partial C_\theta }{\partial \theta }= & {} \dfrac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )}v^{-\theta }\left[ \log \left( \dfrac{\theta }{1+\theta }\right) +\log (1-u)-\log (v) \right] \\\le & {} \dfrac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )}v^{-\theta }\left[ \log \left( \dfrac{\theta }{1+\theta }\right) +\log \left[ \dfrac{(1+\theta )v}{\theta }\right] -\log (v) \right] ,\\&\quad \text {since }(1-u)\le \dfrac{(1+\theta )v}{\theta }\\= &\, 0 \end{aligned}$$

Case II For $0<u<1$, and $\frac{\theta }{1+\theta }<v<1$, we have

$$\begin{aligned} \dfrac{\partial C_\theta }{\partial \theta }= (1-u)^{(1+\theta )}(1-v)\log (1-u)\le 0. \end{aligned}$$

Now combining Case I and II, we have $\dfrac{\partial C_\theta }{\partial \theta }\le 0$ for all $(u,v)\in I^2$, which implies $C_{\theta }$ is decreasing in $\theta $.

1.2 A2 Proof of Proposition 2

Case I For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have

$$\begin{aligned} \nabla ^2 C_\theta (u,v)= & \, \dfrac{\partial ^2 C_\theta (u,v)}{\partial u^2} + \dfrac{\partial ^2 C_\theta (u,v)}{\partial v^2}\\= & \, \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }\left[ (1 - u)^{(\theta -1)} v^{-\theta } + (1-u)^{(1+\theta )}v^{-(2+\theta )}\right] \ge 0 \end{aligned}$$

Case II For $0<u<1$, and $\frac{\theta }{1+\theta }<v<1$, we have

$$\begin{aligned} \nabla ^2 C_\theta (u,v)= &\, \dfrac{\partial ^2 C_\theta (u,v)}{\partial u^2} + \dfrac{\partial ^2 C_\theta (u,v)}{\partial v^2}\\= &\, \theta (1 + \theta ) (1 - u)^{(\theta -1)} (1 - v)\ge 0 \end{aligned}$$

Now from Case I and II we can write $\nabla ^2 C_\theta (u,v) \ge 0$ for all $(u,v)\in I^2$, and hence the result follows.

1.3 A3 Proof of Proposition 3

To establish the absolute continuity of the proposed copula $C_{\theta }$, it is required to show

$$\begin{aligned} \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds =C_\theta (u,v), \end{aligned}$$

for every $(u,v)\in I^2$.

Case I For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have

$$\begin{aligned} \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds= & {} \int _{1-\frac{(1+\theta )v}{\theta }}^u\int _{\frac{\theta (1-s)}{(1+\theta )}}^v\dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-s)^\theta t^{-(1+\theta )}dtds\\= & {} \int _{1-\frac{(1+\theta )v}{\theta }}^u\left[ 1-\left( \frac{\theta }{1+\theta }\right) ^\theta (1-s)^\theta v^{-\theta }\right] ds\\= & {} \int _{1-u}^{\frac{(1+\theta )v}{\theta }}\left[ 1-\left( \frac{\theta }{1+\theta }\right) ^\theta z^\theta v^{-\theta }\right] dz \,\,\,(\mathrm{where}\,\, z = 1-s)\\= & {} v-(1-u)+\dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}(1-u)^{1+\theta }v^{-\theta }=C_\theta (u,v). \end{aligned}$$

Case II For $0<u<1$, and $\frac{\theta }{1+\theta }<v<1$, we have

$$\begin{aligned}& \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds \\ &\quad = \int _0^u\int _{\frac{\theta (1-s)}{(1+\theta )}}^{\frac{\theta }{(1+\theta )}}\dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-s)^\theta t^{-(1+\theta )}dtds\\ &\qquad + \int _0^u\int _{\frac{\theta }{(1+\theta )}}^v(1+\theta )(1-s)^\theta dtds\\&\quad = \int _0^u\left[ 1-(1-s)^\theta \right] ds + \int _0^u \left[ v-\theta (1 - v)\right] (1 - s)^\theta ds\\ &\quad = u-\dfrac{1}{1+\theta }+\dfrac{(1-u)^{\theta +1}}{\theta + 1}+ \left[ v-\theta (1 - v)\right] \dfrac{\left[ 1-(1-u)^{(\theta +1)}\right] }{1+\theta }\\ &\quad = u-(1-v)\left[ 1-(1-u)^{(\theta +1)}\right] =C_\theta (u,v). \end{aligned}$$

Therefore, the results follows by combining Case I and II.

Appendix B

1.1 B1 Proof of Theorem 6

(i)
To establish LTI($Y\mid X$), it is sufficient to show that for any v in I, $\frac{C(u,v)}{u}$ is nondecreasing in u (Nelsen 2006, Theorem 5.2.5, p-192). For $0<u<1$, and $\frac{\theta }{1+\theta }<v<1$, we have
$$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{C(u,v)}{u}\right] =\dfrac{(1-v)[1-(1-u)^\theta (1+\theta u)]}{u^2}. \end{aligned}$$
Now we need to prove that $[1-(1-u)^\theta (1+\theta u)]>0$. Define $h(u):=(1-u)^\theta (1+\theta u)$. Observe that $h(0)=1$, $h(1)=0$, and h(u) is a decreasing function in u, since $h^{'}(u)=-\theta ^2(1+\theta )u(1-u)^{(\theta -1)}<0$ for all $u\in (0,1)$. Therefore, $\frac{\partial }{\partial u}\left[ \frac{C(u,v)}{u}\right] >0$.

Similarly, for $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, it can be shown that
$$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{C(u,v)}{u}\right] =- \dfrac{\frac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1+\theta u)(1-u)^\theta +v^{(1+\theta )}-v^\theta }{ u^2v^\theta }>0. \end{aligned}$$
Hence, the result follows.
(ii)
In view of Theorem 5.2.5 in (Nelsen 2006, p-192), the necessary and sufficient condition for LTI($X\mid Y$) is that, $\frac{C(u,v)}{v}$ is nondecreasing in v, for any u in I.

For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have
$$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{C(u,v)}{v}\right] = \dfrac{(u-1)\left[ \frac{\theta ^\theta }{(1+\theta )^{\theta }}(1-u)^\theta - v^{\theta }\right] }{v^{\theta +2}}\ge 0, \end{aligned}$$
since $(u-1)<0$ and $\left[ \frac{\theta ^\theta }{(1+\theta )^{\theta }}(1-u)^\theta - v^{\theta }\right] <0$.

Similarly, for $0<u<1$, and $\frac{\theta }{1+\theta }<v<1$, we have
$$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{C(u,v)}{v}\right] =\dfrac{(u-1)[(1-u)^\theta -1]}{v^2}\ge 0. \end{aligned}$$
Hence, the result follows.
(iii)
To establish RTD($Y\mid X$), it is sufficient to show that $\frac{v-C(u,v)}{(1-u)}$ is a nondecreasing function in u for any $v\in I$ (Nelsen 2006, Theorem 5.2.5, p-192).

For $0<v \le \frac{\theta }{1+\theta }$ and ${1-\frac{(1+\theta )v}{\theta }<u<1}$, we have
$$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{v-C(u,v)}{(1-u)}\right] = \left( \dfrac{\theta }{1+\theta }\right) ^{1+\theta }(1-u)^{\theta -1}v^{-\theta }>0. \end{aligned}$$

Similarly, for $0<u<1$, and $\dfrac{\theta }{1+\theta }<v<1$, we have
$$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{v-C(u,v)}{(1-u)}\right] = (1-v)(1-u)^{\theta -1}>0. \end{aligned}$$
Hence, the conclusion follows.
(iv)
By Theorem 5.2.5 in (Nelsen 2006, p-192), RTD($Y\mid X$) holds, if $\frac{u-C(u,v)}{(1-v)}$ is a nondecreasing function in v for any $u\in I$.

For $0<v \le \frac{\theta }{1+\theta }$ and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have
$$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{u-C(u,v)}{(1-v)}\right] =\dfrac{\frac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{1+\theta }v^{-(1+\theta )}[\theta (1-v)-v]}{(1-v)^2}, \end{aligned}$$
which is non-negative, since $v <\frac{\theta }{1+\theta }$.

Similarly, for any fixed $u\in I$, and $\frac{\theta }{1+\theta }<v<1$, $\frac{u-C(u,v)}{(1-v)}=1-(1-u)^{1+\theta }$ is a constant function in v. Hence the results follows.

1.2 B2 Proof of Theorem 7

To establish SD($Y\mid X$) property of the proposed copula $C_{\theta }$, we utilise the geometric interpretation of the stochastic monotonicity given in Corollary 5.2.11 of (Nelsen 2006, p-197). Therefore, it is sufficient to show that $C_\theta (u,v)$ is a convex function of u. Similarly, SD($X\mid Y$) can be established by showing $C_\theta (u,v)$ is a convex function of v.

(i)
For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have
$$\begin{aligned} \dfrac{\partial ^2}{\partial u^2}C_\theta (u,v) = \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }(1-u)^{\theta -1}v^{-\theta }>0. \end{aligned}$$

For $0<u<1,$ and $\frac{\theta }{1+\theta }<v<1$, we have
$$\begin{aligned} \dfrac{\partial ^2}{\partial u^2}C_\theta (u,v) =\theta (1+\theta )(1-v)(1-u)^{(\theta -1)}> 0. \end{aligned}$$
Hence $C_\theta (u,v)$ is a convex function of u.
(ii)
For $0<v \le \frac{\theta }{1+\theta }$, and $1-\frac{(1+\theta )v}{\theta }<u<1$, we have
$$\begin{aligned} \dfrac{\partial ^2}{\partial v^2}C_\theta (u,v) ={ \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }(1-u)^{1+\theta }v^{-(2+\theta )}}>0. \end{aligned}$$

Note that, for any fixed $u\in I$, and $\frac{\theta }{1+\theta }<v<1$, $\frac{\partial }{\partial v}C_\theta (u,v)$ is a constant function of v. Hence, the result follows.

1.3 B3 Proof of Theorem 8

To established the NLR between X and Y with copula $C_\theta $, we need to show $c_\theta (u_1,v_1)c_\theta (u_2,v_2)\le c_\theta (u_1,v_2)c_\theta (u_2,v_1)$ holds for all $u_1\le u_2$, and $v_1\le v_2$, where $c_\theta (u,v)$ is the copula density given in (4). Note that for the proposed copula $C_{\theta }$, the aforementioned condition holds with equality for all $u_1\le u_2$ and $v_1\le v_2$ in I.

Appendix C

1.1 C1 Proof of Theorem 9

The results directly follow from Proposition 1.

1.2 C2 Proof of Theorem 10

Let $\theta _1\le \theta _2$. The conditional copula of V given $U=u$ is given by

$$\begin{aligned} C_{\theta _1}(v\mid u)= \left\{ \begin{array}{ll} 1- \dfrac{\theta _1^{\theta _1}}{(1+\theta _1)^{\theta _1}}(1-u)^{\theta _1} v^{-\theta _1}, &{} \dfrac{(1-u)\theta _1}{(1+\theta _1)}<v<\dfrac{\theta _1}{1+\theta _1} \\ 1-(1+\theta _1)(1-v)(1-u)^{\theta _1}, &{} \dfrac{\theta _1}{1+\theta _1}<v<1. \end{array}\right. \end{aligned}$$

Then $C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u)$ is given by

$$\begin{aligned}&C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u) \\&\quad = \left\{ \begin{array}{ll} \dfrac{\theta _2(1+\theta _1)^{(\theta 1/\theta _2)}}{(1+\theta _2)\theta _1^{(\theta 1/\theta _2)}}(1-u)^{1-{(\theta 1/\theta _2)}}v^{(\theta 1/\theta _2)}, &{} 0<v\le 1-(1-u)^{\theta _2} \\ 1-\dfrac{1+\theta _1}{1+\theta _2}(1-v)(1-u)^{(\theta 1-\theta _2)}, &{} 1-(1-u)^{\theta _2}<v< 1. \end{array}\right. \end{aligned}$$

Note that $ C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u)$ is a decreasing function in u as $\theta _1\le \theta _2$. Now, using Definition 5, the result follows.

1.3 C3 Proof of Theorem 11

Let $\theta _1\le \theta _2$. Now, it is easy to verify that the condition provided in Definition 6 holds for any choice of $u_1, u_2, v_1, v_2$, where $u_1\le u_2$, $v_1\le v_2$.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Ghosh, S., Bhuyan, P. & Finkelstein, M. On a bivariate copula for modeling negative dependence: application to New York air quality data. Stat Methods Appl 31, 1329–1353 (2022). https://doi.org/10.1007/s10260-022-00636-3

Download citation

Accepted: 29 March 2022
Published: 28 April 2022
Issue Date: December 2022
DOI: https://doi.org/10.1007/s10260-022-00636-3

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On a bivariate copula for modeling negative dependence: application to New York air quality data

Abstract

Similar content being viewed by others

The Transformed MG-Extended Exponential Distribution: Properties and Applications

A distance based two-sample test of means difference for multivariate datasets

Tail-dependence clustering of time series with spatial constraints

1 Introduction

2 The bivariate copula

2.1 Conditional copulas

Remark 1

Remark 2

2.2 Basic properties

Proposition 1

Proposition 2

Proposition 3

2.3 Measures of dependence

Definition 1

Proposition 4

Definition 2

Proposition 5

3 Connections with notions of negative dependence

Definition 3

Theorem 6

Theorem 7

Theorem 8

Remark 3

4 Ordering properties

Definition 4

Definition 5

Definition 6

Theorem 9

Theorem 10

Theorem 11

5 Examples

Example 1

Example 2

Example 3

Remark 4

6 Estimation methodology

7 Application

7.1 Exploratory data analysis

7.2 Modeling wind speed and ozone level

8 Concluding remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Supplementary Information

Supplementary file 1 (pdf 53 KB)

Appendices

Appendix A

1.1 A1 Proof of Proposition 1

1.2 A2 Proof of Proposition 2

1.3 A3 Proof of Proposition 3

Appendix B

1.1 B1 Proof of Theorem 6

1.2 B2 Proof of Theorem 7

1.3 B3 Proof of Theorem 8

Appendix C

1.1 C1 Proof of Theorem 9

1.2 C2 Proof of Theorem 10

1.3 C3 Proof of Theorem 11

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation