Skip to main content

On a bivariate copula for modeling negative dependence: application to New York air quality data

Abstract

In many practical scenarios, including finance, environmental sciences, system reliability, etc., it is often of interest to study the various notion of negative dependence among the observed variables. A new bivariate copula is proposed for modeling negative dependence between two random variables that complies with most of the popular notions of negative dependence reported in the literature. Specifically, the Spearman’s rho and the Kendall’s tau for the proposed copula have a simple one-parameter form with negative values in the full range. Some important ordering properties comparing the strength of negative dependence with respect to the parameter involved are considered. Simple examples of the corresponding bivariate distributions with popular marginals are presented. Application of the proposed copula is illustrated using a real data set on air quality in the New York City, USA.

Introduction

Copulas provide an effective tool for modeling dependence in various multivariate phenomena in the fields of reliability engineering, life sciences, environmental science, economics and finance, etc (Fontaine et al. 2020; Cooray 2019; Joe 2015, Ch-7). Specifically, in recent decades, bivariate copulas were used to generate bivariate distributions with suitable dependence properties (Lai and Xie 2000; Bairamov and Kotz 2003; Finkelstein 2003; Durante et al. 2012; Mohtashami-Borzadaran et al. 2019). The detailed discussion of historical developments, obtained results and perspectives along with the up to date theory can be found in Durante and Sempi (2015) and Hofert et al. (2018). It should be noted that most copulas available in the literature possess some limitations in modeling negatively dependent data, which is a certain disadvantage, as negative dependence between vital variables is often encountered in real life.

Lehmann (1966) introduced several concepts of negative dependence for bivariate distributions. Later, Esary and Lehmann (1972) and Yanagimoto (1972) extended the corresponding definitions and developed stronger notions of bivariate negative dependence. See Balakrishnan and Lai (2009) for detailed discussion on popular dependence notions and their applications in the context of continuous bivariate distributions. Scarsini and Shaked (1996) provided a detailed overview of the corresponding ordering properties for the multivariate distributions. These results provide useful tools for describing the dependence properties of copulas with respect to a dependence parameter. However, only a few bivariate copulas that allow for a simple and meaningful analysis of this kind have been developed and studied in the literature so far. The Farlie–Gumbel–Morgenstern (FGM) family of distributions exhibits negative dependence, but the Spearman’s rho for this family lies within the interval \([-1/3,1/3]\) (Schucany et al. 1978). Bairamov and Kotz (2000) and Bekrizadeh et al. (2012) have considered the four-parameter and the three-parameter extensions of the FGM family proposed by Sarmanov (1996), with Spearman’s rho lying within the interval \([-0.48, 0.50]\) and \([-0.5, 0.43]\), respectively. To address this issue Amblard and Girard (2009) proposed another extension, but its application is limited because of a singular component that is concentrated on the corresponding diagonal. Some other extensions of the FGM copula are discussed in Ahn (2015) and Bekrizadeh and Jamshidi (2017). Hürlimann (2015) have proposed a comprehensive extension of the FGM copula with the Spearman’s rho and Kendall’s tau attaining any value in \((-1,1)\). However, the dependence and ordering properties of these copulas are not well studied in the literature. Recently, Cooray (2019) proposed a new extension of FGM family which exhibits negative dependence among the variables in a very strong sense. However, its Spearman’s rho and Kendll’s tau are restricted to \([-0.70,0]\) and \([-0.52,0]\), respectively.

Thus, it is quite a challenging problem to construct a flexible bivariate copula with the correlation coefficient that takes any value in the interval \((-1,0)\). Moreover, it is not sufficient just to suggest this type of copula, but it is essential to describe its properties (including relevant stochastic comparisons) especially in the case of strong notions of dependence. In many real life scenarios, paired observations of non-negative variables possess strong negative dependence. For example, rainfall intensity and duration are jointly modeled incorporating their negative dependence for the study of derived flood frequency distribution (Kurothe et al. 1997). This paper is motivated by a real case study on air quality for New York Metropolitan area where the joint distribution of the wind speed and ozone level exhibits strong negative dependence (See Sect. 7). We believe that the current study meets to some extent this challenge, as we propose an absolutely continuous negatively dependent copula that satisfies most of the popular notions of negative dependence available in the literature with correlation coefficients in the interval \((-1,0)\).

The paper is organized as follows. In Sect. 2, we describe the baseline (for the proposed copula) distribution and discuss some basic properties including conditional distributions and correlation coefficients. Various notions of negative dependence in the context of the proposed copula and ordering properties are considered in Sects. 3 and 4, respectively. Section 5 provides some examples of negatively dependent standard bivariate distributions. The estimation methodologies are discussed in Sect. 6. As an illustration, we provide a real case study in Sect. 7. Finally, some concluding remarks are given in Sect. 8.

The bivariate copula

Bhuyan et al. (2020) proposed a negatively dependent bivariate life distribution that possesses tractable closed-form expressions for the joint distributions and exhibits various strong notions of negative dependence reported in the literature. Most importantly, the correlation coefficient may take any value in the interval \((-1,0)\). One of the marginal distribution is Exponential and the other belongs to skew log Laplace family (Dixit and Khandeparkar 2017). We utilize the negative dependence structure inherent in this model and formulate a copula with strong negative dependence. The joint distribution function and the marginal distributions are given by

$$\begin{aligned} H(x,y) = \left\{ \begin{array}{ll} y^{\lambda } - e^{-\lambda x}+\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\left[ e^{(\lambda +\mu )x}-y^{\lambda +\mu }\right] , & 0<y \le 1, x>-\log y \\ 1-e^{-\lambda x}-\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\left[ 1-e^{-(\lambda +\mu )x}\right] , & x>0, y>1, \end{array}\right. \end{aligned}$$
(1)

and \(F(x)= 1- e^{-\lambda x}\) for \(x>0\), and \(G(y) =\dfrac{\mu }{(\lambda +\mu )}y^{\lambda }\mathbbm {1}(0<y \le 1)+ \left[ 1-\dfrac{\lambda }{(\lambda +\mu )y^{\mu }}\right] \mathbbm {1}( y>1)\), respectively, where \(\lambda ,\mu >0\). Note that \(F(\cdot )\) and \(G(\cdot )\) are continuous. We first find the quasi-inverse functions of \(F(\cdot )\) and \(G(\cdot )\) and ‘put’ those into the arguments of the joint distribution function \(H(\cdot ,\cdot )\) given by (1). Then by Corollary 2.3.7 of Nelsen (2006, p-22), we obtain the following copula

$$\begin{aligned} C_{\lambda , \mu }(u,v) = \left\{ \begin{array}{l} v-(1-u)+\dfrac{\lambda \mu ^{\frac{\mu }{\lambda }}}{(\lambda +\mu )^{1+\frac{\mu }{\lambda }}}(1-u)^{{1+\frac{\mu }{\lambda }}}v^{-\frac{\mu }{\lambda }}, 0<v \le \dfrac{\mu }{\mu +\lambda }, 1-\dfrac{(\lambda +\mu )v}{\mu }<u<1, \\ u-(1-v)\left[ 1-(1-u)^{{1+\frac{\mu }{\lambda }}}\right] , 0<u<1, \dfrac{\mu }{\mu +\lambda }<v<1. \end{array}\right. \end{aligned}$$
(2)

Now using the reparameterization \(\mu =\theta \lambda \), in (2), we rewrite \(C_{\lambda , \mu }\) as

$$\begin{aligned} C_\theta (u,v) = \left\{ \begin{array}{l} v-(1-u)+\dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}(1-u)^{1+\theta }v^{-\theta }, 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ u-(1-v)\left[ 1-(1-u)^{1+\theta }\right] , 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$
(3)

for \(\theta >0\). It is easy to verify that \(C_\theta (u,v)\), given by (3), satisfies the following conditions: (i) \(C_\theta (u,0)=0=C_\theta (0,v)\), (ii) \(C_\theta (u,1)=u\), \(C_\theta (1,v)=v\), for any u, v in \(I=[0,1]\), and (iii) \(C_\theta (u_2,v_2)-C_\theta (u_2, v_1)-C_\theta (u_1, v_2)+C_\theta (u_1,v_1)\ge 0\), for any \(u_1, u_2, v_1, v_2\) in I with \(u_1\le u_2\) and \(v_1\le v_2\). In Figs. 1 and 2, we provide graphical presentation of the proposed copula for different values of the dependence parameter \(\theta \).

Fig. 1
figure 1

Graphical plots of \(C_{\theta }\) for different choices of \(\theta \) on an unit square

Fig. 2
figure 2

Contour plots of \(C_{\theta }\) for different choices of \(\theta \)

The survival copula, is the function \({\bar{C}}\) which couples the joint survival function to its marginal survival functions. It is easy to show that \({\bar{C}}\) is a copula, and is related to the copula C via the equation \({\bar{C}}= u+v-1 + C(1-u, 1-v)\). See Nelsen (2006, p-32) for details. The survival copula and the density function of the proposed copula \(C_\theta (u,v)\) are given by

$$\begin{aligned} {\bar{C}}_\theta (u,v) = \left\{ \begin{array}{ll} \dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}u^{1+\theta }(1-v)^{-\theta }, & 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ vu^{(1+\theta )}, & 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} c_\theta (u,v) = \left\{ \begin{array}{ll} \dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-u)^\theta v^{-(1+\theta )}, & 0<v \le \dfrac{\theta }{1+\theta }, 1-\dfrac{(1+\theta )v}{\theta }<u<1 \\ (1+\theta )(1-u)^\theta , & 0<u<1, \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$
(4)

respectively.

Conditional copulas

The conditional copula of U given \(V=v\), is as follows. For \(0<v \le \frac{\theta }{(1+\theta )}\),

$$\begin{aligned} C_\theta (u\mid v)= 1- \frac{\theta ^{(1+\theta )}}{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )} v^{-(1+\theta )}, \,\,\,\,1-\dfrac{(1+\theta )v}{\theta }<u<1, \end{aligned}$$
(5)

whereas for \(\frac{\theta }{(1+\theta )}<v<1\),

$$\begin{aligned} C_\theta (u\mid v)= 1-(1-u)^{(1+\theta )}, \,\,\,\,\,0<u<1. \end{aligned}$$
(6)

The conditional mean and variance of \(U\mid V=v\) are given by

$$\begin{aligned} E[U\mid V=v]= \left\{ \begin{array}{ll} 1-\dfrac{(1+\theta )^2v}{\theta (\theta +2)}, & 0<v \le \dfrac{\theta }{1+\theta } \\ \dfrac{1}{\theta +2}, & \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

and

$$\begin{aligned} Var[U\mid V=v]= \left\{ \begin{array}{ll} \dfrac{(1+\theta )^3v^2}{\theta ^2(\theta +2)^2(\theta +3)},& 0<v \le \dfrac{\theta }{1+\theta } \\ \dfrac{\theta +1}{(\theta +2)^2(\theta +3)},& \dfrac{\theta }{1+\theta }<v<1, \end{array}\right. \end{aligned}$$

respectively.

Remark 1

The regression of U on \(V=v\) is linearly decreasing in v for \(0<v\le \frac{\theta }{\theta +1}\), and independent of v for \(\frac{\theta }{\theta +1}<v<1\). Also, it is interesting to note that the conditional variance of \(U\mid V=v\) is an increasing function of v and bounded from above by \(\frac{\theta +1}{(\theta +2)^2(\theta +3)}.\)

The conditional copula of V given \(U=u\), is given by

$$\begin{aligned} C_\theta (v\mid u)= \left\{ \begin{array}{ll} 1- \dfrac{\theta ^\theta }{(1+\theta )^\theta }(1-u)^\theta v^{-\theta }, & \dfrac{(1-u)\theta }{(1+\theta )}<v \le \dfrac{\theta }{1+\theta } \\ 1-(1+\theta )(1-v)(1-u)^\theta , & \dfrac{\theta }{1+\theta }<v<1 \end{array}\right. \end{aligned}$$
(7)

The conditional mean and variance of \(V\mid U=u\), are given by

$$\begin{aligned} E[V\mid U=u]=\dfrac{(1-u)^\theta }{2(1-\theta )}-\dfrac{2\theta ^2(1-u)}{1-\theta ^2}, \end{aligned}$$

for \(\theta \ne 1\), and

$$\begin{aligned} Var[V\mid U=u]= & {} -\dfrac{(1+\theta )(1-u)^\theta \left[ 2 - \theta + \theta ^2 (2 - 6 u) + 3 \theta ^3 u\right] }{3(\theta -2)(\theta ^2-1)^2}\\&+\dfrac{\theta ^3(1-u)^2}{(\theta -2)(\theta ^2-1)^2} -\dfrac{(1+\theta )^2(1-u)^{2\theta }}{4(\theta ^2-1)^2} \end{aligned}$$

for \(\theta \ne 1,2\), respectively.

Remark 2

The regression of V on \(U=u\) is strictly decreasing in u.

One can use the conditional copula of U given \(V=v\), provided in (5) and (6), to simulate from the proposed copula \(C_{\theta }\), given by (3), using the following steps.

Step I.:

Simulate \(v_i\) and \(u_i^{*}\) independently from standard uniform distribution.

Step II.:

If \(v_i\le \frac{\theta }{\theta +1}\), then solving \(C_{\theta }(u\mid v_i)=u_i^{*}\) from (5), we get \(u_i=1-(\frac{\theta +1}{\theta })v_i(1-u_i^{*})^{\frac{1}{1+\theta }}\); else, solving \(C_{\theta }(u\mid v_i)=u_i^{*}\) from (6), we get \(u_i=1-(1-u_i^{*})^{\frac{1}{1+\theta }}\).

Step III.:

Repeat Step I and Step II n times to obtain independently and identically distributed realizations \((u_i,v_i)\), for \(i=1,2,\ldots ,n\) from \(C_{\theta }\).

A similar algorithm can be elaborated to simulate from \(C_{\theta }\) based on the conditional copula of V given U, provided in (7).The associated R programme for the aforementioned algorithm are provided in the Supplementary material. The Scatter plots based on 500 simulated observations using the aforementioned algorithm for four different values of \(\theta \) are given in Fig. 3. As expected, the data points are getting closer to the diagonal \(v=-u\) for higher values of \(\theta \).

Fig. 3
figure 3

Scatter plots based on 500 simulated observations from \(C_{\theta }\) for different choices of \(\theta \)

Basic properties

In this Subsection, we present three important propositions related to the proposed copula. The detailed proofs are presented in Appendix A.

Proposition 1

The copula \(C_{\theta }\), defined in (3), is decreasing with respect to its dependence parameter \(\theta \), i.e., if \(\theta _1\le \theta _2\) then \(C_{\theta _2}(u,v)\le C_{\theta _1}(u,v)\), for all \((u,v) \in I^2 = [0,1]\times [0,1]\).

Proposition 2

The copula \(C_{\theta }\), defined in (3), is sub-harmonic, i.e., \(\nabla ^2 C_\theta (u,v) \ge 0\).

Proposition 3

The copula \(C_{\theta }\), defined in (3), is absolutely continuous.

Measures of dependence

Measures of dependence are commonly used to summarize the complicated dependence structure of bivariate distributions. See Joe (1997, Ch-2), Nelsen (2006, Ch-5) and Hofert et al. (2018, Ch-2) for a detailed review on measures of dependence and its associated properties. In this section, we derive the expressions of the Kendall’s tau and the Spearman’s rho for the proposed copula \(C_\theta \). Essentially, these coefficients measure the correlation between the ranks rather than actual values of X and Y. Therefore, these coefficients are unaffected by any monotonically increasing transformation of X and Y.

Definition 1

Let X and Y be the continuous random variables with the dependence structure described by the copula C. Then the population version of the Spearman’s rho for X and Y is given by

$$\begin{aligned} \rho :=\int _0^1\int _0^1uvdC(u,v)-3= \int _0^1\int _0^1C(u,v)dudv-3 \end{aligned}$$

Proposition 4

Let (XY) be a random pair with copula \(C_\theta \). The Spearman’s rho is given by

$$\begin{aligned} \rho =\dfrac{2(3+3\theta +\theta ^2)}{2+3\theta +\theta ^2}-3, \end{aligned}$$

which is a decreasing function in \(\theta \) and takes any values between \(-1\) and 0.

Definition 2

Let X and Y be the continuous random variables with copula C. Then, the population version of the Kendall’s tau for X and Y is given by

$$\begin{aligned} \tau := 4\int _0^1\int _0^1C(u,v)dC(u,v)-1 \end{aligned}$$

Proposition 5

Let (XY) be a random pair with copula \(C_\theta \). Then the Kendall’s tau is given by

$$\begin{aligned} \tau =\dfrac{-\theta }{(1+\theta )}, \end{aligned}$$

which is a decreasing function in \(\theta \) and takes any values between \(-1\) and 0.

In Fig. 4, we have plotted the Spearman’s rho and the Kendall’s tau against the dependence parameter \(\theta \). It is easy to see that the Spearman’s rho is less than the Kendall’s tau for all \(\theta >0\).

Fig. 4
figure 4

Plot of Spearman’s rho and Kendall’s tau against the dependence parameter \(\theta \)

Connections with notions of negative dependence

As discussed in Sect. 2.3, the Spearman’s rho and the Kendall’s tau measure the correlation between two random variables. However, it is possible that these random variables may have the strong correlation, but possess the weak association with respect to different notions of dependence or vice versa. In this section, we discuss several relevant notions of negative dependence, namely Quadrant Dependence, Regression Dependence and Likelihood Ratio Dependence, etc., and explore whether the corresponding properties are satisfied by the proposed copula or not. First, we provide the definitions of the aforementioned dependence notions as discussed in Nelsen (2006) and Balakrishnan and Lai (2009).

Definition 3

Let X and Y be continuous random variables with copula C. Then

  1. 1.

    X and Y are Negatively Quadrant Dependent (NQD) if \(P(X \le x, Y \le y) \le {P(X \le x)P(Y \le y)}\), for all \((x, y) \in R^2\), where \(R^2\) is the domain of joint distribution of X and Y, or equivalently a copula C is said to be NQD if for all \((u,v)\in I^2\), \(C(u,v)\le uv.\)

  2. 2.

    Y is left tail increasing in X (LTI(\(Y\mid X\))), if \(P[Y \le y \mid X \le x]\) is a nondecreasing function of x for all y.

  3. 3.

    X is left tail increasing in Y (LTI(\(X\mid Y\))), if \(P[X \le x \mid Y \le y]\) is a nondecreasing function of y for all x.

  4. 4.

    Y is right tail decreasing in X (RTD(\(Y\mid X\))), if \(P[Y> y \mid X > x]\) is a nonincreasing function of x for all y.

  5. 5.

    X is right tail decreasing in Y (RTD(\(X\mid Y\))), if \(P[X>x \mid Y > y]\) is a nonincreasing function of y for all x.

  6. 6.

    Y is stochastically decreasing in X denoted as SD(\(Y\mid X\)), (also known as negatively regression dependent (NRD) (\(Y\mid X\))) if \(P[Y > y \mid X = x]\) is a nonincreasing function of x for all y.

  7. 7.

    X is stochastically decreasing in Y denoted as SD(\(X\mid Y\)), (also known as negatively regression dependent (\(X\mid Y\))) if \(P[X > x\mid Y = y]\) is a nonincreasing function of y for all x.

  8. 8.

    Let X and Y be continuous random variables with joint density function h(xy). Then X and Y are negatively likelihood ratio dependent, denote by NLR(X,Y), if \(h(x_1,y_1)h(x_2,y_2)\le h(x_1,y_2)h(x_2,y_1)\) for all \(x_1, x_2, y_1, y_2\in I\) such that \(x_1\le x_2\) and \(y_1\le y_2\).

Now in the following theorems, we establish that the proposed copula \(C_{\theta }\) satisfies all the aforementioned dependence properties. The detailed proofs are provided in Appendix B.

Theorem 6

Let X and Y be two random variables with copula \(C_\theta \). Then (i) X and Y are LTI(\(Y\mid X\)), (ii) X and Y are LTI(\(X\mid Y\)), (iii) X and Y are RTD(\(Y\mid X\)), and (iv) X and Y are RTD(\(X\mid Y\)).

Theorem 7

Let X and Y be two random variables with copula \(C_\theta \). Then (i) X and Y are SD(\(Y\mid X\)), and (ii) X and Y are SD(\(X\mid Y\)).

Theorem 8

Let X and Y be two random variables with copula \(C_\theta \). Then X and Y are NLR.

Remark 3

Two random variables X and Y with copula \(C_{\theta }\) are NQD. This directly follows from Theorem 8. See the interrelationships between different concepts of negative dependence summarised in (Balakrishnan and Lai 2009, p-130) for details.

Ordering properties

In Sect. 3, several negative dependence properties of the proposed copula \(C_\theta \) has been investigated for the fixed \(\theta >0\). In this section, we discuss the ordering properties of the proposed copula \(C_\theta \), which provides a precise (and also intuitively expected) notion for one bivariate distribution being more positively or negatively associated than another. For this purpose, we first recall the definitions of the dependence orderings for bivariate distributions. These definitions describe the strength of dependence of a copula with respect to its dependence parameter \(\theta \). Lehmann (1966) was first to introduce the NQD and NRD notions. Following this notions, Yanagimoto and Okamoto (1969) introduced the ordering properties as defined below.

Definition 4

Let F and G be two bivariate distributions with the same marginals. Then F is said to be smaller than G in the NQD sense, denoted as \(F\prec _{NQD}G\), if

$$\begin{aligned} F(x,y)\ge G(x,y)\,\,\,\,\,\,\forall x\,\, \mathrm{and}\,\, y. \end{aligned}$$

Definition 5

Let F and G be two bivariate distributions with the same marginals, and let (UV) and (XY) be two random vectors having the distributions F and G, respectively. Then F is said to be smaller than G in the NRD sense, denoted by \(F\prec _{NRD} G \) or \((U, V)\prec _{NRD} (X, Y)\), if for any \(x_1\le x_2\),

$$\begin{aligned} F^{-1}_{V\mid U}(u\mid x)\ge F^{-1}_{V\mid U}(v\mid x^{'})\implies G^{-1}_{V\mid U}(u\mid x)\ge G^{-1}_{V\mid U}(v\mid x^{'}) \end{aligned}$$

for any \(u, v \in I\), where \(F_{V \mid U}\) denote the conditional distribution of V given \(U = u\) and \(F^{-1}_{V\mid U}\) denote its right-continuous inverse. Equivalently, \(F\prec _{NRD} G \) if and only if \(G^{-1}_{Y\mid X}\left[ F_{V\mid U}(y\mid x)\mid x\right] \) is decreasing in x for all y (Fang and Joe 1992).

Later, Kimeldorf and Sampson (1987) have introduced and studied in detail the notion of the Negatively Likelihood Ratio dependence ordering that is described in the following definition. Let the random variables X and Y have the joint distribution G(xy). For any two intervals \(I_1\) and \(I_2\) of the real line, let us denote \(I_1\le I_2\) if \(x_1\in I_1\) and \(x_2\in I_2\) imply that \(x_1\le x_2\). For any two intervals I and J of the real line let G(IJ) represent the probability assigned by G to the rectangle \(I\times J\).

Definition 6

Let F and G be two bivariate distributions with the same marginals, and let (UV) and (XY) be two random vectors having the distributions F and G, respectively. Then F is said to be smaller than G in the NLR dependence sense, denoted by \(F \prec _{NLR} G\) or \((U, V) \prec _{NLR} (X, Y)\), if \( F(I_1, J_1)F(I_2, J_2) G(I_1, J_2)G(I_2, J_1)\ge F(I_1, J_2)F(I_2, J_1) G(I_1, J_1)G(I_2, J_1)\) whenever \(I_1\le I_2\) and \(J_1\le J_2\). When the densities F and G exist and denoted by f and g, respectively, then the aforementioned condition equivalently is written as \(f(x_1, y_1)f(x_2, y_2) g(x_1, y_2)g(x_2, y_1)\ge f(x_1, y_2)f(x_2, y_1) g(x_1, y_1)g(x_2, y_1) \) whenever \(x_1\le x_2\) and \(y_1\le y_2\).

In the following theorems, we derive the sufficient conditions under which one bivariate distribution will be more negatively associated than another. The detailed proofs of the following theorems are presented in Appendix C.

Theorem 9

If \(\theta _1\le \theta _2\), then \(C_{\theta _1}(u,v)\prec _{NQD}C_{\theta _2}(u,v).\)

Theorem 10

If \(\theta _1\le \theta _2\), then \(C_{\theta _1}(u,v)\prec _{NRD}C_{\theta _2}(u,v).\)

Theorem 11

If \(\theta _1\le \theta _2\), then \(C_{\theta _1}(u,v)\prec _{NLR}C_{\theta _2}(u,v).\)

Examples

Traditionally, bivariate life distributions available in the literature are positively correlated (Balakrishnan and Lai 2009). However, in many real life scenarios, paired observations of non-negative variables are negatively correlated (Bhuyan et al. 2020). For example, the rainfall intensity and the duration are jointly modeled incorporating their negative dependence for the study of the corresponding flood frequency distribution (Kurothe et al. 1997). Gumbel (1960) and Freund (1961) have proposed the bivariate Exponential distributions with lower bound of the correlation coefficient as \(-0.4\). In this section, several specific families of bivariate distributions are generated using the proposed copula (3) with different choices for marginal distribution. For modelling purposes, the Lognormal, Weibull, and Gamma distributions are popular among practitioners in the fields of engineering, medical science, and environmental science (Sharma et al. 2016; Pobočíková et al. 2017; Ramos et al. 2019). We consider these choices as baseline distribution. We first define a bivariate Weibull and bivariate Gamma distribution. Then we consider a case when the marginals are different, one from the Lognormal and another from the Weibull family. It should be noted that the resulting bivariate distributions can be described implementing all notions of negative dependence discussed in Sects. 3 and 4.

Example 1

Bivariate Weibull distribution: A family of bivariate Weibull distributions based on the proposed copula \(C_{\theta }\), with marginals \(F(x) =\left[ 1 - e^{-(\lambda _1 x)^{\delta _1}}\right] \mathbbm {1}(x>0)\), and \(G(y) = \left[ 1 - e^{-(\lambda _2 y)^{\delta _2}}\right] \mathbbm {1}(y>0)\), is given by

$$\begin{aligned}&h(x,y) \\&\quad = \left\{ \begin{array}{ll} \dfrac{\delta _1\delta _2\lambda _1^{\delta _1}\lambda _2^{\delta _2}\theta ^{\theta +1}}{(1+\theta )^{\theta }}x^{\delta _1 -1}y^{\delta _2 - 1} \left( \dfrac{e^{-(\lambda _1 x)^{\delta _1}}}{1-e^{-(\lambda _2 y)^{\delta _2}}}\right) ^{1+\theta }, &0<y \le \phi _1, x>\phi _2(y) \\ \delta _1\delta _2\lambda _1^{\delta _1}\lambda _2^{\delta _2}(1+\theta )x^{\delta _1 -1}y^{\delta _2 - 1}e^{-(\lambda _2 y)^{\delta _2}}\left( e^{-(\lambda _1 x)^{\delta _1}}\right) ^{1+\theta }, & x>0, y>\phi _1 \end{array}\right. \end{aligned}$$

where \(\phi _1 =\dfrac{1}{\lambda _2}\left[ \log (1+\theta )\right] ^{\frac{1}{\delta _2}}\), \(\phi _2(y)= \dfrac{1}{\lambda _1}\left[ \log \left( \dfrac{\theta }{(1+\theta )(1 - e^{-(\lambda _2 y)^{\delta _2}})}\right) \right] ^{\frac{1}{\delta _1}}\), \(\lambda _{i}>0\), \(\delta _{i}>0\) for \(i=1,2\).

Example 2

Bivariate Gamma distribution: A family of bivariate Gamma distributions based on the proposed copula \(C_{\theta }\), with marginals \(F(x)=\left[ \int _0^x\frac{1}{\Gamma (\alpha _1)}\beta _1^{\alpha _1}x^{\alpha _1 -1}e^{-\beta _1 x} \right] \mathbbm {1}(x>0)\), and \(G(y) =\left[ \int _0^y\frac{1}{\Gamma (\alpha _2)}\beta _2^{\alpha _2}y^{\alpha _2 -1}e^{-\beta _2 y} \right] {\mathbbm {1}(y>0)}\), is given by

$$\begin{aligned}&h(x,y) \\&\quad = \left\{ \begin{array}{l} \dfrac{\beta _1^{\alpha _1}\beta _2^{\alpha _2}\theta ^{1+\theta }x^{\alpha _1 -1}y^{\alpha _2 -1}e^{-(\beta _1 x+\beta _2 y)}}{\Gamma (\alpha _1)\Gamma (\alpha _2)(1+\theta )^\theta }\left[ 1-\dfrac{\gamma _1(\alpha _1, \beta _1 x)}{\Gamma (\alpha _1)}\right] ^\theta \left[ \dfrac{\gamma _2(\alpha _2, \beta _2 y)}{\Gamma (\alpha _2)}\right] ^{-(1+\theta )}, \\ 0<y \le \xi _2 , \xi _1(y)<x< \eta \\ \dfrac{\beta _1^{\alpha _1}\beta _2^{\alpha _2}(1+\theta )}{\Gamma (\alpha _1)\Gamma (\alpha _2)}x^{\alpha _1 -1}y^{\alpha _2 -1}e^{-(\beta _1 x+\beta _2 y)}\left[ 1-\dfrac{\gamma _1(\alpha _1, \beta _1 x)}{\Gamma (\alpha _1)}\right] ^\theta ,\\ 0<x<\eta , \zeta _1<y<\zeta _2, \end{array}\right. \end{aligned}$$

where \(\zeta _1=\gamma _2^{-1}\left( \dfrac{\theta }{1+\theta }\right) \), \(\zeta _2 = \gamma _2^{-1}(\Gamma (\alpha _2))\), \(\xi _2= \gamma _2^{-1}\left( \dfrac{\Gamma (\alpha _2)\theta }{1+\theta }\right) \), \(\eta = \gamma _1^{-1}(\Gamma (\alpha _1))\), \(\xi _1(y)= \gamma _1^{-1}\left[ \Gamma (\alpha _1) \left( 1-\dfrac{(1+\theta )\gamma _2(\alpha _2,\beta _2 y)}{\theta \Gamma (\alpha _2)}\right) \right] \), \(\gamma _i(\alpha _i, \beta _i) = \int _0^{\beta _i} t^{\alpha _i - 1} e^{-t}dt\), \(\alpha _{i}>0\), \(\beta _{i}>0\) for \({i= 1,2}\).

Example 3

Bivariate Lognormal-Weibull distribution: A family of bivariate distribution with one marginal from Lognormal distribution and another from Weibull distribution based on the proposed copula \(C_{\theta }\), with marginal distribution functions \(F(x) =\frac{1}{2}\left[ 1 + erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] \mathbbm {1}(x>0)\), and \(G(y) = \left[ 1 - e^{-(\lambda y)^{\delta }}\right] {\mathbbm {1}(y>0)}\), is given by

$$\begin{aligned} h(x,y) = \left\{ \begin{array}{l} \dfrac{\delta \lambda ^\delta \theta ^{\theta +1}}{\sigma \sqrt{\pi } (1+\theta )^{\theta } 2^{\frac{2\theta +1}{2}}} \dfrac{y^{\delta -1}e^{-(\lambda y)^\delta }}{x\left( 1-e^{-(\lambda y)^\delta }\right) ^{1+\theta }}\left[ 1 - erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] ^\theta , \\ \quad \quad 0<y \le \psi _1, \psi _2(y)<x<\psi _3 \\ \dfrac{\delta (1+\theta )\lambda ^{\delta }}{\sigma \sqrt{\pi }2^{\frac{2\theta +1}{2}} } \frac{y^{\delta -1}e^{-(\lambda y)^\delta }}{x}\left[ 1 - erf\left( \frac{\ln {x}-\mu }{\sqrt{2}\sigma }\right) \right] ^\theta , \\ \quad \quad x>0, y>\psi _1 \end{array}\right. \end{aligned}$$

where \(\psi _1 =\frac{1}{\lambda }\left[ \log (1+\theta )\right] ^{\frac{1}{\delta }}\), \(\psi _2(y)=\exp \left[ \mu +\sigma \sqrt{2}\,erf^{-1}\left\{ 1-\frac{2(1+\theta )}{\theta }\left( 1-e^{-(\lambda y)^\delta } \right) \right\} \right] \), \(\psi _3=\mu +\sigma \sqrt{2}\,erf^{-1}\left( \frac{1}{2}\right) \), \(\lambda >0\), \(\delta >0\), \(-\infty<\mu <\infty \), \(\sigma >0\), and \(erf(x)=\frac{2}{\sqrt{\pi }}\int _0^x e^{-t^2}dt\).

Remark 4

The bivariate Weibull (in Example 1) and the bivariate Gamma (in Example 2) reduce to bivariate Exponential distribution for \(\delta _1=\delta _2 = 1\), and \(\alpha _1=\alpha _2 = 1\), respectively.

Estimation methodology

In a classical parametric setting, a straightforward approach is to estimate the dependence parameter and the parameters associated with the marginals using maximum likelihood method. This method is theoretically valid but there are some practical limitations. Firstly, the estimation of the dependence parameter \(\theta \) depends on the parametric assumptions made on the marginals and the estimate of \(\theta \) will be biased if the marginals are misspecified. The second drawback is computational as the log-likelihood function involves potentially large number of parameters and high-dimensional optimization is known to be challenging. See Hofert et al. (2018, Ch-4) for details. To avoid aforementioned computational burden Joe (1997) proposed a two stage method known as inference function for margins (IFM). This estimation method is based on two separate maximum likelihood estimations of the univariate marginal distributions, followed by an optimization of the bivariate likelihood as a function of the dependence parameter. Similar to maximum likelihood estimate, the estimate of \(\theta \) based on IFM may be biased if the margins are partially misspecified (Hofert et al. 2018, p-136). Although the IFM has computational edge, it is less efficient compared to the maximum likelihood estimate (Joe 2015, Ch-5).

We propose to use a method that close in spirit to the method of inference function for margins but avoids the issue with misspecified marginals for the estimation of \(\theta \). In contrast to IFM, we do not maximize the bivariate likelihood. Instead, we determine the dependence parameter using method of moments (Hofert et al. 2018, p-141). The method of fitting a bivariate distribution with marginals \(F_{\eta _{i}}(\cdot )\), indexed by parameter \(\eta _{i}\) for \(i=1,2\), involves the following steps:

  1. (i)

    Obtain the estimates \({\hat{\eta }}_{i}\) for \(i=1,2\) using maximum likelihood method.

  2. (ii)

    Estimate of \(\theta \) is given by \({\hat{\theta }}=\frac{-\tau _{n}}{1+\tau _{n}}\), or obtained by solving \(\rho _{n} =\dfrac{2(3+3{\hat{\theta }}+{\hat{\theta }}^2)}{2+3{\hat{\theta }}+{\hat{\theta }}^2}-3\), where \(\tau _{n}\) and \(\rho _{n}\) are sample version of Kendall’s \(\tau \) and Spearman’s \(\rho \), respectively.

  3. (iii)

    Obtain the fitted bivariate distribution by putting \(F_{{\hat{\eta }}_{1}}(\cdot )\), and \(G_{{\hat{\eta }}_{2}}(\cdot )\), and \({\hat{\theta }}\) in (3).

These steps are easy to execute and familiar to the practitioners of different fields of science. This method allows the copula to adequately approximate the dependence structure of the bivariate data, which is of prime concern from a practical point of view.

Application

Exploratory data analysis

For an illustrative data analysis based on the proposed copula, we consider a data set on daily air quality measurements for 153 days in the New York Metropolitan Area from May 1, 1973, to September 30, 1973. Information on average wind speed (in miles per hour) and mean ozone level (in parts per billion), were obtained from the New York State Department of Conservation and the National Weather Service, USA. This data set is named ‘airquality’ and openly available in the R package ‘datasets’. See Chambers, Cleveland, Kleiner, and Tukey (1983, Ch 2–5) for the detailed description of the data. Ozone in the upper atmosphere protects the earth from the sun’s harmful rays. On the contrary, exposure to ozone also can be hazardous to both humans and some plants in the lower atmosphere. Variations in weather conditions play an important role in determining ozone levels (Khiem et al. 2010; Topcu et al. 2003). In general, the concentration of the ozone level is affected by wind speed. High winds tend to disperse pollutants, which in turn dilute the concentration of the ozone level. However, stagnant conditions or light winds allow pollution levels to build up and thereby, the ozone level too becomes larger. Environmental scientists and meteorologists are interested in the study of the effect of a wind speed on the distribution patterns of ozone (Gorai et al. 2015) levels. For our analysis, we consider 116 observations discarding the missing values and presented the scatter plot of average wind speed versus ozone levels in Fig. 5a. It indicates strong negative dependence, and we find that Spearman’s rho and Kendall’s tau coefficients are \(-0.59\) and \(-0.43\), respectively. Further, we apply the methodology proposed by Lu and Ghosh (2021) based on Kolmogorov–Smirnov (KS), Anderson–Darling (AD), and Cramér-vonMises (CvM) discrepancy measures to test the hypothesis if the true underlying copula satisfies the NQD property. The p-values corresponding to KS, AD and CvM tests are 0.893, 0.571, and 0.861, respectively, affirm a strong notion of negative dependence between average wind speed and ozone levels in the NQD sense.

Modeling wind speed and ozone level

In the field of engineering and environmental science, Lognormal, Weibull, and Gamma distributions are widely used for modeling wind speed recorded in the same location (Shepherd 1978; Monjean and Robyns 2015; Pobočíková et al. 2017; Dhiman et al. 2020; Ramadan et al. 2020). These distributions are also used for modeling the level of various pollutants and ozone level (Sharma et al. 2016; Souza et al. 2018; Mishra et al. 2021). Therefore, we consider these three models for estimation of the parameters associated with the marginal distributions of the wind speed and the mean ozone level. Based on the Akaike information criterion, the Gamma distribution fits both marginals better as compared with other choices. The maximum likelihood estimates of the shape and the scale parameters are obtained as 7.171 and 1.375, respectively, for the wind speed, and the same for the mean ozone levels are 1.7 and 24.775, respectively. The estimate of the dependence parameter is obtained as \({\hat{\theta }}=0.765\). Therefore, the joint distribution of wind speed and the mean ozone level is represented by the bivariate Gamma distribution provided in Example 2, and presented graphically in Fig. 5b. Following Balakrishnan and Ristic (2016), we then use bootstrap based Kolmogorov–Smirnov test to check whether the Gamma distribution is a good fit for the marginals. Also, we evaluate the goodness of fit of the proposed copula based on Kolmogorov–Smirnov statistic utilising the bootstrap algorithm proposed by Genest et al. (2006). We find the proposed model fits the data reasonably well. The R programme related to the proposed estimation methodology are provided in the Supplementary material. In Fig. 5c we present the contour plot of the the distribution of wind speed and mean ozone level. It indicates that the concentration of mean ozone level varies from 6–30 ppb when the wind speed is within 7–16 mph. The estimated conditional distributions of the mean ozone level keeping the wind speed fixed at the empirical first decile (5.7 mph), median (9.7 mph), and ninth decile (14.9 mph) are presented in Fig. 5d. It is easy to see that the distribution of the mean ozone level decreases stochastically (in the sense of the usual stochastic order) as the wind speed increases. This visual representation of the regression dependence property indicates that the ozone level distributions below the level of 60 ppb differ significantly with wind speed. This can assist in formulating policies and guidelines to choose between locations to avoid health hazards related to high ozone levels.

Fig. 5
figure 5

Results based on the analysis of New York air quality data

Concluding remarks

We construct the new flexible bivariate copula for modeling negative dependence between two random variables. Its correlation coefficient takes any value in the interval \((-1,0)\), which was not the case for other copulas reported in the literature. It is important to note that the Spearman’s rho and the Kendall’s tau have a simple one-parameter form with negative values in the full range. The properties of the proposed copula is an agreement with most of the popular notions of negative dependence available in the literature, namely quadrant Dependence, regression dependence and likelihood ratio dependence, etc. It is an interesting problem to consider a semi-parametric generalisation of the proposed copula and investigate its associated properties. Another possible direction of future research could be a multivariate extension of the proposed copula using the approaches considered by Fischer and Köck (2012) and Mazo et al. (2015).

For an illustrative data analysis based on the proposed copula, we consider a data set on daily air quality measurements for New York Metropolitan Area. Based on the observed data, we find that wind speed and ozone levels strongly dependent in the NQD sense. We consider three different models (Lognormal, Weibull, and Gamma distributions) for estimation of parameters associated with the marginal distributions of the wind speed and the mean ozone level. It is shown that the Gamma distributions fits better for both marginals and that the distribution of the mean ozone level decreases stochastically (in the sense of the usual stochastic order) as the wind speed increases. The scope of the proposed copula goes far beyond this particular application. For example, biomedical researchers can utilize the proposed copula in studying the negative association between BMI and glycated proteins (Espasandín-Domínguez et al. 2019). One can also extend the proposed copula to asses and model the nonlinear and asymmetric negative dependence over time in security and commodity markets (Liu et al. 2017).

References

  • Ahn JY (2015) Negative dependence concept in copulas and the marginal free herd behavior index. J Comput Appl Math 288:304–322

    MathSciNet  MATH  Article  Google Scholar 

  • Amblard C, Girard S (2009) A new extension of bivariate FGM copulas. Metrika 70:1–17

    MathSciNet  MATH  Article  Google Scholar 

  • Bairamov I, Kotz S (2000) Dependence structure and symmetry of Huang-Kotz FGM distributions and their extensions. Metrika 56:55–72

    MathSciNet  MATH  Article  Google Scholar 

  • Bairamov I, Kotz S (2003) On a new family of positive quadrant dependent bivariate distributions. Int Math J 3(11):1247–1254

    MathSciNet  MATH  Google Scholar 

  • Balakrishnan N, Lai C (2009) Continuous bivariate distributions. Springer, New York

    MATH  Google Scholar 

  • Balakrishnan N, Ristic MM (2016) Multivariate families of gamma-generated distributions with finite or infinite support above or below the diagonal. J Multivar Anal 143:194–207

    MathSciNet  MATH  Article  Google Scholar 

  • Bekrizadeh H, Jamshidi B (2017) A new class of bivariate copulas: dependence measures and properties. Metron 75:31–50

    MathSciNet  MATH  Article  Google Scholar 

  • Bekrizadeh H, Parham GA, Zadkarmi MR (2012) The new generalization of Farlie-Gumbel- Morgenstern copulas. Metrika 6:3527–3533

    MathSciNet  MATH  Google Scholar 

  • Bhuyan P, Ghosh S, Majumder P, Mitra M (2020) A bivariate life distribution and notions of negative dependence. Stat 9(1):1–11

    MathSciNet  Article  Google Scholar 

  • Chambers JM, Cleveland WS, Kleiner B, Tukey PA (1983) Graphical methods for data analysis. Wadsworth & Brooks

  • Cooray K (2019) A new extension of the FGM copula for negative association. Commun Stat Theory Methods 48(8):1902–1919

    MathSciNet  Article  Google Scholar 

  • Dhiman HS, Deb D, Balas VE (2020) Supervised machine learning in wind forecasting and ramp event prediction. Elsevier, Amsterdam

    Google Scholar 

  • Dixit VU, Khandeparkar P (2017) Estimation of parameters of Skew Log Laplace distribution. Am J Math Manag Sci 36:277–291

    Google Scholar 

  • Durante F, Foscolo E, Rodríguez-Lallena JA, Úbeda-Flores M (2012) A method for constructing higher-dimensional copulas. Statistics 46(3):387–404

    MathSciNet  MATH  Article  Google Scholar 

  • Durante F, Sempi C (2015) Principles of copula theory. Chapman and Hall/CRC, Boca Raton

    MATH  Book  Google Scholar 

  • Esary JD, Lehmann EL (1972) Relationship among some concepts of bivariate dependence. Ann Math Stat 43:651–655

    MathSciNet  MATH  Article  Google Scholar 

  • Espasandín-Domínguez J, Cadarso-Suárez C, Kneib T, Marra G, Klein N, Radice R, Gude F (2019) Assessing the relationship between markers of glycemic control through flexible copula regression models. Stat Med 38:5161–5181

    MathSciNet  Article  Google Scholar 

  • Fang Z, Joe H (1992) Further developments on some dependence orderings for continuous bivariate distributions. Ann Inst Stat Math 44:501–517

    MathSciNet  MATH  Google Scholar 

  • Finkelstein M (2003) On one class of bivariate distributions. Stat Prob Lett 65:1–6

    MathSciNet  MATH  Article  Google Scholar 

  • Fischer M, Köck C (2012) Constructing and generalizing given multivariate copulas: a unifying approach. Statistics 46:1–12

    MathSciNet  MATH  Article  Google Scholar 

  • Fontaine C, Frostig R, Ombao H (2020) Modeling dependence via copula of functionals of Fourier coefficients. TEST. https://doi.org/10.1007/s11749-020-00703-5

  • Freund JE (1961) A bivariate extension of the exponential distribution. J Am Stat Assoc 56(296):971–977

    MathSciNet  MATH  Article  Google Scholar 

  • Genest C, Quessy JF, Rémillard B (2006) Goodness-of-fit procedures for copula models based on the probability integral transformation. Scand J Stat 33(2):337–366

    MathSciNet  MATH  Article  Google Scholar 

  • Gorai AK, Tuluri F, Huang H, Hayami H, Yoshikado H, Kawamoto Y (2015) Influence of local meteorology and \(NO_{2}\) conditions on ground-level ozone concentrations in the eastern part of Texas, USA. Air Quality Atmosph Health 8(1):81–96

    Article  Google Scholar 

  • Gumbel EJ (1960) Bivariate exponential distributions. J Am Stat Assoc 55(292):698–707

    MathSciNet  MATH  Article  Google Scholar 

  • Hofert M, Kojadinovic I, Machler M, Yan J (2018) Elements of copula modeling with R. Springer, New York

    MATH  Book  Google Scholar 

  • Hürlimann W (2015) A comprehensive extension of the FGM copula. Stat Pap 58:373–392

    MathSciNet  MATH  Article  Google Scholar 

  • Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall, London

    MATH  Book  Google Scholar 

  • Joe H (2015) Dependence modeling with copulas. CRC Press, Taylor & Francis Group, LLC, Boca Raton

    MATH  Google Scholar 

  • Khiem M, Ooka R, Huang H, Hayami H, Yoshikado H, Kawamoto Y (2010) Analysis of the relationship between changes in meteorological conditions and the variation in summer ozone levels over the central Kanto area. Adv Meteorol. https://doi.org/10.1155/2010/349248

    Article  Google Scholar 

  • Kimeldorf G, Sampson AR (1987) Positive dependence orderings. Ann Inst Stat Math 39:113–128

    MathSciNet  MATH  Article  Google Scholar 

  • Kurothe RS, Goel NK, Mathur BS (1997) Derived flood frequency distribution for negatively correlated rainfall intensity and duration. Water Resour Res 33:2103–2107

    Article  Google Scholar 

  • Lai CD, Xie M (2000) A new family of positive quadrant dependent bivariate distributions. Stat Probab Lett 46:359–364

    MathSciNet  MATH  Article  Google Scholar 

  • Lehmann EL (1966) Some concepts of dependence. Ann Math Stat 37(5):1137–1153

    MathSciNet  MATH  Article  Google Scholar 

  • Liu B, Ji Q, Fan Y (2017) A new time-varying optimal copula model identifying the dependence across markets. Quantitative Finance 17(3):437–453

    MathSciNet  MATH  Article  Google Scholar 

  • Lu L, Ghosh SK (2022) Nonparametric estimation and testing for positive quadrant dependent bivariate copula. J Business Econ Stat 40(2):664–677

    MathSciNet  Article  Google Scholar 

  • Mazo G, Girard S, Forbes F (2015) A class of multivariate copulas based on products of bivariate copulas. J Multivar Anal 140:363–376

    MathSciNet  MATH  Article  Google Scholar 

  • Mishra G, Ghosh K, Dwivedi AK, Kumar M, Kumar S, Chintalapati S, Tripathi SN (2021) An application of probability density function for the analysis of PM2.5 concentration during the COVID-19 lockdown period. Sci Total Environ 782:146681

    Article  Google Scholar 

  • Mohtashami-Borzadaran V, Amini M, Ahmadi J (2019) On the properties of a reliability dependent model. In: Proceeding of the 5th seminar on reliability theory and its applications, Yazd, Iran, pp 256-265

  • Monjean P, Robyns B (2015) Eco-friendly innovations in electricity transmission and distribution networks. Elsevier, Amsterdam

    Google Scholar 

  • Nelsen RB (2006) An introduction to copula. Springer, New York

    MATH  Google Scholar 

  • Pobočíková I, Sedliačková Z, Michalková M (2017) Application of four probability distributions for wind speed modeling. Proc Eng 192:713–718

    Article  Google Scholar 

  • Ramadan A, Ebeed M, Kamel S, Nasrat L (2020) Optimal power flow for distribution systems with uncertainty. Elsevier, Amsterdam

    Google Scholar 

  • Ramos PL, Nascimento DC, Ferreira PH, Weber KT, Santos TEG, Louzada F (2019) Modeling traumatic brain injury lifetime data: improved estimators for the generalized gamma distribution under small samples. PLoS ONE 14(8):e0221332

    Article  Google Scholar 

  • Sarmanov OV (1996) Generalized normal correlation and two-dimensional Fréchet classes. Dokl Akad Nauk SSSR 168:596–599

    Google Scholar 

  • Scarsini M, Shaked M (1996) Positive dependence orders: a survey. In: Athens conference on applied probability and time series analysis, pp 70-91

  • Schucany WR, Parr WC, Boyer JE (1978) Correlation structure in Farlie–Gumbel–Morgenstern distributions. Biometrika 65(3):650–653

    MathSciNet  MATH  Article  Google Scholar 

  • Sharma S, Sharma P, Khare M, Kwatra S (2016) Statistical behavior of ozone in urban environment. Sustain Environ Res 26(3):142–148

    Article  Google Scholar 

  • Shepherd DG (1978) Supervised machine learning in wind forecasting and ramp event prediction. Elsevier, Amsterdam

    Google Scholar 

  • Souza A, Oliveira SSD, Aristone F, Olaofe OZ, Kumar SP, Arsić M, Razika I (2018) Modeling of the function of the ozone concentration distribution of surface to urban areas. Eur Chem Bull 7(3):98–105

    Article  Google Scholar 

  • Topcu S, Anteplioglu U, Incecik S (2003) Surface ozone concentrations and its relation to wind field in Istanbul. Water Air Soil Pollution Focus 3:53–60

    Article  Google Scholar 

  • Yanagimoto T (1972) Families of positively dependent random variables. Ann Math Stat 24:559–573

    MathSciNet  MATH  Article  Google Scholar 

  • Yanagimoto T, Okamoto M (1969) Partial orderings of permutations and monotonicity of a rank correlation statistic. Ann Inst Stat Math 21:489–506

    MathSciNet  MATH  Article  Google Scholar 

Download references

Acknowledgements

The authors are also thankful to Prof. Sujit Ghosh and Ms. Lu Lu for providing the R code for testing quadrant dependence. The work of Dr. Prajamitra Bhuyan was supported in part by the Lloyd’s Register Foundation programme on Data-Centric Engineering at the Alan Turing Institute, UK.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Prajamitra Bhuyan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (pdf 53 KB)

Appendices

Appendix A

A1 Proof of Proposition 1

Case I For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

$$\begin{aligned} \dfrac{\partial C_\theta }{\partial \theta }= & {} \dfrac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )}v^{-\theta }\left[ \log \left( \dfrac{\theta }{1+\theta }\right) +\log (1-u)-\log (v) \right] \\\le & {} \dfrac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{(1+\theta )}v^{-\theta }\left[ \log \left( \dfrac{\theta }{1+\theta }\right) +\log \left[ \dfrac{(1+\theta )v}{\theta }\right] -\log (v) \right] ,\\&\quad \text {since }(1-u)\le \dfrac{(1+\theta )v}{\theta }\\= &\, 0 \end{aligned}$$

Case II For \(0<u<1\), and \(\frac{\theta }{1+\theta }<v<1\), we have

$$\begin{aligned} \dfrac{\partial C_\theta }{\partial \theta }= (1-u)^{(1+\theta )}(1-v)\log (1-u)\le 0. \end{aligned}$$

Now combining Case I and II, we have \(\dfrac{\partial C_\theta }{\partial \theta }\le 0\) for all \((u,v)\in I^2\), which implies \(C_{\theta }\) is decreasing in \(\theta \).

A2 Proof of Proposition 2

Case I For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

$$\begin{aligned} \nabla ^2 C_\theta (u,v)= & \, \dfrac{\partial ^2 C_\theta (u,v)}{\partial u^2} + \dfrac{\partial ^2 C_\theta (u,v)}{\partial v^2}\\= & \, \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }\left[ (1 - u)^{(\theta -1)} v^{-\theta } + (1-u)^{(1+\theta )}v^{-(2+\theta )}\right] \ge 0 \end{aligned}$$

Case II For \(0<u<1\), and \(\frac{\theta }{1+\theta }<v<1\), we have

$$\begin{aligned} \nabla ^2 C_\theta (u,v)= &\, \dfrac{\partial ^2 C_\theta (u,v)}{\partial u^2} + \dfrac{\partial ^2 C_\theta (u,v)}{\partial v^2}\\= &\, \theta (1 + \theta ) (1 - u)^{(\theta -1)} (1 - v)\ge 0 \end{aligned}$$

Now from Case I and II we can write \(\nabla ^2 C_\theta (u,v) \ge 0\) for all \((u,v)\in I^2\), and hence the result follows.

A3 Proof of Proposition 3

To establish the absolute continuity of the proposed copula \(C_{\theta }\), it is required to show

$$\begin{aligned} \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds =C_\theta (u,v), \end{aligned}$$

for every \((u,v)\in I^2\).

Case I For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

$$\begin{aligned} \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds= & {} \int _{1-\frac{(1+\theta )v}{\theta }}^u\int _{\frac{\theta (1-s)}{(1+\theta )}}^v\dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-s)^\theta t^{-(1+\theta )}dtds\\= & {} \int _{1-\frac{(1+\theta )v}{\theta }}^u\left[ 1-\left( \frac{\theta }{1+\theta }\right) ^\theta (1-s)^\theta v^{-\theta }\right] ds\\= & {} \int _{1-u}^{\frac{(1+\theta )v}{\theta }}\left[ 1-\left( \frac{\theta }{1+\theta }\right) ^\theta z^\theta v^{-\theta }\right] dz \,\,\,(\mathrm{where}\,\, z = 1-s)\\= & {} v-(1-u)+\dfrac{\theta ^\theta }{(1+\theta )^{1+\theta }}(1-u)^{1+\theta }v^{-\theta }=C_\theta (u,v). \end{aligned}$$

Case II For \(0<u<1\), and \(\frac{\theta }{1+\theta }<v<1\), we have

$$\begin{aligned}& \int _0^u\int _0^v\dfrac{\partial ^2}{\partial s\partial t}C_\theta (s,t)dtds \\ &\quad = \int _0^u\int _{\frac{\theta (1-s)}{(1+\theta )}}^{\frac{\theta }{(1+\theta )}}\dfrac{\theta ^{1+\theta }}{(1+\theta )^\theta }(1-s)^\theta t^{-(1+\theta )}dtds\\ &\qquad + \int _0^u\int _{\frac{\theta }{(1+\theta )}}^v(1+\theta )(1-s)^\theta dtds\\&\quad = \int _0^u\left[ 1-(1-s)^\theta \right] ds + \int _0^u \left[ v-\theta (1 - v)\right] (1 - s)^\theta ds\\ &\quad = u-\dfrac{1}{1+\theta }+\dfrac{(1-u)^{\theta +1}}{\theta + 1}+ \left[ v-\theta (1 - v)\right] \dfrac{\left[ 1-(1-u)^{(\theta +1)}\right] }{1+\theta }\\ &\quad = u-(1-v)\left[ 1-(1-u)^{(\theta +1)}\right] =C_\theta (u,v). \end{aligned}$$

Therefore, the results follows by combining Case I and II.

Appendix B

B1 Proof of Theorem 6

  1. (i)

    To establish LTI(\(Y\mid X\)), it is sufficient to show that for any v in I, \(\frac{C(u,v)}{u}\) is nondecreasing in u (Nelsen 2006, Theorem 5.2.5, p-192). For \(0<u<1\), and \(\frac{\theta }{1+\theta }<v<1\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{C(u,v)}{u}\right] =\dfrac{(1-v)[1-(1-u)^\theta (1+\theta u)]}{u^2}. \end{aligned}$$

    Now we need to prove that \([1-(1-u)^\theta (1+\theta u)]>0\). Define \(h(u):=(1-u)^\theta (1+\theta u)\). Observe that \(h(0)=1\), \(h(1)=0\), and h(u) is a decreasing function in u, since \(h^{'}(u)=-\theta ^2(1+\theta )u(1-u)^{(\theta -1)}<0\) for all \(u\in (0,1)\). Therefore, \(\frac{\partial }{\partial u}\left[ \frac{C(u,v)}{u}\right] >0\).

    Similarly, for \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), it can be shown that

    $$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{C(u,v)}{u}\right] =- \dfrac{\frac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1+\theta u)(1-u)^\theta +v^{(1+\theta )}-v^\theta }{ u^2v^\theta }>0. \end{aligned}$$

    Hence, the result follows.

  2. (ii)

    In view of Theorem 5.2.5 in (Nelsen 2006, p-192), the necessary and sufficient condition for LTI(\(X\mid Y\)) is that, \(\frac{C(u,v)}{v}\) is nondecreasing in v, for any u in I.

    For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{C(u,v)}{v}\right] = \dfrac{(u-1)\left[ \frac{\theta ^\theta }{(1+\theta )^{\theta }}(1-u)^\theta - v^{\theta }\right] }{v^{\theta +2}}\ge 0, \end{aligned}$$

    since \((u-1)<0\) and \(\left[ \frac{\theta ^\theta }{(1+\theta )^{\theta }}(1-u)^\theta - v^{\theta }\right] <0\).

    Similarly, for \(0<u<1\), and \(\frac{\theta }{1+\theta }<v<1\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{C(u,v)}{v}\right] =\dfrac{(u-1)[(1-u)^\theta -1]}{v^2}\ge 0. \end{aligned}$$

    Hence, the result follows.

  3. (iii)

    To establish RTD(\(Y\mid X\)), it is sufficient to show that \(\frac{v-C(u,v)}{(1-u)}\) is a nondecreasing function in u for any \(v\in I\) (Nelsen 2006, Theorem 5.2.5, p-192).

    For \(0<v \le \frac{\theta }{1+\theta }\) and \({1-\frac{(1+\theta )v}{\theta }<u<1}\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{v-C(u,v)}{(1-u)}\right] = \left( \dfrac{\theta }{1+\theta }\right) ^{1+\theta }(1-u)^{\theta -1}v^{-\theta }>0. \end{aligned}$$

    Similarly, for \(0<u<1\), and \(\dfrac{\theta }{1+\theta }<v<1\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial u}\left[ \dfrac{v-C(u,v)}{(1-u)}\right] = (1-v)(1-u)^{\theta -1}>0. \end{aligned}$$

    Hence, the conclusion follows.

  4. (iv)

    By Theorem 5.2.5 in (Nelsen 2006, p-192), RTD(\(Y\mid X\)) holds, if \(\frac{u-C(u,v)}{(1-v)}\) is a nondecreasing function in v for any \(u\in I\).


    For \(0<v \le \frac{\theta }{1+\theta }\) and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

    $$\begin{aligned} \dfrac{\partial }{\partial v}\left[ \dfrac{u-C(u,v)}{(1-v)}\right] =\dfrac{\frac{\theta ^\theta }{(1+\theta )^{(1+\theta )}}(1-u)^{1+\theta }v^{-(1+\theta )}[\theta (1-v)-v]}{(1-v)^2}, \end{aligned}$$

    which is non-negative, since \(v <\frac{\theta }{1+\theta }\).

Similarly, for any fixed \(u\in I\), and \(\frac{\theta }{1+\theta }<v<1\), \(\frac{u-C(u,v)}{(1-v)}=1-(1-u)^{1+\theta }\) is a constant function in v. Hence the results follows.

B2 Proof of Theorem 7

To establish SD(\(Y\mid X\)) property of the proposed copula \(C_{\theta }\), we utilise the geometric interpretation of the stochastic monotonicity given in Corollary 5.2.11 of (Nelsen 2006, p-197). Therefore, it is sufficient to show that \(C_\theta (u,v)\) is a convex function of u. Similarly, SD(\(X\mid Y\)) can be established by showing \(C_\theta (u,v)\) is a convex function of v.

  1. (i)

    For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

    $$\begin{aligned} \dfrac{\partial ^2}{\partial u^2}C_\theta (u,v) = \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }(1-u)^{\theta -1}v^{-\theta }>0. \end{aligned}$$

    For \(0<u<1,\) and \(\frac{\theta }{1+\theta }<v<1\), we have

    $$\begin{aligned} \dfrac{\partial ^2}{\partial u^2}C_\theta (u,v) =\theta (1+\theta )(1-v)(1-u)^{(\theta -1)}> 0. \end{aligned}$$

    Hence \(C_\theta (u,v)\) is a convex function of u.

  2. (ii)

    For \(0<v \le \frac{\theta }{1+\theta }\), and \(1-\frac{(1+\theta )v}{\theta }<u<1\), we have

    $$\begin{aligned} \dfrac{\partial ^2}{\partial v^2}C_\theta (u,v) ={ \dfrac{\theta ^{(1+\theta )}}{(1+\theta )^\theta }(1-u)^{1+\theta }v^{-(2+\theta )}}>0. \end{aligned}$$

Note that, for any fixed \(u\in I\), and \(\frac{\theta }{1+\theta }<v<1\), \(\frac{\partial }{\partial v}C_\theta (u,v)\) is a constant function of v. Hence, the result follows.

B3 Proof of Theorem 8

To established the NLR between X and Y with copula \(C_\theta \), we need to show \(c_\theta (u_1,v_1)c_\theta (u_2,v_2)\le c_\theta (u_1,v_2)c_\theta (u_2,v_1)\) holds for all \(u_1\le u_2\), and \(v_1\le v_2\), where \(c_\theta (u,v)\) is the copula density given in (4). Note that for the proposed copula \(C_{\theta }\), the aforementioned condition holds with equality for all \(u_1\le u_2\) and \(v_1\le v_2\) in I.

Appendix C

C1 Proof of Theorem 9

The results directly follow from Proposition 1.

C2 Proof of Theorem 10

Let \(\theta _1\le \theta _2\). The conditional copula of V given \(U=u\) is given by

$$\begin{aligned} C_{\theta _1}(v\mid u)= \left\{ \begin{array}{ll} 1- \dfrac{\theta _1^{\theta _1}}{(1+\theta _1)^{\theta _1}}(1-u)^{\theta _1} v^{-\theta _1}, &{} \dfrac{(1-u)\theta _1}{(1+\theta _1)}<v<\dfrac{\theta _1}{1+\theta _1} \\ 1-(1+\theta _1)(1-v)(1-u)^{\theta _1}, &{} \dfrac{\theta _1}{1+\theta _1}<v<1. \end{array}\right. \end{aligned}$$

Then \(C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u)\) is given by

$$\begin{aligned}&C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u) \\&\quad = \left\{ \begin{array}{ll} \dfrac{\theta _2(1+\theta _1)^{(\theta 1/\theta _2)}}{(1+\theta _2)\theta _1^{(\theta 1/\theta _2)}}(1-u)^{1-{(\theta 1/\theta _2)}}v^{(\theta 1/\theta _2)}, &{} 0<v\le 1-(1-u)^{\theta _2} \\ 1-\dfrac{1+\theta _1}{1+\theta _2}(1-v)(1-u)^{(\theta 1-\theta _2)}, &{} 1-(1-u)^{\theta _2}<v< 1. \end{array}\right. \end{aligned}$$

Note that \( C_{\theta _2}^{-1}(C_{\theta _1}(v\mid u)\mid u)\) is a decreasing function in u as \(\theta _1\le \theta _2\). Now, using Definition 5, the result follows.

C3 Proof of Theorem 11

Let \(\theta _1\le \theta _2\). Now, it is easy to verify that the condition provided in Definition 6 holds for any choice of \(u_1, u_2, v_1, v_2\), where \(u_1\le u_2\), \(v_1\le v_2\).

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Ghosh, S., Bhuyan, P. & Finkelstein, M. On a bivariate copula for modeling negative dependence: application to New York air quality data. Stat Methods Appl (2022). https://doi.org/10.1007/s10260-022-00636-3

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10260-022-00636-3

Keywords

  • Air quality
  • Inference function for margins
  • Kolmogorov–Smirnov test
  • Negatively ordered
  • Negatively quadrant dependent