A new class of bivariate copulas: dependence measures and properties

In this paper, we propose a new class of bivariate Farlie–Gumbel–Morgenstern (FGM) copula. This class includes some known extensions of FGM copulas. Some general formulas for well-known association measures of this class are obtained, and various properties of the proposed model are studied. The tail dependence range of the new class is 0 to 1, and its correlation range is more efficient. We apply some sub-families of the proposed new class to model a dataset of medical science to show the superiority of our approach in comparison with the presented generalized FGM family in the literature. We also present a method to simulate from our generalized FGM copula, and validate our method and its accuracy using the simulation results to recover the same dependency structure of the original data.


Introduction
where c(s, t) is the so-called copula density.
Firstly, let us recall that the random variables X and Y are exchangeable, if (X, Y ) and (Y, X ) are identically distributed. Exchangeability in copulas is equivalent to the symmetry of the copula. In other words, a copula C is symmetric if C(s, t) = C(t, s), for every (s, t) ∈ [0, 1] 2 , otherwise C is asymmetric. Second, the random variables X and Y are Positively Quadrant Dependent (P Q D) if P (X ≤ x, Y ≤ y) ≥ P (X ≤ x) P (Y ≤ y), for every (x, y) ∈ 2 or equivalently C(s, t) ≥ st, for every (s, t) ∈ [0, 1] 2 . Let C 1 and C 2 be two copulas, the copula C 2 is said to be more concordant (or more P Q D) than the copula C 1 , (shortly C 1 < C 2 ), if C 1 (s, t) < C 2 (s, t), for every (s, t) ∈ [0, 1] 2 .
The study of copulas and their application has been developed a lot in the past decades, as a tool to describe the dependence of random variable (see e.g. surveys by [12,17,21]). However, copulas play a very important role in Mathematical Modeling and Simulation. So it is very meaningful to construct different kinds of copulas. One of the most popular parametric families of copulas, which were studied by [11,13,22], is the Farlie-Gumbel-Morgenstern (FGM) copula defined by where θ is called the association parameter. The FGM copula is PQD, for θ ∈ (0, 1]. However, this copula has been shown to be somewhat limited. This limitation for the dependence parameter θ ∈ [−1, 1], the Spearman's rho and Kendall's tau, are ρ S = θ/3 ∈ [−1/3, 1/3] and τ k = 2θ/9 ∈ [−2/9, 2/9], respectively. Since the correlation domain of FGM copula is limited, more general copulas have been introduced with the aim of improving the correlation range. An alternative approach to generalize the FGM copula was the symmetric semi-parametric extension that is defined by [26]. It was extensively studied in ( [3] and [2]). [15] developed Polynomial-type single-parameter extensions of FGM copula. They showed that ρ S can be increased up to approximately 0.375 while the lower bound remains −0.33. [19] set conditions for positive quadrant dependence and studied a class of bivariate uniform distribution with positive quadrant dependence property by generalizing the uniform representation of a well-known FGM copula. By a simple transformation, they also obtained families of bivariate distributions with pre-specified marginals. [4] further extended the family introdueced by [15] to the associated Spearman's ρ S ∈ [−0.48, 0.5016]. [25] and [1] proposed a new class of bivariate copulas dependent on two univariate functions which generalizes known families of copulas such as FG Mcopula family. [5] proposed a new class of generalized FGM copula and showed that their generalization can improve the correlation domain of FGM copula. Recently, ( [23,24]) further extended the family given by [5]. Their studied FGM copulas have range of Spearman's ρ S ∈ [−0.48, 0.5308], which is wider than that of the other FGM families of copulas discussed in literature.
In this regard, this paper proposes another generalized class of FGM copula, which includes some extended copulas introduced in recent years, and can improve the correlation range i.e. the proposed family covers some of the introduced family in the literature and its correlation range is more efficient. From another perspective, this presented family, is an extension of the generalized FGM copula discussed in [5]. The main contribution of this paper includes the followings: first, an extension of FGM copula and some interesting properties are presented. Second, properties and the general formulas for association measures of this family is its capability. The main feature of this family is capability for modeling a wider range of dependence. This permits us to extend the range of potential applications of the family in various branches of sciences.
The rest of the paper is as follows: the new extension and their basic characteristics are described in Sect. 2. Section 3 is dedicated to a remark and a property of the new class. Some general formulas for well-known association measures of this class are obtained in Sect. 4. Section 5 is devoted to the application and simulation results.

A new class of bivariate FGM copula and basic properties
To consider the continuous function ψ : As an example, the function ψ(s, 1] 2 is satisfied in the above conditions. Definition 2.1 Based on the function ψ and its properties, suppose that the strictly continuous function ψ : The concrete amount of the parameter space is dependent on the properties of the function ψ. We assume that ψ do not change their sign on [0, 1] in order to obtain unique determined dependence structure. Note that the copula is limited to the range of [0, 1] and therefore, [1+θψ(s, t)] p should be bounded on [0, 1]. The following theorem gives sufficient and necessary conditions on ψ to ensure that C ψ, p θ is a bivariate copula.
Note that θ is the parameter that shows dependence structure of the family C ψ, p θ so that θ = 0 or p = 0, leads to the independence of S and T . By Theorem 2.1, the concreted amount of the parameter space θ depends on the properties of the function ψ that has been investigated via (2. is symmetric, otherwise is asymmetric. As an example, let ψ(s, t) , a ≥ 1, then the generated copula is a symmetric copula. Also, let ψ(s, t) = (1 − √ s a )(1 − t), a ≥ 1, then the generated copula is an asymmetric copula. In particular cases, let ψ( is a continuous differentiable function on (0, 1), and f (1) = 0, then by Theorem 2.1, the generated family is a symmetric bivariate copula. Moreover, let 1], are continuous differentiable functions on (0, 1) and f 1 (1) = f 2 (1) = 0, then by Theorem 2.1, the generated family is an asymmetric bivariate copula ( [7]).

Some known copulas and one property
In this section, we present a remark and a property of the family C ψ, p θ introduced in (2.1).
includes some known family of FGM copulas introduced by researchers in recent years, which are as follows: reduce to the symmetric FGM copula discussed by [11,13,22]. (3.1) The new family C ψ,δ can be a new symmetric generalization of the Gumbel-Barnett (GB) copula discussed by [16], when ψ(s, t) = ln(s) ln(t), ∀s, t ∈ [0, 1] and Celebioglu-Cuadras copula introduced by Cuadras ( [9,10] 1]. Also, the copula C ψ,δ can be considered as a new generalization for [18] copula. [18] under some conditions introduced some families of copulas that are closed under the construction of generalized linear means. One of these families has the form: where φ is a function that defined on I = [0, 1] and φ is satisfied in φ(1) = 0.

Measures of dependence
Measures of dependence are common instruments to summarize a complicated dependence structure in the bivariate case. For a historical review of measures of dependence, see [17] and [21]. In this section, we compute the measures of dependence for the family C ψ, p θ . Since we cannot give formulas for the properties of dependence in terms of elementary functions, it is replaced by its expansion series on Based on , the family C ψ, p θ in (2.1), for every p ∈ [1, ∞) may also be written by polynomial expansion with respect to ψ as Note that, in (4.1), p is an integer, otherwise, p equals to +∞. Moreover, the family density c ψ, p θ in (2.2) can be written as Proof by using (4.2), E [ψ(S, T )] can be expanded as Using part by part integration (Appendix A.), we have (1, k), So,

Spearman's rho
Let X and Y be continuous random variables whose copula is C. Then the population version of Spearman's rho for X and Y is given by Note that, ρ S coincides with the correlation coefficient ρ between the uniform marginal distributions.

Kendall's tau
In terms of copula, Kendall's tau τ k is defined as (see [21]) Proof The proof of this proposition was deferred to the Appendix B.

Gini's gamma and the Spearman's footrule coefficient
Let X and Y be continuous random variables whose copula is C, then the population version of Gini's gamma (γ C ) and Spearman's footrule coefficient (δ C ) for X and Y are given by θ ; then the direction of equality between Gini's gamma (γ C ) and the Spearman's footrule (δ C ) is given by Proof The proof is straightforward As the remark (4.1) shows, the domain of correlation of FGM copula is limited and therefore it is not allowed for modeling of strong dependence. One of the advantages of the family C ψ, p θ is capability to improve the domain of correlation by introducing additional parameter p in FGM copula and some generalized FGM families presented in recent years.

Tail dependence
The concept of tail dependence relates to the amount of dependence in the upper-right quadrant tail and the lower-left-quadrant tail of a bivariate distribution. It is a concept that is relevant for the study of dependence between extreme values. It turns out that tail dependence between two continuous random variables X and Y is a copula property and hence the amount of tail dependence is invariant under strictly increasing transformations of X and Y ( [8,14,17]). For a bivariate copula C if exists, then C has upper tail dependence if λ U ∈ (0, 1], and upper tail independence if λ U = 0. The measure is extensively used in Extreme value theory. The concept of lower tail dependence can be defined in a similar way. If the limit, exist, then C has lower tail dependence if λ L ∈ (0, 1], and lower tail independence if λ L = 0. Proof Clearly, the upper tail dependence coefficient (λ U ) can be simplified as 1 (s, s) + ψ s,2 (s, s) = −pθ ψ s,1 (1, 1) + ψ s,2 (1, 1) .
In (4.8) the copula C ψ, p θ is a one-parameter family of copulas whose upper tail dependence coefficient (λ u ) ranges from 0 to 1 through function ψ. ψ(s, t) as the cumulative distribution function of the uniform distribution on [0, θ], θ ≤ 1, introduced in [15], gives the new family of copulas

Example 4.4 Choosing
By using (4.8), in this new family, we have  0in (2.1), that leads to a new copula as follows:

Application and simulation
In this section, we apply our presented generalized FGM copula to some real dataset in medical science. According to the manual of R's package MASS, the US National Institute of Diabetes and Digestive and Kidney Diseases collected data set from a population of women (at least 21 years old, of Pima Indian heritage and living near Phoenix, Arizona) who were tested for diabetes based on World Health Organization criteria. This dataset was later reanalyzed by [20]. This dataset consists of 200 complete records after dropping the data on serum insulin. Information is needed in analysis and management of body health test, the most important part of which is the study of features frequency of Body Mass Index (BMI) and Diabetes Pedigree Function (PED). Now, let us model the dependence between BMI and PED. Considering the correlation of these two features, some tools must be used to reveal the amount of relationship and impact which exists in the analysis; therefore it is necessary to determine the joint distribution of the two features, BMI and PED. Because of the association amount suggested by the correlation coefficient of 0.172, we reject independent assumption between * FGM copula Fig. 1 Plots of the empirical joint distribution from the generalized FGM copula model two variables (sig. = 0.015). According to the low correlation coefficient between BMI and PED, we decide to determine dependency structure and bivariate distribution between BMI and PED through fitting FGM copula and presented generalized FGM copula in this paper.
To this end and to be free of determining marginals distribution; we use Kernel method to determine marginals distribution for BMI and PED before further analyzing. Hereafter, we use S and T instead of cumulative distribution function BMI and PED, respectively, which determine using Kernel method. For estimating parameter of generalized FGM copula in (2.1), θ and p, the log-likelihood function was computed. The results for different function ψ and parameter estimation with AIC criteria are presented in Table 1. This table shows the familyC ψ, p θ is the flexible generalized FGM copula, by choosing different types of ψ functions and estimating parameter θ and p; accordingly, the family C ψ, p θ can better fit the interested medical data. As an example, for ψ = s(1 − s)(1 − t), θ = 0.1553 and p = 3.5237( p > 1) the family C ψ, p θ shows ψ has less AIC. These results can be investigated using simulation and scatter plot study. Also, in order to evaluate and compare the detailed functions Table 1, the main measure of proximity to the empirical joint distribution of data is used, that results of this evaluation and comparison are summarized in Fig. 1. Figure 1 shows that the joint model III is closer to the main points, therefore, the empirical joint distribution function III, is more suitable for fitting to the data Fig. 2. Also, in Fig. 3, the contour plots for the functions of Table 1 are drawn. We now discuss the simulation of data from the generalized FGM family and perform comparisons between correlations in the simulated data and in the observed data based on 1000 simulations. We follow the simulation method proposed by Johnson (1987, Ch.3) and later Nelson (2006, page 41). Thus, a sampling algorithm to simulate from C ψ, p θ (s 1 , t 1 ) is as follows: (1) Draw two independent uniform random values (s 1 , t 2 ).
The vector (s 1 , t 1 ) is generated from the family C ψ, p θ .  Table 1. It can almost be seen that the simulated data and the original data have similar dependence patterns but the consistency amount between observed data and simulated data is not so clear in Sub-Figures of Fig. 1. To settle this concern, Table 2 shows the rank correlations between the BMI and PED variables calculated from the original observed data, and based on the simulated data of size 1000 taken from the fitted FGM copula; and the generalized FGM copula family. By comparing these correla-

Appendix A
Using integration by parts, we have Also, Thus, it can conclude that, In a similar way,