Effective estimation algorithm for parameters of multivariate Farlie–Gumbel–Morgenstern copula

This paper focuses on the parameter estimation for the d-variate Farlie–Gumbel–Morgenstern (FGM) copula (d≥2\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$d\ge 2$$\end{document}), which has 2d-d-1\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$2^d-d-1$$\end{document} dependence parameters to be estimated; therefore, maximum likelihood estimation is not practical for a large d from the viewpoint of computational complexity. Besides, the restriction for the FGM copula’s parameters becomes increasingly complex as d becomes large, which makes parameter estimation difficult. We propose an effective estimation algorithm for the d-variate FGM copula by using the method of inference functions for margins under the restriction of the parameters. We then discuss its asymptotic normality as well as its performance determined through simulation studies. The proposed method is also applied to real data analysis of bearing reliability.


Introduction
A copula is a function that joins several one-dimensional distribution functions to form a multivariate distribution function with dependency (Nelsen 2006). The copula works as a multivariate distribution function by itself.

3
In reliability engineering, the copula is used as a useful tool for reliability analysis of systems in which dependent failures sometimes occur. A dependent failure of systems and/or system components may occur due to sharing heat, vibration, and tasks (McCool 2012). We should consider the dependence of components to precisely assess the reliability of the system. For example, Eryilmaz and Tank (2012) showed that the positive dependence between two components improves the mean time to failure of a two-component series system by using an Farlie-Gumbel-Morgenstern (FGM) copula and Ali-Mikhail-Haq copula. Navarro et al. (2007) developed a reliability model for coherent systems with dependent components by using several types of copulas including the FGM copula.
There have been many studies on the FGM copula. Earlier studies (Farlie 1960;Gumbel 1960;Morgenstern 1956) discussed families of the bivariate FGM copula. Johnson and Kotz (1975) formulated the FGM copula as a multivariate distribution. The multivariate FGM copula is useful as an alternative to a multivariate normal distribution because it has a simple form and can express mutual dependencies among two or more variables (Jaworski et al. 2010). Therefore, the FGM copula has been applied to statistical modeling in various research fields such as economics (Patton 2006), educational engineering (Shih et al. 2019), finance (Cossette et al. 2013) and reliability engineering (Navarro et al. 2007;Navarro and Durante 2017).
However, there are no practical estimation methods of the parameters of the multivariate FGM copula. The reason mainly depends on the computational complexity of estimating a large number of the parameters. It is known that the d-variate FGM copula consists of 2 d − d − 1 constrained parameters for d ≥ 2 . In fact, the d-variate FGM copula has relatively many parameters compared with other multivariate copulas (e.g., a d-variate Gaussian copula has d(d − 1)∕2 (≤ 2 d − d − 1) correlation parameters (Jaworski et al. 2010)). This implies that estimation for the multivariate FGM copula's parameters is computationally more complicated than that for the other copulas' parameters. Besides, maximum likelihood estimation (MLE) is infeasible over such a high dimensional space with the parameter constraint.
The method of inference functions for margins (IFM) is one acceptable estimation method for parameters of copulas. It was proposed by Joe and Xu (1996), and extensively reviewed by Xu (1996). This method first estimates the univariate parameters from separate univariate likelihood functions of a copula. It then estimates bivariate, trivariate and multivariate parameters step by step from respective bivariate, trivariate, and multivariate marginal likelihoods with lower order parameters fixed as the estimated values. The advantages of IFM are its computational efficiency and asymptotic normality that holds under several regularity conditions (Xu 1996;Joe 2005;Patton 2006). However, the following two problems exist when we apply IFM to the multivariate FGM copula: (1) it is not known how to consider the complex restriction for the FGM copula's parameters, and (2) it has not been found whether the estimators of IFM for the multivariate FGM copula have asymptotic normality.
In this paper, we propose an effective estimation algorithm for the multivariate FGM copula by using IFM. The proposed algorithm requires linear optimization to consider the restriction of the multivariate FGM copula's parameters. Besides, we analyze the asymptotic normality of the estimators given by the proposed algorithm.
The remainder of this paper is organized as follows. In Sect. 2, we introduce the definition and property of the multivariate FGM copula. In Sect. 3, we propose estimation algorithms for the multivariate FGM copula based on MLE and IFM. Their asymptotic properties are analytically investigated in Sect. 4. Section 5 consists of simulation results to compare MLE and IFM for the multivariate FGM copula from the viewpoint of estimation accuracy and computation time. In Sect. 6, our proposed method is applied to real data analysis of bearing reliability. Finally, we conclude our study with a summary in Sect. 7.

FGM copula
Let U = (U 1 , … , U d ) be a random vector that follows a d-variate FGM copula with d uniform marginal distributions in the interval [0, 1]. Let denote the parameter vector of the d-variate FGM copula. According to Johnson and Kotz (1975) and p. 108 of Nelsen (2006), the joint distribution function of the d-variate FGM copula, denoted by C d , is defined as where (u 1 , … , u d ) ∈ [0, 1] d and j 1 ⋯j k ∈ is a parameter. Since ≠ 0 means that the random variables are obviously dependent, we call an element of a dependence parameter. For d = 2 and 3, we obtain The correlation and regression properties for any pair of U were studied by Johnson and Kotz (1977).
The joint density function of U , denoted by c d , is given by is a parameter vector such that the joint density function c d (u 1 , … , u d ; ) is nonnegative for every u j ∈ [0, 1] . Thus, must satisfy the following restriction (Johnson and Kotz 1975;Jaworski et al. 2010).
is a linear function of each u j , the substitution of u j = 0, 1 in every 2 d possible combinations will yield necessary and sufficient conditions of (Johnson and Kotz 1975). Thus, we can write Eq. (1) in a compact form as follows.
The condition given by Eq.
(2) becomes increasingly complex as d becomes large. When we estimate the dependence parameters, we need to consider this condition. Multivariate FGM distributions can be constructed by the Sklar's theorem (Nelsen 2006). Suppose X = (X 1 , … , X d ) is a random vector that follows the d-variate FGM distribution with arbitrary continuous marginal distributions. For j = 1, … , d , let F j (x j ; j ) and f j (x j ; j ) be the jth marginal distribution function and the density function with a parameter vector j , respectively. Let us define as = ( 1 , … , d ) . Then, the cumulative distribution function of X , denoted by H d , is represented as The joint density function of X , denoted by h d , is given by where f j ≡ f j (x j ; j ) and F j ≡ F j (x j ; j ).

Estimation
In this section, we first introduce MLE and its limitation for the multivariate FGM distribution. To solve the limitation, we then propose an estimation algorithm for the multivariate FGM distribution by using IFM.

Maximum likelihood estimation
MLE is one of the natural choices to estimate parameters of probability distributions because its estimators satisfy asymptotic normality under several regularity conditions (Lehmann and Casella 1998;Joe 2014). With this method, we estimate all the parameters and simultaneously. For a sample size n, with observed independent random vectors x i = (x i1 , … , x id ) for i = 1, … , n , the full-dimensional log-likelihood function of and can be written as Thus, the estimators of MLE, denoted as ̂ and ̂ , are given by To find (̂ ,̂ ) , we need to use numerical optimization techniques because Eq. (4) does not have closed form expression. The problems of numerical optimization in Eq. (4) are high-dimensionality and the parameter constraints given by Eq.
(2). Numerical optimization in a high-dimensional and constrained space requires a large amount of computational resources. This issue relates to computational feasibility. Numerical optimization might take a large amount of computation time and fail as the dimension increases. Also, the estimation accuracy of MLE worsens because the numerical optimization technique may not find the global optimum but a local one. Even if all the marginal distributions are uni-parameter distributions (i.e., j = j ), Eq. (4) consists of the 2 d − 1 unknown parameters and . Moreover, the ignorance of (4) (̂ ,̂ ) = arg max , ( , ;x 1 , … , x n ).

3
parameter constraints is one of the most common reasons why the Newton-Raphson type algorithms diverge (MacDonald 2014). Therefore, other simple estimation methods have been required for the multivariate FGM distribution's parameters.

The method of inference functions for margins
To reduce the computational difficulty of MLE, we introduce IFM for estimation of the multivariate FGM distribution. With this method, we estimate the dependence parameters one by one so that each estimation result satisfies the condition given by Eq. (2).
As the first step, let us define k-dimensional marginal likelihood functions for k = 1, 2, … , d − 1 . It is easy to see that the k-dimensional marginal distribution function is as follows.
Thus, for example, we have The key point of these functions is that their variables are controlled by only the parameters with the same subscripts of the marginal distribution functions. We then obtain the following log-likelihood functions of the k-dimensional marginal distribution for k < d and 1 ≤ j 1 < ⋯ < j k ≤ d.
Now, we propose an estimation algorithm based on IFM for the d-variate FGM distribution. With IFM, we first estimate the parameters of marginal distributions as for j 1 = 1, 2, … , d.
Next, we estimate bivariate, trivariate and d-variate dependence parameters step by step. For 1 ≤ j 1 < j 2 ≤ d , the bivariate dependence parameters j 1 j 2 are estimated as follows.
In the same manner, we estimate all dependence parameters of higher dimensions in order. Finally, 12⋯d is estimated in a bottom-up fashion. Thus, we can estimate all dependence parameters by repeatedly maximizing functions with one unknown parameter (one equation per dependence parameter). Therefore, IFM can be used to estimate the dependence parameters of the d-variate FGM distribution no matter how large d is (even if MLE is infeasible).
If MLE is feasible, an estimator of IFM can be considered as a good starting point for the numerical maximization of the full log-likelihood function (Joe 2014). In Sec. 5.2, the detailed procedure of IFM for the d-variate FGM distribution is demonstrated in the case of d = 3.
The proposed algorithm for the d-variate FGM distribution has the following characteristic remarks that are not seen when IFM is applied to other copulas.
Remark 1 For a given k, IFM for the d-variate FGM distribution requires linear optimization to determine the allowable range of a dependence parameter as a j 1 ⋯j k ≤ j 1 ⋯j k ≤ b j 1 ⋯j k where a j 1 ⋯j k = min j 1 ⋯j k and b j 1 ⋯j k = max j 1 ⋯j k under the condition of Eq. (2), respectively.
For example, let us consider the case that d = 3 and the dependence parameters are estimated in the order of 12 → 23 → 13 → 123 . In this case, we have a 12 = min 12 and b 12 = max 12 s.t. Eq. (3), respectively. Then, a 12 = −1 and b 12 = 1 hold as solutions of these two optimization problems. Eq. (50 can be represented as After ̃ 12 is obtained, a 13 and b 13 are given by min 13 and max 13 s.t. Eq. (3) where 12 =̃ 12 , respectively. The same discussion can be done for a 23 , b 23 , a 123 , and b 123 . In this way, the parameters of Eq. (2) are fixed by their estimated values one by one.
Remark 2 Since j 1 ⋯j k is a single nonlinear equation, some numerical optimization is required to solve arg min One can use an interior point method as a numerical optimization method for such constrained nonlinear optimization problems (e.g., Mathematica supports an interior point method by the functions FindMinimum and FindMaximum).

Remark 3 For a given k, the order of estimating the
For example, when d = 3 and k = 2 , we have 12 , 13 , and 23 . The estimation order of these parameters is exchangeable because 12 , 13 , and 23 do not depend on ( 13 , 23 ) , ( 12 , 23 ) , and ( 12 , 13 ) , respectively. Thus, there are 6 ways to estimate these three parameters in the permutations of ( 12 , 13 , 23 ) , ( 12 , 23 , 13 ) , ( 13 , 12 , 23 ) , ( 13 , 23 , 12 ) , ( 23 , 12 , 13 ) , and ( 23 , 13 , 12 ) . In total, we have The value of ̃ may change depending on the estimation order of j 1 ⋯j k 's for a small sample. This is because a j 1 ⋯j k and b j 1 ⋯j k depend on the values of the dependence parameters already estimated. The estimation using a small sample may yield large estimation errors. As a result, large estimation errors of the dependence parameters obtained at the beginning of the algorithm (e.g., ̃ 12 ,̃ 13 , … ) tend to shorten allowable ranges [a j 1 ⋯j k , b j 1 ⋯j k ] of other dependence parameters to be estimated. Therefore, we should estimate by using various ways as long as the computation time is acceptable. If we obtain M different estimates ̃ 1 , … ,̃ M by M ways, we can select the best estimate so that the estimate yields the largest loglikelihood value as arg max

Asymptotic normality
An estimator of MLE satisfies asymptotic normality under regularity conditions (Cramer 1945). It is also known that an estimator of IFM supports asymptotic normality if the copula satisfies regularity conditions (Joe 2014). However, no studies found that the same property holds in the multivariate FGM distribution.
In this section, therefore, we argue that asymptotic normality of MLE and IFM hold for the FGM distribution. We then compare their asymptotic efficiencies for specific parameter cases.
Without loss of generality, let us consider asymptotic normality of each estimator of MLE and IFM for d = 3 and j = j . Let be ( , ) (i.e., = ( 1 , … , 7 ) = ( 1 , 2 , 3 , 12 , 13 , 23 , 123 ) ), and a parameter with the subscript 0 denotes the true value of the parameter (e.g., 0 and 0 ). Then, for a random vector of the FGM distribution X = (X 1 , X 2 , X 3 ) , the following theorems hold under suitable regularity conditions including interchange of differentiation and integration (see pp. 462-463 of Lehmann and Casella (1998) for the consistency and asymptotic normality of MLE and Joe (2014) for the reference of IFM).

Theorem 1 The estimator of MLE ̂ satisfies the following equation for n → ∞.
where D → denotes the convergence in distribution, and I is the Fisher information matrix, which is given by where T represents transpose of a vector or matrix, and Theorem 2 The estimator of IFM ̃ satisfies the following equation for n → ∞.
Note that we cannot obtain the explicit forms of D and V but can compute them for a specific 0 . Proofs for Theorems 1 and 2 are given in Appendix.
A natural question arises: how much is ̂ relatively efficient compared with ̃ ? To take into account this question, we investigate numerical examples. To simplify the asymptotic covariance matrices, let us consider the case in which is given. That is, the parameters of interest are only . Suppose ( 12 , 13 , 23 , 123 ) = (0.3, 0.3, 0.3, 0.3) . Then, each asymptotic covariance matrix is given by Note that we use the closed Newton-Cotes formula of degree 3 to numerically compute integrations for I −1 and V. Since the ith diagonal element of Eq. (10) is less than that of Eq. (11) for i = 1, … , 4 , we can find that MLE is more effective than IFM. For example, the variance of ̂ 12 is around 97% (= 8.48∕8.70) of the variance of ̃ 12 . As another example, let us consider the case of ( 12 , 13 , 23 , 123 ) = (0, 0, 0, 0) (i.e., the independent case). Then, each asymptotic covariance matrix is analytically obtained as Hence, the variances of ̂ are 100% of those of ̃ if ( 12 , 13 , 23 , 123 ) = (0, 0, 0, 0).
In this section, we conduct simulation studies to evaluate the effectiveness of our proposed method in the estimation of the multivariate FGM distribution. We first explain how to generate simulation samples from the multivariate FGM distribution. Then, we compare MLE and IFM from the viewpoint of estimation accuracy and computation time and provide some discussion.

Generating the simulation samples
The conditional method (Xu 1996;Joe 2014) allows us to generate samples from the multivariate FGM distribution with arbitrary marginal distributions.
Let (X 1 , … , X d ) be a random vector of the d-variate FGM distribution. For j = 2, 3, … , d , we define the conditional distribution functions as follows.
Let v 1 , … , v d be samples from the i.i.d. uniform distribution in the interval [0, 1].
We construct a random sequence through the following processes.
Consequently, the random sequence (x 1 , x 2 , … , x d ) follows the d-variate FGM distribution. We need to know the closed form of C −1 j|1⋯j−1 (⋅|⋅) to use the conditional method for the d-variate FGM distribution. The conditional distribution function is generally given by Since it follows that By integrating both sides of Eq. (12), we obtain where From Du 2 j − (1 + D)u j − C j|1⋯j−1 (u j |u 1 , … , u j−1 ) = 0 , the inverse function C −1 j|1⋯j−1 (u j |u 1 , … , u j−1 ) takes one of the values of for which it is positive and less than 1. For example, for j = 2 , we have As far as we have found, the signs of Eq. (13) for j = 2, 3, 4 and 5 are +, −, − , and −, respectively. Consequently, we can generate samples of the multivariate FGM distribution by using the conditional method with Eq. (13).

Short example
By using simulation data, we demonstrate the procedure of IFM for the d-variate FGM distribution. The settings of this example are as follows.
1. The sample size is 1000. We generated random samples under the settings and estimate the parameters from the samples.
We estimated the dependence parameters through the following steps.
Step 2: ̃ 13 = arg max 13 , 12 =0.298 Step 3 Note that the allowable ranges of the dependence parameters are calculated based on Remark 1 in each step. As mentioned in Remark 3, Step 1, 2, and 3 are exchangeable although we estimated the dependence parameters in the order of 12 , 13 , and 23 in this example. If the order is changed, the estimation result might be different from the above result. To illustrate this phenomenon, we conducted another experiment by changing the sample size from 1000 to 100. We estimated all dependence parameters by MLE and IFM with possible six estimation orders (see Table 1). Note that ( , ) bellow each estimated value in Table 1 represents the allowable range of the corresponding dependence parameter (i.e., = a j 1 ⋯j k and = b j 1 ⋯j k ). For example, 12 was estimated in the interval [−1, 1] in the estimation order of 12 → 13 → 23 → 123 .
As shown in Table 1, there were 3 different types of estimation results for IFM, and all ̃ 123 were 0 under the conditions of Eq.
(2) and ( 12 , 13 , 23 ) = (̃ 12 ,̃ 13 ,̃ 23 ) . The reason why we obtained such results due to the difference of the estimation order mainly depended on the accumulation of estimation errors in the process of IFM. For example, the estimated values ̃ 13 were different between the cases that the dependence parameters were estimated in the orders of 12 → 13 → 23 → 123 and 12 → 23 → 13 → 123 (i.e., 0.707 and 0.518). While the allowable range of 13 thirdly estimated in the latter case was [−0.156, 0.518] , that secondly estimated in the former case was [−1, 1] . ̃ 13 could not be 0.707 in the latter case. Therefore, the value of ̃ changed depending on the estimation order.
The result of MLE was also different from the results of IFM. Based on the loglikelihood values, the result of MLE should be chosen as the best estimate in this example.

Estimation accuracy
In this subsection, we investigated the estimation accuracy of the proposed algorithm through Monte Carlo simulation under the conditions that parameters of marginal distributions are known and unknown.

When ı is known
We again considered that all the marginal distributions were given, and was known. For d = 4 , we conducted the simulation under the following settings.
1. The sample sizes n were 100, 1000, and 10,000, and the number of iteration times m was 100. 2. All the marginal distributions were the i.i.d. uniform distribution in the interval [0,1]. This means that the target parameters are only the dependence parameters. 3. We considered the following four situations of the dependence parameters. Under the settings, we calculated the following Bias and root-mean-squared error (RMSE) as indicators of estimation accuracy for each estimator of IFM.
where ̃ ⋅,i is ̃ ⋅ 's estimate obtained in the ith iteration step and ̃ ⋅,(0) is the true value of ̃ ⋅ . Bias and RMSE represent the average and dispersion of the estimation errors, respectively.
The numerical results are presented in Tables 2, 3, 4, and 5. Note that MLE cannot work in the case of n = 10,000 due to too much computation time for 100 iterations (represented as N/A). Besides, only one estimation order of ̃ was considered      The estimation accuracy improved with a power of 10 samples. To obtain estimates with Bias of less than 0.1, we require approximately 100, 1000, and 10,000 samples for j 1 j 2 , j 1 j 2 j 3 , and 1234 , respectively. As for RMSE, ratio values of the results of IFM to those of MLE for the corresponding dependence parameters are close to 1 in most of the cases.

When ı is unknown
We next investigated the estimation accuracy when the marginal distributions were given, but was unknown. In this case, it is expected that the accuracy of IFM worsen because the estimation accuracy for the dependence parameters is affected by the error for the marginal parameters. To check how much the performance worsens, we conducted the simulation under the following settings for d = 4 .
1. The sample size n was 100, and the number of iteration times m was 100. 2. All the marginal distributions were an i.i.d. exponential distribution. We fixed the marginal parameters as ( 1 , 2 , 3 , 4 ) = (1, 1, 1, 1). The estimation results of the marginal parameters and dependence parameters are summarized in Tables 6 and 7, respectively. In Table 6, since n = 100 was enough to estimate the marginal parameters precisely, all biases are close to 0. Hence, the estimation accuracy for the dependence parameters did not significantly worsen even if the marginal parameters were unknown. Bias and RMSE for the dependence parameters in Table 7 were almost the same as those for the dependence parameters shown in the result of n = 100 of Table 2. Therefore, we infer from these results that IFM and MLE for the parameter estimation of the FGM distribution work well under the condition that the sample size is enough to estimate at least.

Computation time
In this subsection, we discuss the investigation of the computation times of MLE and IFM. We implemented IFM for the d-variate FGM distribution in Mathematica 11.3 and a PC equipped with Intel(R)Core(TM) i7-6700k 4.00-GHz, 32GB-RAM running on Windows 10 professional 64bit. We conducted the simulation under the following settings.
1. The dimensions d were 4 and 5.
2. The sample sizes n were 100, 1000, and 10,000. 3. All the marginal distributions were the i.i.d. uniform distribution in the interval [0, 1]. This means that the target parameters are only the dependence parameters. 4. For d = 4 , the values of the dependence parameters are as follows: j 1 j 2 = 0, j 1 j 2 j 3 = 0.2 , and 1234 = 0.2 for 1 ≤ j 1 < j 2 < j 3 ≤ 4. 5. For d = 5 , the values of the dependence parameters are as follows: j 1 j 2 = 0, j 1 j 2 j 3 = 0, j 1 j 2 j 3 j 4 = 0.2 , and 12345 = 0.2 for 1 ≤ j 1 < j 2 < j 3 < j 4 ≤ 5. Figure 1 illustrates the computation times to estimate the dependence parameters under the above situation. We can find that IFM works faster than MLE in all cases, and both MLE and IFM require more computational time as d and n increase. If d = 5 and n = 10,000 , MLE cannot work due to running out of memory (represented as N/A). Therefore, to estimate the parameters of the multivariate FGM distribution, we should not use MLE but IFM from the viewpoint of computational cost. A theoretical analysis of the computational complexity of IFM is required for future work.

Case study
In the previous section, we showed several aspects of the parameter estimation scheme proposed in this paper by utilizing simulation data. In this section, in turn, we depict another numerical example based on the actually collected data set. This is called Bearing data set, which is available in NASA Ames Prognostics Data Repository (Lee et al. 2007).

Bearing data set
This data set was collected for the purpose of the investigation of how four bearings with a coaxial rotating shaft degraded in time. For this case study, we employ "Set No. 1" data set from the original data set. This "Set No.1" contains 2156 files, and each file has 20480 data points of vibration signal yielded in one second from 4 working bearings with 2 data-collection channels of accelerometers on the test rig (see Fig. 2). Therefore one data file has 20480 lines, and each line has 8 numerical values (4 bearings × 2 channels). Also, one file was saved as a log file every 10 minutes (N.B., the first 43 files are logged every 5 minutes). Hence we obtain about 44.1 million ( ≈ 2156 × 20480 ) lines of 8-dimension vibration signal values.
In order to reduce the size of the data set, we extracted the first line from every 2156 files. And from the structure of the test rig, we decided to use 4 channels of #1, #3, #5, and #7 from bearing #1, #2, #3, and #4, respectively. After such a data preparation, we finally obtain 2156 actual samples of 4-variate data set as follows ( i = 1, 2, … , 2156 ). ⋅ x i1 = the ith vibration signal (unit: arbitrary) of channel #1 in bearing #1. ⋅ x i2 = the ith vibration signal (unit: arbitrary) of channel #3 in bearing #2. ⋅ x i3 = the ith vibration signal (unit: arbitrary) of channel #5 in bearing #3. ⋅ x i4 = the ith vibration signal (unit: arbitrary) of channel #7 in bearing #4. Note that the unit of the vibration signal is not indicated in the original references of Bearing data set (Qiu et al. 2006;Lee et al. 2007). Since the vibration signal was collected from the accelerometers, we can guess that the unit is any of voltage (mV) [or (V)] and acceleration (m/s 2 ) which is proportional to a voltage value. For this reason, we assume that the vibration signal has an arbitrary unit in this study.
In this paper, we use this data set to find some dependence property among four bearings via their vibration signal data. Since these four bearings are operated by a single rotating shaft, it is expected that some common factors may affect the degradation of the bearings. For this, we apply 4-variate FGM copula to the data set and estimate its 11 dependence parameters.

Finding marginal distribution
We separate the data set into three sets from the viewpoint of the size. We call them Data-A, Data-B, and Data-C, respectively. Data-C is the overall data (2156 lines). To make Data-A, we extracted the first 100 lines from Data-C. This means that this set was collected from the early stage of the operation time of four bearings. The second one, Data-B, consists of 1000 lines (i.e., Data-A plus successive 900 lines) to grasp the dependence property until the middle part of the data set. Data-C is used for the overall time domain analysis.
For each data set, we need to find the well-fitted marginal distribution firstly. Since the normal distribution was not fit these data sets, we chose the generalized t-distribution with the following probability density function:  Qiu et al. (2006)]. Each bearing has 2 channels for data collection. Bearing #1 has channels #1 and #2, bearing #2 has channels #3 and #4, bearing #3 has channels #5 and #6, and bearing #4 has channels #7 and #8, respectively where = (a, b, c) and B [ , ] is the beta function defined by This distribution has a location parameter a and a scale parameter b. The parameter c denotes the degree of freedom. Therefore if a = 0 and b = 1 , Eq. (14) describes the Student's t-distribution. Table 8 summarizes the estimation results of the marginal t-distribution and their p values by employing the Kolmogorov-Smirnov goodness-of-fit test. The right-most column also lists the p values if the normal distribution is applied. Therefore we choose t-distribution for the marginal, because all the cases are not rejected at the significance level of 0.05.

Parameter estimation and discussion
We estimated all the parameters included in the 4-variate FGM copula by the proposed method. Note that only one estimation order of ̃ was considered to finish the experiment within acceptable computation time. Table 9 is a list of them for each data set. In the statistical sense, the results of Data-C are the most meaningful because the size of the data set is the largest (2156 data points) among the three. The values in each row vary by the combination of bearings. Actually, it is not easy to infer the actual dependence structure among the bearings on the test rig from such information seen in this table, but we dare to state the degree of dependence by categorizing the estimated values. That is, we set three categories of the dependence as positive, negative, and almost zero between two bearings. By following this categorization, ̃ 12 , ̃ 34 are judged as positive, ̃ 13 , ̃ 23 , and ̃ 24 are negative, and ̃ 14 is almost zero.
Translating the subscript numbers 1, 2, 3, and 4 of into the channel numbers 1, 3, 5, and 7, respectively, we can infer that the bearings #1 and #4 are almost independent because ̃ 14 belongs to the category of almost zero. It is understandable that this result from the illustration of Fig. 2 is possible since the bearing #1 and #4 are placed at both ends on the test rig equipment. It is reasonable that they may yield mutually independent vibration signals.
On the other hand, the degree of the positive and negative dependence between two bearings seems to also suggest some structural dependence depicted in Fig. 2. In the case of positive, the pair of bearings #1 and #2 adjoin, and the direction of the radial load is opposite. The same relationship can be seen at the pair of bearings #3 and #4. Therefore we can presume that these pairs of bearings may fail dependently in the future. Regarding this matter, Qiu et al. (2006) reported that an inner race defect was discovered in the bearing #3 and a roller element defect and outer race defect were found in the bearing #4 at the end of the testing. This consequence results in supporting our inference stated above.
The last category, denoted as negative, contains three pairs of bearings as (#1, #3), (#2, #4), and (#2, #3). Among the three, the first two pairs of bearings suffer the opposite radial load direction, but have more distance than the case of positive. Whereas in the case of (#2, #3), these bearings take the same direction of the radial load and are equipped closely. It is not clear that how these factors affect that the estimated values tend to be negative. It may be needed to do more research from the viewpoint of mechanical engineering and other theoretical consideration to explain these interesting relationships between the estimated values and the physical layout of the equipment and other factors of the bearings. Table 9 List of estimated parameters by IFM. Subscript numbers 1, 2, 3, and 4 of the correspond to the channel numbers 1, 3, 5, and 7, respectively Moreover, it becomes more complicated to explain the dependence structure and estimated values among three or more bearings (i.e., ̃ 123 ,..., ̃ 1234 ). It is desired that the reasons and implications for these phenomena are revealed in future studies. In any case, however, we have confirmed that the proposed algorithm for the parameter estimation of the FGM distribution worked well.

Conclusion
We proposed an algorithm for estimating the parameters of the multivariate FGM distribution in practical computation time. The computational complexity of MLE was reduced by using IFM. Because of this, IFM can estimate all the dependence parameters of the multivariate FGM distribution no matter how high the dimension d is. We then revealed that the estimators of IFM satisfy asymptotic normality for the multivariate FGM distribution. Therefore, The proposed algorithm is useful for estimating the parameters of the FGM distribution.
From the viewpoint of reliability engineering, the proposed algorithm can be applied to a quantitative evaluation of the dependence among system components. If we have the lifetime data of system components, we can find the latent dependence structure among them by estimating the parameters of the multivariate FGM distribution with arbitrary marginal distributions.

Appendix: Proofs
Before we prove Theorems 1 and 2, we introduce a lemma and theorem as follows.
Proof of Lemma 1 (9), we prove that each element of E s(⋅; 0 ) is 0, that is, for 1 ≤ j < k ≤ 3 . First, we have E j log f j (X j ;⋅) = 0, E jk log h 2 (X j , X k ;⋅) = 0, E 123 log h 3 (X 1 , X 2 , X 3 ;⋅) = 0, where we assume that the order of the differentiation and integration can be changed in the third step. In the same manner, we obtain and Therefore, E s(⋅; 0 ) = 0 . Also, E[ log h 3 (X 1 , X 2 , X 3 ;⋅)] = 0 holds in the same manner. Hence, the proof is complete. ◻

Proof of Theorem 1
We provide a sketch of the proof of Theorem 1. Suppose Y, Y 1 , … , Y n are i.i.d. random vectors of the d-variate FGM distribution. By applying the mean value theorem for the maximum log likelihood function at 0 , we obtain where a matrix K( ) is given by are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http:// creat iveco mmons. org/ licen ses/ by/4. 0/.