1 Introduction

This study focuses on an \(R \times R\) square contingency table with ordered categories. For example, square contingency tables may be obtained by cross-classifying for repeated measurements of the ordinal categorical outcome.

Let X and Y denote the row and column variables, respectively. The marginal mean equality (ME) model that Tomizawa (1991) proposed is defined by

$$\begin{aligned} E(X) = E(Y), \end{aligned}$$

where

$$\begin{aligned} E(X) = \sum ^{R}_{i=1} i \Pr (X=i) \quad \textrm{and} \quad E(Y) = \sum ^{R}_{i=1} i \Pr (Y=i). \end{aligned}$$

The ME model can be also expressed as

$$\begin{aligned} \sum _{i=1}^{R-1} F^X_i = \sum _{i=1}^{R-1} F^Y_i, \end{aligned}$$
(1)

where

$$\begin{aligned} F^X_i = \Pr (X \le i) \quad \textrm{and} \quad F^Y_i = \Pr (Y \le i). \end{aligned}$$

The \(F^X_i\) and \(F^Y_i\), for \(i=1, 2, \dots , R-1\), are the row and column cumulative marginal probabilities, respectively. By considering the difference between \(F^X_i\) and \(F^Y_i\), for \(i=1,2,\dots ,R-1\), the ME model can be further expressed as

$$\begin{aligned} \sum _{i=1}^{R-1} G_{1(i)} = \sum _{i=1}^{R-1} G_{2(i)}, \end{aligned}$$
(2)

where

$$\begin{aligned} G_{1(i)}= \Pr (X \le i, Y \ge i+1) \quad \textrm{and} \quad G_{2(i)} = \Pr (X \ge i+1, Y \le i). \end{aligned}$$

It must be noted that the Eq. (1) depends on the probabilities that observations will fall in the main diagonal cells of the table (i.e., \(\Pr (X=i, Y=i)\) for \(i=1,\dots ,R\)), and the Eq. (2) dose not depend on those.

Generically, when the concerned model does not fit data, we are interested in (i) applying the model having weaker restrictions than it, or (ii) measuring the degree of deviancy from it. When the marginal homogeneity (MH) model (Stuart 1955) that has restrictions \(F^X_i =F^Y_i \) (or \(G_{1(i)}=G_{2(i)}\)) for \(i=1,2,\dots ,R-1\) dose not fit data, we may be interested in applying the models having weaker restrictions than the MH model (e.g., the ME model), or measuring the degree of deviancy from the MH model. For the details of the models having weaker restrictions than the MH model, see, for examples, Tahata and Tomizawa (2008, 2014), Kurakami et al. (2013), Shinoda et al. (2021), and so on. For the details of the measures of deviancy from the MH model, see, for examples, Tomizawa et al. (2003), Tahata et al. (2006, 2012), Yamamoto et al. (2011), and so on. The measures of Tahata et al. (2006, 2012) were constructed based on the \(\{F^X_i\}\) and \(\{F^Y_i\}\), and the measures of Tomizawa et al. (2003) and Yamamoto et al. (2011) were constructed based on the \(\{G_{1(i)}\}\) and \(\{G_{2(i)}\}\).

Ando (2019) revealed that when comparing the degrees of deviancy from the MH model for two datasets, the magnitude relation of the degrees of deviancy from the MH model may change between the measure that depend on the probabilities \(\Pr (X=i, Y=i)\) and the measure that does not. Thus, for the dataset in which observations are concentrated in the main diagonal cells, the measure that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the MH model. In fact, in square contingency tables, it is known that there is a strong association between row and column variables, and observations tend to be concentrated in the main diagonal cells.

When the ME model does not fit the data, we are interested in only measuring the degree of deviancy from it, because the model having weaker restrictions than the ME model is only the saturated model. We point out that the number of degrees of freedom of the ME model is one. Yamamoto and Tomizawa (2007) proposed a measure based on the Eq. (2) that represents the degree of deviancy from the ME model. This measure, however, cannot discriminate directions in the degree of deviancy from the ME model although there are two types of direction (i.e., \(\sum _{i=1}^{R-1} G_{1(i)} > \sum _{i=1}^{R-1} G_{2(i)}\) and \(\sum _{i=1}^{R-1} G_{1(i)} < \sum _{i=1}^{R-1} G_{2(i)}\)). This is because, this measure takes its maximum value one when \(\sum _{i=1}^{R-1} G_{1(i)}>0\) and \(\sum _{i=1}^{R-1} G_{2(i)}=0\) or \(\sum _{i=1}^{R-1} G_{1(i)}=0\) and \(\sum _{i=1}^{R-1} G_{2(i)}>0\), and its minimum value zero when \(\sum _{i=1}^{R-1} G_{1(i)}=\sum _{i=1}^{R-1} G_{2(i)}\). To tackle this issue, Ando (2021) proposed a directional measure that can concurrently evaluate the degree and direction of deviancy from the ME model. This directional measure takes its maximum value one when \(\sum _{i=1}^{R-1} G_{1(i)}=0\) and \(\sum _{i=1}^{R-1} G_{2(i)}>0\), its minimum value minus one when \(\sum _{i=1}^{R-1} G_{1(i)}>0\) and \(\sum _{i=1}^{R-1} G_{2(i)}=0\), and its value zero when \(\sum _{i=1}^{R-1} G_{1(i)}=\sum _{i=1}^{R-1} G_{2(i)}\).

It must be noted that the above two measures do not depend on the probabilities \(\Pr (X=i, Y=i)\) for \(i=1,\dots ,R\). As with the measure of deviancy from the MH model, for the dataset in which observations are concentrated in the main diagonal cells, the measure that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the ME model. To tackle this issue, we are interested in proposing a directional measure that depends on the probabilities \(\Pr (X=i, Y=i)\).

This study proposes a measure based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. We show that when comparing the degrees of deviancy from the ME model for two datasets, the magnitude relation of the degree of deviancy from the ME model may change between the proposed and existing measures.

This paper is organized as follows. In Sect. 2, we propose a measure of deviancy from the ME model. In Sect. 3, we derive an approximate confidence interval for the proposed measure. Section 4 evaluates the performances of an estimator and an approximate confidence interval for the proposed measure though numerical experiments. In Sect. 5, we apply the proposed measure to the real data of clinical trial. We close with discussions and concluding remarks in Sect. 6 and 7.

2 Measure of deviancy from marginal mean equality

2.1 Existing measure

Ando (2021) proposed a measure based on the Eq. (2) that can concurrently evaluate the degree and direction of deviancy from the ME model. Assuming that \(\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)}) \ne 0\), the measure is defined as follows:

$$\begin{aligned} \Phi = \frac{4}{\pi }\left[ \arccos \left( \frac{G^U}{\sqrt{(G^U)^{2} + (G^L)^{2}}} \right) - \frac{\pi }{4}\right] , \end{aligned}$$

where

$$\begin{aligned} G^U = \frac{\sum _{i=1}^{R-1} G_{1(i)}}{\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)})} \quad \textrm{and} \quad G^L = \frac{\sum _{i=1}^{R-1} G_{2(i)}}{\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)})}. \end{aligned}$$

The \(\Phi \) has the following properties: (i) the range of \(\Phi \) is from minus one to one; (ii) the \(\Phi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} G_{1(i)}\) is equal to \(\sum _{i=1}^{R-1} G_{2(i)}\); (iii) the \(\Phi \) is equal to minus one if and only if there is a structure with \(\sum _{i=1}^{R-1} G_{1(i)} > 0\) and \(\sum _{i=1}^{R-1} G_{2(i)} = 0\), which is the complete-upper-inequality; and (iv) the \(\Phi \) is equal to one if and only if \(\sum _{i=1}^{R-1} G_{1(i)} = 0\) and \(\sum _{i=1}^{R-1} G_{2(i)} > 0\), which is the complete-lower-inequality.

2.2 Proposed measure

In this section, we propose a measure based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. Assuming that \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i) \ne 0\), the proposed measure is defined as follows:

$$\begin{aligned} \Psi = \frac{4}{\pi }\left[ \arccos \left( \frac{F^X}{\sqrt{(F^X)^{2} + (F^Y)^{2}}} \right) - \frac{\pi }{4}\right] , \end{aligned}$$

where

$$\begin{aligned} F^X = \frac{\sum _{i=1}^{R-1} F^X_i}{\sum _{i=1}^{R-1} (F^X_i + F^Y_i)} \quad \textrm{and} \quad F^Y = \frac{\sum _{i=1}^{R-1} F^Y_i}{\sum _{i=1}^{R-1} (F^X_i + F^Y_i)}. \end{aligned}$$

The \(\Psi \) is also express as follows:

$$\begin{aligned} \Psi = \frac{4}{\pi }\left[ \arccos \left( \frac{\sum _{i=1}^{R-1} F^X_i}{\sqrt{(\sum _{i=1}^{R-1} F^X_i)^{2} + (\sum _{i=1}^{R-1} F^Y_i)^{2}}} \right) - \frac{\pi }{4}\right] . \end{aligned}$$

The \(\Psi \) has the following properties: (i) the range of \(\Psi \) is from minus one to one; (ii) the \(\Psi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} F^X_i\) is equal to \(\sum _{i=1}^{R-1} F^Y_i\); (iii) the \(\Psi \) is equal to minus one if and only if there is a structure with \(\sum _{i=1}^{R-1} F^X_i > 0\) and \(\sum _{i=1}^{R-1} F^Y_i = 0\), which is the complete-row-inequality; and (iv) the \(\Psi \) is equal to one if and only if \(\sum _{i=1}^{R-1} F^X_i = 0\) and \(\sum _{i=1}^{R-1} F^Y_i > 0\), which is the complete-column-inequality. From the above properties, we see that the proposed measure can concurrently evaluate the degree and direction of deviancy from the ME model.

It must be noted that the complete-row-inequality is different from the complete-upper-inequality, and the complete-column-inequality is different from the complete-lower-inequality. Under both \(\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)})\) and \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i)\) are positive, the complete-upper-inequality is satisfied when the complete-row-inequality holds; however, the converse is not necessarily true.

3 Approximate confidence interval for the proposed index

Let \(f_{ij}\) denote the observed frequency in the (ij)th cell of the table, and let \(p_{ij}\) denote the probability that an observation will fall in the (ij)th cell of the table. Assume that the observed frequencies \(\{f_{ij}\}\) have a multinomial distribution with parameters that are the cell probabilities\(\{p_{ij}\}\).

Let \({\varvec{f}}\) and \({\varvec{p}}\) be \(R^2 \times 1\) vectors:

$$\begin{aligned} {\varvec{f}}&= (f_{11},\dots ,f_{1R},f_{21},\dots ,f_{2R},\dots ,f_{R1},\dots ,f_{RR})^\top \\ {\varvec{p}}&= (p_{11},\dots ,p_{1R},p_{21},\dots ,p_{2R},\dots ,p_{R1},\dots ,p_{RR})^\top . \end{aligned}$$

From the central limit thorem, \(\hat{{\varvec{p}}}\) is asymptotically distributed as a normal distribution with mean vector \({\varvec{p}}\) and covariance matrix \(\frac{1}{N} ({\textbf {diag}}({\varvec{p}}) - {\varvec{p}}{\varvec{p}}^\top )\) where \(\hat{{\varvec{p}}} = \frac{1}{N} {\varvec{f}}\), \(N = \sum \sum f_{ij}\) and \({\textbf {diag}}({\varvec{p}})\) is a diagonal matrix with the elements of \({\varvec{p}}\) on the main diagonal. Then, we obtain

$$\begin{aligned} {\hat{\Psi }} = \Psi + \left( \frac{\partial \Psi }{\partial {\varvec{p}}^\top }\right) (\hat{{\varvec{p}}} - {\varvec{p}}) + o(\Vert \hat{{\varvec{p}}} - {\varvec{p}}\Vert ), \end{aligned}$$

where the estimator of \(\Psi \), i.e., \({\hat{\Psi }}\), is given by \(\Psi \) with \(\{p_{ij}\}\) replaced by \(\{{\hat{p}}_{ij}\}\).

Using the delta method, descriptions of which are given by, e.g., Bishop et al. (2007, Sect. 14.6), we derive the approximate variance for the estimated measure and the large-sample confidence interval for the \(\Psi \). From the delta method, \(\sqrt{N}({\hat{\Psi }} - \Psi )\) asymptotically (as \(N \rightarrow \infty \)) has a normal distribution with mean zero and variance

$$\begin{aligned} \sigma ^{2}[{\hat{\Psi }}]&=\left( \frac{\partial \Psi }{\partial {\varvec{p}}^\top }\right) ({\textbf {diag}}({\varvec{p}}) - {\varvec{p}}{\varvec{p}}^\top )\left( \frac{\partial \Psi }{\partial {\varvec{p}}^\top }\right) ^\top \\&= \sum ^{R}_{k=1}\sum ^{R}_{l=1}p_{kl}\left( \frac{\partial \Psi }{\partial p_{kl}} \right) ^{2} - \left[ \sum ^{R}_{k=1}\sum ^{R}_{l=1}p_{kl}\left( \frac{\partial \Psi }{\partial p_{kl}} \right) \right] ^{2}, \end{aligned}$$

where

$$\begin{aligned} \frac{\partial \Psi }{\partial p_{kl}}= -\frac{4\left[ (R-k)\left( \sum _{i=1}^{R-1} F^Y_i\right) ^2-(R-l)\left( \sum _{i=1}^{R-1} F^X_i\right) \left( \sum _{i=1}^{R-1} F^Y_i\right) \right] }{\pi \left( \sum _{i=1}^{R-1} F^Y_i\right) \left[ \left( \sum _{i=1}^{R-1} F^X_i\right) ^2+\left( \sum _{i=1}^{R-1} F^Y_i\right) ^2\right] }. \end{aligned}$$

Since

$$\begin{aligned} \sum ^{R}_{k=1}\sum ^{R}_{l=1}p_{kl}\left( \frac{\partial \Psi }{\partial p_{kl}} \right) = 0, \end{aligned}$$

we obtain

$$\begin{aligned} \sigma ^{2}[{\hat{\Psi }}] = \sum ^{R}_{k=1}\sum ^{R}_{l=1}p_{kl}\left( \frac{\partial \Psi }{\partial p_{kl}} \right) ^{2}. \end{aligned}$$

The estimator of \(\sigma ^{2}[{\hat{\Psi }}]\), i.e., \(\widehat{\sigma ^{2}[{\hat{\Psi }}]}\) is given by \(\sigma ^{2}[{\hat{\Psi }}]\) with \(\{p_{ij}\}\) replaced by \(\{{\hat{p}}_{ij}\}\). The \(\widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) is an estimated standard error for \({\hat{\Psi }}\), and \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) is an approximate \(100(1 - \alpha ) \%\) confidence interval for \(\Psi \), where \(z_{\alpha / 2}\) is the percentage point from the standard normal distribution that corresponds to a two-tail probability equal to \(\alpha \).

4 Simulation study

4.1 Setting

Assume that the observed frequencies \(\{f_{ij}\}\) in contingency tables have a multinomial distribution with parameters that are the cell probabilities \(\{p_{ij}\}\) in Tables 1a, 1b, 1c and 1d. In other words, we generate contingency tables by a multinomial random number generator based on probability distributions in Tables 1a, 1b, 1c and 1d.

Table 1 Below \(4 \times 4\) square contingency tables show probability distributions of multinomial random number generator

The values of the \(\Psi \) for Tables 1a, 1b, 1c and 1d are \(-0.0607\), \(-0.1410\), 0.1031 and 0.2222, respectively. The sample sizes set \(N=30, 50, 100, 150, 200, 300\). The number of iterations is 10,000.

Let \({\hat{\Psi }}_s\) be the estimate of \(\Psi \) in the sth iteration (\(s=1,\dots ,10000\)). We evaluate the performances of the estimator \({\hat{\Psi }}\) using the bias (i.e., \(\sum _{s=1}^{10{,}000}({\hat{\Psi }}_s-\Psi )\)/10,000) and the mean squared error (i.e., \(\sum _{s=1}^{10{,}000}({\hat{\Psi }}_s-\Psi )^2\)/10,000). We also evaluate the performances of the \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) using the coverage probabilities of the approximate 95% confidence interval.

4.2 Results

Table 2 shows the bias and mean squared error for the estimator \({\hat{\Psi }}\), and the coverage probability of the approximate 95\(\%\) confidence interval for \(\Psi \).

Table 2 Below table shows the bias and mean squared error (MSE) of estimator \({\hat{\Psi }}\), and coverage probability (CP) of the approximate 95% confidence interval for \(\Psi \)

The bias and the mean squared error were each close to zero as the sample size increased. When the sample sizes are 100, in all settings, the coverage probabilities of the approximate 95% confidence interval for \(\Psi \) were over 94%. We believe that the performances of the estimator \({\hat{\Psi }}\) and the the \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) are well when the sample size is over 100.

These results are similar to those of Ando (2021, 2022), and we believe that the proposed procedure seems to work well even for finite samples.

5 Application to data

5.1 Application to artificial data

In this section, we reveal that when comparing the degrees of deviancy from the ME model for two datasets, the magnitude relation of the degrees of deviancy from the ME model may change between the measure (i.e., the proposed measure \(\Phi \)) that depend on the probabilities \(\Pr (X=i, Y=i)\) and the measure (i.e., the existing measure \(\Phi \)) that does not.

Table 3 shows the artificial data with \(N = 300\) each. The proportions of the observed frequencies of the main diagonal cells to the total observed frequencies in Tables 3a and 3b are 34.3% and 81.0%, respectively. It must be noted that these two datasets differ greatly in the concentration ratio of the observed frequencies of the main diagonal cells. Table 4 shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Tables 3a and 3b.

Table 3 Artificial square contingency tables; \(N = 300\) for each
Table 4 The table below shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Tables 3a and 3b

From Table 4, when the \(\Phi \) is used, we can infer that the degree of deviancy from the ME model in Table 3b is larger than that in Table 3a. On the other hand, when using the \(\Psi \), we can infer that the degree of deviancy from the ME model in Table 3a is larger than that in Table 3b. Thus, the magnitude relation of the degrees of deviancy from the ME model in Tables 3a and 3b change between the \(\Phi \) and \(\Psi \). We believe that the measure \(\Phi \) that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the ME model. Therefore, we recommend to use the \(\Psi \) when the proportions of the observed frequencies of the main diagonal cells to the total observed frequencies are different in two datasets.

5.2 Application to real data

We consider the data in Table 5, taken from Sugano et al. (2012). These data were obtained by cross-classifying for repeated measurements of the ordinal categorical outcome.

Table 5 The two tables below are the data of the modified LANZA scores at end of study for the esomeprazole and placebo groups, respectively, stratified by the modified LANZA score at baseline; source Sugano et al. (2012)

In this clinical trial, Japanese adult patients eligible for inclusion were those who with endoscopically confirmed history of peptic ulcers and required long-term oral NSAID therapy for a chronic inflammatory condition. Patients was randomized into esomeprazole group and placebo group. The modified LANZA score (MLS) is categorized as “0” (best score), “+1”, “+2”, “+3”, and “+4” (worst score).

For these data, the row-inequality implies an improvement, while the column-inequality implies an ingravescence in the change from the baseline to end of study in the MLS. We want to evaluate whether patients in the esomeprazole group are more improvement than patients in the placebo group, as a matter of clinical interest. Therefore, we are interested in comparing the degrees of deviancy from the ME model in Table 5a and b, while also discriminating the directional of the two kinds of complete-inequality. The proportions of the observed frequencies of the main diagonal cells to the total observed frequencies in Table 5a and b are 56.0% and 37.0%, respectively. We should use the \(\Psi \) rather than the \(\Phi \) for analyzing these data.

Table 6 shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Table 5a and b.

Table 6 The table below shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Table 5a and b

From Table 6, we can see that (i) the data in Table 5a have the row-inequality, because the confidence interval for \(\Psi \) is negative, and (ii) the data in Table 5b have the column-inequality. From the above results, we infer that the esomeprazole is to be effective compared to the placebo.

The transitional model (see, Agresti (2018, Sect. 9.4.3)) is widely used for the data of repeated measurements of the ordinal categorical outcome. However, using VGAM R package version 1.1-3, we cannot obtained the estimated parameter of the transitional model for these data. Because patients who the MLS at baseline is “+4” are not observed in placebo group. For the such sparse data, it may not be appropriate to applying the transitional model, although the \(\Psi \) can be applied.

6 Discussions

We consider a measure based on the Eq. (1) that can evaluate only the degree of deviancy from the ME model as well as Yamamoto and Tomizawa (2007). Assuming that \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i) \ne 0\), the measure is defined by

$$\begin{aligned} \psi = \frac{1}{\log 2}\left[ F^X\log \left( \frac{F^X}{1/2}\right) +F^Y\log \left( \frac{F^Y}{1/2}\right) \right] . \end{aligned}$$

The index \(\psi \) has the following properties: (i) the range of \(\psi \) is from zero to one; (ii) the \(\Psi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} F^X_i\) is equal to \(\sum _{i=1}^{R-1} F^Y_i\); and (iii) the \(\Psi \) is equal to one if and only if there is a structure that complete-row-inequality or complete-column-inequality. From the above property (iii), we point out that the \(\psi \) cannot discriminate between the two directions of complete-inequality, although the \(\Psi \) can discriminate. Therefore, we believe that the \(\Psi \) is superior to the \(\psi \).

7 Concluding remarks

This study proposed the measure \(\Psi \) based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. Generically, when the model does not fit data, we are interested in (i) applying the model having weaker restrictions than it, or (ii) measuring the degree of deviancy from it. When the ME model does not fit the data, we are interested in only measuring the degree of deviancy from the ME model, because the model having weaker restrictions than the ME model is only the saturated model.

We pointed out that the magnitude relation of the degrees of deviancy from the ME model in two datasets may change between the \(\Phi \) and \(\Psi \). Moreover, we recommended to use the \(\Psi \) when the proportions of the observed frequencies of the main diagonal cells to the total observed frequencies are different in two datasets.

The estimator of the \(\Psi \) is the unbiased estimator when the sample size is large. When the sample size is small such as under 100, however, it may be the biased estimator. For some measures representing the degree of deviancy from the model in square contingency tables, Tomizawa et al. (2007) and Tahata et al. (2014), and Iki and Tomizawa (2017) showed that the unbiased estimator was obtained by using the second-order term in the Taylar series expansion even if the sample size is small. Similarly, the unbiased estimator of the \(\Psi \) can be constructed. The above concern will be investigated in future research.