Abstract
This study proposes a measure that can concurrently evaluate the degree and direction of deviancy from the marginal mean equality (ME) model in square contingency tables with ordered categories. The proposed measure is constructed as the function of the row and column cumulative marginal probabilities. When the ME model does not fit data, we are interested in measuring the degree of deviancy from the ME model, because the model having weaker restrictions than the ME model is only the saturated model. This existing measure, which represents the degree of deviancy from the ME model, does not depend on the probabilities that observations will fall in the main diagonal cells of the table. For the data in which observations are concentrated in the main diagonal cells, the existing measure may overestimate the degree of deviancy from the ME model. The proposed measure can address this issue. This study derives an estimator and an approximate confidence interval for the proposed measure using the delta method. The proposed measure would be utility for comparing degrees of deviancy from the ME model in two datasets. The proposed measure is evaluated the usefulness with the application to real data of clinical trials.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
This study focuses on an \(R \times R\) square contingency table with ordered categories. For example, square contingency tables may be obtained by cross-classifying for repeated measurements of the ordinal categorical outcome.
Let X and Y denote the row and column variables, respectively. The marginal mean equality (ME) model that Tomizawa (1991) proposed is defined by
where
The ME model can be also expressed as
where
The \(F^X_i\) and \(F^Y_i\), for \(i=1, 2, \dots , R-1\), are the row and column cumulative marginal probabilities, respectively. By considering the difference between \(F^X_i\) and \(F^Y_i\), for \(i=1,2,\dots ,R-1\), the ME model can be further expressed as
where
It must be noted that the Eq. (1) depends on the probabilities that observations will fall in the main diagonal cells of the table (i.e., \(\Pr (X=i, Y=i)\) for \(i=1,\dots ,R\)), and the Eq. (2) dose not depend on those.
Generically, when the concerned model does not fit data, we are interested in (i) applying the model having weaker restrictions than it, or (ii) measuring the degree of deviancy from it. When the marginal homogeneity (MH) model (Stuart 1955) that has restrictions \(F^X_i =F^Y_i \) (or \(G_{1(i)}=G_{2(i)}\)) for \(i=1,2,\dots ,R-1\) dose not fit data, we may be interested in applying the models having weaker restrictions than the MH model (e.g., the ME model), or measuring the degree of deviancy from the MH model. For the details of the models having weaker restrictions than the MH model, see, for examples, Tahata and Tomizawa (2008, 2014), Kurakami et al. (2013), Shinoda et al. (2021), and so on. For the details of the measures of deviancy from the MH model, see, for examples, Tomizawa et al. (2003), Tahata et al. (2006, 2012), Yamamoto et al. (2011), and so on. The measures of Tahata et al. (2006, 2012) were constructed based on the \(\{F^X_i\}\) and \(\{F^Y_i\}\), and the measures of Tomizawa et al. (2003) and Yamamoto et al. (2011) were constructed based on the \(\{G_{1(i)}\}\) and \(\{G_{2(i)}\}\).
Ando (2019) revealed that when comparing the degrees of deviancy from the MH model for two datasets, the magnitude relation of the degrees of deviancy from the MH model may change between the measure that depend on the probabilities \(\Pr (X=i, Y=i)\) and the measure that does not. Thus, for the dataset in which observations are concentrated in the main diagonal cells, the measure that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the MH model. In fact, in square contingency tables, it is known that there is a strong association between row and column variables, and observations tend to be concentrated in the main diagonal cells.
When the ME model does not fit the data, we are interested in only measuring the degree of deviancy from it, because the model having weaker restrictions than the ME model is only the saturated model. We point out that the number of degrees of freedom of the ME model is one. Yamamoto and Tomizawa (2007) proposed a measure based on the Eq. (2) that represents the degree of deviancy from the ME model. This measure, however, cannot discriminate directions in the degree of deviancy from the ME model although there are two types of direction (i.e., \(\sum _{i=1}^{R-1} G_{1(i)} > \sum _{i=1}^{R-1} G_{2(i)}\) and \(\sum _{i=1}^{R-1} G_{1(i)} < \sum _{i=1}^{R-1} G_{2(i)}\)). This is because, this measure takes its maximum value one when \(\sum _{i=1}^{R-1} G_{1(i)}>0\) and \(\sum _{i=1}^{R-1} G_{2(i)}=0\) or \(\sum _{i=1}^{R-1} G_{1(i)}=0\) and \(\sum _{i=1}^{R-1} G_{2(i)}>0\), and its minimum value zero when \(\sum _{i=1}^{R-1} G_{1(i)}=\sum _{i=1}^{R-1} G_{2(i)}\). To tackle this issue, Ando (2021) proposed a directional measure that can concurrently evaluate the degree and direction of deviancy from the ME model. This directional measure takes its maximum value one when \(\sum _{i=1}^{R-1} G_{1(i)}=0\) and \(\sum _{i=1}^{R-1} G_{2(i)}>0\), its minimum value minus one when \(\sum _{i=1}^{R-1} G_{1(i)}>0\) and \(\sum _{i=1}^{R-1} G_{2(i)}=0\), and its value zero when \(\sum _{i=1}^{R-1} G_{1(i)}=\sum _{i=1}^{R-1} G_{2(i)}\).
It must be noted that the above two measures do not depend on the probabilities \(\Pr (X=i, Y=i)\) for \(i=1,\dots ,R\). As with the measure of deviancy from the MH model, for the dataset in which observations are concentrated in the main diagonal cells, the measure that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the ME model. To tackle this issue, we are interested in proposing a directional measure that depends on the probabilities \(\Pr (X=i, Y=i)\).
This study proposes a measure based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. We show that when comparing the degrees of deviancy from the ME model for two datasets, the magnitude relation of the degree of deviancy from the ME model may change between the proposed and existing measures.
This paper is organized as follows. In Sect. 2, we propose a measure of deviancy from the ME model. In Sect. 3, we derive an approximate confidence interval for the proposed measure. Section 4 evaluates the performances of an estimator and an approximate confidence interval for the proposed measure though numerical experiments. In Sect. 5, we apply the proposed measure to the real data of clinical trial. We close with discussions and concluding remarks in Sect. 6 and 7.
2 Measure of deviancy from marginal mean equality
2.1 Existing measure
Ando (2021) proposed a measure based on the Eq. (2) that can concurrently evaluate the degree and direction of deviancy from the ME model. Assuming that \(\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)}) \ne 0\), the measure is defined as follows:
where
The \(\Phi \) has the following properties: (i) the range of \(\Phi \) is from minus one to one; (ii) the \(\Phi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} G_{1(i)}\) is equal to \(\sum _{i=1}^{R-1} G_{2(i)}\); (iii) the \(\Phi \) is equal to minus one if and only if there is a structure with \(\sum _{i=1}^{R-1} G_{1(i)} > 0\) and \(\sum _{i=1}^{R-1} G_{2(i)} = 0\), which is the complete-upper-inequality; and (iv) the \(\Phi \) is equal to one if and only if \(\sum _{i=1}^{R-1} G_{1(i)} = 0\) and \(\sum _{i=1}^{R-1} G_{2(i)} > 0\), which is the complete-lower-inequality.
2.2 Proposed measure
In this section, we propose a measure based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. Assuming that \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i) \ne 0\), the proposed measure is defined as follows:
where
The \(\Psi \) is also express as follows:
The \(\Psi \) has the following properties: (i) the range of \(\Psi \) is from minus one to one; (ii) the \(\Psi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} F^X_i\) is equal to \(\sum _{i=1}^{R-1} F^Y_i\); (iii) the \(\Psi \) is equal to minus one if and only if there is a structure with \(\sum _{i=1}^{R-1} F^X_i > 0\) and \(\sum _{i=1}^{R-1} F^Y_i = 0\), which is the complete-row-inequality; and (iv) the \(\Psi \) is equal to one if and only if \(\sum _{i=1}^{R-1} F^X_i = 0\) and \(\sum _{i=1}^{R-1} F^Y_i > 0\), which is the complete-column-inequality. From the above properties, we see that the proposed measure can concurrently evaluate the degree and direction of deviancy from the ME model.
It must be noted that the complete-row-inequality is different from the complete-upper-inequality, and the complete-column-inequality is different from the complete-lower-inequality. Under both \(\sum _{i=1}^{R-1} (G_{1(i)} + G_{2(i)})\) and \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i)\) are positive, the complete-upper-inequality is satisfied when the complete-row-inequality holds; however, the converse is not necessarily true.
3 Approximate confidence interval for the proposed index
Let \(f_{ij}\) denote the observed frequency in the (i, j)th cell of the table, and let \(p_{ij}\) denote the probability that an observation will fall in the (i, j)th cell of the table. Assume that the observed frequencies \(\{f_{ij}\}\) have a multinomial distribution with parameters that are the cell probabilities\(\{p_{ij}\}\).
Let \({\varvec{f}}\) and \({\varvec{p}}\) be \(R^2 \times 1\) vectors:
From the central limit thorem, \(\hat{{\varvec{p}}}\) is asymptotically distributed as a normal distribution with mean vector \({\varvec{p}}\) and covariance matrix \(\frac{1}{N} ({\textbf {diag}}({\varvec{p}}) - {\varvec{p}}{\varvec{p}}^\top )\) where \(\hat{{\varvec{p}}} = \frac{1}{N} {\varvec{f}}\), \(N = \sum \sum f_{ij}\) and \({\textbf {diag}}({\varvec{p}})\) is a diagonal matrix with the elements of \({\varvec{p}}\) on the main diagonal. Then, we obtain
where the estimator of \(\Psi \), i.e., \({\hat{\Psi }}\), is given by \(\Psi \) with \(\{p_{ij}\}\) replaced by \(\{{\hat{p}}_{ij}\}\).
Using the delta method, descriptions of which are given by, e.g., Bishop et al. (2007, Sect. 14.6), we derive the approximate variance for the estimated measure and the large-sample confidence interval for the \(\Psi \). From the delta method, \(\sqrt{N}({\hat{\Psi }} - \Psi )\) asymptotically (as \(N \rightarrow \infty \)) has a normal distribution with mean zero and variance
where
Since
we obtain
The estimator of \(\sigma ^{2}[{\hat{\Psi }}]\), i.e., \(\widehat{\sigma ^{2}[{\hat{\Psi }}]}\) is given by \(\sigma ^{2}[{\hat{\Psi }}]\) with \(\{p_{ij}\}\) replaced by \(\{{\hat{p}}_{ij}\}\). The \(\widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) is an estimated standard error for \({\hat{\Psi }}\), and \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) is an approximate \(100(1 - \alpha ) \%\) confidence interval for \(\Psi \), where \(z_{\alpha / 2}\) is the percentage point from the standard normal distribution that corresponds to a two-tail probability equal to \(\alpha \).
4 Simulation study
4.1 Setting
Assume that the observed frequencies \(\{f_{ij}\}\) in contingency tables have a multinomial distribution with parameters that are the cell probabilities \(\{p_{ij}\}\) in Tables 1a, 1b, 1c and 1d. In other words, we generate contingency tables by a multinomial random number generator based on probability distributions in Tables 1a, 1b, 1c and 1d.
The values of the \(\Psi \) for Tables 1a, 1b, 1c and 1d are \(-0.0607\), \(-0.1410\), 0.1031 and 0.2222, respectively. The sample sizes set \(N=30, 50, 100, 150, 200, 300\). The number of iterations is 10,000.
Let \({\hat{\Psi }}_s\) be the estimate of \(\Psi \) in the sth iteration (\(s=1,\dots ,10000\)). We evaluate the performances of the estimator \({\hat{\Psi }}\) using the bias (i.e., \(\sum _{s=1}^{10{,}000}({\hat{\Psi }}_s-\Psi )\)/10,000) and the mean squared error (i.e., \(\sum _{s=1}^{10{,}000}({\hat{\Psi }}_s-\Psi )^2\)/10,000). We also evaluate the performances of the \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) using the coverage probabilities of the approximate 95% confidence interval.
4.2 Results
Table 2 shows the bias and mean squared error for the estimator \({\hat{\Psi }}\), and the coverage probability of the approximate 95\(\%\) confidence interval for \(\Psi \).
The bias and the mean squared error were each close to zero as the sample size increased. When the sample sizes are 100, in all settings, the coverage probabilities of the approximate 95% confidence interval for \(\Psi \) were over 94%. We believe that the performances of the estimator \({\hat{\Psi }}\) and the the \({\hat{\Psi }} \pm z_{\alpha / 2} \widehat{\sigma [{\hat{\Psi }}]} / \sqrt{N}\) are well when the sample size is over 100.
These results are similar to those of Ando (2021, 2022), and we believe that the proposed procedure seems to work well even for finite samples.
5 Application to data
5.1 Application to artificial data
In this section, we reveal that when comparing the degrees of deviancy from the ME model for two datasets, the magnitude relation of the degrees of deviancy from the ME model may change between the measure (i.e., the proposed measure \(\Phi \)) that depend on the probabilities \(\Pr (X=i, Y=i)\) and the measure (i.e., the existing measure \(\Phi \)) that does not.
Table 3 shows the artificial data with \(N = 300\) each. The proportions of the observed frequencies of the main diagonal cells to the total observed frequencies in Tables 3a and 3b are 34.3% and 81.0%, respectively. It must be noted that these two datasets differ greatly in the concentration ratio of the observed frequencies of the main diagonal cells. Table 4 shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Tables 3a and 3b.
From Table 4, when the \(\Phi \) is used, we can infer that the degree of deviancy from the ME model in Table 3b is larger than that in Table 3a. On the other hand, when using the \(\Psi \), we can infer that the degree of deviancy from the ME model in Table 3a is larger than that in Table 3b. Thus, the magnitude relation of the degrees of deviancy from the ME model in Tables 3a and 3b change between the \(\Phi \) and \(\Psi \). We believe that the measure \(\Phi \) that does not depend on the probabilities \(\Pr (X=i, Y=i)\) may overestimate the degree of deviancy from the ME model. Therefore, we recommend to use the \(\Psi \) when the proportions of the observed frequencies of the main diagonal cells to the total observed frequencies are different in two datasets.
5.2 Application to real data
We consider the data in Table 5, taken from Sugano et al. (2012). These data were obtained by cross-classifying for repeated measurements of the ordinal categorical outcome.
In this clinical trial, Japanese adult patients eligible for inclusion were those who with endoscopically confirmed history of peptic ulcers and required long-term oral NSAID therapy for a chronic inflammatory condition. Patients was randomized into esomeprazole group and placebo group. The modified LANZA score (MLS) is categorized as “0” (best score), “+1”, “+2”, “+3”, and “+4” (worst score).
For these data, the row-inequality implies an improvement, while the column-inequality implies an ingravescence in the change from the baseline to end of study in the MLS. We want to evaluate whether patients in the esomeprazole group are more improvement than patients in the placebo group, as a matter of clinical interest. Therefore, we are interested in comparing the degrees of deviancy from the ME model in Table 5a and b, while also discriminating the directional of the two kinds of complete-inequality. The proportions of the observed frequencies of the main diagonal cells to the total observed frequencies in Table 5a and b are 56.0% and 37.0%, respectively. We should use the \(\Psi \) rather than the \(\Phi \) for analyzing these data.
Table 6 shows the estimates of \(\Phi \) and \(\Psi \), the approximate standard errors of \({\hat{\Phi }}\) and \({\hat{\Psi }}\), and the approximate 95\(\%\) confidence intervals for \(\Phi \) and \(\Psi \) for the data of Table 5a and b.
From Table 6, we can see that (i) the data in Table 5a have the row-inequality, because the confidence interval for \(\Psi \) is negative, and (ii) the data in Table 5b have the column-inequality. From the above results, we infer that the esomeprazole is to be effective compared to the placebo.
The transitional model (see, Agresti (2018, Sect. 9.4.3)) is widely used for the data of repeated measurements of the ordinal categorical outcome. However, using VGAM R package version 1.1-3, we cannot obtained the estimated parameter of the transitional model for these data. Because patients who the MLS at baseline is “+4” are not observed in placebo group. For the such sparse data, it may not be appropriate to applying the transitional model, although the \(\Psi \) can be applied.
6 Discussions
We consider a measure based on the Eq. (1) that can evaluate only the degree of deviancy from the ME model as well as Yamamoto and Tomizawa (2007). Assuming that \(\sum _{i=1}^{R-1} (F^X_i + F^Y_i) \ne 0\), the measure is defined by
The index \(\psi \) has the following properties: (i) the range of \(\psi \) is from zero to one; (ii) the \(\Psi \) is equal to zero if and only if \(\sum _{i=1}^{R-1} F^X_i\) is equal to \(\sum _{i=1}^{R-1} F^Y_i\); and (iii) the \(\Psi \) is equal to one if and only if there is a structure that complete-row-inequality or complete-column-inequality. From the above property (iii), we point out that the \(\psi \) cannot discriminate between the two directions of complete-inequality, although the \(\Psi \) can discriminate. Therefore, we believe that the \(\Psi \) is superior to the \(\psi \).
7 Concluding remarks
This study proposed the measure \(\Psi \) based on the Eq. (1) that can concurrently evaluate the degree and direction of deviancy from the ME model. Generically, when the model does not fit data, we are interested in (i) applying the model having weaker restrictions than it, or (ii) measuring the degree of deviancy from it. When the ME model does not fit the data, we are interested in only measuring the degree of deviancy from the ME model, because the model having weaker restrictions than the ME model is only the saturated model.
We pointed out that the magnitude relation of the degrees of deviancy from the ME model in two datasets may change between the \(\Phi \) and \(\Psi \). Moreover, we recommended to use the \(\Psi \) when the proportions of the observed frequencies of the main diagonal cells to the total observed frequencies are different in two datasets.
The estimator of the \(\Psi \) is the unbiased estimator when the sample size is large. When the sample size is small such as under 100, however, it may be the biased estimator. For some measures representing the degree of deviancy from the model in square contingency tables, Tomizawa et al. (2007) and Tahata et al. (2014), and Iki and Tomizawa (2017) showed that the unbiased estimator was obtained by using the second-order term in the Taylar series expansion even if the sample size is small. Similarly, the unbiased estimator of the \(\Psi \) can be constructed. The above concern will be investigated in future research.
References
Agresti A (2018) An introduction to categorical data analysis, 3rd edn. Wiley, Hoboken
Ando S (2019) A bivariate index for visually measuring marginal inhomogeneity in square tables. Int J Stat Probab 8(5):58–65
Ando S (2021) An index to simultaneously analyze the degree and directionality of departure from global marginal homogeneity in square contingency tables. J Korean Stat Soc 50(4):997–1008
Ando S (2022) Directional measure for analyzing the degree of deviance from generalized marginal mean equality model in square contingency tables. Sankhya B 84:708–721
Bishop YM, Fienberg SE, Holland PW (2007) Discrete multivariate analysis: theory and practice. Springer, New York
Iki K, Tomizawa S (2017) Improved estimator of measure for marginal homogeneity using marginal odds in square contingency tables. J Adv Stat 2(2):71–108
Kurakami H, Tahata K, Tomizawa S (2013) Generalized marginal cumulative logistic model for multi-way contingency tables. SUT J Math 49(1):19–32
Shinoda S, Tahata K, Yamamoto K, Tomizawa S (2021) Marginal continuation odds ratio model and decomposition of marginal homogeneity model for multi-way contingency tables. Sankhya B 83(2):304–324
Stuart A (1955) A test for homogeneity of the marginal distributions in a two-way classification. Biometrika 42(3/4):412–416
Sugano K, Kinoshita Y, Miwa H, Takeuchi T, Group ENPS (2012) Randomised clinical trial: esomeprazole for the prevention of nonsteroidal anti-inflammatory drug-related peptic ulcers in Japanese patients. Aliment Pharmacol Ther 36(2):115–125
Tahata K, Tomizawa S (2008) Generalized marginal homogeneity model and its relation to marginal equimoments for square contingency tables with ordered categories. Adv Data Anal Classif 2(3):295–311
Tahata K, Tomizawa S (2014) Symmetry and asymmetry models and decompositions of models for contingency tables. SUT J Math 50(2):131–165
Tahata K, Iwashita T, Tomizawa S (2006) Measure of departure from symmetry of cumulative marginal probabilities for square contingency tables with ordered categories. SUT J Math 42(1):7–29
Tahata K, Kawasaki K, Tomizawa S (2012) Asymmetry index on marginal homogeneity for square contingency tables with ordered categories. Open J Stat 2(2):198–203
Tahata K, Tanaka H, Tomizawa S (2014) Refined estimators of measures for marginal homogeneity in square contingency tables. Int J Pure Appl Math 90(4):501–513
Tomizawa S (1991) Decomposing the marginal homogeneity model into two models for square contingency tables with ordered categories. Calcutta Stat Assoc Bull 41(1–4):201–208
Tomizawa S, Miyamoto N, Ashihara N (2003) Measure of departure from marginal homogeneity for square contingency tables having ordered categories. Behaviormetrika 30(2):173–193
Tomizawa S, Miyamoto N, Ohba N (2007) Improved approximate unbiased estimators of measures of asymmetry for square contingency tables. Adv Appl Stat 7:47–63
Yamamoto K, Tomizawa S (2007) Decomposition of measure for marginal homogeneity in square contingency tables with ordered categories. Austrian J Stat 36(2):105–114
Yamamoto K, Ando S, Tomizawa S (2011) A measure of departure from average marginal homogeneity for square contingency tables with ordered categories. REVSTAT Stat J 9(2):115–126
Acknowledgements
The author would like to thank the anonymous reviewers and the editors for their comments and suggestions to improve this paper.
Funding
Open Access funding provided by Tokyo University of Science. The authors have solely funded the research by themselves.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ando, S. Measure of deviancy from marginal mean equality based on cumulative marginal probabilities in square contingency tables. Metrika (2024). https://doi.org/10.1007/s00184-023-00945-x
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00184-023-00945-x