Marginal Continuation odds Ratio Model and Decomposition of Marginal Homogeneity Model for Multi-way Contingency Tables

For square contingency tables with ordered categories, the marginal homogeneity model is represented by various expressions, and some extensions of the marginal homogeneity model were proposed. Herein we consider the marginal continuation-ratio to examine a new expression of the marginal homogeneity model. We also propose an extension of the marginal homogeneity model using the ratio of marginal continuation-ratios; namely, the marginal continuation odds ratio. The proposed model can be interpreted in various ways. Additionally, we propose a generalization of it, and decompose the marginal homogeneity model using the generalized model. Furthermore, we extend the models and decompositions into multi-way contingency tables.


Introduction
Consider an R × R square contingency table with the same row and column ordinal classifications. Let X and Y denote the row and column variables, respectively, and let Pr(X = i, Y = j) = p ij for i = 1, . . . , R; j = 1, . . . , R. The marginal homogeneity (MH) model is defined by p i· = p ·i for i = 1, . . . , R, (1.1) where p i· = R t=1 p it and p ·i = R s=1 p si . See e.g., Stuart (1955) and Bishop et al. (1975, p.294). This indicates that the row marginal distribution is identical to the column marginal distribution.
Using the marginal cumulative probability, this model can be expressed as where F X i = i s=1 p s· = Pr(X ≤ i) and F Y i = i t=1 p ·t = Pr(Y ≤ i). The MH model can also be expressed as g., Tomizawa 1993;. Furthermore, Tahata et al. (2006) expressed the MH model using marginal ridits (see e.g., Bross 1958; Fleiss et al. 2003, pp.198-205;Agresti 2010, p.10). Moreover, the MH model can be expressed with other formulas (see e.g., Iki et al. 2010;Altun and Aktaş 2018). When the MH model does not fit for the data, we are interested in applying a model with weaker restrictions. One example is an extension based on expression (1.1) proposed by Miyamoto et al. (2006) for a square contingency table with nominal classifications. For a square contingency table with ordinal classifications, the marginal cumulative logistic (ML) model is defined by (1.4) See e.g., McCullagh (1977), Agresti (2010, p.241), and Kurakami et al. (2013). Saigusa et al. (2018)  Herein we examine a new expression of the MH model using the continuationratio (see e.g., Fienberg 1980, pp.110-111;Agresti 2010, p.45). The MH model can be expressed as This states that the row marginal continuation-ratio is identical to the column marginal continuation-ratio. Note that there are various research focusing on the continuation-ratio (see e.g., Thompson 1977;McCullagh 1980;Läärä and Matthews 1985;Tutz 1991;Greenland 1994). As an example, Thompson (1977) used the continuation-ratio in modeling discrete survival time data. When the lengths of time intervals approach zero, his model converges to the Cox proportional hazards model.
For the square contingency table analysis, much research on the marginal homogeneity have been studied. However, research on the framework of the continuation-ratio, which is an important concept in categorical analysis, are not enough. As an example, the ML model cannot be interpreted under the continuation-ratio. The purpose of this study is to provide a new insight for the square contingency table analysis by studying the continuation-ratio. This paper can also further understand the previous research by considering the properties of the continuation-ratio. The plan of the paper is as follows. Section 2 extends the MH model based on expression (1.6). Section 3 decomposes the MH model. Section 4 extends the model into multi-way tables. Section 5 gives a test for the goodness-of-fit for the models. Section 6 provides some examples, and Section 7 discusses this paper in the context of related works.

The Marginal Continuation Odds Ratio Model
The ratio of marginal continuation-ratios is for i = 1, . . . , R − 1. We refer to the ratio of marginal continuation-ratios as the marginal continuation odds ratio. Note that this is different from the continuation odds ratio (Agresti 2010, p.24), and the quasi-symmetry model based on the continuation odds ratio was presented by Kateri et al. (2017).
Interpretation of the proposed model will be described using the following examples. Consider the comparison of therapeutic effects when two drugs are administered to the same patient. The treatment effect is an ordinal score with R stages (larger scores indicate more severe symptoms). So, we obtain an R × R contingency table with the same row and column classifications (row variable is drug A; column variable is drug B). We are now interested in the odds that an observation will fall in score category i, instead of score category i + 1 or above for any i. From Eq. (2.1), under the MCOR model, the parameter Δ indicates the odds ratio between drug A and B; if the Δ is zero, the MH model holds, i.e., there is no difference between drug A and B; if the Δ is positive, the odds ratio is exp (Δ) times higher, i.e., drug A is more therapeutic effect than drug B. We can also interpret the MCOR model in two ways. From Eq. (2.2), on condition that an observation will fall in score category i or above, the odds that the observation falls in score category i instead of not i, are exp (Δ) times higher for drug A than for drug B. Moreover, we can see that the conditional probability for drug A is a location shift of that for drug B on a logistic scale.
Note that model (1.4) can be transformed into model (2.2) by replacing the marginal cumulative probability with the corresponding marginal conditional probability. However, the meanings of these models completely differ, and the likelihood ratio chi-squared statistics for testing the goodness-of-fit of these models do not coincide.
2.2. The Generalized Marginal Continuation-ratio Model Model (2.2) is an extension of the MH model using the logit transformation. Hence, model (2.2) may be based on the idea of model (1.4). If we focus on the idea of model (1.5), a distinct extension can be derived using the complementary log-log transformation. Therefore, using a strictly increasing function such as a logit or a complementary log-log function, we propose a generalization of the MCOR model by where the parameter Δ is unspecified and h(·) is a twice-differentiable and strictly increasing function with lim Note that g(·) is a strictly increasing function that gives lim The GMC model can also be expressed as

Properties
In this section, we focus on the complementary log-log and probit transformation as the major transformations for the GMC model.

The Marginal Continuation-ratio Complementary Log-log Model.
We shall refer to model (2.4) as the marginal continuation-ratio complementary log-log (MCC) model. Läärä and Matthews (1985) noted that the complementary log-log transformation for the conditional probabilities is equivalent to the one using the same transformation but with the cumulative probabilities. This leads to the following property.

Property 1. The MCC model is equivalent to the MCL model.
We give the proof of Property 1 below: The conditional probabilities ω X i can be expressed as Marginal Continuation odds Ratio Model... S309 for i = 1, . . . , R − 1. Therefore, the MCC model is expressed as When i = 2, we see Hence, in a similar manner we see that the MCC model is expressed as This expression represents the MCL model.
From above, the parameter Δ in the MCC model can reflect the degree of inhomogeneity not only between Hence, the MCC model also states that one marginal distribution is a location shift of another marginal distribution on a complementary log-log scale.

The Marginal Continuation-ratio Probit Model.
Using the probit transformation, the GMC model is expressed as where Φ(·) is the cumulative distribution function of the standard normal distribution. We refer to model (2.5) as the marginal continuation-ratio probit (MCP) model.

Decompositions of the Marginal Homogeneity Model
Consider the marginal mean equality (ME) model defined by Note that the MH model implies the ME model.
We obtain the following lemmas and theorem.
Proof. The GMC model is expressed as . . . Thus .

Lemma 2. Under the GMC model, we have
.
In a similar manner, we obtain .

Theorem 1. The MH model holds if and only if both the GMC and ME models hold.
Proof. If the MH model holds, then the GMC and ME models hold. Assuming that the GMC and ME models hold, we shall show that the MH model holds.
The proof is complete.
We can also describe the following decompositions of the MH model.

Extension into Multi-way Tables
We extend the models and decompositions in Sections 2 and 3 into multiway contingency tables.
Note that we must consider extensions into multi-way tables not only theoretical aspects but also practical aspects since they are known to be sparse. Although application issues will be future research, we give theoretical extensions respecting the historical value of previous research.
Let ω Then we propose a model defined by where the parameters Δ k are unspecified. A special case of this model obtained by setting Δ 2 = · · · = Δ T = 0 is the MH T model. We refer to model Especially, when h −1 (x) = log (x/ (1 − x)), the GMC T model is expressed as log ω for i = 1, . . . , R − 1; k = 2, . . . , T . We shall refer to model (4.2) as the MCOR T model. Note that where Using the complementary log-log transformation, the GMC T model is expressed as for i = 1, . . . , R − 1; k = 2, . . . , T . We shall refer to model (4.3) as the MCC T model. Using the probit transformation, the GMC T model is expressed as . . . , T, (4.4) where Φ(·) is the cumulative distribution function of the standard normal distribution. We refer to model (4.4) as the MCP T model.

Decompositions of the Marginal Homogeneity Model
Consider the ME T model defined by Note that the MH T model implies the ME T model.
We obtain the following theorem.

Theorem 2. For the R T table, the MH T model holds if and only if both the GMC T and ME T models hold.
The proof is omitted because it is obtained in a similar manner as the proof of Theorem 1. We also obtain the following corollaries.

Goodness-of-fit Test
Let n i 1 ...i T denote the observed frequency in the (i 1 , . . . , i T ) cell of the R T table with n = · · · n i 1 ...i T , and let m i 1 ...i T denote the corresponding expected frequency. Assume that {n i 1 ...i T } has a multinomial distribution. The maximum likelihood estimates (MLEs) of the expected frequencies under each model can be obtained using the Newton-Raphson method to solve the likelihood equations.
The likelihood ratio chi-squared statistic to test the goodness-of-fit of model M is given by 6 Examples 6.1. Example 1 We focus on the contingency table grouping the time scale into ordered categories such as the sleep-onset time. As an example, we used the research data of Marqueze et al. (2015a, b), which was found in the Dryad Digital Repository. We created a square contingency table by grouping the sleep-onset time scale between work days and days-off (Table 1). We used the pair sleep-onset time data of work days and days-off from the original data set, and combined two variables at once. Incidentally, the variable names of the dataset are "Bedtimew" and "Bedtimef". Then we calculated the first quartile and the third quartile from the combined data to create a square contingency table using these quartiles as the cut points. Namely, we classified the continuous bedtime at three levels: (1) below the first quartile, (2) the first quartile or more but less than the third quartile, and (3) the third quartile or more.
We shall analyze the data in Table 1 using Corollary 1. The MCOR model fits these data well since G 2 (MCOR) = 0.73 with 1 df. However, the Table 1: Marqueze's data expressing the bedtime for work days and daysoff using three levels: (1) below the first quartile, (2) the first quartile or more but less than the third quartile, and (3) the third quartile or more (Marqueze et al. 2015a, b) MH and ME models do not fit these data well since G 2 (MH) = 68.86 with 2 df and G 2 (ME) = 58.48 with 1 df. We shall consider the hypothesis that the MH model holds under the assumption that the MCOR model holds; namely, the hypothesis that Δ = 0 holds. Since G 2 (MH|MCOR) = G 2 (MH) − G 2 (MCOR) = 68.13 with 1 df, we reject this hypothesis at the 0.05 level. This shows Δ = 0 in the MCOR model. Therefore, the MCOR model is preferable to the MH model for the data in Table 1. Under the MCOR model, the MLEs of exp (Δ) are exp Δ = 1.38. Noting that ω X we see under the MCOR model that (i) the odds that the sleep-onset time is (1) below the first quartile, instead of (2) or (3), i.e., the first quartile or more, is estimated to be exp Δ = 1.38 times higher for work days than for days-off, and (ii) the odds that it is (2) the first quartile or more but less than the third quartile, instead of (3) the third quartile or more is estimated to be 1.38 times higher for work days than for days-off.
Section 7 discusses the interpretation of this results from the viewpoint of time scales.
6.2. Example 2 Consider the data in Table 2, which is obtained from the Meteorological Agency in Japan . These are obtained from the daily atmospheric temperatures at Hiroshima, Tokyo, and  (3) high. Variables X 1 , X 2 , and X 3 mean the temperatures at Hiroshima, Tokyo, and Sapporo, respectively. We shall analyze the data in Table 2 using Corollary 4. The MCOR T model fits these data well since G 2 (MCOR T ) = 0.61 with 2 df, whereas the MH T and ME T models do not fit these data well since G 2 (MH T ) = 16.80 with 4 df and G 2 (ME T ) = 16.39 with 2 df.
We shall consider the hypothesis that the MH T model holds under the assumption that the MCOR T model holds; namely, the hypothesis that Δ 2 = Δ 3 = 0 holds. Since G 2 (MH T |MCOR T ) = G 2 (MH T )−G 2 (MCOR T ) = 16.19 with 2 df, we reject this hypothesis at the 0.05 level. Therefore the MCOR T model is preferable to the MH T model for these data.
We see from Corollary 4 that the poor fit of the MH T model is caused by the poor fit of the ME T model rather than the MCOR T model. That is, the mean temperatures at Hiroshima, Tokyo, and Sapporo differ. Under the MCOR T model, the MLEs of {exp (Δ k )} are exp Δ 2 = 0.90 and exp Δ 3 = 1.33. Noting that ω (t) and 3 , we see under the MCOR T model that the odds that the temperature is (1) Low instead of (2) Normal or (3) High is estimated to be exp Δ 2 = 0.90 times higher in Tokyo than in Hiroshima, and the odds that it is (2) Normal instead of (3) High is estimated to be 0.90 times higher in Tokyo than in Hiroshima. Also we see that the odds that it is (1) Low instead of (2) Normal or (3) High is estimated to be exp Δ 3 = 1.33 times higher in Sapporo than in Hiroshima, and the odds that it is (2) Normal instead of (3) High is estimated to be 1.33 times higher in Sapporo than in Hiroshima. Tables 1 and  2, the goodness-of-fits of the MCOR T , MCC T , and MCP T models are remarkably different (see Table 3). The MCOR T and MCP T models fit both the data in Tables 1 and 2 very well. However, the MCC T model fits the data in Table 2 well, although it does not fit the data in Table 1 well. From above, considering special cases of the GMC T model, the conditional probabilities of the MCOR T and MCP T models have a symmetric appearance. However, that of the MCC T model is asymmetric, log(− log(1−x)) approaches 0 fairly slowly but approaches 1 quite sharply.

Comparison Between Models Analyzing the data in
The MCOR T and MCC T models may be useful because the parameter exp (Δ k ) of the MCOR T model can be interpreted as the marginal continuation odds ratio and the parameter Δ k of the MCC T model can be considered Table 3: Likelihood ratio statistic G 2 for models applied to the data in Tables 1 and 2  Table 1  Table 2  When an analyst treats the contingency table by grouping the time scale such as studies of survival, the proposed models and decompositions may be useful from the viewpoint of discrete time hazards.