Skip to main content
Log in

Selection of statistics for a multinomial goodness-of-fit test and a test of independence for a multi-way contingency table when data are sparse

  • Original Paper
  • Published:
Japanese Journal of Statistics and Data Science Aims and scope Submit manuscript

Abstract

For the goodness-of-fit test for a multinomial distribution and test of some kinds of independence of a multi-way contingency table, we consider test statistics based on the \(\phi \)-divergence family. Members of the \(\phi \)-divergence family of statistics all have an equivalent Chi-square limiting distribution under the null hypothesis. We consider a second-order correction term as an index of investigating whether the distributions of statistics are close to the Chi-square limiting distribution. We derive properties for the second-order correction term for selecting a \(\phi \)-divergence statistic when we consider an asymptotic test in the case of data being sparse. We propose a selection of statistics when we use a power divergence family of statistics and the family of Rukhin’s statistics as special \(\phi \)-divergence statistics.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Data Availability

All data generated or analyzed during this study are included in this published article.

References

  • Agresti, A. (2002). Categorical data analysis (2nd ed.). Wiley.

    Book  Google Scholar 

  • Ali, S. M., & Silvey, S. D. (1966). A general class of coefficients of divergence of one distribution from another. Journal of the Royal Statistical Society. Series B, 28, 131–142.

    MathSciNet  Google Scholar 

  • Birch, M. W. (1964). A new proof of the Pearson–Fisher theorem. Annals of Mathematical Statistics, 35, 817–824.

    Article  MathSciNet  Google Scholar 

  • Bishop, Y. M. M., Fienberg, S. E., & Holland, P. W. (1975). Discrete multivariate analysis: Theory and practice. MIT Press.

    Google Scholar 

  • Cressie, N., & Read, T. R. C. (1984). Multinomial goodness-of-fit tests. Journal of the Royal Statistical Society. Series B, 46, 440–464.

    MathSciNet  Google Scholar 

  • Csiszár, I. (1967). Information-type measures of difference of probability distributions and indirect observations. Studia Scientiarum Mathematicarum Hungarica, 2, 299–318.

    MathSciNet  Google Scholar 

  • Dale, J. R. (1986). Asymptotic normality of goodness-of-fit statistics for sparse product multinomials. Journal of the Royal Statistical Society. Series B, 48, 48–59.

    MathSciNet  Google Scholar 

  • Holst, L. (1972). Asymptotic normality and efficiency for certain goodness-of-fit tests. Biometrika, 59, 137–145.

    Article  MathSciNet  Google Scholar 

  • Koehler, K. J. (1986). Goodness-of-fit tests for log-linear models in sparse contingency tables. Journal of the American Statistical Association, 81, 483–493.

    Article  MathSciNet  Google Scholar 

  • Koehler, K. J., & Larntz, K. (1980). An empirical investigation of goodness-of-fit statistics for sparse multinomials. Journal of the American Statistical Association, 75, 336–344.

    Article  Google Scholar 

  • Larntz, K. (1978). Small-sample comparisons of exact levels for chi-squared goodness-of-fit statistics. Journal of the American Statistical Association, 73, 253–263.

    Article  Google Scholar 

  • Lawal, H. B. (1984). Comparisons of the \(X^2\), \(Y^2\), Freeman–Tukey and Williams’s improved \(G^2\) test statistics in small samples of one-way multinomials. Biometrika, 71, 415–418.

    MathSciNet  Google Scholar 

  • Morales, D., Pardo, L., & Vajda, I. (2003). Asymptotic laws for disparity statistics in product multinomial models. Journal of Multivariate Analysis, 85, 335–360.

    Article  MathSciNet  Google Scholar 

  • Morris, C. (1975). Central limit theorems for multinomial sums. Annals of Statistics, 3, 165–188.

    Article  MathSciNet  Google Scholar 

  • Pardo, L., Morales, D., Salicrú, M., & Menéndez, M. L. (1993). The \(\phi \)-divergence statistics in bivariate multinomial populations including stratification. Metrika, 40, 223–235.

    Article  MathSciNet  Google Scholar 

  • Pardo, L., Pardo, M. C., & Zografos, K. (1999). Homogeneity for multinomial populations based on \(\phi \)-divergences. Journal of the Japan Statistical Society, 29, 213–228.

    Article  MathSciNet  Google Scholar 

  • Pardo, L. (2006). Statistical inference based on divergence measures. Chapman and Hall/CRC.

    Google Scholar 

  • Read, T. R. C., & Cressie, N. A. C. (1988). Goodness-of-fit statistics for discrete multivariate data. Springer.

    Book  Google Scholar 

  • Rukhin, A. L. (1994). Optimal estimator for the mixture parameter by the method of moments and information affinity. Trans. 12th Prague conference on information theory (pp. 214–219).

  • Taneichi, N., Sekiya, Y., & Toyama, J. (2019). Transformed statistics for test of conditional independence in \( J \times K \times L\) contingency tables. Journal of Multivariate Analysis, 171, 193–208.

    Article  MathSciNet  Google Scholar 

  • Taneichi, N., Sekiya, Y., & Toyama, J. (2021). Improvement of the test of independence among groups of factors in a multi-way contingency table. Japanese Journal of Statistics and Data Science, 4, 181–213.

    Article  MathSciNet  Google Scholar 

  • Upton, G. J. G. (1982). A comparison of alternative tests for the \(2 \times 2\) comparative trial. Journal of the Royal Statistical Society. Series A, 145, 86–105.

    Article  MathSciNet  Google Scholar 

  • Zografos, K., Ferentions, K., & Papaioannou, T. (1990). Sampling properties and multinomial goodness-of-fit and divergence tests. Communications in Statistics—Theory and Methods, 19, 1785–1802.

    Article  MathSciNet  Google Scholar 

  • Zografos, K. (1993). Asymptotic properties of \(\Phi \)-divergence statistic and its applications in contingency tables. International Journal of Mathematics and Statistical Science, 2, 5–12.

    MathSciNet  Google Scholar 

Download references

Acknowledgements

The authors are very grateful to reviewers for their valuable comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nobuhiro Taneichi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix: Proof of Theorems 1 and 2

If we assume that statistic \(G_{\phi }\) has a continuous distribution, the characteristic function of \(G_{\phi }\) is evaluated as

$$\begin{aligned} \psi _{\phi }^G(t)=(1-2it)^{-\lambda /2}+\frac{1}{n} \sum _{j=0}^3(1-2it)^{-(\lambda +2j)/2}r_j^{\phi } + o(n^{-1}), \end{aligned}$$

where

$$\begin{aligned} r_0^{\phi }= & {} \displaystyle {\frac{1}{12}}\biggl [ -(S-1)\biggr ], \\ r_1^{\phi }= & {} \displaystyle {\frac{1}{24}}\biggl [ 3(3S -k^2-2k) +6\phi '''(1)(S -k^2 ) \\{} & {} +\{\phi '''(1)\}^2 (5S -3k^2-6k+4 ) -3\phi ^{(4)}(1)(S - 2k + 1 )\biggr ], \\ r_2^{\phi }= & {} \displaystyle {\frac{1}{24}}\biggl [ -6(2S - k^2-2k+1 ) -4\phi '''(1)(4S - 3k^2 - 3k + 2) \\{} & {} -2\{\phi '''(1)\}^2( 5S - 3k^2 - 6k + 4) +3\phi ^{(4)}(1)(S -2k+1 )\biggr ], \\ r_3^{\phi }= & {} \displaystyle {\frac{1}{24}}\biggl [ \{1+\phi '''(1)\}^2 (5S -3k^2-6k+4 )\biggr ], \end{aligned}$$

and S is given by (7). Here, \(r_j^\phi \; (j=0, 1, 2, 3)\) satisfy

$$\begin{aligned} \sum _{j=0}^3r_j^\phi =0. \end{aligned}$$
(A.1)

For an arbitrary natural number s, let \(\psi _{\phi }^{G(s)}(t)\) be the sth derivatives of \(\psi _{\phi }^G(t)\). Then

$$\begin{aligned} \psi _{\phi }^{G(s)}(t)= & {} i^s\Bigl [\lambda (\lambda +2) \cdots \{\lambda +2(s-1)\} (1-2it)^{-(\lambda +2s)/2} \\{} & {} +\frac{1}{n}\sum _{j=0}^3(\lambda +2j)(\lambda +2j+2) \cdots \{\lambda +2j+2(s-1)\}\\{} & {} \times (1-2it)^{-(\lambda +2j+2s)/2}r_j^{\phi } + o(n^{-1})\Bigr ]. \end{aligned}$$

Therefore, the sth moment about the origin of the statistic \(G_{\phi }\) is evaluated as

$$\begin{aligned} E\{(G_{\phi })^s \vert H_0\} = \lambda (\lambda +2) \cdots \{\lambda +2(s-1)\} + \frac{1}{n}m^G_{\phi }(s) +o(n^{-1}) \; (s=1,2,\ldots ), \end{aligned}$$

where

$$\begin{aligned} m_{\phi }^G(s)= \sum _{j=0}^3(\lambda +2j)(\lambda +2j+2) \cdots \{\lambda +2j+2(s-1)\}r_j^{\phi } \; (s=1,2,\ldots ).\nonumber \\ \end{aligned}$$
(A.2)

If we put

$$\begin{aligned} m_{\phi }^G(s)=\sum _{\ell =0}^sb_{s,\ell }\lambda ^\ell \; (s=1,2,\ldots ), \end{aligned}$$
(A.3)

from (A.1) and (A.2), coefficients \(b_{s,s},\;b_{s,s-1}\), and \(b_{s,s-2}\) are calculated as follows:

$$\begin{aligned} b_{s,s}= & {} \sum _{j=0}^{3}r_j^{\phi }=0 \; (s=1,2,\ldots ), \end{aligned}$$
(A.4)
$$\begin{aligned} b_{s,s-1}= & {} 2s\sum _{j=1}^3jr_j^\phi \nonumber \\= & {} \frac{s}{12}\left\{ 4\phi ^{'''}(1)(S-3k+2) +3\phi ^{(4)}(1)(S-2k+1)\right\} \nonumber \\{} & {} \qquad (s =1,2, \ldots ), \end{aligned}$$
(A.5)

and

$$\begin{aligned} \quad \quad \quad b_{s,s-2}= & {} \frac{s(s-1)}{12} \Big [6(S-k^2-2k+2)\nonumber \\{} & {} +4\phi ^{'''}(1)\left\{ (s+7)S-3k^2-(s+4)(3k-2)\right\} \nonumber \\{} & {} +3\phi ^{(4)}(1)(s+2)(S-2k+1)\nonumber \\{} & {} +2\{\phi ^{'''}(1)\}^2(5S-3k^2-6k+4)\Big ] \; (s=2,3,\ldots ). \qquad \quad \quad \end{aligned}$$
(A.6)

From the assumption that \(q_j = O(k^{-1}) \;(j=1, \ldots , k)\) and (7), we find that \(S=O(k^2)\). Therefore

$$\begin{aligned} r_j^\phi = O(k^2) \; (j=0,1,2,3). \end{aligned}$$
(A.7)

holds. Furthermore,

$$\begin{aligned} \lambda = k-1 = O(k). \end{aligned}$$
(A.8)

Then, from (A.7) and (A.8),

$$\begin{aligned} \sum _{\ell =0}^{s-3}b_{s,\ell }\lambda ^\ell = O(k^{s-1})\; (s=3,4 \ldots ). \end{aligned}$$
(A.9)

Therefore, from (A.3)–(A.9), we obtain the following expression:

$$\begin{aligned} \begin{array}{lcl} \quad \quad m^G_\phi (s) &{}=&{} \displaystyle {\frac{s}{12}} \left\{ 4\phi ^{'''}(1)(S-3k+2) +3\phi ^{(4)}(1)(S-2k+1)\right\} \lambda ^{s-1} \\ &{} &{} + \frac{\displaystyle {1}}{\displaystyle {12}} s(s-1) \Big [6(S-k^2-2k+2)\\ &{}&{} +4\phi ^{'''}(1)\left\{ (s+7)S-3k^2-(s+4)(3k-2)\right\} \\ &{}&{} +3\phi ^{(4)}(1)(s+2)(S-2k+1) \\ &{}&{} +2\{\phi ^{'''}(1)\}^2(5S-3k^2-6k+4)\Big ] \lambda ^{s-2} \\ &{} &{} + O(k^{s-1}) \\ &{}=&{} \displaystyle {\frac{s}{12}} \Big \{4\phi ^{'''}(1)+3\phi ^{(4)}(1)\Big \}\left( \frac{\displaystyle {S}}{\displaystyle {k^2}}\right) k^{s+1}\\ &{}&{}+ \displaystyle {\frac{s}{12}} \Big [(s-1)\Big \{(s+1)(4\phi ^{'''}(1)+3\phi ^{(4)}(1))\\ &{}&{}+10(\phi ^{'''}(1)+1)^2 -4 \Big \} \left( \frac{\displaystyle {S}}{\displaystyle {k^2}}\right) \\ &{}&{}-2\Big \{(4\phi ^{'''}(1)+3\phi ^{(4)}(1))+3(s-1)(\phi ^{'''}(1)+1)^2 +2\phi ^{'''}(1)\Big \}\Big ]k^s \\ &{}&{} +O(k^{s-1})\; (s=1,2,\ldots ). \qquad \qquad \qquad \qquad \quad \qquad \qquad \quad \end{array} \end{aligned}$$
(A.10)

The second-order correction term \(m^G_{\phi }(s)\; (s=1,2,\ldots )\) is evaluated as

$$\begin{aligned} m_{\phi }^G(s) = \{4\phi ^{'''}(1)+3\phi ^{(4)}(1)\}O(k^{s+1})+O(k^s) \; (s=1,2,\ldots ). \end{aligned}$$
(A.11)

From (A.11), if we put \(m^G_{\phi }(s) = 0\), then \( 4\phi ^{'''}(1)+3\phi ^{(4)}(1) = o(1), \) which implies \( 4\phi ^{'''}(1)+3\phi ^{(4)}(1) \rightarrow 0 \) as \(k \rightarrow \infty \). We have completed the proof of Theorem 1. On the other hand, by substituting (8) in (A.10), we obtain the result of Theorem 2.

Proof of Corollary 3

Since the former half of Corollary 3 is immediately shown from Theorem 2, we prove the latter half. Proof of \(\vert A_1 \vert < \vert B_1 \vert \) is straightforward. The relation \(\vert A_s \vert > \vert B_s \vert \) is equivalent to \((A_s - B_s)(A_s + B_s) > 0\). Since

$$\begin{aligned} A_s-B_s = \frac{25}{54}s(s-1)\left\{ \frac{S}{k^2}-1\right\} +\frac{1}{27}s(5s-8), \end{aligned}$$

by noting \(S/k^2 \ge 1\), \(A_s-B_s > 0\) when \(s \ge 2\). Therefore, \(\vert A_s \vert > \vert B_s \vert \) is equivalent to \(A_s+B_s > 0\), when \(s \ge 2\). Since

$$\begin{aligned} A_s+B_s = \frac{25}{54}s(s-1)\left\{ \frac{S}{k^2} - C(s) \right\} , \end{aligned}$$

\(S/k^2 > C(s)\) implies \(\vert A_s \vert > \vert B_s \vert \), when \(s \ge 2\). Similarly, \(1 \le S/k^2 < C(s) \) implies \(\vert A_s \vert < \vert B_s \vert \), when \(s \ge 2\). We obtain the results of the latter half.

Proof of Theorems 3 and 4

Let the characteristic function of \(M_\phi ^*\) be \(\psi _\phi ^M(t)\). By assuming that the distribution of \(M_\phi ^*\) is continuous, \(\psi _\phi ^M(t)\) is evaluated as follows:

$$\begin{aligned} \psi _\phi ^M(t) = (1-2it)^{-\mu /2} +\frac{1}{n}\sum _{j=0}^3(1-2it)^{-(\mu +2j)/2}d_j^\phi + o(n^{-1}), \end{aligned}$$
(A.12)

where

$$\begin{aligned} d_0^\phi= & {} \frac{1}{12}\left[ -\prod _{m=1}^MS_m+\sum _{m=1}^MS_m-(M-1)\right] ,\\ d_1^\phi= & {} \displaystyle {\frac{1}{24}}\biggl [ 3(M-1)K^2\{\phi '''(1)+1\}^2 \\{} & {} +\{-3\phi ^{(4)}(1)+5(\phi '''(1))^2+6\phi '''(1)+9\}\displaystyle {\prod _{m=1}^MS_m} \\{} & {} -3\{\phi '''(1)+1\}^2K^2\displaystyle {\sum _{m=1}^M\frac{S_m}{J_m^2}}+6\{\phi ^{(4)}(1)-(\phi '''(1))^2-1\}K \sum _{m=1}^M\displaystyle {\frac{S_m}{J_m}} \\{} & {} +6(M-1)K\{-\phi ^{(4)}(1)+(\phi '''(1))^2+1\} \\{} & {} +\{-3\phi ^{(4)}(1)+4(\phi '''(1))^2\}\displaystyle {\sum _{m=1}^MS_m} \\{} & {} -6\{\phi ^{(4)}(1)-2(\phi '''(1))^2\}\displaystyle {\sum _{m=1}^{M-1}\sum _{\ell =m+1}^MJ_mJ_\ell } \\{} & {} +6\{\phi ^{(4)}(1)-2(\phi '''(1))^2\}(M-1)\displaystyle {\sum _{m=1}^MJ_m} \\{} & {} +\{-3(M-1)^2\phi ^{(4)}(1)+2(M-1)(3M-2)(\phi '''(1))^2\}\biggr ],\\ d_2^\phi= & {} \displaystyle {\frac{1}{24}}\biggl [ -6(M-1)K^2\{\phi '''(1)+1\}^2 \\{} & {} +\{3\phi ^{(4)}(1)-10(\phi '''(1))^2-16\phi '''(1)-12\}\displaystyle {\prod _{m=1}^MS_m} \\{} & {} +6\{\phi '''(1)+1\}^2K^2\displaystyle {\sum _{m=1}^M\frac{S_m}{J_m^2}} \\{} & {} +6\{-\phi ^{(4)}(1)+2(\phi '''(1))^2+2\phi '''(1)+2\} K\displaystyle {\sum _{m=1}^M}\displaystyle {\frac{S_m}{J_m}} \\{} & {} +6\{\phi ^{(4)}(1)-2(\phi '''(1))^2-2\phi '''(1)-2\}(M-1)K\\{} & {} +\{3\phi ^{(4)}(1)-8(\phi '''(1))^2-8\phi '''(1)-6\}\displaystyle {\sum _{m=1}^MS_m} \\{} & {} +6\{\phi ^{(4)}(1)-4(\phi '''(1))^2-4\phi '''(1)-2\}\displaystyle {\sum _{m=1}^{M-1}\sum _{\ell =m+1}^MJ_mJ_\ell } \\{} & {} +6\{-\phi ^{(4)}(1)+4(\phi '''(1))^2+4\phi '''(1)+2\}(M-1)\displaystyle {\sum _{m=1}^MJ_m}\\{} & {} +\{3(M-1)^2(\phi ^{(4)}(1)-2)-4(M-1)(3M-2)((\phi '''(1))^2+\phi '''(1))\}\biggr ],\\ d_3^\phi= & {} \displaystyle {\frac{1}{24}}\{\phi '''(1)+1\}^2\biggl [3(M-1)K^2+5\displaystyle {\prod _{m=1}^M}S_m -3K^2\displaystyle {\sum _{m=1}^M\frac{S_m}{J_m^2}} -6K\displaystyle {\sum _{m=1}^M\frac{S_m}{J_m}} \\{} & {} +6(M-1)K + 4\displaystyle {\sum _{m=1}^MS_m}+12\displaystyle {\sum _{m=1}^{M-1}\sum _{\ell =m+1}^MJ_mJ_\ell } -12(M-1)\displaystyle {\sum _{m=1}^MJ_m} \\{} & {} +2(M-1)(3M-2)\biggr ], \end{aligned}$$

\(S_m \; (m=1, \ldots , M)\) is given by (18), and \(\mu \) is given by (15). Here, \(d_j^\phi \; (j=0, 1, 2, 3)\) satisfy

$$\begin{aligned} \sum _{j=0}^3d_j^\phi =0. \end{aligned}$$
(A.13)

By (A.12), the sth moment about the origin of \(M_\phi ^*\) under null hypothesis \(H_0^M\) given by (12) is expressed as follows:

$$\begin{aligned} E\{(M_\phi ^*)^s \vert H_0^M\} = \mu (\mu +2) \cdots \{\mu +2(s-1)\} + \frac{1}{n}m_\phi ^M(s)+o(n^{-1}) \; (s=1, 2, \ldots ), \end{aligned}$$

where \(m_\phi ^M(s)\) is the second-order correction term of the sth moment about the origin of \(M_\phi ^*\) and is given by

$$\begin{aligned} m_\phi ^M(s)=\sum _{j=0}^3(\mu +2j)(\mu +2j+2)\cdots \{\mu +2j+2(s-1)\}d_j^\phi \; (s=1, 2, \ldots ).\nonumber \\ \end{aligned}$$
(A.14)

In (A.14), let the coefficient of \(\mu ^\ell \) be \(c_{s,\ell }\) for \(\ell =0,1,\ldots ,s\), then \(m^M_\phi (s)\) can be written as

$$\begin{aligned} m^M_\phi (s)=\sum _{\ell =0}^{s-1}c_{s,\ell }\mu ^{\ell } \; (s=1,2,\ldots ), \end{aligned}$$
(A.15)

since \(c_{s,s}=0\), which is shown by (A.13). Here, \(m_\phi ^M(s)\) is a polynomial of \(\mu \). From (A.14), \(c_{s,s-1}\), which is the coefficient of the maximum degree for \(\mu \), is given as follows:

$$\begin{aligned} c_{s,s-1} = \displaystyle {2s\sum _{j=1}^3 jd_j^\phi }. \end{aligned}$$
(A.16)

Therefore, by substituting the expression of \(d_j^\phi \; (j=1,2,3)\) in (A.16), we obtain

$$\begin{aligned} \begin{array}{lll} c_{s,s-1} &{}=&{}\displaystyle {\frac{s}{12}}\biggl [\{3\phi ^{(4)}(1)+4\phi '''(1)\}\displaystyle {\prod _{m=1}^MS_m} -6\{\phi ^{(4)}(1)+2\phi '''(1)\}K \displaystyle {\sum _{m=1}^M\frac{S_m}{J_m}} \\ &{} &{} + 6\{\phi ^{(4)}(1)+2\phi '''(1)\}(M-1)K + \{3\phi ^{(4)}(1)+8\phi '''(1)\}\displaystyle {\sum _{m=1}^MS_m} \\ &{} &{} +6\{\phi ^{(4)}(1)+4\phi '''(1)+2\}\displaystyle {\sum _{m=1}^{M-1}\sum _{\ell =m+1}^MJ_mJ_\ell } \\ &{} &{} -6\{\phi ^{(4)}(1)+4\phi '''(1)+2\}(M-1)\displaystyle {\sum _{m=1}^MJ_m} \\ &{} &{} +\{3(M-1)^2\phi ^{(4)}(1)+4(M-1)(3M-2)\phi '''(1)+6M(M-1)\}\biggr ] \; \\ &{} &{} \qquad (s=1,2,\ldots ). \end{array} \end{aligned}$$
(A.17)

By the assumption that \( p_{j_1 \ldots j_M} =O(K^{-1}) \; (j_m=1, \ldots , J_m; m=1, \ldots , M), \) \( p_{\cdot (m, j_m)}=O(J_m^{-1}) \; (j_m=1, \ldots , J_m; m=1, \ldots , M) \) hold. Then, \( S_m=O(J_m^2) \) \( \; (m =1, \ldots , M) \) hold. Therefore, we consider evaluating \(m_\phi ^M(s)\) given by (A.15) as the order of \(J_1, \ldots , J_M\). We obtain the following relations:

$$\begin{aligned} \left. \begin{array}{rll} \displaystyle {\prod _{m=1}^M}S_m &{}=&{} O\left( \left( \displaystyle {\prod _{m=1}^M}J_m\right) ^2\right) \\ K\displaystyle {\sum _{m=1}^M}\frac{S_m}{J_m} &{}=&{} O\left( \left( \displaystyle {\prod _{m=1}^M}J_m\right) \left( \displaystyle {\sum _{m=1}^M}J_m\right) \right) \\ K &{}=&{} O\left( \displaystyle {\prod _{m=1}^M}J_m\right) \\ \displaystyle {\sum _{m=1}^M}S_m &{}=&{} O\left( \displaystyle {\sum _{m=1}^M}J_m^2\right) \\ \displaystyle {\sum _{m=1}^{M-1} \sum _{\ell =m+1}^M}J_m J_\ell &{}=&{} O\left( \displaystyle {\sum _{m=1}^{M-1}\sum _{\ell =m+1}^M}J_m J_\ell \right) \\ \displaystyle {\sum _{m=1}^M}J_m &{}=&{} O\left( \displaystyle {\sum _{m=1}^M}J_m \right) \\ M &{}=&{} O(1). \end{array} \right\} \end{aligned}$$
(A.18)

From the above discussion, by assumption \(O(J_1) = \cdots = O(J_M)\), we find that \(\prod _{m=1}^M S_m\) is the highest order in terms of (A.18). On the other hand, the following evaluation holds:

$$\begin{aligned} d_j^\phi =O\left( \left( \displaystyle {\prod _{m=1}^M J_m} \right) ^2 \right) \; (j=0, 1, 2, 3) \end{aligned}$$

and

$$\begin{aligned} \mu = O\left( \displaystyle {\prod _{m=1}^M J_m} \right) . \end{aligned}$$

Therefore, in the case of statistic for testing complete independence in a multi-way contingency table

$$\begin{aligned} \left. \begin{array}{rll} m_\phi ^M(s) &{}=&{} K^{s-1}c_{s,s-1}-(s-1)K^{s-2}\displaystyle {\left( \sum _{m=1}^MJ_m\right) }c_{s,s-1} \\ &{}&{}+O(K^s)+O\left( K^{s-1}\displaystyle {\left( \sum _{m=1}^M J_m \right) ^2}\right) , \qquad \qquad \qquad \qquad \quad \quad \; \end{array} \right. \end{aligned}$$
(A.19)

since \( c_{s,\ell }=O(K^2), \)

$$\begin{aligned} \begin{array}{rll} \mu ^{s-1} &{}=&{} \left( K-\displaystyle {\sum _{m=1}^MJ_m+M-1}\right) ^{s-1} \\ &{}=&{} K^{s-1}-(s-1)K^{s-2}\displaystyle {\left( \sum _{m=1}^M J_m\right) } +O\left( K^{s-2}\right) +O\left( K^{s-3}\displaystyle {\left( \sum _{m=1}^M J_m\right) ^2}\right) \end{array} \end{aligned}$$

and

$$\begin{aligned} \mu ^{s-2} = \left( K-\displaystyle {\sum _{m=1}^MJ_m+M-1}\right) ^{s-2} = O\left( K^{s-2}\right) . \end{aligned}$$

From (A.19), we note that it is not necessary to substitute the expression of coefficient \(c_{s,s-2}\) in (A.19) to evaluate \(m_\phi ^M(s)\), which is different from the case of multinomial goodness-of-fit test statistic. By substituting (A.17) for \(c_{s, s-1}\) in (A.19), we obtain the evaluation of \(m_\phi ^M(s)\) as follows:

$$\begin{aligned} m_\phi ^M(s)= & {} \displaystyle {\frac{1}{12}}s\left\{ 4\phi '''(1)+3\phi ^{(4)}(1)\right\} K^{s-1}\displaystyle {\prod _{m=1}^M S_m} \nonumber \\{} & {} \displaystyle {-\frac{1}{12}}s \biggl [(s-1)\left\{ 4\phi '''(1)+3\phi ^{(4)}(1) \right\} K^{-2} \left( \displaystyle {\prod _{m=1}^M S_m} \right) \left( \displaystyle {\sum _{m=1}^M J_m} \right) \nonumber \\{} & {} +2\left\{ (4\phi '''(1)+3\phi ^{(4)}(1))+2\phi '''(1)\right\} \displaystyle {\sum _{m=1}^M\frac{S_m}{J_m}}\biggr ]K^s +O\left( K^s \right) \nonumber \\{} & {} +O\left( K^{s-1} \left( \displaystyle {\sum _{m=1}^M J_m}\right) ^2\right) \; \; (s=1, 2, \ldots ) \end{aligned}$$
(A.20)
$$\begin{aligned}= & {} \left\{ 4\phi '''(1)+3\phi ^{(4)}(1)\right\} O\left( K^{s+1} \right) \nonumber \\{} & {} +O\left( K^s \left( \displaystyle {\sum _{m=1}^M J_m} \right) \right) \; \; (s=1, 2, \ldots ). \end{aligned}$$
(A.21)

From (A.21), if we put \(m_\phi ^M(s)=0\), then \(4\phi '''(1)+3\phi ^{(4)}(1) = o(1)\), which implies \(4\phi '''(1)+3\phi ^{(4)}(1) \rightarrow 0\) as \( K \rightarrow \infty \). We have completed the proof of Theorem 3. On the other hand, by substituting (8) in (A.20), we obtain the results of Theorem 4.

Proof of Theorems 5 and 6

Let the characteristic function of \(C_\phi ^{*}\) be \(\psi _\phi ^C(t)\). By assuming that the distribution of \(C_\phi ^{*}\) is continuous, \(\psi _\phi ^{C}(t)\) is evaluated as follows:

$$\begin{aligned} \psi _\phi ^{C}(t) = (1-2it)^{-\nu /2} +\frac{1}{n}\sum _{j=0}^3(1-2it)^{-(\nu +2j)/2}v_j^\phi + o(n^{-1}), \end{aligned}$$

where

$$\begin{aligned}{} & {} \begin{array}{cll} v_0^{\phi } &{} = &{} \gamma _0, \\ v_1^{\phi } &{} = &{} (\gamma _1+\gamma _2)+2\phi ^{'''}(1)\gamma _2 +\phi ^{(4)}(1)\gamma _3+\{\phi ^{'''}(1)\}^2\gamma _4, \\ v_2^{\phi } &{} = &{} 2(-\gamma _2+\gamma _3)-2\phi ^{'''}(1)(\gamma _2+\gamma _4) -\phi ^{(4)}(1)\gamma _3-2\{\phi ^{'''}(1)\}^2\gamma _4, \\ v_3^{\phi } &{} = &{} \gamma _4\{1+\phi ^{'''}(1)\}^2, \\ \gamma _0 &{} = &{} -\displaystyle {\frac{1}{12}}\Gamma _1-\frac{1}{12}\Gamma _2 + \frac{1}{12}(\Gamma _3+\Gamma _4),\\ \gamma _1 &{} = &{} \displaystyle {\frac{1}{4}}J_1J_2\Gamma _1+\displaystyle {\frac{1}{4}}\Gamma _2- \displaystyle {\frac{1}{4}}(J_2\Gamma _3+J_1\Gamma _4), \\ \gamma _2 &{} = &{} \displaystyle {\frac{1}{8}}J_1^2J_2^2\Gamma _1+\displaystyle {\frac{1}{8}}\Gamma _2-\displaystyle {\frac{1}{8}}(J_2^2\Gamma _3+J_1^2\Gamma _4)\\ \gamma _3 &{}=&{} \left( -\displaystyle {\frac{1}{2}}J_1J_2-\displaystyle {\frac{1}{8}} +\displaystyle {\frac{1}{4}}J_1+\displaystyle {\frac{1}{4}}J_2\right) \Gamma _1 -\displaystyle {\frac{1}{8}}\Gamma _2+\displaystyle {\frac{1}{4}}(J_2\Gamma _3+J_1\Gamma _4) -\displaystyle {\frac{1}{8}}(\Gamma _3+\Gamma _4) \\ \gamma _4 &{}=&{} \left( \displaystyle {\frac{1}{8}}J_1^2J_2^2+\displaystyle {\frac{3}{4}}J_1J_2+ \displaystyle {\frac{1}{3}}-\displaystyle {\frac{1}{2}}J_1-\displaystyle {\frac{1}{2}}J_2\right) \Gamma _1 +\displaystyle {\frac{5}{24}}\Gamma _2-\displaystyle {\frac{1}{8}}(J_2^2\Gamma _3+J_1^2\Gamma _4) \\ &{} &{}-\displaystyle {\frac{1}{4}}(J_2\Gamma _3+J_1\Gamma _4) +\displaystyle {\frac{1}{6}}(\Gamma _3+\Gamma _4), \\ \end{array} \\{} & {} \Gamma _1 = \displaystyle {\sum _{\ell =1}^{J_3}\frac{1}{p_{\cdot \cdot \ell }}}, \quad \Gamma _2 = \displaystyle {\sum _{j=1}^{J_1}\sum _{k=1}^{J_2}\sum _{\ell =1}^{J_3}\frac{1}{q_{j k \ell }}},\quad \Gamma _3 = \displaystyle {\sum _{j=1}^{J_1}\sum _{\ell =1}^{J_3}\frac{1}{p_{j \cdot \ell }}}, \quad \Gamma _4 = \displaystyle {\sum _{k=1}^{J_2}\sum _{\ell =1}^{J_3}\frac{1}{p_{\cdot k \ell }}}, \end{aligned}$$

and \(\nu \) is defined in (21). Here, \(v_j^{\phi } \; (j=0, 1, 2, 3)\) satisfy

$$\begin{aligned} \sum _{j=0}^3v_j^{\phi }=0. \end{aligned}$$

Similar to the discussion for the proof of Theorems 3 and 4, we calculate \(m_\phi ^C(s)\). After that, since \(\Gamma _1=O(J_3^2)\), \(\Gamma _2=O(K^2)\), \(\Gamma _3=O((J_1J_3)^2)\), and \(\Gamma _4=O((J_2J_3)^2)\) under the assumption that \(p_{jk\ell }=O(K^{-1})\), we can evaluate \(m^C_\phi (s)\) as follows:

$$\begin{aligned} m_\phi ^C(s)= & {} \displaystyle {\frac{1}{12}}s\left\{ 4\phi '''(1)+3\phi ^{(4)}(1)\right\} K^{s-1}\Gamma _2 \nonumber \\{} & {} \displaystyle {-\frac{1}{12}}s \biggl [(s-1)\left\{ 4\phi '''(1)+3\phi ^{(4)}(1)\right\} (J_2^{-1}\Gamma _2+J_1^{-1}\Gamma _2) \nonumber \\{} & {} +2\left\{ (4\phi '''(1)+3\phi ^{(4)}(1))+2\phi '''(1)\right\} (J_2 \Gamma _3 + J_1 \Gamma _4) \biggr ] K^{s-1} \nonumber \\{} & {} +O\left( K^{s-1}J_1^2J_3^2 \right) +O\left( K^{s-1} J_2^2 J_3^2 \right) +O\left( K^s J_3 \right) \; (s=1, 2, \ldots ) \nonumber \\ \end{aligned}$$
(A.22)
$$\begin{aligned}= & {} \left\{ 4\phi '''(1)+3\phi ^{(4)}(1)\right\} O\left( K^{s+1} \right) \nonumber \\{} & {} +O\left( K^sJ_1J_3\right) +O\left( K^sJ_2J_3 \right) \qquad (s= 1, 2, \ldots ). \end{aligned}$$
(A.23)

From (A.23), if we put \(m_\phi ^C(s) = 0\), then \(4\phi '''(1)+3\phi ^{(4)}(1)=o(1)\). On the other hand, by substituting (8) in (A.22), we obtain the result of Theorem 6.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Taneichi, N., Sekiya, Y. Selection of statistics for a multinomial goodness-of-fit test and a test of independence for a multi-way contingency table when data are sparse. Jpn J Stat Data Sci (2024). https://doi.org/10.1007/s42081-023-00233-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42081-023-00233-y

Keywords

Navigation