Skip to main content
Log in

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

  • Original Research
  • Published:
SN Computer Science Aims and scope Submit manuscript

Abstract

Latent Dirichlet allocation (LDA) is useful in document analysis, image processing, and many information systems; however, its generalization performance has been left unknown because it is a singular learning machine to which regular statistical theory can not be applied. Stochastic matrix factorization (SMF) is a restricted matrix factorization in which matrix factors are stochastic; the column of the matrix is in a simplex. SMF is being applied to image recognition and text mining. We can understand SMF as a statistical model by which a stochastic matrix of given data is represented by a product of two stochastic matrices, whose generalization performance has also been left unknown because of non-regularity. In this paper, using an algebraic and geometric method, we show the analytic equivalence of LDA and SMF, both of which have the same real log canonical threshold (RLCT), resulting in that they asymptotically have the same Bayesian generalization error and the same log marginal likelihood. Moreover, we derive the upper bound of the RLCT and prove that it is smaller than the dimension of the parameter divided by two, hence the Bayesian generalization errors of them are smaller than those of regular statistical models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  1. Adams CP. Finite mixture models with one exclusion restriction. Econ J. 2016a;19(2):150–65.

    MathSciNet  Google Scholar 

  2. Adams CP. Stochastic matrix factorization. SSRN Electron J. 2016b;1–24. Available at SSRN: https://ssrn.com/abstract=2840852. Accessed 28 Jan 2020.

  3. Akaike H. Likelihood and bayes procedure. Bayesian Stat. 1980;31:143–66.

    MathSciNet  MATH  Google Scholar 

  4. Aoyagi M. Stochastic complexity and generalization error of a restricted boltzmann machine in bayesian estimation. J Mach Learn Res. 2010;11(Apr):1243–72.

    MathSciNet  MATH  Google Scholar 

  5. Aoyagi M, Watanabe S. Stochastic complexities of reduced rank regression in bayesian estimation. Neural Netw. 2005;18(7):924–33.

    MATH  Google Scholar 

  6. Atiyah MF. Resolution of singularities and division of distributions. Commun Pure Appl Math. 1970;23(2):145–50.

    MathSciNet  MATH  Google Scholar 

  7. Bernstein J. The analytic continuation of generalized functions with respect to a parameter. Funktsional’nyi Analiz i ego Prilozheniya. 1972;6(4):26–40.

    MathSciNet  Google Scholar 

  8. Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022.

    MATH  Google Scholar 

  9. Bobadilla J, Bojorque R, Esteban AH, Hurtado R. Recommender systems clustering using bayesian non negative matrix factorization. IEEE Access. 2018;6:3549–64.

    Google Scholar 

  10. Caskey JE Jr. A markov chain model for the probability of precipitation occurrence in intervals of various length. Mon Weather Rev. 1963;91(6):298–301.

    Google Scholar 

  11. Cemgil AT. Bayesian inference in non-negative matrix factorisation models. Comput Intell Neurosci. 2009;2009(4):17 article ID 785152.

    Google Scholar 

  12. Drton M, Plummer M. A bayesian information criterion for singular models. J R Stat Soc B. 2017;79:323–80 with discussion.

    Article  MathSciNet  MATH  Google Scholar 

  13. Gildea D, Hofmann T. Topic-based language models using em. In: Sixth European Conference on Speech Communication and Technology. 1999.

  14. Griffiths TL, Steyvers M. Finding scientific topics. Proc Natl Acad Sci. 2004;101(suppl 1):5228–35.

    Article  Google Scholar 

  15. Hayashi N, Watanabe S. Tighter upper bound of real log canonical threshold of non-negative matrix factorization and its application to bayesian inference. In: IEEE Symposium series on computational intelligence (IEEE SSCI), 2017a; pp. 718–725.

  16. Hayashi N, Watanabe S. Upper bound of bayesian generalization error in non-negative matrix factorization. Neurocomputing. 2017;266C(29 November):21–8. https://doi.org/10.1016/j.neucom.2017.04.068.

    Article  Google Scholar 

  17. Hironaka H. Resolution of singularities of an algbraic variety over a field of characteristic zero. Ann Math. 1964;79:109–326.

    Article  MATH  Google Scholar 

  18. Kohjima M, Matsubayashi T, Sawada H. Probabilistic non-negative inconsistent-resolution matrices factorization. In: Proceeding of CIKM ’15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015;1:1855–8.

  19. Larsen JS, Clemmensen LKH. Non-negative matrix factorization for binary data. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K). 2015;01:555–63.

  20. Lee DD, Seung HS. Learning the parts of objects with nonnegative matrix factorization. Nature. 1999;401:788–91.

    MATH  Google Scholar 

  21. Li FF, Perona P. A bayesian hierarchical model for learning natural scene categories. In: Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05)—volume 02, IEEE Computer Society, Washington, DC, CVPR ’05, 2005; pp. 524–531. https://doi.org/10.1109/CVPR.2005.16.

  22. Matsuda K, Watanabe S. Weighted blowup and its application to a mixture of multinomial distributions. IEICE Trans. 2003;J86–A(3):278–87 (in Japanese).

    Google Scholar 

  23. Nagata K, Watanabe S. Asymptotic behavior of exchange ratio in exchange monte carlo method. Neural Netw. 2008;21(7):980–8.

    MATH  Google Scholar 

  24. Paatero P, Tapper U. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics. 1994;5(2):111–26. https://doi.org/10.1002/env.3170050203.

    Article  Google Scholar 

  25. Roje T, Marín LG, Sáez D, Orchard M, Jiménez-Estévez G. Consumption modeling based on markov chains and bayesian networks for a demand side management design of isolated microgrids. Int J Energy Res. 2017;41(3):365–76.

    Google Scholar 

  26. Rusakov D, Geiger D. Asymptotic model selection for naive bayesian networks. J Mach Learn Res. 2005;6(Jan):1–35.

    MathSciNet  MATH  Google Scholar 

  27. Sato M, Shintani T. On zeta functions associated with prehomogeneous vector spaces. Ann Math. 1974;100(1):131–70.

    MathSciNet  MATH  Google Scholar 

  28. Schmidt MN, Olsson RK. Single-channel speech separation using sparse non-negative matrix factorization. In: 9th International Conference on Spoken Language Processing. 2006.

  29. Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):461–4.

    MathSciNet  MATH  Google Scholar 

  30. Sonnadara D, Jayewardene D. A markov chain probability model to describe wet and dry patterns of weather at colombo. Theor Appl Climatol. 2015;119(1–2):333–40.

    Google Scholar 

  31. Styan GP, Smith H Jr. Markov chains applied to marketing. J Mark Res. 1964;1(1):50–5.

    Google Scholar 

  32. Tirunillai S, Tellis GJ. Mining marketing meaning from online chatter: strategic brand analysis of big data using latent dirichlet allocation. J Mark Res. 2014;51(4):463–79.

    Google Scholar 

  33. Virtanen T, Cemgil AT, Godsill S. Bayesian extensions to non-negative matrix factorisation for audio signal modelling. In: Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, IEEE, 2008; pp. 1825–1828.

  34. Watanabe S. Algebraic analysis for non-regular learning machines. Adv Neural Inf Process Syst. 2000;12:356–62 denver, USA.

    Google Scholar 

  35. Watanabe S. Algebraic geometrical methods for hierarchical learning machines. Neural Netw. 2001;13(4):1049–60.

    Article  Google Scholar 

  36. Watanabe S. Algebraix geometry and statistical learning theory. Cambridge: Cambridge University Press; 2009.

    Book  Google Scholar 

  37. Watanabe S. Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res. 2010;11(Dec):3571–94.

    MathSciNet  MATH  Google Scholar 

  38. Watanabe S. Mathematical theory of Bayesian statistics. Boca Raton: CRC Press; 2018.

    Book  MATH  Google Scholar 

  39. Yamazaki K, Watanabe S. Singularities in mixture models and upper bounds of stochastic complexity. Neural Netw. 2003;16(7):1029–38.

    Article  MATH  Google Scholar 

  40. Yoshida K, Kuwatani T, Hirajima T, Iwamori H, Akaho S. Progressive evolution of whole–rock composition during metamorphism revealed by multivariate statistical analyses. J Metamorph Geol. 2018;36(1):41–54. https://doi.org/10.1111/jmg.12282.

    Article  Google Scholar 

  41. Zwiernik P. An asymptotic behaviour of the marginal likelihood for general markov models. J Mach Learn Res. 2011;12(Nov):3283–310.

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research was partially supported by NTT DATA Mathematical Systems Inc.. The authors would like to thank the editor and the reviewers for comments to improve this paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Naoki Hayashi.

Ethics declarations

Conflict of Interest

Conflict of interests in our research are as below: The first author is a member of NTT DATA Mathematical Systems Inc., Japan. The second author is a member of Tokyo Institute of Technology, Japan. The authors declare that they do not have any other conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A. Proof of Main Theorem

In this section, we prove Main Theorem using above lemmas.

Proof

(Main Theorem) Summarizing the terms, we have

$$\begin{aligned} \varPhi (A,B)&= \Vert AB-A_0B_0 \Vert ^2 \nonumber \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \!\sum _{k=1}^{H_0-1} (\!a_{ik}b_{kj}{-}a^0_{ik}b^0_{kj})+a_{iH_0}b_{H_0j}\right. \nonumber \\&\quad \left. -a^0_{iH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{ik}b_{kj} + a_{iH} b_{Hj} \!\right\} ^2 \nonumber \\&\quad + \sum _{j=1}^N \left\{ \!\sum _{k=1}^{H_0-1} (\!a_{Mk}b_{kj} {-}a^0_{Mk}b^0_{kj}){+}a_{MH_0}b_{H_0j}\right. \nonumber \\&\quad \left. {-}a^0_{MH_0}b^0_{H_0j} {+}\sum _{k=H_0+1}^{H-1}a_{Mk}b_{kj} {+} a_{MH} b_{Hj}\! \right\} ^2\!\!. \end{aligned}$$
(15)

Put

$$\begin{aligned}&K_{ij}:=\sum _{k=1}^{H_0-1} (a_{ik}b_{kj} -a^0_{ik}b^0_{kj})+a_{iH_0}b_{H_0j} \\&\qquad -a^0_{iH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{ik}b_{kj} + a_{iH} b_{Hj} , \\&L_j:= \sum _{k=1}^{H_0-1} (a_{Mk}b_{kj} -a^0_{Mk}b^0_{kj})+a_{MH_0}b_{H_0j}\\&\qquad -a^0_{MH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{Mk}b_{kj} + a_{MH} b_{Hj}, \end{aligned}$$

then we get

$$\begin{aligned} \Vert AB-A_0 B_0\Vert ^2 = \sum _{j=1}^N \sum _{i=1}^{M-1}K_{ij}^2 + \sum _{j=1}^N L_j^2. \end{aligned}$$

Using \(a_{Mk}=1-\sum _{i=1}^{M-1}a_{ik}\), \(b_{Hj}=1-\sum _{k=1}^{H-1} b_{kj}\), \(a^0_{Mk}=1-\sum _{i=1}^{M-1}a^0_{ik}\), and \(b^0_{H_0j}=1-\sum _{k=1}^{H_0-1} b^0_{kj}\), we have

$$\begin{aligned} \sum _{i=1}^{M-1}K_{ij}=\, & \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \\&-\sum _{i=1}^{M-1} \sum _{k=1}^{H_0-1} \left( a^0_{ik}-a^0_{iH_0}\right) b^0_{kj}+ \sum _{i=1}^{M-1} \left( a_{iH}-a^0_{iH_0}\right) ,\\ L_j=\, & -\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj}\\&+\sum _{i=1}^{M-1} \sum _{k=1}^{H_0-1}\left( a^0_{ik}-a^0_{iH_0}\right) b^0_{kj} - \sum _{i=1}^{M-1} \left( a_{iH}-a^0_{iH_0}\right) , \end{aligned}$$

thus

$$\begin{aligned} L_j^2=\left( \sum _{i=1}^{M-1} K_{ij} \right) ^2. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert AB-A_0B_0\Vert ^2&= \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2 +\sum _{j=1}^N L_j^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2 +\sum _{j=1}^N \left( \sum _{i=1}^{M-1} K_{ij} \right) ^2. \end{aligned}$$

On account of Corollary 12, we have

$$\begin{aligned} \Vert AB-A_0B_0\Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2, \end{aligned}$$

i.e.,

$$\begin{aligned}&\quad \Vert AB-A_0B_0\Vert ^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \right. \nonumber \\&\qquad \left. - \sum _{k=1}^{H_0-1} (a^0_{ik}-a^0_{iH_0})b^0_{kj}+ (a_{iH}-a^0_{iH_0}) \right\} ^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} \{(a_{ik}-a_{iH})b_{kj}- (a^0_{ik}-a^0_{iH_0})b^0_{kj}\} \right. \nonumber \\&\qquad \left. + \sum _{k=H_0}^{H-1} (a_{ik}-a_{iH})b_{kj}+ (a_{iH}-a^0_{iH_0}) \right] ^2. \end{aligned}$$
$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}-a_{iH}, &{} k<H \\ c_i = a_{iH}-a^0_{iH_0}, \\ b_{kj}=b_{kj} \end{array}\right. } \end{aligned}$$

and put \(a^0_{ik}=a^0_{ik}-a^0_{iH_0}\). Then we have

$$\begin{aligned}&\quad \Vert AB-A_0B_0\Vert ^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} \{(a_{ik}-a_{iH})b_{kj}- (a^0_{ik}-a^0_{iH_0})b^0_{kj}\} \right. \nonumber \\&\quad \left. + \sum _{k=H_0}^{H-1} (a_{ik}-a_{iH})b_{kj}+ (a_{iH}-a^0_{iH_0}) \right] ^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}- a^0_{ik}b^0_{kj}) \right. \nonumber \\&\quad \left. + \sum _{k=H_0}^{H-1} a_{ik}b_{kj}+ c_i \right\} ^2. \end{aligned}$$

There is a positive constant \(C>0\), we have

$$\begin{aligned}&\quad C \Vert AB-A_0 B_0 \Vert ^2 \\&\leqq \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}-a^0_{ik}b^0_{kj}) + c_i \right\} ^2 \\&\quad + \sum _{j=1}^N \sum _{i=1}^{M-1} \left( \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right) ^2. \end{aligned}$$

Put

$$\begin{aligned} K_1=\, & \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}- a^0_{ik}b^0_{kj}) + c_i\right] ^2, \\ K_2=\, & \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right\} ^2. \end{aligned}$$

Let \({\bar{\lambda }}_1\) be the RLCT of \(K_1\) ,\({\bar{\lambda }}_2\) be the RLCT of \(K_2\), and \(\lambda\) be the RLCT of \(\Vert AB-A_0B_0\Vert ^2\). The following inequality holds since an RLCT is order isomorphic and \(K_1\) and \(K_2\) are independent:

$$\begin{aligned} \lambda \leqq {\bar{\lambda }}_1 + {\bar{\lambda }}_2. \end{aligned}$$

According to Lemma 17 in the case of \(H \leftarrow H_0\),

$$\begin{aligned} {\bar{\lambda }}_1 \leqq \frac{M-1}{2} + (H_0-1) \frac{M+N-3}{2}. \end{aligned}$$

In contrast, there exists a positive constant \(D>0\), we have

$$\begin{aligned} K_2&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left( \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right) ^2 \\&\leqq D\sum _{j=1}^N \sum _{i=1}^{M-1} \sum _{k=H_0}^{H-1} a_{ik}^2 b_{kj}^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \sum _{k=H_0}^{H-1} a_{ik}^2 b_{kj}^2 \\&= \sum _{k=H_0}^{H-1} \sum _{j=1}^N \sum _{i=1}^{M-1} a_{ik}^2 b_{kj}^2 \\&= \sum _{k=H_0}^{H-1} \left( \sum _{j=1}^N b_{kj}^2 \right) \left( \sum _{i=1}^{M-1} a_{ik}^2 \right) . \end{aligned}$$

The RLCT of the last term becomes a sum of each ones about k. Considering blowing-ups of variables \(\{ a_{ik} \}\) and \(\{ b_{kj} \}\) for each k, we obtain

$$\begin{aligned} {\bar{\lambda }}_2 \leqq \frac{(H-H_0)\min \{ M-1,N \}}{2}. \end{aligned}$$

Using the above inequalities about the RLCTs, we have

$$\begin{aligned} \lambda&\leqq {\bar{\lambda }}_1 + {\bar{\lambda }}_2 \leqq \frac{M-1}{2} \\&\quad + (H_0-1) \frac{M+N-3}{2} + \frac{(H-H_0)\min \{ M-1,N \}}{2}.\\&\quad \therefore \quad \lambda \leqq \frac{1}{2}\left[ M-1\right. \\&\quad \left. +(H_0-1)(M+N-3)+(H-H_0)\min \{M-1,N\} \right] . \end{aligned}$$

\(\square\)

B. Proof of Lemmas

In this section, we prove the four lemmas introduced in Section 3: Lemmas 14, 15, 16, and 17.

First, Lemma 14 is proved.

Proof

(Lemma 14) We set \(A=(a_{i})_{i=1}^{M}\), \(B^T=(1)_{j=1}^{N}\), \(A_0=(a^0_{i})^M\),\(B_0^T=(1)_{j=1}^N\), then

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&=\sum _{i=1}^M N(a_i-a^0_i)^2 \\&= \sum _{i=1}^{M-1} N(a_i-a^0_i)^2 \\ &\quad + N \left( 1-\sum _{i=1}^{M-1} a_i -1 + \sum _{i=1}^{M-1} a^0_i \right) ^2 \\&= \sum _{i=1}^{M-1} N(a_i-a^0_i)^2 \\&\quad + N \left\{ \sum _{i=1}^{M-1} (a_i - a^0_i) \right\} ^2. \end{aligned}$$

Using Corollary 12, \(\sum _{i=1}^{M-1} (a_i - a^0_i) \in \langle a_1-a^0_1, \ldots , a_{M-1}-a^0_{M-1} \rangle\) causes that

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2 \sim \sum _{i=1}^{M-1} N(a_i-a^0_i)^2. \end{aligned}$$

As an RLCT is not changed by any constant factor, all we have to do is calculating an RLCT of

$$\begin{aligned} \sum _{i=1}^{M-1} (a_i-a^0_i)^2 \end{aligned}$$

and this has no singularity. Thus, the RLCT equals to a half of the parameter dimension:

$$\begin{aligned} \lambda = \frac{M-1}{2}. \end{aligned}$$

\(\square\)

Second, Lemma 15 is derived.

Proof

(Lemma 15) We set \(A_0=(a^0_{i})^M\), \(B_0^T=(1)_{j=1}^N\).

$$\begin{aligned}&AB-A_0B_0 \\&\quad =\left( \begin{matrix} a_{11} &{} a_{12} \\ \vdots &{} \vdots \\ a_{(M-1)1} &{} a_{(M-1)2} \\ a_{M1} &{} a_{M2} \end{matrix} \right) \left( \begin{matrix} b_1 &{} \ldots &{} b_N \\ 1-b_1 &{} \ldots &{} 1-b_N \end{matrix} \right) \\&\quad -\left( \begin{matrix} a^0_1 \\ \vdots \\ a^0_{M-1} \\ a^0_{M} \end{matrix} \right) \left( \begin{matrix} 1&\ldots&1 \end{matrix} \right) \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j +a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j+a_{(M-1)2}-a^0_{M-1} \\ (a_{M1}-a_{M2})b_j+a_{M2}-a^0_{M} \\ \end{matrix} \right) _{j=1}^N \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j+a_{(M-1)2}-a^0_{M-1} \\ -\sum _{i=1}^{M-1}(a_{i1}-a_{i2})b_j-\sum _{i=1}^{M-1}(a_{i2}-a^0_{i}) \\ \end{matrix} \right) _{j=1}^N . \end{aligned}$$

Thus,

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \,\,\left(\vphantom{ \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \right] ^2} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2 \right. \\&\quad \left. + \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \right] ^2 \right) . \end{aligned}$$

Put \(I=\langle \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}_{i=1}^{M-1} \rangle\). Because of Corollary 11 and

$$\begin{aligned} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \in I, \end{aligned}$$

we get

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i1}-a_{i2}, \\ a_{i2} = a_{i2}, \\ b_j=b_j \end{array}\right. }. \end{aligned}$$

Then we get

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2\\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +a_{i2}-a^0_i\}^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ c_{i} = a_{i2}-a^0_i \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +a_{i2}-a^0_i\}^2=\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +c_{i}\}^2 \end{aligned}$$

holds.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ x_{i} = a_{i}b_1+c_i, \\ \end{array}\right. }. \end{aligned}$$

If \(j>1\), then we have \(a_i b_j +c_i = x_i -a_i b_1 +a_i b_j\) and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +c_{i}\}^2\\&\quad =\sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{x_i-(a_i b_1-a_i b_j)\}^2 \right] . \end{aligned}$$

Consider the following generated ideal:

$$\begin{aligned} J:=\left\langle (x_i)_{i=1}^{M-1}, (a_i b_1 -a_i b_j)_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned} \{x_i-(a_i b_1-a_i b_j)\}^2=x_i^2 + (a_i b_1-a_i b_j)^2 -2 x_i (a_i b_1 -a_i b_j) \end{aligned}$$

and \(x_i (a_i b_1 - a_i b_j) \in J\) holds. Hence, owing to Corollary 12, we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2 \sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{x_i-(a_i b_1-a_i b_j)\}^2 \right] \\&\sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} (a_i b_1-a_i b_j)^2 \right] \\&= \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} a_i^2(b_j-b_1)^2 \right] .\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_i, \\ b_1=b_1, \\ b_j=b_j-b_1, &{} (j>1) \\ x_i =x_i \end{array}\right. }, \end{aligned}$$

then we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} a_i^2(b_j-b_1)^2 \right\} \\&=\sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} a_i^2b_j^2 \right) \\&=\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N a_i^2 b_j^2 \\&=\sum _{i=1}^{M-1} x_i^2 +\left( \sum _{i=1}^{M-1}a_i^2 \right) \left( \sum _{j=2}^N b_j^2 \right) . \end{aligned}$$

Since \(a_i\),\(b_j\),\(x_i\) are independent variables for each, we consider blowing-ups of them and get

$$\begin{aligned} \lambda&= \frac{M-1}{2} + \min \left\{ \frac{M-1}{2}, \frac{N-1}{2} \right\} \\& = \min \left\{ M-1,\frac{M+N-2}{2} \right\} . \end{aligned}$$

Therefore,

$$\begin{aligned} \lambda ={\left\{ \begin{array}{ll} M-1 &{} (M \geqq N) \\ \frac{M+N-2}{2} &{} (M<N) \end{array}\right. }. \end{aligned}$$

\(\square\)

Third, we prove Lemma 16.

Proof

(Lemma 16)

$$\begin{aligned}&AB-A_0B_0 \\&\quad =\left( \begin{matrix} a_{11} &{} a_{12} \\ \vdots &{} \vdots \\ a_{(M-1)1} &{} a_{(M-1)2} \\ a_{M1} &{} a_{M2} \end{matrix} \right) \; \left( \begin{matrix} b_1 &{} \!\ldots \! &{} b_N \\ 1{-}b_1 &{} \!\ldots \! &{} 1{-}b_N \end{matrix} \right) \\&\qquad -\left( \begin{matrix} a^0_{11} &{} a^0_{12} \\ \vdots &{} \vdots \\ a^0_{(M-1)1} &{} a^0_{(M-1)2} \\ a^0_{M1} &{} a^0_{M2} \end{matrix} \right) \; \left( \begin{matrix} b^0_1 &{} \ldots &{} b^0_N \\ 1{-}b^0_1 &{} \ldots &{} 1{-}b^0_N \end{matrix} \right) \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j-(a^0_{11}-a^0_{12})b^0_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j -(a^0_{(M-1)1}-a^0_{(M-1)2})b^0_j+a_{(M-1)2}-a^0_{M-1} \\ (a_{M1}-a_{M2})b_j-(a^0_{M1}-a^0_{M2})b^0_j+a_{M2}-a^0_{M} \\ \end{matrix} \right) _{j=1}^N\\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j-(a^0_{11}-a^0_{12})b^0_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j -(a^0_{(M-1)1}-a^0_{(M-1)2})b^0_j+a_{(M-1)2}-a^0_{M-1} \\ -\sum _{i=1}^{M-1}\{(a_{i1}-a_{i2})b_1-(a^0_{i1}-a^0_{i2})b^0_j\}-\sum _{i=1}^{M-1}(a_{i2}-a^0_{i}) \\ \end{matrix} \right) _{j=1}^N. \end{aligned}$$

Then we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \,\,\left(\vphantom{\sum _{i=1^{\int^{\sum}}}^{M-1}} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j \right. \\&\quad \left. -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}^ 2 \right. \\&\quad + \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j\right. \\&\quad \left. \left. -(a^0_{i1}-a^0_{i2})b^0_j +a_{i2}-a^0_i\} \vphantom{\sum _{i=1}^{M-1}}\right] ^2 \right) . \end{aligned}$$

Put \(I=\langle \{(a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}_{i=1}^{M-1} \rangle\). Because of Corollary 11 and

$$\begin{aligned} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j +a_{i2}-a^0_i\} \in I, \end{aligned}$$

we get

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j \\&\quad -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i1}-a_{i2}, \\ a_{i2} = a_{i2}, \\ b_j=b_j \end{array}\right. } \end{aligned}$$

and put \(a^0_{i}=a^0_{i1}-a^0_{i2}\). Then we get

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ (a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\right\} ^2 \\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j -a^0_{i}b^0_j+a_{i2}-a^0_i\right\} ^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ c_{i} = a_{i2}-a^0_i \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j-a^0_{i}b^0_j +a_{i2}-a^0_i\right\} ^2 \\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j -a^0_{i}b^0_j+c_{i}\right\} ^2 \end{aligned}$$

holds.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ x_{i} = a_{i}b_1-a^0_{i}b^0_1+c_i, \\ \end{array}\right. }. \end{aligned}$$

If \(j>1\), then we have \(a_i b_j -a^0_{i}b^0_j+c_i = x_i -a_i b_1 +a^0_i b^0_1+a_i b_j-a^0_{i}b^0_j\) and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j -a^0_i b^0_j+c_{i}\}^2\\&\quad =\sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1}\left\{ x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\right\} ^2. \end{aligned}$$

Consider the following generated ideal:

$$\begin{aligned}&J:=\left\langle (x_i)_{i=1}^{M-1}, (a_i b_1 -a^0_i b^0_1\right. \\&\qquad \left. -a_i b_j +a^0_i b^0_j)_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned}&\{x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\}^2\\&=x_i^2 + (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)^2 \\&\quad -2 x_i (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j) \end{aligned}$$

and \(x_i (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j) \in J\) holds. Hence, owing to Corollary 12, we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2\\&\quad \sim \sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1}\{x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\}^2 \\&\quad \sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)^2 \right\} \\&\quad = \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{a_i(b_j-b_1)-a^0_i(b^0_j-b^0_1)\}^2 \right] .\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_i, \\ b_1=b_1, \\ b_j=b_j-b_1, &{} (j>1) \\ x_i =x_i \end{array}\right. } \end{aligned}$$

and put \(b^0_j=b^0_j-b^0_1\), then we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2 \\&\quad \sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{a_i(b_j-b_1)-a^0_i(b^0_j-b^0_1)\}^2 \right] \\&\quad =\sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (a_i b_j - a^0_i b^0_j)^2 \right\} \\&\quad =\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N (a_i b_j - a^0_i b^0_j)^2 .\end{aligned}$$

Let \(f_{ij}\) be \(a_i b_j - a^0_i b^0_j\). If \(\Vert AB-A_0B_0 \Vert ^2 =0\), \(f_{ij}=0\). Hence, \(a_i \ne 0\) and \(b_j \ne 0\). Owing to Proposition 13

$$\begin{aligned} \sum _{i=1}^{M-1} \sum _{j=2}^N f_{ij}^2 \sim \sum _{i=2}^{M-1} f_{i1}^2 + \sum _{j=3}^N f_{1j}^2 +f_{12}^2, \end{aligned}$$

we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N (a_i b_j - a^0_i b^0_j)^2 \nonumber \\&= \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N f_{ij}^2 \nonumber \\&\sim \sum _{i=1}^{M-1} x_i^2 +\left( f_{12}^2+\sum _{i=2}^{M-1} f_{i2}^2 + \sum _{j=3}^N f_{1j}^2 \right) . \end{aligned}$$
(16)

Thus, all we have to do is calculate an RLCT of the right side. Considering blowing-ups, the RLCT \(\lambda _1\) of the first term is equal to \(\lambda _1=(M-1)/2\). For deriving the RLCT of the second term, we arbitrarily take \(i,j(1 \leqq i \leqq M-1, 2 \leqq j \leqq N, i,j \in {\mathbb {N}})\) and fix them.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_i = a_i, \\ f_{i2} = a_i b_2 - a^0_i b^0_2, \\ f_{1j} = a_1 b_j -a^0_1 b^0_j, \\ x_i = x_i \end{array}\right. } \end{aligned}$$

and we have that the Jacobi matrix of the above transformation is equal to

$$\begin{aligned} \frac{\partial (a_i ,f_{ij}, x_i)}{\partial (a_i,b_j,x_i)}= \left( \begin{array}{ccc} \frac{\partial a_i}{\partial a_i} &{} \frac{\partial f_{ij}}{\partial a_i} &{} \frac{\partial x_i}{\partial a_i}\\ \frac{\partial a_i}{\partial b_j} &{} \frac{\partial f_{ij}}{\partial b_j} &{} \frac{\partial x_i}{\partial b_j}\\ \frac{\partial a_i}{\partial x_i} &{} \frac{\partial f_{ij}}{\partial x_i} &{} \frac{\partial x_i}{\partial x_i}\\ \end{array} \right) = \left( \begin{array}{ccc} 1 &{} b_j &{} 0 \\ 0 &{} a_i &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array} \right) . \end{aligned}$$

Because of

$$\begin{aligned} \Biggl | \frac{\partial (a_i ,f_{ij}, x_i)}{\partial (a_i,b_j,x_i)} \Biggr |=a_i \ne 0, \end{aligned}$$

g is an analytic isomorphism. Thus, the RLCT \(\lambda _2\) of the second term in Eq. (16) is equal to

$$\begin{aligned} \lambda _2 = \frac{M+N-3}{2}. \end{aligned}$$

Let \(\lambda\) be the RLCT of \(\Vert AB-A_0B_0 \Vert ^2\). From the above,

$$\begin{aligned} \lambda =\lambda _1+\lambda _2 = \frac{2M+N-4}{2}. \end{aligned}$$

\(\square\)

Lastly, we derive the inequality of Lemma 17.

Proof

(Lemma 17) We develop the objective function and obtain

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \nonumber \\&\quad = \sum _{i=1}^M \sum _{j=1}^N (a_{i1}b_{1j} + \cdots + a_{iH}b_{Hj} - a^0_{i1}b^0_{1j} - a^0_{iH}b^0_{Hj})^2 \nonumber \\&\quad = \sum _{j=1}^N \sum _{i=1}^M \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \nonumber \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \nonumber \\&\qquad + \sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) \Biggr \}^2. \end{aligned}$$
(17)

Expand the second term in Eq. (17) using \(a_{Mk}=1-\sum _{i=1}^{M-1}a_{ik}\), \(b_{Hj}=1-\sum _{k=1}^{H-1} b_{kj}\), \(a^0_{Mk}=1-\sum _{i=1}^{M-1}a^0_{ik}\), and \(b^0_{Hj}=1-\sum _{k=1}^{H-1} b^0_{kj}\), then we have

$$\begin{aligned}&\sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H-1} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) + (a_{MH}b_{Hj} - a^0_{MH}b^0_{Hj}) \Biggr \}^2 \\&\quad =\sum _{j=1}^N \Biggl (- \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a_{ik}b_{kj} + \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a^0_{ik}b^0_{kj}\\&\qquad -\sum _{i=1}^{M-1}a_{iH} +\sum _{i=1}^{M-1}\sum _{k=1}^{H-1} a_{iH}b_{kj}+ \sum _{i=1}^{M-1}a^0_{iH} \\&\qquad -\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a^0_{iH}b^0_{kj} \Biggr )^2 {=}\!:\! \varPhi _2. \end{aligned}$$

Developing the equation, we have

$$\begin{aligned} \varPhi _2&=\sum _{j=1}^N \Bigg \{- \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \\&\quad + \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a^0_{ik}-a^0_{iH})b^0_{kj} -\sum _{i=1}^{M-1}(a_{iH}-a^0_{iH}) \Bigg \}^2 \\&=\sum _{j=1}^N \Biggl [\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\qquad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \Biggr ]^2. \end{aligned}$$

On the other hand, the first term of Eq. (17) is equal to

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + (a_{iH}b_{Hj} - a^0_{iH}b^0_{Hj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + a_{iH}\\&\qquad -\sum _{k=1}^{H-1} a_{iH}b_{kj} - a^0_{iH} + \sum _{k=1}^{H-1} a^0_{iH} b^0_{kj} \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + (a_{iH} - a^0_{iH})\\&\qquad -\sum _{k=1}^{H-1} (a_{iH}b_{kj} - a^0_{iH} b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})+\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} \\&\qquad - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 . \end{aligned}$$

Consider the following ideal:

$$\begin{aligned} I=\, & \left\langle (a_{iH}-a^0_{iH})_{i=1}^{M-1} , \{(a_{ik}-a_{iH})b_{kj} \right. \\&\left. - (a^0_{ik}-a^0_{iH}) b^0_{kj}\}_{(i,j,k)=(1,1,1)}^{(M-1,N,H-1)} \right\rangle . \end{aligned}$$

Since we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})\\&\quad +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 \\&\quad + \sum _{j=1}^N \Biggl [\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\quad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \Biggr ]^2 \end{aligned}$$

and

$$\begin{aligned}&\forall j, \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\quad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \in I, \end{aligned}$$

thus Corollary 12 causes

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})\\&\quad +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2. \end{aligned}$$

We transform the coordinate like the proof of Lemma 16 for resolution singularity of the above polynomial.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}-a_{iH}, &{} (k<H) \\ a_{iH}=a_{iH}, \\ b_{kj}=b_{kj}, \\ \end{array}\right. } \end{aligned}$$

and put \(a^0_{ik}=a^0_{ik}-a^0_{iH}\),

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 \\&=\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik},\\ b_{kj}=b_{kj}, \\ c_i=a_{iH}-a^0_{iH} \end{array}\right. }. \end{aligned}$$

Then we obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [c_i+\sum _{k=1}^{H-1}(a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{i=1}^{M-1} \sum _{j=1}^{N} \Biggl [c_i+\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{k1} {-} a^0_{ik}b^0_{k1}){+}c_i \right\} ^2 \\&\quad + \sum _{j=2}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \right\} ^2 .\end{aligned}$$

In addition,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik},\\ b_{kj}=b_{kj}, \\ x_i=\sum _{k=1}^{H-1} (a_{ik}b_{k1} - a^0_{ik}b^0_{k1}) + c_i \end{array}\right. }. \end{aligned}$$

If \(j>1\), then we have

$$\begin{aligned}&\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \\&= x_i -\sum _{k=1}^{H-1} (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\\&\quad +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj})\\&=x_i +\sum _{k=1}^{H-1}\{ (a_{ik}b_{kj} \\&\quad {-} a^0_{ik}b^0_{kj})- (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\} \end{aligned}$$

and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \right\} ^2 \\&=\sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N}\sum _{i=1}^{M-1} \left[ x_i +\sum _{k=1}^{H-1}\{ (a_{ik}b_{kj} \right. \\&\quad \left. {-} a^0_{ik}b^0_{kj})- (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\} \vphantom{\sum _{k=1}^{H-1}}\right] ^2. \end{aligned}$$

Put

$$\begin{aligned} g_{ij}:=\sum _{k=1}^{H-1}\left\{ \left( a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}\right) - \left( a_{ik}b_{k1} - a^0_{ik}b^0_{k1}\right) \right\} . \end{aligned}$$

Consider the following ideal:

$$\begin{aligned} J:=\left\langle (x_i)_{i=1}^{M-1}, (g_{ij})_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned} (x_i+g_{ij})^2=x_i^2 + (g_{ij})^2 +2x_i g_{ij} \end{aligned}$$

and \(x_i g_{ij} \in J\). Hence, owing to Corollary 12, we get

$$\begin{aligned} &\Vert AB-A_0 B_0 \Vert ^2\\&\quad\sim \sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1} \{x_i+g_{ij}\}^2 \sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (g_{ij})^2 \right\} \\&\quad= \sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\left\{ \left(a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}\right) -\left(a_{ik}b_{k1} - a^0_{ik}b^0_{k1}\right)\right\}\right] ^2 \right) \\&\quad= \sum _{i=1}^{M-1} \left(x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\{ a_{ik}(b_{kj} {-} b_{k1}) -a^0_{ik}(b^0_{kj}- b^0_{k1})\}\right] ^2 \right) . \end{aligned}$$
$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}, \\ b_{k1}=b_{k1}, \\ b_{kj}=b_{kj}-b_{k1}, &{} (j>1) \\ x_i =x_i \end{array}\right. } \end{aligned}$$

and put \(b^0_{kj}=b^0_{kj}- b^0_{k1}\), then we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\{ a_{ik}(b_{kj} \right. \right. \\&\quad \left. \left. {-} b_{k1}) -a^0_{ik}(b^0_{kj}- b^0_{k1})\}\right] ^2 \right) \\&=\sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2 \right] \\&=\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2. \end{aligned}$$

There exists a positive constant \(C>0\), we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2 \\&\leqq \sum _{i=1}^{M-1} x_i^2 +C\sum _{i=1}^{M-1} \sum _{j=2}^{N} \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})^2 \\&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})^2 \\&= \sum _{i=1}^{M-1} x_i^2 + \sum _{k=1}^{H-1} \sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2. \end{aligned}$$

We blow-up the coordinate like the proof of Lemma 16 for resolution singularity in

$$\begin{aligned} \sum _{i=1}^{M-1} x_i^2 + \sum _{k=1}^{H-1} \sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2. \end{aligned}$$

Let \({\bar{\lambda }}_1\) be the RLCT of the first term and \({\bar{\lambda }}_2\) be the RLCT of the second term. It is immediately proved that \({\bar{\lambda }}_1\) is equal to \((M-1)/2\). For deriving the RLCT of the second term \({\bar{\lambda }}_2\), we use the result of Lemma 16: the RLCT of \(\sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2\) is equal to \((M+N-3)/2\). Thus we have

$$\begin{aligned} {\bar{\lambda }}_2 = (H-1) \frac{M+N-3}{2}. \end{aligned}$$

Let \(\lambda\) be the RLCT of \(\Vert AB-A_0B_0\Vert ^2\). In general, an RLCT is order isomorphic, therefore we get

$$\begin{aligned} \lambda&\leqq \bar{\lambda _1} + {\bar{\lambda }}_2 \\&= \frac{M-1}{2} + (H-1) \frac{M+N-3}{2} \\&= \frac{M-1+(H-1)(M+N-3)}{2}. \end{aligned}$$

\(\square\)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hayashi, N., Watanabe, S. Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization. SN COMPUT. SCI. 1, 69 (2020). https://doi.org/10.1007/s42979-020-0071-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-020-0071-3

Keywords

Navigation