Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Hayashi, Naoki; Watanabe, Sumio

doi:10.1007/s42979-020-0071-3

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Original Research
Published: 20 February 2020

Volume 1, article number 69, (2020)
Cite this article

SN Computer Science Aims and scope Submit manuscript

1076 Accesses
2 Citations
17 Altmetric
Explore all metrics

Abstract

Latent Dirichlet allocation (LDA) is useful in document analysis, image processing, and many information systems; however, its generalization performance has been left unknown because it is a singular learning machine to which regular statistical theory can not be applied. Stochastic matrix factorization (SMF) is a restricted matrix factorization in which matrix factors are stochastic; the column of the matrix is in a simplex. SMF is being applied to image recognition and text mining. We can understand SMF as a statistical model by which a stochastic matrix of given data is represented by a product of two stochastic matrices, whose generalization performance has also been left unknown because of non-regularity. In this paper, using an algebraic and geometric method, we show the analytic equivalence of LDA and SMF, both of which have the same real log canonical threshold (RLCT), resulting in that they asymptotically have the same Bayesian generalization error and the same log marginal likelihood. Moreover, we derive the upper bound of the RLCT and prove that it is smaller than the dimension of the parameter divided by two, hence the Bayesian generalization errors of them are smaller than those of regular statistical models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Systematic Review of Hidden Markov Models and Their Applications

Article 12 May 2020

Tutorial on PCA and approximate PCA and approximate kernel PCA

Article Open access 31 October 2022

Clustering, coding, and the concept of similarity

Article 19 March 2024

References

Adams CP. Finite mixture models with one exclusion restriction. Econ J. 2016a;19(2):150–65.
MathSciNet Google Scholar
Adams CP. Stochastic matrix factorization. SSRN Electron J. 2016b;1–24. Available at SSRN: https://ssrn.com/abstract=2840852. Accessed 28 Jan 2020.
Akaike H. Likelihood and bayes procedure. Bayesian Stat. 1980;31:143–66.
MathSciNet MATH Google Scholar
Aoyagi M. Stochastic complexity and generalization error of a restricted boltzmann machine in bayesian estimation. J Mach Learn Res. 2010;11(Apr):1243–72.
MathSciNet MATH Google Scholar
Aoyagi M, Watanabe S. Stochastic complexities of reduced rank regression in bayesian estimation. Neural Netw. 2005;18(7):924–33.
MATH Google Scholar
Atiyah MF. Resolution of singularities and division of distributions. Commun Pure Appl Math. 1970;23(2):145–50.
MathSciNet MATH Google Scholar
Bernstein J. The analytic continuation of generalized functions with respect to a parameter. Funktsional’nyi Analiz i ego Prilozheniya. 1972;6(4):26–40.
MathSciNet Google Scholar
Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3(Jan):993–1022.
MATH Google Scholar
Bobadilla J, Bojorque R, Esteban AH, Hurtado R. Recommender systems clustering using bayesian non negative matrix factorization. IEEE Access. 2018;6:3549–64.
Google Scholar
Caskey JE Jr. A markov chain model for the probability of precipitation occurrence in intervals of various length. Mon Weather Rev. 1963;91(6):298–301.
Google Scholar
Cemgil AT. Bayesian inference in non-negative matrix factorisation models. Comput Intell Neurosci. 2009;2009(4):17 article ID 785152.
Google Scholar
Drton M, Plummer M. A bayesian information criterion for singular models. J R Stat Soc B. 2017;79:323–80 with discussion.
Article MathSciNet MATH Google Scholar
Gildea D, Hofmann T. Topic-based language models using em. In: Sixth European Conference on Speech Communication and Technology. 1999.
Griffiths TL, Steyvers M. Finding scientific topics. Proc Natl Acad Sci. 2004;101(suppl 1):5228–35.
Article Google Scholar
Hayashi N, Watanabe S. Tighter upper bound of real log canonical threshold of non-negative matrix factorization and its application to bayesian inference. In: IEEE Symposium series on computational intelligence (IEEE SSCI), 2017a; pp. 718–725.
Hayashi N, Watanabe S. Upper bound of bayesian generalization error in non-negative matrix factorization. Neurocomputing. 2017;266C(29 November):21–8. https://doi.org/10.1016/j.neucom.2017.04.068.
Article Google Scholar
Hironaka H. Resolution of singularities of an algbraic variety over a field of characteristic zero. Ann Math. 1964;79:109–326.
Article MATH Google Scholar
Kohjima M, Matsubayashi T, Sawada H. Probabilistic non-negative inconsistent-resolution matrices factorization. In: Proceeding of CIKM ’15 Proceedings of the 24th ACM International on Conference on Information and Knowledge Management. 2015;1:1855–8.
Larsen JS, Clemmensen LKH. Non-negative matrix factorization for binary data. In: 2015 7th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K). 2015;01:555–63.
Lee DD, Seung HS. Learning the parts of objects with nonnegative matrix factorization. Nature. 1999;401:788–91.
MATH Google Scholar
Li FF, Perona P. A bayesian hierarchical model for learning natural scene categories. In: Proceedings of the 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05)—volume 02, IEEE Computer Society, Washington, DC, CVPR ’05, 2005; pp. 524–531. https://doi.org/10.1109/CVPR.2005.16.
Matsuda K, Watanabe S. Weighted blowup and its application to a mixture of multinomial distributions. IEICE Trans. 2003;J86–A(3):278–87 (in Japanese).
Google Scholar
Nagata K, Watanabe S. Asymptotic behavior of exchange ratio in exchange monte carlo method. Neural Netw. 2008;21(7):980–8.
MATH Google Scholar
Paatero P, Tapper U. Positive matrix factorization: a non-negative factor model with optimal utilization of error estimates of data values. Environmetrics. 1994;5(2):111–26. https://doi.org/10.1002/env.3170050203.
Article Google Scholar
Roje T, Marín LG, Sáez D, Orchard M, Jiménez-Estévez G. Consumption modeling based on markov chains and bayesian networks for a demand side management design of isolated microgrids. Int J Energy Res. 2017;41(3):365–76.
Google Scholar
Rusakov D, Geiger D. Asymptotic model selection for naive bayesian networks. J Mach Learn Res. 2005;6(Jan):1–35.
MathSciNet MATH Google Scholar
Sato M, Shintani T. On zeta functions associated with prehomogeneous vector spaces. Ann Math. 1974;100(1):131–70.
MathSciNet MATH Google Scholar
Schmidt MN, Olsson RK. Single-channel speech separation using sparse non-negative matrix factorization. In: 9th International Conference on Spoken Language Processing. 2006.
Schwarz G. Estimating the dimension of a model. Ann Stat. 1978;6(2):461–4.
MathSciNet MATH Google Scholar
Sonnadara D, Jayewardene D. A markov chain probability model to describe wet and dry patterns of weather at colombo. Theor Appl Climatol. 2015;119(1–2):333–40.
Google Scholar
Styan GP, Smith H Jr. Markov chains applied to marketing. J Mark Res. 1964;1(1):50–5.
Google Scholar
Tirunillai S, Tellis GJ. Mining marketing meaning from online chatter: strategic brand analysis of big data using latent dirichlet allocation. J Mark Res. 2014;51(4):463–79.
Google Scholar
Virtanen T, Cemgil AT, Godsill S. Bayesian extensions to non-negative matrix factorisation for audio signal modelling. In: Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, IEEE, 2008; pp. 1825–1828.
Watanabe S. Algebraic analysis for non-regular learning machines. Adv Neural Inf Process Syst. 2000;12:356–62 denver, USA.
Google Scholar
Watanabe S. Algebraic geometrical methods for hierarchical learning machines. Neural Netw. 2001;13(4):1049–60.
Article Google Scholar
Watanabe S. Algebraix geometry and statistical learning theory. Cambridge: Cambridge University Press; 2009.
Book Google Scholar
Watanabe S. Asymptotic equivalence of bayes cross validation and widely applicable information criterion in singular learning theory. J Mach Learn Res. 2010;11(Dec):3571–94.
MathSciNet MATH Google Scholar
Watanabe S. Mathematical theory of Bayesian statistics. Boca Raton: CRC Press; 2018.
Book MATH Google Scholar
Yamazaki K, Watanabe S. Singularities in mixture models and upper bounds of stochastic complexity. Neural Netw. 2003;16(7):1029–38.
Article MATH Google Scholar
Yoshida K, Kuwatani T, Hirajima T, Iwamori H, Akaho S. Progressive evolution of whole–rock composition during metamorphism revealed by multivariate statistical analyses. J Metamorph Geol. 2018;36(1):41–54. https://doi.org/10.1111/jmg.12282.
Article Google Scholar
Zwiernik P. An asymptotic behaviour of the marginal likelihood for general markov models. J Mach Learn Res. 2011;12(Nov):3283–310.
MathSciNet MATH Google Scholar

Download references

Acknowledgements

This research was partially supported by NTT DATA Mathematical Systems Inc.. The authors would like to thank the editor and the reviewers for comments to improve this paper.

Author information

Authors and Affiliations

Simulation & Mining Division, NTT DATA Mathematical Systems Inc., 1F Shinano-machi-Renga-kan, Shinano-machi 35, Shinjuku-Ku, 160-0016, Tokyo, Japan
Naoki Hayashi
Department of Mathematical and Computing Science, Tokyo Institute of Technology, W8-42, 2-12-1, Oookayama, Meguro-ku, 152-8552, Tokyo, Japan
Sumio Watanabe

Authors

Naoki Hayashi
View author publications
You can also search for this author in PubMed Google Scholar
Sumio Watanabe
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Naoki Hayashi.

Ethics declarations

Conflict of Interest

Conflict of interests in our research are as below: The first author is a member of NTT DATA Mathematical Systems Inc., Japan. The second author is a member of Tokyo Institute of Technology, Japan. The authors declare that they do not have any other conflicts of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

A. Proof of Main Theorem

In this section, we prove Main Theorem using above lemmas.

Proof

(Main Theorem) Summarizing the terms, we have

$$\begin{aligned} \varPhi (A,B)&= \Vert AB-A_0B_0 \Vert ^2 \nonumber \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \!\sum _{k=1}^{H_0-1} (\!a_{ik}b_{kj}{-}a^0_{ik}b^0_{kj})+a_{iH_0}b_{H_0j}\right. \nonumber \\&\quad \left. -a^0_{iH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{ik}b_{kj} + a_{iH} b_{Hj} \!\right\} ^2 \nonumber \\&\quad + \sum _{j=1}^N \left\{ \!\sum _{k=1}^{H_0-1} (\!a_{Mk}b_{kj} {-}a^0_{Mk}b^0_{kj}){+}a_{MH_0}b_{H_0j}\right. \nonumber \\&\quad \left. {-}a^0_{MH_0}b^0_{H_0j} {+}\sum _{k=H_0+1}^{H-1}a_{Mk}b_{kj} {+} a_{MH} b_{Hj}\! \right\} ^2\!\!. \end{aligned}$$

(15)

Put

$$\begin{aligned}&K_{ij}:=\sum _{k=1}^{H_0-1} (a_{ik}b_{kj} -a^0_{ik}b^0_{kj})+a_{iH_0}b_{H_0j} \\&\qquad -a^0_{iH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{ik}b_{kj} + a_{iH} b_{Hj} , \\&L_j:= \sum _{k=1}^{H_0-1} (a_{Mk}b_{kj} -a^0_{Mk}b^0_{kj})+a_{MH_0}b_{H_0j}\\&\qquad -a^0_{MH_0}b^0_{H_0j} +\sum _{k=H_0+1}^{H-1}a_{Mk}b_{kj} + a_{MH} b_{Hj}, \end{aligned}$$

then we get

$$\begin{aligned} \Vert AB-A_0 B_0\Vert ^2 = \sum _{j=1}^N \sum _{i=1}^{M-1}K_{ij}^2 + \sum _{j=1}^N L_j^2. \end{aligned}$$

Using $a_{Mk}=1-\sum _{i=1}^{M-1}a_{ik}$, $b_{Hj}=1-\sum _{k=1}^{H-1} b_{kj}$, $a^0_{Mk}=1-\sum _{i=1}^{M-1}a^0_{ik}$, and $b^0_{H_0j}=1-\sum _{k=1}^{H_0-1} b^0_{kj}$, we have

$$\begin{aligned} \sum _{i=1}^{M-1}K_{ij}=\, & \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \\&-\sum _{i=1}^{M-1} \sum _{k=1}^{H_0-1} \left( a^0_{ik}-a^0_{iH_0}\right) b^0_{kj}+ \sum _{i=1}^{M-1} \left( a_{iH}-a^0_{iH_0}\right) ,\\ L_j=\, & -\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj}\\&+\sum _{i=1}^{M-1} \sum _{k=1}^{H_0-1}\left( a^0_{ik}-a^0_{iH_0}\right) b^0_{kj} - \sum _{i=1}^{M-1} \left( a_{iH}-a^0_{iH_0}\right) , \end{aligned}$$

thus

$$\begin{aligned} L_j^2=\left( \sum _{i=1}^{M-1} K_{ij} \right) ^2. \end{aligned}$$

Therefore

$$\begin{aligned} \Vert AB-A_0B_0\Vert ^2&= \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2 +\sum _{j=1}^N L_j^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2 +\sum _{j=1}^N \left( \sum _{i=1}^{M-1} K_{ij} \right) ^2. \end{aligned}$$

On account of Corollary 12, we have

$$\begin{aligned} \Vert AB-A_0B_0\Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} K_{ij}^2, \end{aligned}$$

i.e.,

$$\begin{aligned}&\quad \Vert AB-A_0B_0\Vert ^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \right. \nonumber \\&\qquad \left. - \sum _{k=1}^{H_0-1} (a^0_{ik}-a^0_{iH_0})b^0_{kj}+ (a_{iH}-a^0_{iH_0}) \right\} ^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} \{(a_{ik}-a_{iH})b_{kj}- (a^0_{ik}-a^0_{iH_0})b^0_{kj}\} \right. \nonumber \\&\qquad \left. + \sum _{k=H_0}^{H-1} (a_{ik}-a_{iH})b_{kj}+ (a_{iH}-a^0_{iH_0}) \right] ^2. \end{aligned}$$

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}-a_{iH}, &{} k<H \\ c_i = a_{iH}-a^0_{iH_0}, \\ b_{kj}=b_{kj} \end{array}\right. } \end{aligned}$$

and put $a^0_{ik}=a^0_{ik}-a^0_{iH_0}$. Then we have

$$\begin{aligned}&\quad \Vert AB-A_0B_0\Vert ^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} \{(a_{ik}-a_{iH})b_{kj}- (a^0_{ik}-a^0_{iH_0})b^0_{kj}\} \right. \nonumber \\&\quad \left. + \sum _{k=H_0}^{H-1} (a_{ik}-a_{iH})b_{kj}+ (a_{iH}-a^0_{iH_0}) \right] ^2 \\&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}- a^0_{ik}b^0_{kj}) \right. \nonumber \\&\quad \left. + \sum _{k=H_0}^{H-1} a_{ik}b_{kj}+ c_i \right\} ^2. \end{aligned}$$

There is a positive constant $C>0$, we have

$$\begin{aligned}&\quad C \Vert AB-A_0 B_0 \Vert ^2 \\&\leqq \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}-a^0_{ik}b^0_{kj}) + c_i \right\} ^2 \\&\quad + \sum _{j=1}^N \sum _{i=1}^{M-1} \left( \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right) ^2. \end{aligned}$$

Put

$$\begin{aligned} K_1=\, & \sum _{j=1}^N \sum _{i=1}^{M-1} \left[ \sum _{k=1}^{H_0-1} (a_{ik}b_{kj}- a^0_{ik}b^0_{kj}) + c_i\right] ^2, \\ K_2=\, & \sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right\} ^2. \end{aligned}$$

Let ${\bar{\lambda }}_1$ be the RLCT of $K_1$ ,${\bar{\lambda }}_2$ be the RLCT of $K_2$, and $\lambda$ be the RLCT of $\Vert AB-A_0B_0\Vert ^2$. The following inequality holds since an RLCT is order isomorphic and $K_1$ and $K_2$ are independent:

$$\begin{aligned} \lambda \leqq {\bar{\lambda }}_1 + {\bar{\lambda }}_2. \end{aligned}$$

According to Lemma 17 in the case of $H \leftarrow H_0$,

$$\begin{aligned} {\bar{\lambda }}_1 \leqq \frac{M-1}{2} + (H_0-1) \frac{M+N-3}{2}. \end{aligned}$$

In contrast, there exists a positive constant $D>0$, we have

$$\begin{aligned} K_2&= \sum _{j=1}^N \sum _{i=1}^{M-1} \left( \sum _{k=H_0}^{H-1} a_{ik}b_{kj} \right) ^2 \\&\leqq D\sum _{j=1}^N \sum _{i=1}^{M-1} \sum _{k=H_0}^{H-1} a_{ik}^2 b_{kj}^2 \\&\sim \sum _{j=1}^N \sum _{i=1}^{M-1} \sum _{k=H_0}^{H-1} a_{ik}^2 b_{kj}^2 \\&= \sum _{k=H_0}^{H-1} \sum _{j=1}^N \sum _{i=1}^{M-1} a_{ik}^2 b_{kj}^2 \\&= \sum _{k=H_0}^{H-1} \left( \sum _{j=1}^N b_{kj}^2 \right) \left( \sum _{i=1}^{M-1} a_{ik}^2 \right) . \end{aligned}$$

The RLCT of the last term becomes a sum of each ones about k. Considering blowing-ups of variables $\{ a_{ik} \}$ and $\{ b_{kj} \}$ for each k, we obtain

$$\begin{aligned} {\bar{\lambda }}_2 \leqq \frac{(H-H_0)\min \{ M-1,N \}}{2}. \end{aligned}$$

Using the above inequalities about the RLCTs, we have

$$\begin{aligned} \lambda&\leqq {\bar{\lambda }}_1 + {\bar{\lambda }}_2 \leqq \frac{M-1}{2} \\&\quad + (H_0-1) \frac{M+N-3}{2} + \frac{(H-H_0)\min \{ M-1,N \}}{2}.\\&\quad \therefore \quad \lambda \leqq \frac{1}{2}\left[ M-1\right. \\&\quad \left. +(H_0-1)(M+N-3)+(H-H_0)\min \{M-1,N\} \right] . \end{aligned}$$

$\square$

B. Proof of Lemmas

In this section, we prove the four lemmas introduced in Section 3: Lemmas 14, 15, 16, and 17.

First, Lemma 14 is proved.

Proof

(Lemma 14) We set $A=(a_{i})_{i=1}^{M}$, $B^T=(1)_{j=1}^{N}$, $A_0=(a^0_{i})^M$,$B_0^T=(1)_{j=1}^N$, then

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&=\sum _{i=1}^M N(a_i-a^0_i)^2 \\&= \sum _{i=1}^{M-1} N(a_i-a^0_i)^2 \\ &\quad + N \left( 1-\sum _{i=1}^{M-1} a_i -1 + \sum _{i=1}^{M-1} a^0_i \right) ^2 \\&= \sum _{i=1}^{M-1} N(a_i-a^0_i)^2 \\&\quad + N \left\{ \sum _{i=1}^{M-1} (a_i - a^0_i) \right\} ^2. \end{aligned}$$

Using Corollary 12, $\sum _{i=1}^{M-1} (a_i - a^0_i) \in \langle a_1-a^0_1, \ldots , a_{M-1}-a^0_{M-1} \rangle$ causes that

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2 \sim \sum _{i=1}^{M-1} N(a_i-a^0_i)^2. \end{aligned}$$

As an RLCT is not changed by any constant factor, all we have to do is calculating an RLCT of

$$\begin{aligned} \sum _{i=1}^{M-1} (a_i-a^0_i)^2 \end{aligned}$$

and this has no singularity. Thus, the RLCT equals to a half of the parameter dimension:

$$\begin{aligned} \lambda = \frac{M-1}{2}. \end{aligned}$$

$\square$

Second, Lemma 15 is derived.

Proof

(Lemma 15) We set $A_0=(a^0_{i})^M$, $B_0^T=(1)_{j=1}^N$.

$$\begin{aligned}&AB-A_0B_0 \\&\quad =\left( \begin{matrix} a_{11} &{} a_{12} \\ \vdots &{} \vdots \\ a_{(M-1)1} &{} a_{(M-1)2} \\ a_{M1} &{} a_{M2} \end{matrix} \right) \left( \begin{matrix} b_1 &{} \ldots &{} b_N \\ 1-b_1 &{} \ldots &{} 1-b_N \end{matrix} \right) \\&\quad -\left( \begin{matrix} a^0_1 \\ \vdots \\ a^0_{M-1} \\ a^0_{M} \end{matrix} \right) \left( \begin{matrix} 1&\ldots&1 \end{matrix} \right) \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j +a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j+a_{(M-1)2}-a^0_{M-1} \\ (a_{M1}-a_{M2})b_j+a_{M2}-a^0_{M} \\ \end{matrix} \right) _{j=1}^N \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j+a_{(M-1)2}-a^0_{M-1} \\ -\sum _{i=1}^{M-1}(a_{i1}-a_{i2})b_j-\sum _{i=1}^{M-1}(a_{i2}-a^0_{i}) \\ \end{matrix} \right) _{j=1}^N . \end{aligned}$$

Thus,

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \,\,\left(\vphantom{ \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \right] ^2} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2 \right. \\&\quad \left. + \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \right] ^2 \right) . \end{aligned}$$

Put $I=\langle \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}_{i=1}^{M-1} \rangle$. Because of Corollary 11 and

$$\begin{aligned} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\} \in I, \end{aligned}$$

we get

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i1}-a_{i2}, \\ a_{i2} = a_{i2}, \\ b_j=b_j \end{array}\right. }. \end{aligned}$$

Then we get

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j +a_{i2}-a^0_i\}^2\\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +a_{i2}-a^0_i\}^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ c_{i} = a_{i2}-a^0_i \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned} \sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +a_{i2}-a^0_i\}^2=\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +c_{i}\}^2 \end{aligned}$$

holds.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ x_{i} = a_{i}b_1+c_i, \\ \end{array}\right. }. \end{aligned}$$

If $j>1$, then we have $a_i b_j +c_i = x_i -a_i b_1 +a_i b_j$ and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j +c_{i}\}^2\\&\quad =\sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{x_i-(a_i b_1-a_i b_j)\}^2 \right] . \end{aligned}$$

Consider the following generated ideal:

$$\begin{aligned} J:=\left\langle (x_i)_{i=1}^{M-1}, (a_i b_1 -a_i b_j)_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned} \{x_i-(a_i b_1-a_i b_j)\}^2=x_i^2 + (a_i b_1-a_i b_j)^2 -2 x_i (a_i b_1 -a_i b_j) \end{aligned}$$

and $x_i (a_i b_1 - a_i b_j) \in J$ holds. Hence, owing to Corollary 12, we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2 \sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{x_i-(a_i b_1-a_i b_j)\}^2 \right] \\&\sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} (a_i b_1-a_i b_j)^2 \right] \\&= \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} a_i^2(b_j-b_1)^2 \right] .\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_i, \\ b_1=b_1, \\ b_j=b_j-b_1, &{} (j>1) \\ x_i =x_i \end{array}\right. }, \end{aligned}$$

then we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} a_i^2(b_j-b_1)^2 \right\} \\&=\sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} a_i^2b_j^2 \right) \\&=\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N a_i^2 b_j^2 \\&=\sum _{i=1}^{M-1} x_i^2 +\left( \sum _{i=1}^{M-1}a_i^2 \right) \left( \sum _{j=2}^N b_j^2 \right) . \end{aligned}$$

Since $a_i$,$b_j$,$x_i$ are independent variables for each, we consider blowing-ups of them and get

$$\begin{aligned} \lambda&= \frac{M-1}{2} + \min \left\{ \frac{M-1}{2}, \frac{N-1}{2} \right\} \\& = \min \left\{ M-1,\frac{M+N-2}{2} \right\} . \end{aligned}$$

Therefore,

$$\begin{aligned} \lambda ={\left\{ \begin{array}{ll} M-1 &{} (M \geqq N) \\ \frac{M+N-2}{2} &{} (M<N) \end{array}\right. }. \end{aligned}$$

$\square$

Third, we prove Lemma 16.

Proof

(Lemma 16)

$$\begin{aligned}&AB-A_0B_0 \\&\quad =\left( \begin{matrix} a_{11} &{} a_{12} \\ \vdots &{} \vdots \\ a_{(M-1)1} &{} a_{(M-1)2} \\ a_{M1} &{} a_{M2} \end{matrix} \right) \; \left( \begin{matrix} b_1 &{} \!\ldots \! &{} b_N \\ 1{-}b_1 &{} \!\ldots \! &{} 1{-}b_N \end{matrix} \right) \\&\qquad -\left( \begin{matrix} a^0_{11} &{} a^0_{12} \\ \vdots &{} \vdots \\ a^0_{(M-1)1} &{} a^0_{(M-1)2} \\ a^0_{M1} &{} a^0_{M2} \end{matrix} \right) \; \left( \begin{matrix} b^0_1 &{} \ldots &{} b^0_N \\ 1{-}b^0_1 &{} \ldots &{} 1{-}b^0_N \end{matrix} \right) \\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j-(a^0_{11}-a^0_{12})b^0_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j -(a^0_{(M-1)1}-a^0_{(M-1)2})b^0_j+a_{(M-1)2}-a^0_{M-1} \\ (a_{M1}-a_{M2})b_j-(a^0_{M1}-a^0_{M2})b^0_j+a_{M2}-a^0_{M} \\ \end{matrix} \right) _{j=1}^N\\&\quad =\left( \begin{matrix} (a_{11}-a_{12})b_j-(a^0_{11}-a^0_{12})b^0_j+a_{12}-a^0_1 \\ \vdots \\ (a_{(M-1)1}-a_{(M-1)2})b_j -(a^0_{(M-1)1}-a^0_{(M-1)2})b^0_j+a_{(M-1)2}-a^0_{M-1} \\ -\sum _{i=1}^{M-1}\{(a_{i1}-a_{i2})b_1-(a^0_{i1}-a^0_{i2})b^0_j\}-\sum _{i=1}^{M-1}(a_{i2}-a^0_{i}) \\ \end{matrix} \right) _{j=1}^N. \end{aligned}$$

Then we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \,\,\left(\vphantom{\sum _{i=1^{\int^{\sum}}}^{M-1}} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j \right. \\&\quad \left. -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}^ 2 \right. \\&\quad + \left[ \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j\right. \\&\quad \left. \left. -(a^0_{i1}-a^0_{i2})b^0_j +a_{i2}-a^0_i\} \vphantom{\sum _{i=1}^{M-1}}\right] ^2 \right) . \end{aligned}$$

Put $I=\langle \{(a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}_{i=1}^{M-1} \rangle$. Because of Corollary 11 and

$$\begin{aligned} \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j +a_{i2}-a^0_i\} \in I, \end{aligned}$$

we get

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \{(a_{i1}-a_{i2})b_j \\&\quad -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\}^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i1}-a_{i2}, \\ a_{i2} = a_{i2}, \\ b_j=b_j \end{array}\right. } \end{aligned}$$

and put $a^0_{i}=a^0_{i1}-a^0_{i2}$. Then we get

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ (a_{i1}-a_{i2})b_j -(a^0_{i1}-a^0_{i2})b^0_j+a_{i2}-a^0_i\right\} ^2 \\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j -a^0_{i}b^0_j+a_{i2}-a^0_i\right\} ^2. \end{aligned}$$

Moreover,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ c_{i} = a_{i2}-a^0_i \end{array}\right. } \end{aligned}$$

and

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j-a^0_{i}b^0_j +a_{i2}-a^0_i\right\} ^2 \\&\quad =\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ a_{i}b_j -a^0_{i}b^0_j+c_{i}\right\} ^2 \end{aligned}$$

holds.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_{i}, \\ b_j=b_j ,\\ x_{i} = a_{i}b_1-a^0_{i}b^0_1+c_i, \\ \end{array}\right. }. \end{aligned}$$

If $j>1$, then we have $a_i b_j -a^0_{i}b^0_j+c_i = x_i -a_i b_1 +a^0_i b^0_1+a_i b_j-a^0_{i}b^0_j$ and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \{a_{i}b_j -a^0_i b^0_j+c_{i}\}^2\\&\quad =\sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1}\left\{ x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\right\} ^2. \end{aligned}$$

Consider the following generated ideal:

$$\begin{aligned}&J:=\left\langle (x_i)_{i=1}^{M-1}, (a_i b_1 -a^0_i b^0_1\right. \\&\qquad \left. -a_i b_j +a^0_i b^0_j)_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned}&\{x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\}^2\\&=x_i^2 + (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)^2 \\&\quad -2 x_i (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j) \end{aligned}$$

and $x_i (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j) \in J$ holds. Hence, owing to Corollary 12, we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2\\&\quad \sim \sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1}\{x_i-(a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)\}^2 \\&\quad \sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (a_i b_1 -a^0_i b^0_1-a_i b_j +a^0_i b^0_j)^2 \right\} \\&\quad = \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{a_i(b_j-b_1)-a^0_i(b^0_j-b^0_1)\}^2 \right] .\\&\text{ Let } {\left\{ \begin{array}{ll} a_{i}=a_i, \\ b_1=b_1, \\ b_j=b_j-b_1, &{} (j>1) \\ x_i =x_i \end{array}\right. } \end{aligned}$$

and put $b^0_j=b^0_j-b^0_1$, then we have

$$\begin{aligned}&\Vert AB-A_0 B_0 \Vert ^2 \\&\quad \sim \sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \{a_i(b_j-b_1)-a^0_i(b^0_j-b^0_1)\}^2 \right] \\&\quad =\sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (a_i b_j - a^0_i b^0_j)^2 \right\} \\&\quad =\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N (a_i b_j - a^0_i b^0_j)^2 .\end{aligned}$$

Let $f_{ij}$ be $a_i b_j - a^0_i b^0_j$. If $\Vert AB-A_0B_0 \Vert ^2 =0$, $f_{ij}=0$. Hence, $a_i \ne 0$ and $b_j \ne 0$. Owing to Proposition 13

$$\begin{aligned} \sum _{i=1}^{M-1} \sum _{j=2}^N f_{ij}^2 \sim \sum _{i=2}^{M-1} f_{i1}^2 + \sum _{j=3}^N f_{1j}^2 +f_{12}^2, \end{aligned}$$

we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N (a_i b_j - a^0_i b^0_j)^2 \nonumber \\&= \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^N f_{ij}^2 \nonumber \\&\sim \sum _{i=1}^{M-1} x_i^2 +\left( f_{12}^2+\sum _{i=2}^{M-1} f_{i2}^2 + \sum _{j=3}^N f_{1j}^2 \right) . \end{aligned}$$

(16)

Thus, all we have to do is calculate an RLCT of the right side. Considering blowing-ups, the RLCT $\lambda _1$ of the first term is equal to $\lambda _1=(M-1)/2$. For deriving the RLCT of the second term, we arbitrarily take $i,j(1 \leqq i \leqq M-1, 2 \leqq j \leqq N, i,j \in {\mathbb {N}})$ and fix them.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_i = a_i, \\ f_{i2} = a_i b_2 - a^0_i b^0_2, \\ f_{1j} = a_1 b_j -a^0_1 b^0_j, \\ x_i = x_i \end{array}\right. } \end{aligned}$$

and we have that the Jacobi matrix of the above transformation is equal to

$$\begin{aligned} \frac{\partial (a_i ,f_{ij}, x_i)}{\partial (a_i,b_j,x_i)}= \left( \begin{array}{ccc} \frac{\partial a_i}{\partial a_i} &{} \frac{\partial f_{ij}}{\partial a_i} &{} \frac{\partial x_i}{\partial a_i}\\ \frac{\partial a_i}{\partial b_j} &{} \frac{\partial f_{ij}}{\partial b_j} &{} \frac{\partial x_i}{\partial b_j}\\ \frac{\partial a_i}{\partial x_i} &{} \frac{\partial f_{ij}}{\partial x_i} &{} \frac{\partial x_i}{\partial x_i}\\ \end{array} \right) = \left( \begin{array}{ccc} 1 &{} b_j &{} 0 \\ 0 &{} a_i &{} 0 \\ 0 &{} 0 &{} 1 \\ \end{array} \right) . \end{aligned}$$

Because of

$$\begin{aligned} \Biggl | \frac{\partial (a_i ,f_{ij}, x_i)}{\partial (a_i,b_j,x_i)} \Biggr |=a_i \ne 0, \end{aligned}$$

g is an analytic isomorphism. Thus, the RLCT $\lambda _2$ of the second term in Eq. (16) is equal to

$$\begin{aligned} \lambda _2 = \frac{M+N-3}{2}. \end{aligned}$$

Let $\lambda$ be the RLCT of $\Vert AB-A_0B_0 \Vert ^2$. From the above,

$$\begin{aligned} \lambda =\lambda _1+\lambda _2 = \frac{2M+N-4}{2}. \end{aligned}$$

$\square$

Lastly, we derive the inequality of Lemma 17.

Proof

(Lemma 17) We develop the objective function and obtain

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \nonumber \\&\quad = \sum _{i=1}^M \sum _{j=1}^N (a_{i1}b_{1j} + \cdots + a_{iH}b_{Hj} - a^0_{i1}b^0_{1j} - a^0_{iH}b^0_{Hj})^2 \nonumber \\&\quad = \sum _{j=1}^N \sum _{i=1}^M \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \nonumber \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \nonumber \\&\qquad + \sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) \Biggr \}^2. \end{aligned}$$

(17)

Expand the second term in Eq. (17) using $a_{Mk}=1-\sum _{i=1}^{M-1}a_{ik}$, $b_{Hj}=1-\sum _{k=1}^{H-1} b_{kj}$, $a^0_{Mk}=1-\sum _{i=1}^{M-1}a^0_{ik}$, and $b^0_{Hj}=1-\sum _{k=1}^{H-1} b^0_{kj}$, then we have

$$\begin{aligned}&\sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \Biggl \{ \sum _{k=1}^{H-1} (a_{Mk}b_{kj} - a^0_{Mk}b^0_{kj}) + (a_{MH}b_{Hj} - a^0_{MH}b^0_{Hj}) \Biggr \}^2 \\&\quad =\sum _{j=1}^N \Biggl (- \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a_{ik}b_{kj} + \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a^0_{ik}b^0_{kj}\\&\qquad -\sum _{i=1}^{M-1}a_{iH} +\sum _{i=1}^{M-1}\sum _{k=1}^{H-1} a_{iH}b_{kj}+ \sum _{i=1}^{M-1}a^0_{iH} \\&\qquad -\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} a^0_{iH}b^0_{kj} \Biggr )^2 {=}\!:\! \varPhi _2. \end{aligned}$$

Developing the equation, we have

$$\begin{aligned} \varPhi _2&=\sum _{j=1}^N \Bigg \{- \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a_{ik}-a_{iH})b_{kj} \\&\quad + \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} (a^0_{ik}-a^0_{iH})b^0_{kj} -\sum _{i=1}^{M-1}(a_{iH}-a^0_{iH}) \Bigg \}^2 \\&=\sum _{j=1}^N \Biggl [\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\qquad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \Biggr ]^2. \end{aligned}$$

On the other hand, the first term of Eq. (17) is equal to

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + (a_{iH}b_{Hj} - a^0_{iH}b^0_{Hj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + a_{iH}\\&\qquad -\sum _{k=1}^{H-1} a_{iH}b_{kj} - a^0_{iH} + \sum _{k=1}^{H-1} a^0_{iH} b^0_{kj} \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl \{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} - a^0_{ik}b^0_{kj}) + (a_{iH} - a^0_{iH})\\&\qquad -\sum _{k=1}^{H-1} (a_{iH}b_{kj} - a^0_{iH} b^0_{kj}) \Biggr \}^2 \\&\quad = \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})+\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} \\&\qquad - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 . \end{aligned}$$

Consider the following ideal:

$$\begin{aligned} I=\, & \left\langle (a_{iH}-a^0_{iH})_{i=1}^{M-1} , \{(a_{ik}-a_{iH})b_{kj} \right. \\&\left. - (a^0_{ik}-a^0_{iH}) b^0_{kj}\}_{(i,j,k)=(1,1,1)}^{(M-1,N,H-1)} \right\rangle . \end{aligned}$$

Since we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&= \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})\\&\quad +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 \\&\quad + \sum _{j=1}^N \Biggl [\sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\quad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \Biggr ]^2 \end{aligned}$$

and

$$\begin{aligned}&\forall j, \sum _{i=1}^{M-1} \sum _{k=1}^{H-1} \{(a_{ik}{-}a_{iH})b_{kj} \\&\quad {-} (a^0_{ik}{-}a^0_{iH})b^0_{kj}\}{+}\sum _{i=1}^{M-1}(a_{iH}{-}a^0_{iH}) \in I, \end{aligned}$$

thus Corollary 12 causes

$$\begin{aligned}&\Vert AB-A_0B_0 \Vert ^2 \sim \sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH})\\&\quad +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2. \end{aligned}$$

We transform the coordinate like the proof of Lemma 16 for resolution singularity of the above polynomial.

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}-a_{iH}, &{} (k<H) \\ a_{iH}=a_{iH}, \\ b_{kj}=b_{kj}, \\ \end{array}\right. } \end{aligned}$$

and put $a^0_{ik}=a^0_{ik}-a^0_{iH}$,

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} \{(a_{ik}-a_{iH})b_{kj} - (a^0_{ik}-a^0_{iH}) b^0_{kj}\} \Biggr ]^2 \\&=\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2.\\&\text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik},\\ b_{kj}=b_{kj}, \\ c_i=a_{iH}-a^0_{iH} \end{array}\right. }. \end{aligned}$$

Then we obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [(a_{iH} - a^0_{iH}) +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{j=1}^N \sum _{i=1}^{M-1} \Biggl [c_i+\sum _{k=1}^{H-1}(a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{i=1}^{M-1} \sum _{j=1}^{N} \Biggl [c_i+\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}) \Biggr ]^2 \\&=\sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{k1} {-} a^0_{ik}b^0_{k1}){+}c_i \right\} ^2 \\&\quad + \sum _{j=2}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \right\} ^2 .\end{aligned}$$

In addition,

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik},\\ b_{kj}=b_{kj}, \\ x_i=\sum _{k=1}^{H-1} (a_{ik}b_{k1} - a^0_{ik}b^0_{k1}) + c_i \end{array}\right. }. \end{aligned}$$

If $j>1$, then we have

$$\begin{aligned}&\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \\&= x_i -\sum _{k=1}^{H-1} (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\\&\quad +\sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj})\\&=x_i +\sum _{k=1}^{H-1}\{ (a_{ik}b_{kj} \\&\quad {-} a^0_{ik}b^0_{kj})- (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\} \end{aligned}$$

and obtain

$$\begin{aligned}&\sum _{j=1}^N \sum _{i=1}^{M-1} \left\{ \sum _{k=1}^{H-1} (a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}){+}c_i \right\} ^2 \\&=\sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N}\sum _{i=1}^{M-1} \left[ x_i +\sum _{k=1}^{H-1}\{ (a_{ik}b_{kj} \right. \\&\quad \left. {-} a^0_{ik}b^0_{kj})- (a_{ik}b_{k1} - a^0_{ik}b^0_{k1})\} \vphantom{\sum _{k=1}^{H-1}}\right] ^2. \end{aligned}$$

Put

$$\begin{aligned} g_{ij}:=\sum _{k=1}^{H-1}\left\{ \left( a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}\right) - \left( a_{ik}b_{k1} - a^0_{ik}b^0_{k1}\right) \right\} . \end{aligned}$$

Consider the following ideal:

$$\begin{aligned} J:=\left\langle (x_i)_{i=1}^{M-1}, (g_{ij})_{(i,j)=(1,2)}^{(M-1,N)} \right\rangle . \end{aligned}$$

We expand the square terms

$$\begin{aligned} (x_i+g_{ij})^2=x_i^2 + (g_{ij})^2 +2x_i g_{ij} \end{aligned}$$

and $x_i g_{ij} \in J$. Hence, owing to Corollary 12, we get

$$\begin{aligned} &\Vert AB-A_0 B_0 \Vert ^2\\&\quad\sim \sum _{i=1}^{M-1} x_i^2 + \sum _{j=2}^{N} \sum _{i=1}^{M-1} \{x_i+g_{ij}\}^2 \sim \sum _{i=1}^{M-1} \left\{ x_i^2 + \sum _{j=2}^{N} (g_{ij})^2 \right\} \\&\quad= \sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\left\{ \left(a_{ik}b_{kj} {-} a^0_{ik}b^0_{kj}\right) -\left(a_{ik}b_{k1} - a^0_{ik}b^0_{k1}\right)\right\}\right] ^2 \right) \\&\quad= \sum _{i=1}^{M-1} \left(x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\{ a_{ik}(b_{kj} {-} b_{k1}) -a^0_{ik}(b^0_{kj}- b^0_{k1})\}\right] ^2 \right) . \end{aligned}$$

$$\begin{aligned} \text{ Let } {\left\{ \begin{array}{ll} a_{ik}=a_{ik}, \\ b_{k1}=b_{k1}, \\ b_{kj}=b_{kj}-b_{k1}, &{} (j>1) \\ x_i =x_i \end{array}\right. } \end{aligned}$$

and put $b^0_{kj}=b^0_{kj}- b^0_{k1}$, then we have

$$\begin{aligned} \Vert AB-A_0 B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} \left( x_i^2 + \sum _{j=2}^{N} \left[ \sum _{k=1}^{H-1}\{ a_{ik}(b_{kj} \right. \right. \\&\quad \left. \left. {-} b_{k1}) -a^0_{ik}(b^0_{kj}- b^0_{k1})\}\right] ^2 \right) \\&=\sum _{i=1}^{M-1} \left[ x_i^2 + \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2 \right] \\&=\sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2. \end{aligned}$$

There exists a positive constant $C>0$, we have

$$\begin{aligned} \Vert AB-A_0B_0 \Vert ^2&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \left\{ \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})\right\} ^2 \\&\leqq \sum _{i=1}^{M-1} x_i^2 +C\sum _{i=1}^{M-1} \sum _{j=2}^{N} \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})^2 \\&\sim \sum _{i=1}^{M-1} x_i^2 +\sum _{i=1}^{M-1} \sum _{j=2}^{N} \sum _{k=1}^{H-1}(a_{ik}b_{kj} -a^0_{ik}b^0_{kj})^2 \\&= \sum _{i=1}^{M-1} x_i^2 + \sum _{k=1}^{H-1} \sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2. \end{aligned}$$

We blow-up the coordinate like the proof of Lemma 16 for resolution singularity in

$$\begin{aligned} \sum _{i=1}^{M-1} x_i^2 + \sum _{k=1}^{H-1} \sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2. \end{aligned}$$

Let ${\bar{\lambda }}_1$ be the RLCT of the first term and ${\bar{\lambda }}_2$ be the RLCT of the second term. It is immediately proved that ${\bar{\lambda }}_1$ is equal to $(M-1)/2$. For deriving the RLCT of the second term ${\bar{\lambda }}_2$, we use the result of Lemma 16: the RLCT of $\sum _{i=1}^{M-1} \sum _{j=2}^N (a_{ik} b_{kj} -a^0_{ik} b^0_{kj})^2$ is equal to $(M+N-3)/2$. Thus we have

$$\begin{aligned} {\bar{\lambda }}_2 = (H-1) \frac{M+N-3}{2}. \end{aligned}$$

Let $\lambda$ be the RLCT of $\Vert AB-A_0B_0\Vert ^2$. In general, an RLCT is order isomorphic, therefore we get

$$\begin{aligned} \lambda&\leqq \bar{\lambda _1} + {\bar{\lambda }}_2 \\&= \frac{M-1}{2} + (H-1) \frac{M+N-3}{2} \\&= \frac{M-1+(H-1)(M+N-3)}{2}. \end{aligned}$$

$\square$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hayashi, N., Watanabe, S. Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization. SN COMPUT. SCI. 1, 69 (2020). https://doi.org/10.1007/s42979-020-0071-3

Download citation

Received: 26 August 2019
Accepted: 30 January 2020
Published: 20 February 2020
DOI: https://doi.org/10.1007/s42979-020-0071-3

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Abstract

Access this article

Similar content being viewed by others

A Systematic Review of Hidden Markov Models and Their Applications

Tutorial on PCA and approximate PCA and approximate kernel PCA

Clustering, coding, and the concept of similarity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Appendices

A. Proof of Main Theorem

Proof

B. Proof of Lemmas

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Asymptotic Bayesian Generalization Error in Latent Dirichlet Allocation and Stochastic Matrix Factorization

Abstract

Access this article

Similar content being viewed by others

A Systematic Review of Hidden Markov Models and Their Applications

Tutorial on PCA and approximate PCA and approximate kernel PCA

Clustering, coding, and the concept of similarity

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of Interest

Additional information

Publisher's Note

Appendices

A. Proof of Main Theorem

Proof

B. Proof of Lemmas

Proof

Proof

Proof

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation