Skip to main content

Identifiability and parameter estimation of the overlapped stochastic co-block model

Abstract

Stochastic block model (SBM) has been extensively studied for undirected network data with community structure, yet its extension to directed network, stochastic co-block model (ScBM), has only been proposed recently. The key difference of the ScBM model is to introduce out- and in-communities to capture different sending and receiving patterns among nodes. In this paper, we further extend the ScBM model so that each node may belong to multiple out- or in-communities. Particularly, we formulate the ScBM model as a generative model, where the unknown community assignment is modeled based on the exclusive or overlapped community. We also establish the corresponding identifiability of the generative ScBM model, and estimate its parameters via an efficient variational EM algorithm. The advantage of the generative ScBM model is demonstrated in a variety of simulated networks and a real political blog network.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3

References

  • Abbe, E.: Community detection and stochastic block models: recent developments. J. Mach. Learn. Res. 18(1), 6446–6531 (2017)

    MathSciNet  Google Scholar 

  • Adamic, L.A., Glance, N.: The political blogosphere and the 2004 US election: divided they blog. In: Proceedings of the 3rd International Workshop on Link Discovery, pp. 36–43 (2005)

  • Aicher, C., Jacobs, A.Z., Clauset, A.: Learning latent block structure in weighted networks. J. Complex Netw. 3(2), 221–248 (2015)

    MathSciNet  MATH  Article  Google Scholar 

  • Airoldi, E.M., Blei, D.M., Fienberg, S.E., Xing, E.P.: Mixed membership stochastic blockmodels. J. Mach. Learn. Res. 9(Sep), 1981–2014 (2008)

    MATH  Google Scholar 

  • Chiang, K.Y., Hsieh, C.J., Natarajan, N., Dhillon, I.S., Tewari, A.: Prediction and clustering in signed networks: a local to global perspective. J. Mach. Learn. Res. 15(1), 1177–1213 (2014)

    MathSciNet  MATH  Google Scholar 

  • Coscia, M., Rossetti, G., Giannotti, F., Pedreschi, D.: Uncovering hierarchical and overlapping communities with a local-first approach. ACM Trans. Knowl. Discov. Data 9(1), 1–27 (2014)

    Article  Google Scholar 

  • Dai, B., Wang, J., Shen, X., Qu, A.: Smooth neighborhood recommender systems. J. Mach. Learn. Res. 20(1), 589–612 (2019)

    MathSciNet  MATH  Google Scholar 

  • Fister, I., Jr., Fister, I., Perc, M.: Toward the discovery of citation cartels in citation networks. Front. Phys. 4, 49 (2016)

    Article  Google Scholar 

  • Fortunato, S.: Community detection in graphs. Phys. Rep. 486(3–5), 75–174 (2010)

    MathSciNet  Article  Google Scholar 

  • Girvan, M., Newman, M.E.J.: Community structure in social and biological networks. Proc. Natl. Acad. Sci. 99(12), 7821–7826 (2002)

    MathSciNet  MATH  Article  Google Scholar 

  • Guo, X., Qiu, Y., Zhang, H., Chang, X.: (2020). Randomized spectral co-clustering for large-scale directed networks. arXiv:2004.12164v2

  • Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)

    MathSciNet  Article  Google Scholar 

  • Jin, J., Ke, Z.T., Luo, S.: Estimating network memberships by simplex vertex hunting. arXiv:1708.07852 (2017)

  • Jordan, M., Ghahramani, Z., Jaakkola, T., Saul, L.: An introduction to variational methods for graphical models. Mach. Learn. 37(2), 183–233 (1999)

    MATH  Article  Google Scholar 

  • Jung, S., Segev, A.: Analyzing future communities in growing citation networks. Knowl. Based Syst. 69, 34–44 (2014)

    Article  Google Scholar 

  • Karrer, B., Newman, M.E.J.: Stochastic blockmodels and community structure in networks. Phys. Rev. E 83(1), 016107 (2011)

    MathSciNet  Article  Google Scholar 

  • Kluger, Y., Basri, R., Chang, J.T., Gerstein, M.: Spectral biclustering of microarray data: coclustering genes and conditions. Genome Res. 13(4), 703–716 (2003)

    Article  Google Scholar 

  • Latouche, P., Birmelé, E., Ambroise, C.: Overlapping stochastic block models with application to the French political blogosphere. Ann. Appl. Stat. 5(1), 309–336 (2011)

    MathSciNet  MATH  Article  Google Scholar 

  • Lazzeroni, L., Owen, A.: Plaid models for gene expression data. Stat. Sin. pp. 61–86 (2002)

  • Leskovec, J., Krevl, A.: SNAP datasets: Stanford large network dataset collection. http://snap.stanford.edu/data (2014)

  • Li, T., Levina, E., Zhu, J.: Network cross-validation by edge sampling. Biometrika 107(2), 257–276 (2020)

    MathSciNet  MATH  Article  Google Scholar 

  • Linderman, S., Adams, R.: (2014). Discovering latent network structure in point process data. In: International Conference on Machine Learning, pp. 1413–1421

  • Malliaros, F.D., Vazirgiannis, M.: Clustering and community detection in directed networks: A survey. Phys. Rep. 533(4), 95–142 (2013)

    MathSciNet  MATH  Article  Google Scholar 

  • Mariadassou, M., Robin, S., Vacher, C.: Uncovering latent structure in valued graphs: a variational approach. Ann. Appl. Stat. 4(2), 715–742 (2010)

    MathSciNet  MATH  Article  Google Scholar 

  • Rohe, K., Chatterjee, S., Yu, B.: Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Stat. 39(4), 1878–1915 (2011)

    MathSciNet  MATH  Article  Google Scholar 

  • Rohe, K., Qin, T., Yu, B.: Co-clustering directed graphs to discover assymmetries and directional communities. Proc. Natl. Acad. Sci. 113(45), 12679–12684 (2016)

    MathSciNet  MATH  Article  Google Scholar 

  • Su, G., Kuchinsky, A., Morris, J.H., States, D.J., Meng, F.: Glay: community structure analysis of biological networks. Bioinformatics 26(24), 3135–3137 (2010)

    Article  Google Scholar 

  • Su, L., Lu, W., Song, R., Huang, D.: Testing and estimation of social network dependence with time to event data. J. Am. Stat. Assoc. 115(530), 570–582 (2020)

    MathSciNet  MATH  Article  Google Scholar 

  • Van Laarhoven, T., Marchiori, E.: Robust community detection methods with resolution parameter for complex detection in protein protein interaction networks. In: IAPR International Conference on Pattern Recognition in Bioinformatics, pp. 1–13. Springer (2012)

  • Zhang, J., He, X., Wang, J.: (2021). Directed community detection with network embedding. J. Am. Stat. Assoc. 1–11

  • Zhang, Y., Levina, E., Zhu, J.: Detecting overlapping communities in networks using spectral methods. SIAM J. Math. Data Sci. 2(2), 265–283 (2020)

    MathSciNet  MATH  Article  Google Scholar 

  • Zhao, Y.: A survey on theoretical advances of community detection in networks. Wiley Interdiscip. Rev. Comput. Stat. 9(5), e1403 (2017)

    MathSciNet  Article  Google Scholar 

  • Zhao, Y., Levina, E., Zhu, J.: Consistency of community detection in networks under degree-corrected stochastic block models. Ann. Stat. 40(4), 2266–2292 (2012)

    MathSciNet  MATH  Article  Google Scholar 

  • Zhou, Z., Amini, A.A.: Analysis of spectral clustering algorithms for community detection: the general bipartite setting. J. Mach. Learn. Res. 20, 47–1 (2019)

    MathSciNet  MATH  Google Scholar 

  • Zhou, Z., Amini, A.A.: Optimal bipartite network clustering. J. Mach. Learn. Res. 21(40), 1–68 (2020)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research is supported in part by HK RGC grants GRF-11303918, GRF-11300919 and GRF-11304520. The authors are grateful to the co-ordinating editor and two anonymous referees for their insightful comments and constructive suggestions, which have improved the manuscript significantly.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jingnan Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Appendix

Proof of Proposition 1. Given \(P_s({\varvec{A}};{\varvec{\theta }})\) with \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},{\varvec{W}})\in {\varvec{\Theta }}^{\text {s}}{\setminus }{\varvec{\Theta }}_0^{\text {s}}\), we first show that \({\varvec{\alpha }}\) can be uniquely determined up to a permutation. For any in-node \(j\ne 1\), note that

$$\begin{aligned} P(a_{1j}=1|T_{1k}=1)&=\sum _{l=1}^LP(a_{1j}=1|T_{1k}=1,S_{jl}=1)\\&\quad P(S_{jl}=1)=\sum _{l=1}^Lw_{1kl}\beta _l={\varvec{W}}_{1k\cdot }{\varvec{\beta }}, \end{aligned}$$

which does not depend on j, and \({\varvec{W}}_{1k\cdot }\) is the k-th row of \({\varvec{W}}_1\). Thus, we define \({\varvec{\rho }}=(\rho _1,\ldots ,\rho _K)^T\) with \(\rho _k=P(a_{1j}=1|T_{1k}=1)\) for \(1\le k\le K\). Furthermore, for any m different in-nodes \(j_1,\ldots ,j_m \in \{2,\ldots ,n\}\), we have

$$\begin{aligned} P(a_{1j_1}&=\cdots =a_{1j_m}=1|T_{1k}=1)\\&=\sum _{l_1,\ldots ,l_m=1}^LP(a_{1j_1}=\cdots =a_{1j_m}=1|T_{1k}=1,S_{j_1l_1}\\&=\cdots =S_{j_ml_m}=1)\prod _{i=1}^m\beta _{l_i}\\&=\sum _{l_1,\ldots ,l_m=1}^L\prod _{i=1}^mw_{1kl_i}\beta _{l_i}=\prod _{i=1}^m\sum _{l_i=1}^Lw_{1kl_i}\beta _{l_i}=\rho _k^m. \end{aligned}$$

As \(n\ge 2K\), define \(q_1=1\) and \(q_{m+1}=P(a_{12}=1,\ldots ,a_{1,m+1}=1)\) for \(1\le m\le 2K-1\), then there holds that

$$\begin{aligned} q_{m+1}&=\sum _{k=1}^KP(a_{12}=1,\ldots ,a_{1,m+1}=1|T_{1k}=1)P(T_{1k}=1)\\&=\sum _{k=1}^K\alpha _k\rho _k^m. \end{aligned}$$

Note that \(\{q_m\}_{m=1}^{2K}\) can be fully determined by \(P_s({\varvec{A}};{\varvec{\theta }})\), whereas \(\alpha _k\) and \(\rho _k\) are not due to their dependence on the unknown \({\varvec{T}}\).

Let \({\varvec{B}}=(b_{ij})_{i,j}\) be a \((K+1)\times K\) matrix with \(b_{ij}=q_{i+j-1}\), and let \({\varvec{B}}_{-i}\) denote the square matrix obtained by deleting the i-th row of \({\varvec{B}}\). Then we have \({\varvec{B}}_{-(K+1)}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\tilde{{\varvec{\rho }}}}^T\) as \(b_{ij}=\sum _{k=1}^K\rho _k^{i-1}\alpha _k\rho _k^{j-1}\), where \(\tilde{{\varvec{\rho }}}=({\tilde{\rho }}_{ik})_{i,k=1}^K\) with \({\tilde{\rho }}_{ik}=\rho _k^{i-1}\) is an invertible Van der Monde matrix, and \({\varvec{A}}_0=\text {diag}({\varvec{\alpha }})\). Let \(D_i=\text {det}({\varvec{B}}_{-i})\) and \(f(x)=\sum _{i=1}^{K+1}(-1)^{i+K+1}D_ix^{i-1}\). Note that the j-th column of \({\varvec{B}}\) can be rewritten as \({\varvec{B}}_{\cdot j}=\sum _{k=1}^K \alpha _k \rho _k^{j-1} {\varvec{x}}_k\) with \({\varvec{x}}_k=(1,\rho _k,\ldots ,\rho _k^K)^T\), and \(f(\rho _k)\) is the determinant of the square matrix \(({\varvec{B}}, {\varvec{x}}_k)\). Since all the columns of \(({\varvec{B}}, {\varvec{x}}_k)\) are linear combinations of \(\{{\varvec{x}}_k\}_{k=1}^K\), we have \(f(\rho _k)=0\) for any k. As \(D_{K+1}\ne 0\), it follows immediately that

$$\begin{aligned} f(x)=D_{K+1}\prod _{i=1}^K(x-\rho _i). \end{aligned}$$
(16)

Since \(\{q_m\}_{m=1}^{2K}\) is fully determined given \(P_s({\varvec{A}};{\varvec{\theta }})\), so are \({\varvec{B}}\) and f(x). Therefore, it follows from (16) that \({\varvec{\rho }}\) can be fully determined up to a permutation of \(\{1,\ldots ,K\}\). It then can be solved that \({\varvec{A}}_0=\tilde{{\varvec{\rho }}}^{-1}{\varvec{B}}_{-(K+1)}(\tilde{{\varvec{\rho }}}^T)^{-1}\), whose diagonal elements \(\alpha _k\)’s can also be determined up to a permutation of \(\{1,\ldots ,K\}\).

For \({\varvec{\beta }}\), note that for any out-node \(i\ne 2\),

$$\begin{aligned} P(a_{i2}=1|S_{2l}=1)&=\sum _{k=1}^KP(a_{i2}=1|T_{ik}=1,S_{2l}=1)\\&\quad P(T_{ik}=1)=\sum _{k=1}^Kw_{1kl}\alpha _k ={\varvec{W}}_{1l\cdot }^T{\varvec{\alpha }}, \end{aligned}$$

which does not depend on i, and \({\varvec{W}}_{1l\cdot }^T\) is the l-th row of \({\varvec{W}}_1^T\). Thus, we can define \({\varvec{\rho }}'=(\rho _1',\ldots ,\rho _L')^T\) with \(\rho _l'=P(a_{i2}=1|S_{2l}=1)\) for \(1\le l\le L\). Let \(\tilde{{\varvec{\rho }}}'=({\tilde{\rho }}'_{jl})_{j,l=1}^L\) with \({\tilde{\rho }}'_{jl}=(\rho _l')^{j-1}\) and \({\varvec{B}}_0=\text {diag}({\varvec{\beta }})\). Then as \(n\ge 2\,L\), \({\varvec{\rho }}'\) and \({\varvec{\beta }}\) can also be determined up to a permutation of \(\{1,\ldots ,L\}\) following a similar treatment as for \({\varvec{\alpha }}\).

For \({\varvec{W}}\), let \({\varvec{H}}=(h_{ij})_{i,j}\) with \(h_{ij}=P(a_{12}=\cdots =a_{1,i+1}=1, a_{32}=\cdots =a_{j+1,2}=1)\) for \(1\le i\le K\) and \(1\le j\le L\). Note that \(h_{i1}=P(a_{12}=\cdots =a_{1,i+1}=1)\). Then we have

$$\begin{aligned} h_{ij}&=\sum _{k,l}P(a_{12}=\cdots =a_{1,i+1}=1, a_{32}=\cdots =a_{j+1,2}\\&=1|T_{1k}=1, S_{2l}=1)P(T_{1k}=1, S_{2l}=1)\\&=\sum _{k,l}P(a_{13}=\cdots =a_{1, i+1}=1|T_{1k}=1)\\&\quad P(a_{32}=\cdots =a_{j+1, 2}=1|S_{2l}=1)w_{1kl}\alpha _k\beta _l\\&=\sum _{k,l}\rho _k^{i-1}\alpha _kw_{1kl}\beta _l(\rho _l')^{j-1}, \end{aligned}$$

and thus \({\varvec{H}}=\tilde{{\varvec{\rho }}}{\varvec{A}}_0{\varvec{W}}_1{\varvec{B}}_0(\tilde{{\varvec{\rho }}}')^T\). As \({\varvec{H}}\) can be fully determined by \(P_s({\varvec{A}};{\varvec{\theta }})\), it immediately follows that \({\varvec{W}}_1={\varvec{A}}_0^{-1}\tilde{{\varvec{\rho }}}^{-1}{\varvec{H}}\big ((\tilde{{\varvec{\rho }}}')^T\big )^{-1}{\varvec{B}}_0^{-1}\) can be fully determined up to permutations for its rows and columns. As \(w_{kl}=\text {logit}(w_{1kl})\), this completes the proof of Proposition 1.

Proof of Theorem 1. Suppose there exist \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})\ne {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^{\text {g}}{\setminus }{\varvec{\Theta }}_0^{\text {g}} \) such that \(P_g({\varvec{A}};{\varvec{\theta }})=P_g({\varvec{A}};{\varvec{\theta }}')\), it then suffices to show that \({\varvec{\theta }}'\) and \({\varvec{\theta }}\) are identical up to a permutation over community labels. Note that in (2),

$$\begin{aligned} \gamma _{ij}&=\text {logit}(p_{ij})={\varvec{T}}_i^T{\varvec{W}}{\varvec{S}}_j+{\varvec{T}}_i^T{\varvec{U}}+{\varvec{V}}^T{\varvec{S}} _j+w_0\\&={\varvec{T}}_i^T\big ({\varvec{W}}+{\varvec{U}}{\varvec{1}}_L^T+{\varvec{1}}_K{\varvec{V}}^T+w_0{\varvec{1}}_K{\varvec{1}}_L^T\big ){\varvec{S}}_j, \end{aligned}$$

where the last equality follows from the exclusive constraints that \({\varvec{T}}_i^T{\varvec{1}}_K={\varvec{S}}_j^T{\varvec{1}}_L=1\). Thus, Proposition 1 immediately implies that there exist permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,K\}\) and \(\{1,\ldots ,L\}\), such that \(\alpha _k'=\alpha _{\sigma _1(k)}\), \(\beta _l'=\beta _{\sigma _2(l)}\) and \(w_{2kl}'=w_{2\sigma _1(k)\sigma _2(l)}\) for each \(1\le k\le K\) and \(1\le l\le L\), where \(w_{2kl}=\text {logistic}(w_{kl}+u_k+v_l+w_0)\).

It then remains to show \({\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}\) for any k and l. In fact, it follows from the definition of \({\varvec{W}}_2\) that

$$\begin{aligned} w_{kl}'+u_k'+v_l'+w_0'=w_{\sigma _1(k)\sigma _2(l)}+u_{\sigma _1(k)}+v_{\sigma _2(l)}+w_0. \end{aligned}$$
(17)

Taking summation over k and l on both side of (17) implies that \(w_0'=w_0\). With \(w_0'\) and \(w_0\) canceled, taking summation over k or l respectively implies that \(u_k'=u_{\sigma _1(k)}\) or \(v_l'=v_{\sigma _2(l)}\), which also leads to \(w_{kl}'=w_{\sigma _1(k)\sigma _2(l)}\). Thus, \({\tilde{w}}_{kl}'={\tilde{w}}_{\sigma _1(k)\sigma _2(l)}\) by setting \(\sigma _1(K+1)=K+1\) and \(\sigma _2(L+1)=L+1\). The desired result then follows immediately.

Proof of Lemma 1. We first show that \(\psi \) is an injective mapping. Suppose there exist \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}}), {\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')\in {\varvec{\Theta }}^\text {o}\), such that \(\psi ({\varvec{\theta }})=({\varvec{\eta }},{\varvec{\zeta }},{\varvec{\Pi }})=({\varvec{\eta }}',{\varvec{\zeta }}',{\varvec{\Pi }}')=\psi ({\varvec{\theta }}')\). By the definition of \(\psi \), we have,

$$\begin{aligned}&\prod _{k=1}^K \alpha _k^{b_k}(1-\alpha _k)^{(1-b_k)}=\prod _{k=1}^K (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)},\\&\prod _{l=1}^L\beta _l^{c_l}(1-\beta _l)^{(1-c_l)}=\prod _{l=1}^L (\beta _l')^{c_l}(1-\beta _l')^{(1-c_l)},\\&({\varvec{b}}^T,1)\widetilde{{\varvec{W}}}({\varvec{c}}^T,1)^T=({\varvec{b}}^T,1) \widetilde{{\varvec{W}}}'({\varvec{c}}^T,1)^T \end{aligned}$$

for any \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\). Particularly, setting \({\varvec{b}}={\varvec{1}}_K\) leads to \(\prod _{k=1}^K \alpha _k=\prod _{k=1}^K\alpha _k'\), and setting \({\varvec{b}}=({\varvec{1}}_{K-1}^T,0)^T\) leads to \((1-\alpha _K)\prod _{k=1}^{K-1} \alpha _k=(1-\alpha _K')\prod _{k=1}^{K-1}\alpha _k'\). Thus, \(\alpha _K=\alpha _K'\), and then \(\prod _{k=1}^{K-1} \alpha _k^{b_k}(1-\alpha _k)^{(1-b_k)}=\prod _{k=1}^{K-1} (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)}\). Repeating the same treatment for \(K-1,\ldots ,1\) yields that \({\varvec{\alpha }}={\varvec{\alpha }}'\) and similar \({\varvec{\beta }}={\varvec{\beta }}'\). For \(\widetilde{{\varvec{W}}}\), setting \({\varvec{b}}={\varvec{c}}={\varvec{0}}\) leads to \(w_0=w_0'\), and setting \({\varvec{c}}={\varvec{0}}\) and \({\varvec{b}}\) with only \(b_k=1\) leads to \(u_k=u_k'\). Similarly, setting \({\varvec{b}}={\varvec{0}}\) and \({\varvec{c}}\) with only \(c_l=1\) leads to \(v_l=v_l'\), and setting \({\varvec{b}}\) and \({\varvec{c}}\) with only \(b_k=c_l=1\) leads to \(w_{kl}'=w_{kl}\). Therefore, \(\widetilde{{\varvec{W}}}=\widetilde{{\varvec{W}}}'\) and \({\varvec{\theta }}={\varvec{\theta }}'\), and thus \(\psi \) is an injective mapping.

For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), we have \({\varvec{\omega }}=\psi ({\varvec{\theta }})\in {\varvec{\Omega }}\), and then

$$\begin{aligned} P_s({\varvec{A}};{\varvec{\omega }})&=\sum _{\widetilde{{\varvec{T}}},\widetilde{{\varvec{S}}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (\gamma _{ij}a_{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _{i,s}\eta _{s}^{\widetilde{T}_{is}}\Big )\Big (\prod _{j,t}\zeta _{t}^{\widetilde{S}_{jt}}\Big )\\&=\sum _{{\varvec{T}},{\varvec{S}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (\gamma _{ij}a_{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _i\eta _{s_i}\Big )\Big (\prod _j\zeta _{t_j}\Big )\\&=\sum _{{\varvec{T}},{\varvec{S}}}\Big (\prod _{1\le i,j\le n}\frac{\exp (a_{ij}\gamma _{ij})}{1+\exp (\gamma _{ij})}\Big )\Big (\prod _{i,k}\alpha _{k}^{T_{ik}}(1-\alpha _k)^{1-T_{ik}}\Big )\\&\quad \Big (\prod _{j,l}\beta _{l}^{S_{jl}}(1-\beta _l)^{1-S_{jl}}\Big )=P_o({\varvec{A}};{\varvec{\theta }}). \end{aligned}$$

This completes the proof of Lemma 1.

Lemma 2

For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), let \({\varvec{\theta }}'=g_{{\varvec{yz}}}({\varvec{\theta }})\) defined as in Section 2.2, then there exists \(\sigma =(\sigma _1,\sigma _2)\) with permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\).

Proof of Lemma 2. Let \({\varvec{\theta }}=({\varvec{\alpha }},{\varvec{\beta }},\widetilde{{\varvec{W}}})\), and \({\varvec{\theta }}'=({\varvec{\alpha }}',{\varvec{\beta }}',\widetilde{{\varvec{W}}}')=g_{{\varvec{yz}}}({\varvec{\theta }})\) for some \({\varvec{y}}\in \{0,1\}^K\) and \({\varvec{z}}\in \{0,1\}^L\), with \(g_{{\varvec{yz}}}({\varvec{\theta }})\) defined as in (10). Correspondingly, let \({\varvec{w}}=({\varvec{\eta }}, {\varvec{\zeta }}, {\varvec{\Pi }})=\psi ({\varvec{\theta }})\) and \({\varvec{w}}'=({\varvec{\eta }}', {\varvec{\zeta }}', {\varvec{\Pi }}')=\psi ({\varvec{\theta }}')\).

For any \(s\in \{1,\ldots ,2^K\}\) and \(t\in \{1,\ldots ,2^L\}\), there exist \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\) such that \(s=\sum _{k=1}^Kb_k2^{k-1}+1\) and \(t=\sum _{l=1}^Lc_l2^{l-1}+1\). Then we define \(\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1\) and \(\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1\), where \({\varvec{b}}'\) and \({\varvec{c}}'\) are constructed as

$$\begin{aligned} b_{k}'={\left\{ \begin{array}{ll}1-b_{k}, &{} \text {if} \quad y_k=1; \\ b_k, &{} \text {if}\quad y_k=0, \end{array}\right. } \quad \text{ and } \quad c_l'={\left\{ \begin{array}{ll}1-c_l, &{} \text {if} \quad z_l=1; \\ c_l, &{} \text {if}\quad z_l=0. \end{array}\right. } \end{aligned}$$

It can be verified that

$$\begin{aligned} \eta _{s}'&=\prod _{k=1}^K (\alpha _k')^{b_k}(1-\alpha _k')^{(1-b_k)}=\prod _{k=1}^K \alpha _k^{b_k'}(1-\alpha _k)^{(1-b_k')}\\&=\eta _{\sigma _1(s)}, \\ \zeta _{t}'&=\prod _{l=1}^L(\beta _l')^{c_l}(1-\beta _l')^{(1-c_l)}=\prod _{l=1}^L \beta _l^{c_l'}(1-\beta _l)^{(1-c_l')}=\zeta _{\sigma _2(t)}, \\ \pi _{st}'&=({\varvec{b}}^T,1)\widetilde{{\varvec{W}}}'({\varvec{c}}^T,1)^T =({\varvec{b}}^T,1){\varvec{E}}_{{\varvec{y}}}^T\widetilde{{\varvec{W}}}{\varvec{E}}_{{\varvec{z}}}({\varvec{c}}^T,1)^T\\&=\big (({\varvec{b}}')^T,1\big )\widetilde{{\varvec{W}}}\big (({\varvec{c}}')^T,1\big )^T=\pi _{\sigma _1(s)\sigma _2(t)}, \end{aligned}$$

where the third equality follows from the equalities that

$$\begin{aligned} {\varvec{E}}_{{\varvec{y}}}\big ({\varvec{b}}^T,1\big )^T&=\Big (\big ({\varvec{b}}-2\text {diag}({\varvec{y}}){\varvec{b}}+{\varvec{y}}\big )^T,1\Big )^T=\big (({\varvec{b}}')^T, 1\big )^T,\\ {\varvec{E}}_{{\varvec{z}}}\big ({\varvec{c}}^T,1\big )^T&=\Big (\big ({\varvec{c}}-2\text {diag}({\varvec{z}}){\varvec{c}}+{\varvec{z}}\big )^T,1\Big )^T=\big (({\varvec{c}}')^T, 1\big )^T. \end{aligned}$$

Thus, \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\), and the proof of Lemma 2 is completed.

Lemma 3

For any \({\varvec{\theta }}, {\varvec{\theta }}' \in {\varvec{\Theta }}^{\text {o}}\), if there exists \(\nu =(\nu _1,\nu _2)\) with \(\nu _1\) and \(\nu _2\) being permutations on \(\{1,\ldots ,K\}\) and \(\{1,\ldots ,L\}\) and \({\varvec{y}}\in \{0,1\}^K\) and \({\varvec{z}}\in \{0,1\}^L\), such that \({\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), then we have \({\varvec{\theta }}'\sim {\varvec{\theta }}\) and \(P_o({\varvec{A}};{\varvec{\theta }}')=P_o({\varvec{A}};{\varvec{\theta }})\), where “\(\sim \)” is an equivalent relation and satisfies the reflexive, symmetric and transitive properties.

Proof of Lemma 3. For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\), setting \(\nu =(\nu _1,\nu _2)\) with \(\nu _1(k)=k\) and \(\nu _2(l)=l\) and \({\varvec{y}}={\varvec{z}}={\varvec{0}}\) leads to \({\varvec{\theta }}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), and thus the reflexive property. As for symmetric property, it can be verified that \(g_{\nu ({\varvec{yz}})}\big (h_{\nu }({\varvec{\theta }})\big )=h_{\nu }\big (g_{{\varvec{yz}}}({\varvec{\theta }})\big )\), where \(\nu ({\varvec{yz}})=\nu _1({\varvec{y}})\nu _2({\varvec{z}})\) with \(\nu _1({\varvec{y}})_k=y_{\nu _1(k)}\) and \(\nu _2({\varvec{z}})_l=z_{\nu _2(l)}\). Therefore, if \({\varvec{\theta }}'=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\), then \({\varvec{\theta }}=h_{\nu ^{-1}}\big (g_{{\varvec{yz}}}({\varvec{\theta }}')\big )=g_{\nu ^{-1}({\varvec{yz}})}\big (h_{\nu ^{-1}}({\varvec{\theta }}')\big )\).

Finally, let \({\varvec{\theta }}_1=g_{{\varvec{y}}{\varvec{z}}}\big (h_{\nu }({\varvec{\theta }}_0)\big )\) and \({\varvec{\theta }}_2=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}_1)\big )\), then \({\varvec{\theta }}_1=h_{\nu }\big (g_{\nu ^{-1}({\varvec{y}}{\varvec{z}})}({\varvec{\theta }}_0)\big )\) and

$$\begin{aligned} {\varvec{\theta }}_2&=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}\big (h_{\nu }\big (g_{\nu {-1}({\varvec{y}}{\varvec{z}})}({\varvec{\theta }}_0)\big )\big )\big )=g_{{\varvec{y}}'{\varvec{z}}'}\big (g_{\nu '({\varvec{yz}})}\big (h_{\nu '\nu }({\varvec{\theta }}_0)\big )\big )\\&=g_{{\varvec{y}}''{\varvec{z}}''}\big (h_{\nu ''}({\varvec{\theta }}_0)\big ), \end{aligned}$$

where \({\varvec{y}}''=(y_k'')_{k=1}^K\) with \(y_k''=|y_k'-y_{\nu '_1(k)}|\), \({\varvec{z}}''=(z_l'')_{l=1}^L\) with \(z_l''=|z_l'-z_{\nu '_2(l)}|\), \(\nu ''=(\nu _1'',\nu _2'')\) with \(\nu _1''(k)=\nu _1'\big (\nu _1(k)\big )\) and \(\nu _2''(l)=\nu _2'\big (\nu _2(l)\big )\), and the last equality follows from

$$\begin{aligned}&{\varvec{E}}_{\nu _1'({\varvec{y}})}{\varvec{E}}_{{\varvec{y}}'}\\&\quad =\begin{pmatrix} \Big ({\varvec{I}}-2\text {diag}\big (\nu _1'({\varvec{y}})\big )\Big )\big ({\varvec{I}}-2\text {diag}({\varvec{y}}')\big ) \\ \qquad \Big ({\varvec{I}}-2\text {diag}\big (\nu _1'({\varvec{y}})\big )\Big ){\varvec{y}}'+\nu _1'({\varvec{y}}) \\ {\varvec{0}}^T &{} 1\end{pmatrix}\\&\quad =\begin{pmatrix} {\varvec{I}}-2\text {diag}({\varvec{y}}'') &{} {\varvec{y}}'' \\ {\varvec{0}}^T &{} 1 \end{pmatrix}={\varvec{E}}_{{\varvec{y}}''}, \end{aligned}$$

and similarly \({\varvec{E}}_{\nu _2'({\varvec{z}})}{\varvec{E}}_{{\varvec{z}}'}={\varvec{E}}_{{\varvec{z}}''}\). It then follows that \({\varvec{\theta }}' \sim {\varvec{\theta }}\).

Furthermore, if \({\varvec{\theta }}'\sim {\varvec{\theta }}\), there exists \(\sigma =(\sigma _1,\sigma _2)\) with permutations \(\sigma _1\) and \(\sigma _2\) on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi ({\varvec{\theta }}')=\sigma \big (\psi ({\varvec{\theta }})\big )\), where \(\psi \) is defined as in (9). It then follows from Lemma 1 and Proposition 1 that \(P_o({\varvec{A}};{\varvec{\theta }}')=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }}')\big )=P_s\big ({\varvec{A}};\psi ({\varvec{\theta }})\big )=P_o({\varvec{A}};{\varvec{\theta }})\). This completes the proof of Lemma 3.

Proof of Theorem 2. For any \({\varvec{\theta }}\in {\varvec{\Theta }}^{\text {o}}\setminus {\varvec{\Theta }}_0^{\text {o}}\), it is clear that there exist \(\nu \), \({\varvec{y}}\) and \({\varvec{z}}\) such that \(\tilde{{\varvec{\theta }}}=g_{{\varvec{yz}}}\big (h_{\nu }({\varvec{\theta }})\big )\) with \({\tilde{\alpha }}_1<\cdots<{\tilde{\alpha }}_K<\frac{1}{2}\) and \({\tilde{\beta }}_1<\cdots<{\tilde{\beta }}_L<\frac{1}{2}\). Similarly, we have \(\tilde{{\varvec{\theta }}}'=g_{{\varvec{y}}'{\varvec{z}}'}\big (h_{\nu '}({\varvec{\theta }}')\big )\) with \({\tilde{\alpha }}_1'<\cdots<{\tilde{\alpha }}_K'<\frac{1}{2}\) and \({\tilde{\beta }}_1'<\cdots<{\tilde{\beta }}_L'<\frac{1}{2}\) for some \(\nu '\), \({\varvec{y}}'\) and \({\varvec{z}}'\). It then follows from Lemma 3 that \(\tilde{{\varvec{\theta }}} \sim {\varvec{\theta }}\) and \(\tilde{{\varvec{\theta }}'} \sim {\varvec{\theta }}'\), and \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};{\varvec{\theta }})\) and \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')=P_o({\varvec{A}};{\varvec{\theta }}')\). Thus, it suffices to prove that \(\tilde{{\varvec{\theta }}}=\tilde{{\varvec{\theta }}'}\) given \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')\).

It follows from Lemma 1 that \(P_o({\varvec{A}};\tilde{{\varvec{\theta }}})=P_o({\varvec{A}};\tilde{{\varvec{\theta }}}')\) if and only if \(P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )\). Furthermore, as \(\tilde{{\varvec{\theta }}}\) and \(\tilde{{\varvec{\theta }}}'\notin {\varvec{\Theta }}^{\text {o}}_0\), it can be verified that \(\psi (\tilde{{\varvec{\theta }}}),\psi (\tilde{{\varvec{\theta }}}')\notin {\varvec{\Omega }}_0\). With the assumption that \(n\ge \max \{2^{K+1},2^{L+1}\}\), it follows from Proposition 1 that \(P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}})\big )=P_s\big ({\varvec{A}};\psi (\tilde{{\varvec{\theta }}}')\big )\) if and only if there exists \(\sigma =(\sigma _1,\sigma _2)\) with \(\sigma _1\) and \(\sigma _2\) being permutations on \(\{1,\ldots ,2^K\}\) and \(\{1,\ldots ,2^L\}\), such that \(\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )\). Then, we have

$$\begin{aligned}&\Big \{\prod _k{\tilde{\alpha }}_k^{b_k}(1-{\tilde{\alpha }}_k)^{1-b_k}; {\varvec{b}}\in \{0,1\}^K\Big \}\nonumber \\&\quad =\Big \{\prod _k({\tilde{\alpha }}_k')^{b_k}(1-{\tilde{\alpha }}_k')^{1-b_k}; {\varvec{b}}\in \{0,1\}^K\Big \}, \end{aligned}$$
(18)

and

$$\begin{aligned}&\Big \{\prod _l{\tilde{\beta }}_l^{c_l}(1-{\tilde{\beta }}_l)^{1-c_l}; {\varvec{c}}\in \{0,1\}^L\Big \}\nonumber \\&\quad =\Big \{\prod _l({\tilde{\beta }}_l')^{c_l}(1-{\tilde{\beta }}_l')^{1-c_l}; {\varvec{c}}\in \{0,1\}^L\Big \}. \end{aligned}$$
(19)

Thus, \(\prod _k{\tilde{\alpha }}_k=\prod _k{\tilde{\alpha }}_k'\) as they are the smallest elements of each set in (18), and further

$$\begin{aligned} (1-{\tilde{\alpha }}_K)\prod _{k=1}^{K-1}{\tilde{\alpha }}_k=(1-{\tilde{\alpha }}_K')\prod _{k=1}^{K-1}{\tilde{\alpha }}_k', \end{aligned}$$

as they are the second smallest elements of each set in (18). Then, we have \(\frac{{\tilde{\alpha }}_K}{1-{\tilde{\alpha }}_K}=\frac{{\tilde{\alpha }}_K'}{1-{\tilde{\alpha }}_K'}\), and thus \({\tilde{\alpha }}_K={\tilde{\alpha }}_K'\). Removing \({\tilde{\alpha }}_K^{b_K}(1-{\tilde{\alpha }}_K)^{1-b_K}\) from all items in (18), it follows from similar argument that \({\tilde{\alpha }}_{K-1}={\tilde{\alpha }}_{K-1}'\), and repeating this treatment K times finally leads to \(\tilde{{\varvec{\alpha }}}=\tilde{{\varvec{\alpha }}}'\). Similarly, we also have \(\tilde{{\varvec{\beta }}}=\tilde{{\varvec{\beta }}}'\).

Next, for any \(s\in \{1,\ldots ,2^K\}\) and \(t\in \{1,\ldots ,2^L\}\), let \(s=\sum _{k=1}^Kb_k2^{k-1}+1\) and \(t=\sum _{l=1}^Lc_l2^{l-1}+1\) for some \({\varvec{b}}\in \{0,1\}^K\) and \({\varvec{c}}\in \{0,1\}^L\), and \(\sigma _1(s)=\sum _{k=1}^Kb_k'2^{k-1}+1\) and \(\sigma _2(t)=\sum _{l=1}^Lc_l'2^{l-1}+1\) for some \({\varvec{b}}'\in \{0,1\}^K\) and \({\varvec{c}}'\in \{0,1\}^L\). It then follows from \(\psi (\tilde{{\varvec{\theta }}}')=\sigma \big (\psi (\tilde{{\varvec{\theta }}})\big )\) that

$$\begin{aligned} \eta _s'=\prod _k{\tilde{\alpha }}_k^{b_k}(1-{\tilde{\alpha }}_k)^{1-b_k}=\prod _k{\tilde{\alpha }}_k^{b_k'}(1-{\tilde{\alpha }}_k)^{1-b_k'}=\eta _{\sigma _1(s)}, \end{aligned}$$

which leads to

$$\begin{aligned} \sum _kb_k\log \Big (\frac{{\tilde{\alpha }}_k}{1-{\tilde{\alpha }}_k}\Big )=\sum _kb_k'\log \Big (\frac{{\tilde{\alpha }}_k}{1-{\tilde{\alpha }}_k}\Big ). \end{aligned}$$

It then follows from the definition of \({\varvec{\Theta }}_0^{\text {o}}\) immediately that \({\varvec{b}}'={\varvec{b}}\), and thus \(\sigma _1(s)=s\). Similarly, \(\sigma _2(t)=t\), and thus \(\psi (\tilde{{\varvec{\theta }}}')=\psi (\tilde{{\varvec{\theta }}})\). The desired result then follows from Lemma 1 immediately.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhang, J., Wang, J. Identifiability and parameter estimation of the overlapped stochastic co-block model. Stat Comput 32, 57 (2022). https://doi.org/10.1007/s11222-022-10114-1

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-022-10114-1

Keywords

  • Community detection
  • Directed network
  • Identifiability
  • Stochastic co-block model
  • Variational EM