Overlapping Community Detection in Networks via Sparse Spectral Decomposition

Arroyo, Jesús; Levina, Elizaveta

doi:10.1007/s13171-021-00245-4

Overlapping Community Detection in Networks via Sparse Spectral Decomposition

Published: 31 May 2021

Volume 84, pages 1–35, (2022)
Cite this article

Sankhya A Aims and scope Submit manuscript

653 Accesses
1 Citation
Explore all metrics

Abstract

We consider the problem of estimating overlapping community memberships in a network, where each node can belong to multiple communities. More than a few communities per node are difficult to both estimate and interpret, so we focus on sparse node membership vectors. Our algorithm is based on sparse principal subspace estimation with iterative thresholding. The method is computationally efficient, with computational cost equivalent to estimating the leading eigenvectors of the adjacency matrix, and does not require an additional clustering step, unlike spectral clustering methods. We show that a fixed point of the algorithm corresponds to correct node memberships under a version of the stochastic block model. The methods are evaluated empirically on simulated and real-world networks, showing good statistical performance and computational efficiency.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Community Detection with Graph Sparsification

A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks

Bipartite communities via spectral partitioning

Article 25 April 2020

References

Abbe, E (2017). Community detection and stochastic block models: recent developments. Journal of Machine Learning Research 18, 1–86.
MathSciNet MATH Google Scholar
Adamic, L A and Glance, N (2005). The political blogosphere and the 2004 US election: divided they blog. ACM, p. 36–43.
Airoldi, E M, Blei, D M, Fienberg, S E and Xing, E P (2009). Mixed membership stochastic blockmodels, p. 33–40.
Amini, A A and Wainwright, M J (2008). High-dimensional analysis of semidefinite relaxations for sparse principal components. IEEE, p. 2454–2458.
Amini, A A and Levina, E (2018). On semidefinite relaxations for the block model. The Annals of Statistics 46, 1, 149–179.
Article MathSciNet MATH Google Scholar
Arroyo, J and Levina, E (2020). Simultaneous prediction and community detection for networks with application to neuroimaging. arXiv:2002.01645.
Ball, B, Karrer, B and Newman, MEJ (2011). Efficient and principled method for detecting communities in networks. Physical Review E 84, 3, 036103.
Article Google Scholar
Bollobás, B, Janson, S and Riordan, O (2007). The phase transition in inhomogeneous random graphs. Random Structures and Algorithms 31, 1, 3–122.
Article MathSciNet MATH Google Scholar
Bullmore, E and Sporns, O (2009). Complex brain networks: graph theoretical analysis of structural and functional systems. Nature Reviews Neuroscience10, 3, 186–198.
Article Google Scholar
Cape, J, Tang, M and Priebe, C E (2019). On spectral embedding performance and elucidating network structure in stochastic blockmodel graphs. Network Science 7, 3, 269–291.
Article Google Scholar
Conover, M, Ratkiewicz, J, Francisco, M R, Gonçalves, B, Menczer, F and Flammini, A (2011). Political polarization on Twitterx. ICWSM133, 89–96.
Google Scholar
da Fonseca Vieira, V, Xavier, C R and Evsukoff, A G (2020). A comparative study of overlapping community detection methods from the perspective of the structural properties. Applied Network Science 5, 1, 1–42.
Google Scholar
Girvan, M and Newman, Mark EJ (2002). Community structure in social and biological networks. Proceedings of the National Academy of Sciences 99, 12, 7821–7826.
Article MathSciNet MATH Google Scholar
Golub, G H and Van Loan, C F (2012). Matrix computations, 3. Johns Hopkins University Press, USA.
MATH Google Scholar
Gregory, S (2010). Finding overlapping communities in networks by label propagation. New Journal of Physics 12, 10, 103018.
Article MATH Google Scholar
Holland, P W, Laskey, K B and Leinhardt, S (1983). Stochastic blockmodels: First steps. Social Networks 5, 2, 109–137.
Article MathSciNet Google Scholar
Huang, K and Fu, X (2019). Detecting overlapping and correlated communities without pure nodes: Identifiability and algorithm, p. 2859–2868.
Ji, P and Jin, J (2016). Coauthorship and citation networks for statisticians. The Annals of Applied Statistics 10, 4, 1779–1812.
MathSciNet MATH Google Scholar
Jin, J (2015). Fast community detection by score. The Annals of Statistics 43, 1, 57–89.
Article MathSciNet MATH Google Scholar
Jin, J, Ke, Z T and Luo, S (2017). Estimating network memberships by simplex vertex hunting. arXiv:1708.07852.
Johnstone, I M and Lu, A Y (2009). On consistency and sparsity for principal components analysis in high dimensions. Journal of the American Statistical Association 104, 486, 682–693.
Article MathSciNet MATH Google Scholar
Jolliffe, I T, Trendafilov, N T and Uddin, M (2003). A modified principal component technique based on the lasso. Journal of Computational and Graphical Statistics 12, 3, 531–547.
Article MathSciNet Google Scholar
Karrer, B and Newman, M.EJ (2011). Stochastic blockmodels and community structure in networks. Physical Review E 83, 1, 016107.
Article MathSciNet Google Scholar
Lancichinetti, A, Fortunato, S and Kertész, J (2009). Detecting the overlapping and hierarchical community structure in complex networks. New J. Phys.11, 3, 033015.
Article Google Scholar
Lancichinetti, A, Radicchi, F, Ramasco, J J and Fortunato, S (2011). Finding statistically significant communities in networks. PLoS ONE 6, 4.
Latouche, P, Birmelé, E. and Ambroise, C (2011). Overlapping stochastic block models with application to the french political blogosphere. The Annals of Applied Statistics, 309–336.
Le, C M and Levina, E (2015). Estimating the number of communities in networks by spectral methods. arXiv:1507.00827.
Le, C M, Levina, E and Vershynin, R (2017). Concentration and regularization of random graphs. Random Structures & Algorithms 51, 3, 538–561.
Article MathSciNet MATH Google Scholar
Lee, D D and Seung, H S (1999). Learning the parts of objects by non-negative matrix factorization. Nature 401, 6755, 788–791.
Article MATH Google Scholar
Lei, J and Rinaldo, A (2015). Consistency of spectral clustering in stochastic block models. The Annals of Statistics 43, 1, 215–237.
Article MathSciNet MATH Google Scholar
Levin, K, Athreya, A, Tang, M, Lyzinski, V and Priebe, C E (2017). A central limit theorem for an omnibus embedding of random dot product graphs. arXiv:1705.09355.
Li, T, Levina, E and Zhu, J (2020). Network cross-validation by edge sampling. Biometrika 107, 2, 257–276.
Article MathSciNet MATH Google Scholar
Lyzinski, V, Sussman, D L, Tang, M, Athreya, A and Priebe, C E (2014). Perfect clustering for stochastic blockmodel graphs via adjacency spectral embedding. Electronic Journal of Statistics 8, 2, 2905–2922.
Article MathSciNet MATH Google Scholar
Ma, Z (2013). Sparse principal component analysis and iterative thresholding. The Annals of Statistics, 41, 2, 772–801.
Article MathSciNet MATH Google Scholar
Mao, X, Sarkar, P and Chakrabarti, D (2017). On mixed memberships and symmetric nonnegative matrix factorizations. PMLR, p. 2324–2333.
Mao, X, Sarkar, P and Chakrabarti, D (2018). Overlapping clustering models, and one (class) svm to bind them all, p. 2126–2136.
Mao, X, Sarkar, P and Chakrabarti, D (2020). Estimating mixed memberships with sharp eigenvector deviations. Journal of the American Statistical Association. (just-accepted), 1–24.
McAuley, J J and Leskovec, J (2012). Learning to discover social circles in ego networks., 2012, p. 548–56.
Newman, Mark EJ (2006). Finding community structure in networks using the eigenvectors of matrices. Physical Review E 74, 3, 036104.
Article MathSciNet Google Scholar
Porter, M A, Onnela, J.-P. and Mucha, P J (2009). Communities in networks. Notices of the AMS 56, 9, 1082–1097.
MathSciNet MATH Google Scholar
Power, J D, Cohen, A L, Nelson, S M, Wig, G S, Barnes, K A, Church, J A, Vogel, A C, Laumann, T O, Miezin, F M and Schlaggar, B L (2011). Functional network organization of the human brain. Neuron 72, 4, 665–678.
Article Google Scholar
Psorakis, I, Roberts, S, Ebden, M and Sheldon, B (2011). Overlapping community detection using bayesian non-negative matrix factorization. Physical Review E 83, 6, 066114.
Article Google Scholar
Rohe, K, Chatterjee, S and Yu, B (2011). Spectral clustering and the high-dimensional stochastic blockmodel. Ann. Statist. 39, 4, 1878–1915.
Article MathSciNet MATH Google Scholar
Rubin-Delanchy, P, Priebe, C E and Tang, M (2017). Consistency of adjacency spectral embedding for the mixed membership stochastic blockmodel. arXiv:1705.04518.
Schlitt, T and Brazma, A (2007). Current approaches to gene regulatory network modelling. BMC Bioinformatics 8, Suppl 6, S9.
Article Google Scholar
Schwarz, G (1978). Estimating the dimension of a model. The Annals of Statistics 6, 2, 461–464.
Article MathSciNet MATH Google Scholar
Schwarz, A J, Gozzi, A and Bifone, A (2008). Community structure and modularity in networks of correlated brain activity. agnetic Resonance Imaging 26, 7, 914–920.
Article Google Scholar
Tang, M, Athreya, A, Sussman, D L, Lyzinski, V, Park, Y and Priebe, C E (2017). A semiparametric two-sample hypothesis testing problem for random graphs. Journal of Computational and Graphical Statistics 26, 2, 344–354.
Article MathSciNet MATH Google Scholar
Vu, V Q and Lei, J (2013). Minimax sparse principal subspace estimation in high dimensions. The Annals of Statistics 41, 6, 2905–2947.
Article MathSciNet MATH Google Scholar
Wang, C and Blei, D (2009). Decoupling sparsity and smoothness in the discrete hierarchical Dirichlet process. Advances in Neural Information Processing Systems 22, 1982–1989.
Google Scholar
Wang, YX R and Bickel, P J (2017). Likelihood-based model selection for stochastic block models. The Annals of Statistics 45, 2, 500–528.
Article MathSciNet MATH Google Scholar
Wasserman, S and Faust, K (1994). Social network analysis: Methods and applications, 8. Cambridge University Press, Cambridge.
Book MATH Google Scholar
Williamson, S, Wang, C, Heller, K A and Blei, D M (2010). The ibp compound dirichlet process and its application to focused topic modeling. Omnipress, Madison, p. 1151–1158.
Xie, J, Kelley, S and Szymanski, B K (2013). Overlapping community detection in networks: The state-of-the-art and comparative study. ACM Computing Surveys 45, 4, 1–35.
Article MATH Google Scholar
Yu, Y, Wang, T and Samworth, R J (2015). A useful variant of the Davis–Kahan theorem for statisticians. Biometrika 102, 2, 315–323.
Article MathSciNet MATH Google Scholar
Zachary, W W (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research 33, 4, 452–473.
Article Google Scholar
Zhang, Y, Levina, E and Zhu, J (2020). Detecting Overlapping Communities in Networks Using Spectral Methods. SIAM Journal on Mathematics of Data Science 2, 2, 265–283.
Article MathSciNet MATH Google Scholar
Zou, H, Hastie, T and Tibshirani, R (2006). Sparse principal component analysis. Journal of Computational and Graphical Statistics 15, 2, 265–286.
Article MathSciNet Google Scholar

Download references

Acknowledgements

This research was supported in part by NSF grants DMS-1521551 and DMS-1916222. The authors would like to thank Yuan Zhang for helpful discussions, and Advanced Research Computing at the University of Michigan for computational resources and services.

Author information

Authors and Affiliations

Department of Mathematics, University of Maryland, College Park, MD, USA
Jesús Arroyo
Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, USA
Jesús Arroyo
Department of Statistics, University of Michigan, Ann Arbor, MI, USA
Elizaveta Levina

Authors

Jesús Arroyo
View author publications
You can also search for this author in PubMed Google Scholar
Elizaveta Levina
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jesús Arroyo.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix

Proof of Proposition 1.

Because V and $\widetilde {\textup {\textbf {V}}}$ are two bases of the column space of P, and rank(P) = K, then $\textbf {P}=\textbf {V}\textbf {U}^{\top }=\widetilde {\textbf {V}}\widetilde {\textbf {U}}^{\top }$ for some full rank matrices $\textbf {U},\widetilde {\textbf {U}}\in \mathbb {R}^{n\times K}$ and therefore

$$ \textup{\textbf{V}}=\widetilde{\textup{\textbf{V}}}(\widetilde{\textup{\textbf{U}}}^{\top}\textup{\textbf{U}})({\textup{\textbf{U}}}^{\top}\textup{\textbf{U}})^{-1}. $$

(A.1)

Let $(\widetilde {\textup {\textbf {U}}}^{\top }\textup {\textbf {U}})({\textup {\textbf {U}}}^{\top }\textup {\textbf {U}})^{-1}=\boldsymbol {\Lambda }$. We will show that Λ = QD for a permutation matrix Q ∈{0,1}^K×K and a diagonal matrix $\textup {\textbf {D}}\in \mathbb {R}^{K\times K}$, or in other words, this is a generalized permutation matrix.

Let $\boldsymbol {\theta },\widetilde {\boldsymbol {\theta }}\in \mathbb {R}^{n}$ and $\textup {\textbf {Z}},\widetilde {\textup {\textbf {Z}}}\in \mathbb {R}^{n\times K}$ such that $\boldsymbol {\theta }_{i} = \left ({\sum }_{k=1}^{K}\textup {\textbf {V}}_{ik}^{2}\right )^{1/2}$, $\widetilde {\boldsymbol {\theta }}_{i} = \left ({\sum }_{k=1}^{K}\widetilde {\textup {\textbf {V}}}_{ik}^{2}\right )^{1/2}$, and Z_ik = V_ik/𝜃_i if 𝜃_i > 0, and Z_ik = 0 otherwise (similarly for $\widetilde {\textup {\textbf {Z}}}$). Denote by $\mathcal {S}_{1}=(i_{1},\ldots ,i_{K})$ to the vector of row indexes that satisfy $\textup {\textbf {V}}_{i_{j}j}> 0$ and $\textup {\textbf {V}}_{i_{j}j'}=0$ for j^′≠j, and j = 1,…,j (these indexes exist by assumption). In the same way, define $\mathcal {S}_{2}=(i'_{1},\ldots ,i'_{K})$ such that $\widetilde {\textup {\textbf {V}}}_{i^{\prime }_{j}j}> 0$ and $\widetilde {\textup {\textbf {V}}}_{i_{j}j'}=0$ for j^′≠j. j = 1,…,j. Denote by $\textup {\textbf {Z}}_{\mathcal {S}}$ to the K × K matrix formed by the rows indexed by $\mathcal {S}$. Therefore

$$\textup{\textbf{Z}}_{\mathcal{S}_{1}}= \textup{\textbf{I}}_{K}=\widetilde{\textup{\textbf{Z}}}_{\mathcal{S}_{2}}.$$

Write $\boldsymbol {\Theta } = \text {diag}(\boldsymbol {\theta })\in \mathbb {R}^{n\times n}$ and $\widetilde {\boldsymbol {\Theta }} = \text {diag}(\widetilde {\boldsymbol {\theta }})\in \text {real}^{n\times n}$. From Eq. A.1 we have

$$ \begin{array}{@{}rcl@{}} (\boldsymbol{\Theta}\textup{\textbf{Z}})_{\mathcal{S}_{2}}= & (\widetilde{\boldsymbol{\Theta}}\widetilde{\textup{\textbf{Z}}})_{\mathcal{S}_{2}}\boldsymbol{\Lambda} =\widetilde{\boldsymbol{\Theta}}_{\mathcal{S}_{2}, \mathcal{S}_{2}} \widetilde{\textup{\textbf{Z}}}_{\mathcal{S}_{2}}\boldsymbol{\Lambda} = \widetilde{\boldsymbol{\Theta}}_{\mathcal{S}_{2}, \mathcal{S}_{2}}\boldsymbol{\Lambda}\ , \end{array} $$

where ${\Theta }_{\mathcal {S}, \mathcal {S}}$ is the submatrix of Θ formed by the rows and columns indexed by $\mathcal {S}$. Thus,

$$\boldsymbol{\Lambda} = (\widetilde{\boldsymbol{\Theta}}_{\mathcal{S}_{2}, \mathcal{S}_{2}}^{-1}{\boldsymbol{\Theta}}_{\mathcal{S}_{2}, \mathcal{S}_{2}})\textup{\textbf{Z}}_{\mathcal{S}_{2}},$$

which implies that Λ is a non-negative matrix. Applying the same to the equation $(\boldsymbol {\Theta }\textbf {Z})_{\mathcal {S}_{1}}\boldsymbol {\Lambda }^{-1}= (\widetilde {\boldsymbol {\Theta }}\widetilde {\textup {\textbf {Z}}})_{\mathcal {S}_{1}}$, we have

$$\boldsymbol{\Lambda}^{-1} = ({\boldsymbol{\Theta}}_{\mathcal{S}_{1}, \mathcal{S}_{1}}^{-1}\widetilde{\boldsymbol{\Theta}}_{\mathcal{S}_{1}, \mathcal{S}_{1}})\widetilde{\textup{\textbf{Z}}}_{\mathcal{S}_{1}}.$$

Hence, both Λ and Λ^− 1 are non-negative matrices, which implies that Λ is a positive generalized permutation matrix, so Λ = QD for some permutation matrix Q and a diagonal matrix D with diag(D) > 0. □

Proof of Proposition 2.

Let $\boldsymbol {\theta }\in \mathbb {R}^{n}$ be a vector such that ${\boldsymbol {\theta }_{i}^{2}}={\sum }_{k=1}^{K}\boldsymbol {V}_{ik}^{2}$, and define $\textup {\textbf {Z}}\in \mathbb {R}^{n\times K}$ such that $\textbf {Z}_{ik}=\frac {1}{\theta _{i}}\textbf {V}_{ik}$, for each i ∈ [n],k ∈ [K]. Let B = (V^⊤V)^− 1V^TU. To show that B is symmetric, observe that VU^⊤ = P = P^⊤ = UV^⊤. Multiplying both sides by V and V^⊤,

$$\textup{\textbf{V}}^{\top} \textup{\textbf{V}} \textup{\textbf{U}}^{\top} \textup{\textbf{V}} = \textup{\textbf{V}}^{\top} \textup{\textbf{U}}\textup{\textbf{V}}^{\top} \textup{\textbf{V}},$$

and observing that (V^⊤V)^− 1 exists since V is full rank, we have

$$ \textup{\textbf{U}}^{\top} \textup{\textbf{V}}(\textup{\textbf{V}}^{\top} \textup{\textbf{V}})^{-1} = (\textup{\textbf{V}}^{\top} \textup{\textbf{V}})^{-1}\textup{\textbf{V}}^{\top} \textup{\textbf{U}},$$

which implies that B^⊤ = B. To obtain the equivalent representation for P, form a diagonal matrix Θ = diag(𝜃). Then ΘZ = V, and

$$ \begin{array}{@{}rcl@{}} \boldsymbol{\Theta}\textup{\textbf{Z}}\textup{\textbf{B}}\textup{\textbf{Z}}^{\top} \boldsymbol{\Theta} \! =\! \textup{\textbf{V}}[(\textup{\textbf{V}}^{T}\textup{\textbf{V}})^{-1}\textup{\textbf{V}}^{T}\textup{\textbf{U}}]\textup{\textbf{V}}^{\top} = \textup{\textbf{V}}(\textup{\textbf{V}}^{T}\textup{\textbf{V}})^{-1}\textup{\textbf{V}}^{T}\textup{\textbf{V}}\textup{\textbf{U}}^{\top} = \textup{\textbf{V}}\textup{\textbf{U}}^{\top} = \textup{\textbf{P}}. \end{array} $$

Finally, under the conditions of Proposition 1, V uniquely determines the pattern of zeros of any non-negative eigenbasis of P, and therefore supp(V) = supp(ΘZQ) = supp(ZQ) for some permutation Q. □

Proof of Proposition 3.

Suppose that P = VU^⊤ for some non-negative matrix V that satisfies the assumptions of Proposition 1. Let D ∈real^K such that D_i = ∥V_⋅k∥₂ and D = diag(D). Then $\textbf {P} = \widetilde {\textbf {V}}\textbf {D}\textbf {U}^{\top }$. Let $\textup {\textbf {V}}^{(0)} = \widetilde {\textup {\textbf {V}}}$ be the initial value of Algorithm 1. Then, observe that

$$\textup{\textbf{T}}^{(1)} = \textup{\textbf{P}}\widetilde{\textup{\textbf{V}}} = \widetilde{\textup{\textbf{V}}}\textup{\textbf{D}}\textup{\textbf{U}}^{T}\widetilde{\textup{\textbf{V}}},$$

$$ \begin{array}{@{}rcl@{}} \widetilde{\textup{\textbf{T}}}^{(1)} & = &\textup{\textbf{T}}^{(1)}\left[\widetilde{\textup{\textbf{V}}}^{\top}\textup{\textbf{T}}^{(1)}\right]^{-1}(\widetilde{\textup{\textbf{V}}}^{T}\widetilde{\textup{\textbf{V}}})\\ & =& \widetilde{\textup{\textbf{V}}}\textup{\textbf{D}} (\textup{\textbf{U}}^{T}\widetilde{\textup{\textbf{V}}})\left( \textup{\textbf{U}}^{T}\widetilde{\textup{\textbf{V}}}\right)^{-1}\textup{\textbf{D}}^{-1}(\widetilde{\textup{\textbf{V}}}^{\top}\widetilde{\textup{\textbf{V}}})^{-1}(\widetilde{\textup{\textbf{V}}}^{\top}\widetilde{\textup{\textbf{V}}})\\ & = &\widetilde{\textup{\textbf{V}}}. \end{array} $$

Suppose that λ ∈ [0,v^∗). Then, $\lambda \max \limits _{j\in [K]}|\widetilde {\textup {\textbf {V}}}| <\widetilde {\textup {\textbf {V}}}_{ik}$ for all i ∈ [n],k ∈ [K] such that V_ik > 0, and hence $\textup {\textbf {U}}^{(1)}=\mathcal {S}(\widetilde {\textup {\textbf {V}}}, \lambda ) = \widetilde {\textup {\textbf {V}}}$. Finally, since $\|\widetilde {\textup {\textbf {V}}}_{\cdot ,k}\|_{2}=1$ for all k ∈ [K], then $\textup {\textbf {V}}^{(1)}=\widetilde {\textup {\textbf {V}}}$. □

Proof of Theorem 1.

The proof consists of a one-step fixed point

analysis of Algorithm 2. We will show that if Z^(t) = Z, then Z^(t+ 1) = Z with high probability. Let T = T^(t+ 1) = AZ be value after the multiplication step. Define $\textup {\textbf {C}}\in \mathbb {R}^{K\times K}$ to be the diagonal matrix with community sizes on the diagonal, C_kk = n_k = ∥Z_⋅,k∥₁. Then $\widetilde {\textup {\textbf {T}}}=\widetilde {\textup {\textbf {T}}}^{(t+1)}= \textup {\textbf {T}}\textup {\textbf {C}}^{-1}$. In order for the threshold to set the correct set of entries to zero, a sufficient condition is that in each row i the largest element of $\widetilde {\boldsymbol {T}}_{i,\cdot }$ corresponds to the correct community. Define $\mathcal {C}_{k}\subset [n]$ as the node subset corresponding to community k. Then,

$$ \widetilde{\textup{\textbf{T}}}_{ik} = \frac{1}{{n_{k}}} \textup{\textbf{A}}_{i,\cdot} \boldsymbol{Z}_{\cdot, k} = \frac{1}{{n_{k}}}\sum\limits_{j\in\mathcal{C}_{k}}\textup{\textbf{A}}_{ij}. $$

Therefore $\widetilde {\textup {\textbf {T}}}_{ik}$ is a sum of independent and identically distributed Bernoulli random variables. Moreover, for each k₁ and k₂ in [K], $\widetilde {\textup {\textbf {T}}}_{ik_{1}}$ and $\widetilde {\textup {\textbf {T}}}_{ik_{2}}$ are independent of each other.

Given a value of λ ∈ (0,1), let

$$\mathcal{E}_{i}(\lambda)=\{ \lambda |\widetilde{\boldsymbol{T}}_{ik_{i}}|> |\widetilde{\boldsymbol{T}}_{ik_{j}}|, i\in\mathcal{C}_{k_{i}} \forall k_{j}\neq k_{i}\}$$

be the event that the largest entry of $\widetilde {\textbf {T}}_{i\cdot }$ corresponds to k_i, that is, the entry corresponding to the community of node i, and all the other indexes in that row are smaller in magnitude than $\lambda |\widetilde {\textbf {T}}_{ik_{i}}|$. Let $\textbf {U} = \textbf {U}^{(t+1)}=\mathcal {S}(\widetilde {\textbf {T}}^{(t+1)}, \lambda )$ be the matrix obtained after the thresholding step. Under the event $\mathcal {E}(\lambda )=\bigcap _{i=1}^{n} \mathcal {E}_{i}(\lambda )$, we have that $\|\textbf {U}_{i,\cdot }\|_{\infty } = \textbf {U}_{ik_{i}}$ for each i ∈ [n], and hence

$$\textup{\textbf{U}}_{ik}= \left\{\begin{array}{cl} \textup{\textbf{U}}_{ik_{i}} & \text{if }k=k_{i},\\ 0 & \text{otherwise.} \end{array}\right. $$

Therefore, under the event $\mathcal {E}(\lambda )$, the thresholding step recovers the correct support, so Z^(t+ 1) = Z.

Now we verify that under the conditions of Theorem 3.6, the event $\mathcal {E}(\lambda )$ happens with high probability. By a union bound,

$$ \begin{array}{@{}rcl@{}} \mathbb{P}(\mathcal{E}(\lambda)) \geq 1-\sum\limits_{i=1}^{n} \mathbb{P}(\mathcal{E}_{i}(\lambda)^{C}) \geq 1-\sum\limits_{i=1}^{n}\sum\limits_{j\neq k_{i}}\mathbb{P}(\widetilde{\textup{\textbf{T}}}_{ij}>\lambda \widetilde{\textup{\textbf{T}}}_{ik_{i}}). \end{array} $$

(A.2)

For j≠k_i, $\widetilde {\textup {\textbf {T}}}_{ij}-\lambda \widetilde {\textup {\textbf {T}}}_{ik_{i}}$ is a sum of independent random variables with expectation

$$ \begin{array}{@{}rcl@{}} \mathbb{E}\left[\widetilde{\textup{\textbf{T}}}_{ij}-\lambda \widetilde{\textup{\textbf{T}}}_{ik_{i}}\right] = \frac{1}{n_{j}}\sum\limits_{j\in\mathcal{C}_{j}}\mathbb{E}[\boldsymbol{A}_{ij}] - \frac{\lambda}{n_{k_{i}}}\sum\limits_{j\in\mathcal{C}_{k_{i}}}\mathbb{E}[\textup{\textbf{A}}_{ij}]\\ = q - \lambda \frac{n_{k_{i}}-1}{n_{k_{i}}}p. \end{array} $$

(A.3)

By Hoeffding’s inequality, we have that for any $\tau \in \mathbb {R}$,

$$ \begin{array}{@{}rcl@{}} \mathbb{P}\left( \widehat{\textup{\textbf{T}}}_{ij}-\lambda \widehat{\textup{\textbf{T}}}_{ik_{i}} \geq \tau + \mathbb{E}\left[\widehat{\textup{\textbf{T}}}_{ij} - \lambda \widehat{\textup{\textbf{T}}}_{ik_{i}}\right] \right) & \leq& 2\exp\left( \frac{-2\tau^{2}}{\frac{1}{n_{j}} + \frac{\lambda^{2}}{n_{k_{i}}}}\right)\\ & \leq& 2\exp\left( - \frac{2n_{\min} \tau^{2}}{1+\lambda^{2}}\right) \\&\leq& 2\exp\left( -n_{\min} \tau^{2}\right), \end{array} $$

where $n_{\min \limits } = \min \limits _{k\in [K]}n_{k}$. Setting

$$ \begin{array}{@{}rcl@{}} \tau = -\mathbb{E}\left[\widehat{\textup{\textbf{T}}}_{ij}- \lambda \widehat{\textup{\textbf{T}}}_{ik_{i}}\right] \geq \lambda^{\ast} p - q - \frac{1}{n_{k_{i}}}p \end{array} $$

and using Eq. A.3 and 3.6, we obtain that for n sufficiently large,

$$ \begin{array}{@{}rcl@{}} \mathbb{P}\left( \widehat{\textup{\textbf{T}}}_{ij}>\lambda \widehat{\textup{\textbf{T}}}_{ik_{i}} \right) & \leq& 2\exp\left( -n_{\min} \left( c_{1}\sqrt{\frac{\log(Kn)}{\min_{k}n_{k}}} -\frac{p}{n_{k}} \right)^{2} \right)\\ & \leq &2\exp\left( -n_{\min} \left( (c_{1}-1)\frac{\log(Kn)}{n_{\min}} \right) \right)= \frac{2}{(Kn)^{c_{1}-1}}. \end{array} $$

Combining with the bound (A.2), the probability of event $\mathcal {E}(\lambda )$ (which implies that Z^(t+ 1) = Z) is bounded from below as

$$ \begin{array}{@{}rcl@{}} \mathbb{P}(\mathcal{E}(\lambda)) & \geq& 1- n(K-1)\min_{i\in[n], k\in[K]}\mathbb{P}\left( \widehat{\textup{\textbf{T}}}_{ij}>\lambda \widehat{\textup{\textbf{T}}}_{ik_{i}} \right)\\ & \geq& 1-\frac{2(K-1)n}{(Kn)^{c_{1}-1}}\\ &\geq& 1- \frac{2}{Kn^{(c_{1} -2)}}. \end{array} $$

Therefore, with high probability Z is a fixed point of the Algorithm 2 for any λ ∈ (λ^∗,1). □

Proof of Proposition 4.

Observe that

$$ \begin{array}{@{}rcl@{}} \|\textup{\textbf{A}} -\widehat{\textup{\textbf{V}}}\textup{\textbf{B}}\widehat{\textup{\textbf{V}}}^{\top}\|_{F}^{2} & = &\text{Tr}(\textup{\textbf{A}}^{\top}\textup{\textbf{A}}) - 2\text{Tr}(\widehat{\textup{\textbf{V}}}^{\top}\textup{\textbf{A}}^{\top}\widehat{\textup{\textbf{V}}}\textup{\textbf{B}}) + \text{Tr}(\textbf{B}^{\top}\widehat{\textbf{V}}^{\top}\widehat{\textbf{V}}\textbf{B}\widehat{\textbf{V}}^{\top}\widehat{\textbf{V}})\\ & =& \|\textup{\textbf{B}} - (\widehat{\textup{\textbf{V}}}^{\top}\widehat{\textup{\textbf{V}}})^{-1}\widehat{\textup{\textbf{V}}}^{\top}\textup{\textbf{A}}\widehat{\textup{\textbf{V}}}(\widehat{\textup{\textbf{V}}}^{\top}\widehat{\textup{\textbf{V}}})^{-1}\|_{F}^{2} + C, \end{array} $$

where C is a constant that does not depend on B. Therefore $\widehat {\textup {\textbf {B}}}$

$$ \begin{array}{@{}rcl@{}} \widehat{\textup{\textbf{P}}} & = &\underset{\boldsymbol{B}\in\mathbb{R}^{K\times K}}{\arg\min}\|\textup{\textbf{A}} -\widehat{\textup{\textbf{V}}}\textup{\textbf{B}}\widehat{\textup{\textbf{V}}}^{\top}\|_{F}^{2}\\ & = & \widehat{\textup{\textbf{V}}}(\widehat{\textup{\textbf{V}}}^{\top}\widehat{\textup{\textbf{V}}})^{-1}\widehat{\textup{\textbf{V}}}^{\top}\textup{\textbf{A}}\widehat{\textup{\textbf{V}}}(\widehat{\textup{\textbf{V}}}^{\top}\widehat{\textup{\textbf{V}}})^{-1}\widehat{\textup{\textbf{V}}}^{\top}. \end{array} $$

Suppose that $\widehat {\boldsymbol {V}} = \widehat {\boldsymbol {Q}}\widehat {\boldsymbol {R}}$ for some matrix Q with orthonormal columns of size n × K. Then, $\widehat {\boldsymbol {R}}$ is a full rank matrix, and therefore

$$(\widehat{\textup{\textbf{V}}}^{\top}\widehat{\textup{\textbf{V}}})^{-1} = \widehat{\textup{\textbf{R}}}^{-1}(\widehat{\textup{\textbf{Q}}}^{\top}\widehat{\textup{\textbf{Q}}})^{-1}(\widehat{\textup{\textbf{R}}^{\top}})^{-1}.$$

Using this equation, we obtain the desired result. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Arroyo, J., Levina, E. Overlapping Community Detection in Networks via Sparse Spectral Decomposition. Sankhya A 84, 1–35 (2022). https://doi.org/10.1007/s13171-021-00245-4

Download citation

Received: 05 July 2020
Accepted: 15 February 2021
Published: 31 May 2021
Issue Date: June 2022
DOI: https://doi.org/10.1007/s13171-021-00245-4

Keywords

AMS (2000) subject classification

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Overlapping Community Detection in Networks via Sparse Spectral Decomposition

Abstract

Access this article

Similar content being viewed by others

Fast Community Detection with Graph Sparsification

A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks

Bipartite communities via spectral partitioning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix

Proof of Proposition 1.

Proof of Proposition 2.

Proof of Proposition 3.

Proof of Theorem 1.

Proof of Proposition 4.

Rights and permissions

About this article

Cite this article

Keywords

AMS (2000) subject classification

Navigation

Overlapping Community Detection in Networks via Sparse Spectral Decomposition

Abstract

Access this article

Similar content being viewed by others

Fast Community Detection with Graph Sparsification

A Spectral Algorithm with Additive Clustering for the Recovery of Overlapping Communities in Networks

Bipartite communities via spectral partitioning

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Appendix

Appendix

Proof of Proposition 1.

Proof of Proposition 2.

Proof of Proposition 3.

Proof of Theorem 1.

Proof of Proposition 4.

Rights and permissions

About this article

Cite this article

Share this article

Keywords

AMS (2000) subject classification

Search

Navigation