Skip to main content

Almost Exact Recovery in Label Spreading

  • 232 Accesses

Part of the Lecture Notes in Computer Science book series (LNTCS,volume 11631)

Abstract

In semi-supervised graph clustering setting, an expert provides cluster membership of few nodes. This little amount of information allows one to achieve high accuracy clustering using efficient computational procedures. Our main goal is to provide a theoretical justification why the graph-based semi-supervised learning works very well. Specifically, for the Stochastic Block Model in the moderately sparse regime, we prove that popular semi-supervised clustering methods like Label Spreading achieve asymptotically almost exact recovery as long as the fraction of labeled nodes does not go to zero and the average degree goes to infinity.

Keywords

  • Semi-supervised clustering
  • Community detection
  • Label spreading
  • Random graphs
  • Stochastic Block Model

This is a preview of subscription content, access via your institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-030-25070-6_3
  • Chapter length: 14 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
eBook
USD   44.99
Price excludes VAT (USA)
  • ISBN: 978-3-030-25070-6
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   59.99
Price excludes VAT (USA)

References

  1. Abbe, E.: Community detection and stochastic block models. Found. Trends® Commun. Inf. Theory 14(1–2), 1–162 (2018)

    Google Scholar 

  2. Avrachenkov, K., Gonçalves, P., Mishenin, A., Sokol, M.: Generalized optimization framework for graph-based semi-supervised learning. In: SIAM International Conference on Data Mining (SDM 2012) (2012)

    Google Scholar 

  3. Avrachenkov, K., Kadavankandy, A., Litvak, N.: Mean field analysis of personalized pagerank with implications for local graph clustering. J. Stat. Phys. 173(3–4), 895–916 (2018)

    MathSciNet  CrossRef  Google Scholar 

  4. Avrachenkov, K.E., Filar, J.A., Howlett, P.G.: Analytic Perturbation Theory and Its Applications, vol. 135. SIAM, Philadelphia (2013)

    CrossRef  Google Scholar 

  5. Chapelle, O., Schölkopf, B., Zien, A.: Semi-supervised Learning. Adaptive Computation and Machine Learning. MIT Press, Cambridge (2006)

    CrossRef  Google Scholar 

  6. Condon, A., Karp, R.M.: Algorithms for graph partitioning on the planted partition model. In: Hochbaum, D.S., Jansen, K., Rolim, J.D.P., Sinclair, A. (eds.) APPROX/RANDOM-1999. LNCS, vol. 1671, pp. 221–232. Springer, Heidelberg (1999). https://doi.org/10.1007/978-3-540-48413-4_23

    CrossRef  MATH  Google Scholar 

  7. Erdős, P., Rényi, A.: On random graphs. Publ. Math. (Debr.) 6, 290–297 (1959)

    MATH  Google Scholar 

  8. Gilbert, E.N.: Random graphs. Ann. Math. Statist. 30(4), 1141–1144 (1959)

    MathSciNet  CrossRef  Google Scholar 

  9. Holland, P.W., Laskey, K.B., Leinhardt, S.: Stochastic blockmodels: first steps. Soc. Netw. 5(2), 109–137 (1983)

    MathSciNet  CrossRef  Google Scholar 

  10. Horn, R.A., Johnson, C.R.: Matrix Analysis, 2nd edn. Cambridge University Press, Cambridge (2012)

    CrossRef  Google Scholar 

  11. Johnson, R., Zhang, T.: On the effectiveness of Laplacian normalization for graph semi-supervised learning. J. Mach. Learn. Res. 8(Jul), 1489–1517 (2007)

    MathSciNet  MATH  Google Scholar 

  12. Le, C.M., Levina, E., Vershynin, R.: Concentration and regularization of random graphs. Random Struct. Algorithms 51(3), 538–561 (2017)

    MathSciNet  CrossRef  Google Scholar 

  13. Mai, X., Couillet, R.: A random matrix analysis and improvement of semi-supervised learning for large dimensional data. J. Mach. Learn. Res. 19(1), 3074–3100 (2018)

    MathSciNet  MATH  Google Scholar 

  14. Zhou, D., Bousquet, O., Lal, T.N., Weston, J., Schölkopf, B.: Learning with local and global consistency. In: Advances in Neural Information Processing Systems, pp. 321–328 (2004)

    Google Scholar 

  15. Zhu, X.: Semi-supervised learning literature survey. Technical report, Computer Science Department, University of Wisconsin-Madison (2006)

    Google Scholar 

  16. Zhu, X., Ghahramani, Z., Lafferty, J.D.: Semi-supervised learning using Gaussian fields and harmonic functions. In: ICML (2003)

    Google Scholar 

Download references

Acknowledgements

This work has been done within the project of Inria – Nokia Bell Labs “Distributed Learning and Control for Network Analysis”.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Maximilien Dreveton .

Editor information

Editors and Affiliations

Appendices

A Background Results on Matrix Analysis

1.1 A.1 Inversion of the Identity Matrix Minus a Rank 2 Matrix

Lemma 1

(Sherman-Morrison-Woodbury formula). Let A be an invertible \(n \times n\) matrix, and BCD matrices of correct sizes. Then: \(\Big ( A + BCD \Big )^{-1} = A^{-1} -A^{-1} B \Big (I + CDA^{-1} B \Big )^{-1} CDA^{-1}\). In particular, if uv are two column vectors of size \(n\times 1\), we have: \(\Big ( A + u v^T \Big )^{-1} = A^{-1} - \dfrac{A^{-1}u v^T A^{-1} }{1 + v^T A^{-1} u}\).

Proof

See for example [10], section 0.7.4.   \(\square \)

Lemma 2

Let \(M = \begin{pmatrix} aJ_{n_1} &{} b J_{n_1 n_2} \\ c J_{n_2 n_1} &{} d J_{n_2} \end{pmatrix}\) for some values abcd. Let \(n = n_1+n_2\). If \(I_n-M\) is invertible, we have:

$$\begin{aligned} (I-M)^{-1} = I_n - \dfrac{1}{K} \begin{pmatrix} \big (-a + n_2(ad-bc) \big ) J_{n_1} &{} -b J_{n_1 n_2} \\ -c J_{n_2 n_1} &{} \big ( -d + n_1(ad-bc) \big ) J_{n_2} \end{pmatrix} \end{aligned}$$

where \(K = (1-n_1a)(1-n_2d) - n_1n_2 bc\).

Proof

We will use the Sherman-Morrison-Woodbury matrix identity (Lemma 1) with \(A = I_n\), \(D = \begin{pmatrix} 1\dots 1; 0 \dots 0 \\ 0 \dots 0 ; 1 \dots 1 \end{pmatrix}\) (on the first line, there are \(n_1\) ones and \(n_2\) zeros), \(B = D^T\) and \(C = \begin{pmatrix} -a &{} -b \\ -c &{} -d \end{pmatrix}\). We can easily verify that \(BCD = -M\).

$$\begin{aligned} (I-M)^{-1}&= I_n - B(I + CDB)^{-1} CD \\&= I_n - B \begin{pmatrix} 1-n_1a &{} -n_2 b \\ -n_1c &{} 1-n_2d \\ \end{pmatrix}^{-1} CD \\&= I_n - B \dfrac{1}{(1-n_1a)(1-n_2d) - n_1n_2 bc }\begin{pmatrix} 1-n_2d &{} n_2 b \\ n_1c &{} 1-n_1a \\ \end{pmatrix} CD \\&= I_n - \dfrac{1}{K} B \begin{pmatrix} -a + n_2(ad-bc) &{} -b \\ -c &{} -d + n_1(ad-bc) \end{pmatrix} D \\&= I_n - \dfrac{1}{K} \begin{pmatrix} \big (-a + n_2(ad-bc) \big ) J_{n_1} &{} -b J_{n_1 n_2} \\ -c J_{n_2 n_1} &{} \big ( -d + n_1(ad-bc) \big ) J_{n_2} \end{pmatrix}. \end{aligned}$$

   \(\square \)

1.2 A.2 Spectral Study of a Rank 2 Matrix

Lemma 3

(Schur’s determinant identity, [10]). Let AD and \(\begin{pmatrix} A &{} B \\ C &{} D \end{pmatrix}\) be squared matrices. If A is invertible, we have:

$$\det \begin{pmatrix} A &{} B \\ C &{} D \end{pmatrix} = \det ( A ) \det (D - CA^{-1}B).$$

Proof

Follows from the formula \(\begin{pmatrix} A &{} B \\ C &{} D \end{pmatrix} = \begin{pmatrix} A &{} 0 \\ C &{} I_q \end{pmatrix} \begin{pmatrix} I_p &{} A^{-1}B \\ 0 &{} D-CA^{-1}B \end{pmatrix}\).    \(\square \)

Lemma 4

(Matrix determinant lemma, [10]). For an invertible matrix A and two column vectors u and v, we have \(\det (A + uv^T)= (1+v^T A^{-1} u ) \det (A)\).

Lemma 5

Let \(\alpha \) and \(\beta \) be two constants. When \(M = \alpha I_n + \beta J\) where J is the \(n \times n\) matrix with all entries equal to one, we have \(\det M = \alpha ^{n-1}(\alpha + \beta n)\).

Proof

Suppose that \(\alpha \not =0\). Then with \(v^T = (1,\dots , 1)\) and \(u = \beta (1,\dots ,1)\) vectors of size \(1 \times n\), Lemma 4 gives us

$$\begin{aligned} \det M&= \det (\alpha I_n) \Big (1 + v^T (\alpha I_n)^{-1} u \Big ) \\&= \alpha ^n \Big (1+ \dfrac{\beta n}{\alpha } \Big ) \\&= \alpha ^{n-1}(\alpha + \beta n), \end{aligned}$$

which proves the lemma for \(\alpha \not =0\). To treat the case \(\alpha =0\), see that the function \(\alpha \in \mathbf {R}\mapsto \det (\alpha I_n + \beta J)\) is continuous (even analytic) [4], thus by continuous prolongation in \(\alpha =0\), the expression \(\alpha ^{n-1}(\alpha + \beta n)\) holds for any \(\alpha \in \mathbf {R}\).   \(\square \)

Proposition 2

Let \(M = \begin{pmatrix} aJ_{n_1} &{} b J_{n_1 n_2} \\ c J_{n_2 n_1} &{} d J_{n_2} \end{pmatrix}\) for some values abcd. The eigenvalues of M are:

  • 0 with multiplicity \(n_1 + n_2-2\);

  • \(\lambda _\pm = \dfrac{1}{2} \big (n_1a + n_2d \pm \sqrt{\varDelta } \big ) \) where \(\varDelta = (n_1a-n_2d)^2 + 4n_1 n_2 bc \).

Proof

The matrix being of rank 2 (except for some degenerate cases), the fact that 0 is an eigenvalue of multiplicity \(n_1+n_2-2\) is obvious. By an explicit computation of the characteristic polynomial of M, the two remaining eigenvalues will be given as roots of a polynomial of degree 2.

Let \(\lambda \in \mathbf {R}\) and \(A := \lambda I_{n_1} - aJ_{n_1}\). If \(\lambda \not \in \{0; a n_1 \}\), then A is invertible and by the Schur’s determinant identity (Lemma 3) we have

$$\begin{aligned} \det (\lambda I_n - M)&= \det A \, \det \Big ( \lambda I_{n_2} - dJ_{n_2} - cJ_{n_2 n_1} A^{-1} b J_{n_1 n_2} \Big ) \\&= \det A \, \det B. \end{aligned}$$

From Lemma 5, it follows that \(\det A = \lambda ^{n_1-1} \big (\lambda - n_1a)\).

Let us now compute \(\det B\). First, we show that \(A^{-1} = \dfrac{1}{\lambda } \Big (I_{n_1} + \dfrac{a}{\lambda - a n_1} J_{n_1} \Big )\). Indeed, from the Sherman-Morrison-Woodbury formula (Lemma 1) with \(u=-a 1_{n_1}\) and \(v= 1_{n_1}\), it follows that

$$\begin{aligned} \Big (\lambda I_{n_1} - aJ_{n_1} \Big )^{-1}&= \dfrac{1}{\lambda } I_{n_1} - \dfrac{1}{\lambda ^2} \dfrac{ -aJ_{n_1} }{ 1 + \dfrac{-a n_1 }{\lambda }} \\&= \dfrac{1}{\lambda } I_{n_1} + \dfrac{1}{\lambda } \, \dfrac{a}{\lambda -an_1} J_{n_1}, \end{aligned}$$

which gives the desired expression. Thus,

$$\begin{aligned} B&= \lambda I_{n_2} - dJ_{n_2} - \dfrac{bc}{\lambda } J_{n_2 n_1} \big ( I_{n_1} + \dfrac{a}{\lambda - a n_1} J_{n_1} \big ) J_{n_1 n_2} \\&= \lambda I_{n_2} - dJ_{n_2} - \dfrac{bc}{\lambda } \big ( n_1 + \dfrac{a \, n_1^2}{\lambda - an_1} \big )J_{n_2} \\&= \lambda I_{n_2} + \Big ( - d - \dfrac{bc n_1}{\lambda - an_1} \Big ) J_{n_2}. \end{aligned}$$

Again, this matrix is of the form \(\lambda I_n + \beta J_n\), and we can use Lemma 5 to show that

$$\begin{aligned} \det B = \lambda ^{n_2-1} \Big ( \lambda + n_2 \beta \Big ). \end{aligned}$$

Now we can finish the computation of \(\det (\lambda I_n - M)\)

$$\begin{aligned} \det (\lambda I_n - M)&= \lambda ^{n_1+n_2-2} \big ( \lambda - n_1 a \big ) \Big ( \lambda - n_2 d - \dfrac{bc n_1 n_2}{ \lambda - a n_1} \Big ) \\&= \lambda ^{n_1+n_2-2} \Big ( \lambda ^2 + \lambda (-n_1 a - n_2d) + n_1 n_2 (ad-bc) \Big ). \end{aligned}$$

The discriminant of this second degree polynomial expression is given by

$$\begin{aligned} \varDelta&= (n_1 a + n_2d)^2 - 4 n_1 n_2 (ad-bc) \\&= (n_1a-n_2d)^2 + 4 n_1 n_2 bc. \end{aligned}$$

Thus \(\varDelta \ge 0\) and the two remaining eigenvalues are given by

$$ \lambda _\pm = \dfrac{1}{2} \big (n_1a + n_2d \pm \sqrt{\varDelta } \big ). $$

   \(\square \)

1.3 A.3 Spectral Study of \(\mathbf {E}\mathcal L\)

Proposition 3

(Eigenvalues of \(\mathbf {E}\mathcal L_{uu}\), symmetric case). Assume two communities of equal size, with \(p_1 = p_2 (= p)\). The two smallest eigenvalues of \(\mathbf {E}\mathcal L_{uu}\) are:

$$\begin{aligned} \lambda _1 = r \, \dfrac{p - q}{p+q} \quad \text {and} \quad \lambda _2 = r . \end{aligned}$$

Note that the other eigenvalue of \(\mathbf {E}\mathcal L_{uu}\) is one (with multiplicity \(\lfloor (1-r)n \rfloor -2\)).

Proof

The matrix \(\mathbf {E}\mathcal L_{uu}\) can be written as \(I-M\), where \(M = D^{-1/2}AD^{-1/2}\) has a block form like in Proposition 2, with coefficients \(a = \dfrac{p_1}{d_1}\), \(b = c = \dfrac{q}{\sqrt{d_1 d_2}}\) and \(d = \dfrac{p_2}{d_2} \). Note that the blocks sizes are now \(\lfloor (1-r)n_i \rfloor \) and not \(n_i\). Under the symmetric assumption, we have \(d_1 = d_2 = \dfrac{n}{2}(p+q)\).

Moreover, \(\lambda _M\) is an eigenvalue of M if and only if \(1 - \lambda _M\) is eigenvalue of \(\mathbf {E}\mathcal L_{uu}\). Using the notations of Proposition 2, we have \(\varDelta = 4 (1-r)^2 \dfrac{q^2}{(p+q)^2}\), and the two non-zero eigenvalues of M are given by:

$$\begin{aligned} \lambda _\pm&= \dfrac{1}{2}\Big ( 2 (1-r) \dfrac{p}{p+q} \pm 2 (1-r) \dfrac{q}{p+q} \Big ) \\&= 1 - r \dfrac{p \pm q}{p+q}. \end{aligned}$$

   \(\square \)

B Spectral Norm of an Extracted Matrix

Proposition 4

Let A be a matrix and B an extracted matrix (non necessarily squared: we can remove rows and columns with different indices, and potentially more rows than columns, or vice versa) from A, then: \( ||B||_2 \le ||A||_2 \).

Proof

For two subsets I and J of \(\{1,\dots ,n\}\), let \(B = A_{IJ}\) the matrix obtained from A by keeping only the rows (resp. columns) in I (resp. in J). Then \(B = M_1 A M_2\) where \(M_1\) and \(M_2\) are two appropriately chosen permutation matrices. Thus their spectral norm is equal to one, and the result \(||B||_2 \le ||A||_2\) follows from the inequality \( ||B||_2 \le ||M_1||_2 ||A||_2 ||M_2||_2\).   \(\square \)

Rights and permissions

Reprints and Permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Avrachenkov, K., Dreveton, M. (2019). Almost Exact Recovery in Label Spreading. In: Avrachenkov, K., Prałat, P., Ye, N. (eds) Algorithms and Models for the Web Graph. WAW 2019. Lecture Notes in Computer Science(), vol 11631. Springer, Cham. https://doi.org/10.1007/978-3-030-25070-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-25070-6_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-25069-0

  • Online ISBN: 978-3-030-25070-6

  • eBook Packages: Computer ScienceComputer Science (R0)