Abstract
The study of random graphs and networks had an explosive development in the last couple of decades. Meanwhile, techniques for the statistical analysis of sequences of networks were less developed. In this paper, we focus on networks sequences with a fixed number of labeled nodes and study some statistical problems in a nonparametric framework. We introduce natural notions of center and a depth function for networks that evolve in time. We develop several statistical techniques including testing, supervised and unsupervised classification, and some notions of principal component sets in the space of networks. Some examples and asymptotic results are given, as well as two real data examples.
Similar content being viewed by others
References
Adar E, Zhang L, Adamic LA, Lukose RM (2004) Implicit structure and the dynamics of blogspace. Implicit Struct Dyn Blogspace 13:16989–16995
Ahmed S, Li D, Rosalsky A, Volodin A (2001) Almost sure lim sup behavior of bootstrapped means with applications to pairwise i.i.d. sequences and stationary ergodic sequences. J Stat Plan Infer 98:126–137
Aparicio D, Fraiman D (2015) Banking networks and Leverage dependence in emerging countries. Adv Complex Syst 18:1550022
Arcones MA, Cui H, Zuo Y (2006) Empirical depth processes. TEST 15:151–177
Auer J (1995) An elementary proof of the invertibility of distance matrices. Linear Multilinear A 40:119–124
Barabasi A, Albert R (1999) Emergence of scaling in random networks. Science 286(5439):509–512
Bickel P, Chen A (2009) A nonparametric view of network models and Newman–Girvan and other modularities. Proc Natl Acad Sci USA 106:21068–21073
Bollobás B, Janson S, Riordan O (2007) The phase transition in inhomogeneous random networks. Random Struct Algor 31(1):3–122
Bonanno G, Caldarelli G, Lillo F, Micciche S, Vandewalle N, Mantegna R (2004) Networks of equities in financial markets. Eur Phys J B 34:363–371
Breiman L (1968) Probability. Classics in applied mathematics, SIAM
Brown B (1983) Statistical uses of the spatial median. J R Stat Soc B 45:25–30
Bullmore E, Sporns O (2009) Complex brain networks: network theoretical analysis of structural and functional systems. Nat Rev Neurosci 10:186–196
Chatterjee S, Diaconis P (2013) Estimating and understanding exponential random graph models. Ann Statist 41:2428–2461
Cuesta-Albertos J, Nieto-Reyes A (2008) The Tukey and the random Tukey depths characterize discrete distributions. J Multivar Anal 10:2304–2311
Cuesta-Albertos J, Fraiman R, Ransford T (2006) Random projections and goodness-of-fit tests in infinite-dimensional spaces. Bull Braz Math Soc 37:1–25
Cuesta-Albertos J, Fraiman R, Ransford T (2007) A sharp form of the Cramer–Wold theorem. J Theor Probab 20:201–209
Dehling H, Wendler M (2010) Central limit theorem and the bootstrap for u-statistics of strongly mixing data. J Multivar Anal 101:126–137
Devroye L, Fraiman N (2014) Connectivity of inhomogeneous random graphs. Random Struct Algor 45(3):408–420
Devroye L, Györfi L, Lugosi G (1996) A probabilistic theory of pattern recognition. Springer, New York
Donges J, Petrova I, Loew A, Marwan N, Kurths J (2015) How complex climate networks complement eigen techniques for the statistical analysis of climatological data. Clim Dyn 45(9):2407–2424
Doukhan P, Neumann MH (2008) The notion of \(\psi \)-weak dependence and its applications to bootstrapping time series. Probab Surv 5:146–168
Fraiman D (2008) Growing directed networks: stationary in-degree probability for arbitrary out-degree one. Eur Phys J B 61:377–388
Fraiman D, Balenzuela P, Foss J, Chialvo D (2009) Ising-like dynamics in large-scale functional brain networks. Phys Rev E 79:61922
Fraiman D, Saunier G, Martins E, Vargas C (2014) Biological motion coding in the brain: analysis of visually driven EEG functional networks. PloS ONE 9:e84612
Gao X, Xiao B, Tao D, Li X (2010) A survey of graph edit distance. Pattern Anal Appl 13:113–129
Gauzere B, Brun L, Villemin D (2012) Two new graphs kernels in chemoinformatics. Pattern Recogn Lett 33:2038–2047
Gozolchiani A, Yamasaki K, Gazit O, Havlin S (2008) Pattern of climate network blinking links follows El Niño events. Europhys Lett 83:28005
Greco L, Farcomeni A (2016) A plug-in approach to sparse and robust principal component analysis. TEST 25:449–481
Guigoures R, Boulle M, Rossi F (2015) Advances in data analysis and classification. Springer, New York
Holme P (2015) Modern temporal network theory: a colloquium. Eur Phys J B 88:1–30
Jiang X, Münger A, Bunke H (2001) On median graphs: properties, algorithms, and applications. IEEE T Pattern Anal 23:1144–1151
Jo HH, Karsai M, Kertsz J, Kaski K (2012) Circadian pattern and burstiness in mobile phone communication. New J Phys 14:013055
Karrer B, Newman M (2011) Spectral methods for network community detection and network partitioning. Phys Rev E 83:8016107
Kolar M, Song L, Ahmed A, Xing E (2010) Estimating time-varying networks. Ann Appl Stat 4:94–123
Kumar G, Garland M (2006) Visual exploration of complex time-varying graphs. IEEE T Vis Comput Gr 12:805–812
Kumar R, Novak J, Raghavan P, Tomkins A (2005) On the bursty evolution of blogspace. World Wide Web 8:159–178
Liu R (1988) On a notion of simplicial depth. Proc Natl Acad Sci USA 97:1732–1734
Livi L, Rizzi A (2013) The graph matching problem. Pattern Anal Appl 16:253–283
Mahalanobis P (1936) On the generalized distance in statistics. Proc Natl Inst Sci 2:49–55
Mastrandrea R, Fournet J, Barrat A (2015) Contact patterns in a high school: a comparison between data collected using wearable sensors, contact diaries and friendship surveys. PloS ONE 10:e0136497
Micchelli C (1986) Interpolation of scattered data: distance matrices and conditionally positive definite functions. Constr Approx 2:11–22
Mikosch T, Sorensen M (2002) Empirical process techniques for dependent data. Springer, New York
Newman M (2006) Modularity and community structure in networks. Proc Natl Acad Sci USA 103(23):8577–8582
Newman M, Girvan M (2004) Finding and evaluating community structure in networks. Phys Rev E 69:026113
Peixoto TP (2015) Inferring the mesoscale structure of layered, edge-valued, and time-varying networks. Phys Rev E 92:042807
Peligrad M (1986) Recent advances in the central limit theorem and its weak invariance principle for mixing sequences of random variables (a survey). Dependence in probability and statistics, pp 193–224
Pignolet Y, Roy M, Schmid S, Tredan G (2015) Exploring the graph of graphs: network evolution and centrality distances. arXiv:1506.01565
Pollard D (1981) Strong consistency of k-means clustering. Ann Stat 9:135–140
Robins G, Pattison P, Kalish Y, Lusher D (2007) An introduction to exponential random graph (p*) models for social networks. Soc Netw 29:173–191
Rohe K, Chatterjee S, Yu B (2011) Spectral clustering and the high-dimensional stochastic blockmodel. Ann Stat 39(4):1878–1915
Rosenblatt M (1956) A central limit theorem and a strong mixing condition. Proc Natl Acad Sci USA 42:43–47
Small C (1996) A survey of multidimensional medians. Int Stat Rev 58:263–277
Tang J, Scellato S, Musolesi M, Mascolo C, Latora V (2010) Small-world behavior in time-varying graphs. Phys Rev E 81:055101
Tsonis A, Swanson K (2008) Topology and predictability of El Niño and La Niña Networks. Phys Rev Lett 100:228502
Tukey J (1975) Mathematics and the picturing of data. In: Proceedings of the international congress of mathematicians, Vancouver, pp 523–531
Vardi Y, Zhang C (2000) The multivariate \(l_1\)-median and associated data depth. Proc Natl Acad Sci USA 97:1423–1426
Vishwanathan SVN, Schraudolph NN, Kondor R, Borgwardt KM (2010) Graph kernels. J Mach Learn Res 11:1201–1242
Watts D, Strogatz S (1998) Collective dynamics of ’small-world’ networks. Nature 393:440–442
Xing E, Fu W, Song L (2010) A state-space mixed membership blockmodel for dynamic network tomography. Ann App Stat 4:535–566
Xu K, Hero O (2013) Dynamic stochastic block models: statistical models for time-evolving networks. In: International conference on social computing, behavioral-cultural modeling, and prediction, vol 1, pp 201–210
Yang T, Chi Y, Zhu S, Gong Y, Jin R (2011) Detecting communities and their evolutions in dynamic social networks—a Bayesian approach. Mach Learn 82:157–189
Zhao Y, Levina E, Zhu J (2011) Community extraction for social networks. Proc Natl Acad Sci USA 108(18):7321–7326
Zhao Y, Levina E, Zhu J (2012) Consistency of community detection in networks under degree-corrected stochastic block models. Ann Stat 40(4):2266–2292
Zuo Y, Serfling R (2000) General notions of statistical depth function. Ann Stat 28(2):461–482
Acknowledgements
The authors would like to thank two anonymous reviewers for helpful comments and criticism on earlier versions of the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
This article was produced as part of the activities of FAPESP Center for Neuromathematics (Grant#2013/07699-0, S.Paulo Research Foundation).
Appendix: Proofs
Appendix: Proofs
1.1 Characterization of the central set
Proof of Proposition 1
We have that the expected distance from a network H to a random network \({\mathbf {G}}\) is
Let A(G) be the adjacency matrix of the network G and \({{\mathbf {A}}}\) the adjacency matrix of the random network \({{\mathbf {G}}}\). Then expression (8) can be written as
which is minimized by any network H with adjacency matrix \(A(H)_{ij}=1\) if and only if \({{\mathbb {P}}}\left( {{\mathbf {A}}}_{ij}=1 \right) \ge 1/2\). Moreover, if for all i, j
there is a unique network S that minimizes expression (8) and the corresponding adjacency matrix satisfies \(A(S)_{ij}=1\) if and only if \({{\mathbb {P}}}\left( {{\mathbf {A}}}_{ij} =1 \right) > 1/2\).
On the other hand, if condition (9) does not hold, there are many solutions. The maximal center L is the network whose adjacency matrix fulfills \(A(L)_{ij} =1\) when \({{\mathbb {P}}}\left( {{\mathbf {A}}}_{ij}=1 \right) \ge 1/2\) and the set \({\mathscr {C}}\) contains exactly all subnetworks of L for which S is a subnetwork.
The proof for the empirical version is completely analogous. \(\square \)
1.2 Depth determines measure
Proof of Proposition 2
Let \(\mu = (\mu _1, \ldots , \mu _K)\) and \(\nu = (\nu _1, \ldots , \nu _K)\) be two distributions on \({{\mathscr {G}}}\) where \(K=2^m\) is the cardinal of the space of networks. For any \(H \in {{\mathscr {G}}}\), let \(d(H) = (d(H_1,H), \ldots d(H_{K},H))\). Then, the population depth is
Therefore, the depth determines the measure if and only if
Denote by F the matrix with rows given by \(d(H_1), \ldots d(H_{K})\). Then expression (10) is equivalent to \(F (\mu -\nu ) = 0\), having a unique solution. The result follows from the invertibility of distance matrices. This result was initially proved by Micchelli (1986) and later on by Auer (1995), who provided an elementary proof, that the only uses the triangle inequality. In particular, our metric is just the \(L^1\) distance between the adjacency matrices and the result holds. For the sake of completeness, we now state the result in Auer (1995) for distance matrices.
Theorem 4
Let \(P_1, \ldots , P_n\) be distinct points in \({\mathbb {R}}^k\), and \(d_{i,j} = \Vert P_i - P_j \Vert \). If \(F_n\) is the distance matrix with entries \(d_{i,j}\), then
-
(a)
The \(\det F_n\) is positive if n is odd and negative if n is even; in particular, \(F_n\) is invertible.
-
(b)
The matrix \(F_n\) has one positive and \(n-1\) negative eigenvalues.
Applying Theorem A to our setup, we get that the matrix F is invertible. \(\square \)
1.3 Convergence of empirical depth
Proof of Theorem 2
From the ergodic theorem, we have that \({\hat{D}}_\ell (H) \rightarrow D(H)\) almost surely as \(\ell \rightarrow \infty \) for each \(H \in {\mathscr {G}}\). Since \({\mathscr {G}}\) is finite, we get uniform convergence. \(\square \)
Proof of Theorem 3
Recall that a sequence of random elements \({\mathbf {X}} := ({\mathbf {X}}_t, t \ge 1)\) is a strong mixing sequence if it fulfills the following condition. For \( 1 \le j < \ell \le \infty \), let \({\mathscr {F}}_j^\ell \) denote the \( \sigma \)-field of events generated by the random elements \(X_k,\ j \le k \le \ell \ (k \in \mathbf{N})\) . For any two \( \sigma \)-fields \( {\mathscr {A}}\) and \( {\mathscr {B}}\), define
For the given random sequence \({\mathbf {X}}\), for any positive integer n, define the dependence coefficient
The random sequence \({\mathbf {X}}\) is said to be “strongly mixing,” or “\( \alpha \)-mixing,” if \( \alpha (n) \rightarrow 0\) as \( n \rightarrow \infty \). This condition was introduced by Rosenblatt (1956). By assumption, we have that the sequence of random networks \(\{{{\mathbf {G}}}_t: t\ge 1 \}\) is a strongly mixing sequence. In order to prove the theorem, we use the following result (see, for instance, Peligrad 1986).
Theorem 5
Let \(\{{\mathbf {X}}_t: t \ge 1\}\) be a strictly stationary centered \(\alpha \)-mixing sequence, and let \({\mathbf {S}}_\ell = \sum _{t=1}^\ell {\mathbf {X}}_t\). Assume that for some \(C>0\)
Then,
is absolutely summable. If in addition \(\sigma ^2 >0\), then \({\mathbf {S}}_\ell /\sqrt{\ell }\sigma \) converges weakly to a standard normal distribution.
First observe that
where \(\{{\mathbf {W}}_t: t \ge 1 \}\) is a strictly stationary, bounded, centered \(\alpha \)-mixing sequence, fulfilling \(\sum _{n=1}^\infty \alpha (n) < \infty \). On the other hand, we have that
and the result follows from Theorem B. \(\square \)
1.4 Characterization of principal components
Proof of Proposition 3
Note that
We first consider the case when the network Q has only one link \((k,\ell )\), and find within this family the one that maximizes the objective function. Next we prove that for any other network Q the objective function is bounded by the maximum restricted to the former family. Finally, we show that the principal component space is generated by the one link networks.
Let \(Q_1\) such that \(A(Q_1)_{k_1\ell _1}=1\), and 0 otherwise. Also, let \({\mathscr {G}}_{Q_1}^+ = \{G\in {\mathscr {G}}: A(G)_{k_1\ell _1} = 1\}\) and \({\mathscr {G}}_{Q_1}^- = \{G\in {\mathscr {G}}: A(G)_{k_1\ell _1} = 0 \}\). When we search within the one link networks, the objective function reduces to
and the solution is the one link graph for which \({{\mathbb {P}}}\left( {{\mathbf {A}}}_{k_1\ell _1}=1 \right) \) is closest to 1 / 2.
In the general case, we want to find Q that maximizes
where \(w_{ij}= c_{ij} / \sum _{(p,q)\in Q} c_{pq}\). Now, since the weights \(w_{ij}\) add to one, we have
which corresponds to the one link optimum.
If there exists a unique one link graph (\(Q_1\)) that verifies \({{\mathbb {P}}}\left( {{\mathbf {A}}}_{k_1\ell _1}=1 \right) \) is closest to 1 / 2, then the principal component space is generated just by \(Q_1\), i.e., \({{\mathscr {S}}}_1 ={\mathscr {G}}_{Q_1}^+\). If there exist multiple one link graphs, \(Q_1, Q_2, \ldots , Q_p\), that minimize \(|{{\mathbb {P}}}\left( {{\mathbf {A}}}_{k\ell }=1 \right) -1/2|\), then the principal component space is \({{\mathscr {S}}}_1 = \cup _{i=1}^p {\mathscr {G}}_{Q_i}^+\). The second principal component space verifies the same. In this case, the maximization of the variance is over \(\{G\in {\mathscr {G}}: G \notin {{\mathscr {S}}}_1 \}\). Analogous for the rest of the components, for example, for finding the \(k-esima\) principal component space just maximizes the variance over \(\{G\in {\mathscr {G}}: G \notin {{\mathscr {S}}}_1, G \notin {{\mathscr {S}}}_2, G \notin {{\mathscr {S}}}_{k-1} \}.\) \(\square \)
1.5 Consistency of principal components
Proof of Proposition 4
For each \(Q \in {\mathscr {G}}\) from the ergodic theorem, we have the following.
-
(a)
\(\displaystyle \varLambda _{\ell } (Q) = \frac{1}{\ell } \sum _{k=1}^{\ell } \vert G_k \wedge Q \vert \rightarrow {{\mathbb {E}}}\left( \vert {{\mathbf {G}}}\wedge Q\vert \right) \), almost surely as \(\ell \rightarrow \infty \).
-
(b)
which converges almost surely to
$$\begin{aligned} {{\mathbb {E}}}\left( \frac{\vert {{\mathbf {G}}}\wedge Q \vert ^2}{\vert Q \vert ^2} \right) + {{\mathbb {E}}}\left( \frac{\vert {{\mathbf {G}}}\wedge Q \vert }{\vert Q \vert ^2} \right) ^2 - 2 {{\mathbb {E}}}\left( \frac{\vert {{\mathbf {G}}}\wedge Q \vert }{\vert Q \vert ^2} \right) ^2 = {\text {var}}\left( \frac{\vert {{\mathbf {G}}}\wedge Q \vert }{\vert Q \vert }\right) . \end{aligned}$$(11) -
(c)
Since the space \({\mathscr {G}}\) is finite, expression (11) entails that \(\hat{{\mathscr {Q}}}_1 \rightarrow {\mathscr {Q}}_1\) almost surely, i.e., \( \hat{{\mathscr {Q}}}_1 = {\mathscr {Q}}_1\) for \(\ell \) large enough almost surely, which entails that the principal components converge because the geodesics coincide eventually.
For the next principal component, the proof is analogous. \(\square \)
Rights and permissions
About this article
Cite this article
Fraiman, D., Fraiman, N. & Fraiman, R. Nonparametric statistics of dynamic networks with distinguishable nodes. TEST 26, 546–573 (2017). https://doi.org/10.1007/s11749-017-0524-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11749-017-0524-8