Coverage centralities for temporal networks

Structure of real networked systems, such as social relationship, can be modeled as temporal networks in which each edge appears only at the prescribed time. Understanding the structure of temporal networks requires quantifying the importance of a temporal vertex, which is a pair of vertex index and time. In this paper, we define two centrality measures of a temporal vertex based on the fastest temporal paths which use the temporal vertex. The definition is free from parameters and robust against the change in time scale on which we focus. In addition, we can efficiently compute these centrality values for all temporal vertices. Using the two centrality measures, we reveal that distributions of these centrality values of real-world temporal networks are heterogeneous. For various datasets, we also demonstrate that a majority of the highly central temporal vertices are located within a narrow time window around a particular time. In other words, there is a bottleneck time at which most information sent in the temporal network passes through a small number of temporal vertices, which suggests an important role of these temporal vertices in spreading phenomena.


Introduction
Complex networks such as social networks, information networks, and biological networks have been intensively studied in the past decade to understand their behavior under certain dynamics and develop efficient algorithms for them.See [1][2][3][4] for extensive surveys.
However, many real-world networks are actually temporal networks [5,6], in which a vertex communicates with another vertex at specific time over finite duration.For example, social interaction between individuals, passenger flow between cities, and synaptic transmission between neurons can be represented as temporal networks.When we assume that the focal dynamical processes on networks, such as information propagation, occur on a time scale comparable to the change in network structure, a temporal-network representation gives us a precise way to capture the processes.We can describe the advantage of working with a temporal network using the example shown in Fig. 1.This temporal network consists of four vertices and eight edges, each of which has the time it appears.Let us assume that it takes unit time to send the information from the tail to the head of an edge.For example, suppose that the information starts to propagate from v 1 at time 1.
a All the authors contributed equally to the work.b e-mail: t takaguchi@nii.ac.jpThen, it reaches v 2 at time 2 through edge (v 1 , v 2 ), waits at v 2 till time 3, then reaches v 3 at time 4 through edge (v 2 , v 3 ).The information never reaches v 4 because the only edge incoming to v 4 is (v 2 , v 4 ) which appears at time 1, and v 2 does not have the information at that time.However, if we ignore the temporal information and regard the network as a static directed network, we mistakenly reach the conclusion that information in v 1 at time 1 can reach v 4 because there is a directed path from v 1 to v 4 .Therefore, we cannot dismiss temporal information to properly understand the structure of temporal networks.
An important notion studied to understand the structure of (static) networks is vertex centrality, which measures the importance of a vertex.The following reasons motivate the study of centralities.First, we can use centralities to find important vertices in several applications such as suppressing the epidemics [7,8] or maximizing the spread of influence [9].Second, we can use them to understand the structure of real-world networks by examining the difference between the distributions of the centrality values in such networks and in the randomized networks (e.g, [10,11]).Third, we can examine the validity of generative network models by investigating the distribution of centralities of the generated network (e.g., [12,13]).
Hence, it is natural to study centralities for temporal networks.Since the most fundamental difference between a static network and a temporal network is that the latter involves time, we define the centrality of a vertex at a specific time.To distinguish from a vertex, we call the pair of a vertex and time a temporal vertex.In the literature, multiple centrality notions of temporal vertices based on time-respecting paths [5] have been proposed.Examples include the generalizations of the centrality notions to temporal networks, such as betweenness [15][16][17][18], closeness [16,17,19], communicability [20][21][22], efficiency [14], random-walk centrality [23], and win-lose score [24] (see Ref. [25] for a review of some of them).However, each previous centrality notion suffers from at least one of the following two issues: 1. We need to carefully set parameter values and (or) the time interval within which we consider time-respecting paths.

It is inefficient to compute the centrality.
For the first issue, the time interval length especially requires careful tuning; if the time interval is too wide, then the centrality of a temporal vertex v becomes negligible because most of the paths finish before or start after v appears.By contrast, if the time interval is too narrow, again the centrality of v becomes negligible because paths can pass by only a tiny fraction of vertices in the time interval.For the second issue, even if we compromise to use an approximation, computing the approximated centrality value of a single temporal vertex requires computational time at least linear to network size [26].
In this paper, we propose two novel centrality notions for temporal networks that resolve these issues.The first one, called temporal coverage centrality (TCC), measures the fraction of pairs of (normal) vertices that can use the temporal vertex when sending information as quickly as possible.The second one, called temporal boundary coverage centrality (TBCC), measures the fraction of pairs of vertices that should use the temporal vertex when sending information as quickly as possible.
Our centrality notions address the two issues described above in the following way.For the first issue, TCC and TBCC are free from setting of any parameters or time interval.To calculate the TCC or TBCC value of a temporal vertex v = (v, τ ), we only have to run over all pairs of vertices (u, w).Namely, we consider temporal vertices u = (u, τ u ) and w = (w, τ w ), where τ u is the latest time at which we can send information from u so that it reaches v at time τ , and τ w is the earliest time at which we can receive information at w that is sent from v at time τ .It should be noted that, if we fix focal temporal vertex v, τ u and τ w are uniquely determined by u and w, respectively, and that we thus do not have to care about the time interval around v.Then, we check whether the information sent from u = (u, τ u ) to w = (w, τ w ) can or should drop by v.
For the second issue, although the definitions of TCC and TBCC might look complicated and hard to compute, this is not the case.Indeed, computing TCC and TBCC can be reduced to the problem of deciding whether or not there is a directed path between queried vertices in an associated directed network (see Section 2.2 for details).The latter problem is well studied in the database community [38][39][40][41][42], and it can be solved by constructing an index of the directed network, which computes the reachability between any pair of nodes by using information of the reachability between a fraction of node pairs.If it suffices to use approximations to the TCC and TBCC values, we only need to query the index at most O(log 2 N ) times, where N is the total number of vertices in the network (see Appendix A).Since we can efficiently process queries to the index in practice, this method is advantageous compared to the O(N ) time for approximating previous centrality notions.
With the aid of our centrality notions, we are able to compute the centrality of all temporal vertices in a temporal network and analyze the statistics of the whole network.Using TBCC, we demonstrate that real-world temporal networks have a small number of temporal vertices without which information propagates more slowly.Surprisingly, we reveal that the temporal vertices of large centrality values form a narrow time region, and this time region seemingly corresponds to the beginning or the end of a time interval in which temporal edges occur in a bursty manner.In addition, by using TCC, we show that the remaining part of the temporal network is highly redundant in the sense that there are many ways to send information as quickly as possible.Although these properties are recognized in the network science community [28][29][30], we quantitatively confirm it for the first time using our centrality notions.We also demonstrate that the removal of temporal vertices according to their TBCC values is effective for hindering the propagation of information for both delaying and stopping it.
The paper is organized as follows.In Section 2, we introduce basic notions of temporal networks and the directed network associated with a temporal network.Section 3 introduces our centrality notions for temporal vertices, and Section 4 explains detailed methods of computing our centrality notions.Section 5 is dedicated to demonstrating our experimental results.We give the conclusion in Section 6.

Preliminaries about temporal networks 2.1 Basic notions
We introduce the terminology and symbols to describe temporal network structure, which basically follow those used in Ref. [31].
For integer k, let [k] denote the set {1, 2, . . ., k}.We define R + as the set of non-negative real numbers.
Let V be the set of vertices.A temporal edge is represented by quadruplet e = (u, v, τ, λ), where u, v ∈ V , τ ∈ R, and λ ∈ R + .For temporal edge e = (u, v, τ, λ), we refer to τ , λ, and τ + λ as the starting time, the duration, and the ending time of e, respectively.Temporal network G = (V, E) is a pair of set of vertices V and set of temporal edges E.
When we study temporal networks, a vertex at a certain time is of interest.Therefore, we define a temporal vertex by a pair of vertex v ∈ V and time τ ∈ R. In the following, we always use bold symbols such as v to denote temporal vertices.For temporal vertex v = (v, τ ), we denote the time τ by τ (v).
Temporal path P in temporal network G = (V, E) is defined as an alternating sequence of temporal vertices and edges , the i-th temporal edge e i is of the form e i = (v i , v i+1 , τ, λ) such that τ i ≤ τ and τ + λ ≤ τ i+1 .We define the starting time, the duration, and the ending time of P as τ 1 , τ k − τ 1 , and τ k , respectively.For two temporal vertices u and v, relationship u v indicates that there is a temporal path from u to v.
We define the earliest arrival time at vertex w when departing from temporal vertex v by the smallest τ ∈ R such that v (w, τ ), and we denote it by τ eat (v, w).If there is no such τ , we define τ eat (v, w) = ∞.Similarly, we define the latest departure time from a vertex u for arriving at v as the largest τ ∈ R such that (u, τ ) v, and we denote it by τ ldt (v, u).If there is no such τ , we define τ ldt (v, u) = −∞.A temporal shortest path from temporal vertex v to vertex w is a temporal path from v to (w, τ eat (v, w)), and a temporal shortest path from a vertex u to a temporal vertex v is a temporal path from (u, τ ldt (v, u)) to v.

Directed acyclic graph representation
A directed acyclic graph (DAG) is a directed network with no directed cycle.In this section, we describe the DAG representation of a temporal network, which is useful when solving problems related to temporal paths and describing the centrality notions we will introduce in Section 3.This DAG representation and its variants have been considered in the analysis of temporal networks [17,[32][33][34][35][36].
For temporal network G = (V, E), the DAG representation of G, denoted by G = ( V , E), is constructed as follows.A vertex in G represents a temporal vertex in G.For each v ∈ V , we first add to V two vertices corresponding to the temporal vertices (v, −∞) and (v, ∞).For each temporal edge (u, v, τ, λ) ∈ E, we add to V two vertices corresponding to temporal vertices u = (u, τ ) and v = (v, τ + λ) (if they do not exist in V ) and add edge (u, v) to E. Finally, for each pair of temporal vertices u = (u, τ ), u = (u, τ ) sharing the same vertex u, we add edge (u, u ) to E if there is no temporal vertex of the form (u, τ ) in V such that τ < τ < τ .Figure 2 illustrates DAG representation G of temporal network G shown in Fig. 1.The vertex in the i-th row and the j-th column corresponds to the temporal vertex (v i , j).For example, since there is temporal edge (v 1 , v 2 , 1, 1) in G, we have an edge from (v 1 , 1) to (v 2 , 2) in G.For the ith row, the leftmost and rightmost vertices correspond to the temporal vertices (v i , −∞) and (v i , ∞), respectively.
From the construction of the DAG representation, we have the following useful properties: Proof This is clear as we only add edges of the form ((u, τ ), (v, τ )), where τ < τ .
Lemma 2 Let G be a temporal network.Suppose that temporal vertices u and v have corresponding vertices in G.Then, there is a temporal path from u to v in G if and only if there is a directed path from u to v in G.
Without loss of generality, we assume that the time of v i is equal to the starting time of e i or the ending time of The converse easily follows the correspondence explained above.

Temporal coverage centralities
In this section, we introduce the temporal coverage centrality and the temporal boundary coverage centrality.Algorithm 1 (The TCC value of v) w ← (w, τeat(v, w)).

Temporal coverage centrality
Before defining TCC, we define the notion of coverage in temporal networks by generalizing its original version in static networks [37] as follows.Let v be a temporal vertex and u, w be vertices.Let u = (u, τ ldt (v, u)) and w = (w, τ eat (v, w)).Then, we say that v covers node pair (u, w) if the following two conditions hold: In words, the earliest arrival time at w when departing from u does not change even if we drop by v (condition 1), and the latest departure time from u for arriving at w does not change even if we drop by v (condition 2). Figure 3 explains condition 1.Let us focus on v = (v 1 , 7).Then, temporal vertices u = (v 4 , τ ldt (v, v 4 )) = (v 4 , 4) and w = (v 2 , τ eat (v, v 2 )) = (v 2 , 9) are determined as shown in the figure.We observe that, if we depart from u and are not forced to drop by v, we can arrive at w = (v 2 , 8), which is earlier than w.Hence, node pair (u, w) is not covered by v but by w .
On the basis of this notion of coverage, the TCC value of v is defined as the fraction of pairs (u, w) ∈ V × V that are covered by v.By definition, the TCC value of a temporal vertex takes a real number in [0, 1].If the TCC value is close to unity, the temporal vertex is said to be central in the sense that it covers many pairs of nodes.The formal definition is given in Algorithm 1 in an algorithmic manner.

Temporal boundary coverage centrality
Let v = (v, τ ) be a temporal vertex and u, w be vertices.Let u = (u, τ ldt (v, u)) and w = (w, τ eat (v, w)).Even if the TCC value of v is large, it does not always imply that Algorithm 2 (The TBCC value of v) w ← (w, τeat(v, w)).

5:
if τeat(u, w) = τ (w) and τ ldt (w, u) = τ (u) then 6: removing the temporal edges involving v makes τ eat (u, w) larger or τ ldt (w, u) smaller.One particular reason for this is that sometimes we can reach v from u earlier than τ and can leave v later than τ to reach w (see temporal vertices v 2 and v 3 in Fig. 4).In some applications, we may want to regard such v as unimportant.
To address this issue, we define TBCC by imposing additional criteria to the notion of coverage as follows.Note that, if focal temporal vertex v is an example of the situation stated in the previous paragraph, then τ eat (u, v) < τ or τ ldt (w, v) > τ should hold.Hence, we define that a pair (u, w) of vertices is covered at a boundary by temporal vertex v if the following hold: 1. (u, w) is covered by v, and 2. τ eat (u, v) = τ or τ ldt (w, v) = τ .
We explain this definition using the example shown in Fig. 4. Let v i = (v, τ i ) for i ∈ [4].Note that u = (u, τ ldt (v i , u)) and w = (w, τ eat (v i , w)) coincide for all i ∈ [4].In addition, note that all v i cover (u, w).We can see that v 1 and v 4 cover (u, w) at the boundary because τ eat (u, v) = τ 1 and τ ldt (w, v) = τ 4 .By contrast, v 2 and v 3 do not cover (u, w) at the boundary.
On the basis of this notion of coverage at the boundary, the TBCC value of v is defined as the fraction of pairs (u, w) that are covered at the boundary by v. Similar to TCC, the TBCC value of a temporal vertex takes a real number in [0, 1] by definition.The formal definition is given in Algorithm 2 in an algorithmic manner.

Computing temporal coverage centralities
We can straightforwardly calculate TCC and TBCC according to Algorithms 1 and 2. In this section, to manage large temporal networks, we give efficient methods for computing TCC and TBCC on the basis of a graphindexing technique developed recently in the database community [27].The key idea is in how to speed up the computation of τ eat and τ ldt in Algorithms 1 and 2. We describe the exact computation of TCC and TBCC in this section, and we also give the algorithms to approximate the TCC and TBCC values whose running time is polylogarithmic in the total number of vertices in G (see Appendix A).
In a directed network, we say that a vertex v t is reachable from v s if there is a directed path from v s to v t .With respect to Lemma 2, to enumerate the number of pairs (u, w) being covered by v (at the boundary, if needed), we want to efficiently answer reachability in the DAG representation G of given temporal network G.To this end, it is beneficial to construct an index of G that computes the reachability between any pair of nodes on the basis of information of the reachability between a fraction of node pairs.Such an index is often called a reachability oracle in the database community [38][39][40][41][42].
The basic idea of the construction of a reachability oracle for the present problem is the following.Naively, we want to compute a large table that stores the reachability of every pair of temporal vertices.If this were possible, we could answer reachability just by looking at that table.Unfortunately, however, perfecting this table requires O(| V | 2 ) computation time and O(| V | 2 ) space, which could be prohibitively slow and large.The reachability oracle overcomes this problem by carefully storing partial information of the network.Based on the information, it efficiently computes the reachability for the whole network.
For example, the method proposed in Ref. [42], which we will use for the numerical experiments in Section 5, computes a small table for each temporal vertex that stores reachability from (and to) a small number of other certain temporal vertices.Then, we can answer the reachability from a temporal vertex u to a temporal vertex v by checking whether there is another temporal vertex w such that we can confirm the reachability from u to w and from w to v using the small tables of u and v.If there is such w, we indeed have a directed path from u to v. The challenging part of the construction lies in guaranteeing the other direction; if there is a directed path from u to v, then there is always such w.In addition, we need to be able to compute the small table for each vertex efficiently.This method resolves these issues, so that it can handle directed networks of millions of edges with the query time of less than a microsecond on average (see Ref. [42] for further technical details).
With the aid of the reachability oracle, we can efficiently compute τ eat and τ ldt : Lemma 3 Let G be a temporal network and G be its DAG representation.We can compute τ eat and τ ldt with O(log |E|) queries to the reachability oracle of G.
Proof We only consider τ eat as τ ldt can be computed similarly.Given temporal vertex v and vertex w, τ eat (v, w) is the minimum τ ∈ R such that there is a temporal path from v to (w, τ ).To find such τ , we perform a binary Table 1.Basic statistics of the datasets.Variables n, m, n, and τmax are the total number of vertices and temporal edges in G, the total number of vertices in G, and the maximum ending time of a temporal edge, respectively.The datasets are arranged in increasing order of m.

Results
The basic statistics of the datasets we use are summarized in Table 1.It should be noted that we do not use the actual time stamps in the datasets but define τ by the order of unique values of the time stamps.For example, if the dataset consists of two time stamps t = 1, 4, we translate them into τ = 1, 2. Although interactions in Irvine and Email are directed (i.e., from sender to receiver(s) of messages), we regard them as undirected.

Statistics of TCC and TBCC
Figure 5 depicts the rank plots of the TCC and TBCC values of temporal vertices in the decreasing order.In all the datasets, at least 10% of temporal vertices have TCC values larger than 0.1 (Fig. 5(a)).This fact implies the redundancy of temporal networks in the sense that, when information flows between temporal vertices, it can drop by different vertices without increasing the total duration of the temporal paths.However, there are a smaller number of temporal vertices with large TBCC values (Fig. 5(b)).This fact also implies the redundancy of temporal networks in a different sense such that, when information flows between temporal vertices, it is not forced to exist at a certain vertex at a certain time.
To see the impact of the structural peculiarity of temporal networks on these distributions, we computed the centrality values of temporal vertices in randomized temporal networks.We randomize an original temporal network by replacing the two ends of each temporal edge by vertices chosen uniformly at random (similar to the procedure called randomized edges with randomly permuted times in Ref. [5]).The resultant centrality values are shown in Fig. 6.We notice that more temporal vertices have sufficiently large centrality values (e.g., larger than 0.1) in real-world temporal networks (Fig. 5) than in randomized temporal networks (Fig. 6).The maximum centrality values are larger in the randomized than in the original networks for HT09 and Hospital, and vice versa for Infectious and Email.This fact implies that the way the flow concentrates upon temporal vertices depends on each dataset.Next, we examine how the centrality values change over time owing to the structural transformation of the temporal networks.Figure 7 depicts the change in the maximum TCC and TBCC values over temporal vertices at present and the number of temporal vertices at present for Infectious and Hospital.In both datasets shown in Fig. 7, we can see some periodic patterns in the number of temporal vertices.However, the maximum centrality values are not much affected by the patterns, which implies that these values are determined not by the mere activity level in the networks but by the structure of the temporal network.In addition, the fact that the maximum centrality values vary considerably throughout the observation periods suggests that we should carefully incorporate temporal structure to assess the importance of vertices.Generally, the maximum TCC values are larger than the maximum TBCC values, which makes sense according to their definitions (i.e., TBCC only counts the coverage of the temporal paths at the boundary but TCC does not impose this boundary criterion).
When we focus on a particular vertex, two centrality values of it also vary in a different manner over time.Figure 8 depicts the change in the TCC and TBCC values of the vertex that are involved in the largest number of temporal edges in the two datasets, Infectious and Hospital.The TCC value of the vertex increases with time in Infectious (Fig. 8(a)), simply because the number of present temporal vertices increases and thus the focal vertex can reach these vertices in this period (also see Fig. 7(a)).By contrast, the TBCC value does not exhibit such an increasing trend.This fact supports our original purpose of introducing TBCC, i.e., to discount the centrality values of the temporal vertices of the dispensable temporal paths.In addition, the plot of TBCC unveils that even the vertex with the largest number of temporal edges does not always bridge effective temporal paths.In Hospital (Fig. 8(b)), we can observe that the temporal edges associated with the focal vertex are partitioned into five time intervals, in each of which temporal edges occur in a bursty manner, and the centrality values of the vertex become larger at the beginning and the end of each of these time intervals.This observation makes sense because, at the endpoints of a time interval, a vertex tends to play the role as the gateway for information flowing into or out of the time interval.
The computational efficiency of the two centralities enables us to draw a map of the centrality values of all the temporal vertices over time.This map reveals the existence of bottleneck time regions in the empirical temporal networks.Figures 9(a) and 9(b) depict the TCC values of temporal vertices as a heat map for Infectious and Hospital, respectively.In both datasets, most temporal vertices have non-negligible TCC values, and these results support the notion of redundancy of temporal networks (see Fig. 5(a)) such that all the vertices can belong to redundant temporal paths.In addition, the temporal vertices with the largest centrality values appear in the middle of the observation period, and the temporal vertices at the same time tend to have similar TCC values.We found the same phenomenon in all the datasets (see Electronic Supplementary Materials for the plots of the other datasets), and the existence of this bottleneck time period seems to be a common property of empirical temporal networks.
If we are interested in when these bottleneck time periods begin and end, we can look at the heat map of the TBCC values.As an example, Fig. 9(c) magnifies a bottleneck time period in Infectious (Fig. 9(a)) in which we observe many temporal vertices with the largest TCC values.However, the boundary of the bottleneck period is not clear in the figure.Figure 9(d) shows the heat map of the TBCC values in the same area as shown in Fig. 9(c).As we observe, the TBCC values indicate the boundaries at τ 660, 680, and 750.This boundary information should be meaningful, for example, when we narrow the candidates of the vertices to be vaccinated for epidemic spreading on temporal networks [47][48][49].For readability, we smoothed the curves by taking the average over a sliding window with a length of 100 units of time.We finally stress again that it becomes possible to compute these statistics and analyze the structure of temporal networks in such detail because of the efficient computation of TCC and TBCC using the reachability oracle.

Delay caused by removing a central temporal vertex
In closing this section, to verify the relevance of the proposed centrality notions at the microscopic level, we briefly report that removing a temporal vertex with large TCC and TBCC values is effective in delaying the propagation of information.
Let G = (V, E) be a temporal network, where V = {v 1 , v 2 , . . ., v n }.For a temporal vertex v = (v, τ ), let v i = (v i , τ eat (v, v i )) for each i ∈ [n] and τ be the (unique) time such that v has an edge to v = (v, τ ).We say that v i gets prolonged by removing v if τ eat (v, v i ) becomes larger by removing edges incident to v (and we keep edge (v, v )).In a similar manner, we say that v i becomes disconnected by removing v if we cannot reach v i from v after removing edges incident to v (where, again, we keep edge (v, v )).
We investigate the fraction of prolonged or disconnected temporal vertices among v 1 , v 2 , . . ., v n , by removing one of the top 100 vertices with respect to the TCC or TBCC values.It should be noted that the fraction of temporal vertices becoming prolonged or disconnected is nontrivial because the definition of TCC and TBCC take into account temporal paths both before and after the focal temporal vertex.As a baseline for comparison, we also conduct the same test by removing a temporal vertex chosen randomly.For the random case, we randomly choose 100 temporal vertices without replacement and take the average of the fraction of prolonged or disconnected temporal vertices for these 100 trials.
The results of the removal test of temporal vertices are summarized in Table 2 for the five datasets.As we expected, the removals according to the largest centrality values make more temporal vertices prolonged or disconnected than the random removals.The removals according to the largest TCC values tend to prolong a certain fraction of temporal vertices for all the datasets considered.However, it makes few temporal vertices disconnected.These outcomes make sense because the number of other temporal paths running alongside the temporal path going through the focal temporal vertex is not considered in TCC (also see Section 3.1).By contrast, the removals according to the largest TBCC values make a considerable fraction of temporal vertices prolonged and disconnected.Remarkably, 50.8% of the temporal vertices, on average, become disconnected from a removed temporal vertex in Irvine.There is no clear distinction between the results of the offline (i.e., Infectious, HT09, and Hospital) and online (i.e., Irvine and Email) networks.

Conclusions
We introduced two centrality notions for temporal networks-temporal coverage centrality and temporal boundary coverage centrality-to represent the importance of a temporal vertex by the fraction of vertex pairs that can or should use the temporal vertex when sending information as quickly as possible.Compared to centrality notions proposed in previous work, TCC and TBCC have two advantages: (i) Parameters or time windows do not need to be set and (ii) computation time is reasonable.Applying TCC and TBCC to multiple datasets of empirical temporal networks, we revealed that there tends to be particular bottleneck time periods that play a crucial role in propagating information quickly and that the rest of the networks is redundant in the sense that there are many temporal paths to send information with the same duration.Although such structural redundancy in temporal networks was suggested in some previous studies [28][29][30], our centrality notions enable us to clearly quantify and visualize this property.We believe that the centrality notions we proposed are useful for further studying the structure of temporal networks and verifying generative models of temporal networks.Datasets used in the numerical experiments, Infectious, HT09, and Hospital were originally collected and published by the SocioPatterns collaboration (http:// www.sociopatterns.org/).Datasets HT09 and Hospital were downloaded from the SocioPatterns website.Datasets Infectious, Irvine, and Email were downloaded from the Koblenz Network Collection (http://konect.uni-koblenz.de/).The authors thank Dr. James Cheng for valuable discussions.Yuichi Yoshida is supported by JSPS Grant-in-Aid for Young Scientists (B) (No. 26730009), MEXT Grant-in-Aid for Scientific Research on Innovative Areas (24106001), and JST, ERATO, Kawarabayashi Large Graph Project.

A Approximate computation of temporal coverage centralities
By Lemma 4 (see Section 4), the number of queries to the reachability oracle for computing the TCC and TBCC values is (almost) quadratic in the number of vertices of a temporal network.However, in some applications, we may want to compute these centralities faster.Here, we introduce a standard technique that enables us to approximate these centrality values with a sublinear number of queries.Sample vertices u, w ∈ V uniformly.4: u ← (u, τ ldt (v, u)).
We only explain the case of TCC; the case of TBCC is performed in a similar way.
Algorithm 3 is an approximate method for computing the centrality value.The difference from Algorithm 1 is that, instead of enumerating all pairs (u, w), we only sample O(1/ 2 ) pairs of vertices and take the average over them, where is the parameter controlling the possible error in approximation.
Recalling that the query time of the reachability oracle is tiny, we find that the running time of Algorithms 3 can be seen as polylogarithmic in the input size.This is the great advantage of TCC and TBCC against other centrality notions.We consider only the temporal vertices involved in temporal edges with other vertices to calculate the statistics.For (d) and (e), we smoothed the curves by taking the average over a sliding window with a length of 100 units of time, because the time resolutions of the observations are so high that there are not sufficient number of temporal vertices to take the average at most of the time points.

Fig. 1 .
Fig. 1.Schematic of an example of temporal network.The number associated with each edge represents the time at which the edge appears.

Fig. 7 .
Fig. 7. Change in the maximum TCC and TBCC values over temporal vertices at present in (a) Infectious and (b) Hospital.For readability, we smoothed the curves by taking the average over a sliding window with a length of 100 units of time.

Fig. 8 .
Fig. 8. Change in the TCC and TBCC values of the vertex with the largest number of temporal edges.(a) Vertex with label 195 in Infectious and (b) vertex with label 1115 in Hospital.

Fig. 9 .
Fig. 9. Heat maps of the TCC values for (a) Infectious and (b) Hospital.(c) Heat map magnifying the area with 650 ≤ τ ≤ 800 and 100 ≤ ID ≤ 220 in (a).(d) Heat map of the TBCC values in the same area as shown in (c).
Fig.S1.Average (solid line) and 10 − 90% values (shaded areas) of TCC at each time for (a) Infectious, (b) HT09, (c) Hospital, (d) Irvine, and (e) Email.We consider only the temporal vertices involved in temporal edges with other vertices to calculate the statistics.For (d) and (e), we smoothed the curves by taking the average over a sliding window with a length of 100 units of time, because the time resolutions of the observations are so high that there are not sufficient number of temporal vertices to take the average at most of the time points.
using the reachability oracle.Since the number of possible values for τ is O(|E|), the number of queries is O(log |E|).
Lemma 4Let G be a temporal network and G be its DAG representation.For any temporal vertex v, we can compute the TCC and TBCC values of v with O(|V | 2 log |E|) queries to the reachability oracle of G.

Table 2 .
Results of the removal of temporal vertices.The number in each cell presents the average fraction of disconnected (or prolonged) temporal vertices over the 100 trials of the removal based on the given procedure (i.e., according to the largest TCC and TBCC values or random pick).Algorithm 3 (Approximation to the TCC value of v)1: r ← 0. 2: for i = 1 to k := 1 2 2 log(2|V | 2 ) do 3: