Computing top-k temporal closeness in temporal networks

The closeness centrality of a vertex in a classical static graph is the reciprocal of the sum of the distances to all other vertices. However, networks are often dynamic and change over time. Temporal distances take these dynamics into account. In this work, we consider the harmonic temporal closeness with respect to the shortest duration distance. We introduce an efficient algorithm for computing the exact top-k temporal closeness values and the corresponding vertices. The algorithm can be generalized to the task of computing all closeness values. Furthermore, we derive heuristic modifications that perform well on real-world data sets and drastically reduce the running times. For the case that edge traversal takes an equal amount of time for all edges, we lift two approximation algorithms to the temporal domain. The algorithms approximate the transitive closure of a temporal graph (which is an essential ingredient for the top-k algorithm) and the temporal closeness for all vertices, respectively, with high probability. We experimentally evaluate all our new approaches on real-world data sets and show that they lead to drastically reduced running times while keeping high quality in many cases. Moreover, we demonstrate that the top-k temporal and static closeness vertex sets differ quite largely in the considered temporal networks.


Introduction
Centrality measures are a cornerstone of social network analyses.One of the most popular and well-researched centrality measures is closeness, first introduced by Bavelas [1].In a static and undirected graph, the closeness of a vertex is the inverse of the sum of the smallest distances to the other vertices of the network.However, many real-world networks are temporal, e.g., in a social network, persons only interact at specific points in time.Recently, the analyses of dynamic networks, or temporal graphs, have gained increasing attention [5,18,26,32,40].Here, a temporal network consists of a set of vertices and a set of temporal edges.Each temporal edge is only available at a specific discrete point in time (called the availability time), and edge traversal costs a strictly positive amount of time (called the transition time).
Important examples are human and social contact networks, communication networks, and transportation networks.The transition times in transportation networks correspond to the times needed for spatial movements, e.g., the duration of flights or the travel time of buses between the bus stops.For communication networks, the transition times depend on the type of communication, e.g., email communication uses unit transition times.In a temporal network modeling phone usage, the transition time of an edge may be the duration of a phone call between two users.In the case of unit transition times, a temporal graph can be seen as a sequence of static graphs over a fixed set of vertices and edges that evolve over time.Figure 1a shows a temporal graph, where the edges are labeled with the time stamps of their existence.Figure 1c shows the representation as a sequence of static graphs over time.The temporal properties direct any possible flow of information in the network.For example, a dissemination process on the network, such as the spread of rumors, fake news, or diseases, has to respect the forward flow of time.Hence, a meaningful adaption of a conventional path is the time-respecting temporal path.In a temporal path, at each vertex, the time stamp of the next edge of the path cannot be earlier than the arrival time at the vertex [7,18,52].Modifications of static closeness to temporal closeness using temporal path and temporal distances have been suggested [11,28,41,45,49,52].One of these temporal distances is the shortest duration, i.e., the duration of a fastest path between two nodes.Here, we consider the harmonic temporal closeness of a vertex.It is defined as the sum of the reciprocals of the durations of the fastest paths to all other vertices.Furthermore, we define the harmonic temporal in-closeness as the sum of the reciprocals of the durations of the fastest paths arriving at a vertex.Harmonic closeness for non-connected static graphs was introduced in [31].The reason for using the harmonic variant for temporal closeness and in-closeness is that reachability between vertices, even in an undirected temporal graph, is restricted.It has been shown that in networks modeling dissemination processes, like the spread of viruses or fake news, a vertex with a high temporal closeness can be expected to be of high importance for the transportation or dissemination process [48,49].In general, high temporal closeness vertices in temporal networks can differ from high static closeness and high degree vertices.For example, in the temporal graph shown in Fig. 1a, traversing an edge takes one time unit for all edges.Figure 1b shows the aggregated static graph, in which static edges replace the temporal edges.In this static graph, the most central vertex in terms of static closeness is a.It also has the most outgoing edges.However, in the temporal graph, the vertex with the highest temporal closeness is vertex e.Notice that vertex a can only reach its direct neighbors in the temporal graph, while vertex e can reach all other vertices.Computing the exact temporal closeness values of all vertices can be costly because it demands finding a fastest temporal path between each pair of vertices.However, often knowing the top-k important vertices is sufficient.Our contributions are: 1. We propose an algorithm for computing the top-k harmonic temporal closeness values and the corresponding vertices in a temporal network.This algorithm can be simplified for the task of computing the closeness values of all vertices.The algorithms are based on a new minimum duration path algorithm on temporal graphs.

(a) (b) (c)
Fig. 1 Example of an undirected temporal graph, its aggregated static graph, and its interpretation as a sequence of static graphs temporal graphs.The transitive closure is essential for the top-k algorithm.This approximation speeds up the running time of the top-k algorithm substantially.Moreover, we adapt a stochastic sampling algorithm for approximating the temporal closeness of all vertices directly.4. We comprehensively evaluate our algorithms using real-world temporal networks.As a baseline, we use a temporal closeness algorithm based on the edge stream fastest path algorithm introduced by Wu et al. [52].Our approaches decrease the running times for all data sets significantly.

Differences to the conference version
The main differences and additions to the conference version [38] are the following.We added the definition of temporal in-closeness.Our new results show that with the temporal transpose, our algorithms can efficiently compute the temporal in-closeness.Furthermore, we introduce two new heuristics with detailed descriptions and analyses.Our experimental evaluation for the two new heuristics shows better results for both of them compared to the one in the conference version.Finally, we extended our experiments section with a comparison between temporal closeness, static closeness, degree centrality, and reachability.
the k most influential vertices has been widely studied, see, e.g., [3,53,54].Bergamini et al. [3] present algorithms for computing the top-k closeness in unweighted static graphs.They start a breadth-first search (BFS) from each vertex in order to find the shortest distance to all other vertices.In each BFS run, they calculate an upper bound for the closeness of the current vertex.If the current vertex cannot have a top-k closeness value, their algorithm stops the computation early.We adapt this strategy for temporal closeness in Sect. 4. Our algorithm differs by computing the fastest temporal paths, allowing variable transition times, and using corresponding upper bounds.A simple adaption of the static closeness algorithm from [3] would be impossible because it does not take the temporal features, like temporal edges with transition times or waiting times at vertices, into account.In [4] Bisenius et al. extend the same framework to dynamic graphs in which edges can be added and removed.It allows efficient updates of the static closeness after edge insertions or deletions.However, their algorithm works for static closeness using unweighted static shortest paths without considering edge availability or transition times.Eppstein and Wang [13] proposed a randomized approximation algorithm for closeness in weighted undirected graphs.It approximates the closeness of all vertices with a small additive error with high probability.Their algorithm also is not directly applicable to the temporal case.In Sect.5, we introduce a transformation of temporal graphs to an inverse representation such that the algorithm can be used to approximate temporal closeness.Based on [13], Okamoto et al. [39] introduced an algorithm for approximating the top-k closeness.First, they use an approximation to find a candidate set of vertices.Next, they rank the topk vertices with high probability.Cohen et al. [9] combined the sampling approach and a pivoting technique for approximating the closeness of all vertices.
An early overview of different dynamic graph models is given by Harary and Gupta [15].More recent and comprehensive introductions to temporal graphs, including overviews of different temporal centrality measures, are provided in, e.g., [18,28,45].Michail [33] gives an algorithmic oriented introduction to temporal graphs.The early work on fastest paths by Cooke and Halsey [10] suggests to iteratively compute T distance matrices for each vertex, where T is a time-dependent parameter chosen beforehand.Kempe et al. [21] discuss timerespecting paths and related connectivity problems.Xuan et al. [7] introduce algorithms for finding the shortest, fastest, and earliest arrival paths, which are generalizations of Dijkstra's shortest paths algorithm.Their fastest path algorithm only supports unit traversal times for all edges and enumerates all the earliest arrival paths for each possible starting time.In [19], the authors define variants of temporal graph traversals.A depth-first search (DFS) variant can be used to construct a DFS tree starting at a vertex v, containing information about the fastest paths from v to the other vertices.The authors of [36] consider bicriteria temporal paths in weighted temporal graphs, where each edge has an additional cost value.They propose an algorithm that enumerates all efficient paths with polynomial-time delay.
Wu et al. [52] introduce streaming algorithms operating on the temporal edge stream in which the edges arrive with non-decreasing time stamps.They discuss finding fastest, shortest, latest departure, and earliest arrival paths and suggest using their algorithms to compute temporal closeness.We use their state-of-the-art streaming algorithms for the fastest paths and temporal closeness in the baseline in our experiments.In Sect.4.5, we further discuss the differences to our algorithms.
For temporal graphs with unit traversal times, Crescenzi et al. [11] introduce a temporal closeness variant based on the durations of earliest arrival paths by integrating over all starting times in a given time interval.For the exact closeness computation, they use a variant of the edge streaming algorithm.For finding the top-k closeness vertices, the authors define a backward version of their algorithm and provide a variant of Eppstein and Wangs [13] algorithm for closeness approximation.After approximating the closeness of all vertices, the vertices are ranked according to the approximated closeness.Finally, the exact closeness is computed for the K > k highest-ranked vertices in order to obtain a final top-k ranking.In [50], the authors introduce shortest-fastest paths as a combination of the conventional distance and shortest duration.They extend Brandes' algorithm [6] for computing betweenness centrality in temporal graphs.There are works on temporal walk-based centrality measures, e.g., the authors of [2] adapt the walk-based Katz centrality to temporal graphs, and Rozenshtein and Gionis [44] give a temporal PageRank variant.Nicosia et al. [37], and Kim and Anderson [25] examine a wide range of properties of temporal graphs, including various temporal centrality measures.In [48] and [49], the authors compare temporal distance metrics and temporal centrality measures to their static counterparts.They reveal that the temporal versions for analyzing temporal graphs have advantages over static approaches on the aggregated graphs.

Preliminaries
A directed temporal graph G = (V , E) consists of a finite set of vertices V and a finite set E of directed temporal edges e = (u, v, t, λ) with u and v in V , u = v, availability time (or time stamp) t ∈ N and transition time (or traversal time) λ ∈ N. The transition time of an edge denotes the time required to traverse the edge.We only consider directed temporal graphs-it is possible to model undirectedness using a forward-and a backwarddirected edge with equal time stamps and traversal times for each undirected edge.Notice that in a temporal graph, the number of edges is not polynomially bounded by the number of vertices.Given a temporal graph G, removing all time stamps and traversal times, and merging resulting multi-edges, we obtain the aggregated graph A(G) = (V , E s ) with E s = {(u, v) | (u, v, t, λ) ∈ E}.For a directed temporal graph G = (V , E) and a time interval I = [α, β] with α, β ∈ N and α ≤ β, we define the temporal subgraph G I = (V , E I ) with , the largest numbers of different starting or arrival times at any v ∈ V .Finally, let T (G) be the set of all availability times in G. Figure 2a shows an example of a temporal graph G with T (G) = {1, 2, 5, 6, 7, 8}.At each edge the availability and transition time is shown.Notice that G = G I for I = [1,12], τ I + = max{1, 2, 3} = 3 and τ I − = max{0, 1, 2} = 2. Restricting the temporal graph to the interval J = [5,9] leads to the temporal subgraph shown in Figure 2b.For G J it is T (G J ) = {5, 6, 7}, τ J + = 1 and τ J − = 1.

Temporal paths
A temporal path P between vertices v 1 , v +1 ∈ V of length is an alternating sequence of vertices and temporal edges such that e i ∈ E, each vertex in P is visited exactly once, and t i + λ i ≤ t i+1 for 1 ≤ i < .
For notational convenience, we sometimes omit edges or vertices.The starting time of P is s(P) = t 1 , the arrival time is a(P) = t + λ , and the duration is d(P) = a(P) − s(P).
A fastest path is a path with minimal duration.We say vertex v is reachable from vertex u if there exists a temporal (u, v)-path.In contrast to classical static graphs, even undirected (a) Example for a temporal graph G.At each edge the availability and transition time is given as pair (t, λ).

(b)
The temporal subgraph G [5,9] of G that is restricted to the time interval [5,9].and has a duration of one.Furthermore, notice the restricted reachability due to the temporal constraints.For example vertex b is able to reach vertex d, and vertex d can reach vertex c.However, the path from b to d has an arrival time of nine and is too late to be continued with one of the outgoing edges at d, which start at six and eight, respectively.

Harmonic temporal closeness
For a temporal graph G = (V , E) and a time interval I , let P I uv be the set of all temporal paths between u, v ∈ V in G I .We define the shortest duration between u and v as d I (u, v) = min P∈P I uv (d(P)).If v is not reachable from vertex u during the interval I , we set d I (u, v) = ∞, and we define 1 ∞ = 0. Due to the restricted reachability in temporal graphs, we use the harmonic variant of temporal closeness.Marchiori and Latora [31] introduced harmonic closeness in static graphs.Notice that the temporal closeness can be defined using either the outgoing or incoming fastest paths, i.e., using d I (u, v) or d I (v, u).In the first case, the temporal closeness describes how well vertex u can reach the other vertices in terms of duration.The second case describes how well the other vertices can reach vertex u.Therefore, analogous to the harmonic temporal closeness, we define the harmonic temporal in-closeness.Definition 3 (Harmonic Temporal in-Closeness) Let G = (V , E) be a temporal graph and I a time interval.We define the harmonic temporal in-closeness for u ∈ V with respect to I as c I in (u) = v∈V \{u} In Sect.5, we introduce the temporal transpose of a temporal graph that enables our temporal closeness algorithms to calculate the harmonic temporal in-closeness in case that all edges have equal transition times.
In the following, we often drop the word harmonic and call the problem temporal closeness.

Algorithms for temporal closeness
First, we present a new fastest path algorithm, which we then use for the top-k temporal closeness computation.The new algorithm is tailored to be part of our top-k algorithm and operates on the adjacency list representation of the temporal graph, i.e., for each vertex, all out-going edges are in a list.

A label setting fastest path algorithm
Recall that, in general, a sub-path of a fastest path is not necessarily a fastest path.We deal with this problem by using a label setting algorithm that finds the fastest paths from a start vertex u to all other vertices during a time interval I .The algorithm uses labels, where each label l = (v, s, a) represents a (u, v)-path that starts at time s at vertex u and arrives at time a at vertex v.For each vertex v ∈ V , the algorithm keeps all such labels in a list [v] and uses a dominance check when a new label is created to remove labels that cannot lead to optimal paths.We use the dominance relation, which is also used in [52].
A non-dominated label does not necessarily represent a fastest path.However, it might represent a prefix-path of a fastest path.On the other hand, a dominated label cannot represent a fastest path or a prefix-path of a fastest path.Therefore, all dominated labels can be deleted.Using a dominance check, Algorithm 1 only keeps labels that may lead to a fastest path.In the case of two equal labels, we only need to keep one.Besides the label lists [v], we use a priority queue Q containing all labels which still need to be processed.In each iteration of the while loop, we get the label (v, s, a) with the smallest duration a − s from the priority queue (line 5).At this point, if the algorithm discovers v for the first time, the shortest duration for each outgoing edge e = (v, w, t, λ) from v do 10: else l = (w, l.s, t + λ) 13: remove dominated labels from [w] and Q 14: if l is not dominated then 15: Q.insert(l ) and [w].add(l ) 16: return d Lemma 1 Let (l 1 = (w 1 , s 1 , a 1 ), . . ., l p = (w p , s p , a p )) be the sequence of labels returned by the extractMin call of the priority queue.Then the durations are non-decreasing, i.e., Proof The duration is strictly increasing in the length of a path because we have strictly positive transition times, i.e., using an edge takes at least one time step.We use induction over the iteration i of the while loop (line 4).The first initial label has duration a − s = 0. Now assume the hypothesis holds after the i-th iteration, for an i ≥ 1.In the (i + 1)-th iteration, first the label l i+1 with the currently shortest duration is returned from the priority queue.The inner for loop (line 9) iterates over all outgoing-edges at w and a new label might be added to the priority queue for each possible extension of the (u, w)-path.The duration of all newly inserted labels is larger than the one of l i+1 and therefore also larger then the duration of any label that was already in the queue when l i+1 was returned.However, labels newly inserted during one iteration of the while loop may have equal duration.

Lemma 2 If during the iterations of the while loop a vertex v is inserted in F (line 6 ff.), the duration d[v] = a − s is equal to the duration of a fastest (u, v)-path.
Proof Vertex v is added to F when label l = (v, s, a) is returned from the priority queue and if v / ∈ F. After a vertex has been added to F, it remains in F. Therefore, l is the first label that is processed for vertex v. Due to Lemma 1, it follows that the durations of the processed labels are non-decreasing.Consequently, a − s is the shortest duration of a fastest path from u to v.
The following theorem states the correctness and asymptotic running time of Algorithm 1.

Theorem 1 Let G = (V , E) be a temporal graph, u ∈ V , and I a time interval. And, let δ
Proof Let w ∈ R I (u) be a temporally reachable vertex.There has to be at least one, or the temporal closeness of u is 0. Because w is reachable, we know that there exists a fastest temporal (u, w)-path P = (v 1 = u, e 1 , . . ., e −1 , v = w).By induction over the length , we show that for each prefix-path a corresponding label is generated.For the prefixpath of length h = 0, we have the initial label at vertex u.We assume that for some h ≥ 0 the hypothesis holds.Now, for the case h + 1, we have the prefix-path P h = (v 1 = u, e 1 , . . ., e h , v h+1 ) which consists of (v 1 = u, e 1 , . . ., v h ) and edge e h arriving at vertex v h+1 .By applying the induction hypothesis, a label (v h , s , a ) for P h was created and added to Q.
The only way the label could get deleted without being processed is by being dominated by a label (v h , s , a ).Without loss of generality, we can assume that not both s = s and a ≥ a holds (in this case we consider the path represented by (v h , s , a )).Assume that s < s and a ≥ a then P would not be a fastest temporal path.Therefore, the label does not get deleted from Q due to dominance checks.
However, eventually, it gets processed in some iteration, and a label for P h+1 is added to the priority queue.The label representing P h+1 cannot be dominated for similar reasons as for the label for P h .Hence, at some point, vertex w is reached for the first time and added to the set F. With Lemma 2, it follows that a fastest temporal path has been found, and d[w] is set to the shortest duration.
The total number of labels is upper bounded by | and Q will eventually be empty and the algorithm terminates.
For the running time, we have the following considerations.Initialization is done in O(|V |).The while loop (line 4) is called for every label (v, s, a), and the inner for loop (line 9) is called for each out-going edge at vertex v in G.During the inner for loop, a new label is created and compared to all labels at vertex w for the dominance check.Due to the dominance check, for each possible arrival time, we may only keep the label with the latest starting time and no equal labels.And, for two labels with equal starting times, we only keep the one with the earlier arrival time.Therefore, the number of labels at a vertex w is less or equal to the maximum of the number of different arrival times at any v ∈ V and different starting times at u, i.e., the size of [w] is at most π.The domination checks for all outgoing edges (line 9 ff.) can be done in πδ + m .The total number of labels generated is less or equal to |E I | • |τ I + (u)|.However, at any time there are at most |E I | labels in the priority queue, because we only keep the label with the latest starting time per edge.Therefore, the cost for extracting and deleting a label from the priority queue is amortized O(log(|E I |)) using a Fibonacci heap, and inserting is possible in constant time.Because we have to check for each edge in e ∈ E if e is in the time interval I , the total running time is in

Computing Top-k temporal closeness
Based on the fastest path algorithm presented in the previous section, we now introduce the top-k temporal closeness algorithm.Given a temporal graph G and a time interval I , Algorithm 2 computes the exact top-k temporal closeness values and the corresponding vertices in G I .If vertices share a top-k temporal closeness value, the algorithm finds all of these vertices.We adapt the pruning framework introduced by Bergamini et al. [3] for the temporal closeness case.In contrast to their work, we compute the fastest paths in temporal graphs instead of BFS in static graphs and introduce upper bounds that take temporal aspects like transition and waiting times into account.We iteratively run Algorithm 1 for each u ∈ V .
During the computations, we determine upper bounds of the closeness and stop the closeness computation of u if we are sure that the closeness of u cannot be in the set of the top-k values.
For calculating the upper bounds, we use two pairwise disjunctive subsets of the vertices that get updated every iteration, i.e., F i (u) ∪ T i (u) ⊆ V , with 1. F i (u) contains u and all vertices for which we already computed the exact temporal distance from u, and 2. T i (u) contains all vertices w for which there is an edge e = (v, w, t, λ) ∈ E with v ∈ F i (u) and w / ∈ F i (u).
Together with the set R I (u) of reachable vertices, which does not change during the algorithm, we obtain the upper bound cI i (u) of the temporal closeness of vertex u in iteration i as 1 lower bound for d I (u, v) .
In the following, we introduce upper bounds for the temporal closeness, such that we can use Algorithm 1 to compute the duration c I (u) exactly or stop the computation if the vertex cannot achieve a top-k temporal closeness value.Algorithm 2 has as input a temporal graph G, time interval I , and the number of reachable vertices r I (u) for all u ∈ V .Algorithms for computing the number of reachable vertices are discussed in Sect.4.4.Algorithm 2 first removes all edges that are not leaving or arriving during the time interval I and proceeds on G I .Next, the algorithm orders the vertices by decreasing number of out-going edges and starts the computation of the closeness for each vertex v j in the determined order.The intuition behind processing the vertices by decreasing out-degree is that a vertex with many out-going edges can reach many other vertices fast.The processing order of the vertices is crucial in order to stop the computations for as many vertices as soon as possible.After ordering the vertices, for each v j , the shortest durations from v j to the reachable vertices are computed using the adapted version of Algorithm 1.In each iteration of the while loop, the algorithm extracts the label (v, s, a) with the smallest duration a − s from the priority queue, and if v is still in T i−1 (v j ), then it found the shortest duration d I (v j , v).Let (l 1 = (w 1 , s 1 , a 1 ), . . ., l p = (w p , s p , a p )) be the sequence of labels returned by the extractMin call of the priority queue.Then the durations are non-decreasing, i.e., a i − s i ≤ a h − s h for 1 ≤ i < h ≤ p. And, in iteration i, when reaching line 27, the label l i+1 that is returned from the priority queue in iteration i + 1 is determined.It holds that d i+1 = a i+1 − s i+1 is the next possible duration to any reachable vertex that is not yet in F i (v j ).Now, let be the minimal temporal waiting time at a vertex w over all vertices, i.e., = min w∈V {t b − (t a + λ a ) | (x, w, t a , λ a ), (w, y, t b , λ b ) ∈ E I and t a + λ a ≤ t b }.Furthermore, let λ min the smallest transition time in E I .

Lemma 3 In Algorithm 2, during the inner while loop in line 27, for each vertex
Then its duration is not found yet.The next label l i+1 provides the next possible duration any reachable vertex can have with d i+1 = a i+1 − s i+1 .Moreover, for any reachable vertex z that is not in F i (v j ) ∪ T i (v j ), we have to account for at least the additional waiting time at a vertex in y ∈ T i (v j ) plus the transition time from y to z. Initialize label l = (v j , 0, 0) at vertex v j 6: Initialize PQ Q and insert l 7: Initialize Initialize empty lists [v] for all v ∈ V 10: while Q not empty and 18: Finally, we discuss the running time of the algorithm.We can easily adapt this algorithm to compute the closeness for all vertices.In this case, we do not need the number of reachable vertices, and we do not need to keep track of the partitioning of the vertex set or the upper bound c(v i ).

Heuristic modifications
We propose two different modifications for our fastest path algorithm and obtain heuristic versions of our temporal closeness algorithms.Heuristic 1: We propose the following heuristic approach.The idea is to limit the size of the label lists at the vertices.Let h ∈ N be the maximal size of each label list at v ∈ V .At each vertex v ∈ V , we then have maximal h labels at any point during the computation of the temporal closeness, i.e., [v] ≤ h.We achieve this by adding a check before adding a new label to the list in line 24.If the size of [v] is already h at this point, we discard the new label.The difference for the running time happens during the inner for loop when a label is created and compared to all labels at vertex w for the dominance check.The number of labels at a vertex w is now less or equal to the size of [w], i.e., h, and the domination checks for all outgoing edges (line 9 ff.) can be done in hδ + m .Because we call the adapted fastest paths algorithm for each v ∈ V the result follows.
Heuristic 2: We can further reduce the running time in the case of h = 1 by ignoring vertices after they are visited for the first time.We only need to store a single label at each vertex and do not need the label lists at the vertices.Even though this further reduces the possibilities to find the shortest duration path, our experimental evaluation shows that this approach performs well on real-world data sets.During the computation of the closeness for a vertex v j , after for a vertex v the duration d(v j , v) is set and v is added to the set F i , the heuristic algorithms do not further consider or generate labels for v. Therefore, vertex v is assigned the duration a −s when label (v, s, a) is processed, and new labels are only generated for edges (v, w, t, λ) with a ≤ t.All further labels (v, s , a ) that are returned from the priority queue in the following iterations are ignored, even if a < a.This means that we only have to store the label with the minimum duration found so far at each vertex v ∈ V because this is the label that will determine the final duration d(u, v).Consequently, this approach leads to a Dijkstra-like algorithm.As we argued already in Sect.3.1, we cannot guarantee to find the fastest paths because, in general, fastest paths do not consist of fastest sub-paths.

Theorem 5 The running time of temporal top-k closeness algorithm using the second heuristic is in
Proof The fastest path algorithm using the second heuristic has a running time in O(|V |+|E|• log(|E I |)).The dominance checks are unnecessary, and the total number of labels generated is less or equal to |E| I .Because we do not visit any vertex more than once, for each edge at most one label is added to the priority queue.We call the adapted fastest paths algorithm for each v ∈ V , and therefore, the result follows.

The number of reachable vertices
The top-k temporal closeness algorithms need as one of its inputs the number r I (u) of reachable vertices in G I for all u ∈ V .There are different possibilities to obtain the number of reachable vertices exactly.We can determine all r I (u) using |V | times a temporal DFS or BFS, cf.[19].If the adjacent edges are stored in chronological order at each vertex, it takes O(|V | 2 +|V ||E|) total running time.Alternatively, we can use a one-pass streaming algorithm similar to the earliest-arrival path algorithm in [52] with O(|V | 2 + |V ||E|) running time.In Sect.5.1, we present an approximation for speeding up the computation of the number of reachable vertices.

Comparison to the baseline algorithm
For a temporal graph G = (V , E) and a time interval I , the baseline algorithm first removes all edges that are not in I .Next, it runs the fastest path edge streaming algorithm from Wu et al. [52] for each u ∈ V to determine the closeness c I (u).Their fastest path algorithm uses a single pass over the edges, which need to be sorted by increasing time steps.One crucial difference is that their algorithm may update the minimal duration of a vertex after it was visited the first time.This is necessary if, e.g., the last edge of the edge stream is the fastest (u, v)-path consisting of one edge.However, for our top-k algorithms, the fastest duration between two vertices u and v must be determined when v is discovered for the first time., respectively.Notice that this is faster than the worst case running time of our algorithm.However, the streaming algorithm always has to scan all edges.Even for a vertex v with no out-going edge, the edge stream algorithm has to traverse over all edges because in each iteration it only knows the edges up to the point in time of the current edge and can not rule out possible future edges incident to v. Our algorithm, however, stops in this case after one iteration.Therefore, our algorithm performs well on real-world data sets.

Approximations for equal transition times
We lift two approximations from the static to the temporal domain.The first one is for the number of reachable vertices and is based on the estimation framework introduced by Cohen et al. [8].The second one approximates the temporal closeness for all vertices and adapts the undirected static closeness approximation from Wang et al. [13].Both algorithms are applicable if the temporal graph has equal transition times for all edges.This is often the case for social and contact networks.First, we introduce the concept of the temporal transpose of a temporal graph, which corresponds to reversing the edges in a conventional static graph and allows the search of incoming temporal paths.

Definition 5 For a temporal graph
Figure 3 shows an example for a temporal graph G and its temporal transpose I(G).The following lemma is the basis for the approximations.

Lemma 4 Let G = (V , E) be a temporal graph in which all edges have equal transition times, I(G) its temporal transpose, and u, v ∈ V . For each (u, v)-path P uv with duration d(P uv ) = d P in G there exists a (v, u)-path P vu with duration d(P vu ) = d P in I(G) and vice versa.
Proof We use t(e) to denote the availability time of edge e ∈ E. First notice that every (nontemporal) (u, v)-path P in G corresponds to a (non-temporal) (v, u)-path in I(G) that visits the vertices in reverse order of P. Therefore, having a temporal path in G (or I(G)) there exists a sequence of vertices and edges in I(G) (or G, respectively) for which we have to show that the time stamps at the edges allow for a temporal path.Let P = (v 1 , e 1 , v 2 , . . ., e , v +1 ) be a temporal path in G. Now, consider the sequence Q = v +1 , e , v −1 , . . ., e 1 , v 1 in I(G), with e i = (v i+1 , v i , t max − t, λ) for each e i = (v i , v i+1 , t, λ) and i ∈ {1, . . ., }. From t(e i ) + λ ≤ t(e i+1 ) follows t max − t(e i ) ≥ t max − t(e i+1 ) + λ.Therefore, the sequence Q is a temporal path in I(G).The other direction is symmetric.Furthermore, we have Lemma 4 allows us to apply our temporal closeness algorithms to the transposed temporal graph in order to consider the shortest durations of incoming instead of outgoing temporal paths.This way, we can compute the temporal in-closeness.

Theorem 6 Let G be a temporal graph in which all edges have equal transition times, I(G) its temporal transpose. The temporal closeness c(u) of a vertex u ∈ V in I(G) is equal to the temporal in-closeness c in (u) in G.
Furthermore, Lemma 4 shows that the set of vertices R(u) reachable by u in the transposed graph I(G) is equal to the set of vertices in G that can reach u.

Approximation for the number of reachable vertices
We propose an approximation algorithm for the number of reachable vertices based on the estimation framework introduced by Cohen et al. [8].The non-weighted version proceeds the following way.For two sets X and Y let S : Y → 2 X .Furthermore, let L be an oracle that if it is presented a random permutation r : X → {1, . . ., |X |} it returns a mapping l : Y → X such that for all y ∈ Y , l(y) ∈ S(y) and r (l(y)) = min x∈S(y) r (x).In h rounds, a uniformly randomly chosen value from the interval [0, 1] is assigned to each x ∈ X , i.e., the rank R i (x) in round i.In each round, the sorted ranking induces a permutation on X that is presented to the oracle L, which returns the mapping l i : Y → X .For an estimator ŝ(y) = − 1 the following holds.
Theorem 7 (Cohen et al. [8] For a temporal graph G and a time interval I , Algorithm 3 computes the estimates r I (u) of the number of reachable vertices r I (u), for all u ∈ V .Given G, I and h ∈ N, our algorithm first computes the temporal transpose I(G I ).Notice that we only need to add edges (u, v, t, λ) ∈ E that start or arrive during I to the temporal transpose I(G I ).The algorithm then proceeds with the following steps on I(G I ).In h rounds, we first assign a uniformly randomly chosen value from the interval [0, 1] to each vertex, i.e., the rank R(v) of the vertex v. Let v 1 , . . ., v n be the vertices ordered by increasing rank.The algorithm then determines for each v i the set of reachable vertices from v i , i.e., R(v i ), and sets the mapping l j (v) of all newly found v ∈ R(v i ) to R(v i ).

Algorithm 3
sort v 1 , . . ., v n by increasing rank 6: Theorem 8 Let G be a temporal graph, I a time interval and h ∈ N. Algorithm 3 approximates the reachability r I (u) for each u ∈ V and 0 < < 1 such that 123 Proof After assigning l j (v) = R(v i ), we do not need to reach v again using the edges that we took on the path from v i to v.All possible extensions outgoing from v will be explored.Then, the result follows from Lemma 4 and Theorem 7. Computing however, each edge is at most processed once in each of the h rounds.

Theorem 9
The running time of Algorithm 3 is in After approximating the reachability r I (u) for all u ∈ V , we can use them for the top-k temporal closeness algorithms.However, notice that underestimating the number of reachable vertices may stop the computation early for a vertex with a top-k temporal closeness value.

Approximation of the temporal closeness
We lift the randomized algorithm for undirected, static graphs from Wang et al. [13] to the temporal domain.Algorithm 4 computes an approximation of the normalized temporal closeness for each vertex in a directed temporal graph with equal transition times.Given a temporal graph G, a time interval I and sampling size h ∈ N, the algorithm first computes the temporal transpose I(G I ), and then samples h vertices v 1 , . . ., v h .For each vertex v i it determines the shortest durations to all vertices w ∈ V \ {v i } in I(G I ).Due to Lemma 4, we can estimate the closeness of a vertex u in G I by averaging the distances d(v, u) found in I(G I ).

Theorem 10 Let G = (V , E) be a temporal graph, I a time interval and h ∈ N. Algorithm 4 approximates the normalized temporal closeness ĉI
Proof Due to Lemma 4, we know that there is a one-to-one mapping between the temporal paths in G I and I(G I ) in which corresponding paths have the same duration.Next, we use Hoeffdings inequality [16].Let x 1 , . . ., x h be independent random variables that are bounded by 0 ≤ x i ≤ 1 for i ∈ {1, . . .h}.Furthermore, let S = h i=1 x i and μ = E S/h the expected mean.Then the following inequality holds.
Then with h = log |V | • −2 it follows

Experiments
We address the following questions: • Q1 How do the running times of the algorithms for the top-k temporal closeness, the temporal closeness for all vertices, and the baseline compare to each other?• Q2 How much does the reachability estimation speed up the computation of the top-k temporal closeness, and how good are the results?• Q3 How does using the heuristics for computing the fastest paths affect the top-k and the exact algorithms in terms of running time and solution quality?• Q4 How well does the sampling algorithm in terms of running time and approximation quality perform?• Q5 How does temporal closeness compare to static closeness, degree centrality, and reachability?How do the temporal closeness and the temporal in-closeness compare to each other?

Data sets
We used the following real-world temporal graph data sets: • Infectious A data set from the SocioPatterns project. 1 The Infectious graph represents face-to-face contacts between visitors of the exhibition Infectious: Stay Away [20].• Arxiv An authors collaboration network from the arXiv's High Energy Physics -Phenomenology (hep-ph) section [30].Vertices represent authors and edges collaborations.
The time stamp of an edge is the publication date.• Facebook This graph is a subset of the activity of a Facebook community over three months and contains interactions in the form of wall posts [51].• Prosper A network based on a personal loan website.Vertices represent persons, and each edge a loan from one person to another person [42].• WikipediaSG The network is based on the Wikipedia network.Each vertex represents a Wikipedia page, and each edge a hyperlink between two pages [35].• WikiTalk Vertices represent users and edges edits of a user's talk page by another user [29,47].• Digg and FlickrSG Digg and Flickr are social networks in which vertices represent persons and edges friendships.The times of the edges indicate when the friendship was formed [17,34].For WikipediaSG and FlickrSG, we chose a random time interval in size of ten percent of the total time span.For the other data sets, the time interval spans over all edges.Table 1 gives an overview of the statistics of the networks (forWikipediaSG and FlickrSG the statistics are for the subgraph G I ).The transition times are one for all edges in all data sets.

Experimental protocol and algorithms
All experiments were conducted on a workstation with an AMD EPYC 7402P 24-Core Processor with 3.35 GHz and 256 GB of RAM running Ubuntu 18.04.3LTS.We used GNU CC Compiler 9.3.0 with the flag -O2 and implemented the following algorithms in C++: • TC-Top-k is our top-k temporal closeness algorithm.
• TC-All is our temporal closeness algorithm for computing the exact values for all vertices.
• TC-Approx is our temporal closeness approximation.
• EdgeStr is the temporal closeness algorithm based on the edge stream algorithm for equal transition times [52].
For TC-Top-k and TC-All, we also implemented the variants using the heuristics Heuris-tic1 and Heuristic2 described in Sect.4.3.Our source code and the data sets are available at https://gitlab.com/tgpublic/tgcloseness.

Results and discussion
Q1 Table 2 shows the running times of the exact temporal closeness algorithms.We computed the numbers of reachable vertices exactly and set k ∈ {1, 10, 100, 1000} for the top-k algorithms.The reported running times include the time spend for computing the reachability.For all data sets, the top-k algorithms need significantly less running time than EdgeStr and TC-All.For the Infectious data set, the running time can be reduced by a factor between 10 (k = 1000) and 17 (k = 1) compared to EdgeStr. Figure 4 shows the reduction of the running time in percent compared to EdgeStr.For Prosper and WikipediaSG the running times compared to EdgeStr are decreased by more than 85% and 72% for each k ∈ {1, 10, 100, 1000}.
In case of Facebook the decrease is between 62% (k = 1000) and 72% (k = 1).Also, for the other data sets, considerable improvements are reached, i.e., for k = 1 and k = 10, the    times faster, and for the Prosper data set it is more than four times faster than the baseline.
In case of the WikipediaSG data set, TC-All is more than twice as fast, and it decreases the running time for Facebook by 45%.For the Arxiv, Digg, WikiTalk and the FlickrSG data sets, EdgeStr is faster than TC-All.The reason is a higher number of labels per vertex that are generated during the iterations of Algorithm 2. This leads to a worse performance of TC-All for these data sets.Q2 Table 3 shows the average running times and standard deviations for our top-k temporal closeness algorithms using the reachability approximation over ten repetitions.We approximate r (u) for all u ∈ V using Algorithm 4 with parameter h = 5.For the relatively small Infectious network, the approximation cannot improve the running time.For all other data sets, the reachability approximation reduces the running time.Notice the large improvements for the Facebook, Digg, and FlickrSG data sets of at least 45%, 36%, and 68%, respectively.Even though there are approximation guarantees for the reachability approximation (Theorem 9), we are not able to give a guarantee for finding the top-k sets.The reason is that underestimating the number of reachable vertices may stop the computation of a vertex that has a top-k temporal closeness value.To evaluate the solution quality, we measure the Jaccard similarity between top-k vertex sets computed using the algorithms with the reachability approximation and the exact top-k sets.The Jaccard similarity between two sets A and B is defined as J (A, B) = |A∩B| |A∪B| .Values closer to one mean a higher similarity of the sets.For all data sets and all k ∈ {10, 100, 1000}, we obtained top-k vertex sets with similarities of one, i.e., the algorithm found the exact top-k sets in all ten rounds.4. They are lower than those of our exact algorithms for all data sets.Figure 5a shows the decrease in running time compared to the exact algorithms.For Infectious, the running time is decreased by around 50% for all algorithms.In case of Arxiv the heuristic decreases the running time between 61% (TC-All) and 71% (TC-Top-100).
Similar results are achieved for the Prosper data set with decreases between 61% (TC-All) and 66% (TC-Top-1, TC-Top-10 and TC-Top-100).For WikipediaSG and WikiTalk, the decrease in running time is between 25% and 57%, and for Digg between 13% and 35%.The lowest decrease in running time is achieved for Facebook and FlickrSG with values between 11% − 21% and 5% − 33%, respectively.We measured the average relative deviation from the exact closeness values and the Jaccard similarity between the computed top-k sets and the exact solutions.For Arxiv, Prosper, Digg and FlickrSG the exact top-k sets are computed for all k ∈ {1, 10, 100, 1000}, i.e., the Jaccard similarity is one.Furthermore, for Arxiv and Prosper the average relative deviation from the exact closeness is less than 2 • 10 −6 .For Digg and FlickrSG, the deviations are less than 0.0003 and 7 • 10 −5 .The Jaccard similarities for all remaining data sets are one in case of k = 1 and k = 10.For Facebook, WikipediaSG and WikiTalk, the Jaccard similarity is one for k = 100 and 0.99 for k = 1000.The similarities for Infectious are 0.96 for both k = 100 and k = 1000.The average relative deviation from the exact closeness is less than 3 • 10 −5 for Facebook, WikipediaSG and WikiTalk.For the Infectious data set, the deviation is less than 0.007.Next, we ran the temporal closeness algorithms using the second heuristic (Heuristic2).Table 5 shows the running times.As expected, the running times are lower than those of our exact algorithms for all data sets and lower than the running times of the algorithms using the first heuristic in almost all cases.Figure Fig. 5b shows the decrease in running time compared to the exact algorithms.For Infectious, Arxiv and Prosper the second heuristic leads to large gains in performance for all algorithms of at least 55%, 66%, and 68%, respectively.In case of WikipediaSG and WikiTalk, the decrease is at least 27% and 32%, respectively.Notice that for TC-All the decrease is at least 52% (Facebook) and up to 83% (Arxiv).All data sets but Infectious have a Jaccard similarity of one for k ∈ {1, 10, 100}.Arxiv, Facebook and Prosper have a Jaccard similarity of one for k = 1000, and their average relative deviation from the exact closeness is less than 2 • 10 −6 .The Infectious data set has a Jaccard similarity of 0.96 for both k = 100 and k = 1000.The average relative deviation is 0.008.The remaining data sets have a Jaccard similarity of 0.99 for k = 1000 and an average relative deviation of 0.003 (FlickrSG), 5 • 10 −5 (WikiTalk), 3 • 10 −5 (Digg), and 1 • 10 −6 (WikipediaSG).The heuristics provide very good results with small errors in the computed closeness values.For small k, the correct top-k sets are found.For larger k, the Jaccard similarity is still very high.Q4 For a fixed error , the sample size h grows logarithmically in |V |, and the approximation is most suitable for data sets with a high number of vertices.For example, if we want to achieve an error of ≤ 0.001 (with high probability) for a data set with |V | = 10 6 vertices it follows h = log 10 6 • 0.001 −2 ≈ 1.38 • 10 7 .In this case, the needed sample size exceeds the number of vertices by a factor greater than ten, and the exact computation of the normalized temporal closeness will be faster.However, for a data set with |V | = 10 8 vertices, we have h ≈ 1.84 • 10 7 , and more drastically for |V | = 10 100 it follows h ≈ 2.3 • 10 8 .Choosing a large enough h to achieve a low error for data sets with few vertices may lead to too high running times.Hence, we ran the Algorithm 4 ten times for each data set with the sampling sizes h = p • |V | with p ∈ {0.1, 0.2, 0.5}.Table 6 reports the average running times and the standard deviations.Theorem 10 shows that TC-Approx with high probability computes closeness values with a small additive error.For p = 0.5, the running times for Arxiv, WikiTalk and Digg exceed the running times using the exact algorithms.As expected, with increased sampling sizes, the approximation error is reduced.Table 6 shows the approximation errors for the data sets.For p = 0.1 an approximation error of only 8% is reached for WikiTalk while using less than half the running time of EdgeStr.The standard deviation of the mean approximation error over the ten rounds is below 0.01 for all data sets.Q5 We measured the Jaccard similarity between the temporal and static top-k closeness vertex sets.Table 7 shows that the similarity is very low in most cases, and for Facebook and Digg even zero.In case of Arxiv and k = 10, the similarity is highest with 0.81.Furthermore, Table 7 shows the Jaccard similarity between the temporal top-k closeness vertices and the vertices with the largest number of outgoing temporal edges, i.e., the out-degree centrality.For all chosen k, the Jaccard similarity for Infectious is low.For Facebook, WikipediaSG, FlickrSG the sets of the top-10 temporal closeness vertices and degree centrality vertices are equal.However, the similarity declines fast for each of these data sets and is very low for k = 1000.We also compared the Jaccard similarity between the top-k temporal closeness vertices and the top-k vertex sets of vertices with the highest reachability values, i.e., vertices u for which r (u) is highest (Table 7).The values are small for all data sets, with 0.23 being the highest similarity for Arxiv and k = 1000.For k = 10, all similarities are zero with exception of the Facebook and FlickrSG data sets with similarities of 0.17 and 0.05.The Jaccard similarities between the top-k temporal closeness and the temporal in-closeness vertices are very low for all data sets but WikiTalk and FlickrSG.These low Jaccard similarities are expected due to the missing symmetries of the directed temporal edges and the temporal restrictions.The vertices with a high temporal closeness are different from the vertices that have a high temporal in-closeness.
To further investigate the relationships between the vertex rankings obtained using temporal closeness, degree centrality, and reachability, we measured the correlations of the rankings using the Kendall rank correlation coefficient [23].Because vertices in a graph can have equal centrality values and rank, we use the version accounting for possible ties in the rankings (Kendall's τ b coefficient) [24].The Kendall rank correlation coefficient is commonly used for analyzing and comparing vertex rankings given by centrality measures [14].Let X = (x 1 , . . ., x n ) and Y = (y 1 , . . ., y n ) be two sequences, i.e., the ranked vertices.A pair of (x i , y i ) and (x j , y j ) with 1 ≤ i < j ≤ n is concordant if x i < x j and y i < y j or x i > x j and y i > y j .The pair is discordant if x i > x j and y i < y j or x i < x j and y i > y j .The Kendall rank correlation is defined as the number of concordant pairs minus the number  Figure 6 shows the correlation matrices for all data sets.The correlation between the rankings by degree and temporal closeness is very strong for most data sets.For the Infectious and WikiTalk data sets, it is 0.53 and 0.78, and for the other data sets, it is between 0.94 and one.The reason for the high correlation is that, in the case of unit traversal times, each direct neighbor adds a value of one to the temporal closeness.For the reachability, the correlation varies between 0.56 for the Infectious and 0.97 for the WikiTalk data set, with an average value of 0.82.There is only little correlation between temporal and static closeness for most data sets.Finally, we compare different centrality measures as a heuristic for selecting initial vertices for distributing information in a temporal network.The problem of choosing optimal initial vertices is an instance of the influence maximization problem in temporal networks.Kempe et al. [22] have shown that selecting the most influential vertices is NP-hard in static networks.Here, we simulated an information diffusion process in the Infectious network.This information could be, e.g., (fake) news, a viral campaign, or some infectious disease.We simulated the dissemination over the time spanned by the temporal graph.At the start of the simulation, only a set of k = 100 seed vertices possess the information.We chose the seed vertices with the top-k temporal closeness, the top-k highest degree, top-k static closeness, and k randomly chosen vertices, respectively.The static closeness was computed on the aggregated graph A(G).Next, we evaluated the spread of the information over time.Each vertex, i.e., person, that has the information propagates it with a probability of 10%.Hence, each time a temporal edge (u, v, t, 1) is available at time t, the vertex u gives the information to vertex v with a probability of 10%.The information is then available to vertex v for further distribution at time t + 1.We repeated the experiment ten times and measured the mean number of persons who obtained the information after processing the last edge.Figure 7 shows the results.When using the top-k temporal closeness vertices as seeds, the information is distributed to the highest number of people.Using the vertices with the highest Fig. 7 Comparing different centrality measures for choosing the seed of a dissemination process degree as seeds leads to the second-best results.The static closeness results cannot compete because the temporal restrictions are not respected during the computation of the top-k static closeness vertices.The randomly chosen vertices perform better than static closeness.In conclusion, temporal closeness can be a helpful heuristic for selecting influential vertices in temporal graphs.

Conclusion
We introduced algorithms for computing temporal closeness in temporal graphs.The basis for our algorithms is a new fastest path algorithm using a label setting strategy and which might be of interest on its own.Our top-k temporal closeness algorithms improved the running times for all real-world sets.We introduced simple yet strong heuristic modifications that significantly reduced the running times and led to only small errors for all data sets.Furthermore, we adapted two randomized algorithms, one for estimating the number of reachable vertices and one for approximating the closeness for all vertices.Also, the reachability approximation is interesting in itself.Both randomized approaches lead to significant speed-ups, and the sampling algorithm for temporal closeness has only a small additive error with high probability.Our approaches allow us to efficiently find the most relevant vertices and their temporal closeness in temporal graphs.regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Fig. 2
Fig. 2 Example of a temporal graph and a time-restricted subgraph

Definition 1 (Definition 2
Harmonic Temporal Closeness) Let G = (V , E) be a temporal graph and I a time interval.We define the harmonic temporal closeness for u ∈ V with respect to I as c I (u) = v∈V \{u} 1d I (u,v) .We call c I n (u) = c I (u)|V | the normalized harmonic temporal closeness.Now we define the top-k temporal closeness problem.For a temporal graph G = (V , E) and k ∈ N, the Top-k Harmonic Temporal Closeness Problem asks for the k largest values of the harmonic temporal closeness and the set of all vertices in V with these values.

Theorem 3
Let δ + m be the maximal out-degree in G I , π = min{ τ I − , τ I + }, and furthermore ξ = max{πδ + m , log(|E I |)}.The running time of Algorithm 2 is in O(|V | 2 + |V ||E| • τ I + • ξ).Proof Ordering the vertices by decreasing out-degree takes O(|V | log |V |) time.Computing λ min is done during the removal of vertices that are not in the time interval I , i.e., this together takes O(|E|) time.We can compute by checking for each edge (u, v, t, λ) ∈ E I the time stamps of the outgoing edges at v in O(|E I |δ + m ).Initialization is done in O(|E|).The outer for loop (line 4) is called for each vertex, and the inner loop is an adaption of Algorithm 1.The updating of the upper bound can be done in constant time.Therefore, we have |V | times the running time of Algorithm 1. Keeping track of the top-k results is possible in O(|V | log k).

Theorem 4
Let δ + m be the maximal out-degree and ξ = max{hδ + m , log(|E I |)}.The running time of the temporal top-k closeness algorithm using Heuristic 1 is in O(|V | 2 +|V ||E|• τ I + ξ).Proof Let G = (V , E) be a temporal graph, u ∈ V , and I a time interval.The fastest path algorithm using the first heuristic returns estimations of the durations of the fastest (u, v)paths for all v ∈ V in G I in running time O(|V | + |E| • |τ I + (u)| • ξ).
For a temporal graph G = (V , E) spanning time interval I , let δ − m be the maximal in-degree in G, and let π = min{δ − m , |τ I + (u)|}.The running time of the edge stream algorithm for computing fastest path is in O(|V | + |E| log π) and in graphs with equal transition time for all edges in O(|V | + |E|).For π = min{δ − m , τ I + }, starting their algorithm from each vertex leads to temporal closeness algorithms with running times of O(|V | 2 + |V ||E| log π ), or O(|V | 2 + |V ||E|)

Fig. 3
Fig. 3 Example for a temporal graph G and its temporal transpose I(G).All edges of G and I(G) have unit transition times and t max = 9 due to edge (w, z, 8, 1)

Fig. 4
Fig. 4 Decrease in running time of the top-k algorithms in percent compared to baseline EdgeStr

Fig. 5
Fig.5 Decrease of the running times in percent of the algorithms using the heuristics compared to the exact algorithms pairs, normalized by the number of all possible pairs (accounting for ties) in the rankings.The correlation coefficient takes on values between −1 and 1, where values close to one indicate similar rankings, close to zero no correlation, and close to minus one a strong negative correlation.

Table 1
Statistics and properties of the data sets, with δ − m (resp.δ + m ) being the maximal in-degree (resp.out-degree), and |E s | the number of edges in the aggregated graph

Table 2
Running times in seconds for the exact temporal closeness algorithms

Table 3
Running times in seconds for the algorithms using the reachability approximation with h = 5.The running times are the average and standard deviations over ten repetitions

Table 4
Running times in seconds for the first heuristic with h = 2 With increasing k, the running time also increases because the upper bounds calculated during the run of TC-Top-k converge slower at the lower end of the top-k values.Our exact algorithm TC-All is faster than EdgeStr for the Infectious, Facebook, Prosper and WikipediaSG data sets.For the Infectious data set our exact algorithm is more than ten

Table 5
Running times in seconds for the second heuristic

Table 6
The average running times and standard deviations in seconds and the mean approximation error for TC-Approx

Table 7
Jaccard similarity between the top-k temporal closeness vertices and the temporal in-closeness, static closeness vertices in the aggregated graph, vertices with the highest number of outgoing temporal edges, and vertices with the highest reachability Data set