Finding events in temporal networks: Segmentation meets densest-subgraph discovery

In this paper we study the problem of discovering a timeline of events in a temporal network. We model events as dense subgraphs that occur within intervals of network activity. We formulate the event-discovery task as an optimization problem, where we search for a partition of the network timeline into k non-overlapping intervals, such that the intervals span subgraphs with maximum total density. The output is a sequence of dense subgraphs along with corresponding time intervals, capturing the most interesting events during the network lifetime. A naive solution to our optimization problem has polynomial but prohibitively high running time complexity. We adapt existing recent work on dynamic densest-subgraph discovery and approximate dynamic programming to design a fast approximation algorithm. Next, to ensure richer structure, we adjust the problem formulation to encourage coverage of a larger set of nodes. This problem is NP-hard even for static graphs. However, on static graphs a simple greedy algorithm leads to approximate solution due to submodularity. We extended this greedy approach for the case of temporal networks. However, the approximation guarantee does not hold. Nevertheless, according to the experiments, the algorithm finds good quality solutions.


I. INTRODUCTION
Real-world networks are highly dynamic in nature, with new relations (edges) being continuously established among entities (nodes), and old relations being broken. Analyzing the temporal dimension of networks can provide valuable insights about their structure and function, for instance, it can reveal temporal patterns, concept drift, periodicity, temporal events, etc. In this paper we focus on the problem of finding dense subgraphs, a fundamental graph-mining primitive. Applications include community detection in social networks [1]- [3], gene expression and drug-interaction analysis in bioinformatics [4], [5], graph compression and summarization [6]- [8], spam and security-threat detection [9], [10], and more.
When working with temporal networks one has first to define how to deal with the temporal dimension, i.e., how to identify which are the temporal intervals in which the dense structures should be sought. Instead of defining those intervals a-priori, in this paper we study the problem of automatically identifying the intervals that provide the most interesting structures. We consider a subgraph interesting if it boasts high density. As a result, we are able to discover a sequence of dense subgraphs in the temporal network, capturing the evolution of interesting events that occur during the network lifetime. As a concrete example, consider the problem of story identification in online social media [11], [12]: the main goal is to automatically discover emerging stories by finding dense subgraphs induced by some entities, such as twitter hashtags, co-occurring in a social media stream. In our case, we are additionally interested in understanding how the stories evolve over time. For instance, as one story wanes and another one emerges, one dense subgraph among entities dissipates and another one appears. Thus, by segmenting the timeline of the temporal network into intervals, and identifying dense subgraphs in each interval, we can capture the evolution and progression of the main stories over time.
As another example, consider a collaboration network, where a sequence of dense subgraphs in the network can reveal information about the main trends and topics over time, along with the corresponding time intervals.
Challenges and contributions. The problem of finding the k densest subgraphs in a static graph has been considered in the literature from different perspectives. One natural idea is to iteratively (and greedily) find and remove the densest subgraphs [13]. More recent works consider finding k densest graphs with limited overlap [14], [15]. However, these approaches do not generalize to temporal networks.
For temporal networks, to our knowledge, there are only few papers that consider the task of finding temporally-coherent densest subgraphs. The most similar to our work aims at finding a heavy subgraph present in all, or k, snapshots [16]. Another related work focuses on finding a dense subgraph covered by k scattered intervals in a temporal network [17]. Both of these methods, however, find a single densest subgraph.
In this paper, instead, we aim at producing a segmentation of the temporal network that (i) captures dense structures in the network; (ii) exhibits temporal cohesion; (iii) spans the whole history of the network; and (iv) is amenable to direct inspection and temporal interpretation. Towards this goal we formulate the problem of k-DENSEST-EPISODES, which requires to find a partition of the temporal domain into k nonoverlapping intervals, such that the intervals span subgraphs with maximum total density. The output is a sequence of dense subgraphs along with corresponding time intervals, capturing the most interesting events during the network lifetime.
A naïve solution to this problem has polynomial but prohibitively-high running-time complexity. Thus, we adapt existing recent work on dynamic-densest subgraph [18] and approximate dynamic programming [19] to design a fast approximation algorithm.
Next we shift our attention to encouraging coverage of a larger set of nodes, so as to produce richer, more interesting structures. The resulting new problem formulation turns out to be NP-hard even for the case of static graphs. However, on static graphs a simple greedy algorithm leads to approximate solution thanks to the submodularity of the objective function. Following this observation, we extended this greedy approach for the case of temporal networks. Despite the fact that the approximation guarantee does not carry on when generalizing to the temporal case, our experimental evaluation indicates that the method produces solutions of very high quality.
The contributions of this paper are summarised as follows: • We introduce (Section II) the k-DENSEST-EPISODES problem and show that it has a polynomial time exact algorithm, which is however cubic thus unpractical. • By leveraging recent work on dynamic densest subgraph and approximate dynamic programming we achieve a fast algorithm with approximation guarantees (Section III). • We then (Section IV) extend the problem formulation to encourage coverage of a larger set of nodes. We show that the resulting problem is NP-hard even for the case of static graph. However, we show on static graphs a simple greedy algorithm leads to approximate solution due to submodularity; then we extend this greedy approach for the case of temporal networks. • Experiments on synthetic and real-world datasets (Section V), and a case study on Twitter data (Section VI) confirm that our methods are efficient and produce meaningful and high-quality results.

II. PROBLEM FORMULATION
We are given a temporal graph G = (V, T , τ ), where V denotes the set of nodes, T = [0, 1, . . . , t max ] N is a discrete time domain, and τ : V × V × T → {0, 1} is a function defining for each pair of nodes u, v ∈ V and each timestamp t ∈ T whether edge (u, v) exists in t. We denote T , let Definition 1 (Episode). Given a temporal graph G = (V, T , τ ) we define an episode as a pair (I, H) where I T is a temporal interval and H is a subgraph of G[I].
Our goal is to find a set of interesting episodes along the lifetime of the temporal graph. In particular, our measure of interestingness is the density of the subgraph in the episodes. We adopt the widely-used notion of density of a subgraph H = (V (H), E(H)) as the average degree of the nodes in the subgraph, i.e., d(H) = |E(H)| |V (H)| . Observe that this definition is not the only choice, however, such a notion of density enjoys the following nice properties: It can be optimized exactly [20] and approximated efficiently [21], while a densest subgraph can be computed in real-world graphs containing up to tens of billions of edges [22]. Problem 1 (k-DENSEST-EPISODES). Given a temporal graph G = (V, T , τ ) and an integer k ∈ N, find a set of k episodes S = {(I , H )}, for = 1, . . . , k such that the {I } are disjoint intervals and k =1 d(H ) is maximized. A solution for Problem 1 can be computed in polynomial time. To see this, let S * be an optimum solution and let I(S * ) = {I , = 1, . . . , k} and G(S * ) = {H , = 1, . . . , k}. We can assume without loss of generality that the union of the intervals in I(S * ) results in the set of time stamps T , that is, I(S) is a k-segmentation of T . Moreover, a graph H ∈ G(S * ) is the densest subgraph of G(I ), and can be found in O(nm log n) time [20], [23] or in O(nm log(n 2 /m)) time [24] (where n and m denote the number of nodes and edges in G(I ) respectively). The optimal segmentation can be solved with a standard dynamic programming approach, requiring O(km 2 ) steps [25]. This brings the total running time to O(km 3 n log n) or O(km 3 n log(n 2 /m)).

III. APPROXIMATE DYNAMIC PROGRAMMING
The simple algorithm discussed in the previous section has a running time, which is prohibitively expensive for large graphs. In this section we develop a fast algorithm with approximation guarantees.
The derivations below closely follows the ones in [19], which improves [26]. However, we cannot use those results directly: both papers work with minimization problems, while leveraging the fact that the profit of an interval is not less than the profit of its subintervals (monotone non-decreasing). In contrast, our problem can be viewed as a minimization problem with monotone non-increasing profit function.
Given a time interval T = [t 1 , t 2 ], let us write d * (T ) = max H⊆G(T ) d(H). For simplicity, we define d * ([t 1 , t 2 ]) = 0 if t 2 < t 1 . Problem 1 is now a classic k-segmentation problem of T maximizing the total sum of scores d * (T ) for individual time intervals. For notation simplicity, we assume that the all timestamps T are enumerated by integers from 1 to r.
Let o[i, ] be the profit of optimal -segmentation using only the first i time stamps. It holds: and o[i, k] can be computed recursively. Denote the approximate profit of optimal -segmentation as s[i, ]. The main idea behind the speed-up is not to test all possible values of j. Instead, we are going to keep a small set of candidates, denoted by A, and only use those values for testing. The challenge is how to keep A small enough while at the same time guarantee the approximation ratio. The pseudo-code achieving this balance is given in Algorithm 1, while a subroutine that keeps the candidate list short is given in Algorithm 2. Algorithm 1 executes a standard dynamic programming search: it assumes that partition of i < i first data points into − 1 intervals is already calculated and finds the best last interval [a, i] for partitioning of i first points into l intervals. However, it considers not all possible candidates [a, i], but only a sparsified list, which guarantees to preserve a quality guarantee. The sparsified list is built for a fixed number of intervals starting from A; 5 else j = j + 1; 6 end 7 return A from empty list. Intuitively, it keeps only candidates A = [a j ] with significant difference s[a j , − 1]. Significance of the difference depends on the current best profit s[i, ]: the larger the value of the solution found, the less cautions we can be about lost candidates and the coarser becomes A. Thus, we need to refine A by Algorithm 2 after each processed i.
Let us first prove that ApproxDP yields an approximation guarantee, assuming that d * (·) is calculated exactly.
To prove the final result, let us first fix and let A i be the set of candidates in the beginning of round i. Let δ i be the value of δ in Algorithm 2, called on iteration i.

Proof. We say that a list of numbers
for every a j ∈ A with j < |A|. We first prove by induction over i that A i is i-dense.

Assume that
and SPRS does not create gaps larger than Let a j be the largest element in A i , such that a j ≤ b. Then either a j ≤ b < a j+1 or b = a |Ai| and j = a |Ai| .
. This concludes the proof.
We can now complete the proof.
Proof of Proposition 1. We will prove the result with induction over . Let α = (1 + k ( − 1)). Let b be the starting point of the last interval of optimal solution o[i, ], and let a j as given by Lemma 1. Upper bound As a result, Let us now address the computational complexity.
Proof. Fix i and , and let c j = s[a j , ], where a j ∈ A i . Then c j is monotonically increasing sequence upper bounded by s[i − 1, ], and having consecutive elements being at least δ i−1 apart. Counting conservatively, this leads to Since we have kr cells in s, the result follows.
Since computing d * requires O(nm log n) time, this gives us a total running time of O(nmr k 2 ). We further speed up our algorithm by approximating the value d * by means of one of the approaches developed in [18]. In particular, we employ the algorithm that maintains a 2(1 + )-approximate solution for the incremental densest subgraph problem (i.e. edge insertions only), while boasting a poly-logarithmic amortized cost. We shall refer to such an algorithm as ApprDens.
ApprDens allows us to efficiently maintain the approximate density of the densest subgraph d * ([a, i]) for each a in A i in ApproxDP, as larger values of i are processed and edges are added. Whenever we remove an item a from A i in SPRS we also drop the corresponding instance of ApprDens.
From the fact that an approximate densest subgraph can be maintained with poly-logarithmic amortized cost, it follows that our algorithm boasts quasi-linear running time. Proof. Let m j be the number of edges added to the graph corresponding to a j before it is deleted. The same argument as in the proof of Proposition 2 states that i m i ∈ O( k 2 1 m). Theorem 4 in [18] states that maintaining the graph with m i edges requires O(m i −2 2 log 2 n) time. Combining these two results proves the proposition.
When combining ApproxDP with ApprDens, we wish to maintain the same approximation guarantee of ApprDens. Recall that ApproxDP leverages the fact that the profit function is monotone non-increasing. Unfortunately, ApprDens does not necessarily yield a monotone score function, as the density of the computed subgraph might decrease when a new edge is inserted. This can be easily circumvented by keeping track of the best solution, i.e. the subgraph with highest density. The following proposition holds.
Proof. Let d * a (T ) be the density of the graph returned by ApprDens for a time interval T . Let O be the optimal ksegmentation, and let Let q 3 be the score of the optimal k-segmentation using d * a , and let q 4 be the score of the segmentation produced by ApproxDP. Then, completing the proof.
We will refer to this combination of ApproxDP with ApprDens as Algorithm KGAPPROX.

IV. ENCOURAGE COVERAGE
Problem 1 is focused on total density maximization, thus its solution can contain graphs which are dense, but union of their node sets cover only a small part of the network. Such segmentation is useful when we are interested in the densest temporally coherent subgraphs which can be understood as tight cores of temporal clusters. However, segmentations with larger but less dense subgraphs, covering a larger fraction of nodes, can be useful to get a high-level explanation of the whole temporal network. To allow for such segmentations we extend Problem 1 to take node coverage into account. Let Here we consider a generalized cover functions of the shape where w is a non-negative non-decreasing concave function of x v (G). When w(x v (G)) is a 0-1 indicator function, function cover(G) is a standard cover, which is intuitive and easy to optimize by greedy algorithm. Another instance of the generalized cover function, inspired by text-summarization research [27], is w(x v (G)) = x v (G). It ensures that the marginal gain of a node decreases proportionally to the number of times the node is covered.
Proposition 5. There is no polynomial solution for Problem 2 unless P=NP. Proposition 6. Function cover(G) is a non-negative nondecreasing submodular function of subgraphs.
Proof. For a fixed v ∈ V function x v (G) is non-decreasing modular (and submodular): for any set of subgraphs X and a new subgraph x holds that belongs to x and does not belong to any subgraph in X. Otherwise 0. By property of submodular functions, composition of concave non-decreasing and submodular non-decreasing is non-decreasing submodular. Function cover(G) is submodular non-decreasing as a non-negative linear combination. Nonnegativity follows from non-negativity of w.
A. K static densest subgraphs and generalized average degree Before going into the temporal segmentation, we briefly consider the static case: To solve this problem we can search greedily over subgraphs. Let H i−1 = {H 1 , . . . , H i−1 }, and define marginal node gain, given weight function w, as ). Then denote marginal gain of subgraph H i given already selected graphs H i−1 as Greedy algorithm for Problem 3 consequently builds the set H by adding H i , which maximizes gain χ(H i , H i−1 ). If we can find H i optimally, such greedy gives 1−1/e approximation due to submodular maximization over cardinality constrains (see [28] for this classic result, Euler's number e ≈ 2.71828).

Problem 4. Given a static graph
Before we proceed, we define a more general and simple version of Problem 4. First, we note that preselected subgraphs H i−1 contribute only to δ v (H i , H i−1 | w) and this term does not change through iterations. Thus, once term δ v (H i , H i−1 | w) is recalculated we can exclude H i−1 from consideration.
Next, we define a generalized degree as a function of nodes then Problem 5 is equivalent to Problem 4. Note that in the latter case a depends on the number of nodes in graph H.
We will continue analysis with Problem 5.

Proposition 7.
There is no polynomial solution for Problem 5 unless P=NP.
To solve Problem 5 efficiently we can modify Charikar's algorithm for densest subgraphs [21] and obtain 1/2 approximation guarantee.

B. Incremental case
Here we consider the setting of incremental updates for Problem 5, which may be not interesting by itself, but we will use it as a subroutine for temporal case.
Given a stream of incremental edge updates to graph H we would like to find and keep up-to-date a subgraph H i , which maximizes d a (H i ) for some generalized degree function deg a (u, v | H i ).
To keep H i updated we can use the data structure and update procedure designed for the densest subgraph by Epasto et al. [18]. In the full version of this paper we describe the approach of Epasto et al. and necessary modifications to handle generalized degree. We will refer to this extension as ApprGenDens. The algorithm provides 2(1+ )-approximate generalized density densest subgraph using edge insertions.

C. Greedy dynamic programming
Similarly to Problem 1, we will use dynamic programming for Problem 2. However, as the problem is hard we have to rely on greedy choices of the subgraphs. Thus, the obtained solution does not have any quality guarantee.
Let M [ , i] be the profit of i first points into intervals, let C[ , i] be the set of subgraphs G = {G 1 , . . . , G } selected on these intervals, 1 ≤ ≤ k and 0 ≤ i ≤ m.
Define marginal gain interval [j, i], given that j − 1 are already segmented into − 1 interval, Dynamic programming recurrence: After filling this table, M [k, m] contains the profit of ksegmentation with subgraph overlaps. C[k, m] will contain selected subgraphs, the intervals and subgraphs can be reconstructed, if we keep track of the starting points of selected last intervals. Note, that profit M [k, m] is not optimal, because the choice of subgraph G i depends on the interval and the previous choices, and there is a fixed order, in which we explore intervals.
We perform dynamic programming by approximation algorithm ApproxDP, and the densest subgraph for each candidate interval is retrieved by ApprGenDens. We refer to the resulting algorithm as KGCVR. To keep track on number of x v when we construct G we need to keep frequencies of each node. To avoid extensive memory costs, in the experiments we use Min-Count sketches.

V. EXPERIMENTS
We evaluate the performance of the proposed algorithms on synthetic graphs and real-world social networks. The datasets are described below. Unless specified, we post-process the output of all algorithms and report the optimal densest subgraphs in the output intervals. Our datasets and implementations are publicly available. 1 A. Synthetic data.
We generate a temporal network with k planted communities and a background network. All graphs are Erdős-Rényi. The communities G have the same density, disjoint set of nodes, and are planted in non-overlapping intervals. The background network G includes nodes from all planted communities G . The edges of G are generated uniformly on the timeline. In the typical setup the length of the whole time interval T is |T | = 1000 time units, while the edges of each G are generated in intervals of length |T | = 100 time units. The densities of the communities and the background network vary. The number of nodes in G is set to 100.
We test the ability of our algorithms to discover planted communities in two settings. In the first setting (dataset family Synthetic1 ) we vary the average degree of the background network from 1 to 6 and fix the density of the planted 5-cliques to 4. Synthetic1 allows to test the robustness against background noise. In the second setting (dataset family Synthetic2 ) we vary the density of planted 8-node graphs from 2 to 7, while the average degree of the background network is fixed to 2.
B. Real-world data.
We use the following real-world datasets: Facebook [29] is a subset of Facebook activity in the New Orleans regional community. Interactions are posts of users on each other walls. The data covers the time period from 9.05.06 to 20.08.06. The Twitter dataset tracks activity of Twitter users in Helsinki in year 2013. As interactions we consider tweets that contain mentions of other users. The Students 2 dataset logs activity in a student online network at the University of California, Irvine. Nodes represent students and edges represent messages with ignored directions. Enron: 3 is a popular dataset that contains email communication of senior management in a large company and spans several years.
For a case study we create a hashtag network from Twitter dataset (the same tweets from users in Helsinki in year 2013): nodes represent hashtags -there is an interaction, if two hashtags occur in the same tweet. The timestamp of the interaction corresponds to the timestamp of the tweet. We denote this dataset as Twitter# .

C. Optimal baseline
A natural baseline for KGAPPROX is OPTIMAL, which combines exact dynamic programming with finding the optimal densest subgraph for each candidate interval. Due to the high time complexity of OPTIMAL we generate a very small dataset with 60 timestamps, where each timestamp contains a random graph with 3-6 nodes and random density. We vary the number of intervals k and report the value of the solution (without any post-processing) and the running time in Figure 1. On this toy dataset KGAPPROX is able to find near-optimal solution, while it is significantly faster than OPTIMAL.

D. Results on synthetic datasets
Next, we evaluate the performance of KGAPPROX on the synthetic datasets Synthetic1 and Synthetic2 by assessing how well the algorithm finds the planted subgraphs. We report mean precision, recall, and F -measure, calculated with respect to the ground-truth subgraphs. All results are averaged over 100 independent runs. First, Figure 2 . average degree in the background network. However, precision degrades as the density of the background network increases, as then it becomes cost-beneficial to add more nodes in the discovered densest subgraphs. Second, Figure 2(b) shows the quality of the solution of KGAPPROX as a function of the density in the planted subgraphs. In Synthetic2 the density of the background is 2.
Similarly to the previous results, the quality of the solution, especially recall, degrades much only when the density of the planted and the background network become similar.

E. Results on real-world datasets
As the optimal partition algorithm OPTIMAL is not scalable for real datasets, we present comparative results of KGAPPROX with baselines KGOPTDP and KGOPTDS. The KGOPTDP algorithm performs exact dynamic programming, but uses an approximate incremental algorithm for the densest subgraph search (the incremental framework by Epasto et al. [18]). Vice versa, KGOPTDS performs approximate dynamic programming while calculating the densest subgraph optimally for each candidate interval (by Goldberg's algorithm [20]). Note that KGOPTDP has 2(1 + DS ) 2 approximation guarantee and KGOPTDS has (1 + DP ) approximation guarantee. However, even these non-optimal baselines are quite slow on practice and we use a subset of 1 000 interactions of Students and Enron datasets for comparative reporting.
To ensure fairness, we report the total density of the optimal densest subgraphs in the intervals returned by the algorithms.
In Table I we report the density of the solutions reported by KGAPPROX, KGOPTDP, and KGOPTDS, as well as their running time. We experiment with different parameters for the approximate densest-subgraph search ( DS ) and for approximate dynamic programming ( DP ). For both datasets the best solution was found by KGOPTDS. This is expected, as this algorithm has the best approximation factor. The solution cost decreases as DP increases. On the other hand, KGOPTDS has the largest running time, which decreases with increasing DP , but even with the largest parameter value ( DP = 2) KGOPTDS takes about an hour.
The KGOPTDP algorithm typically finds the second-best solution, however it only marginally outperforms KGAPPROX (e.g., DS = 0.1), while requiring up to several orders of magnitude of higher computational time. Naturally, the quality of the solution degrades with increasing DS .
The solution quality degrades with increasing the approximation parameters for all algorithms. However, the degradation is not as dramatic as the worst case bound suggests, and using such an approximation parameter offers significant speed-up. KGAPPROX provides the fastest estimates of a good quality for a wide range of approximation parameters. Note that KGAPPROX is more sensitive to the changes in the quality of the densest subgraph search regulated by DS . Figure 3 shows running time of KGAPPROX as a function of the approximation parameters DS and DP . The figure confirms the theory, that is, DS has significant impact on the running time, while the algorithm scales very well with DP .

F. Running time and scalability
We demonstrate scalability in Figure 4, plotting the running time for increasing number of interactions, for Facebook and Twitter datasets. Recall that the theoretical running time is O(k 2 m log n), where n is the number of nodes and m the number of interactions. In practice, the running time grows fast for the first thousand interactions and then saturates to linear dependence. This happens because in the beginning of the network history the number of nodes grows fast. In addition, new, denser than previously seen, subgraphs are more likely to occur. Thus, the approximate densest-subgraph subroutine has to be computed more often. Furthermore, the number of intervals k contributes to running time as expected.

G. Subgraphs with larger node coverage -static graphs
Next we evaluate STATICGREEDY. To measure coverage, we simply count the number of distinct nodes in the output subgraphs. We use the 10K first interactions of Students dataset, set k = 20, and test different values of λ. Figure 5 shows the density and the pairwise Jaccard similarity of the node sets of the retrieved subgraphs. The subgraphs are shown in the order they are discovered. Smaller values of λ give larger density, and larger values of λ give more cover. We observe that, for all values of λ, in the beginning STATICGREEDY returns diverse and dense subgraphs, but soon after it returns identical graphs. We speculate that the algorithm finds all dense subgraphs that exist in the dataset. Regarding setting λ, we observe that λ = 0.002 offers a good trade-off in finding subgraphs of high density and moderate overlap.

H. Subgraphs with larger node coverage -dynamic graphs
Finally we evaluate the performance of KGCVR algorithm. We vary the parameter λ and compare different characteristics of the solution, with the solution returned by KGAPPROX. For different values of λ, Table II shows average density, total number of covered nodes, average size of the subgraphs, and average pairwise Jaccard similarity. Although KGCVR does not have an approximation guarantee, for small values of λ it finds subgraphs of the density close to KGAPPROX. Similarly to the static case, λ provides an efficient trade-off between density and coverage.

VI. CASE STUDY
We present a case study using graphs of co-occurring hashtags from Twitter messages in the Helsinki region. We create two subsets of Twitter# dataset: one covering all tweets in November 2013 and another in December 2013. Figure 6 shows the dense subgraphs discovered by the KGAPPROX algorithm on these datasets, with k = 4 and DS = DP = 0.1.
For the November dataset, KGAPPROX creates a small 1day interval in the beginning and then splits the rest time almost evenly. This first interval includes the nodes movember, liiga, halloween, and digiexpo, which cover a broad range of global (e.g., movember and Halloween) and local events (e.g., game-industry event DigiExpo and Finnish icehockey league). The next interval is represented by a large variety of well-connected tags related to mtv and media, corresponding to the MTV Europe Music Awards'13 on November 10. There are also other ice hockey-related tags, e.g., leijonat, and Father's Day tags, e.g., isänpäivä, which was on November 13. The third interval is mostly represented by Slush-related tags; Slush is the annual large startup and tech event in Helsinki. The last interval is completely dedicated to ice-hockey with many team names.  There are three major public holidays in December: Finland's Independence Day on December 6, Christmas on December 25, and New Year's Eve on December 31. KGAPPROX allocates one interval for Christmas and New Year from December 21 to 31. Ice hockey is also represented in this interval, as well as in the third interval. Remarkably, the Independence Day holiday is split into 2 intervals. The first one is from December 1 to December 6, 3:30pm, and the corresponding graph has two clusters: the first one containing general holidays-related tags and the second one is focused on Independence Day President's reception. This is a large event that starts on December 6, 6pm, is broadcasted live, and is discussed in media for the following days. The second interval for December 6-9 is a truthful representation of this event.
VII. RELATED WORK Partitioning a graph in dense subgraphs is a well-established problem. Many of the existing works adopt as density definition the average-degree notion [30]- [33]. The densest subgraph, under this definition, can be found in polynomial time [20]. Moreover, there is a 2-approximation greedy algorithm by Charikar [21] and Asahiro [34], which runs in linear time of the graph size. Many recent works develop methods to maintain the average-degree densest-subgraph in a streaming scenario [18], [35]- [38]. Alternative density definitions, such as variants of quasi-clique, are often hard to approximate or solve by efficient heuristics due to connections to NP-complete Maximum Clique problem [13], [39], [40].
A line of work focuses on dynamic graphs, which model node/edge additions/deletions. Different aspects of network evolution, including evolution of dense groups, were studied in this setting [41]- [44]. However, here we use the interactionnetwork model, which is different to dynamic graphs, as it captures the instantaneous interactions between nodes.
Another classic approach to model temporal graphs is to consider graph snapshots, find structures in each snapshot separately (or by incorporating information from previous snapshots), and then summarize historical behavior of the discovered structures [45]- [49]. These approaches usually focus on the temporal coherence of the dense structures discovered in the snapshots and assume that the snapshots are given. In this work we aggregate instantaneous interaction into timeline partitions of arbitrary lengths.
To the best our knowledge, the following works are better aligned with our approach. A work of Rozenshtein et al. [17] considers a problem of finding the densest subgraph in a temporal network. However, first, they do not aim on creating a temporal partitioning. Second, they are interested in finding a single dense subgraph whose edges occur in k short time intervals. On the contrary, in this work we search for an  interval partitioning and consider only graphs that are span continuous intervals. Other close works are by Jethava and Beerenwinkel [50] and Semertzidis et al. [16]. However, these works consider a set of snapshots and search for a single heavy subgraph induced by one or several intervals. The work of Semertzidis et al. [16] explores different formulations for the persistent heavy subgraph problem, including maximum average density, while Jethava and Beerenwinkel [50] focus solely on maximum average density.

VIII. CONCLUSIONS
In this work we consider the problem of finding a sequence of dense subgraphs in a temporal network. We search for a partition of the network timeline into k non-overlapping intervals, such that the intervals span subgraphs with maximum total density. To provide a fast solution for this problem we adapt recent work on dynamic densest subgraph and approximate dynamic programming. In order to ensure that the episodes we discover consist of a diverse set of nodes, we adjust the problem formulation to encourage coverage of a larger set of nodes. While the modified problem is NP-hard, we provide a greedy heuristic, which performs well on empirical tests.
The problems of temporal event detection and timeline segmentation can be formulated in various ways depending on the type of structures that are considered to be interesting. Here we propose segmentation with respect to maximizing subgraph density. The intuition is that those dense subgraphs provide a sequence of interesting events that occur in the lifetime of the temporal network. However, other notions of interesting structures, such as frequency of the subgraphs, or statistical non-randomness of the subgraphs, can be considered for future work. In addition, it could be meaningful to allow more than one structure per interval. Another possible extension is to consider overlapping intervals instead of a segmentation.