1 Introduction

Cyber attacks pose a constant threat for IT infrastructures. As a consequence, Intrusion Detection Systems (IDSs) have been developed to monitor a wide range of activities within systems and analyze system events and interactions for suspicious and possibly malicious behavior, in which case they generate alerts that are subsequently reported to administrators or Security Information and Event Management (SIEM) systems. The main advantage of IDSs is that they are capable of processing massive amounts of data in largely autonomous operation and are therefore usually deployed as network-based IDSs that analyze network traffic or host-based IDSs that also analyze system logs.

One of the main issues with IDSs is that they often produce large amounts of alerts that easily become overwhelming for analysts, a situation that is commonly referred to as alert flooding [5]. The number of produced alerts depends on the deployed IDSs as well as the type of attack, for example, attacks that result in many alerts include denial-of-service attacks that access machines with high intensity, brute-force attacks that repeatedly attempt to log into accounts with random passwords, and automatic scripts that search for vulnerabilities [4]. These attacks produce high loads on the network and consequently cause the generation of many events in the monitored logs, of which a large part is reported by signature-based IDSs that search for patterns corresponding to such common attacks. On the other hand, anomaly-based IDSs that learn a baseline of normal system behavior and report alerts for statistical deviations are known to suffer from high false positive rates, i.e., they frequently report alerts during normal operation. Independent from their origin, alerts that occur in large frequencies are problematic, because they are difficult to categorize and may cause that analysts oversee other relevant alerts that occur less frequently [2, 5]. To alleviate this issue, alerts should be filtered or aggregated before being presented to human analysts.

Alert aggregation techniques usually rely on automatic correlation or manual linking of alert attributes [11]. However, organizations frequently deploy heterogeneous IDSs to enable broad and comprehensive protection against a wide variety of threats, causing that generated alerts have different formats and thus require normalization [13]. Most commonly, attributes of alerts are thereby reduced to timestamps, source and destination IP addresses and ports, and IDS-specific classifications, which are considered the most relevant features of alerts [1]. Unfortunately, alerts from host-based IDSs do not necessarily contain network information and alerts from anomaly-based IDSs do not involve alert types, which renders them unsuitable for existing aggregation techniques. In their survey, Navarro et al. [11] therefore recommend to develop alert aggregation techniques that operate on general events rather than well-formatted alerts to avoid loss of context information. The authors also found that most existing approaches rely on predefined knowledge for linking alerts, which impedes detection of unknown attack scenarios. In addition, modern infrastructures consist of decentralized networks and container-based virtualization that prevent IP-based correlation [4]. Hence, there is a need for an automatic and domain-independent alert aggregation technique that operates on arbitrary formatted alerts and is capable of generating representative attack patterns independent from any pre-existing knowledge about attack scenarios.

IDSs generate streams of individual alerts. Aggregating these alerts means to group them so that all alerts in each group are related to the same root cause, i.e., a specific malicious action or attack. Unfortunately, finding such a mapping between alerts and attacks is difficult for a number of reasons. First, attack executions usually trigger the generation of multiple alerts [12], because IDSs are set up to monitor various parts of a system and any malicious activity frequently affects multiple monitored services at the same time. This implies that it is necessary to map a set of alerts to a specific attack execution, not just a single alert instance. Second, it is possible that the same or similar alerts are generated as part of multiple different attacks, which implies that there is no unique mapping from alerts to attacks. This is caused by the fact that IDSs are usually configured for a very broad detection and do not only consist of precise rules that are specific to particular attacks. Third, repeated executions of the same attack do not necessarily manifest themselves in the same way, but rather involve different amounts of alerts and changes of their attributes. This effect is even more drastic when parameters of the attack are varied, their executions take place on different system environments, or alerts are obtained from differently configured IDSs. Fourth, randomly occurring false positives that make up a considerable part of all alerts [5] as well as interleaving attacks complicate a correct separation of alerts that relate to the same root cause.

In addition, alert sequences should be aggregated to higher-level alert patterns to enable the classification of other alerts relating to the same root cause. In the following, we refer to these patterns as meta-alerts. The aforementioned problems are insufficiently solved by existing approaches, which usually rely on models built on pre-existing domain knowledge, manually crafted links between alerts, and exploitation of well-structured alert formats.

The development of alert aggregation techniques is usually motivated by specific problems at hand. Accordingly, existing approaches base on different assumptions regarding available data, tolerated manual effort, etc. With respect to the previously issues outlined, we derive the following list of requirements for domain-independent alert aggregation techniques:

  1. (1)

    Automatic. Manually crafting attack scenarios is time-consuming and subject to human errors [11]. Therefore, unsupervised methods should be employed that enable the generation of patterns and meta-alerts relating to unknown attacks without manual interference.

  2. (2)

    Grouping. Attacks should be represented by more than a single alert. This grouping is usually based on timing (T), common attributes (A), or a combination of both (C).

  3. (3)

    Format-independent. Alerts occur in diverse formats [11]. Methods should utilize all available information and not require specific attributes, such as IP addresses.

  4. (4)

    Incremental. IDSs generate alerts in streams. Alert aggregation methods should therefore be designed to derive attack scenarios and classify alerts in incremental operation.

  5. (5)

    Meta-alerts. Aggregated alerts should be expressed by human-understandable meta-alerts that also enable automatic detection [2]. Thereby, generated patterns are usually based on single events (E), sequences (S), or a combination (C) of both.

This chapter thus presents a framework for automatic and domain-independent alert aggregation that meets the requirements listed above. The approach consists of an algorithm that groups alerts by their occurrence times, clusters these groups by similarity, and extracts commonalities to model meta-alerts. This is achieved without merging all considered alerts into a single common format. Our implementations as well as data used for evaluation are available onlineFootnote 1\(^{,}\)Footnote 2. We summarize our contributions as follows:

  • An approach for the incremental generation of meta-alerts from heterogeneous IDS alerts.

  • Similarity-metrics for semi-structured alerts and groups of such alerts.

  • Aggregation mechanisms for semi-structured alerts and groups of such alerts.

  • A dashboard that visualizes alert aggregation results.

  • An application example that demonstrates the alert aggregation approach.

The remainder of the chapter structures as follows: Section 2 outlines important concepts of our approach, including alerts, alert groups, and meta-alerts. In Sect. 3 we provide an application example to demonstrate the alert aggregation approach and introduce a dashboard to visualize and access actionable CTI. Finally, Sect. 4 concludes the chapter.

2 Entities and Operations

This section presents relevant concepts of our alert aggregation approach. We first provide an overview of the entities and their relationships. We then discuss our notion of alerts, outline how alerts are clustered into groups, and introduce a meta-alert model based on aggregated alert groups. A detailed description of the implementation and evaluation results are provided in [8]. The implementation is available as open source via Github (see Footnote 1).

2.1 Overview

Our approach transforms alerts generated by IDSs into higher-level meta-alerts that represent specific attack patterns. Figure 1 shows an overview of the involved concepts. The top of the figure represents alerts occurring as sequences of events on two timelines, which represent different IDSs deployed in the same network infrastructure or even separate system environments. Another possibility is that events are retrieved from historic alert logs and used for forensic attack analysis.

Fig. 1.
figure 1

Overview of the relationships between concepts. Alerts (top) occurring on timelines (t) are grouped by temporal proximity (center) and then aggregated to meta-alerts by similarity (bottom).

Alert occurrences are marked with symbols and colors that represent their types. Thereby, two alerts could be of the same type if they share the same structure, were generated by the same rule in the IDS, or have coinciding classifications. We differentiate between square \(\left( \square \right) \), triangle \(\left( \triangle , \triangledown , \triangleleft , \triangleright \right) \), circle \(\left( \circ \right) \), and dash \(\left( - \right) \) symbols, which are marked blue, red, green, and yellow respectively. For the examples presented throughout this paper, we consider alerts represented by one of \(\left\{ \triangle , \triangledown , \triangleleft , \triangleright \right\} \) as variations of the same alert type, i.e., these alerts have sufficiently many commonalities such as matching attributes and are thus similar to each other. In general, each alert represents a unique event that occurs only at one specific point in time. However, alerts of the same type, e.g., alerts that are generated by the same violation of a predefined rule or alerts reported by the same IDS, may occur multiple times. We mark these alerts accordingly with the same color.

As outlined in Sect. 1, automatic mapping of alerts to higher-level meta-alerts is non-trivial. In the simple example shown in Fig. 1, it is easy to see that the alert sequence \(\left( \square , \triangle , \circ \right) \) and the similar sequence \(\left( \square , \triangledown , \circ \right) \) occur a total of three times, and that the pattern \(\left( \circ , \circ , \triangledown \right) \) occurs two times over the two timelines. This is intuitively visible, because these alerts occur close together. Accordingly, it is reasonable to allocate alerts to groups that reflect this characteristic.

The center part of the figure shows groups of alerts based on their respective positions on the timelines. Note that grouping by alert type instead of temporal proximity would result in a loss of information, because alerts would be allocated to groups independent of their contexts, i.e., other alerts that are generated by the same root cause. For example, grouping all alerts of type \(\circ \) would have neglected the fact that this type actually occurs in the patterns \(\left( \square , \triangle , \circ \right) \) as well as \(\left( \circ , \circ , \triangledown \right) \) and may thus not be a good indicator for a particular attack execution on its own.

Computing similarities between groups means measuring the differences of orders, frequencies, and attributes of their contained alerts. Alert groups that yield a high similarity are likely related to the same root cause and should thus be aggregated into a condensed form that reflects a typical instance of that group, i.e., a meta-alert. The bottom of the figure shows the generation of meta-alerts from similar groups. Thereby, orders, frequencies, and attributes of meta-alerts are created in a way to represent all allocated alert groups as accurate as possible. The figure shows that this is accomplished by merging the second alert in the patterns \(\left( \square , \triangle , \circ \right) \) and \(\left( \square , \triangledown , \circ \right) \) into alert , which combines attributes and values of \(\triangle \) and \(\triangledown \) so that both are adequately represented. In practice, this could mean that two different values of the same attribute in both alerts are combined into a set.

The second meta-alert with alert sequence \(\left( \circ , \circ , \triangledown \right) \) is formed from two identical groups and thus does not involve changes to merged alerts. If meta-alert generation was based on similarity of alerts rather than groups, all occurrences of similar alerts \(\triangle \) and \(\triangledown \) would be replaced with , thereby decreasing the specificity of the second meta-alert. This suggests that forming groups of logically related alerts is an essential step for meta-alert generation. Finally, the third meta-alert contains a single alert that only occurred once and is the only alert in its group. Since alerts form the basis of the presented approach, the following section will discuss their compositions in more detail.

2.2 Alerts

IDSs are designed to transmit as much useful information as possible to the person or system that receives, interprets, and acts upon the generated alerts. This includes data derived from the event that triggered the alert, e.g., IP addresses present in the monitored data, as well as information on the context of detection, e.g., detection rule identifiers. As outlined in [8], most approaches omit a lot of this information and only focus on specific predefined attributes. Our approach, however, utilizes all available data to generate meta-alerts without imposing any domain-specific restrictions.

To organize all data conveyed with each alert in an intuitive form, alerts are frequently represented as semi-structured objects, e.g., XML-formatted alerts as defined by the IDMEFFootnote 3 or JSON-formatted alerts generated by WazuhFootnote 4 IDS. Even though such standards exist, different IDSs produce alerts with data fields specific to their detection techniques. For example, a signature-based detection approach usually provides information on the rule that triggered the alert, while anomaly-based IDSs only indicate suspicious event occurrences without any semantic interpretation of the observed activity. In addition, some IDSs do not provide all attributes required by standards such as IDMEF, e.g., host-based IDSs analyze system logs that do not necessarily contain network and IP information.

Figure 2 shows such an alert that was caused by a failed user login attempt generated by Wazuh. Note that it does not support IP-based correlation, since only “srcip” that points to localhost is available. The alert contains semi-structured elements, i.e., key-value pairs (e.g., “timestamp”), lists (e.g., “groups”), and nested objects (e.g., “rule”). In alignment with this observation, we model alerts as abstract objects with arbitrary numbers of attributes. Formally, given a set of alerts \(\mathcal {A}\), an alert \(a \in \mathcal {A}\) holds one or more attributes \(\kappa _a\), where each attribute a.k is defined as in Eq. 1.

$$\begin{aligned} a.k = v_1, v_2, ..., v_n&\quad \forall k \in \kappa _a, n \in \mathbb {N} \end{aligned}$$

Note that Eq. 1 also holds for nested attributes, i.e., \(a.k.j, \forall j \in \kappa _{a.k}\), and that \(v_i\) is an arbitrary value, such as a number or character sequence. In the following we assume that the timestamp of the alert is stored in key \(t \in \kappa _a, \forall a \in \mathcal {A}\), e.g., \(a.t = 1\) for alert a that occurs at time step 1. These alert attributes are suitable to compare alerts and measure their similarities, e.g., alerts that share a high number of keys and additionally have many coinciding values for each common key should yield a high similarity, because they are likely related to the same suspicious event. This also means that values such as IPs are not ignored, but matched by common keys like all other attributes. We define a function \(alert\_sim\) in Eq. 2 that computes the similarity of alerts \(a, b \in \mathcal {A}\).

$$\begin{aligned} alert\_sim : a, b \in \mathcal {A} \rightarrow \left[ 0, 1 \right] \end{aligned}$$

Thereby, the similarity between any non-empty alert and itself is 1 and the similarity to an empty object is 0. Furthermore, the function is symmetric, which is intuitively reasonable when comparing alerts on the same level of abstraction. On the other hand, the function implicitly computes how well one alert is represented by another more abstract aler. We summarize the properties of the function in Eq. 35.

$$\begin{aligned} alert\_sim(a, a)&= 1 \end{aligned}$$
$$\begin{aligned} alert\_sim(a, \emptyset )&= 0, \quad a \ne \emptyset \end{aligned}$$
$$\begin{aligned} alert\_sim(a, b)&= alert\_sim(b, a) \end{aligned}$$

As mentioned, we do not make any restrictions on the attributes of alerts and only consider the timestamp a.t of alert a as mandatory, which is not a limitation since the time of detection is always known by the IDS or can be extracted from the monitored data. In the next section, this timestamp will be used to allocate alerts that occur in close temporal proximity to groups.

Fig. 2.
figure 2

Simplified sample alert documenting a failed user login.

2.3 Alert Groups

Alerts generated by an arbitrary number of deployed IDSs result in a sequence of heterogeneous events. Since attacks typically manifest themselves in multiple mutually dependent alerts rather than singular events, it is beneficial to find groups of alerts that were generated by the same root cause as shown in Sect. 2.1. In the following, we describe our strategies for formation and representation of alert groups that enable group similarity computation.

Formation. Depending on the type of IDS, alerts may already contain some kind of classification provided by their detection rules. For example, the message “PAM: User login failed.” contained in the alert shown in Fig. 2 could be used to classify and group every event caused by invalid logins. While existing approaches commonly perform clustering on such pre-classifications of IDSs, single alerts are usually not sufficient to differentiate between specific types of attacks or accurately filter out false positives (cf. Sect. 1). To alleviate this problem, we identify multiple alerts that are generated in close temporal proximity and whose combined occurrence is a better indicator for a specific attack execution. For example, a large number of alerts classified as failed user login attempts that occur in a short period of time and in combination with a suspicious user agent could be an indicator for a brute-force password guessing attack executed through a particular tool. Such a reasoning would not be possible when all alerts are analyzed individually, because single failed logins may be false positives and the specific user agent could also be part of other attack scenarios.

The problem of insufficient classification is even more drastic when alerts are received from anomaly-based IDS, because they mainly disclose unknown attacks. Accordingly, an approach that relies on clustering by alert classification attributes would require human analysts who interpret the root causes and assign a classifier to each alert. Temporal grouping on the other hand is always possible for sequentially incoming alerts and does not rely on the presence of alert attributes.

Our strategy for alert group formation is based on the interval times between alerts. In particular, two alerts \(a, b \in \mathcal {A}\) that occur at times a.tb.t have an interval time \(\left| a.t - b.t \right| \) and are allocated to the same group if \(\left| a.t - b.t \right| \le \delta \), where \(\delta \in \mathbb {R}^+\). This is achieved through single-linkage clustering [3]. In particular, all alerts are initially contained in their own sets, i.e., \(s_{\delta , i} = \left\{ a_i \right\} , \forall a_i \in \mathcal {A}\). Then, clusters are iteratively formed by repeatedly merging the two sets with the shortest interval time \(d = \min \left( \left| a_i.t - a_j.t \right| \right) , \forall a_i \in s_{\delta , i}, \forall a_j \in s_{\delta , j}\). This agglomerative clustering procedure is stopped when \(d > \delta \), which results in a number of sets \(s_{\delta , i}\). Each set is transformed into a group \(g_{\delta , i}\) that holds all alerts of set \(s_{\delta , i}\) as a sequence sorted by their occurrence time stamps as in Eq. 6.

$$\begin{aligned} g_{\delta , i}&= \left\{ \left( a_1, a_2, \dots , a_n \right) , \forall a_i \in s_{\delta , i} : a_1.t \le a_2.t \le \dots \le a_n.t \right\} \end{aligned}$$

Equation 7 defines the set of all groups for a specific \(\delta \) as their union.

$$\begin{aligned} \mathcal {G}_\delta&= \bigcup _{i \in \mathbb {N}} g_{\delta , i} \end{aligned}$$

This group formation strategy is exemplarily visualized in Fig. 3. The figure shows alert occurrences of types \(\left\{ \square , \triangle , \circ , - \right\} \) in specific patterns duplicated over four timelines with different \(\delta \). The sequence \(\left( \square , \triangle , \circ \right) \) at the beginning of the timelines occurs with short alert interval times and that a similar sequence \(\left( \square , \triangledown , \circ \right) \) occurs at the end, but involves \(\triangledown \) instead of its variant \(\triangle \) and has an increased interval time between \(\triangledown \) and \(\circ \). Nevertheless, due to the similar compositions of these two alert sequences, it is reasonable to assume that they are two manifestations of the same root cause.

In this example, each tick in the figure marks a time span of 1 unit. In timeline (d), all alerts end up in separate groups, because no two alerts yield a sufficiently small interval time lower than \(\delta =0.5\), i.e., \(\mathcal {G}_{0.5} = \left\{ \left( \square \right) , \left( \triangle \right) , \left( \circ \right) , \left( - \right) , \left( \square \right) , \left( \triangledown \right) , \left( \circ \right) \right\} \). In timeline (c) where alerts are grouped using \(\delta =1.5\), two groups that contain more than a single alert are formed, because the grouped alerts occur within sufficiently close temporal proximity, i.e., \(\mathcal {G}_{1.5} = \left\{ \left( \square , \triangle , \circ \right) , \left( - \right) , \left( \square , \triangledown \right) , \left( \circ \right) \right\} \). Considering the results for \(\mathcal {G}_{2.5} = \left\{ \left( \square , \triangle , \circ \right) , \left( - \right) , \left( \square , \triangledown , \circ \right) \right\} \) in timeline (b) shows that the aforementioned repeating pattern \(\left( \square , \triangle , \circ \right) \) and its variant end up in two distinct groups. This is the optimal case, since subsequent steps for group analysis could determine that both groups are similar and thus merge them into a meta-alert as shown in Sect. 2.1. A larger value for delta, e.g., \(\delta =3.5\) that yields \(\mathcal {G}_{3.5} = \left\{ \left( \square , \triangle , \circ \right) , \left( - , \square , \triangledown , \circ \right) \right\} \) in timeline (a), adds alert of type − to form group \(\left( -, \square , \triangledown , \circ \right) \), which is not desirable since this decreases its similarity to group \(\left( \square , \triangle , \circ \right) \). This example thus shows the importance for an appropriate selection of the interval threshold for subsequent analyses.

Fig. 3.
figure 3

Alert occurrences duplicated over four parallel timelines show the formation of alert groups based on alert interval times. Larger intervals (top) allow more elapsed time between alerts and thus lead to fewer and larger groups compared to smaller intervals (bottom).

Note that this strategy for temporal grouping has several advantages over sliding time windows. First, instead of time window size and step width, only a single parameter that specifies the maximum delta time between alerts is required, which reduces complexity of parameter selection. Second, it ensures that alerts with close temporal proximity remain in the same group given any delta larger than their interval times, while intervals of sliding time windows possibly break up groups by chance. Third, related sequences with variable delays result in complete groups as long as there is no gap between any two alerts that exceeds \(\delta \), e.g., two groups with similar alerts but varying delays are found for \(\delta =2.5\) in Fig. 3. However, time window sizes must exceed the duration of the longest sequence to yield complete groups, which is more difficult to specify in general.

Despite these benefits, pure time-based grouping suffers from some drawbacks compared to knowledge-based clustering methods, e.g., grouping by classification messages. As seen in the example from Fig. 3, the quality of the resulting grouping is highly dependent on a selection of the parameter \(\delta \) that fits the typical time interval of the events to be grouped. Another issue is that randomly occurring alerts, e.g., false positives, are incorrectly allocated to groups if they occur in close proximity to one of the grouped alerts. Even worse, these alerts could connect two or more groups into a single large group if they happen to occur in between and in sufficiently high amount or close proximity to both groups. As we will outline in the following sections, our approach mitigates these problems by finding groups using several values for \(\delta \) in parallel.

Similarity Computation. Other than clustering based on predefined alert types, time-based grouping only acts as a preparatory step for subsequent analyses. In particular, a similarity measure for alert groups is required that allows to determine which groups of alerts are likely generated from the same root cause. Only then it is possible to cluster groups by their similarities and in turn generate meta-alerts by merging alert groups that end up in the same clusters. We therefore define function \(group\_sim\) in Eq. 8 that computes the similarity of any two groups \(g, h \in \mathcal {G}_\delta \).

$$\begin{aligned} group\_sim : g, h \in \mathcal {G}_\delta \rightarrow \left[ 0, 1 \right] \end{aligned}$$

Analogous to alert similarity computation (cf. Sect. 2.2), the similarity between any non-empty group \(g \in \mathcal {G}_\delta \) and itself is 1 and the similarity to an empty object is 0. However, we do not impose symmetry on the function, since it can be of interest to measure whether one group is contained in another possibly more abstract group, such as a meta-alert. Details on such a similarity function are discussed in [8]. In the following section, we first explain the representation of meta-alerts and then introduce matching strategies for similarity computations between groups.

2.4 Meta-alerts

We generate meta-alerts by merging groups, which relies on merging alerts within these groups. In the following, we first introduce features that support the representation of merged alerts and then outline group merging strategies for similarity computations and meta-alert generation.

Alert Merges. As outlined in Sect. 2.2, alerts are semi-structured objects, i.e., data structures that contain key-value pairs, and are suitable for similarity computation. However, aggregating similar alerts into a merged object that is representative for all allocated alerts is non-trivial, because single alert objects may have different keys or values that need to be taken into account.

For example, the failed login alert in Fig. 2 contains the attribute “srcuser” with value “daryl” in the “data” object. Since a large number of users may trigger such alerts, this event type occurs with many different values for attribute “srcuser” over time. An aggregated alert optimally abstracts over such attributes to represent a general failed login alert that does not contain any information specific to a particular event. The computed similarity between such an aggregated alert and any specific alert instance is independent of attributes that are known to vary, i.e., only the presence of the attribute “srcuser” contributes to similarity computations, but not its value. Note that this assumes that keys across alerts have the same semantic meaning or that keys with different names are correctly mapped if alert formats are inconsistent, e.g., keys “src_user” and “srcuser”.

We incorporate merging of alerts by introducing two new types of values. First, a wildcard value type that indicates that the specific value of the corresponding key is not expressive for that type of alert, i.e., any value of that field will yield a perfect match just like two coinciding values. Typical candidates for values replaced by wildcards are user names, domain names, IP addresses, counts, and timestamps. Second, a mergelist value type that comprises a finite set of values observed in several alerts that are all regarded as valid values, i.e., a single matching value from the mergelist is sufficient to yield a perfect match for this attribute present in two compared alerts. The mergelist type is useful for discrete values that occur in variations, e.g., commands or parameters derived from events. Deciding whether an attribute should be represented as a wildcard or mergelist is therefore based on the total number of unique values observed for that attribute.

We define that each attribute key \(k \in \kappa _a\) of an aggregated alert a that is the result of a merge of alerts \(A \subseteq \mathcal {A}\) is represented as either a wildcard or mergelist as in Eq. 9.

$$\begin{aligned} a.k = {\left\{ \begin{array}{ll} wildcard\left( \right) \\ mergelist\left( \bigcup _{b \in A} b.k \right) \end{array}\right. } \end{aligned}$$

Note that Eq. 9 also applies for nested keys, i.e., values within nested objects stored in the alerts. Since our approach is independent of any domain-specific reasoning, a manual selection of attributes for the replacement with wildcards and mergelists is infeasible. The function \(alert\_merge\) thus automatically counts the number of unique values for each attribute from alerts \(A \subseteq \mathcal {A}\) passed as a parameter, selects and replaces them with the appropriate representations, and returns a new alert object a that represents a merged alert that is added to all alerts \(\mathcal {A}\) as shown in Eq. 1011.

$$\begin{aligned} a&= alert\_merge(A), \quad A \subseteq \mathcal {A} \end{aligned}$$
$$\begin{aligned} \mathcal {A}&\Leftarrow a \end{aligned}$$

Note that we use the operation \(\Leftarrow \) to indicate set extensions, i.e., \(\mathcal {A} \Leftarrow a \iff \mathcal {A}^\prime = \mathcal {A} \cup \left\{ a \right\} \). We drop the prime of sets like \(\mathcal {A}^\prime \) in the following for simplicity and assume that after extension only the new sets will be used. The extension of \(\mathcal {A}\) implies that merged alerts are also suitable for similarity computation and merging with other alerts or merged alerts. We will elaborate on the details of the alert merging procedure in [8]. The next section will outline the role of alert merging when groups are merged for meta-alert generation.

Group Merges. Similar to merging of alerts that was discussed in the previous section, a merged group should represent a condensed abstraction of all groups used for its generation. Since each group should ideally comprise a similar sequence of alerts, it may be tempting to merge groups by forming a sequence of merged alerts, where the first alert is merged from the first alerts in all groups, the second alert is merged from the second alerts in all groups, and so on. Unfortunately, this is infeasible in practice, because alert sequences are not necessarily ordered, involve optional alerts, or are affected by false positives causing that alert positions in sequences are shifted. To alleviate this issue, it is necessary to find matches between the alerts of all groups to be merged. In the following, we describe three matching strategies used in our approach that are suitable for group similarity computation as well as meta-alert generation.

Exact Matching. This strategy finds for each alert in one group the most similar alert in another group and uses these pairs to determine which alerts to merge. The idea of finding these matches is depicted in the left side of Fig. 4, where lines across groups \(g_1, g_2, g_3\) indicate which alerts were identified as the most similar. As expected, alerts of the same type are matched, because they share several common attributes and values that are specific to their respective types. The figure also shows that correct alerts are matched even though the second and third alert in \(g_2\) are in a different order than in \(g_1\) and \(g_3\). In addition, note that the alert of type \(\triangledown \) in \(g_1\) is correctly matched to the related alert type \(\triangle \) in \(g_2\) and that the merged group thus contains the merged alert type at that position. In addition, there is a missing alert of type \(\circ \) in \(g_3\) that leads to an incomplete match. Nevertheless, the alert of type \(\circ \) ends up in the merged group, because it occurs in the majority of all merged groups and is therefore considered to be representative for this root cause manifestation.

Fig. 4.
figure 4

Merging strategies for alert groups. Left: Finding exact matches between alert pairs. Center: Matching representatives using a bag-of-alerts model. Right: Matching using alert sequence alignment.

When only two groups are considered, this matching method is also suitable for measuring their similarity. In particular, this is achieved by computing the average similarity of all matched alerts, where non-matching alerts count as total mismatches. The similarity score is further enhanced by incorporating an edit distance [10] that measures the amount of inserts, removes, and substitutions of alerts, i.e., misalignments such as the occurrence of \(\left( \circ , \triangle \right) \) instead of \(\left( \triangle , \circ \right) \) in \(g_2\).

While the exact matching strategy yields accurate group similarities, it is rather inefficient for large groups. The reason for this is that computing the pairwise similarities of all alerts requires quadratic runtime with respect to group sizes. We therefore only use this strategy when the number of required comparisons for groups gh does not exceed a limit \(l_{bag} \in \mathbb {N}\), i.e., \(\left| g \right| \cdot \left| h \right| \le l_{bag}\), where \(\left| g\right| \) denotes the size of group g. In the following, we outline an alternative strategy for larger groups.

Bag-of-Alerts Matching. For this strategy, we transform the alert sequences of all groups into a bag-of-alerts model following the well-known bag-of-words model [9]. This is accomplished by incrementally clustering [15] the alerts within each group using a certain similarity threshold \(\theta _{alert} \in \left[ 0, 1 \right] \). Thereby, each alert a that is sufficiently similar to one of the initially empty sets of cluster representatives R, i.e., \(alert\_sim(r, a) \ge \theta _{alert}, \forall r \in R\), is added to the list \(C_r\) that stores all alerts of that cluster, i.e., \(C_r \Leftarrow a\), or forms a new cluster with itself as a representative otherwise, i.e., \(R \Leftarrow a\). Once all alerts of a group are processed, the bag-of-alerts model for that group is generated by merging all alerts in each cluster, i.e., \(alert\_merge(C_r), \forall r \in R\).

The matching procedure then finds the pairs of these merged alerts that yield the highest similarities across groups and aggregates them by identifying lower and upper limits of their corresponding cluster sizes \(\left| C_r \right| \) in each group. The advantage in comparison to the exact matching strategy is that the number of necessary similarity computations is reduced to the product of the number of clusters per group, which is controllable through \(\theta _{alert}\). Note that the speedup stems from the fact that the computation of the bag-of-alerts model only has to be carried out once for each group, but then enables fast matching with all other groups.

The center part of Fig. 4 shows bag-of-alert models for sample groups, where alerts of types \(\triangle \) and \(\triangledown \) in \(g_1\) are merged to , which is then matched to \(\triangle \) in \(g_2\) and \(g_3\) before they are once again merged for the generation of the meta-alert. Since alert type \(\circ \) occurs twice in \(g_1\) and \(g_2\), but only once in \(g_3\), the meta-alert uses a range with minimum limit \(l_{min} = 1\) and maximum limit \(l_{max} = 2\) to describe the occurrence frequency of this alert type.

This strategy also supports measuring the similarity of two groups gh by averaging the relative differences of occurrence counts, which yields the highest possible similarity of 1 if the respective counts coincide or their intervals overlap, and \(min(l_{max, g}, l_{max, h}) / max(l_{min, g}, l_{min, h})\) otherwise. Alerts without a match are considered as total mismatches and contribute the lowest possible similarity score of 0 to the average. We favored this similarity metric over existing measures such as cosine similarity [9], because it allows a more intuitive representation of lower and upper occurrence limits which supports human interpretation of meta-alerts.

The downside of the bag-of-alerts strategy is that information on the order of the alerts is lost. However, it is possible to resolve this issue by combining the original alert sequence with the bag-of-alerts model. In the following, we outline this addition to the bag-of-alerts matching.

Alignment-Based Matching. To incorporate alignment information for large clusters that are not suited for the exact matching strategy, it is necessary to store the original sequence position of all clustered alerts during generation of the bag-of-alerts model of each group. This information enables to generate a sequence of cluster representatives. For example, the right side of Fig. 4 shows that group \(g_1\) has sequence , because the occurrences of \(\triangle \) and \(\triangledown \) have been replaced by their cluster representative that was generated in the bag-of-alerts model. Note that this strategy is much faster for large groups than the exact matching strategy, because it enables to reuse the matching information of representative alerts from the bag-of-alerts model instead of finding matches between all alerts. Since the corresponding sequence elements across groups are known, it is simple to use sequence alignment algorithms for merging and similarity computation.

We decided to merge the sequences using longest common sequence (LCS) [10], because it enables to retrieve the common alert pattern present in all groups and thereby omit randomly occurring false positive alerts [6]. The example in Fig. 4 shows that this results in a sequence of representatives that occurs in the same order in all groups. Using the LCS also enables to compute the sequence similarity of two groups gh by \(\left| LCS(g, h) \right| / min(\left| g \right| , \left| h \right| )\), which we use to improve the bag-of-alerts similarity by incorporating it as a weighted term after averaging.

Equation 12 defines a function that takes a set of groups \(G \subseteq \mathcal {G}_\delta \) and automatically performs all aforementioned merging strategies to generate a new group g.

$$\begin{aligned} g&= group\_merge(G), \quad G \subseteq \mathcal {G}_\delta \end{aligned}$$
$$\begin{aligned} \mathcal {G}_\delta&\Leftarrow g \end{aligned}$$

Analogous to merges of single alerts, Eq. 13 indicates that merges of alert groups have the same properties as normal groups and therefore support similarity computation and merging. In the previous sections, we defined several functions required for meta-alert generation. The following section will embed all aforementioned concepts in an overall procedure.

3 Application Example

The remaining section provides an application example for the proposed alert aggregation approach and introduces a KibanaFootnote 5 based Cyber Threat Intelligence (CTI) dashboard that visualizes alerts, alert groups, and meta-alerts, as well as their interdependencies in form of actionable CTI. The section first provides an overview of the example’s process flow, then describes the test data, and finally depicts the results of the example and the CTI dashboard. A detailed evaluation of the alert aggregation approach is available in [8].

3.1 CTI Process Flow

Figure 5 depicts the process flow for the demonstrated application example. All tools and algorithms we use in course of the example are either open source or freely available for non-commercial use. The first step considers anomaly detection to generate alerts. In this example, we process log data that contains a baseline of normal system behavior, as well as traces of attacks. The next sections provide detailed information about the considered log data. Therefore, we apply the log-based anomaly detection system AMinerFootnote 6 [14, 16] that generates alerts in JSON format (see Fig. 9). The AMiner forwards the alerts via a KafkaFootnote 7 message queue to a central elasticFootnote 8 search database that stores the alerts. The alert aggregationFootnote 9 continuously polls the alerts from the database and builds alert groups and generates meta-alerts. It forwards both alert groups and meta-alerts to the elastic search database. Finally, we developed a Kibana CTI dashboardFootnote 10 that visualizes the alerts, the alert groups, the meta-alerts, as well as their interdependencies. Section 3.3 describes the dashboard in detail.

Fig. 5.
figure 5

The process flow used in the application example includes the log-based anomaly detection system AMiner, the message queue Kafka, a Kibana-based CTI dashboard developed by the authors, and the implementation of the alert aggregation.

3.2 Data Generation

For the demonstration of the alert aggregation and the CTI dashboard a representative use-case was prepared and implemented. This was accomplished in three steps applying the KyoushiFootnote 11 testbed approach [7]: First, a testbed for log data generation was deployed and simulations of normal behavior were started. After several days of running the simulation, a sequence of attack steps was launched on the testbed. This was repeated four times to generate repeated manifestations of the same attack types that should then be aggregated to meta-alerts using the proposed alert aggregation approach. Thereby, we paid attention to vary the attack parameters so that the attack manifestations are not identical, but represent slight variations of the attacks that should then be reflected in the meta-alerts. Second, the log dataset was collected from the attacked webserver and labelled using a predefined dictionary of attack tracesFootnote 12. Third, the AMiner was used to analyze the dataset and generate alerts to be aggregated. In the following, we describe each of these steps in more detail.

The data generation approach is described in detail in [7]. In a nutshell, for the example at hand the virtual testbed environment consists of an Apache web server that runs an Exim mail server and Horde Webmail, as well as 16 virtual users that perform common tasks on the infrastructure, including sending emails, changing calender entries, and taking notes. The most relevant part for alert aggregation concerns the injection of attacks, since variations of attack parameters reflect in the meta-alerts. For our use-case, we prepared a multi-step intrusion that involves several tools commonly used by adversaries and exploits of two well-known vulnerabilities to gain root access on the mail server. The first two steps involve scans for open ports using the NmapFootnote 13 scanner and NiktoFootnote 14 vulnerability scanner. Then, the attacker uses the smtp-user-enum toolFootnote 15 for discovering Horde Webmail accounts using a list of common names and the hydraFootnote 16 tool to brute-force log in to one of the accounts using a list of common passwords. The attack proceeds with an exploit in HordeFootnote 17 Webmail 5.2.22 that allows to upload a webshell (CVE-2019-9858Footnote 18) and enables remote command execution. We simulate the attacker examining the web server for further vulnerabilities by executing several commands, such as printing out system info. In our scenario, the intruder discovers a vulnerable version of the Exim package and uploads an exploit (CVE-2019-10149Footnote 19) to obtain root privileges through another reverse connection. To realize the attacks, we use a sequence of predefined commands in a script, but do not specify values that are only known after instantiating the testbed, such as the IP addresses of the web server and user hosts, as well as parameters that are varied in each simulation, such as port numbers, evasion strategies, or commands executed after gaining remote access. This attack was purposefully designed as a multi-step attack with variable parameters to evaluate the ability of IDSs to disclose and extract individual attack steps and their connections, and recognize the learned patterns in different environments despite variations.

Once the testbed setup is completed, we run the simulation for five days to capture a baseline of normal system behavior. Afterwards, we run the attacks and collect one more day of log data to ensure that all attack consequences are completed (e.g., events corresponding to timeouts that are related to the attack could possibly only be generated some time after the attack is finished) and to allow the network and system behavior to stabilize itself back to the normal activities. At this point, we collect all the logs and label all events. This relates to the second step of the overall scenario and data generation procedure and is described in detail in [7]. Labeling data is essential for appropriately evaluating and comparing the detection capabilities of IDSs and alert aggregation approaches. However, generating labels is difficult for several reasons: (i) log data is generated in large volumes and manual labeling all lines is usually infeasible, (ii) single actions may manifest themselves in multiple log sources in different ways, (iii) processes are frequently interleaving and thus log lines corresponding to malicious actions are interrupted by normal log messages, (iv) execution of malicious commands may cause manifestations in logs at a much later time due to delays or dependencies on other events, and (v) it is non-trivial to assign labels to missing events, i.e., log messages suppressed by the attack.

We attempt to alleviate most of these problems by automatically labeling logs on two levels. First, we assign time-based labels to all collected logs. For this, we make use of an attack execution log that is generated as part of executing the attack scripts. We implemented a script that processes all logs, parses their time stamps, and labels them if their occurrence time lies within the time period of an attack stage. Assuming that attack consequences and manifestations are not delayed, it is then simple to check whether anomalies reported by IDSs lie within the expected attack time phases. Since exact times of malicious command executions are known, it is even possible to count correctly reported missing events as true positives.

While time-based labeling is simple and effective, it cannot differentiate between interleaved malicious and normal processes and does not correctly label delayed log manifestations that occur after the attack time frame. Therefore, our second labeling mechanism is based on lines that are known to occur when executing malicious commands. For this, we carry out the attack steps in an idle system, i.e., without simulating normal user behavior, and gather all generated logs. We observed that most attack steps either generate short event sequences of particular orders (e.g., webshell upload) or large amounts of repeating events (e.g., scans). We assign the logs to their corresponding attack steps and use the resulting dictionary for labeling new data. For the short-ordered sequences, we pursue exact matching, i.e., we compute a similarity metric based on a combination of string similarity and timing difference between the expected and observed logs and label the event sequence that achieves the highest similarity. For logs that occur in large unordered sequences, we first reduce the logs in the dictionary to a set of only few representative events, e.g., through similarity-based clustering. Our algorithm then labels each newly observed log line that occurs within the expected time frame and achieves a sufficiently high similarity with one of the representative lines. These strategies enable correct labeling of logs that occur with a temporal offset or are interrupted by other events, but obviously suffer from misclassifications when malicious and normal lines are similar enough to be grouped together during clustering.

In the final step of the overall scenario and data generation procedure, we use the anomaly-based detection engine AMiner. The AMiner learns a baseline of normal behavior and raises alerts for all deviations of this normal behavior in the testing phase. We add several detectors to the AMiner analysis pipeline: a detector for unparsed events that do not correspond to the parser model, a detector for new events that fit the parser but never occurred before (i.e., detection of new paths), detectors for new values that do not occur in the training phase in events of several log files (e.g., IP addresses in Exim log files or user agent strings in Apache log files), and detectors for new value combinations that do not occur in the training phase (e.g., file info in suricata event logs or user authentication info in Exim logs) [14, 16].

3.3 CTI Dashboard

The remaining section describes the CTI dashboard in detail. The section presents the visualizations the dashboard provides as well as how it depicts data entities provided by the alert aggregation, such as alert groups and meta-alerts. Finally, it discusses how meta-alerts can be used to recognize reoccurring attack patterns. The CTI dashboard is implemented as Kibana dashboardFootnote 20 and uses an elastic search database. In the backend runs the proposed alert aggregation approachFootnote 21.

Visualization. Figure 6 shows the overview of the CTI dashboard. The general CTI dashboard shows three tables. The first one shows all currently existing meta-alerts, provides the time stamp when the meta-alert was observed for the last time, the meta-alert id, and the number of alerts the meta-alert consists of. The second table ‘groups’ shows the alert groups, the timestamp when a group was generated, which meta-alerts a group is assigned toFootnote 22, and the number of alerts a group consists of. The third table shows the alerts, the timestamp an alert was generated, the ‘Componentname’, i.e., the name of the detector that triggered the alert, the message of the alert, which groups it is assigned toFootnote 23, and the related meta-alertsFootnote 24.

The dashboard overview allows to filter for specific meta-alerts, alert groups, and alerts. For example, in Fig. 6, the user filters the meta-alert with the id 0 by clicking the box next to it. Hence, the second table shows all the groups that match meta-alert 0, and the third table provides all the alerts that relate to meta-alert 0.

Fig. 6.
figure 6

Dashboard overview.

Additionally, the overview provides a button ‘Open graph’. Once a user has filtered specific entities, it is possible to visualize their relations by clicking the ‘Open graph’ button. Figure 7 shows the visualization when filtering for meta-alert 0. The figure shows that groups 0 (G0), 7 (G7), and 15 (G15) match meta-alert 0 (MA 0). In this scenario groups G0 and G7 have triggered the generation of M0 and G15 matches the meta-alert. Moreover, the visualization shows that four alerts relate to each alert group. Detailed information on the single meta-alerts, alert groups, and alerts are provided by the ‘Discovery’ dashboard.

Fig. 7.
figure 7

Visualization of relations between meta-alerts, alert groups, and alerts.

Alerts and Alert Groups. The dashboard also provides a detailed view on alerts, alert groups, and meta-alerts that complements the graphical overview as a tree. In the following, we review the alert groups and the alerts they contain for the attack step visualized in the previous section. Figures 8, 9, and 10 depict the information of two alert groups provided by the CTI dashboard.

Figure 8 shows alert groups with ID 0 (left) and 7 (right) that both correspond to executions of the Nmap scan and are therefore similar. Note that these groups correspond to nodes G0 and G7 visible in the visualization of the alerts, alert groups, and meta-alert relationships in the previous section (see Fig. 7). As visible in Fig. 8, both alert groups are part of the same meta-alert MA0 with ID 0. Note that the timestamp visible in the figure describes the point in time where the groups were generated, and not the point in time where the alerts occurred, which is instead visible for each alert separately as we will show in the following screenshots. The delta time, i.e., the interval time that was used to build the groups by adding alerts to the same group if their interarrival time is lower than the threshold, is also visible in the JSON data. In particular, we used a delta time of 5 seconds to generate the alert groups. The field ‘alerts’ then depicts all alerts that are part of the alert group. Figure 8 provides a more detailed view on one of these alerts.

Fig. 8.
figure 8

Two similar alert groups.

Figure 9 shows alerts A1 and A463. A1 represents the second alert of group G0 (left) and alert A463 the second alert of G7 (right). We picked these alerts as a representative example for all other alerts that are merged as part of the meta-alert generation procedure. Comparing these alerts shows that they have many similarities: First of all, they share the same fields. This makes sense, since the alerts are reported by the same detector from the same IDS, in particular, the NewMatchPathValueDetector from the AMiner. This detector detects new values in specific positions of events that did not occur as part of the training phase. Second, the alerts share common values for most of the attributes. This includes, for example, the type of the detector (‘AnalysisComponentType’), the name of the detector (‘AnalysisComponentName’), as well as parts of the parsed log line such as the message ‘no host name found for IP address’ in the parser path ‘/parser/model/fm/no_host_found/no_host_found_str’. However, there are also some differences between the two alerts that are especially interesting for the purpose of alert aggregation. Most importantly, the value that triggered the detection, which is stated in the field ‘AffectedLogAtomValues’ is different: 3232238098 in A1 and 3232238161 in A463. Note that these are IP addresses in decimal format and correspond to and respectively. The IP addresses can also be seen in the raw log data that is depicted in the parser path ‘/parser/model’. These IP addresses are the addresses of the attacker host in two executions of the attacks during the scenario where the alerts have been collected from. Accordingly their detection as new values is correct. Besides the IP addresses, also the timestamp of the raw log atom is different, because the two attack executions have been launched at two different points in time. The timestamps depicted in unix time are visible in path ‘/parser/model/time’ and in raw format in path ‘/parser/model’. We expect that these values are independent of the attack itself, because in every system environment and for every attack execution in real-world scenarios the IP address of the attacker host and time of execution can be different. Consequently, in the meta-alert these values are either represented as a list of possible options, or wildcards.

We do not show the remaining alerts of the groups, since they are interpreted similarly. For completeness, we only state their types and important properties: Two of the alerts are almost identical, because the anomalous value occurs in two identical events. The third alert affects the following event that occurs during normal operation: \(\texttt {Feb 29 06:40:49 mail-2 dovecot: imap-login: Disconnected}\) \(\texttt {(auth failed, 2 attempts in 12 secs):}\) \(\texttt {user=<violet>, method=PLAIN, rip}\) \(\texttt {=, lip=, secured,}\) \(\texttt {session=<3XdWO7GfOoN/AAAB>}\). However, during the attack execution the event appears as follows: \(\texttt {Mar 4 13:51:}\) \(\texttt {48 mail dovecot: imap-login: Disconnected}\) \(\texttt {(disconnected before auth}\) \(\texttt {was ready, waited 0 secs):}\) \(\texttt {user=<>, rip=, lip=192.168.}\)\(\texttt {10.21, session=<+KO9uAeg4sPAqAoS>}\). Note that in this case, the message in brackets states ‘disconnected before auth was ready’ instead of ‘auth failed’, which never occurred before and is thus detected as a new path by the NewMatchPathDetector. The same event also triggers the final alert in the alert group, which is raised by the NewMatchPathValueComboDetector that monitors combinatinos of values. In particular, this detector monitors the user name as well as the ‘secured’ and ‘method’ parameter in the event, which are all missing in the anomalous event. Since this combination of missing values did not occur in the training phase, the event is raised as an anomaly. We did not pick these alerts as examples, because they are less expressive - the only difference is the timestamp and there is no IP address that is different across the different attack executions. Other than that, they are treated and interpreted in the same way.

Fig. 9.
figure 9

Alerts of two similar alert groups

Figure 10 shows the meta-information of the alert group. In particular, it shows the attribute alert_count, which states that four alerts occur in each of the two groups. Moreover, the IDs of the alerts are explicitly mentioned. For the first group, these IDs are 0, 1, 2, 3 corresponding to alerts A0, A1, A2, A3. In the second group, the IDs correspond to A462, A463, A464, A465.

Fig. 10.
figure 10

Alert identifiers stated in alert groups.

Meta-alerts. The following section describes how the ‘Discovery dashboard’ depicts meta-alerts, using meta-alert 0 as example. Figure 11 shows a part of the JSON that describes MA0, specifically the section that describes the second alert of the attack pattern defined by meta-alert MA0. The second alert of MA0 relates to the alerts A1 and A463, the second alerts of groups G0 and G7 visible in Fig. 9. Figure 11 shows that some properties of the meta-alert have the same values in all groups that are related to MA0. For example, the ‘AnalysisComponentName’, the ‘AnalysisComponentName’, and the ‘AffectedLogAtomPaths’ are equal for all alert groups assigned to MA0. Therefore, the values of these keys are lists with a single entry. Other properties, such as ‘AffectedLogAtomValues’ can take different values, and thus are represented by lists of values that occurred in this locations. Since the alert aggregation continuously learns new meta-alerts and adapts existing meta-alerts, the values of the properties can change over time. For example, meta-alert MA0 was generated by G0 and G7. Hence, first a list of two values was assigned to ‘AffectedLogAtomValues’. Afterwards, group G15 also matched MA0 and triggered an adaptation of MA0. In this case a third entry was added to the list. Eventually, the alert aggregation will replace specific values with wild cards, if the lists become too long. However, at some point a meta-alert reaches a certain level of stability and the alert aggregation stops modifying it.

Fig. 11.
figure 11

Meta-alert example.

Figure 12 shows that each meta-alert also stores the relations between alert groups and alerts that are assigned to the meta-alert. This information is used to generate the visualization shown in Fig. 7.

Fig. 12.
figure 12

Meta-alert structure.

4 Conclusion

In this chapter we introduced a novel approach for meta-alert generation based on automatic alert aggregation. Our method is designed for arbitrary formatted alerts and does not require manually crafted attack scenarios. This enables to process alerts from anomaly-based and host-based IDSs that involve heterogeneous alert formats and lack IP information, which is hardly possible using state-of-the-art methods. We presented a similarity metric for semi-structured alerts and three different strategies for similarity computation of alert groups: exact matching, bag-of-alerts matching, and alignment-based matching. Moreover, we proposed techniques for merging multiple alerts into a single representative alert and multiple alert groups into a meta-alert. We outlined an incremental procedure for continuous generation of meta-alerts using aforementioned metrics and techniques that also enables the classification of incoming alerts in online settings.

We presented an application example that demonstrates the alert aggregation approach. Additionally, we introduced a CTI dashboard that enables visualization of alert aggregation results, including alerts, alert groups, and meta-alerts. Finally, the dashboard allows to filter different entities and visualize their interdependencies.

We foresee a number of extensions for future work. We plan to develop a metric that measures how well a group is represented by a meta-alert could reduce the problem of noise alerts within groups and could even be used to separate alerts of overlapping attack executions into distinct groups. On the other hand, determining how well meta-alerts are represented by groups could allow to automatically recognize and improve incorrectly formed meta-alerts. Furthermore, we explained that group formation with different \(\delta \) values enables generation of diverse meta-alerts. However, we do not make use of the fact that this also yields a hierarchical structure of groups. It could be interesting to transfer these relationships between groups to meta-alerts in order to improve their precision. Finally, we plan to develop metrics that support interpretation of meta-alert quality, for example, to measure their entropy.