CyberSANE System is the implementation of the aforementioned approach; an innovative, knowledge based, collaborative security and response dynamic system which incorporates all phases of the Cyber incident handling lifecycle for handling incidents, incident detection, analysis activities and post-incident knowledge harvesting.
The main goal of the CyberSANE system is to increase the agility of the investigators and encourage continuous learning throughout the incident life cycle. In particular, the proposed system aims to improve, intensify and coordinate the overall security efforts for the effective and efficient identification; investigation, mitigation and reporting of realistic multi-dimensional attacks within the interconnected web of the cyber assets of the CIIs and security events. To realize this objective, the CyberSANE system is empowered with the newest techniques in prevention, detection and mitigation of cyber-threats, including the understanding of synthetic cyberspace through the use of Advanced Visualization Techniques (immersive interfaces, cyber 3D models, etc.). These visualization approaches will help CII operators to better comprehend the situation and to detect some traces/details that could allow them to provide incident analysis in-depth and thus to detect a potential threat/attack.
In order to capture all the CyberSANE system requirements, Gürses et al. (2005) requirements elicitation process is followed. Moreover, the elicitation, analysis and the documentation of requirements, is focused on capturing the perspectives of technical professionals, such as software developers and system engineers, who cooperate with system’s end users to discover problems that have to be solved. The requirements elicitation process includes but is not limited to what the proposed system should provide, what are the expected services, the required characteristics and software constraints etc. To this intent, targeted questionnaires were developed, aiming at collecting feedback from various CIIs operators. These questionnaires serve mainly the purpose of corroborating the requirements elicited, but they are also used as a communication measure and feedback collection. In this respect, a number of questionnaire replies was gathered (approximately 30 questionnaires from organizations coming from various critical sectors, banking, maritime, transportation, healthcare, energy).
From technical perspective, the system is composed of five core components: (1) the Live Security Monitoring and Analysis (LiveNet), which is able to monitor, analyze, and visualize organizations’ internal live network traffic in real time, (2) the Deep and Dark Web mining and intelligence (DarkNet), which monitors the Dark and Deep Web as a means to grasp and analyse the big picture of global malware cybersecurity activities, (3) the Data Fusion, Risk Evaluation and Event Management (HybridNet), which receives security related information on potential cyber threats from both LiveNet and Darknet accordingly with the view to analyze and evaluate the security situation inside an organization, (4) the Intelligence and Information Sharing and Dissemination (ShareNet), which disseminates and shares information of useful incident-related information to relevant parties and (5) the Privacy and Data Protection (PrivacyNet) that undertakes the responsibility to collect, compile, process and fuse all the incident-related information in a way that their integrity and validity is well preserved. Figures 1 and 2 depict the CyberSANE Incident Handling approach and the CybeSANE system along with its main parts respectively. The following sections provide a detailed description of the five CyberSANE system core components.
Live security monitoring and analysis (LiveNet) component
The LiveNet is an advanced and scalable Live Security Monitoring and Analysis component capable of preventing and detecting threats and, in case of a declared attack, capable of mitigating the effects of an infection/intrusion. The main objective of this component is to implement the Identification, Extraction, Transformation, and Load process for collecting and preparing all the relevant information, serving as the interface between the underlying CIIs and the CyberSANE system. It includes proper cyber security monitoring sensors with network-based Intrusion Detection Systems (IDS), innovative Anomaly detection modules and endpoint protection solutions for accessing and extracting information, on a real-time basis, in order to detect complex and large-scale attacks (e.g. Advanced Persistent Threats). The incident-related information that reside in different and heterogeneous cyber systems may include various types of data, such as: active (unpatched) vulnerabilities in the technological infrastructure; misuse detection in the network or in the systems, including both host-based and network-based IDS deployment and integration; anomaly detection in the network or in the systems; system availability signals; network usage and bandwidth monitoring; industry proprietary protocol anomalies; SCADA vulnerabilities, etc.
LiveNet incorporates appropriate data management and reasoning capabilities for: (1) near real-time identification of anomalies, threats, risks and faults and the appropriate reactions; (2) proactive reaction to threats and attacks; and (3) dynamic decision making in micro, macro and global level according to the end user’s needs and the identified incidents/threats. These capabilities are empowered with existing innovative algorithms based on techniques such as machine learning, deep learning and AI that identify previously unknown attacks. This component provides an abstraction of the collected information to the Data Fusion, Risk Evaluation and Event Management (HybridNet) component of the CyberSANE system. Moreover, all incidents-related information captured from LiveNet will be parsed, filtered, harmonized and enriched to ensure that only the data necessary for the multivariate and multidimensional analysis are available to the other components (e.g. HybridNet). Thus, LiveNet contributes as follows: (1) preventing a flood of irrelevant or repeated information from cluttering the HybridNet processing component; and (2) consolidating the different data contents and formats towards a uniform perspective in order to provide the upper components a unified and convenient way to handle the information.
Deep and dark web mining and intelligence (DarkNet) component
The Deep and Dark Web mining and intelligence (DarkNet) component provides the appropriate Social Information Mining capabilities that will allow the exploitation and analysis of security, risks and threats related information embedded in user-generated content (UGC). This is achieved via the analysis of both the textual and meta-data content available from such streams. Textual information is processed to extract data from otherwise disparate and distributed sources that may offer unique insights on possible cyber threats. Examples include the identification of situations that can become a threat for the CIIs with significant legal, regulatory and technical considerations. Such situations are: organization of hacktivist activities in underground forums or IRC channels; external situations that can become a potential threat to the CIIs (e.g. relevant geopolitical changes); disclosure of zero day vulnerabilities; sockpuppets impersonating real profiles in social networks etc. Entities (e.g., events, places) and security-realated information will be uniquely extracted from textual content using advanced Natural Language Processing (NLP) techniques, such as sentiment analysis.
Data fusion, risk evaluation and event management (HybridNet) component
The Data Fusion, Risk Evaluation and Event Management (HybridNet) component provides the intelligence needed to perform effective and efficient analysis of a security event based on: (1) information derived and acquired by the LiveNet and DarkNet components; and (2) information and data produced and extracted from this component. In particular, HybridNet component retrieves incidents-related data via the LiveNet component from the underlying CIIs and data from unstructured and structured sources (e.g. from Deep and Dark Web) consolidated in a unified longitudinal view which are linked, analysed and correlated, in order to achieve semantic meaning and provide a more comprehensive and detailed view of the incident. In CyberSANE, a formal and uniform representation of digital evidence along with their relationships has been used to encapsulate all concepts of the forensic field and provide a common understanding of the structure of all information linking to evidence among the CIIs’ operators and the forensics investigators. The main goal of the analysis process is to continuously carry out the assessment (e.g. identification of on-going attacks and related information, such as what is the stage of the attack and where is the attacker) and prediction (i.e. identification of possible scenarios of future attacks through forecasting models). HybridNet incorporates fusion models based on existing mathematical models (e.g. data mining, AI, deep learning, machine learning and visualization techniques). These models will support and provide reasoning capabilities for the near real-time identification of anomalies, threats and attacks, assessing any possible malicious actions in the cyber assets such as abnormal behaviors or malicious connections to identify unusual activities that match the structural patterns of possible intrusions.
In order to meet its objectives, the HybridNet will consist of three elements Anomaly Detection Engine, Incident Analysis & Respond and Decision-Making, Warning and Notification which are further described below.
The Anomaly Detection Engine will undertake the responsibility to process a large amount of data delivered from the abovementioned components. The objective of this engine is to analyse the received data in order to further evaluate and correlate attack-related patterns associated with specific malicious or anomalous activities in the CIIs. Thus, when the engine identifies unusual activities that match the structural patterns of possible intrusions, generates alerts to show that these activities require a more intensified analysis.
Once an event has been considered by the Anomaly Detection Engine as a real security incident, the Incident Analysis & Respond element is responsible to further investigate it. The analysis is performed based on data and information produced from the following subsystems: (1) the Collaborative Evidence-based Risk Assessment subsystem implements the main steps required for the identification, evaluation and mitigation of all vulnerabilities, threats and risks associated with the CIIs, in a graphical way using visualization tools, simulation processes, automated routines and structured content. (2) The Collective Intelligence and Big-Data Analytics subsystem implements a wide array of reasoning, data mining and big data analytics techniques which will incorporate and leverage a variety of data sources and data types in order to enhance and optimize the investigation, analysis and response of a security incident. (3) The Model Processing Management subsystem includes a variety of modeling tools and methods in order to easily visualize CIIs and identify all type of interdependencies (at physical, system, technology and business levels).
Finally, the Decision-Making, Warning and Notification element is responsible to orchestrate and facilitate the analysis, which includes the scrutiny of the attacker’s actions and identification of the means that were employed by the attacker, and in overall and understanding of how the attack originated and evolved. This subsystem takes into account the incident-related information processed by the Anomaly Detection Engine and the Incident Analysis & Respond in order to design and execute the necessary simulation experiments through the Security Incident/Attack Simulation Environment and the Behavior Simulation Environment.
The Security Incident/Attack Simulation Environment comprises a set of novel mathematical instruments, including mathematical models for simulating, analysing, optimizing, validating, monitoring simulation data and optimizing security incident handling process. Specifically, these instruments include: (1) a bundle of novel process/attack analysis and simulation techniques for designing, executing, analyzing and optimizing threat and attack simulation experiments that will produce appropriate evidence and information that facilitate the identification, assessment and mitigation of the CII-related risks; (2) graph theory to implement attack graph generation, to perform security incident analysis and to strengthen the prognosis of future malefactor steps; (3) pioneering mathematical techniques for analyzing, compiling and combining information and evidences about security incidents and attacks/threats patterns and paths in order to find relationships between the recovered forensic artefacts and piecing the evidential data together to develop a set of useful chain of evidence (linked evidence) associated with a specific incident; (4) innovative simulation techniques which will optimize the automatic analysis of diverse data; (5) innovative techniques in order to link optimization and simulation. In this context, this simulation environment is fed with information about an incident and proceeds to calculate and generate a number of possible attack graphs (routes of possible attacks) and graphs of linked evidence (chains of evidence) and also compute probabilities for a sequence of events on top of these graphs. The resulting probabilistic estimate for the compromised CIIs’ assets will be used to identify, model and represent the course of an attack as it propagates across the CIIs. It should be noted the HybridNet component continuously updates the simulation engine with information collected and piece of information, thereby enabling both understanding which assets might have been compromised, as well as gain more accurate estimates on the likelihood that other assets might be compromised in the future.
The second simulation environment is the Behavior Simulation Environment that aims to stimulate the behavior of CIIs’ operators and Treat Actors taking into consideration their cyber interactions and interdependencies to measure the cascading effects of various cyber-attack patterns and security incidents within the digital ecosystem. Based on the estimated effects, the environment is able to formulate extensive plans to mitigate the effects of such incidents.
Intelligence and information sharing and dissemination (ShareNet) component
The ShareNet component provides the necessary threat intelligence and information sharing capabilities within the CIIs and with relevant parties (e.g. industry cooperation groups, CSIRTs). It is responsible for the instantiation of the adopted intelligence model; in particular, ShareNet undertakes the identification and dissemination of, the right and sanitized information that have to be shared in a usable format and in a timely manner. This environment produces and circulates notifications containing critical information, enhancing the perception of the current situation and improving the projection into the future. It should be noted that all potential evidence from the systems that are suspected to be part of the infrastructure being investigated are forensically captured, stored and exchanged in a way that their integrity is maintained using the security and data protection methods of the PrivacyNet Orchestrator.
To this end, ShareNet follows a trusted and distributed intelligence and incident sharing approach to facilitate and promote the collaboration and secure and privacy-aware information sharing of the CIIs’ operators with relevant parties (e.g. industry cooperation groups, CSIRTs), in order to exchange risk incident-related information, through specific standards and/or formats (STIX) (OASIS 2017a, b, c, d, e), improving overall cyber risk understanding and reduction. Privacy preserving is another important issue considered at every phase of sharing, applying methods such as anonymization or pseudo anonymization and encryption techniques incorporated in and made available from PrivacyNet Orchestrator. This brings forward a mixture of several cryptographic techniques that holds certain security guarantees.
Privacy and data protection (PrivacyNet) Orchestrator
Through the specific “Privacy and Data Protection Orchestrator” (PrivacyNet), it is possible to coordinate the abovementioned components of the CyberSANE system in order to ensure desired-levels of data protection for sensitive incident-related information, enabling the possibility to apply such protection in all phases of cyber security incident handling flow. The main purpose the PrivacyNet is to manage and orchestrate the application of the innovative privacy mechanisms and maximize achievable levels of confidentiality and data protection towards compliance with the highly-demanding provisions in the GDPR in the context of protecting sensitive incident-related information within and outside CIIs. To this end, PrivacyNet sets up the security and data “protection configurations” allowing security experts and members of the incident response team to specify all the protection steps that have to be performed and the required conditions to execute them, which can be referred to GDPR-based rules (and to other guidance for its application by the European Data Protection Board, formerly Art. 29 Data Protection Working Party).
In addition, the orchestration approach of the CyberSANE allows applying the most appropriate security and data protection methods depending on the user’s privacy requirements, which cover a wide range of techniques including anonymization, location privacy, obfuscation, pseudonymization, searchable encryption, multi-party computation and verifiable computation, in order to meet the highly demanding regulatory compliance obligations, for example in relation to accountability towards data protection supervisory authorities, for adequate management of informed consent etc. For this reason, novel techniques and processes for enhancing the secure distribution and storage of all forensic artifacts in order to protect them from unauthorized deletion, tampering revision and sharing (e.g. Attribute-based Encryption (ABE) and block-chain technologies) have been combined.