This section presents a proof-of-concept for the HFC methodology considering the manual representation of cybersecurity profiles. The history behind and the actors in it are not real. The incident (data leak) has been prepared in a controlled environment and is used in practice to teach digital forensics to our students. We apply the methodology to this use case in order to check whether or not the results correspond with the logical interpretation after the analysis of the digital evidence.
In what follows the methodology is used to describe the characteristics of a set of potential suspects during a digital investigation. To do so, we define a initial set of dependencies between the parameters in Table 1 that will be modified based on the specific information in the profiles of the users, which are simplified for the sake of clarity.
As detailed in the previous sections, this information can be completed using API tools to acquire public information about the suspects. However, in this case we prefer to focus on the dynamic adaptation of the context for the interpretation of the digital investigation based on the profiles, since it is an aspect that cannot be implemented using the APIs. The integration of new information from the APIs is for future work because it requires the design of new correlation models that are beyond the scope of this paper.
Set-up of basic profiles based on HFC characteristics
Table 1 defines characteristics for three users of a system that has suffered a data leak. The (fictious) incident is as follows.
The incident has occurred in a family business called La Abuela Cantora. None of the users seem to be guilty; Clark is the administrator of the network but his technical skills (and security expertise) are quite limited. A USB Rubber Ducky [33] has been found camouflaged as a normal USB device in a set of USBs shared between the three suspects. This is assigned to Clark, who has acknowledged that he received it through a contact but did not know what it was.
The three participants have mixed feelings: Denise thinks that Bob is responsible because in the past he betrayed the company by leaking secrets to the reporter Alice Protocolo. On the other hand, Bob distrusts Clark because he considers him a hacker. Clark believes the attacker is an outsider.
Table 1 Example: Three actors defined based on HFC characteristics
A Rubber Ducky is a special USB. Once connected to the victim’s machine it will install keyboard drivers and will not be announced as a storage device. Instead, it will use its emulated keyboard and its mini processor to execute commands on the victim’s computer. In this example, the Rubber Ducky has been prepared to open a backdoor in the victim’s computer, which was Denise’s computer. Then, the attacker, using Meterpreter copied documents (some traces are also observed in the memory). This is something that can be observed once the digital investigator starts to analyse the devices of the three suspects.
It is important to try to deduce information using the characteristics showed in Table 1 and understand if these should be completed and how. This can help improve the timeline of the digital investigation.
In order to test our approach the CPRM model [26] is used to represent the information shown in Table 1. To that end, the characteristics have been expressed in layers (e.g, habits, devices) and the specific, common descriptors inside the characteristics (e.g., hobbies, apps, role) have been defined as general parameters inside the layers. The relationships between the parameters have been defined to express the dependencies between them. The tool [26], implemented in Matlab [34], uses Graphviz [35] to generate graphs as Fig. 6 shows. The graphs can be quite complex, and this is one of the reasons why the analysis is based on the results after operating with these graphs.
Analysis-preliminary results
The specific values for the different actors are expressed as particular contexts. Based on the initial information, Clark’s profile has the higher probability of being considered the offender. This is because of his role in the system but also given that he has specific tools that could have been used to commit the attack (e.g., Rubber Ducky) and the fact that he has also shown interest in hacking pages. Similarly, Bob is more likely than the rest of participants to be the victim. The reason for this is that his computer does not have security tools enabled beyond the Windows 7 firewall. Furthermore, the code programmed for the Rubber Ducky is intended for Windows systems; Denise also uses Windows, but Windows Defender will stop this specific threat.
To make the analysis feasible, the model is trained using a basic set of parameters and relationships. When the parameters or the relationships change then the expected behaviour also changes. The objective in this case is to evaluate whether or not the model can determine if the suspects (Denise, Bob or Clark) are potential victims, offenders, guardians or witnesses. Therefore, the target in our requests to the tool are the parameters targeted as “Actors”: Victim, Offender, Guardian, Witness. The model, without being instantiated shows the expected behaviour in Fig. 7. This means that, as it is, there are many more parameters that finally influence the parameter “Offence”. This is not good or bad, it is merely the way in which the parameters and relationships have been defined for this use case.
What really changes the results is the particular context (PC) defined for each participant in the experiment. In this case, there are three PCs, one per physical actor: Denise, Bob and Clark. The PCs have been defined based on the columns in Table 1. In this experiment, each PC will be combined, separately, with the previous context (Fig. 7). More specifically, Figs. 8, 9 and 10 show the results after combining each profile (Denise, Bob and Clark) with the basic behaviour defined previously. This is done using the rules defined in the CPRM model. Using these rules the dependencies between the parameters can be expressed and those considered as general can be broken down into more specific parameters in a process defined as instantiation, which is a process that in turns has its own definition of conditions (c.f. mathematical formulation and description of rules of a CPRM model [26]).
For example, during this example the model is used to interpret the type of profile of Denise, Bob and Clark by defining the parameter Actor and instantiating this value with the parameters “Victim”, “Guardian”, “Witness” and “Offender”. Therefore, the results for the users depend on the specific values of the parameters in Table 1 (e.g. Technical skills for Denise, Bob and Clark). This language is highly dependent on and sensitive to context, which can be a limitation in a productive environment with a large number of parameters, but is useful to show a proof of concept of the methodology with a problem limited to three actors. In addition, note that when the number of parameters increases, the visualisation of results becomes very complex. For clarity we focus on those parameters that directly affect the interpretation implemented in the model.
The results in Fig. 8 show that, according to the model, Denise is probably not the Offender. Moreover, these results show that even combining an increasing and decreasing of parameters the values to be “Offensive” are negative. Denise could be a victim or a guardian/witness. The reason for these results is that during the modelling, Denise’s operating system was considered to be more secure given the threat. Also, the relationship of Denise with the organisation (i.e., being a member of the family) decreases her motivational values. Moreover, Denise has a good salary and this decreases her Economic motivations to commit an attack. All these features affect the “Offender” parameter, which is minimised in the case of Denise.
In the case of Bob (Fig. 9), the results are more interesting. His Windows 7 operating system makes him vulnerable to the specific threat considered in this use case, which is motivated by a particular USB device belonging to Clark (apparently). His relationship with Alice Protocolo, who is known for being an activist at “La Gaceta del RAT”, makes his possible motivation to be “ideological”. Then, there are various features that make Bob’s system vulnerable and therefore can make him a potentially desirable victim. However, in this case the results are not completely conclusive. The capacity of Bob to commit this attack is not clear as he does not have proven technical skills. Nevertheless, this indecision could be an indication that this system is being used as a victim and attacker. This is the case indeed since the leak occurred in Bob’s device after connecting the Rubber Ducky. Therefore the results are in line with reality.
At this point one would think that Clark is entirely guilty. The results for Clark are shown in Fig. 10. Clearly Clark has been targeted as Offender, with also a certain probability of being a victim. This is motivated by several factors. For example, Clark has access to the entire system because he is administrator. Also his salary is very low, so that is a motivation. His hobbies will not help his case either (hacking and security), although these could be understandable given his role in the system (e.g., if he is interested in improving security). Clark has the motivation, the opportunity and the weapon (Rubber Ducky).
Moreover, Clark has two people that he considers to be his friends. One is Bob Protocolo and the other one is Sue Picious. Initially Sue is not considered a relevant actor until the investigators’ team reveal that he has exchanged several hacking emails with Clark and also sent him the Rubber Ducky. After this new information, the relationship between the parameters “Friends” and “Challenge” grows for Clark. This means that the values for “Offender” also grow. These results show that in this chain of facts the culpability of Clark could be higher than the rest of the actors.