The static detection of malware has celebrated successes over the years, but obfuscation techniques have deprived static methods of many of their advantages. The Achilles heel of obfuscated code is that, however difficult to read and understand, it has to display its actions when executed. Dynamic methods for malware detection exploit this fact. They execute the code and study its behaviour.

In this chapter, we give an overview of dynamic malware detection. We discuss the different methods used and their strengths and weaknesses compared to static analysis. Again, the goal of the chapter is to shed light on the extent to which dynamic malware detection techniques can help us in detecting the actions of a dishonest equipment provider.

8.1 Dynamic Properties

When we want to observe the behaviour of a system, we first need to decide what aspects of the execution we are going to monitor. Egele et al. [6] provide a good overview of the different alternatives, the most important of which we will review here.

Tracing the function calls of a system under inspection is a natural approach to dynamic malware detection [9]. In programming, functions usually represent higher-level abstractions related to the task that the code should perform. They are also the building blocks of the application programming interface (API) and constitute the system calls that an operating system provides to the programmer to make use of system resources. The semantics of the functions and the sequence in which they are called are therefore very suitable for analysis of what a program actually does. The interception of function calls is carried out by inserting hook functions into the code under inspection or by exploiting the functionality of a debugger. These hook functions are invoked whenever a function subject to analysis is called.

In addition to obtaining the actual sequence of function calls carried out by a program, the dynamic collection of function calls also reveals the parameters used in each call. This information is inherently difficult to obtain through static code analysis, since the value of each variable passed on as a parameter could be the result of arbitrarily complex computation.

Another useful feature that can be extracted through dynamic analysis is how the system processes data. In its simplest form, this consists of recording how the contents of given memory locations influence what is later stored in other memory locations. As an example, we could assume that one memory location is initially tainted. Whenever an assignment statement is executed, the taint progresses to the memory location that is assigned a new value, if and only if the new value is computed based on the content of the already tainted memory location. Using this information, we would, for example, be able to detect cases in which tainted information is leaked over the network.

The lowest level of properties that can be captured by dynamic analysis is the sequence of machine instructions that is performed by the system. For any given program, such a trace will easily turn out to be immense and therefore extremely costly to analyse in full. Still, such traces will contain information that may not be represented in the high level of abstractions of API-, function-, and system calls.

8.2 Unrestricted Execution

Capturing the dynamic properties of suspicious code by running it in an unrestricted environment has some clear advantages. First, the test environment is easy to set up and, second, it is hard for the malware to detect that it is being observed and thus choose to suppress its malicious actions. API calls and system calls can be collected by inserting hooks into the called functions. The hooks are inserted in the same manner as in rootkits [10].

This approach has two significant weaknesses. First, since the code under inspection runs directly on hardware, it is complicated to collect instruction traces and internal function calls. Second—and this is the most serious drawback—the malicious actions of the malware will be executed in an unrestricted way. This means that the harm intended by the malware will actually be committed. The second drawback can, to some extent, be mitigated by recording the actions of the code under inspection and taking snapshots of the system state. This would allow the tester to restore the system to a non-damaged state. Unfortunately, it is not always possible to roll back the effects of a system. Information leaked out of a system cannot be called back and attacks on other systems can cause harm that cannot easily be undone in the test environment alone.

8.3 Emulator-Based Analysis

Running the code without allowing it to cause harm requires the use of a controlled execution environment. An emulator—often referred to as a sandbox in the context of malware analysis—is a software system that replicates the actions of hardware. Using an emulator for a given hardware architecture, one can load an operating system on top of it. In this operating system, the code under inspection can be started and the consequences of its actions confined to the sandbox.

Emulator-based analysis has an additional advantage. Since the hardware is emulated by software, it is easy to insert hooks to extract features such as instruction traces, which are usually only observable at the hardware level. On the other hand, such analysis introduces a semantic gap in comparison with unrestricted execution with API hooks [6]. A hardware emulator cannot immediately identify API or system calls, since these will appear to the emulator as a sequence of instructions that together implement the call in question. These sequences will need to be identified and recognized before the higher-level semantic information of API calls can be obtained.

The most important limitation of sandboxes is, however, their speed of operation. Hardware emulated in software inevitably represents a slowdown in itself. More significantly, however, every resource used for analysis represents the removal of resources for running the code under inspection. The detailed insight provided by the sandbox comes at a high cost of overhead. This approach is therefore usually applied only to small suspicious portions of the code.

8.4 Virtual Machines

In an emulator, all the actions of the hardware are simulated by means of software. Therefore, the emulator itself can, in principle, run on hardware that is different from the hardware it emulates. This level of indirection contributes to the significant slowdown induced by emulation-based analysis in sandboxes.

Virtual machines are similar to emulators, in that they encapsulate execution in a software environment to strongly isolate the execution under inspection. They are, however, different, in that they run on hardware that is similar to the hardware they pretend to be. This means that non-privileged instruction sequences in the code under inspection can be run directly on hardware, a feature that could speed up the execution considerably.

The strength of virtual machines in relation to emulators is that they run faster. On the other hand, it is hard to generate instruction traces on non-privileged sequences of instructions that run directly on hardware. The semantic gap between the sequences of instructions and API and system calls is similar between the two approaches.

8.5 Evasion Techniques

A malware programmer’s first duty is to prevent the malware from being detected. When facing dynamic detection methods, malware should therefore be expected to use the same strategy as a human wrongdoer, namely, to behave well as long as you are being observed. Code whose intention is to hide its malicious capability will therefore try to detect whether it is under observation and will only carry out its malicious actions when it has reason to believe that it can do so without being detected [13].

Several different ways of detecting observation have been found in malware. We have seen samples that will not exhibit their malicious behaviour when run in debug mode. Furthermore, we have seen malware that behaves well whenever it detects specific patterns of users or simultaneously running processes. Furthermore, there is an arms race between the makers of emulators and virtual machines and malware developers when it comes to detection and evasion techniques. One particularly interesting aspect of this race is the suggestion that having your system appear as if it were observing the executing code reduces the likelihood of the malware in your system revealing its malicious effects [4]. This will not prevent one from becoming infected but it will, in some cases, make one immune to the consequences.

The bad news is that it is very hard to completely hide emulators or virtualization techniques from the code that is running on top of them. This is particularly the case if the code under inspection must have access to the Internet and can use this to query a remote time source [8]. The good news regarding evasion techniques is that the emulators and virtual machines that some malware are trying so hard to detect are rapidly becoming common platforms upon which most software, including operating systems, are running. The virtualization of resources is nowadays ubiquitous in computing and, thus, malware that exhibits its malicious behaviour only when run on non-virtualized systems is losing its strength. In our case, however, the problem is of a different nature. We are not only concerned with a piece of code. A typical mode of interest to us would be determining if an entire system consisting of hardware, virtualization support from a virtual machine monitor (VMM), operating systems, and application code does not have malicious intent coded into it. It is not important that the malware hide its behaviour from the VMM if it is the VMM itself we do not trust.

8.6 Analysis

In the sections above, we discussed the problem of collecting information from the execution of some code under inspection. Analysis of this information to conclude whether malicious behaviour is observed is, however, a challenge of its own.

The analysis of execution traces is associated with many of the same problems and solutions that we discussed for heuristic static analysis in Sect. 7.6. This means that many of the methods we described in the previous chapter also apply to dynamic analysis. N-Grams have been used to analyse dynamic instruction traces, as well as static instructions found in the code [2]. Although there are differences between instruction traces captured dynamically from those that are captured statically, the methods for their analysis have close similarities.

Other features, such as the sequences of system and API calls, are more easily captured through dynamic than static detection. Still, the problem of making sense of such sequences bears similarities with the problem of making sense of the control flow information that is used in static analysis (see Sect. 7.6). Therefore, the problem is approached in much the same way and with techniques from machine learning and artificial intelligence [7, 11].

Our question is whether the dynamic analysis of software for malware identification can help us identify malicious intent in integrated products offered by a vendor. The weakness of static analysis in our case is that the same malicious behaviour can manifest itself in countless different ways in static code. During execution, however, similar malicious actions will express themselves similarly. We have therefore seen dynamic methods for detecting malware gain momentum, and few modern products for malware detection still based on only static detection [5]. There are still two weaknesses that static and dynamic analysis based on machine learning and artificial intelligence have in common: First, they must have access to a learning set that is free of malicious actions. In our case, such sets are hard to define accurately. Second, when the methods for detecting malicious actions become publicly available, they are generally easy to evade.

8.7 Hardware

The literature on malicious software has focused almost exclusively on scenarios where the attacker is a third party. The software user and manufacturer are implicitly assumed to be in collaboration in deterring unknown wrongdoers from inserting malicious code into a piece of equipment. The situation is different for hardware. A wrongdoer is not expected to be able to change the hardware in a finished product. Therefore, hardware security has consisted of techniques to ensure that the design does not inadvertently build security holes into a system.

The thought that a wrongdoer could insert malicious logic gate structures into an integrated circuit was first taken seriously when integrated circuit design and manufacturing practices started increasingly relying on intellectual property cores supplied by third-party vendors. In addition to this development, modern integrated circuits are currently manufactured in third-party facilities and are designed using third-party software for design automation [3]. Distrust in third parties in the hardware domain therefore points the finger at the production phase rather than the post-deployment phase.

Discussions on hardware Trojans have taken particular foothold in the military sector. In September 2007, Israeli jets bombed what was suspected to be a nuclear installation in Syria. During the attack, the Syrian radar system apparently failed to warn the Syrian army of the incoming attack. Intense speculation and an alleged leak from a US defence contractor point to a European chipmaker that built a kill switch into its chips. The speculation is therefore that the radars were remotely disabled just before the strike [1].

There are two classes of dynamic approaches for finding such Trojans in integrated circuits. One is the activation of the Trojan by executing test vectors and comparing the responses with expected responses. If the Trojan is designed to reveal itself only in exceptionally rare cases, such as after a specifically defined sequence of a thousand instructions, such an approach is unfortunately practically impossible, due to the combinatoric explosion in the number of possible stimuli [16].

The other method is the analysis of side-channels through measuring irregularities in power consumption, electromagnetic emission, and timing analysis. This is doable when there are chips without Trojans with which the measurements can be compared [14]. When these Trojans are deliberately introduced by the manufacturer, there may not be any Trojan-free chips for comparison. In our case, side-channel analysis can therefore be easily duped by the perpetrator.

With hardware, unlike software, a great deal of effort goes into handling the problem of untrusted producers of code that goes into the product. It is therefore disconcerting that wrongdoers still seem to operate without substantial risk of being caught. There is yet no ‘silver bullet’ available that can be applied to detect all classes of hardware Trojans with high confidence [15]. This observation should, however, not lead to criticism of the researchers in the area; rather, it should make us realize the complexity of the problem at hand. Continued research is the only real hope we have to be able to address it.

8.8 Discussion

After obfuscation techniques have rendered many previously successful static approaches useless, the industry has turned, to an increasing extent, towards dynamic techniques for detecting malware. The reasoning behind this development is sound. Whereas it is possible to camouflage malicious behaviour in the code, it is far more difficult to hide malicious behaviour in the actions that are performed.

Dynamic detection has two significant weaknesses. One is that, in any run through the code, only one execution path will be followed. Consequently, a dishonest vendor can make sure that the malicious actions of the code are not executed in the time available to the analyser. One way to do this is to start the malicious behaviour only after the system has run for a given amount of time. As discussed above, the observation of code behaviour will, in most cases, mean that the code will run quite slowly and, as a result, there is a high probability that the limit of a given delay will never be exceeded. Still, the safest way for a malicious equipment producer to steer the execution path away from the malicious code is to make it dependent on a predefined external stimulus. Having this external stimulus encoded in only 512 bits would yield \(13.4 \times 10^{153}\) combinations. For comparison, the universe has existed for approximately \(4\times 10^{17}\) seconds. If the strongest computer on Earth had started computing at the beginning of the universe, it would still not have made any visible progress on the problem of testing these combinations [1]. The other weakness of dynamic detection is that one has to choose between execution in a real environment, where the malicious actions will actually take place, and execution in an environment where the system can detect that it is being observed.

Modern malware detection products combine static and dynamic detection in an attempt to cancel out the respective weaknesses of the two approaches [7]. Unfortunately, this is not likely to deter a dishonest equipment vendor. In software, the combination of obfuscating the malicious code in the product and letting it hide its functionality until a complex external trigger occurs suffices to reduce the chance of detection below any reasonable threshold. Symbolic execution has made some inroads in that area [17], but the state of the art is still far from changing the game. In hardware, the same trigger approach can be used to avoid dynamic detection and malicious actions can be hidden in a few hundred gates within a chip containing several billions of them [12], so that static analysis would be prohibitively hard, even if destructive analysis of the chip resulted in a perfect map of the chip’s logic. In addition, a vendor that produces both the hardware and the software can make the malicious acts be a consequence of their interaction. The malicious acts would thus invisible in either the hardware or software when studied separately.

When we expect a malicious equipment vendor to be able to put malicious functionality into any part of a technology stack, it is hard to extract traces of activities that we can be reasonably sure have not been tampered with. Still, insight from the field of dynamic analysis is of interest to us. In particular, there is reason to hope that malicious functionality can be detected through observation of external communication channels. Unfortunately, with kill switches and, to some extent, the leakage of information, the damage will already have been done when the activity is detected.