Static analysis for discovering IoT vulnerabilities

The Open Web Application Security Project (OWASP), released the “OWASP Top 10 Internet of Things 2018” list of the high-priority security vulnerabilities for IoT systems. The diversity of these vulnerabilities poses a great challenge toward development of a robust solution for their detection and mitigation. In this paper, we discuss the relationship between these vulnerabilities and the ones listed by OWASP Top 10 (focused on Web applications rather than IoT systems), how these vulnerabilities can actually be exploited, and in which cases static analysis can help in preventing them. Then, we present an extension of an industrial analyzer (Julia) that already covers five out of the top seven vulnerabilities of OWASP Top 10, and we discuss which IoT Top 10 vulnerabilities might be detected by the existing analyses or their extension. The experimental results present the application of some existing Julia’s analyses and their extension to IoT systems, showing its effectiveness of the analysis of some representative case studies.

from vulnerable computers to IoT devices. The ubiquitous nature of IoT ecosystems goes beyond the boundaries of traditional network security, and it widens the attack surface, as interconnected devices operate from different physical locations and network layers. In such scenarios, attackers may use automation tools to simulate authorized operations on legitimate devices to create a springboard effect where they may exploit minor vulnerabilities. IoT systems usually comprise at least three major components: devices, cloud, and companion applications [20]. Each of these components may contain security vulnerabilities, and when combined together such issues might increase their severity exponentially because of various computational and network features of IoT ecosystems.
In general, a "Thing" in IoT (aka, device) executes (embedded) software on microcontrollers (MCUs) with a small memory footprint, where autonomy, reconfiguration, safety, and fault tolerance are highly sought to meet functional safety requirements. Moreover, IoT devices rely on cloud services (e.g., to communicate and store data). This yields to communications between devices physically located at different places through different communication mediums supported by distinct protocols. This diversity may highly compromise the integrity of device data that could be sensitive or under IoT user's control. Tracking this across diversified communication mediums (e.g., WiFi, Bluetooth, and NFC) is difficult and error prone as it may propagate through multiple layers and devices. Furthermore, tracking malicious flow can be overwhelming especially when there are millions of devices communicating through the same channels. Finally, the cloud environment facilitates the development of sophisticated programs allowing the access and control of devices from remote locations. Such software may come in the form of Web or mobile applications, or even enterprise programs. This software, supported by ubiquitous connectivity, may become a suitable attack surface for intruders. Ideally, applications based on cloud services should use identity management, access management, identity governance, and authentication services to enforce security. However, due to computational constraints of IoT devices, a full-flagged implementation of these services may not be possible. This increases the risk of exploitation of weak points in the companion applications to gain significant remote control of IoT systems.
Several standards and certifications have been proposed over the years in order to prevent software vulnerabilities. The Open Web Application Security Project (OWASP) is probably the most notable and popular effort in this context. Among the many projects carried on by this foundation, the OWASP Top 10 project lists the most dangerous security vulnerabilities in Web applications. Similarly, the OWASP Internet of Things Top 10 project focuses on the 10 most critical risks for the IoT ecosystem. OWASP Top 10 policy refers to the Top 10 as an "awareness document" which may be adopted by industries to improve their product development processes in order to minimize and/or mitigate the most critical security risks. The vulnerabilities listed by OWASP IoT Top 10 in 2018 include, among others, weak and hardcoded passwords, insecure network interfaces, lack of update mechanisms, and insecure ecosystem interfaces. The diversity of these vulnerabilities poses a critical challenge to adopt a robust solution for their detection and mitigation.
The first version of OWASP Top 10 dates back to 2003. During the last 15 years, this list has widely impacted the processes to enforce cybersecurity in Web applications driving the development and adoption of various tools to prevent software vulnerabilities. Static analysis focuses on their detection at compile time without executing the program. Since this approach does not need to have concrete values to expose different execution paths, it can navigate the code more pervasively at the price of introducing some forms of approximation. Different kinds of static analyzers have been implemented and commercialized, ranging from syntactic analyzers (mostly considering small portions of the code in isolation) to tools based on formal methods (thus building up a complete semantic model of the programs and approximating how different software components might interact). OWASP Top 10 pushed industrial static analyzers to detect the software vulnerabilities listed among its categories. Therefore, various types of analyses have been developed in order to detect injection and XSS vulnerabilities, leakages of sensitive data, hardcoded passwords, as well as usage of weak cryptographic algorithms.
Can we therefore conclude that the application of static analysis to detect security IoT vulnerabilities is straightforward? Unfortunately, no. In particular, the IoT ecosystem comprises quite diversified types of software, like "Web, backend API, cloud or mobile interfaces," and embedded software as well. Each software components is potentially written in a different programming language (e.g., C for embedded software, and Java for Web and mobile applications), executes independently, and interacts over various communication channels. Existing static analyzers are mostly focused on individual programs. If independent programs interact with each other, then the analysis considers each program "in isolation," missing some potential vulnerabilities, or producing too many false alarms. In addition, existing static analyzers do not possess any knowledge or interface to specify the physical world of an IoT system.
In addition, the IoT ecosystem poses new challenges. For this reason, few years ago OWASP opened the IoT project 1 that released the IoT Top 10 list. 2 Like OWASP Top 10, this list is aimed at impacting how enterprises develop and debug their software in order to prevent vulnerabilities. However, this scenario is more recent and quickly evolving.
In this paper, we discuss each category of the OWASP IoT Top 10 list, and if and how such vulnerabilities can be prevented by means of static analysis. In particular, we compare them with OWASP Top 10 category, and how the static analyzers developed w.r.t. these vulnerabilities can be applied to IoT software as well. In addition, we extend an existing industrial static analyzer (Julia's Injection checker in particular) to properly address the novel challenges arising from IoT systems. We present some preliminary experimental results that show that our extension is in position to precisely discover security vulnerabilities specific of IoT systems.
The rest of the paper is structured as follows. Section 2 discusses related literature, while Sect. 3 introduces how static analysis has been applied to address some of the categories of the OWASP Top 10 list. Section 4 presents in detail the OWASP IoT Top 10 list, and for each category it discusses if and how static analysis can help to prevent the particular type of software vulnerabilities. Section 5 informally introduces the extension of the Julia static analyzer we developed to address some of the IoT Top 10 issues not yet covered by existing analyses, while Sect. 6 discusses some preliminary experimental results of this implementation. Finally, Sect. 7 concludes.

Related work
The diversity of IoT devices continues to grow, and nowadays it ranges from simple sensing and actuating devices to complex systems like connected vehicles, smart televisions and cameras. This diversification is closely related to what happened in other computing paradigms such as wireless sensor network (similar core concepts), edge and cloud computing (similar adoption of existing technology). On the one hand, IoT takes advantage of the most sophisticated technologies. On the other hand, it also inherits and increases the security concerns of these technologies. To make it even worse, in addition to the existing vulnerabilities, it also introduces some unique security threats due to its specific system architecture.
In this context, Assiri et al. [4] reviewed the security and privacy issues associated with IoT ecosystem. Das et al. [18] carried out a comparative study of different security protocols of IoT. In this process, they presented a taxonomy for security protocols used in IoT which includes device authentication, access control, privacy preservation, etc. Similarly, Frustaci et al. [29] and Neshenko et al. [52] presented a taxonomy of the IoT security issues based on the perception, transportation, and application. Tweneboah-Koduah et al. [67] analyzed the taxonomy of various security issues, whereas Ge at al. [30] devised an IoT security model consisting of five phases (namely data processing, security model generation, security visualization, security analysis, and model updates). This model is capable of analyzing IoT security strategies based on well-defined security metrics. Mavropoulos et al. [50] proposed a class-based notation for the architectural modeling languages and corresponding mechanism for transition between different models. Khattak et al. [44] discussed the various components of an IoT architecture from the context of perception layers security. Here, they categorized and classified possible attacks at different layers in their architecture. Yoon et al. [71] proposed an architecture for remote security management of IoT devices to prevents various threats in advance. Urien [68] devised an architecture where a TLS server running on a secure chip is used to secure the communication among various devices. This server facilitates strong mutual authentication with the clients and serves as an identity module. However, most of these security frameworks only outlines the IoT system vulnerabilities, and they do not provide any analysis of the major security issues listed in OWASP IoT Top 10.
From another perspective, the limited computational resources prevent IoT devices from implementing advanced authentication mechanism. Thus, device authentication is a weak point in the IoT security landscape. The Mirai malware exploited such devices to launch a DDoS attack in the larger network. El-Hajj et al. [21] provided an analysis of the different authentication mechanisms based on the multi-criteria classification. They compared and analyzed the existing authentication protocols to evaluate their relative advantages and disadvantages. Hao et al. [36] devised a secure device authentication mechanism by integrating physical security with the asymmetric cryptography. The cryptographic key is generated by estimating the device features such as intermediate nodes and radio-frequency. The experimental results demonstrate a more effective protection against various common attacks. Bhawiyuga et al. [5] proposed a tokenbased authentication mechanism on the MQTT protocol for resource-constrained devices. The devised mechanism comprises four components (namely publisher, subscriber, MQTT broker, and token authentication server). Here, the publisher/subscriber first submits its credentials to the authentication server to get the token. After obtaining a valid token, it can store and use it for further authentication. Shah et al. [60] presented mutual authentication mechanisms based on multiple keys. A secure vault is used for sharing the keys among many IoT devices. Initially, keys of the secure vault are provided which changes after a successful communication session. Similarly, a signature-based authenticated mechanism for the IoT devices was introduced by Challa et al. [9]. The mechanism is simulated using NS2 and later analyzed using Burrows-Abadi-Needham logic. Finally, Alizai et al. [3] proposed a mechanism where devices are allowed into a network only if they pass a multi-factor authentication process. Such an approach helps in mitigating the common attacks like replay and man in the middle by using nonce and timestamps.
A fully flagged authentication process is often dubbed as a too-costly mechanism toward ensuring device security. Thus, in order to contrast such limitation one could secure the network by using technologies like enforcing various layers of defense, segregating devices into separate networks using firewalls, etc. [35]. In this context, Zaidan et al. [72] explored the security challenges of existing communication components for IoT-based smart homes. Sahay et al. [23] proposed a framework called CyberShip-IoT, which mitigates network traffic attacks by leveraging the software defined network (SDN) paradigm. Chze et al. [10] devised a multi-hop routing protocol for secure communication among IoT devices. This approach authenticates a device through multilayer parameters before forming a new network for enhancing the security of the communication. Farris et al. [24] analyzed security issues of network functions virtualization (NFV) and SDN from the perspective of IoT. They also depicted the critical security challenges needed to be addressed for SDN and NFV based security mechanisms when adopted by IoT systems. Kim et al. [45] devised a trustworthy networking system based on self-certifying ID (SCID), whereas Shin et al. [61] used trust between Proxy Mobile IPv6 (PMIPv6) domain and IoT systems to device a new protocol for addressing various security issues. The devised protocol supports features like handover management, mutual authentication, key exchange, etc. Giuliano et al. [32] analyzed IoT capillary networks for various IP and non-IP IoT devices to propose an algorithm based on secure key renewal mechanism for better network security. Although these approaches enhance security of IoT systems, they usually incur a substantial implementation complexity, and/or impose a significant runtime overhead which may hinder the performance as data is transferred using these secure network channels.
Typically, in an IoT system, the values measured from sensors often control the physical behavior of other devices. Thus, security of transmitted data in the IoT ecosystem is very important for preventing attacks [46]. However, the traditional cryptographic algorithms are not suitable for the resource constraint IoT devices. For this purpose, Kim et al. [46] devised a mechanism by implementing proxy reencryption for transmitting data with minimal encryption overhead. Sahay et al. [59] prevented the flow of malicious data in the system by devising a mechanism for detecting the malicious nodes which are vulnerable to version number attacks, whereas Hou et al. [38] combined the concept of IoT architectures and data life cycles to devise a threedimensional approach for exploring IoT security. Singh et al. [62] presented an overview of the blockchain technology and its implementation toward enhancing the security of IoT using blockchain, whereas Jeon et al. [42] uses MySQL's Mobius configuration to devise a novel IoT server platform supported by a blockchain for secure storage and retrieval of sensor data. Further, Sollins et al. [64] addressed the issue of conflicts in the collection, usage, and management of large volume of data from the perspective of IoT security and privacy requirements. However, like many other security enforcing mechanisms, these approaches consider the data flow from an IoT end-point devices through the Internet to a cloud layer (or vice versa) at runtime to ensure secure usage of IoT data, but they do not track how this data flows through different software layers statically.
In this context, static program analysis [14] can be very useful for determining taintedness of the data (e.g., sensitive or user controlled data) propagating across different IoT layers. In particular, taint analysis [53,65,66,66] tracks if something from a source (e.g., methods retrieving user input or sensitive data) flows into a sink (e.g., methods sending data to Internet or executing SQL queries) without being sanitized (e.g., encrypted or escaped). This approach has been widely applied to the detection of SQL injections in Web applications [66], leakages of sensitive data [26,27], etc. A first attempt to apply such approach to a scenario similar to IoT was performed by Mandal et al. [47] and Panarotto et al. [58], that utilized this approach to detect leakages and injection vulnerabilities in Android automotive apps [49]. Huuck [40] discussed the use static code analysis to detect some of these types of issues. Similarly, Celik et al. [8] identified security and privacy issues of five IoT platforms, and applied existing static analyzers to detect these issues. These approaches pointed out that "a suite of analysis tools and algorithms targeted at diverse IoT platforms is at this time largely absent." Further, taint analysis should be performed over multiple programs, as IoT systems composed of multiple interactive components executing independently. Therefore, the current IoT security landscape demands a mechanism for analyzing the security vulnerabilities of the IoT system which facilitates cross-interface data propagation. In this regard, the existing taint analysis techniques can be very useful, but they can only analyze a program in isolation. Therefore, the taint analysis should be enhanced to support the analysis of a multiple interactive programs running independently.

OWASP Top 10
OWASP Top 10 [57] is one of the flagship and most popular OWASP projects.
The OWASP Top 10 is a powerful awareness document for Web application security. It represents a broad consensus about the most critical security risks to Web applications. Project members include a variety of security experts from around the world who have shared their expertise to produce this list. We urge all companies to adopt this awareness document within their organization and start the process of ensuring that their Web applications minimize these risks. Adopting the OWASP Top 10 is perhaps the most effective first step toward changing the software development culture within your organization into one that produces secure code.
This project lists 10 categories of security vulnerabilities of Web applications in order of relevance. The first version of this classification was released in 2004, and it has updated several times. The first two columns of Table 1 report the 2017 version. Over the years, OWASP Top 10 kept the pace with the changes of the continuously evolving cybersecurity world, where new vulnerabilities are discovered and exploited as soon as the previous ones were detected and fixed. OWASP Top 10 heavily impacted the focus of security

Static analyzers
Static analysis detects bugs at compile time without executing the code. While dynamic analysis (e.g., testing) needs specific execution states in order to expose different execution paths, static analysis can abstractly reason about all different paths of execution. However, it needs to introduce some form of approximation in order to represent the execution of a program and prove properties on them. In particular, static analysis tools build a semantic model of a software at compile time without executing it, and then check various properties on that model. Nowadays, several standards and regulations (e.g., MISRA DO-178C, IEC 61508, ISO 26262, and IEC 62304) require the application of such tools. This pushed the development and commercialization of various industrial static analyzers like ASTREE [17] and GrammaTech CodeSonar [34]. Similarly, OWASP Top 10 pushed static analyzers to the detection of security vulnerabilities on the back-end of Web servers. In this paper, we will focus on the Julia static analyzer [43]. Note that, while other commercial analyzers like SonarQube [1] exist, they cover all the same types of vulnerabilities. Therefore, we chose Julia as a representative example since the most part of its analyses has been formalized and published.
Julia implements an abstract interpretation-based engine for the analysis and verification of Java bytecode and CIL. It contains a call graph builder as well as several denotational and constraint-based analyses that rely on an intermediate representation of bytecode. Julia currently features about 50 checkers, ranging from a sound taint analysis engine [22,65] to superficial analyses to detect a large set of typical errors in software, such as null-pointer accesses, nontermination, and wrong synchronization.

OWASP Top 10 coverage by static analyzers
The last column of Table 1 reports the coverage of the various OWASP Top 10 categorizes by static analysis.
In particular, categories A1 (Injection), A3 (Sensitive Data Exposure), A4 (XML External Entities), and A7 (Cross-Site Scripting) can be detected through taint analysis [66], that is, an analysis that tries to detect if a value coming from a source (e.g., methods retrieving some user input) flows into a sink (e.g., methods executing SQL queries) without being sanitized (e.g., properly escaped). While the set of sources and sinks is different for these analyses, the same taint analysis can be applied to detect security vulnerabilities such as SQL injections and XSS [6] (Julia's Injection checker) as well as to the detection of leakages of sensitive data [26,27] (GDPR checker).
Category A2 comprises a wide range of different checks. Some of them, like hardcoded passwords (checker Passwords) and weak cryptographic algorithms (checker Cryptography), can be (partially) detected by means of static analysis, while others, like the prevention of automated attacks, require runtime analyses and/or monitoring.
Instead, A9 (Using Components with Known Vulnerabilities) can be fully detected by specific tools such as the OWASP Dependency Check. 3 Julia embeds such detection in the VulnerableComponents checker.

Security vulnerabilities of IoT software
In March 2018, OWASP released the "2018 Internet of Things Top 10" [56] list of the high-priority security vulnerabilities for the IoT ecosystem.
The OWASP Internet of Things Project is designed to help manufacturers, developers, and consumers better understand the security issues associated with the Internet of Things, and to enable users in any context to make better security decisions when building, deploying, or assessing IoT technologies. The primary theme for the 2018 OWASP IoT Top 10 is simplicity. Rather than having separate lists for risks vs. threats vs. vulnerabilities-or for developers vs. enterprises vs. consumers-the project team elected to have a single, unified list that captures the top things to avoid when dealing with IoT Security Various organizations released detailed guidelines for IoT security targeting different industries. Instead, the OWASP IoT Top 10 provided a generic vulnerability classification by creating a list consisting of very-critical issues relevant for manufacturers, enterprises, and consumers at the same time. This section re-caps the security vulnerabilities of OWASP, reinforcing it with more context and clarification. Furthermore, it also analyzes the feasibility of detecting such vulnerabilities by static analysis means. In the rest of this section, for each category of the OWASP IoT Top 10 list we recall its description, we introduce a code snippet explaining it, and we discuss if and how static analysis techniques can be applied to discover and/or prevent this kind of vulnerabilities.

I1: Weak, guessable, or hardcoded passwords
Use of easily brute forced, publicly available, or unchangeable credentials, including backdoors in firmware or client software that grants unauthorized access to deployed systems.
Keeping a default password during system development is (unfortunately) a common practice, and sometimes such password might have been hardcoded within the program. However, critical problems may occur when sensitive information (such as credentials, encryption keys, and certificates) is hardcoded. To make this issue even worse, the IoT developer/manufacturer keeps such information exactly same for all the instances of the device/applications. This is easily exploitable by the intruders to gain unauthorized access for the entire product-line in general by using simple brute force, or a reverse engineering approach. SecretKeySpec spec = new SecretKeySpec(key, "AES"); 3 Cipher aes = Cipher.getInstance("AES"); 4 aes.init(Cipher.ENCRYPT_MODE, spec); 5 return aesCipher.doFinal(secretData); Similarly, keeping cryptographic keys in the source code can cause serious security issues as source code can be widely shared in an enterprise environment and easily detectable. The code snippet in Listing 2 depicts such an example.
Instead, weak (e.g., 123456) and guessable (that is, already disclosed and publicly known) passwords are not hardcoded inside the program, but they are stored in property files or a database, and often they are provided by the user. Static analysis Hardcoded passwords within the program can be detected by static analysis means. For instance, Julia provides the Passwords checker 4 that detects hardcoded passwords (CWE 259) and passwords that are retrieved from property files (CWE 522). However, this covers only partially I1, since weak and guessable passwords can be detected only (i) through dynamic analysis (e.g., penetration testing) on a deployed system, or (ii) dynamic monitoring of the passwords selected by users checking that they conform some standards (e.g., at least 8 characters, special characters, etc.) and that they have not been previously disclosed in some public databases (e.g., https://haveibeenpwned.com/). Thus, category I1 can be partially detected using static analysis.

I2: Insecure network services
Unneeded or insecure network services running on the device itself, especially those exposed to the internet, that compromise the confidentiality, integrity/authenticity, or availability of information or allow unauthorized remote control.
On the one hand, a common approach to secure network services is to monitor an IT system through firewalls and intrusion detection systems in order to prevent, recognize, and block external attacks. In an IoT environment, this requires to implement similar network security measures, since things communicate over the very same Internet. On the other hand, software running in the system should use secure services (e.g., communications through https rather than http). Again, for hassle free set up of a new IoT connection devices often use a technology called UPnP to allow IoT devices to open certain ports in the router and allow traffic through them. However, researchers [2] uncovered serious security issues with UPnP particularly if a IoT device in the network is exploited UPnP can give intruders remote control, thus allowing them to steal sensitive information and access to the other devices connected to the network. Further, the compromised devices can be used to launch botnets to instigate distributed denial of service (DDoS) campaigns. The code snippet listed in Listing 4 shows that the port 8123 is opened for the public traffic. Static analysis Static analysis can detect when a program relies on insecure network communications. For instance, a standard string prefix analysis [12,13] approximates the prefixes of a string, and it would be in position to detect what communication protocols are used when building up URLs. In addition, static analysis can automatically track and report what ports are used by a program. However, currently Julia does not implement such analyses.
Instead, securing the network services involves the monitoring of the overall system (and not only the analysis of a software), and thus it requires runtime analysis. Therefore, static analysis can cover partially category I2.

I3: Insecure ecosystem interfaces
Insecure Web interface, supporting APIs, and mobile interfaces of the IoT ecosystem increases the attack surface of the device or its related components. Common issues include a lack of authentication/authorization, lacking or weak encryption, and a lack of input and output filtering.
This category encompasses various OWASP Top 10 categories, since these were designed to improve the security of a specific layer of the IoT ecosystem (that is, Web applications). In particular, the lack of authentication/authorization and weak encryption is part of A2 (Broken Authentication), while the lack of input and output filtering is part of A1 (Injection) and A7 (XSS).

15
InputStream Again, when we consider the companion Android application then it is clear that the tainted data received from the servlet is displayed to the user. The code snippet in Listing 7 and 8 reports the code of the Android application. The background worker retrieves (line 17 of Listing 8) and returns the data (line 22 of Listing 8) exposed by the servlet, while the main activity displays such data (line 19 of Listing 7) to the user. Therefore, if the device, cloud storage or companion application communicate correctly, they may propagate through the entire network the sensitive data and leak it. Static Analysis As discussed in Sect. 3, categories A1, A2, and A7 are already covered by static analysis. In particular, Julia's Injection checker covers A1 and A7, while A2 is partially covered by Passwords and Cryptography checkers. However, the interaction between different layers in an IoT system poses new challenges to static analysis: existing techniques (like taint analysis) could only analyze each layer of the example above in isolation. Therefore, they are not in position to track how user input flows between different layers, and they can cover only partially or with very limited precision injection vulnerabilities in IoT software. Further investigation and solutions [48] are needed in order to address this novel scenario.

I4: Lack of secure update mechanism
Lack of ability to securely update the device which includes lack of firmware validation on device, lack of secure delivery (un-encrypted in transit), lack of anti-rollback mechanisms, and lack of notifications of security changes due to updates.
IoT systems might be exposed to malware and other attack techniques that exploit vulnerable components installed in the devices when the system was deployed. Therefore, it is important to update the software on a regular basis to prevent such attacks. A commonly used technique for this purpose is public key infrastructure (PKI) based system, which is capable of providing required security for update mechanisms. However, the majority of IoT devices lack the computation power to execute PKI efficiently. Again, if the key is not secure and can be extracted from the device, it is encrypted with a single symmetric key for all the device instances, or the encryption keys transferred along with the update for the device firmware, then the IoT system is at risk of hijacking the update process. Therefore, the lack of secure update mechanism refers as well to how the firmware is managed, and not specifically vulnerable components in the firmware. Static analysis This type of vulnerability can be detected only after the IoT system is deployed, as it involves the overall set up of the different devices, and it cannot be detected by analyzing the software of the various IoT layers.

I5: Use of insecure or outdated components
Use of deprecated or insecure software components/libraries that could allow the device to be compromised. This includes insecure customization of operating system platforms, and the use of third-party software or hardware components from a compromised supply chain.
Studies [70] showed that 80% of the code in today's applications are using libraries and frameworks, but the vulnerabilities associated with these components has been largely undermined. An outdated vulnerable library may allow an intruder to exploit the full privilege of the application, which may include accessing sensitive data, executing transactions, etc. In this regard, the National Vulnerability Database [54] lists majority of the outdated vulnerable libraries. Therefore, the IoT applications shouldn't use third-party libraries which contain well-known vulnerabilities published in National Vulnerability Database. Static analysis This type of vulnerability is equivalent to A9 (Using Components with Known Vulnerabilities) of the OWASP Top 10 list, and it can be easily detected statically using tools such as OWASP Dependency Checker as discussed in Sect. 3.2.

I6: Insufficient privacy protection
User's personal information stored on the device or in the ecosystem that is used insecurely, improperly, or without permission.
During the last few years, identity thefts and leakages of sensitive data are on the rise with increasing number of devices exposed to the Internet. Assessing the type and protecting sensitive data in various IoT layers is nowadays a critical topic. For instance, should the apps installed in the infotainment system communicate or store sensitive information like location, speed, etc. of the vehicle all the time? Thus, from the privacy perspective one should look for unnecessary communication and storing of sensitive personal identifiable information, encryption of all such data, and anonymization whenever feasible to protect the privacy of the user. This category is similar to A3 (Sensitive Data Exposure) of the OWASP Top 10 list. Examples To explain the scenario of privacy breaking IoT applications, we considered the Rain Monitor app. 5 It uses OpenXC [55] to collect sensitive data (such as location, windshield status, and speed) of a car, and transmit it to a remote Web service, where it is collected and used to inform drivers of possible showers in their area. Clearly, this app contains a leakage of sensitive data to Internet, which can be seen as a privacy issue. The code snippet in Listing 9 reports the snippet of the OpenXC application that reads the car location and windshield data, and sends it to the Internet, without encryption nor authentication. The status of the HTTP request and of the windshield is also logged. These are instances of injections: flow of sensitive data into dangerous operations. In this case, the operations divulge sensitive information, violating privacy.
Listing 10 Another code snippet from the Rain Monitor app. Sensitive car data is logged and used to build a URL address. The code snippet in Listing 10 reports another fragment of the source code from the same app. In this case, the applica-tion reads the car position from the CAN and logs it. Hence, anybody having access to the logs can reconstruct the movements of the vehicle, a clear privacy issue. At the end, this code builds a URL by using latitude and longitude. This is a URL injection (sensitive data flowing into an Internet address), possibly inherent to the task performed by this app. Static analysis As discussed in Sect. 3.2, taint analysis can be applied to detect leakages of sensitive data. Such approach has been implemented in Julia's GDPR Checker. Therefore, this category can be covered by static analysis.

I7: Insecure data transfer and storage
Lack of encryption or access control of sensitive data anywhere within the ecosystem, including at rest, in transit, or during processing.
Consider the scenario discussed in I6: Insufficient Privacy Protection where an application sends the location of a car to the cloud, and also logs some sensitive information coming from the CAN in the local log file as depicted in the code snippets of Listing 9 (line 31-32) and Listing 10 (line 15). With regard to this category, the problem is that the sensitive data is passed and stored in clear text. Therefore, anyone with access the network or the log can intercept and understand it. To protect this sensitive data, messages must be properly encrypted (e.g., with keys that are not directly accessible). These keys are usually stored in a keystore and protected with a password.

Listing 11
Malicious usage of a keystore. 1 KeyStore ks = KeyStore.getInstance("JKS"); 2 char[] password = getPassword(); 3 try (FileInputStream fis = new FileInputStream("keyStoreName")) { 4 ks.load(fis, password); 5 } 6 // get private key 7 KeyStore.ProtectionParameter protParam = new KeyStore.PasswordProtection(password); 8 KeyStore.PrivateKeyEntry pkEntry = (KeyStore.PrivateKeyEntry) ks.getEntry("privateKeyAlias", protParam); 9 PrivateKey myPrivateKey = pkEntry.getPrivateKey(); 10 // save secret key 11 javax.crypto.SecretKey mySecretKey = ...; 12 KeyStore.SecretKeyEntry skEntry = new KeyStore.SecretKeyEntry(mySecretKey); 13 ks.setEntry("secretKeyAlias", skEntry, protParam); 14 // store in the keystore 15 try (FileOutputStream fos = new FileOutputStream("newKeyStoreName")) { 16 ks.store(fos, password); 17 } Example The code snippet in Listing 11 uses such a keystore. If the parameter passed to the constructor of PasswordProtection() (line 4) and load() (line 7) contain input under user's control, then the security of the system could be compromised by an attacker. Moreover, many companion applications rely on weak cryptographic algo-rithms such as SHA1PRNG in SecureRandom. A common but potentially harmful use of this algorithm is the creation of encryption keys by using a password as a seed [31]. Due to some implementation issues, the key could become deterministic if the seed is generated with an unsafe algorithm. These issues are prominent in many IoT devices devices and the companion applications. Static analysis Like category A2 of the OWASP Top 10 list, also this category can be partially covered. In particular, the use of unsafe cryptographic algorithms can be detected by simple analyses that checks if some APIs are called with some specific values. Such vulnerabilities are already detected by standard tools, like Julia's Cryptography checker. Instead, taint analysis can be applied to detect if user input flows into cryptographic keys that are storing in a keystore. However, to the best of our knowledge, state-of-the-art industrial analyzers do not cover yet this scenario.
Finally, the lack of access control to sensitive data can be only discovered when the system is deployed (e.g., through some forms of dynamic analysis). Therefore, this category can be covered only partially by static analysis means.

I8: Lack of device management
Lack of security support on devices deployed in production, including asset management, update management, secure decommissioning, systems monitoring, and response capabilities.
This may lead to unauthorized access to the device or the data. Due to poor configurations, devices may have debug ports open for interaction with the system. An intruder may communicate through these pin-outs to interact with the entire system. The level of vulnerable interaction and privilege exploitation is dependent on the type of communication protocol. In the configuration file there may be pin-outs for UART interface which enable intruders to access command shell, logger output, etc. Again, based on the device configuration an intruder may also get access to low-level interaction with the microcontroller using protocols such as JTAG and SWD, which can be used to read/write the internal flash, read/write register values, debug the OS/base firmware code. Static analysis This category requires to analyze a whole deployed system (comprising firmware and runtime settings), and it goes far beyond the IoT software involving some forms of runtime monitoring. Therefore, it cannot be detected by static analysis.

I9: Insecure default settings
Devices or systems shipped with insecure default settings or lack the ability to make the system more secure by restricting operators from modifying configurations.
Malware like Mirai scans IoT devices trying to get control by using the default username and password (e.g., admin/admin). If a malware is able to get root access to the device, it may exploit it for coordinating botnet attacks. Therefore, when configuring a device, it is critical to enforce the administrator to follow strict security regulations. Static analysis In general, these default settings are stored in some properties file or databases, and thus, this category cannot be detected using the static analysis.

I10: Lack of physical hardening
Lack of physical hardening measures, allowing potential attackers to gain sensitive information that can help in a future remote attack or take local control of the device.
One important aspect of IoT devices is that they are used regularly by multiple users over time. On top of the device usage, there is also the aspect of how a device is accessible, and what level of device access is really needed. Physical security weaknesses are present for instance when an attacker can disassemble a device to easily access the storage medium and any data stored on that. Weaknesses are also present when USB ports or other external ports can be used to access the device using features intended for configuration or maintenance. This could lead to unauthorized access to the device. An attacker could then steal confidential data from the device's memory and launch a spoofing attack. Static analysis Physical presence of the intruders is required to carry out these kinds of attacks, and therefore, they can be prevented only by physical surveillance and access control of the devices. Thus, static analysis cannot help to detect such category of vulnerabilities.

Summary
The in-depth exploration of OWASP IoT Top 10 categories suggests that IoT security vulnerabilities can be broadly classified into three categories: software, system, and device hardware. Software vulnerabilities refer to security issues associated with the applications running on the IoT system at different layers. Instead, system vulnerabilities refer to the security issues related to the firmware or operating systems of the devices, as well as to the configuration of the deployed system. Finally, device hardware vulnerabilities are associated with the hardware components and the physical environment they are operating in.
Again, static analysis detects the program vulnerabilities without executing the code. Therefore, this approach is suitable for detecting application vulnerabilities, such as reading sensors data and sending it to the cloud or storing it locally. Instead, discovering vulnerabilities on the overall system involves often firmware and RTOS, and thus, it cannot be pursued by static analysis as it usually demands execution of the program to monitor runtime behaviors and device configuration (e.g., adjusting duty cycle, sending signal to the microcontroller's pins, communication protocols, managing credentials from the configuration files, etc.). Finally, the vulnerabilities associated with the device hardware and physical operational environment cannot be detected using static analysis as they have nothing to do with the application software. Table 2 summarizes the categories of the OWASP IoT top 10 2018 vulnerabilities, and their coverage using static analysis. Discussion Six out of the top seven vulnerabilities of OWASP IoT Top 10 can be addressed by static analysis. Existing industrial solutions were developed mainly to address OWASP Top 10 vulnerabilities in Web applications, and in fact the four IoT categories that have a corresponding category in OWASP Top 10 are already covered by Julia. While the coverage of IoT categories might be less or more pervasive (e.g., for I1 this approach can detect only very specific cases), in two cases (I5 and I6) static analysis could provide deep coverage. Therefore, next section will briefly introduce the extension of Julia's analyses we developed, while Sect. 6 will present the experimental results of these analyses when applied to some IoT applications publicly available in GitHub.

Extending static analyzers to IoT systems
In this section, we describe how we extended Julia's taint analysis engine [7,22,65] in order to detect IoT security vulnerabilities such as leakages of sensitive data and interface interaction issues.

IoT privacy checker
An IoT device consists of multiple sensors, which provide their own APIs to access (potentially sensitive) sensor data.
Thus, it is very difficult to provide a single solution applicable for all IoT devices and companion apps. For this purpose, we applied Julia's taint analysis to detect privacy issues related to sensitive sensor data. Figure 1 depicts the working mechanism of the IoT privacy checker, that relies on a dictionary of sources and sinks specific to the APIs of IoT devices under analysis. Here, sources refer to methods retrieving sensitive information about the device, whereas sinks include methods that potentially leak data (e.g., logging, database, or network manipulation). The analyzer tags as tainted (aka, with a Boolean flag set to true) all the value retrieved from some sources, and then it propagates such values throughout the whole program following the program's semantics. Then, the analyzer checks the flag associated to the values passed to the sink. If it is false then flow of tainted data into that sink is not possible; otherwise, it generates a warning, reporting a potential leakage of private data. We instantiated our approach to a specific library used by some of the examples we will discuss in Sect. 6.

Insecure ecosystem checker
Julia's taint analysis can provide an exhaustive report tracking how tainted data flows through the program up to a sink [25]. Such approach has been instantiated for the GDPR analysis [26,27], and it allows the user to specify sources and sinks through an Excel spreadsheet. This spreadsheet contains all the potential API calls that could retrieve or leak sensitive data. The user can then tag these with the category of sensitive data they retrieve or with leakage points they disclose information to. The GDPR checker then applies Julia's taint analysis engine with the specification provided through the annotated Excel file, and it returns an exhaustive report with all possible data flow graphs representing potential leakages. We extended the GDPR checker to work with multiple programs, for this we added intermediate sources and sinks in the boundaries of the different interfaces with the help of the communication channel, and tracked the propagation of tainted data across the interfaces. Figure 2 depicts the working principle. The analyzer first generates the possible set of sources and sinks from the given programs. The users may provide the set of external sources (primary origin) and sinks (final destination) for various types of sensitive data. Usually, we have a program (e.g., embedded software in an IoT device) that access the sensitive data and transmits it through some communication channel. In this case, the analysis considers as sources the ones defined by the user (that is, external sources), while the sinks are determined from the communication channel (termed as intermediate sinks). Then, we usually have another application that reads data from the communication channel and exposes it to other applications. Here, both sources and sinks are the ones specified for some specific communication channels. Finally, we could have a program that read data from the communication channel, and leaks it to some external sinks. In this latter case, the sources are the one defined for communication channels, while the sinks are the ones defined by the user (external sinks). A complete IoT system comprises several applications that are all analyzed by our approach, and then the results are combined into a final report.

Experimental results and discussion
The IoT checkers have been implemented on top of the commercial Julia static analyzer [43] for Java and .NET bytecode, based on abstract interpretation [15,16]. As of version 2.7.0.2, Julia implements 48 different checkers, divided into two main groups. The basic checkers perform simple yet comprehensive semantic controls of software issues. Instead, the advanced checkers perform deep semantic controls that need a complete inspection of the call graph, a precise abstraction of the heap, as well as other supporting (e.g., flow [22]) analyses. The analyses have been executed on a r5.xlarge Amazon Web Service machine, that features a Xeon Platinum 8000 series (Skylake-SP) processor with a sustained all core Turbo CPU clock speed of up to 3.1 GHz and 32 GB of RAM.

IoT privacy
Listing 12 Vulnerability warnings for the Rain Monitor app. 1 CheckWipersTask.java:111:XSS-injection into method "execute" 2 CheckWipersTask.java:114:Log forging into method "w" 3 CheckWipersTask.java:117:Log forging into method "d" 4 FetchAlertsTak.java:68:Log forging into method "d" 5 FetchAlertsTak.java:76:URL injection into method "<init>" In the Rain Monitor app, we introduced in the description of category I6 of the OWASP IoT Top 10 in Sect. 4, we instructed Julia's GDPR analysis with the sources of sensitive data about the car. Julia's taint analysis then issues the five warnings about potential injections reported in Listing 12. These correspond to the privacy issues informally discussed in the aforementioned section, and reported in Listing 9. The first warning reports that sensitive data about the vehicle flows into method execute which creates an HTTP request. Moreover, the status of the HTTP request and the status of the windshield get logged into a file, as shown in the code snippet of Listing 9. The analysis catches these issues in the second and third warning. Instead, in the code snippet of Listing 10, sensitive data (latitude and longitude) is read from the CAN, logged at line 68 (fourth warning), and later concatenated into a URL at line 76 (fifth warning). The latter points to a remote Web service that tracks the position of the car and the weather. Clearly, this is potentially a privacy breach. In conclusion, the privacy checker issues five injection warnings on Rain Monitor and they are all true alarms, although inherent to the main functionality the app performs.

Insecure ecosystem
To design and demonstrate the capabilities of Insecure Ecosystem checker, we scanned GitHub for repositories containing IoT systems made up of several interacting programs. We ended up selecting five repositories based on Android Things, where an edge program and an Android application communicate through some channels. Android Things supports cloud, Bluetooth and Near Field Communication (NFC) connectivity between different IoT components. We selected the repositories that have at-least an Android or Web application along with an Android Things application (that is, an edge program). This narrowed the available repositories, since the majority is functionally repetitive, and many did not compile because of missing resources, or incorrect Gradle Build files.
In particular, we selected IoT systems communicating through Google Firebase [33] (Doorbell [63] and Electricity Monitor [28]), Near Field Communication (Color Thing [37]), Bluetooth (Bluetooth Low Energy (BLE) fun [51]), and Internet (Robocar [73]). In the rest of this section, we discussed the results of the analysis when applied to these IoT programs.

Firebase: Doorbell and electricity monitor
Most IoT systems rely on different cloud services as communication channel between the edge software and the mobile applications. A very popular choice is Google Firebase [33]. We have considered two different IoT systems that communicate through Firebase: Doorbell [63] and Electricity Monitor [28]. Android Things Doorbell [63] implements a smart doorbell that captures the image of the visitor who presses the bell button. The picture obtained from the camera is processed through Google's Cloud Vision API; the edge software then uploads it to a Firebase database, together with Cloud Vision annotations and metadata. The companion Android app accesses the database and presents data to the user. For the analysis of this program, we tagged as sources and sinks of the communication channel StorageReference.putBytes (called at line 181 of DoorbellActivity) and FirebaseStorage. getReferenceFromURL (called at line 88 of Doorbell EntryAdapter), respectively. Furthermore, we specified ImageReader.acquireLatestImage as external source (i.e., the Android API that retrieves a camera image, called at line 162 of DoorbellActi vity) and GlideRe-quests.load as external sink (i.e., the method of the mobile app that displays an image, called at line 91 of DoorbellEntryAdapter). The programs along with the Excel spreadsheet tagging these sources and sinks are passed to Julia's GDPR checker. Figure 3 reports the flow graph produced by the taint analysis, where the results on the two programs have been connected (Thing and Android App). In addition, the bold arrow represents a data flow between components. This result shows that the edge software retrieves the image and stores it in Firebase, where the mobile app retrieves it to show it to the user. Therefore, our approach detects the information flow from source (picture taken from a camera) to sink (image shown in a mobile app). The second IoT system in this category is Electricity Monitor [28], that tracks the availability of electricity and notifies the user about black-outs, in an Android app. It uses Firebase as communication means between the edge software and the mobile app. Similarly to Doorbell, we add as source and sink of the communication channel methods DataSnapshot.getValue (called at line 74 of OverviewPresenter) and Database Reference. setValue (called at line 41 of Electricity Monitor Activity), respectively. The external source is the ElectricityLog instance in field ElectricityMo-nitorActvity.electricityLog (since it keeps track of the status of electricity). The external sink is setIsPo-werOn of ElectricityViewModel, since the information contained there is shown to the user of the mobile app. Note that we might have chosen additional external sinks since the view model contains several other fields, but we focused on a single specific value since the other cases would be identical. The analysis of these programs, with this configuration, generates the flow in Fig. 4. It shows that the edge software accesses the log of the status of electricity and immediately retransmits it to Firebase; moreover, the mobile app accesses this data and shows the electricity status to the user, by passing this data to the view model.

Near Field Communication: color thing
Near Field Communication (NFC) allows short-range wireless connectivity. The Color Thing program [37] relies on NFC to allow communication between an edge program and a mobile app, that changes the colors of some LEDs  Fig. 5, shows that the Android app reads the user input, elaborates an adequate payload transforming the user input into an RGB color and transmits it through NFC; the edge software receives this input, processes the value, and transmits a coherent value to the hardware device to set the LEDs' color.

Bluetooth: BLE fun
Bluetooth is another way of communication between nearby devices. The Bluetooth Low Energy (BLE) fun-Android (Things) program [51] relies on Bluetooth Low Energy technology to communicate between an Android Things program and a mobile app. It simply sends a counter from the edge software to the Android app. For this communication channel, source and sink are the second parameter of the event listener Bluetooth GattCallback. onCharacteristicRead, accessed at line 83 of GattClient, and the parameter of BluetoothGatt  Server.sendResponse, called at line 112 of Gatt Server, respectively. The initial value of the counter is read by calling SharedPreferences.getInt at line 19 of AwesomenessCounter and is consequently tagged as external source. The external sink is Button.setText, called at line 37 of InteractActivity, since it shows the value to the user of the mobile app. Figure 6 reports the analysis results: the edge software accesses a counter stored in shared preferences and transmits it, after some computation, through Bluetooth; the mobile app receives it and shows it to the user.

Internet: Robocar
As a last example, we considered an IoT system that allows one to drive a little, remote-controlled, autonomous car built with Android Things [73]. The system consists of a mobile app to control the car and of some edge software, that sends driving instructions to the car. The two programs communicate through standard HTTP requests and responses. The repository provides specific interfaces to send and receive data. Hence, source and sink of the communication channel are the parameter of HTTPRequestListener.onSpeed(RobocarSpeed) and the value passed to RobocarClient.setSpeed (int,int), respectively. In addition, the external source is the second parameter of GameControllerActivity. handle-JoystickButtonEvent(View,Motion Event), since this activity manages the joystick of the mobile app. The external sink is LocahostDriver. changeSpeed, called at line 212 of MainActivity. Figure 7 reports the result of the analysis that, this time, did not find any explicit flow from the external source to the external sink. Although the edge software does receive data from the communication channel and sends it to the external sink (as the figure reports), the mobile app does not send data to the Internet: it just performs some checks of user input (retrieved through the external source) and sends distinct constant values based on such checks (see GameControllerActivity.handleJoystick Butt-onEvent). Hence, there is no explicit flow from the external source to the communication channel, while there is an implicit flow, not detected by taint analysis. From our point of view, this means that the program correctly translates the user input into sanitized (namely constant) values. In that way, the software prevents the user (and potentially an attacker) from sending arbitrary speed values that might damage the device.

Conclusion
In this paper, we discussed how static analysis can be adopted to prevent security vulnerabilities in IoT systems. In particular, we started by discussing how this type of tools is currently applied to prevent OWASP Top 10 vulnerabilities in Web applications. We then analyzed OWASP IoT Top 10, and discussed how these vulnerabilities are related to OWASP Top 10 and what existing static analyses can be re-used to detect IoT vulnerabilities. We then introduced two extensions of an existing industrial static analyzer (Julia) and presented some preliminary experimental results.
Overall, six out of the first seven vulnerabilities listed by OWASP IoT Top 10 can be covered at least partially by static analysis means. Five of these types of vulnerabilities are already provided by Julia's checkers, but in some cases its coverage is partial since the IoT scenario introduces novel complexity (in particular, communications between different software layers) that were not present in Web applications. Therefore, we extended these analyses (in particular, about leakages of sensitive data coming from sensors, and insecure ecosystem interfaces passing tainted data between different software layers) to address this scenario. The experimental results show that the proposed extensions are capable of fully detecting several issues related to these categories.
Funding Open access funding provided by Universitá Ca' Foscari Venezia within the CRUI-CARE Agreement.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.