Tracing security requirements in industrial control systems using graph databases

We must explicitly capture relationships and hierarchies between the multitude of system and security standards requirements. Current security requirements specification methods do not capture such structure effectively, making requirements management and traceability harder, consequently increasing costs and time to market for developing certified ICS. We propose a novel requirements repository model for ICS that uses labelled property graphs to structure and store system-specific and standards-based requirements using well-defined relationship types. Furthermore, we integrate the proposed requirements repository with design-time ICS tools to establish requirements traceability. A wind turbine case study illustrates the overall workflow in our framework. We demonstrate that a robust requirements traceability matrix is a natural consequence of using labelled property graphs. We also introduce a compatible requirements change management procedure that aids in adapting to changes in development and certification schemes.


Introduction
Industrial control systems (ICS) deployed in critical infrastructures highly rely on security and safety features. Therefore, security requirements specification and management is a crucial part of the overall requirement engineering process of ICS. Critical components deployed in ICS must comply with security standards. ICS-specific security standards such as IEC 62443 [22] provide robust yet generic security requirements. Such requirements sometimes can be intricate to understand in terms of their applicability for a specific project. The certification process around security standards also requires multiple parties to concur on a set of security requirements specific to a particular product. Therefore, stakeholders must agree on a security requirements engineering approach in a security certification process that is feasible to all the parties involved: users, vendors and certifying authorities. Correct mapping of system requirements to security standards requirements is critical to ensure the security of the whole process [38].
Current requirements elicitation and specification approaches are not rigorous enough to scale to developing certified large-scale ICS that require continuous progression analogous to technology advancement. Security requirements specification techniques such as SIREN [28,53] focus on reusing requirements with the help of repositories. However, they use textual and semi-formal approaches that are laborious to express and comprehend. Security standards are also written in natural language, and requirements specifications are abstract and often ambiguous [6,13,16]. For example, a requirement such as "The ICS shall ensure the security of its critical parameters through the use of cryptography." is abstract and vague since it does not expand the terms such as "critical parameters" and "cryptography".
Security standard requirements must be mapped to product-specific system security requirements usually contained in a Cyber Security Requirements Specification (CSRS) [17] document. This document acts as a stepping stone to cater for secure-by-design practices by detailing the security requirements and the capability levels across the ICS zones. The interpretation of the standard against system security requirements also presents a unique challenge regarding the correct mapping of requirements [31]. The current ICS security requirements engineering practices do not offer formal and expressive specification techniques to map the system's requirements to ICS-specific security standards. An expressive formal requirement provides the ability to convey the detailed meaning of a security requirement and remove any ambiguity that is inherent in the natural language.
Furthermore, [3] discusses the extension of security requirements engineering methods to the ISO 27001 [8] security standard. However, ISO 27001 is a general information technology standard, not specific to ICS. A recent research [10] discusses the integration of security standards to ICS development life cycle; however, it lacks the internal details about security requirements mapping with security standards. Moreover, [4,12] discuss the security standard requirements compliance, yet generally focus on security requirements verification of ICS end products. Lack of system to security standard requirements mappings reduces the traceability of security requirements during the development of security certified ICS components.
We addressed the problems stated above by exploring the following research questions: RQ1: What techniques can manage and express the relationship between system and security standard requirements? RQ2: How can security standard requirements be integrated with the design and implementation of ICS applications to achieve end-to-end traceability?
We followed an adapted case study method to address these questions. A safety-critical wind turbine system was modelled and implemented to study the various issues in managing and tracing security requirements in ICS. An overview of this solution appears in Sect. 2.
The primary contributions of this article are as follows: 1. We propose a novel model of repository that stores CSRS and IEC 62433-4-2 security requirements as labelled property graphs (LPGs) in multiple partitions, while emphasising requirements structure and relationships. 2. We present the formal definition of the IEC 62443-4-2 extended requirements structure that helps select standard cryptographic primitives to guide the implementation of IEC 62443-4-2 requirements. 3. We propose and demonstrate a process to integrate the repository with ICS design tools to support end-to-end requirements traceability.
The rest of this article is organised as follows. Section 3 presents the background of techniques and tools used in the solution while Sect. 4 discusses the related literature review and current works. Section 5 provides a formal definition of extended IEC 62443-4-2 requirements structure that relates to the contribution 2 listed above. Section 6 presents the method to generate LPGs for IEC 62443-4-2 and CSRS. Section 7 demonstrates the repository architecture and implementation that relates to the contribution 1. Section 8 demonstrates design-time ICS tool integration for requirements traceability purposes. It relates to the contribution 3 of this article. Section 9 presents the results and discussions while Sect. 10 discusses some of the limitations and validity threats of our proposed solution. Section 11 finally concludes the article and proposes future works. Figure 1 shows an overall process that integrates the proposed repository model to the design-time tools for ICS. Such integration allows the security requirements in the repository to be linked with the design and the implementation, enabling end-to-end requirements traceability.

Overview of the proposed solution
A safety-critical wind turbine ICS case study is adopted to demonstrate the repository usage and its associated traceability process. Multiple Programmable Logic Controllers (PLCs) are installed in a master-slave topology in the wind turbine system. A master PLC is usually placed at the wind turbine base that commands the slave PLCs in the nacelle. Moreover, data from the PLCs are transmitted to the control system and subsequently to Supervisory Control and Data Acquisition or Enterprise Systems. The communication between master and slave PLCs is critical because it essentially controls the physical processes of the wind turbine. The control device network can be compromised if an attacker can install a rogue device between master and slave PLCs to sabotage the operation of components in the nacelle and the pitch gears. Such an ICS demands trustworthy communication between all its components. Table 1 shows the CSRS extract of system security requirements of a wind turbine system derived for PLCs arranged in masterslave topology. Column 1 of Table 1 shows the respective Fig. 1 Process overview of creating an LPG security requirements repository and its association with secure-by-design and traceability tools. RId and SLId in stage 3 refer to requirement and secure link id, respectively, while TORUS [49] is a requirement traceability tool

SL-C 2
CRb Critical parameters shall be not be persisted on the master and slave PLCS in order to ensure the confidentiality of data for discharged devices from the system.

SL-C 4
Authentication AR Any access to the PLC (Master/Slave) shall be provided after appropriate authentication based on role-based identification.

IR
The system shall ensure the integrity of ingress and outguess data.

IRa
Communication between master PLC and external components shall use appropriate methods to ensure the integrity of the data.

SL-C 4
IRb Communication between master and slave PLCs shall support communication integrity checks.
SL-C 4 security goal to be achieved for the system. RID in column 2 represents the requirement and sub-requirement identifier in the CSRS, while "security level" in column 4 represents the intended security level respective to IEC 62443-4-2. For example, CR is overall requirement for confidentiality nevertheless broken down into CRa and CRb portraying the precise requirements for communication and critical parameter storage. An analyst can choose a particular security level for each sub-requirement based on the system constraints.
The first stage, requirements graph generation as shown in Fig. 1, involves creating LPGs of CSRS for the wind turbine system and the security standard. The wind turbine security requirements listed in Table 1 form the CSRS. An LPG is thus created based on these requirements where each requirement/sub-requirement, such as CRa, CRb, AR, IRa and IRb, is considered as a graph node. Such an approach of structuring the security requirements from multiple specification documents assists in highlighting the relationships that  Fig. 1, creates the repository by storing LPGs created in the previous step in a graph database tool. An LPG provides structure that is useful in efficient storing and managing the repository. A key advantage of using LPG is that the relationships between data are computed and stored at the database creation stage [44,45]. The repository enables querying of complete requirements by specifying graph patterns such as a tree structure, chain structure, chain-set structure and forest structures [43]. Furthermore, the result sets can be filtered based on requirement identifiers and properties associated with nodes and edges of an LPG [42,47]. In this article, we utilise the querying ability provided by graph database tool. This enables us to obtain a holistic view of a security requirement's hierarchy and its dependent requirements. Furthermore, maintaining multiple partitions of the proposed repository are highly reusable for vendors and the ICS development community.
As shown in Fig. 1, the third stage of Requirements↔ Design linkage uses TORUS [49] for requirements traceability. TORUS is a requirement traceability tool that uses splices to link the requirements to their implementation. TORUS's splice metadata maintains the repository link using a requirement identifier (e.g. IRa from Table 1) from the CSRS of the wind turbine system. The use of TORUS with the repository enables it to act as a bridge to provide traceability between the requirements in the security repository and the application design.
The final stage of design tool integration uses secure links-a secure-by-design tool-for ICS [52] that briefly introduces the idea of integrating secure links and TORUS with a requirements repository. However, we consolidate the concept by demonstrating its practical use with the LPG repository proposed in this article. It is achieved by stor-ing the secure link and the wind turbine CSRS requirements identifiers (such as listed under RID column of Table 1) in a TORUS splice enabling end-to-end requirements traceability. The application of the proposed repository model shows that it produces a transparent and detailed traceability matrix that is beneficial in the verification and validation of security requirements.
A detailed process for the requirements change management illustrated in Sect. 9 aids in requirement's addition, removal and modification from the repository. The results also show an improvement in requirements extraction effort from the security standards. We also show that using LPGbased repository aids in the efficient requirements traversal and analytics using the Cypher [14] query language.

Background
IEC 62443-4-2 Technical requirements for ICS components standard is the focus of this research. This standard describes component requirements (CRs) derived from foundational requirements (FRs) defined in IEC 62443-1-1 [19]. The requirements specified in the standard are derived from seven FRs defined in IEC 62443-1-1. These requirements include (1) identification and authentication control, (2) use control, (3) system integrity, (4) data confidentiality, (5) restricted data flow, (6) timely response to events and (7) resource availability. Each subsequent part of the IEC 62443 series expands FRs into sub-requirements within the context of various dimensions related to ICS security such as security management system, solution suppliers, risk assessment, system security and component security. Sub-requirements can be standalone or are enhanced by Requirement Enhancements (REs) that determine the level of security for a particular requirement. REs are meant to intrinsically provide additional security for a particular requirement implementation. Table 2 provides an excerpt of IEC 62443-4-2 security requirements composition. For example, CR 4.2 (Information persistence), the sub-requirement of FR4 (Data confidentiality), provides two REs requiring erasure and erase verification of critical device data in the ICS devices. The standard requires implementation of these REs to achieve higher security levels.
IEC 62443 standard defines four capability security levels (SL-C) that depend upon the severity of the attack and the attacker's skills, resources, and motivation level. For example, SL-C 1 is described as the protection against passive attacks such as eavesdropping or casual exposure and, unintentional incidents. Higher SL-Cs are characterised by deliberate attempts to inflict harm on an ICS. Since an attacker's ability can vary depending upon the degree of resources, skills and motivation, SL-Cs assist in categorising the ability of an attacker to inflict harm. Therefore, SL-C 2, 3 and 4 relate to low, moderate and high ability, respectively.
The presence of one or more RE is usually associated with an enhanced SL-C. However, in some cases, implementing a CR with no REs may suffice to the maximum SL-C. For example, as shown in Table 2, CR 2.1 (Authorisation enforcement) is a sub-requirement of FR2 (Use control) of IEC 62443-4-2 and has four associated REs. Figure 2 shows a CR 2.1 REs to SL-C mapping. In order to achieve SL-C 1, CR 2.1 needs to be implemented in its basic form. RE 1 and 2 should be implemented along with base CR 2.1 to achieve SL-C 2. Moreover, CR 2.1, RE 1, 2 and 3 should be implemented for SL-C 3, and for SL-C 4, RE 4 should also be implemented.
Cyber Security Requirements Specification [17] contains the overall security requirements of a system exhibiting its security aspects. The security requirements cover multiple sub-systems across the different ICS layers. CSRS can be developed as a separate artefact or integrated with the existing requirements specification of a system. Current literature lacks details about the composition of the CSRS. Its characteristics are briefly discussed in [17] that closely relates it to the system requirements specifications. Nevertheless, requirements representation in CSRS would deal with the same issue of generic requirements specification, i.e. lack of formal methods. In [26], the authors note that the industry practitioners still prefer natural language to specify the safety requirements and that there is a dearth in using formal specification techniques. There are methods such as [37] that use semi-formal techniques like UMLsec to verify the architecture and design. However, their empirical affinity with ICS design and implementation is yet to be seen. Standard integration with important system security requirements to improve requirements verification and requirement reuse, therefore, is important criteria for achieving security certifications for crit-ical systems [27]. Moreover, IEC 62443-4-1 [1] recommends that security requirements specifications shall be traceable to the ICS design and implementation which needs to be explored regarding CSRS.
A labelled property graph database uses a graph structure for storing and managing data, allowing the modelling of real-world entities as nodes and edges [46]. Nodes are used to store data, and relationships between data are stored as edges [42,44,47]. In the LPG database, nodes and edges have labels associated with them. Labels serve as a medium to associate nodes and edges with certain groups. An important characteristic that LPGs possess is that nodes and edges can also have associated properties [43][44][45][46]. Properties exist in the form of key-value pairs, which serve as an internal structure for storing data in an LPG database. We illustrate a comprehensive example of an LPG containing requirement and requirement-enhancement nodes along with their properties in Fig. 3, Sect. 5.

Related works
A literature review was conducted by choosing the relevant works important to our proposed approach. We examine the existing methods of storing requirements in a requirements repository in diverse database technologies. Relevance of the current methods to our research is discussed below: The concept of requirements repository has been proposed for requirements reusability in various processes and models [28,53]. However, both methods use repositories or catalogue in textual or semi-formal forms [50]. Also, [34] finds out that most of the requirements reuse techniques are textual copy based and that there is a direct relationship between the requirements reuse and the adopted technology. Moreover, methods that extensively adopt requirements repositories such as [20,40,53] use them to construct requirement specification documents. The repository usability reduces at the design and implementation stage due to the manual linkages of the requirements. SREP [28] is closely related to our research such that it integrates common criteria (ISO/IEC 15408) security standard in the early stages of the software development process. One of the fundamental techniques used by SREP is a repository containing assets, threats and security requirements for reuse. A model to integrate safety and security requirements to aid security certification process in presented in [7] but its integration with security/safety standards is not explained empirically. Another method for early requirement elicitation process is presented in [18]. It uses a heuristics tool for elicitation of common criteria requirements and uses UMLSec for security requirements' design that traces back to security requirements. It recommends using the requirements repository for storage, although the repository is not part of the model. Also, the method in [18] does not explore ICS design and its related development standards such as IEC 61499. We explore using a security standard based requirement repository for the later phases of the ICS product development covering the whole spectrum of development life cycle contrary to SREP that uses the repository at the early stages of development.
After going through the literature that uses requirements repositories extensively, it can be observed that these approaches do not particularly emphasise how data are being stored inside the repositories. Most of the repositories store data in the textual form written in natural languages. Essentially this means that the repository does not capture links between requirements rendering repositories less reliable [55].
An approach to store security requirements in a schemaless XML database has been discussed in [29]. However, this approach is semi-automated and only supports the representation of a tree-structured graph data model. Furthermore, having an entirely schema-less graph database can increase the chances of data corruption [35,36].
A schema-based approach to organising information related to security standards has been discussed in [30]. The discussed schema uses a hierarchical structure to organise information related to components of security objectives. The authors have used a relational database to manage and store information, and they use Structured Query Language (SQL) to retrieve information stored in the database. A relational query language like SQL may result in a longer response time when data is highly interconnected. Graph databases are efficient in storing and managing highly interconnected data [43,44,47]. Furthermore, relationships between data are calculated at the database creation stage; therefore, unlike relational query languages, the relationships are not calculated when data are retrieved from the database [43]. It can become a predicament when the requirement set is large, and the relationships or hierarchies are modified frequently between the requirements.
Resource Description Framework (RDF) for requirement engineering and management is proposed in [2,11,39]. RDF is a W3C recommendation for data exchange over the web and provides a query language SPARQL. However, storing data as an RDF graph often results in a dense graph [9]. Furthermore, RDF graphs only support the storage of properties in nodes of the graph [43][44][45][46][47]. The use of LPG databases is advantageous in this context, as metadata related to nodes and edges can be stored as node and edge properties in an LPG database [43,45,46].
The current approaches to store requirements using relational or XML databases do not mainly concentrate on their relationships. However, ICS is evolving rapidly as a consequence of technological evolution. Due to their semantic rigidness, such methods also do not scale well to changing requirements in ICS development scenarios. Moreover, current literature does not focus on integrating system and ICS security standard requirements and relationships, which is eminent in ICS security certifications.
We solve the lack of emphasis on robust requirements relationships by storing and integrating CSRS and 62443-4-2 security requirements in the form of a repository. Integration of requirement sets requires a properly defined relationship between requirements and any obligatory sub-requirements, especially in safety-critical ICS projects. Therefore, LPG databases help produce highly connected requirements since graph-based solutions emphasise entities and their relationships rather than individual entities and requirements.

Extending IEC 62443-4-2 requirements structure for standard implementations
This section discusses the IEC 62443-4-2 requirements structure and proposes extending its existing structure by providing directions for requirements implementation using industry-standard methods. Furthermore, a formal definition of the extension in terms of LPG is specified related to contribution two listed in Sect. 1. We extend the basic structure of the IEC 62443-4-2 requirements specification by adding guidelines regarding standard cryptographic methods and algorithms for ICS security certifications. IEC 62433-4-2 specifies CRs at an abstract level. The detailed implementation of such requirements depends on the capability of components. A CR may be implemented using different security methods based on the required level of security. Therefore, "standard" methods and algorithms are added as implementation guidelines for a possible satisfaction of a particular requirement. Moreover, a requirement may further direct the use of another security standard for selecting methods and algorithms, thus forming a chain of security standards based on which a particular requirement is realised [31].
The requirement structure is divided into two parts, i.e. the inherent requirements in the standard and standard implementations of such requirements in the form of methods and associated algorithms. For example, the general practice uses the Message Authentication Code (MAC) method to enforce integrity. However, MAC can further be implemented by various algorithms such as SHA1, SHA256 and SHA512. External components may implement some requirements such as anti-virus/malware tools or intrusion detection and prevention systems.
In this article, we use LPGs for formally defining the IEC 62443-4-2 requirements structure. As mentioned in Sect. 3, nodes and edges in an LPG have labels associated with them. Let L N be a set of node labels and L E be a set of edge labels such that L N ∩ L E = ∅.
Nodes and edges in an LPG also have properties associated with them. Properties exist in the form of key-value pairs where properties values are atomic entities. Let K be a set of keys (e.g. id, isValid, etc.) and V be a set of values (e.g. 1426, TRUE, etc). We define a set of properties P ⊆ (K × V) [43,45]. -R is a label for requirements -RE is a label for requirement enhancements -M is a label for methods -A is a label for algorithms ξ : E → L E is an edge labelling function which maps all edges to labels in the set of edge labels L E . -ρ : (N ∪E) → 2 P is a property labelling function which maps all nodes and/or edges to all possible subsets of the property set P. -E is restricted such that for any n 1 → n 2 Figure 3 illustrates the formalism shown in definition 1. The IEC 62443-4-2 requirement CR4.2 requires a component to erase all the information if it is being discharged from an ICS. It has two REs, i.e. RE1 simple data erasure for SL-C 2 and further erasure verification to achieve SL-C 4. However, Fig. 3 is an excerpt of the requirement CR4.2. We show only one RE in the graph for the sake of simplicity. The standard NIST SP 800-88: Guidelines for media sanitisation [21] suggests clear, purge or destroy methods to achieve data erasure. To implement such methods, a variety of data wiping algorithms such as Schneier, Gutmann are available. One algorithm can be chosen for the implementation based on what is appropriate for the target ICS context and environment.
The slow-changing nature of the IEC 62443 standard makes its LPG graphs highly reusable for other projects that need to fulfil the standard requirements. Therefore, it is beneficial to store them in a repository for later referencing.

Security requirements LPGs
This section relates to requirements graph generation stage shown in Fig. 1. This stage captures the security requirements from a CSRS and the IEC 62443-4-2 security standard. It involves organising the requirements in a hierarchical structure which is especially challenging for the large documents such as the IEC 62443-4-2. Such an application enables the linking of security requirements benefiting in intuitive traceability of requirements.

IEC 62443-4-2 property graph
We demonstrate the IEC 62443-4-2 security requirements LPG in Neo4j [24] graph database tool to show the practical implementation of formalism presented in Sect. 5. Although there are seven FRs in the standard, we show an example of LPG specification of FR3 that makes the basis to achieve the integrity goal in ICS. FR3 is chosen due to its applicability for integrity requirements IRa and IRb for the wind turbine system, as listed in Table 1. IRa requires data integrity between the wind turbine PLCs and external components, while IRb requires data integrity between master and slave PLCs. Both the requirements relate to FR3 that is a system integrity requirement specified in IEC 62443-4-2.
The property graph for FR3 is shown in Fig. 4. IEC 62443-4-2 breaks down an FR into more than one CRs. Furthermore, the LPG also illustrates the IEC 62443 security standard's requirement structure extension as described in Sect. 5. The IEC 62443-4-2 node in the graph specifies three sub-requirements CR3.1, CR3.2 and CR3.3. Furthermore, CR3.1 and CR3.3 have specific REs that determine the capability security level of the desired requirement. We have extended the graph by adding methods and algorithms required to implement these requirements and REs. Each method/algorithm is derived from the various recognised security standards. For example, RE1 of CR3.1 can be implemented using digital signatures. Therefore, the security capability level for CR3.1 can be achieved using digital signature algorithms as advocated by the FIPS 186-4 standard. Similarly, RE1 of CR3.1 also implies the use of hashing and MAC methods. ISO 19790 [23] offers such algorithms. Implementation of these algorithms is further discussed in FIPS 180-4 and FIPS 198-1. Edges between a CR and its RE also contain the SL-C (not depicted in Fig. 4). Such a feature helps create a path or extraction of subgraph to show the selection of methods/algorithms for the desired SL-C. Extraction is made possible with the help of queries. Neo4j supports the Cypher query language to generate LPGs and extract the data stored in LPGs.

CSRS property graph
We use an LPG to organise and structure the CSRS security requirements. We use this data model because an LPG assists in embedding additional data related to nodes and edges as key-value pairs. One of the main advantages of using this technique is the compatibility with the formal syntax of the proposed extended 62443-4-2 requirements organisation proposed in Sect. 5. The structure of a CSRS incorporates system-level and component type requirements. The system-level requirement may be fulfilled with third-party systems, e.g. an obligation to install the anti-virus software on the machines at a higher level of the ICS or a provision of intrusion detection and prevention system at ingress points of the system or at zone boundaries. On the other hand, the component type requirements deal with security provided by software or hardware components that in-house developers or a third-party vendors may implement. An example of such a requirement can be communication confidentiality between sensor nodes and the PLCs or between multiple PLCs in the wind turbine system. This article's primary focus is the component type requirements of the wind turbine system linked to IEC 62433-4-2 CRs.
A CSRS LPG of the wind turbine system is illustrated in Fig. 5. It primarily shows the component type requirements listed in Table 1. A CSRS document acts as a root node of a CSRS graph. The graph is divided into two branches of system-level and component type requirements. It is further sub-divided into confidentiality, authentication and integrity nodes to represent the security goals. Each goal has respective security requirement nodes. LPGs allow marking the edge label of the nodes. Therefore, each component type requirement in the CSRS is marked with a label associated with the SL-C of the individual requirement as shown in Table 1. Such an edge label acts as filtering criteria for searching appropriate requirements in the IEC 62443-4-2 graph. The filter allows the extraction of requirements and guidelines to their standard implementation from IEC 62443-4-2, corresponding to the SL-C of the particular CSRS requirement.

Security requirements repository
This section describes the proposed LPG security requirements repository (contribution one listed in Sect. 1). We define the requirements repository as integration of CSRS and IEC 62443-4-2 property graphs. The conceptual architecture of the repository in terms of its logical and process views is discussed. Cypher query implementation of the repository is also carried out to realise the repository's concept. Figure 6 shows the storage and integration process of CSRS and IEC 62443-4-2 graphs to promote their reusability for different ICS projects as discussed in Sect. 6. Individual graphs are created in isolation and stored in the repository. Cypher queries are used to extract the associated requirements from IEC 62443-4-2 graph based on a CSRS requirement. This integration of graphs results in a unique requirements subgraph structure for each CSRS requirement. Graph database tool such as Neo4j is leveraged to store the graphs. Optionally, the resulting subgraphs can also be stored in the repository for later reference.

Repository architecture
Definition 2 (LPG repository) Given a CSRS graph G csrs and n (n ≥ 1) security standard requirements graphs G s1 , . . . , G sn , a requirements repository for a single system is defined as: The combination of CSRS and IEC 62443-4-2 LPGs forms a repository. For a single system, an instance of a repository (G rep ) consists of a CSRS and one or more security standard graphs as shown in Definition 2. In an overall context, the repository acts as a container of graphs that can also be used to collect trees from different projects. At any point in time, the repository contains the graphs created from the security standard that are static because of their consolidated requirements. The CSRS requirements graphs, however, are dynamic since the system requirements in CSRS may change often. Therefore, the repository can be divided into two partitions, i.e. static and dynamic partition. Figure 7 shows an abstract view of the two logical partitions of the repository. Multiple CSRS property graph specifications belonging to different projects can be stored in the repository's dynamic area, thereby increasing the size of the partition over time. Such a scheme allows one-to-many relationships between CSRSs and the security standard, reflecting the common development practice. Therefore, different projects can take advantage of static graphs of IEC 62443-4-2 and linking them to multiple CSRSs reducing overall time and effort from requirements elicitation.

Repository implementation
For the wind turbine case study, Neo4j stores the CSRS and IEC 62443-4-2 property graphs in a single graph file that creates a logical repository. Both property graphs are created separately through the use of Cypher queries.
The Cypher query in Listing 1 is applied to create the wind turbine CSRS property graph described in Fig. 5. The IEC 62443-4-2 graph can be created using similar queries. Listing 2 shows an example query used to create the relationships between the IEC 62443-4-2 requirement nodes from Fig. 4. SL-C labels used between the edges of CRs and the REs are a significant set of information. They are critical to filtering out the security requirements from the standard based on the capability security level specified in a CSRS requirement. For example, SL-C:4 on line 4 of Listing 2 specifies the label over the edge of CR3.1 and its RE1. It helps define a path to the associated cryptographic primitives required to achieve capability security level 4 for CR3.1. Listing 2 is a snippet of a larger query that is not included here due to space concerns. The built-in feature of Neo4j that executes and stores a set of queries in a designated database also complements the idea of our proposed repository. The results of an executed query become a part of the database that allows registering different database views for later use. The wind turbine CSRS and IEC 62443-4-2 FR3 requirement graphs resulting from queries in listings are stored as separate entities in the same graph database. Therefore, both the graphs can be merged, or the data can be extracted through subsequent Cypher queries. For example, wind turbine's security requirement IRa id required to be implemented at SL-C 4 as specified in CSRS. Listings 3 shows the Cypher queries to deduce the required standard

Design-time tool integration
This section discusses contribution three of this article listed in Sect. 1 that extends the use of the repository by integrating it with design-time and traceability tools to create practical and maintainable ICS security applications. We use a secure-by-design approach called "secure links" presented in [52] to link the CSRS LPG graph requirements to the design and implementation of IEC 61499 applications. Secure links development methodology proposes the abstractions for security requirements repository along with TORUS [49] for tracing requirements in ICS applications. We illustrate the detailed use of the LPG repository with the secure links for more practical purposes in Fig. 9.
The approach of secure links helps in linkages between security requirements and their realisations. Such linkages identify the function block implementation of a security requirement. It results in intuitive requirements tracing capabilities for the designers and the developer during the system development and the requirement change process. Secure links and the repository complement each other by behaving like anchors in the form of security methods/algorithms presented on the leaf nodes of the IEC 62443 property graph in the repository and its function block implementation in the IEC 61499 ICS application. Each secure link is identified by a unique identifier that references it. For example, a secure link SL(IRa, FBN_HMAC) specifies the requirement from wind turbine CSRS where IRa is an integrity requirement mentioned in Table 1. FBN_HMAC is the function block network implementing an integrity mechanism from secure link security library as shown in the design and implementation part of Fig. 9.
Secure links methodology inherently uses TORUS [49] to link secure links to requirements in the repository. TORUS is a tool that provides tractability for a requirement using its core feature called splice. A splice contains a requirement identifier and its meta-data along with a secure link identifier for its implementation. In the current context, it uses the requirements nodes of the CSRS graph as a security requirement. In this case, the meta-data may include abstract level information, e.g. requirement identifier, splice identifier, capability security level, type of security requirement (system or component) and precise information such as security method and algorithm. A design application can leverage a splice to pack the meta-data of a requirement as a link to the implementation. Therefore, the TORUS's use provides the ability to trace the requirements from the requirements to the implementation phase. Figure 9 also illustrates TORUS's application to bridge secure links with the repository to achieve end-to-end requirements traceability for IEC 61499 ICS applications. For example, the splice in TORUS stores the secure link identifier and the wind turbine requirement identifier IRa as the meta-data. Such a method can take full advantage of TORUS's requirement tracing capabilities. Cypher queries are used to obtain the recommended methods/algorithms from the repository based on the capability security level.  Such queries extract the complete path from the root node of CSRS down to the leaf nodes of IEC 62443-4-2 graphs by joining both graphs. Subsequently, an appropriate method or an algorithm can be selected to fulfil a particular security requirement. It can be achieved by writing queries such as shown in Listing 3, further extending them to filter the required algorithms/method at the leave nodes. The communication details between the repository and the secure links are left out of the scope of this article. Figure 10 shows a screen capture of our implementation of a proof-of-concept plug-in in IEC 61499-based 4DIAC development environment [15]. The plug-in can list and identify potential secure links for inter-device mapped function blocks. Figure 10 shows two secure links for an IEC 61499 application, i.e. SL1 and SL2 highlighted in green. Users can select requirements from the repository and select an appropriate function block network based on the selected requirement using the drop-down lists in a secure links panel. The plug-in communicates with the repository using Cypher queries to obtain the recommended cryptographic algorithms for the requirement implementation. On the other hand, TORUS links the repository and the secure link plug-in semantically to enable requirements tracing.  9 Results and discussions

Requirements traceability for industrial control systems
Requirements traceability is a critical property for an endto-end method of security application development in ICS systems. The complexity of such systems demands forward, and backwards trace of requirements between the early phase of development and the implementation phase [5]. The use of secure links and TORUS with the repository allows the security requirements mapping with the IEC 61499 function blocks. Table 3 lists the TORUS splices we created based on the security requirements of the wind turbine system listed in Table 1. The last column shows the actual secure link containing the requirement identifier and the IEC 61499 Function Block Network (FBN) [54].
The current example shows the one-to-one relationship between a requirement and an FBN. However, two or more requirements may be implemented using a single FBN depending upon product requirements.
Requirements traceability matrix (RTM) essentially provides a map of relationships between requirements and the product artefacts such as design, implementation and tests. Table 4 shows resulting security RTM for the wind turbine system that can be readily generated by expanding a TORUS splice from the containing requirement and secure link identifier. It can be done by following the steps from the proposed method in Sect. 7, i.e. generate the subgraph by combining CSRS and IEC 62443-4-2 graph through Cypher queries based on a requirement identifier. Each row of the table represents the result of a TORUS splice expansion. The first and last column of the table contains the requirement and secure link identifier, respectively. Columns 2-8 show the requirements, enhancements, associated standards, methods and algorithms associated with the FBN implementation shown in column 9. For example, for the project requirement IRa, the RTM shows the appropriate selection of standard requirements such as CR3.1 and the further REs from the standard required to implement for a particular security level (SL-C 4 in this case). The associated standards give further guidance to implement requirements in terms of recommended methods/algorithms. For example, ISO 19279-0 recommends using MAC for data integrity and authentication to fulfil CR3.1. Furthermore, FIPS 198-1 describes the standard implementation of HMAC that requires hashing algorithms such as SHA1 and SHA256. The designer/developer may choose one of the available hash functions for the HMAC in the implementation IEC 61449 function block network, as shown in column 8.
Security methods (columns 5 and 7) can also be selected straightforwardly without any associated intermediary standards. For some requirements such as CRa related to confidentiality, the IEC 62433-4-2 does not require RE to achieve the maximum capability security level. Table 4 shows the direct mapping of CR4.3 to the symmetric method of encryption, further linking to IEC 18033-3 standard that lists down the recommended block cyphers to be chosen for the implementation of FBN_ENC.
One of the main advantages of using such an RTM is producing document artefacts containing bidirectional traceability of security requirements. It helps develop ICS products that are meant to be certified against a security standard such as IEC62443-4-2. The other main advantage is showing the requirements to test-case traceability, e.g. linking a security requirement and a unit test of implementing function block in an FBN. It provides a bird's eye view to the stakeholders, such as users, vendors and the certification bodies, that a product requirement has been fulfilled based on the particular requirement from the security standard. The associated standards also ensure that a robust set of guidance backs up each link in the requirements chain since they are approved by industry practitioners and government after rigorous technical reviews and security testing.

Requirements change management
Requirements change is an inevitable part of system development. Figure 11 shows the requirements change process that can be accomplished by the proposed method. Requirements may be changed in two ways such that: 1. A requirement may be added or deleted for the product needed to be implemented by a secure link. If a node is to be relocated under a different node within the graph, or if a security goal has been changed, it is deleted and added as a new node. For example, if the use case for wind turbine requirement AR shown in Fig. 5 is no longer desired or if there is a new use case of merging AR with one of the integrity requirements IRa or IRb, then the node AR is deleted and introduced as a new requirement under the integrity node. 2. The nature of a requirement may get changed, e.g. change in capability security level or metadata. In such cases, a new graph based on a modified CSRS needs to be generated. It can be achieved by updating and re-executing the Cypher queries and subsequently storing the new graph in the repository for later referencing. Regeneration of CSRS graph ensures the consistency and correct linking to security standard graph. If an existing requirement is changed, the node's metadata, such as node name and identifier, is updated. For example, if the SL-C of IRa in Fig. 5 is lowered to level 3, the edge label needs to be changed to reflect the new level in the CSRS graph. The modified SL-C label may produce an entirely different subgraph from the linking of CSRS and IEC 62443-4-2 requirements, containing a different set of associated standards and methods/algorithms.
For any of the above scenarios, the CSRS graph must be updated and stored in the repository. The added/updated requirement can consequently be selected for a secure link. A new subgraph is generated to link it to related FR from the security standard graph stored in the repository. This step generates associated standards and methods/algorithms based on the FRs in the standard. The associated secure link and the CSRS requirement identifiers are stored in the TORUS splice for traceability purposes.

Requirement traversal and graph analytics
Manual selection of requirements is a laborious task in the absence of automated tools, especially for large requirements set when requirements are interconnected and fulfilled by multiple sub-requirements, as shown in our graph model. LPG databases such as Neo4j can query the data stored in a graph database using the Cypher query language. By using Cypher, the requirement repository can be explored locally as well as globally [32]. Local graph search corresponds to searching smaller subgraphs in the graph database. This can be done by specifying the graph pattern in Cypher [42,44,46,47]. Cypher also enables the global search of the repository for efficient requirement traversal. It is facilitated by using the inbuilt graph algorithms such as shortest path, all short- Fig. 11 Requirements change process using security requirements repository and design-time tools est path, single-source shortest path, minimum spanning tree, centrality and community detection [32]. For example, Listing 4 shows a query written in Cypher that can be used to search for all shortest paths starting from the root node of a requirement repository. The MATCH and WHERE clauses in lines 1-5 are used to search for the root node. The CALL clause in line 6 is used to specify that to search all shortest paths starting from the root node, Dijkstra's algorithm should be used. Finally, as shown in lines 7-8, YIELD and RETURN clauses are used to output the result set.
Moreover, using a graph-based approach to store requirements also means that advanced frameworks for graph analytics such as Spark can be utilised to perform large-scale data analytics.

Requirements extraction
Requirement elicitation is an essential step at the inception of a project [48]. The process of collecting and analysing the requirements in large-scale projects is often an exhausting task [41]. The task becomes more complicated when  the project requirements are tied up to security standards, as shown in this article. The analysts have to manually go through all system security requirements to link them with an extensive set of security standard documents.

AND NOT EXISTS {MATCH (root)<-()}
In the following subsections, we compare manual requirements extraction effort against the extraction based on security levels using LPGs from the IEC 62443-4-2 combined with CSRS of the systems such as wind turbine discussed in this article and an industrial mixer control system case study presented in [52].

Manual requirement extraction
The manual process requires analysts to go through all the requirements and evaluate the appropriate methods/ algorithms to implement the requirement based on a security level. Figure 12a shows the initial manual extraction effort for IEC 62443-4-2 FR3. It provides the total security requirements extracted from the CSRS and the security IEC 62443-4-2 standard document. For example, the total number of requirements nodes from wind turbine CSRS are 13. For FR3 (Integrity), the total requirements to be extracted are 65, i.e. CSRS + FR3 requirements. Also, the total number of manually extracted requirements comes to 57 for the industrial mixer control system. It can be observed that the increase in requirements depends on the number of requirements in the project, which is a dynamic component in the repository.

Requirement extraction using LPG
The secure link plug-in automates the process of security level based requirements extraction from CSRS and IEC 62443-4-2 LPGs using Cypher queries. Figure 12b shows the number of nodes from FR3 when an individual require- Fig. 12 a Manual requirements extraction b Requirements extraction based on security level using LPGs. A comparison of requirements extraction between manual and LPG method. y-axis represents number of requirements ment is selected from CSRS using the secure links plug-in. It shows that the designers and developers need to deal with the reduced number of requirements for each CSRS requirement selected from the plug-in. For example, in Fig. 12b, the total number of requirements for FR3 are 52 (Fig. 12a). However, when IRa that is the sub-requirement of FR3, is selected from the secure link plug-in, the number of extracted requirements comes down to 36.
From the above comparison, it can be seen that the total number of requirements is filtered down based on a security level when using LPGs and the secure link plug-in compared to the manual process. That is, 51 compared to 65 for the wind turbine, and in the case of the industrial mixer control system, 39 compared to 59. Although these numbers may not be significant, there will be a considerable improvement if a system has an increased number of CSRS requirements, which may result in implementing more FR sub-requirements.

Reusability of the repository
The graph storage feature of the repository renders it highly reusable for developers and testers to build and maintain IEC 62443-4-2 certified ICS applications. The ability to query the repository using Cypher ensures an easy requirements extraction due to the LPGs. The static and dynamic partitions of the repository illustrated in Fig. 7 provide distinct levels of reusability.
The static partition, i.e. IEC 62443-4-2 graphs, provides high reusability since the standard does not often change. A one-time effort has to be made to create the graph for each security FR of the standard as discussed in Sect. 9.4. Once created, the FR graphs can be linked with the relevant requirements in CSRS. It is beneficial for the certifiers who may have to deal with various products at the same time. Reusable IEC 62443-4-2 requirement graphs help quickly validate and verify a CR or log the correction if needed. For example, a certifier may validate whether a vendor has correctly chosen the methods/algorithms according to the capability security level. Moreover, in conjunction with the graph repository, secure links also help in the implementation verification of the requirement. An RTM discussed in Sect. 9.1 can help in verifying a security requirement by tracing it to the particular FBN implementation. For instance, a certifier can generate an RTM shown in Table 4 using secure links plug-in, thus verifying the implementation of CRb through FBN_ERASE.
A vendor may also use such IEC 62443-4-2 graphs as a guideline for designing and implementing security requirements against a particular security level. For example, a CSRS containing integrity requirement can be linked with the graph of IEC 62443-4-2 FR3, as illustrated in Fig. 8. Querying the graph based on the capability security level reveals the subgraph that conveniently guides the vendor to the required methods/algorithms implementations.
Another effective reusability strategy is to ensure the availability of the IEC 62443-4-2 static repository partition to the broader community of ICS application developers in the form of a library of Cypher queries. Aspiring vendors of 62443-4-2 certified ICS applications may import the graphs and reuse them readily in their projects, thus ensuring community collaboration. For example, a set of cyber queries for the generation of IEC 62443-4-2 FR graphs can be stored in versioning control systems, and the vendors can subsequently use these queries for the instant creation of the graphs. It also enables community contribution helping in maintaining and updating the repository in case of modification of the standard.
Similarly, vendors are also able to reuse the dynamic partition of the repository. Each CSRS graph for a particular project is an archive in the repository for future reuse, as illustrated in Fig. 7. For example, a project may reuse the graph from an earlier project with similar security requirements, thus reducing the time against requirement analysis and extraction.

Limitations and validity threats
In this section, we discuss some of the pertinent limitations and threats to the validity of our purposed repository model that uses TORUS and secure links, such as illustrated in Fig. 9.

Reliance on IEC 61499
IEC 61499 has some open issues regarding its industry-wide adoption. A state-of-the-art literature review [25] focusing on the applicability of IEC 61499 identifies its major challenges in industrial practices. The key issue is the industry's reluctance to deviate from legacy ICS development approaches and rely on an older standard like IEC 61131-3. The causes for such hesitation in adopting IEC 61499 are the cost to upgrade the legacy systems, lack of proven redesign methods, practitioners' expertise, and the lack of educational aspects around IEC 61499 regarding course designs and industrial training.
Secure links provide a secure-by-design development approach for IEC 61499 ICS applications. Similarly, the TORUS splice in our approach requires an input consisting of a secure link identifier that refers to the underlying IEC 61499 function block network. Although TORUS can be considered a generalised framework for requirements traceability, its applicability in the current literature is demonstrated majorly through IEC 61499. Therefore, the reliance on the IEC 61449 based tools for the end-to-end traceability limits the backward compatibility with the ICS using legacy standards such as IEC 61131-3.

Scope limitations
The proposed LPG security requirements repository provide the implementation guidelines regarding security requirements in IEC 62443-4-2 in the form of LPG nodes. These guidelines include standard security mechanisms and their associated cryptographic primitives. The current collection of these guidelines is not extensive, i.e. it specifies limited numbers of standard cryptographic algorithms and methods on a limited set of security requirements in the IEC 62443-4-2 standard. The LPG graphs of the standard need to be comprehensive for the repository to be used in industrial-scale ICS projects.

Generalisability
The LPG repository currently maps to the IEC 62443 standard. It is designed to closely follow the structure of all other ISO standards. Our current IEC 62443 implementation can be considered as one particular instance of our proposed method. Modelling of similar standards in the form of graph databases can enhance the applicability of our method for multiple domains. The LPG can be made more generalisable by partitioning it further to contain more than one standard. For example, for a more generalised LPG repository, the static partition in Fig. 7 can simply be modified to "Standard partition" instead of "Security Standard partition" and dynamic partition can be renamed to "SRS partition". Other ICS specific security standards such as NIST SP 800-82 [51], FIPS 140-2 [33], and ISO/IEC-15408 [28] can be explored to be used with the repository by structuring these standards in the form of LPGs. Such a proposition will also allow industry-specific ICS such as smart grids, manufacturing, and gas and oil industries by leveraging our generalisable repository structure.

Scalability
The wind turbine case study presented in this article is an abstraction of a larger system containing complex requirements. We selected a set of key security requirements (listed in Table 1) of the wind turbine system in order to illustrate the traceability aspects of our solution. Therefore, the case study does not cater to all of IEC 62443-4-2 FRs and only the FRs related to the wind turbine requirements are selected for illustration purposes.
The limited scope and scale of the requirements may pose a threat to the generalisation of the results. Several different FRs may need to be accommodated in an industrialscale real-world system, and their respective graphs need to be generated. However, the effort required to extract the requirements from the standard through the manual process is directly proportional to the FRs applied on a system. In contrast, the LPGs provides the capability using Cypher queries to filter the appropriate requirements from the standard based on the security level, reducing the extraction effort as discussed in Sect. 9.4. Therefore, we believe that our proposed solution can scale to accept increasing numbers of FRs.

Error-prone requirements elicitation and extraction
The proposed approach assumes that the generation of security standard property graphs and the mapping of standard requirements to available mechanisms and algorithms are comprehensive. However, such practice requires significant expertise and extensive knowledge of ICS security standards and solutions. For example, the IEC 62443 property graph illustrated in Fig. 4 is a prototype created based on the authors' expertise and knowledge. The interpretation of a security standard requirement may vary between different requirements analysts, especially in the case of assigning derived standards and algorithms to the appropriate security levels described in the standard. Therefore, there is a risk of producing two different graphs of the same IEC 62443 functional requirement that may also induce difficulties in maintaining the property graphs, consequently providing inaccurate requirements traceability from requirements to code. Automatic security standard requirements extraction can help achieve consistency.

Conclusion and future works
ICS applications that conform to security standards need robust requirements structures to capture formally-defined requirements relationships to verify and validate conformance to standards. This article proposes a multi-partitioned LPG security requirements repository to store and integrate system (CSRS) and IEC 62443-4-2 standard requirements. A formal definition of the IEC 62443-4-2 extended requirement structure is proposed that guides the selection of standard cryptographic primitives required to implement a security requirement according to the desired security level. End-toend security requirements traceability is achieved when the repository is paired with design patterns to capture communication security constraints using secure links [52] and a requirement traceability engine like TORUS [49]. We also present a requirements change process aided by the repository. All improvements are achieved through the ability to query the repository using graph-query languages. We also discuss implications regarding the utility of the repository in the security certification process.
Future directions include mapping the maximum number of IEC 62433-4-2 functional requirements into graphs to test the scalability of the proposed solution that can be achieved by using an extensive case study with additional security requirements. Moreover, utilising ICS security standards other than IEC 62443-4-2 is another exciting challenge. We also aim to look at the opportunities for performing advanced graph analytics that can be performed using our approach using graph algorithms.
Funding Open Access funding enabled and organized by CAUL and its Member Institutions.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.