Modeling ecosystems of reference frameworks for assurance: a case on privacy impact assessment regulation and guidelines

To assure certain critical quality properties (e.g., safety, security, or privacy), supervisory authorities and industrial associations provide reference frameworks such as standards or guidelines that in some cases are enforced (e.g., regulations). Given the pace at which both technical advancements and risks appear, there is an increase in the number of reference frameworks. As several frameworks might apply for same systems, certain overlaps appear (e.g., regulations for different countries where the system will operate, or generic standards in conjunction with more concrete standards for a given industrial sector or system type). We propose the use of modelling for alleviating the complexity of these reference frameworks ecosystems, and we provide a tool-supported method to create them for the benefit of different stakeholders. The case study is based on privacy data protection, and more concretely on privacy impact assessment processes. The European GDPR regulates the movement and processing of personal data, and, contrary to available software engineering privacy guidelines, articles in legal texts are usually difficult to translate to the underlying processes, artefacts and roles that they refer to. To facilitate the mutual comprehension of legal experts and engineers, in this work we investigate how mappings can be created between these two domains of expertise. Notably, we rely on modelling as a central point. We modelled the legal requirements of the GDPR on data protection impact assessments, and then, we selected the ISO/IEC 29134, a mainstream engineering guideline for privacy impact assessment, and, taking a concrete sector as example, the EU Smart Grid Data Protection Impact Assessment template. The OpenCert tool was used for providing technical support to both the modelling and the creation of the mapping models in a systematic way. We provide a qualitative evaluation from legal experts and privacy engineering practitioners to report on the benefits and limitations of this approach.


Introduction
From an engineering perspective, system assurance is an engineering process that can provide support to comply with normative frameworks, where assurance is "the justified confidence that the system functions as intended and Tommaso Crepax tommaso.crepax@santannapisa.it 1 Tecnalia, Basque Research and Technology Alliance (BRTA), Derio, Spain 2 Universidad Politecnica de Madrid (UPM), Madrid, Spain 3 Trialog, Paris, France 4 Scuola Universitaria Superiore Sant'Anna, Pisa, Italy is free of exploitable vulnerabilities, either intentionally or unintentionally designed or inserted as part of the system at any time during the life cycle" [1]. In our case, the "exploitable vulnerabilities" will be related to privacy. The assurance process, in a model-based approach, starts with the modelling of the normative framework, also known as reference framework in system assurance parlance. This reference framework model contains definitions of processes that may/shall be followed according to the regulation, as well as formal requirements. Once the reference framework is created, it can be reused in any project dealing with compliance to that specific framework-at least until the law or its interpretation changes. During the execution of a project by an organization, the project will generate evidence that can be traced to the model of the reference framework, and allow asserting compliance with it through argumentation of compliance.
It is usually the case that assurance projects need to cope with a wide range of reference frameworks, so we will be referring to "ecosystems" of reference frameworks. The objective in these ecosystems is to ensure that the legal entity-or entities-responsible for the functioning of the system has applied the best practices (procedural, technical and legal) to provide a system without errors and vulnerabilities. For instance, there are technical standards, created in collaboration with developers and engineers, that deal with the privacy quality attribute providing guidance on the product-based requirements and the process to follow during the design. An example of these "development standards" is ISO/IEC 29134 (Information technology-Security techniques-Guidelines for privacy impact assessment) [2], or sector-specific guidelines created to support the activity such as the EU Smart Grid Data Protection Impact Assessment (DPIA) template [3] for Smart Grid and Smart Metering. Differently to those, the European Union General Data Protection Regulation (GDPR) [4] is a legal text. The main aim of the GDPR is to protect people's rights and freedoms through the protection of personal data, as well as the free movement of personal data. In force since May 2018, this Regulation sets out principles for data processing activities, defines rights for the data subjects, and imposes legal obligations on controllers and processors for the movement of personal data. Organizations and industrial companies dealing with personal data in their systems have to continuously assess their privacy protection practices to comply with GDPR clauses, including those of the mentioned privacy impact assessment. In practice, system engineers are confronted with the problem of making their system comply with abstract general principles and legal requirementssuch as those of the GDPR, while using familiar privacy standards that are more focused on engineers or developers. In this process, it is challenging to conciliate and consolidate the assurance projects, especially when legal and technical sources of requirements are not explicitly mapped. In fact, in case of non-formalized, existing relations and similarities among two reference frameworks, only experts on both reference frameworks are able to identify them. These mappings, which we call "implicit mappings", made it challenging to address ecosystems of reference.
The modelling of reference frameworks is a research direction that can establish central common points where legal experts and engineers share a language, i.e., legal and technical requirements are established and understood together. We claim that reference framework models can make legal texts like the GDPR more "operational" from an engineering perspective, by operationalizing the legal text into engineering process activities and bridging them to those specified by technical standards. To support our claim, we take as case study the GDPR Article 35 on DPIA, and confront it with ISO/IEC 29134, the EU Smart Grid DPIA template, and the state-of-the-art CACM metamodel (Common Assurance and Certification Metamodel) [5] for modeling reference frameworks, we aim to answer the following research questions: • RQ1: To what extent can information privacy law and privacy protection standards for conducting DPIAs be modelled from a development process perspective? • RQ2: To what extent do the reference framework models of GDPR DPIA match to the ISO/IEC 29134, and to the EU Smart Grid DPIA template?
The contributions of this work are: • A model of a relevant excerpt from the GDPR legal text, in particular the articles dealing with DPIA. • A model of the Guidelines for Privacy Impact Assessment (PIA) specified by ISO/IEC 29134, and a model of the EU Smart Grid DPIA template. • A model of a mapping from ISO/IEC 29134 to the GDPR's DPIA, and from the EU Smart Grid DPIA template to the GDPR's DPIA. • A qualitative evaluation of the results.
The rest of this paper is structured as follows: Sect. 2 presents background information, and Sect. 3 presents related work. The used methodology is described in Sect. 4, and the results are detailed in Sect. 5. Then, Sect. 6 presents a qualitative evaluation and the threats to validity. Finally, Sect. 7 concludes and outlines future work.

Background
Section 2.1 presents the scope of this work within the more global assurance engineering context. Then, Sect. 2.2 elaborates more on background information on reference frameworks ecosystems and their modeling. Finally, Sect. 2.3 presents information on the case study context on privacy.

The role of reference framework ecosystems
This subsection aims to show how models of reference frameworks ecosystems are to be used in the assurance engineering process. Figure 1 shows three interrelated layers, namely reference frameworks ecosystems, assurance, and system assets. Reference frameworks ecosystems provide guidance for assurance engineering (e.g., assurance projects), and the assurance goal objective is to comply and to provide evidence that these reference frameworks are respected. The assurance layer abstracts the assurance project activities and assets such as models of argumentation and proof of goals achievement (e.g., assurance cases using the Goal Structuring Notation metamodel [6]) or evidence models for evidence management. Assurance takes into account the reference frameworks and helps providing guidance about some system and V&V (Verification and Validation) assets to create, and in some cases, it predefines or guides the organizational processes to follow. Finally, those same assets are used in the assurance projects as evidences through traceability links from the assurance layer (e.g., an evidence in an assurance case traces to a system component or to testing results).
The CACM metamodel [5] is a domain-specific language (DSL) proposed to model reference frameworks and, more generally, assurance management needs. Together with the CACM creation, a tool was created providing a modelling framework covering both the reference frameworks ecosystems layer, assurance layers, and traceability means for any system development project. Eclipse OpenCert [7] tool can be used to support the assurance activities allowing the modelling of reference frameworks, equivalence mapping models, and many other functionalities for assurance project management that are out of the scope of this paper such as the creation of assurance cases and evidence models, each System assets, V&V assets and organizational processes  Fig. 2 Example of a reference frameworks ecosystem for data privacy impact assessment which includes three compliance targets of them with their own metamodel besides CACM. The ultimate aim of the modelling tool is to lower certification costs in face of rapidly changing product features and market needs. From all the OpenCert functionalities, its capability to model reference frameworks is the basis for all the assurance functionalities. The focus of this work is strictly in the ecosystems of reference frameworks part (the layer at the top of Fig. 1). Thus, we keep out of our scope the other, yet equally important parts, such as project-specific assurance cases and concrete system assets (e.g. architecture, implementation, or V&V assets). The choice of Eclipse OpenCert for our work comes from the fact that, to the best of our knowledge, it is the only implementation of CACM, it supports the modelling of reference frameworks and mappings, the resulting models can be later reused seamlessly in further assurance projects through other tool functionalities, and the team had the skills and experience in using it. Our approach is then reliant on this decision; other alternatives will be presented in Sect. 3.
From all the possible reference frameworks ecosystems, the case of this work is based on DPIAs, and we considered the three reference frameworks illustrated in Fig. 2. As mentioned in Sect. 1, the GDPR is a regulation; thus, given its enforcement in EU and companies providing services in EU, it has been established as the common baseline and reference for privacy considerations within the development, maintenance, and evolution of any kind of information system. Assurance is reflected in the GDPR accountability principle as we discuss in Appendix A; however, the focus is on the PIA part. The ISO/IEC 29134 is a standard providing more actionable instructions for conducting DPIAs. Finally, as a selected example of specific guidelines for a concrete domain where privacy is highly relevant [8], we selected the EU Smart Grid DPIA template. Obviously, this is an excerpt of the ecosystem as other countries have their own regulations, and other sectors have their standards and guidelines. The focus will be on the modelling of these three reference frameworks and the relations of the latter ones with the GDPR.

Modelling reference frameworks and equivalence maps
Compliance with legal and technical requirements may become troublesome to address during systems development.
In order to demonstrate that the development process complies with a given regulation, and trace the features of the development outcomes to the regulation, we require an unambiguous model of the standard or reference framework whose compliance is pursued. One of the initial efforts to provide a reference metamodel to deal with reducing the ambiguity behind regulations, and standards compliance-related activities was the CCL (Common Certification Language) [9]. It was proven useful to model safety standards such as the EN 50128 standard for railway domain, the DO-178 for avionics [10], or ISO 26262 for automotive [11]. CCL evolved into the CACM (Common Assurance and Certification Meta model) [5] to be more expressive in the connection with the actual system architecture [12]. CACM has been used [13] to model more safety standards but also security standards such as IEC 61443.
The CACM metamodel [5] is quite extensive, but we present here its most relevant elements and relationships to model project-independent reference frameworks. For readability, we omit the prefix "Reference" to all the elements (e.g., Reference Activity, Reference Role, etc.).
• Activity: "Unit of behaviour that a reference assurance framework defines for the system lifecycle and that must be executed to demonstrate compliance." An Activity can be related to zero or more Preceding Activities, which must be executed before it. Besides, an Activity can be composed of any number of Sub-Activities, which provide a more fine-grained description. Precedence relationships may be defined (but they are not required) between such Sub-Activities. • Role: "A type of agent that participates in an Activity." A Role can be involved in several Activities and vice versa. • Artefact: "Type of units of data that a reference assurance framework defines and that must be created and maintained during the system lifecycle to demonstrate compliance." An Artefact can be related to Activities as a Produced Artefact (generated or changed by the execution of the Activity) and/or as a Required Artefact (necessary for its execution). • Requirement: "Condition or criterion that a reference assurance framework defines or prescribes to comply with it." On the one hand, an Activity may have Owned Requirements, meaning that they must have been met when the Activity finishes its execution. On the other hand, an Artefact may have Constraining Requirements, meaning that their satisfaction is fulfilled by the existence of such Artefact.
It shall be noted that beyond the explicit relations just presented, implicit relations may also exist mediated through indirect dependencies, e.g. two Activities related by an Arte- fact which is Produced by one and Required by another, or an Activity related to an Artifact through a Requirement. Figure 3 presents the graphical notation used to depict the different elements of a reference framework and the relations between one another. Some element types (e.g., constraints and others that will be presented later in Sect. 4) do not have a graphical representation counterpart but might be only present in the underlying model. Color is just a convenient feature of this notation, whose semantics are not color-dependent at all: as we can observe in the legend of Fig. 3, the semantics of the arrows just depend on its source and destination elements. For example, a link between activities will always be a "Next activity", an activity to an artefact will always be an "Activity produces Artefact", and so on, hence, the color is not relevant indeed.
In this work, we present the first time that CACM and OpenCert are used to model privacy-related reference frameworks. 1 In addition, while several process-or productoriented standards have been modelled, this is also the first time we use them for a legal text, requiring, as we will explain later, a higher degree of interpretation.
Apart from reference framework models, mappings between one another aim to reduce assurance efforts thanks to the identification of overlapping assurance requirements and activities derived from the compliance targets (law/standard/ policy in the selected ecosystem). We use another available meta-modeling package in CACM for the creation of equivalence mapping models between reference frameworks. Mapping models help to make explicit a 'matching' between elements included in the standards such as activities, roles, artefacts, requirements or other CACM elements. CACM equivalence maps were previously used to map the IEC 61508 (safety-related) and ISA 62443 (security-related) standards [14] or to map standards across domains between DO-178 (avionics) and EN 50128 (railway) [15]. Mappings in the model can be categorized using three types: • Full match: Elements are identical. The characteristics of the element referred to by term A in its original context (its form, required content, objectives) fully satisfy those required of the element referred to by term B. • Partial match: There is some similarity between the elements referred to by two terms, but they are not identical. Differences may be significant or insignificant and adding a textual justification in the mapping model element will be recommended. • No match: There is insufficient similarity between the elements to permit a match. This type of mapping is useful to avoid confusions and make explicit that two elements that initially might seem similar they are actually not.
The mappings can be freely grouped in categories, and OpenCert provides both a tree view and a table view for inspecting or editing these mapping models. As mentioned in Sect. 1, in this paper we present a mapping between the GDPR DPIA part and both the ISO/IEC 29134 and the EU Smart Grid DPIA template.
The protocol to create the mappings involves manually comparing each element (or set of elements) from a reference framework to elements of the other one. The level of abstraction of the elements of both reference frameworks depends on how they were originally described in their respective documents, e.g., some parts can be high level, while others are more concrete, still there can be mappings from one to the other, whether it is a "Full match" when the overlap is clear, or a "Partial match" otherwise, with a justification that can be still useful for the users of the mapping.

ISO/IEC 29134, smart grid DPIA template, and
preliminary comments on its alignment with GDPR's PIA ISO/IEC 29134 (Information technology-Security techniques-Guidelines for privacy impact assessment) gives guidelines about both, the process and the contents of a Privacy Impact Assessment (PIA). Considering the text of Art. 35 GPDR, beyond discussions on terminology (e.g., PIA vs DPIA (Data Protection Impact Assessment), personal data vs PII (Personally Identifiable Information), controls vs measures, etc.), there is a clear alignment with the contents of the GDPR DPIA framework as ISO/IEC 29134 defines that personal data processing operations must be known and privacy and data protection risks shall be assessed, analysed, and then treated, reports must be generated and reviewed, etc.
Despite the mentioned alignment, there is a substantial difference between both: as a legal text, GDPR is binding and enforceable, but it only focuses on "what" shall be provided, leaving the companies room to decide on the specific process, while ISO/IEC 29134, as a technical standard, goes into further operational details on the "how", but its adherence is voluntary, leaving aside which is the applicable law that each organization must stick to. For instance, while the GDPR mandates that a risk analysis be carried out, it does not prescribe the specific procedure to be applied. ISO/IEC 29134, on the contrary, details risk analysis into separate activities for privacy risk identification, analysis, evaluation, and choice of treatment options. Further, ISO/IEC 29134 makes explicit considerations for activities related to the preparation phase of the PIA to be done such as setting up the PIA team and prepare it with direction, establish a PIA plan and the necessary resources for conducting it, describe what is being assessed, identify stakeholders, establish a consultation plan, and consult with stakeholders. All these (except part of the last step about consultation) correspond to an earlier phase of the process, out of the scope of the GDPR, which only focuses on the execution of the DPIA, and which takes for granted all these preparatory activities, whose outputs (PIA plan, risk criteria, descriptions of business processes and information systems to be assessed, etc.) are regarded by the GDPR as elsewhere generated, pre-existent resources (respectively referred to as methodology; origin, nature, particularity, and severity of risks; nature, scope, context, and purpose of processing operations, etc.). Or, while both ISO/IEC 29134 and the GDPR anticipated the need to consult with stakeholders (e.g. data subjects, experts), the former provides further details of the contents expected from results to be produced by such consultation. Yet another example comes from the concrete structure and content of a PIA report detailed by ISO/IEC 29134, while the GDPR only briefly lists what the assessment shall contain with a very high level description.
The mentioned difference is not a drawback, quite the opposite. It reflects that ISO/IEC 29134 can be more easily approached by technical staff, while, at the same time, its results can be used as a contribution to achieve compliance with the respective GDPR sections. This way, they are complementary, and not a replacement. It has been discussed whether technical guidelines such as ISO/IEC 29134 guidelines support or rather blurs the GDPR's objective as they are too focused in a risks-based approach [16]; however, an explicit mapping of the equivalences between these independent documents can support the comprehension of both legal experts and engineers, facilitating their communication, and helping in assurance tasks.
Regarding the EU Smart Grid DPIA template, it targets data controllers dealing with data from the Smart Grid and Smart Metering Systems requiring guidance for how to conduct a DPIA with the peculiarities of their domain. Thus, it provides support to Smart Grid operators managing or initiating Smart Grids or Smart Metering Systems, or companies evolving Smart Grid architecture platforms. The template explicitly acknowledges the influence of the GDPR in its preparation, reference GDPR several times, and clarifies that the meaning of the notions in the template is that of the GDPR, unless otherwise provided. The template provides a process-oriented view of conducting the DPIA so in this sense is also more focused in the "'how" than the GDPR legal text. From the whole template, we have focused exclusively in the high-level DPIA process that they define.

Related work
Similar and previous domain-independent efforts for reference frameworks modelling are presented in Sect. 2.2. In this section, we will focus on the application domain of our case study on privacy. Important work has already been done in the field of GDPR modeling. In this section, we present a few papers that dwell on this subject, those of which we have considered to be the most relevant and better aligned with the purposes, objectives and ideas of our work.
Some of the research tackles the problem from a processoriented point of view. For example, PrOnto (Privacy Ontology for legal compliance) [17] is a legal ontology that formalizes the GDPR main conceptual cores: data types and documents, agents and roles, processing purposes and operations, legal bases, etc. Its goal is to support legal reasoning that can be applied to many legal and normative frameworks. The execution of actions by agents with certain roles supports the specification of temporal parameters and context for those actions. The inclusion of deontic logic is also relevant because it allows to include operators such as obligation, permission and prohibition, which are fundamental in the modeling of legal rules. After presenting the fundamentals of the ontology, they provide a series of workflows focusing on risk analysis, violation detection, and others, as well as depiction of concepts such as data portability (also in [18]). The representation of GDPR as a workflow proposed by PrOnto is clearly aligned with the result of our approach, aiming to represent the key concepts and legal axioms of GDPR in a model that can later be used for compliance checking and risk analysis. Similarly, BPMN (Business Process Model and Notation) [19] has been proposed to capture the process patterns for data subject rights [20,21], but not for DPIA.
Another example aims to build a concise model of the GDPR using UML class diagrams [22] and additional invariants expressed in OCL (Object Constraint Language). Since a concrete implementation of the GDPR is affected by the national laws of the EU member states, among other factors, they propose a two-tier approach for their model: a generic tier that captures the concepts and principles that apply to all contexts, and a specialized tier that includes tailoring of the generic tier for specific contexts, including contextual validations that can affect the interpretation and application of the GDPR. Work on a tool for checking GDPR compliance based on this model is still in progress and is not detailed in the paper. The modeling methodology was iterative and incremental, aided by thorough validation with the help of legal experts who were previously trained to understand the UML class model. The more significant difference between this and our approach is given by the fact that the described UML class model is specifically tailored for the GDPR, whereas the model proposed in this paper is open and can be used to develop new reference frameworks and new mappings, derived from different pieces of legal text.
There is also abundant research on this topic from a non-process-oriented perspective. For instance, a model that provides a concise overview in a visual format of the relations and associations between entities and agents defined on the norm and its constraints [23]. Their objective was to assess GDPR contents to study the relevance of its representation in the diagram. Entities (human or otherwise) have been identified, as well as actions and interactions among them, and other properties as well. Special care has been taken into the representation of personal data, data processing; important actors such as data processor and data controller, and relevant concepts such as consent when performing processing operations and conditions under which it is valid. They then present a few screenshots of the diagram and provide a discussion in how it can aid to provide assurance. However, they do not offer a toolbox for implementing it: no process can be extrapolated from the model, as it acts as a domain model rather than a workflow model.
The Open Digital Rights Language (ODRL) was also extended to represent legislative obligations and model the GDPR [24]. The central entity of this model is Policies, which are used to specify Rules, that are then used to represent Permissions, Prohibitions and Duties, which are granted over Actions. Parties are entities who participate in policy-related transactions, and Assets are objects which can be subject to a policy under consideration; constraints can also be defined for policies. With this approach, the GDPR is modeled as a set of rules, and said model does not result, again, in a workflow that can be followed to provide assurance.
Certification and assurance considerations could be included using well-established development process modelling approaches such as SPEM (Software Systems Process Engineering Metamodel) [25], or general business modelling languages such as the previously mentioned BPMN [19]. Also, enterprise modelling can be used for this purpose interrelating different concrete models to enable enterpriselevel analysis. For instance, the Zachman Framework [26] is actually a taxonomy for classifying enterprise architectural artefacts but it has the limitation that it does not describe processes. However, TOGAF Architecture Framework [27] or the NATO Architecture Framework (NAF) [28] can be an alternative or complementary to our approach as they could enable enterprise-level analysis for privacy assurance. An example of this relation can be seen by connecting the Data architecture view from TOGAF, which describes how data are organized, stored and accessed, with the GDPR requirements for data privacy. The NAF architecture, more specifically the "A7 Architecture Compliance Viewpoint", usage for quality assurance could benefit from the information extracted on the mappings and potentially provide content for this viewpoint.
Other related approach is enriching assurance cases with extra information. For instance, Workflow+ [29] proposes an approach to model the process and control flow, dataflow (with input-output relationships), and argument flow or constraint derivation. Although the authors do not mention traceability to specific standard requirements or privacy compliance, the approach can be translated to our context. Once our approach is applied at project independent level, assurance cases and Workflow+ can be used to ensure that the requirements are fulfilled.

Method
Section 4.1 presents an overview of the method, and Sect. 4.2 details the codebook used for annotating the GDPR DPIA regulation.

Modelling the ecosystem of reference frameworks
This work has been carried out by a focus group of experts on privacy assurance from the EU project PDP4E (Privacy and Data Protection for Engineers) [30]. A team of two researchers, experts on privacy (including the GDPR and ISO/IEC 29134), was the main responsible for creating the models. One of the researchers has more than 17 years of experience in research, having participated in privacy research projects for almost 10; the other one has been working on privacy-related projects during the last 2 years. They were supported by two other experts on reference frameworks and assurance modelling in OpenCert, with experience in modelling other standards and guidelines. One has a strong background in functional safety standards evolution and knowledge in cyber-security. They were involved in the inception of OpenCert and have led parts of its development. The other expert has longstanding experience in process improvement in domains related to critical systems, and knowledge of OpenCert development technologies.
The initial drafts of the models were reviewed independently by four experts more, including legal experts, and thus iterated until the final version was agreed upon. The experts have between 6 and 35 years of experience in security, privacy, interoperability, and IoT. One of them is a standardization expert involved in ISO/IEC JTC1 AG8 (Meta RA), ISO/IEC JTC1/SC41 (IoT and digital twins), ISO/IEC JTC1/SC27 (Cybersecurity and data protection), ISO PC317 (privacy-by-design for consumer goods and services), ISO TC22/SC32/WG11 (road vehicles cybersecurity engineering). The process was iterative: first the scope was decided among all the individuals involved, and then the team responsible for creating the models started the process. For any inconvenience, limitation, or discussion on the modeling, the support team provided solutions as soon as possible. When the models were somehow complete, they were presented to the review team to be agreed upon. Progress meetings were held between the different teams every two weeks and the whole process took three months. This work was carried out in parallel to other tasks. In case of disagreement within any of the teams, the meetings were used as a discussion opportunity and a final decision was agreed among their members. In some cases, changes to the CACM were needed, or new developments to OpenCert were requested. The opinion of the review team has been taken into account for any proposed modification. Discussions were particularly fruitful from the legal perspective.
The modelling of ISO/IEC 29134, being a more technical and procedural text, was quite straightforward to model as we will discuss in Sect. 5. Similarly, the Smart Grid DPIA template model was also straightforward to create. However, for the DPIA part of the GDPR, more attention was required and the specific methodology for this is explained in the next paragraphs. For the equivalence map modelling, the reference frameworks of ISO/IEC 29134 and Smart Grid DPIA template were systematically analysed (iterating through all its elements) and manually identifying the full and partial mappings to the GDPR DPIA model.
The complete model should not be created from scratch without a refined procedure underlying. We applied a textual analysis of the contents of the GDPR, an approach which is common in other disciplines such as requirements engineering or conceptual modelling, and which departs from a textual description of the system functions or concepts. Notice that we are not using automatic NLP (Natural Language Processing) approach in our method, but a manual, yet systematic, approach based on Qualitative Content Analysis (QCA) [31]. Basically, QCA is a research method based on taking texts as input and manually inspecting their content marking ('coding') what each part of the text means. In our modelling context, these codes correspond to model entities of the envisioned model.
Our analysis of GDPR articles is supplemented with the analysis of the GDPR recitals, respectively, related to that article, plus the contents of guidance [32] on DPIA issued by the Working Party on the Protection of Individuals with regard to the Processing of Personal Data (a.k.a. WP29, an advisory body predecessor to the European Data Protection Board), which provides interpretive details of GDPR regarding how DPIAs should be addressed. We used this approach to capture the elements of the process model underlying the execution of a DPIA, from its description in the GDPR. It shall be noted that contrary to other kind of documents such as ISO standards, GDPR, as a legal document, is not organized around a process description or list of detailed requirements and, thus, the extraction of the different process elements is not straightforward. The relation between the GDPR clauses and the process stages is not linear, some elements are not described explicitly, but implied by the text, etc. The different GDPR DPIA paragraphs were modelled and then consolidated in a complete model.
As discussed above, several engineers and legal experts already reviewed our results throughout their creation and iterative refinement. Notwithstanding, another group was established to qualitatively evaluate the final reference framework models and mapping models. This group consists of three practitioners of systems engineering (each with more than ten years of industry experience) plus two experts in data privacy law (a PhD candidate who holds a Master of Laws degree with five years of experience in the field, and an engineer specialized in cybersecurity and privacy engineering regulations with more than fifteen years of experience). All the evaluators received an introductory explanation to our approach, and they were provided with access to the models and to a draft version of the content of this paper (not yet including the evaluation part). Two iterations were carried out to clarify potential doubts, and then, their qualitative feedback was received and processed by the authors.

Codebook for regulation annotation
Reference framework modelling starts by annotating the text of the regulation to be modelled with elements from reference framework meta-model, and then, they were manually interpreted, with expert judgement, to create the reference framework model. Our codebook provides instructions to annotate the different elements and relations, according to the reference framework metamodel.
• Activity which is discussed by the text of the given clause.
It is usually a verb (or a verb phrase), typically introduced by a modal (e.g. shall or may) which establishes an obligation or a power. • Input Artefact which is required as an input to the mentioned activity. It is usually a noun (or noun phrase), introduced as (a) the object (with the semantic role of stimulus) of phrases such as "take into account", (b) the direct object of an activity defined by a communication or a transference verb (communicate, inform, notify, convey, provide, etc.) • Output Artefact which is produced as a result of the mentioned activity. It is typically introduced as (a) the object (theme or patient) of phrases such as "result into" or "contain", (b) the direct object of activities defined using other verbs. • Role which carries out the said activity. It may appear as (a) the agent of the activity (active verb subject or passive agent), (b) the indirect object of the activity, (c) mentioned as the provider of some "advice", "accordance", etc., which shall be provided to the main activity. • Reference to another activity which is dependent on the execution of the current activity, which shall precede the referred activity. It is typically introduced by "prior to", "before", etc. • Reference to another activity upon which the current activity depends, i.e., the current activity shall succeed the referred activity. It (1) is introduced by "already", etc., or (2) mentions the results of the activity. • Condition usually identified by a conditional clause: (1) introduced by an interrogative adverb ("where", "when", "for which"), or (2) a conditional expression ("in the case of", "subject to", "unless"), or (3) directly an adverb phrase that sets some constraint (e.g. "in the public interest"). Sometimes, the conditions are quite generic ("where appropriate", "where applicable", "where necessary"). Sometimes, the condition can also be split into several parts in the text. The conditions may refer to artefacts, activities, or even roles, which can be omitted when some circumstances yield (e.g. the result of the DPO nomination may render that position unnecessary), but the negation of the premise established by the condition still validates the clause (e.g. when the DPO does not exist, their participation in an activity is not required anymore). • Applicability constraints under which a given activity, role or artefact is required. While the conditions mentioned (e.g. 'processing is likely to produce a high risk') they shall be evaluated during the execution of an assurance project; the current applicability criteria (e.g. 'the organization is an SME', 'there is a specific Member State law'), are already defined at the beginning of the assurance project. • Traceability links are also added to provide a reference to the structural part of the regulation (e.g. article, section, paragraph) that motivates the respective element. They have been added as annotations to the model, as they are not supported natively by CACM.

Modelling results
Sections 5.1, 5.2 and 5.3 present the models of the three individual reference frameworks, and Sects. 5.4 and 5.5 present the two mapping models to the regulation.

GDPR DPIA
As mentioned above, in this work we focus on the contents of GDPR Article 35 (and related official guidance), which deals with the DPIA that data controllers shall carry out when a personal data processing activity may entail high risks to data subjects. Table 1 provides some examples of the modelling process and its results, presenting, for each clause of GDPR Art. 35 included: the annotated text extracted from the GDPR, a list of the CACM concepts identified therein, and the graphical representation of the concepts and their relations. Figure 4 shows the graphical representation of the consolidated model f he GDPR DPIA as modelled in OpenCert. It includes 28 activities,c 39 artefacts, 7 roles, 2 applicability conditions, and 2 requirements.
Although the graphic notation can make it easier to visually grasp the structure and navigate through the elements of reference frameworks, it may also pose challenges when complex reference frameworks are represented, as human consumers might find it difficult to read and interpret them, as can be seen in the reference framework depicted in Fig. 4. To alleviate this, users interested in specific elements and their relations can use the tool features to query the model, and filter and rearrange their elements so as to create topic-specific views (i.e., model with a subset of the elements and relationships). For instance, we have created four examples of partial views (namely, Determine the need to carry out DPIA, Assess impact, Process personal data and consult with supervisory authority, and Generate report), which are included within our released GDPR model. 2

ISO/IEC 29134
ISO/IEC 29134 gives guidelines about both, the process and the contents of a PIA. As a technical standard, ISO/IEC 29134 displays a clearly procedural orientation, which makes much straightforward to define the OpenCert reference framework from its contents. Indeed, the bulk of the document (clause 6, "Guidance on the process for conducting a PIA") is structured as a set of subclauses, each defined according to a template that comprises the following sections: Objective, Input, Expected output, Actions, and Implementation Guidance. From there, we can define the different elements of the reference framework model (as we did for GDPR above). Figure 5 shows an excerpt of ISO/IEC 29134 modelled as reference framework. We present in the "Data availability statement" at the end of the paper that, because of licensing conditions, we do not show or make available the whole model. Our reference framework for ISO/IEC 29134 includes 38 activities and 35 artefacts.

Mapping between ISO/IEC 29134 and GDPR DPIA
The activities, artefacts, requirements, and roles prescribed by technical norms (e.g. ISO standards) can be mapped to the respective contents of the legal regulations and, specifically, GDPR. To define that mapping, both frameworks shall be described in compatible terms. Thanks to the creation of the GDPR DPIA and the ISO/IEC 29134 reference frameworks using the CACM metamodel, we were able to tackle the creation of the equivalence mapping model. In our GDPR model, from the data controller's perspective, three groups of activities can be detected: those that are carried out before the DPIA proper (decide whether DPIA is needed, consult with data subjects, experts, etc.), the production of the impact analysis itself (including risk elicitation, analysis, control measures, etc.), and the activities that use such impact analysis as an input (publication, review, consultation with supervisory authorities if needed, etc.). Likewise, in ISO/IEC 29134 we have four subclauses: 6.2 Determine whether a PIA is necessary (threshold analysis), 6.3 Preparation of the PIA, 6.4 Perform the PIA, and 6.5 Follow up the PIA. We can roughly map these to the GDPR groups above mentioned (including 6.2 and 6.3 in the same group). Table 2 shows an excerpt of the equivalences from ISO/IEC 29134 to GDPR where the different columns show: 1) the element(s) from ISO/IEC 29134, including the clause where it appears (since the alignment from elements to clauses is straightforward, as above explained), b) the corresponding elements(s) from the GDPR DPIA. These are extracted from GDPR Article 35 and the WP29 guidance [32]. However, in this case we do not refer to the GDPR Table 1 Examples of annotated text from the GDPR and diagram fragments clause itself but to the elements from our reference framework (as the correspondence is not so direct), and c) the type of mapping: Whether the mapping is full or partial and discusses what is missing from the source (ISO/IEC 29134) in the target (GDPR). It should be noted that not all the activities can be mapped, which has been explicitly discussed in some cases.
In total, 24 mappings were identified and categorized in three main groups: Prepare the PIA, Perform the PIA and Follow up the PIA. For the first group, "Prepare the PIA", ten mappings were identified, out of them five were full mappings and five were partial mappings. Regarding roles, there is no direct mapping between ISO/IEC 29134 and GDPR, as the individual roles employed by the former (e.g. manager, assessor) represent an orthogonal partition to those defined by GDPR which refer to whole organizations (data controller, data processor).

Mapping between EU Smart Grid DPIA and GDPR DPIA
The resulting mapping model consists of 29 mappings in total including both partial and full mappings. The mappings were separated in seven groups, namely Pre-Assessment, Analysis of Use Case, Threat identification, Data protection risk assessment, Management of risk and resolution, Documentation of DPIA report, and Reviewing and maintenance. Table  3 shows an excerpt of the mappings.

Evaluation and Discussion
The qualitative evaluation of the created models and their mapping has been performed by three industrial practitioners on systems engineering (Sect. 6.1) and two legal experts (Sect. 6.2). These two first subsections are a summary of their comments and observations. A selection of quotes is added

Systems engineering perspective
This section presents the qualitative evaluation and discussion of three system engineering experts: From the engineering point of view, the needed elements for implementing a regulation can be summarized in two main points. First, the availability of a system engineering process which defines specific goals leading to deliverables/artefacts and a structure of the complex work based on tasks or activities. This point is related to question RQ1 (feasibility of modelling privacy law and privacy protection standards). Through the specification of this process, an engineer will be able to understand the requirements, and execute and monitor the activities. Second, assure that this process is compatible with the elements of the regulation (modelled provisions, articles and clauses), which is related to question RQ2 (feasibility of matching the models to the GDPR model) because the degree of compatibility between the engineering process and the regulation shows the quality of the bridge that was created between the legal and engineering disciplines. This is also a first step to understand the feasibility to technically implement the regulatory prescriptions.
To evaluate these aspects, the following questions will be considered: (1) whether the CACM metamodel can be easily translated to an engineering process and (2) whether the mapping proposed between ISO/IEC 29134 and GDPR DPIA is representative of both documents, more specifically in terms of coverage and ordering of activities. Modelling and expressiveness Reference frameworks directly address the process part by allowing to model a standard or legal framework in terms of roles involved, activities and artefacts. This description is similar to process modelling languages like BPMN [19], SPEM 2.0 [33], or ISO/IEC 24744 (SEMDM) [34]. CACM has strengths to model reference frameworks but also some limitations. Its main strength is that its metamodel is expressive to model diverse types of reference frameworks. As some limitations, for example, for process-oriented ones, the concept of BPMN gateways is not present in CACM, neither is the concept of temporal con-straints between activities from SEMDM. However, CACM's level of detail can be considered sufficient for assurance modeling as standards (in most of the cases) use to avoid being too specific when defining a process to account for the need of integrating in an existing organization's internal processes. This results in a description of the processes using a higher level view than the one the standards might establish. In these terms, and answering RQ1, the targeted information privacy law and privacy protection standards in this work were possible to be modelled from a process-oriented perspective. Other model-driven engineering techniques like model to model transformations can also be applied to bridge the gap between CACM and more specialized process metamodels. For example, transformations between SPEM 2.0 and CACM have been defined [25]. On the other side, despite the comprehensiveness of CACM elements and the codebook presented in Sect. 4.2, the researchers noted the lack of Boolean process gateways and temporal constraints (as just mentioned above) which prevented them from modelling complex logic clauses in the regulations.
"The decision to use CACM instead of the well-established BPMN or SPEM is not easy to justify. But I understand that CACM might be more suitable for reference frameworks in general, and more integrated in the rest of the assurance process."

Mappings feasibility and their representativeness
Regarding the quality (coverage and consistency) of the mapping, taking as an example the elements of ISO/IEC 29134 shown in Fig. 5, which corresponds to the mappings presented in Table 2, it can be observed that all artefacts except one (Scope of the PIA) have been fully or at least partially mapped to the GDPR model. Two activities (Establish scope and Document the scale) are not associated with any counterpart; however, their outputs are documented and their parent activity (Threshold analysis) is correctly mapped. Moreover, the input and output artefacts are consistent across the mapping: the information about the processing is an input in both models to determine whether a PIA is necessary, and the mandate to prepare the PIA and the analysis results are outputs of the mapped activities.
In the rest of the mapping, Tables 4 and 5 show how many elements in each model have a mapping to another (independently if it was full or partial), or are contained within a mapped element. The final coverage is calculated adding both of these numbers (mapped and contained) and calculating the percentage with respect to the total number of elements in the model. As it would be expected in answering RQ2, neither the ISO or the Smart Grid DPIA template are covered in their entirety (100%) as they each have their specificity. For example, ISO/IEC 29134 states the need of a review of the DPIA after its completion without offering any further clarifications  Nature, scope, context, purpose of processing operations [Full] In order to carry out a PIA/DPIA, it is always key to know the details of whichever operations process personal data. These details are a prerequisite to the DPIA from the GDPR perspective, while in ISO, it combines external inputs with results of internal activities.

Determine whether a PIA is necessary (threshold analysis)
Determine the need to carry out a DPIA [Partial] Although ISO/IEC 29134 suggests some rules on when to carry out a PIA, GDPR and the WP29 guidance on DPIA are much more specific in this respect.

Threshold analysis result
Risk appraisal [Partial] The "risk appraisal" in our parlance represents an initial evaluation of whether the processing operations may imply a high risk to the rights and freedoms of the data subject. This, as per WP29 guidance, is much more detailed than ISO/IEC 29134's threshold.
6.2 Mandate to prepare PIA Resolution to carry out a DPIA [Full] As an artefact, each ISO/IEC 29134's mandate covers the scope of GDPR's resolution. Note, however, that the corresponding activity in ISO/IEC 29134 is narrower than GDPR's. Hence there may be some processing operations missing from ISO/IEC 29134's mandates that would still merit a resolution from GDPR's perspective.

Methodology
[Partial] ISO/IEC 29134 terms of reference, elicited during the preparation of the PIA, partially contribute to the methodology applied when it is carried out.
6.2 Scope of the PIA None explicitly (see explanation) [No map] ISO/IEC 29134 lists a set of activities to which the PIA will be applied, but, from the GDPR perspective there is no correspondence of this artefact. GDPR needs, of course, to know on which processing operations a DPIA is being performed. However, our GDPR reference framework model only considers activities on a one by one basis, thus there is no such "list" as an explicit artefact.
(part of 6.2) Decide if new or updated PIA is needed Check for similar past DPIAs [Partial] ISO/IEC 29134 considers whether a previous PIA can be reused and/or updated, GDPR also considers similar DPIAs. This is also related to the feedback loop that starts a new iteration of a DPIA if there are changes in the processing operations, but they are not explicitly shown (as the OpenCert reference framework metamodel does not consider this sequential perspective). other than that the organization should establish a clear policy to follow, whereas GDPR clearly specifies three different tasks for this purpose: monitoring the performance of the DPIA, review of the accordance of the processing operation, and consultation with the supervisory authority. The scope of the task at hand is narrower, in contrast to ISO/IEC 29134 definition where the possibility of an audit from an external party is also considered. However, more than the half of the elements are covered in both cases which indicates a good level of overlapping between them. This overlap, made explicit through the mapping model, can be leveraged to facilitate and reduce efforts during any privacy assurance project where the ecosystem of standards to study includes the GDPR and ISO/IEC 29134 or the EU Smart Grid DPIA template. The analysis of the final impact (e.g., development, effort, etc.) in application scenarios is out of the scope of this evaluation. However, it seems clear that each element with no mapping will require individual attention from the perspective of the reference framework containing it. "Even if the coverage is around the 50%, the benefit is still relevant to potentially avoid rework."

Mappings and ordering of activities
The general sequence of artefacts and activities is also preserved. For instance, it is possible to determine a correspondence between the main steps in the ISO/IEC 29134 model and specific areas of the GDPR model as we illustrate in Fig. 7. This figure highlights three main groups of elements in each reference framework which correspond, in order, to the main three steps of Preparing the PIA, Performing the PIA, and Following up the PIA. In addition, these groups appear in the same order in both models which suggests that they model compatible processes. Similarly, a straightforward analysis of the Smart Grid DPIA template model in Fig. 6   the PIA (Reviewing and Maintenance). In a mapping between reference frameworks, there may be cases where the target standard imposes further additional ordering constraints beyond what is already defined in the source standard. While this is not the case when mapping from ISO/IEC 29134 to the GDPR (as the former is, indeed, more explicit in this respect), in other cases that might create an increase in the effort needed to conduct the assurance project for one or more individual reference frameworks from this ecosystem.
The focus in our case study is improving compliance with the target standard by resorting to compliance with a more detailed, source standard, so this is not an issue here. As a final comment from the system engineering perspective, any of the three reference frameworks seems actionable to conduct an assurance engineering project as illustrated in Sect. 2.1, especially when integrated in a modelling framework covering the remaining activities. The fact that a reference framework ecosystem model can be modelled once and reused for several projects, makes the availability of these models highly desirable. In this sense, the initially proposed models are sound. "This approach might be useful if the reference framework models are complete and the mappings are correct. Initially they are sound but more industrial validation might enable its refinement." "This approach is a necessary step towards monitoring and visualizing the advancement in the assurance project and, in parallel, in the compliance to a set of related norms."

Legal perspective
This section presents the qualitative evaluation and discussion from a legal perspective. Under such legal lenses, it is possible to highlight three aspects for discussion that is, justified confidence, best practices, and mapping. On "justified confidence" and "best practices" In Sect. 1, system assurance is defined as "justified confidence (...)". Translating justified confidence into its system engineering scenario, it means that developers have to put all reasonable measures in place to make sure that, to the best of their knowledge, the system has no known vulnerabilities or flaws.
In strict legal terms, the word "justified" may have multiple meanings: "justified" may mean "technically legitimized" by the state-of-the-art literature-which may or may not agree on the measures being "sufficient" to accomplish system functionality; or, "justified" may also mean "demonstrating of such compliance" through sufficiently sound means. Encompassing both meanings, GDPR's principle of accountability asks the controller to, at once, comply with the GDPR requirements and be able to demonstrate such compliance. This work addresses only problems related to the second interpretation of "justified", leaving the qualitative assessment of the measures to further research. Notably, the reference frameworks will allow to provide traceability from the assurance and system assets layer (as explained in Sect. 2.1), but the fact that traceability exists to, for example, a specific security control, does not mean that the related security control is the most appropriate one.
Also, in the introduction, the reference to "best practices" to provide a system without errors and vulnerabilities may be correct in engineering terms, but it does not exactly match the idea behind the principle of assurance. In fact, the principle of accountability of Art 5.2 GDPR, as mentioned in Sect. 1, requires that: 1) the data controller is responsible for complying with the GDPR, and 2) must be able to demonstrate compliance. The controller is not responsible for putting in place all best practices, but only the ones that are reasonably necessary to address the specific risks of the targeted processing. This logic interpretation of GDPR Art. 35 is more in line with the legislator's intention to build GDPR with a risk-based approach.
"Modelling and traceability cannot be considered, per se, as a provider of justified confidence." Mapping GDPR to standards To the chagrin of legality attentive process engineers Europe-wide, GDPR is not a standard. Its rules are not organized around a process description or list of detailed requirements, which makes it complicated for engineers and controllers alike to design systems for, and ensure compliance with GDPR. However, there are not only many GDPR provisions that explain rules for lawful processing in a standard-like fashion (f.i., Art. 35) [35], but also authoritative scholars that understand data protection as an inherently procedural law [36]. Not unreasonably, data protection law is seen as a list of procedures to follow in order to "legalize" the processing of personal data, which otherwise would determine a violation to the fundamental right to informational privacy. In these terms, the exercise of mapping GDPR provisions to standards finds its theoretical basis.
The exercise of representing GDPR rules through the reference framework's elements presented in this work (RQ1) is a first step in closing the gap between normative provisions and technical specifications. Insofar as the model returned a coverage of around 60% of activities both of GDPR and ISO/IEC 29134, it determined a partial match (RQ2). This result is positive for two reasons: First, it confirms the utility of CACM in the modelling of different reference frameworks of legal and technical nature (RQ1), thus constituting a common language for legal practitioners and engineers to communicate. Second, the quality of the manual mapping process that was examined by GDPR legal compliance experts returned no alarming flags (RQ2). However, legal expertise is still required to cover the gaps that the partial matching leaves without guidance beyond a justification and description in the mapping model. In fact, partial compliance, in the eyes of the law, is still a form of non-compliance. In addition, a GDPR professional should still double check the correctness of the full mappings.
"Models and mappings are valuable supporting assets but legal compliance experts are still needed."

General remarks
One relevant aspect that was considered by both experts (engineering and legal) is that this approach of modelling the privacy-related ecosystem (reference frameworks and equivalence maps) is promising in improving the communication among both points of view. This GDPR interpretation method and the bridges that can be created with other established technical guidelines allow the GDPR to be more "operational" for system engineers. The other way around, legal experts can observe technical evidences trying to instantiate legal considerations. This is aligned with the established belief that a shared understanding among different stakeholders is one of the most salient features delivered by the use of model-driven engineering.
"The modelling approach enables interdisciplinary work."

Threats to validity
The validity of our results is inherently limited by 1) the choice of a specific metamodel (CACM) to define reference frameworks, and 2) its application to a single information privacy law and two guidelines (GDPR, ISO/IEC 29134 and EU Smart Grid DPIA template). The former impacts construct validity: we might have obtained different results if we had modelled our regulations using another metamodel which defines other types of elements and relationships. We have tried to mitigate this by not restraining ourselves to the contents of CACM, but introducing as well other elements missing there but appearing in other process-oriented metamodels, when they emerged as salient in the analysis of GDPR (namely, structural cross-references, as explained in Sect. 2.2). The latter impacts external validity: we might have found that other information privacy regulations are not appropriate to be modelled as processes. Nonetheless, these limitations are intrinsic to the goals of our work, that is, applying system assurance (for which CACM is often employed in methods and tools) to information privacy regulations (of which GDPR is paramount).
Probably, the most important threat to validity is internal and comes from the interpretation of the regulations and standards to model them as reference frameworks which is a manual task that might be error prone and subject to many arguable modelling decisions. We tried to alleviate this by using a systematic approach for modelling different parts followed by their consolidation, and mainly through continuous iterations of review and agreement with external persons to the modelling tasks, and addressing both the engineering and legal perspective, as explained in Sect. 4. Despite of this, it is likely that different independent teams will end up creating different models. This level of subjectivity is hard to assess but most probably the essence of the process will be similar, maybe with slight changes in the level of detail (e.g., the granularity of the activities). We consider this a concern but this does not mean that the final result will not be useful. For the equivalence map, the concern is similar and we tried to alleviate it with the same mitigation strategies. Thus, we have described a reproducible method grounded on the textual analysis of GDPR, ISO/IEC 29134, and the EU Smart Grid DPIA template, however, the results of following that method might not be the same because of the inherent subjectivity of interpretation and modelling. Reproducibility, in this case, is not based on being able to reach the same results, but on being able to apply the methodology and also being able to have a result that is logically sound per se, i.e. different actors can understand among them with the result.
Also, legal doctrine is an "empirical-hermeneutical discipline" [37], at which core lays human interpretation. As mentioned before, interpretation is inherently a subjective process, prone to subjective biases and informational asymmetries. In this work, interpretive biases and asymmetries are smoothened by the methodological expedient of choosing the most recent and authoritative interpretation of legal doctrine and practice that to which European DPAs and courts eventually adhere to build precedents. Although it is true that multiple legal experts may disagree on the interpretation of one specific disposition, it is also true that they would agree on indicating what is the most used and agreed upon interpretation. Consequently, it is logic how using minoritarian school of thoughts for modelling a reference framework would hinder its adoption. Our approach can be used but with the risk that our interpretation might not become the one accepted by the majority, and thus, it should be adapted when a more established or valid interpretation is available.
The perspective that the mentioned subjectivity in text or law interpretation could be alleviated by entirely removing humans from the loop is interesting from a software engineering perspective, but we consider that it is not possible yet with enough confidence. Finally, the validation has threats to validity given that only two experts have provided their qualitative feedback. However, the fact that each of them belongs to a different field of expertise (engineering and law) helps to cover the two targeted perspectives.

Conclusion and future work
Systems' privacy assurance can benefit from models capturing the ecosystem of privacy reference frameworks that exist today. In this work, we use a metamodel and a tool (CACM and OpenCert, both proven useful in modelling safety and security standards and guidelines in previous works), to model the technical and process-oriented ISO/IEC 29134 PIA guidelines, the EU Smart Grid DPIA template, and the GDPR DPIA legal text, and the equivalence mapping between the first two ones with the GDPR regulation. The qualitative evaluation from the system engineers' and legal experts' point of view, points out that the approach is sound and promising in improving the communication among the stakeholders and in making the GDPR more "operational" from an engineering perspective.
As further work, we identify two directions: Extended GDPR coverage In this paper, we have shown how GDPR can be modelled as an activity-oriented process model. Such model acts as a reference framework which thus becomes a keystone in the assurance process as explained in Sect. 1. But the assurance process involves further steps which connects with the evidences generated by the system development process. Work in assurance patterns derived from our reference frameworks will be investigated, and assurance projects in concrete complex and privacy-critical systems using the presented technology will be evaluated. Application to further regulations Another potential avenue for future work consists in extending the reference models herein presented with those derived from regulations recently issued by other jurisdictions (Nevada Privacy Law, California Consumer Privacy Act, Brazil's General Data Protection Law, etc.), each having their own similarities and differences with regard to GDPR; and so might we consider other EU laws which deal with information privacy (e.g., national level privacy regulations, ePrivacy Directive, NIS Directive). Furthermore, the GDPR itself allows for other accompanying legal instruments that can be modelled as extensions to our reference framework, ranging from opening clauses which allow for national derogations in some specific aspects and adequacy decisions that recognize laws from third countries, to quasi-regulations that supplement GDPR with hermeneutics helping in its interpretation (rulings from the Court of Justice of the European Union and guidance from the European Data Protection Board), to coregulations that refine further stipulations e.g. addressing concrete techniques (codes of conduct, codes of practice, voluntary certification schemes) and self-regulations regarding international transfers (binding corporate policies and standard contractual clauses). Organizations may well want to add their own, in-house conventions and best practice that establish internally enforceable obligations, which can also be modelled as extensions to the reference framework or as mapping models. Thanks also to Phu H. Nguyen, Erkuden Rios, and Iker Martinez de Soria, for the feedback on this work.
Funding This work has been conducted in the scope of the project PDP4E (Methods and tools for GDPR compliance through Privacy and Data Protection Engineering). This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 787034. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/. Appendix A Assurance enforcement in the GDPR: the accountability principle GDPR prescribes a set of data protection principles (defined in its Article 5 and later developed in the rest of its Chapter 2) that shall be met whenever personal data are processed, e.g. data transparency, accuracy, or minimization. One such principle is that of accountability, which states (Art. 5.2) that "The controller shall be responsible for, and be able to demonstrate compliance with, paragraph 1" (where other data protection principles are defined). That is, compliance with GDPR requires not only that organizations behave according to the law, but also that they are able to demonstrate it so as to be above suspicion. As a practical application of this principle, GDPR requires that records of written consent from the data subjects be kept by the controller (Rec. 42, Art. 7.1) as evidence that data subjects did provide their consent, and that it was voluntary, informed, etc. Likewise, records of processing activities are required (Art. 30, except for small organizations) and can be requested by supervisory authorities.

Data availability
System assurance can respond to that need and can hence become pivotal to support this accountability principle, as its activities are precisely aimed at providing evidence and argumentation that support the claims of compliance with a given intended specification. Therefore, the underlying reference framework (i.e., prescribed activities, artefacts, etc.) in the GDPR legal text is the basis for privacy-related system assurance. In this work, we focus on the modeling of reference frameworks and the modelling of "bridges" between reference frameworks; however, as main motivation for this work, we have in mind the leverage of the reference framework model in the context of its associated assurance project. The former is more generic, while the latter is more productor company-specific.
To provide more background information on assurance, it is also useful to meet other requirements of GDPR. For instance, available evidences can be leveraged to provide transparent information to the data subjects under different circumstances (Art. 12, Rec. 60, Rec. 85) and hence support the transparency principle (Art. 5.1.a and Rec. 39, Rec. 58, Rec. 78), or to be used to claim adherence to codes of conduct (Art. 40), or provide proofs for certifications (Art. 42), and even to provide the notifications required after data breaches (Art. 32, Art. 33). Last, the fact that assurance supports the existence of a view of the system status with respect to compliance and how it has been reached is key for two data protection goals as defined by Hansen [38]: transparency ("all privacy-relevant data processing including the legal, technical, and organizational setting can be understood and reconstructed at any time.") and intervenability ("intervention is possible concerning all ongoing or planned privacy-relevant data processing"), which can in turn be mapped to traditional management disciplines of monitoring and control. Here, such disciplines are applied in the benefit of the data subject, whose rights and freedoms are deemed as the target of protection by the management actions.
Tommaso Crepax is an Early-Stage Researcher at Scuola Superiore Sant'Anna and a Ph.D student of the National Doctorate in Artificial Intelligence. His background in regulation of and by technology finds its academic roots at the Tilburg Institute for Law and Technology, where he graduated cum laude as Master's student, and in which he subsequently worked as lecturer and program coordinator. Before joining Liderlab, he was associate researcher at the KU Leuven Center of IP and IT law in the privacy engineering cluster. His areas of expertise encompass privacy, cyber-security and data protection of information systems, which he both studied in theory as well as applied in practice in EU funded Horizon 2020 projects. In his research, he loves to work side-by-side with data scientists, computer scientists and information engineers, whom in the years granted him the title of legal engineer.