Integration of Formal Proof into Unified Assurance Cases with Isabelle/SACM

Assurance cases are often required to certify critical systems. The use of formal methods in assurance can improve automation, increase confidence, and overcome errant reasoning. However, assurance cases can never be fully formalised, as the use of formal methods is contingent on models that are validated by informal processes. Consequently, assurance techniques should support both formal and informal artifacts, with explicated inferential links between them. In this paper, we contribute a formal machine-checked interactive language, called Isabelle/SACM, supporting the computer-assisted construction of assurance cases compliant with the OMG Structured Assurance Case Meta-Model. The use of Isabelle/SACM guarantees well-formedness, consistency, and traceability of assurance cases, and allows a tight integration of formal and informal evidence of various provenance. In particular, Isabelle brings a diverse range of automated verification techniques that can provide evidence. To validate our approach, we present a substantial case study based on the Tokeneer secure entry system benchmark. We embed its functional specification into Isabelle, verify its security requirements, and form a modular security case in Isabelle/SACM that combines the heterogeneous artifacts. We thus show that Isabelle is a suitable platform for critical systems assurance.


Introduction
Assurance cases (ACs) are structured arguments, supported by evidence, intended to demonstrate that a system meets its requirements, such as safety or security, when applied in a particular operational context [WKD + 19,Kel98]. They are recommended by several international standards, such as ISO 26262 for automotive applications. An AC consists of a hierarchical decomposition of claims, through appropriate argumentation strategies, into further claims, and eventually supporting evidence. Several AC languages exist, including the Goal Structuring Notation (GSN) [Kel98], Claims, Arguments, and Evidence (CAE) [BB98], and the Structured Assurance Case Metamodel (SACM) [Gro20, WKD + 19], which unifies several notations.
AC creation can be supported by model-based design, which utilises architectural and behavioural models over which requirements can be formulated [DMW + Figure 1. Overview of our approach to integrative model-based assurance cases fallacies and inadequate evidence [GKHP06]. Moreover, although notations such as GSN or CAE have their merits in supporting the management of complex ACs via their hierarchical decomposition and modular representation [GC17], improving the comprehensibility and the maintenance of an assurance argument, these notations have been criticised for ambiguities arising from the existing guidance (e.g. [DP15, DMW + 18]). A proposed solution is formalisation in a machine-checked logic to enable verification of consistency and well-formedness [Rus13,Rus14]. As confirmed by avionics standard DO-178C, the evidence gathering process can also benefit from the rigour of Formal Methods (FMs) [J + 11]. However, it is also the case that (1) ACs are intended primarily for human consumption, and (2) that formal models must be validated informally [HK14]. Consequently, ACs usually combine informal and formal content, and so tools must support this. Moreover, there is a need to integrate several FMs [Pai97], potentially with differing computational paradigms and levels of abstraction [HH98]. So, for several reasons, it is paramount to maintain change impact traceability across these heterogeneous artifacts using FM integration [GFW19], going beyond the possibilities of shallow hyperlinking of engineering data as frequently seen in model-based engineering practice.
Vision Our vision, illustrated in Figure 1, is a unified framework for machine-checked ACs with heterogeneous artifacts and integrated FMs. We envisage an assurance backend for a variety of graphical assurance tools [DP18,WKD + 19] that utilise SACM [Gro20] as a unified interchange format, and an array of FM tools provided by our verification platform, Isabelle/UTP [FBC + 20,FZN + 19,FZW16]. Our framework aims to improve existing processes by harnessing formal verification to produce mathematically grounded ACs with guarantees of consistency and adequacy of the evidence. In the context of safety regulation in domains such as intelligent transportation, critical infrastructure, human-robot collaboration, or medical devices, our framework can aid AC evaluation through machine-checking and automated verification [BW19b,FNO + 20].
Contributions A first step in this direction is made by the contributions of this paper, which are: (1) Isabelle/SACM, an implementation of SACM in Isabelle [NPW02], (2) a front-end for Isabelle/SACM called interactive assurance language (IAL), which is an interactive DSL for the definition of machine-checked SACM models, (3) a novel formalisation of Tokeneer [BCJ + 06] in Isabelle/UTP, (4) the verification of the Tokeneer security requirements 1 , and (5) the definition of a modular assurance case capturing the lifecycle artifacts and the claims that Tokeneer meets its security requirements. Our Tokeneer assurance case demonstrates how one can integrate formal artifacts, resulting from the work with Isabelle/UTP (4), and informal artifacts, such as the Tokeneer documentation.
Isabelle provides a sophisticated executable document model for presenting a graph of hyperlinked formal artifacts, such as definitions, theorems, and proofs. It provides automatic and incremental consistency checking, where updates to artifacts trigger rechecking. Such capabilities can support efficient maintenance and evolution of model-based ACs [WKD + 19]. Moreover, the document model allows both formal and informal content [Wen18], and provides access to an array of automated proof tools [WW07,Wen18]. Additionally, Brucker et al. [BACW18,BW19a,BW19b] extend Isabelle with DOF, a framework with a textual language for the embedding of ontologies and meta-models into the Isabelle document model, which we harness to embed SACM. For these reasons, we believe that Isabelle is an ideal platform both for assurance cases and integration of formal methods.
Isabelle/UTP [FBC + 20] employs Unifying Theories of Programming [HH98] (UTP) to provide formal verification facilities for a variety of languages, with paradigms as diverse as concurrency [FCC + 19], realtime [FCWZ18], and hybrid computation [FTCW16,MSF20]. Moreover, verification techniques such as Hoare logic, weakest precondition calculus, and refinement calculus are all available through a variety of proof tactics. This makes Isabelle/UTP an obvious choice for modelling and verification of Tokeneer, and more generally as a platform for integrated FMs based on unifying formal semantics. We believe our novel mechanisation of the Tokeneer specification in a theorem prover is one of the most complete to date, with respect to the original benchmark [C + 08a]. The model includes 60 state variables, 38 top-level operations for user entry, admin, and enrolment procedures, 30 invariants, and several hundred discharged invariant proof obligations. Where possible, we are faithful to the benchmark, using the same names and structure for the system model. With our mechanisation we are able to formally verify three security properties that could only be argued semi-formally in the original documents [C + 08b]. Our work therefore demonstrates how automated proof tools have advanced over the past fifteen years. We also highlight a few invariants missing from the original formal specification, without which we could not verify Tokeneer.
This paper is an extension of a previous conference paper [FNGK19]. We develop a more elaborate modular assurance case for Tokeneer ( §3 and §6), further develop our IAL ( §4), and formalise the Admin operations in the Tokeneer formal model and verify two additional security properties ( §5). We also provide further implementation details and examples throughout, and in particular describe our strategy for converting the Tokeneer Z schemas into Isabelle/UTP. This article is organised as follows. In §2, we outline preliminaries: SACM, Isabelle, and DOF. In §3 we describe the Tokeneer system. In §4, we begin our contributions by describing Isabelle/SACM, which consists of the embedding of SACM into DOF ( §4.1), and IAL ( §4.2). In §5, we model and verify Tokeneer in Isabelle/UTP. In §6, we describe the mechanisation of the Tokeneer AC in the ACME graphical assurance case tool and Isabelle/SACM. In §7, we indicate relationships to previous research. After reflecting on our approach in §8, we conclude in §9.

Preliminaries
In this section, we provide background material on ACs, the SACM standard, the Isabelle components, and Isabelle/UTP, all required to follow our investigations in §4, §5, and §6.

Assurances Cases and SACM
Assurance cases are often presented using a graphical notation like GSN [Kel98] (Figure 2). In this notation, claims are rectangles, which are linked with "supported by" arrows, strategies are parallelograms, and the circles are evidence ("solutions"). The other shapes denote various types of context, which are linked to by the "in context of" arrows. An argument in GSN proceeds from the most abstract claim down, through argumentation strategies and further subgoals, until the claims can be directly supported by evidence. GSN also has a modular extension, where arguments can be encapsulated in modules, and certain elements can be marked as public, meaning they are available to other modules and can be cited using elements like "away goals" and "away context".
SACM is an OMG standard meta-model for ACs [WKD + 19]. It unifies, extends, and refines several predecessor notations, including GSN [Kel98] and CAE [BB98] (Claims, Arguments, and Evidence), and is intended as a definitive reference model. SACM models three crucial concepts: arguments, artifacts, and terminology. An argument is a set of claims, evidence citations, and inferential links between them. Artifacts represent evidence, such as system models, techniques, results, activities, participants, and traceability links. Terminology fixes formal terms for use in claims. Normally, claims are in natural language, but in SACM they can also contain structured expressions, which allows integration of formal languages. Arguments, artifacts, and terminology can all be grouped into a number of packages, which generalise GSN modules.
The argumentation meta-model of SACM is shown in Figure 3. The base class is ArgumentAsset, which groups the argument assets, such as Claims, ArtifactReferences, and AssertedRelationships (which are inferential links). Every asset may contain a MultiLangString that provides a description, potentially in multiple natural and formal languages, and corresponds to contents of the shapes in Figure 2.
AssertedRelationships represent a relationship that exists between several assets. They can be of type AssertedContext, which uses an artifact to define context; AssertedEvidence, which evidences a claim; Asserted- Inference which describes explicit reasoning from premises to conclusion(s); or AssertedArtifactContext which documents a dependency between the claims of two artifacts.
Both Claims and AssertedRelationships inherit from Assertion, because in SACM both claims and inferential links are subject to argumentation and refutation. SACM allows six different classes of assertion, via the attribute assertionDeclaration, including axiomatic (needing no further support), assumed, and defeated, where a claim is refuted. An AssertedRelationship can also be flagged as isCounter, where counter evidence for a claim is presented.
For development of graphical assurance cases, we use an Eclipse-based tool called Assurance Case Management Environment (ACME), from which we captured Figure 2. ACME supports the creation and management of assurance cases using notations such as GSN (and in the future CAE), the abstract syntax of which is an extension of Object Management Group's (OMG) Structured Assurance Case Metamodel (SACM), both are explained in detail in [WKD + 19]. ACME integrates a number of model management tools and frameworks, including Eclipse Epsilon [KPP06], Eclipse Hawk [BK13], and Xtext [Bet16], towards the management of fully model-based assurance cases. Using SACM's full potential and with the help of model management frameworks, ACME currently supports 1) fine-grained traceability from an assurance case to its referenced engineering models (defined in mainstream modelling technologies such as, e.g. UML) to the level of model elements; 2) traceability to formal notations in Isabelle; 3) automated means to validate/verify traced engineering artifacts; 4) use and execution of constrained natural language for model validation; 5) automated change impact analysis for assurance cases and their engineering artifacts.

Isabelle, Isar, and DOF
Isabelle/HOL is an interactive theorem prover for higher order logic (HOL) [NPW02], based on the generic framework Isar [WW07]. The former provides a functional specification language, and an array of automated proof tools [BBN11]. The latter has an interactive, extensible, and executable document model [Wen18],    Figure 4 illustrates the document model. The first section for context definition describes imports of existing theories, and keywords which extend the concrete syntax. The second section is the body enclosed between beginend, which is a sequence of commands. The concrete syntax of commands consists of (1) a pre-declared keyword (in blue), such as the command ML, (2) a "semantics area" enclosed between <...>, and (3) optional subkeywords (in green). Commands generate document elements. For example, the command lemma creates a new theorem within the underlying theory context. When a document is edited by removal, addition, or alteration of elements, it is immediately executed and checked by Isabelle, with feedback provided to the frontend. This includes consistency checks for the context and wellformedness checks for the commands. Isabelle is therefore ideal for ACs, which have to be maintainable, well-formed, and consistent. In §4.2 we extend this document model with commands that define our assurance language.
Moreover, informal artifacts in Isabelle theories can be combined with formal artifacts using the command text <...>. It is a processor for markup strings containing a mixture of informal artifacts and hyperlinks to formal artifacts through antiquotations of the form @{aqname ...}. For example, text <The reflexivity theorem @{thm HOL.refl}> mixes natural language with a hyperlink to the formal artifact HOL.refl through the antiquotation @{thm HOL.refl}. This is important since antiquotations are also checked by Isabelle as follows: (1) whether the referenced artifact exists within the underlying theory context; (2) whether the type of the referenced artifact matches the antiquotation's type.
A major foundation for our work is Isabelle/DOF [BACW18,BW19a,BW19b], an ontology framework for Isabelle. DOF permits the description of ontologies using Isabelle Ontology Specification Language (IOSL), a language to model document classes, which extends the document model with new structures. We use the command doc_class from IOSL to add new document classes for each of the SACM classes. Instances of DOF classes are not embedded into the HOL logic as datatypes, but sit at the meta-logical level in the document model. This means they can refer to other objects like theorems and definitions, can themselves be referenced using antiquotations, and carry an enriched version of the corresponding Isabelle markup string. One of DOF's targets is formal development of certification documents, making use of Isabelle's proof facilities [BW19b]. In this work, we take their vision forward with our SACM-based assurance case framework.

Isabelle/UTP
Isabelle/UTP [FBC + 20] is a tool for developing formal semantics and verification tools based on Hoare and He's Unifying Theories of Programming [HH98]. Isabelle/UTP contains a number of theories for reasoning about programs built using different computational paradigms, such as concurrent and real-time programming. In this paper, we use the core relational programming model to verify the functional specification of Tokeneer.
Variable mutation in Isabelle/UTP is modelled algebraically using lenses [FGM + 07]. A variable of type V in a state space S is denoted by a lens x : V =⇒ S, with two functions get : S → V and put : S → V → S, that respectively query and update the value of the variable in a given state s : S. This allows us to treat variables as semantic objects, rather than syntactic objects. We can check whether two lenses, x and y, refer to disjoint regions of the state space using independence x y. We can also check whether an expression e depends on a particular variable x using unrestriction, which is written x e, and is a semantic encoding of variable freshness. For example, if x y (they are different variables), then x (y + 1), since the valuation of y +1 does not depend on the value of x. Using these predicates, and the UTP relational program model [HH98], we can express laws about assignments, such as commutativity: As we have previously shown [FBC + 20], lenses effectively allow us to semantically characterise variable sets as well (for example a = {x, y, z}) and thus framing properties. As a dual to unrestriction, we also have the used-by predicate, a e, which states that e uses only those variables mentioned in a.
Rather than using the Z notation [Spi89], we use a variant of Dijkstra's Guarded Command Language (GCL) [Dij75] encoded in Isabelle/UTP to specify the model's behaviour. Our GCL has the following syntax: P ::= skip | abort | P P | E −→ P | P P | V := E | V :[P] Here, P is a program, E is an expression, and V is a variable. The language provides sequential composition, guarded commands, non-deterministic choice 2 , and assignment. We adopt a frame operator a :[P], which states that P changes only variables in the namespace a [FZW16, FZN + 19]. The namespace is modelled by a lens a : S 1 =⇒ S 2 , which shows how to embed the inner state-space S 1 into the outer state-space S 2 . This enables modular reasoning about the Tokeneer Identification Station (TIS) internal and real-world states, which is a further novelty of our work. We give both a weakest precondition (wp) and weakest liberal precondition (wlp) semantics to our GCL. Technically, each operator is denoted as a relational predicate in UTP, and the following laws are theorems of these definitions [HH98,CW06].
With these equations, we can calculate the weakest precondition of any program composed of these operators. The wlp semantics is almost the same for each operator, except for the following equations: Most of the wp and wlp laws are standard [Dij75], the exception being the laws for the frame operator. These make use of state space coercions [FB20], P ↑a and P ↓a , which respectively grow and shrink the state space of P using a. This, for example, means that the types of variables and quantifiers in P are type coerced. The first frame law has the proviso a b, meaning that the postcondition b does not depend on any variables in  the frame a. Consequently, the weakest precondition is essentially b, but we also need to conjoin the domain of P. Since P operates on the inner state space, we need to grow its state space using coercions. The second frame law, conversely, has that b depends only on the variables in a. Consequently, the weakest precondition is derived directly from P, but with suitable state space coercions applied. We can use wlp calculus to verify Hoare triple, using the following well-known theorem [FBC + 20]: In Isabelle/UTP we have developed a tactic, hoare_wlp_auto, that utilises this theorem, calculates the precondition using Theorem 2.1, and uses the UTP tactic rel_auto relational calculus tactic [FBC + 20] to try and discharge the resulting verification condition.

Case Study: Tokeneer
To demonstrate our approach, we use the Tokeneer Identification Station (TIS) 3 illustrated in Figure 5, a system that guards entry to a secure enclave. The pioneering work on the TIS assurance was carried out by Praxis High Integrity Systems and SPRE Inc. [BCJ + 06]. Barnes et al. performed security analysis, definition of a security target, formal functional specification using Z, refinement to a formal design, implementation in SPARK, and verification of the security properties against the Z specification ( Figure 5b). After independent assessment, Common Criteria (CC) Evaluation Assurance Level (EAL) 5 was achieved. Therefore, Tokeneer can be seen as a successful example of using FMs to assure a system against CC. Though now more than fifteen years old, it remains an important benchmark for FMs and other assurance techniques.
As indicated in Figure 5a, the physical infrastructure consists of a door, fingerprint reader, display, and card (token) reader. The main function is to check the credentials on a presented token, read a fingerprint if necessary, and then either unlatch the door, or deny entry. Entry is permitted when the token holds at least three data items: (1) a user identity (ID) certificate, (2) a privilege certificate, with a clearance level, and (3) an identification and authentication (I&A) certificate, which assigns a fingerprint template. When the user first presents their token, the three certificates are read and cross-checked. If the token is valid, then a fingerprint is taken, which, if validated against the I&A certificate, allows the door to be unlocked once the token is removed. An optional authorisation certificate is written upon successful authentication, which allows the fingerprint check to be skipped.
The TIS has a variety of other functions related to its administration. Before use, a TIS must be enrolled, meaning it is loaded with a public key chain and certificate, which are needed to check token certificates. Moreover, the TIS stores audit data which can be used to check previously occurred entries. The TIS therefore also has a keyboard, floppy drive, and screen to configure it. Administrators are granted access to these functions. The TIS also has an alarm which will sound if the door is left open for too long. SFR1 If the latch is unlocked, then TIS must possess either a User token or an Admin token. The User token must either have a valid authorisation certificate, or valid ID, Privilege, and I&A Certificates, together with a template that allowed to successfully validate the User's fingerprint. Or, if the User token does not meet this, the Admin token must have a valid authorisation certificate, with the role "guard". SFR2 If the latch is unlocked automatically by TIS, then the current time must be close to being within the allowed entry period defined for the User requesting access. SFR3 An alarm will be raised whenever the door/latch is insecure. SFR4 No audit data is lost without an audit alarm being raised. SFR5 The presence of an audit record of one type will always be preceded by certain other audit records. SFR6 The configuration data will be changed, or information written to the floppy, only if there is an Admin person logged on to the TIS.
Our objective is to (i) construct a machine-checked assurance case that argues that the TIS fulfils the security properties SFR1, part of SFR2, SFR3, and SFR6, and (ii) integrate evidential artifacts from the mechanised model of the TIS behaviour in Isabelle/UTP into this assurance case. For these SFRs, our approach re-enacts the green parts in Figure 5b. Particularly, we focus on verifying the functional formal specification against the security properties and on checking well-formedness of the functional specification. We envisage the modular assurance case [Kel98,DP15] for Tokeneer illustrated in Figure 6. Here, we have modelled the main documents produced during the development process as assurance case modules using modular GSN. The numbers correspond to the document codes given in the Tokeneer archive 4 . Each of the package symbols represents a collection of claims, arguments, and other lifecycle artifacts, for example 40 4 Security Properties provides formalisation of some of the six SFRs. Certain artifacts are marked public, meaning they can be used by other modules, and some are private. The arrows between the modules indicate dependencies, for example the formal specification is developed both in the context of the system requirements and the security properties. In this paper, we focus on formalisation of 41 2 Functional Specification, and the argument that the SFRs are satisfied in TIS SFRs. The assurance arguments and artifacts will be embedded into Isabelle/SACM, which we develop in the next section.

Isabelle/SACM
In this section we encode SACM as a DOF ontology ( §4.1), and use it to provide an interactive machinechecked AC language ( §4.2). Our embedding implements ACs as meta-logical entities in Isabelle, that is, elements of the document model, rather than as formal elements embedded in the HOL logic, as this would prevent the expression of informal reasoning and explanation. Therefore, antiquotations to formal artifacts can be freely mixed with natural language and other informal artifacts.

Modelling: Embedding SACM in Isabelle
We embed the SACM meta-model in Isabelle using IOSL, and we focus on modelling ArgumentAsset 5 and its child classes from Figure 3, as these are the most relevant classes for the TIS assurance argument that we develop in §6. The class ArgumentAsset has the following textual model: doc class ArgumentAsset = ArgumentationElement + content_assoc:: MultiLangString Here, doc_class defines a new class, and automatically generates an antiquotation type, @{ArgumentAsset <...>}, which can be used to refer to entities of this type. ArgumentationElement is a class which ArgumentAsset inherits from, but is not discussed further. content_assoc models the content association in Figure 3. To model MultiLangString in Isabelle/SACM, we use DOF's markup string. Thus, the usage of antiquotations is allowed for artifacts with the type MultiLangString. ArgumentAsset has three subclasses: (1) Assertion, which is a unified type for claims and their relationships; (2) ArgumentReasoning, which is used to explicate the argumentation strategy being employed; and (3) ArtifactReference, that evidences a claim with an artifact. Since DOF extends the Isabelle/HOL document model, we can use the latter's types, such as sets and enumerations (algebraic datatypes), in modelling SACM classes, as shown below: datatype assertionDeclarations_t = Asserted|Axiomatic|Defeated|Assumed|NeedsSupport|AsCited doc class Assertion = ArgumentAsset + assertionDeclaration::assertionDeclarations_t doc class Claim = Assertion + metaClaim::"Assertion set" <= "{}" doc class ArgumentReasoning = ArgumentAsset + structure_assoc::"ArgumentPackage option" doc class ArtifactReference = ArgumentAsset + referencedArtifactElement_assoc::"ArtifactElement set" Here, datatype defines an algebraic datatype, assertionDeclarations_t is an enumeration, set is the set type, and option is the optional type. Attribute assertionDeclaration is of type assertionDeclarations_t, which specifies the status of instances of type Assertion. Examples of Assertions in SACM are claims, justifications, and both kinds of arrows in Figure 2. A Claim is an assertion, extended with the metaClaim association. The attribute structure_assoc, in class ArgumentReasoning, is an association to the class ArgumentPackage, which is not discussed here. Finally, the attribute referencedArtifactElement_assoc, from class ArtifactReference, is an association to ArtifactElements from the ArtifactPackage, allowing instances of type ArgumentAsset to be supported by evidential artifacts. The class Claim in Figure 3 inherits from the class Assertion the attributes gid, content_assoc, and assertionDeclaration of type assertionDeclarations_t. The other child class of Assertion is AssertedRelationship, as shown below. AssertedRelationship models the relationships between instances of type ArgumentAsset, such as the "supported by" and "in context of" arrows of Figure 2. isCounter specifies whether the target of the relation is supported or refuted by the source, and reasoning_assoc is an association to ArgumentReasoning, which models GSN strategies in SACM. The attributes source and target, both of type ArgumentAsset, specify the source and target for the relation. Rather than placing them directly in AssertedRelationship we put them in the concrete subclasses, as this means they can be specialised to enforce OCL constraints in the reference meta-model [Gro20]. The various kinds of relationship classes, such as AssertedInference and AssertedEvidence, are then created as subclasses. An AssertedInference can only connect assertions, and an AssertedEvidence can only connect an evidential artifact to an assertion. These constraints are enforced by DOF when model instances are created.

Interactive Assurance Language
Interactive Assurance Language (IAL) is our assurance language with a concrete syntax consisting of various Isabelle commands that extend the document model in Figure 4. Each command performs a number of checks: (1) standard Isabelle checks ( §2); (2) OCL-style constraints imposed on the attributes by SACM (provided by DOF); (3) well-formedness checks against the meta-model, e.g. instances comply to the type restrictions imposed by the SACM datatypes. IAL instantiates doc_classes from §4.1 to create SACM model elements in Isabelle, for example, the command Claim creates a model element of the class Claim. Attributes and associations of a class have a concrete syntax represented by an Isabelle (green) subcommand. The grammar of the IAL commands for creating argumentation elements is shown below. <AssertDecl> := asserted | axiomatic | assumed | defeated | needsSupport <ClaimComm> := Claim <gid> isAbstract? isCitation? (metaClaims <gid>*)? <AssertDecl>? <Description> <InferenceComm> := Inference <gid> <AssertDecl> (src <gid>*) (tgt <gid>*) <Description> <ContextComm> := Context <gid> <AssertDecl> (src <gid>*) (tgt <gid>*) <Description> <EvidenceComm> := Evidence <gid> <AssertDecl> (src <gid>*) (tgt <gid>*) <Description> Claim creates a model element of type Claim with an identifier (gid), and description contained in a MultiLangString. The antiquotation @{Claim <<gid>>} can be used to reference the created model element. The subcommands isAbstract, isCitation, metaClaims, and <AssertDecl> are optional, with default values being False, False, {} and asserted, respectively. The metaClaims keyword allows us to link a claim to assertions about this claim, such as the level of confidence in it. Inference creates an inference between several model elements of type ArgumentAsset. It has subcommands src and tgt that are both lists of antiquotations pointing to ArgumentAssets. The use of antiquotations to reference the instances ensures that Isabelle will do the checks explained in §2. Context similarly asserts that an instance should be treated as context for another, and Evidence associates evidence with a claim. Model elements created by IAL are semi-formal, since they can contain both informal content and references to machine checked formal content. With these commands, IAL can be used to represent a GSN diagram, as illustrated in Figure 7. The claims C1-C4 are encoded using the Claim command. Claim C1 is supported by C2 via the inference I1, which represents the "supported-by" arrow in the GSN at the left, and uses antiquotations to refer to the two claims. An artifact called Hazard_Log is introduced as context for C2 using the Context command. A further evidence artifact FV1 is used to support claim C3, using the Evidence command. The final claim C4 is left undeveloped, indicated by the needsSupport keyword. Figure 8 shows the interactive nature of IAL, and some of the error types. In (8a), the Inference command expects a source followed by target element, but the latter is missing, and so IAL raises the error message at the bottom. The exclamation marks to the left denote from where the error originates. In the jEdit interface, these errors are raised interactively whilst the user is typing. Moreover, this kind of check ensures that the model elements produced conform to the SACM reference meta-model.
In (8b), the target is specified (Claim_A), but it refers to a claim that does not exist (hence the blue colour), and so IAL again raises an error message. In (8c), an element called Claim_A exists, but it is of the wrong type. Claim_A is an artifact, which violates the OCL constaints of the SACM standard [Gro20], and so DOF raises an ontological error. Finally, (8d) shows the cascading effect of errors: Claim_B does not exist, the element Rel_A fails to process, and consequently any attempt to reference it will also fail. This kind of cascading can also be used to detect proof failures following an update to a model and failed verification.
In addition to argumentation commands, we have also implemented several commands for creating different kinds of artifacts.  With the exception of Requirement, these artifact classes are adopted from the SACM standard [WKD + 19]. They allow us to model the various artifacts created during the development and assurance lifecycle, and the relationships between them, for the purposes of traceability. The Artifact command represents a unit of data produced during the lifecycle, such as a specification or verification results. It can be annotated with a version, and the creation date. The Requirement command can be used represent requirements, a specialised form of Artifact. The Resource command can be used to model a link to an external resource, such as a standard or code base, which is uniquely represented by a URI. The Activity command models an activity or process, with a start time and end time, and Event similarly represents a timed and dated event.
A Participant models an actor that takes part in the lifecycle, such as a developer, and Technique models a technique, such as a modelling language or formal method, that is applied in the creation of artifacts. Finally, ArtifactRelation allows us to relate two artifacts.
An example using the artifact commands is shown in Figure 9, which further elaborates the verification result in Figure 7. The formal verification result FV1 is an artifact, with version 1, that points to the Isabelle theorem vc1. The result was created during a verification activity, VACT1, as shown using the artifact relation AR1. Isabelle was used to perform the proof, which is modelled using a Resource that links to the Isabelle website. The verification activity was led by a proof engineer, Anne Other, who is modelled as a Participant, and linked to the verification activity by a further artifact relation. The specific technique used for the proof was the Isabelle simplifier, which is modelled as a Technique, and contains a link to the proof method simp.
We have now developed Isabelle/SACM and our IAL. In the next section, we consider the modelling verification of the Tokeneer system.

Modelling and Verification of Tokeneer
In this section we present a novel mechanisation of Tokeneer in Isabelle/UTP [FZW16, FZN + 19] to provide evidence for the AC. This model encodes the formal functional specification (41 2) in the modular assurance case in Figure 6. In [C + 08b], the satisfaction of the SFRs are argued semi-formally using the functional specification, but here we provide a formal proof. We focus on the verification of three of the requirements: SFR1 (the most challenging of the six), SFR3, and SFR6, and describe the necessary model elements.

Modelling and Mechanisation
The TIS functional specification [C + 08a] describes an elaborate state space and a collection of relational operations. The state is bipartite, consisting of (1) the digital state of the TIS and (2) the monitored and controlled variables shared with the real world. The TIS monitors the time, enclave door, fingerprint reader, token reader, and several peripherals. It controls the door latch, an alarm, a display, and a screen.
The specification describes a complex state transition system, with around 50 operations for enrolling the station, performing various administrative operations, such as archiving log files and updating the configuration file, and the user entry operations. The main user entry operations are illustrated in Figure 10  , where each transition corresponds to an operation. Following enrolment, the TIS becomes quiescent (awaiting interaction). ReadUserToken triggers if the token is presented, and reads its contents. Assuming a valid token, the TIS determines whether a fingerprint is necessary, and then triggers either BioCheckRequired or BioCheckNotRequired. If required, the TIS then reads a fingerprint (ReadFingerOK), validates it (ValidateFingerOK), and finally writes an authorisation certificate to the token (WriteUserTokenOK). If the access credentials are available (waitingEntry), then a final check is performed (EntryOK), and once the user removes their token (waitingRemoveTokenSuccess), the door is unlocked (UnlockDoor).
We mechanise the TIS using hierarchical state space types, with invariants adapted from the Z specification [C + 08a]. We define the operations using GCL [Dij75] rather than the Z schemas directly, to enable syntax-directed reasoning. The syntax of Z [Spi89], though maximally flexible, does not easily lend itself to such reasoning, since every operation schema contains a set of conjoined predicates which must be considered in turn. In contrast, as illustrated in §2.3, it is straightforward to calculate the weakest precondition of a GCL program. Within these constraints, we have endeavoured to remain faithful to the function specification by representing each of the state and operation schemas and using the same naming and overall structure. The use of GCL means that the model is also much closer to a program, and consequently refinement to code should be straightforward. Moreover, since GCL has a denotational semantics in UTP's relational calculus [HH98], it may be possible to prove equivalence with the corresponding Z operations.

State Space
We first describe the state space of the TIS state machine: A collection of algebraic data types characterise the state of system elements, including the door, latch, and token. The type α option represents an optional value that can be either undefined, None, or defined, Some x for x : α. The function the : α option → α allows us to extract the value from a defined value. We define state types for the TIS state, controlled variables, monitored variables, real-world, and the entire system, respectively. The controlled variables include the physical latch, the alarm, the display, and the screen. The monitored variables correspond to time (now), the door (door), the fingerprint reader (finger), the tokens, and the peripherals. RealWorld combines the physical variables, and SystemState composes the physical world (rw) and the TIS (tis).
Variable currentUserToken represents the last token presented to the TIS, and userTokenPresence indicates whether a token is currently present. The variable status is used to record the state the TIS is in, and can take the values indicated in the state bubbles of Figure 10. Variable issuerKey is a partial function representing the public key chain, which is needed to authorise user entry. Variables rolePresent, availableOps, and currentAdminOp are used to represent the presence of an Admin, the available operations for this Admin, and the current operation being executed.
In addition to the state types, we also encode a number of predicates that represent the invariants of seven Z state schemas. These effectively encode low-level well-formedness constraints for the types; the higher level invariants are considered in §5.4. The predicate representing the invariants associated with the Admin variables is shown below.

Admin
This predicate closely corresponds to the Admin schema in the functional specification [C + 08a, page 22]. It states, firstly, that if a role is present, it must be one of the three Admin roles. Conversely, if no roles are present then no Admin operations are available to be executed. The next three implications assign possible operations to the given Admin roles. The final predicate states that if an Admin operation is being executed, then it must be one of the available operations and there must be a role present. We collect the seven well-formedness predicates in TIS-wf, as defined below.

TIS-wf (DoorLatchAlarm
The verification of the TIS SFRs depends on these state predicates being invariant for all the operations.

Operations
We now specify a selection of the operations over IDStation 6 : Definition 5.4 (User Entry Operations). Each operation is guarded by execution conditions and consist of several assignments. BioCheckRequired requires that the current state is gotUserToken, the user token is present, and sufficient for entry (UserTokenOK ), but there is no authorisation certificate (¬UserTokenWithOKAuthCert). The latter two predicates essentially require that (1) the three certificates can be verified against the public key store, and (2) additionally there is a valid authorisation certificate present. We give the definition of UserTokenOK below.

UserTokenOK
It requires that currentUserToken contains a token, which is current (CurrentToken), and has valid ID, privilege, and I&A certificates. The definitions of the omitted predicates can be found elsewhere [C + 08a]. Assuming these preconditions hold, operation BioCheckRequired updates the state to waitingFinger and the display with an instruction to provide a fingerprint. ReadFingerOK requires that the state is waitingFinger, and checks whether both finger and user token are present. If they are, then the state switches to gotFinger, and the display is updated to wait. UnlockDoorOK requires that the current state is waitingRemoveTokenSuccess, and the token has been removed. It unlocks the door, using the auxiliary operation UnlockDoor, returns the status to quiescent, and updates the display. UnlockDoor both unlocks the latch, and also updates two timeout variables, latchTimeout and alarmTimeout. The former is used to close the latch after a certain period, and the latter to sound an alarm if the door is left open.
These operations act only on the TIS state space. During their execution, monitored variables can also change, to reflect real-world updates. Mostly these changes are arbitrary, with the exception that time must increase monotonically. We therefore promote the operations to SystemState with the following schema. In Z, this functionality is provided by the schema UserEntryContext [C + 08a], from which we derive the name UEC. It promotes Op to act on tis, and composes this with a relational predicate that constrains the real-world variables (rw). The behaviour of all monitored variables other than now is arbitrary, and all controlled variables are unchanged. This separation enables modular reasoning, since we can promote invariants of the TIS to any real world context using Theorem 2.1 and the following Hoare logic theorem. We have employed a pattern for conversion from Z to GCL. Every conditional predicate, for example status = quiescent, becomes a guard in Definition 5.4. Every primed variable equation, such as status = gotUserToken, becomes an assignment, and all of the resulting commands are sequentially composed. Nevertheless, we preserve the non-determinism of the original model, and these assignments are equivalent to primed variable equations since in UTP we denote an assignment as follows: Consequently, the operations could be expressed as relational expressions. Using UEC we promote each operation, for example TISReadToken UEC (ReadToken), to achieve the same effect as including UserEntryContext.
In Z, invariants of the state are imposed through the inclusion of state schemas, such as UserToken. Here, we do not impose these but we will prove that each operation preserves each invariant in Section 5.4. This sometimes requires that we add extra assignments to satisfy the invariant, as we illustrate below.
We next define some of the key admin operations, which are necessary to prove the security properties.

Definition 5.5 (Admin Operations).
OverrideDoorLockOK OverrideDoorLockOK allows the door to be unlocked when an Admin has already logged in who can execute the overrideLock command, that is, an Admin with the role guard. If the enclave is awaiting an Admin command, an Admin token is present, and the Admin gives the overrideLock command, then the door is unlocked and the enclave returns to awaiting another Admin command. FinishUpdateConfigOK is the second part of a two stage process for updating the configuration file. The first step checks whether a configuration file floppy has been inserted. In this second step, if the command updateConfigData has been selected and a valid floppy has been inserted, then the config is updated, displayed on the screen, and the enclave again returns to the main menu. Finally, ShutdownOK is used to shutdown the TIS. If an Admin is logged in, selects the shutdownOp command, and the door is closed, then the operation locks the door, logs the Admin out, blanks the screen and display, and sets the status to shutdown. The auxiliary operation AdminLogout has one more assignment than the corresponding Z schema [C +  We omit several operations, though these have all been mechanised. In each iteration of the state machine, we non-deterministically select an enabled operation and execute it. We also update the controlled variables, which is done by composition with the following relational update operation.

Formal Verification
In this section, we verify three SFRs of the formal model using Isabelle/UTP. We first formalise the TIS state invariants necessary to prove the SFRs 7 : Definition 5.6 (TIS State Invariants Selection).  Figure 11. Verification of Tokeneer Invariants in Isabelle/UTP Inv 1 states that whenever the TIS is in a state beyond gotUserToken, then either a valid authorisation certificate is present, or else the user token is valid. It corresponds to the first invariant in the IDStation schema [C + 08a, page 26]. However, we need to add an extra state, updateTokenSuccess and strengthen the consequent. The consequent originally only requires that there is a token with a valid authorisation certificate, which may not be the case if a fingerprint has not yet been taken. Inv 2 states that whenever in state waitingEntry or waitingRemoveTokenSuccess, then either an authorisation certificate or a valid fingerprint is present. Inv 2 is not present at all in [C + 08a], but we found it necessary to satisfy SFR1, specifically to ensure that a valid fingerprint is present. That certain invariants are missing, or too weak, is acknowledged in the TIS Security Properties document [C + 08b, page 11], but this does not invalidate the functional specification; it just makes it tricky to formally verify the SFRs. Inv 3 states that whenever an Admin role is present, this means that a valid Admin token is also present (AdminTokenOK ), so that it is not necessary to explicitly check this in each Admin operation. Similar to Inv 1 , it is equivalent to the second IDStation schema invariant [C + 08a, page 26], but we again needed to strengthen the consequent. Inv 4 states that if an Admin operation shutdownOp or overrideLock is selected, then the TIS must have an assigned name (also present in the key store), and hence it must already be enrolled. Finally, Inv 5 states that if an Admin token and Admin role are both present, then the role must match with the one contained on the admin token. This invariant does not seem to be present at all in [C + 08a], but we believe it is certainly necessary to prove SFR1. We elide the additional five invariants that deal with administrators, the alarm, and audit data [C + 08a].
As before, and differently to [C + 08a], which imposes the invariants by construction, we prove that each operation preserves the invariants using Hoare logic, similar to [RBC16]: Theorem 5.2 (TIS Operation Invariants).

• {TIS-inv} TISUserEntryOp {TIS-inv} • {TIS-inv} TISAdminOp {TIS-inv}
This theorem shows that the user entry and admin operations never violate the well-formedness properties and ten state invariants. We can therefore assume that they hold to satisfy any requirements. The proof involves discharging verification conditions for a total of 32 operations in Isabelle/UTP, a process that is automated using our proof tactics hoare_auto [FBC + 20] and hoare_wlp_auto. We illustrate this in Figure 11 for two of the defined operations. We follow the mathematical notation for GCL as much as possible. Each proof first applies an introduction rule, IDStation_correct_intro, that splits the goal into the well-formedness and behavioural invariants. Then, hoare_wlp_auto is applied to each resulting goal. This high-level automation means that proofs can be adapted for small changes to the operations with minimal intervention.
We use this fact to assure SFR1, which is formalised by the formula FSFR1, that characterises the conditions under which the latch will become unlocked having been previously locked. We can determine these states by application of the weakest precondition calculus [Dij75], which mirrors the (informal) Z schema domain calculations in [C + 08b, page 5]. Specifically, we characterise the weakest precondition under which execution of TISOp followed by TISUpdate leads to a state satisfying rw:ctrl:latch = unlocked. We formalise this in the definition below.
We first state the unlocking precondition for TISOp using wp calculus. Then, we conjoin the wp formula with tis:currentLatch = locked to capture behaviours when the latch was initially locked. The only operation that unlocks the door for users is UnlockDoorOK , and for admins it is OverrideDoorLockOK . As a result, we can calculate the following unlocking preconditions.
The first equation shows that precondition for a user unlock is that access is permitted and the token has been removed. The second equation shows that the precondition for an Admin unlock is that the TIS is waiting for an Admin command, an Admin token is present, and the selected command is overrideLock. From these equations we can calculate the unlocking precondition of TISOp itself, which is the disjunction of the two preconditions above. We can then conjoin this with TIS-Inv, since we know it holds in any state. We show that this composite precondition implies that either a valid user token and fingerprint were present (using Inv 2 ) or a valid authorisation certificate, or else an Admin is present (using Inv 5 ), and we can use the well-formedness invariant Admin to show that this Admin must have the guard role. Consequently, FSFR1 can indeed be verified.

Theorem 5.4 (FSFR1 is provable).
Proof. By application of weakest precondition and relational calculus.
Proof of SFR2 can likely be achieved in a similar way to SFR1, but more complex additional invariants are required that depend on time, which we have not been able to formalise for this case study (see §8). Next, we consider SFR3, which requires that an alarm is raised if the door is left open. This property can be proved more straightforwardly, since it is essentially a property of the well-formedness invariant in DoorLatchAlarm [C + 08a, page 23].

FSFR3
IDStation This states that if the invariants hold, the latch is locked, the door is open, and the time has advanced beyond the alarm timeout, then the door alarm is sounding. By Theorem 5.2, DoorLatchAlarm always holds and therefore FSFR3 can be verified.
Finally, we consider SFR6, which requires that the configuration and floppy can change only when an admin is logged on. In order to verify this, we need to reason about the variables a given operation can modify. In Isabelle/UTP, we can answer such framing questions using lenses [FGM + 07, FBC + 20]. We define a novel modification predicate, P nmods a, which states that relation P does not modify any of the variables captured by a. This is equivalent to stating that P is a fixed point of the function M (X ) (a = a ∧ X ). We prove the following modification laws for this predicate.
∈ a a :[P] nmods x P nmods x a :[P] nmods a:x As expected, neither skip nor abort modify any variable x. Sequential composition, P Q, does not modify x provided neither P nor Q does, and similarly for internal choice. Assignment to y does not modify x provided that x is independent of y (x y), which effectively means that y is not part of x. A guarded command b −→ P does not modify x provided that P also does not. For the frame operator, a :[P], we identify two cases. If a variable x is not in a, then clearly it is not modified. Conversely, if x is within the a namespace, then it is necessary to check whether P nmods x. Using these laws we can automatically verify that a program does not modify certain variables, and so formalise SFR6 as follows. If we assume that there is not an admin token, then this means that TISOp cannot modify either config or floppy. For the verification, we can distribute the absence precondition throughout the operations using the law b −→ (P Q) = (b −→ P) (b −→ Q). One admin operation can modify config, namely FinishUpdateConfigOK in Definition 5.5. If we prefix this operation with adminTokenPresence = absent, we obtain the program abort which does not modify config, since this violates the second guard. We can also prove that for every other operation P, P nmods config deductively using Theorem 5.6. Consequently, we can prove FSFR6.
We have now formalised and verified three of the SFRs. In the next section we place these in the context of an assurance case.

Mechanising the Tokeener Assurance Case
In this section, we use ACME and Isabelle/SACM to model the Tokeneer development process originally followed by Praxis, and illustrated in Figure 5b. A GSN diagram of the modular structure of the AC is shown in Figure 6. The modules have names that correspond to the modelled artifact and a brief description. Each module contains a mixture of lifecycle and certification artifacts, such as requirements and models, and GSN arguments. The former were developed originally by Praxis and evaluated to comply with CC EAL 5 in the context of the certification process of TIS. While the certification artifacts were the contribution of Praxis, the GSN AC argument modelling those artifacts is our contribution. Therefore, we complement the certification process of TIS with a GSN model translated to Isabelle/SACM, which aids the evaluation process of the certification artifacts by offering a machine-checked argument structure with full traceability. Our work also provides guidelines on the use of modular GSN to document the artifacts.
We focus on the module TIS SFRs, illustrated in Figure 12. It encapsulates an argument for a public claim that all SFRs are satisfied, which are defined in the module 40 2, by the TIS model, which is defined in 41 2. We reference these artifacts with the use of away context elements. For now, we focus on the SFRs that we have formally verified, namely SFR1, SFR3, and SFR6. For this, we use the Theorems 5.2, 5.4, 5.5, and 5.7 from §5 as evidential artifacts. Satisfaction of SFR1 is modelled by the claim SFR1 C1, which uses SFR1 as context. The claim is satisfied by formalisation, which is performed in another module called TIS SFR1. Satisfaction of the other SFRs can be represented using the same pattern.
The argument for SFR1 is shown in Figure 13. It uses the "formalisation pattern" [DP18], which shows how results from a formal method can be used to provide evidence for claims to satisfy a requirement {R} In Figure 13, we adapt Denney's pattern [DP18] as follows. We begin with the claim SFR1 Formalisation, Figure 13. Satisfaction of SFR1 by formalisation which references both SFR1, with its natural language description from module 40 1, and FSFR1, which is defined in this module and is defined in Definition 5.7 from §5. We then invoke an argumentation strategy, SFR1 S1, for formalisation. Instead of using a validation claim for the formalization of the requirements, we use a justification element, FSFR1 V1, which should be an explanation of how FSFR1 formalises SFR1. This is to preserve the well-formedness of the AC -the "requirement validation" claims have a type different from the "requirement satisfaction" claims. An example of a "requirement satisfaction" claim is SFR1 Formalisation.
FSFR1 also assumes that all the operations of the TIS preserve the system invariants, and so we record a link to this proof in module 41 2, which corresponds to Theorem 5.2. The subclaim of SFR1 S1 is FSFR1 Verified, which is supported by the evidence FSFR1-Proof, which refers to Theorem 5.4. Figure 14 shows the IAL model of TIS SFR1 that was manually translated and elaborated from Figure 13. In our translation, each of the modules in Figure 6 is assigned an Isabelle theory with the corresponding artifacts. We represent both the artifacts and argumentation elements necessary to assure satisfaction of SFR1. Each command has an optional descriptive text, enclosed in quotes <...> that can integrate hyperlinks to both formal artifacts, such as theorems and proof, and structured assurance artifacts, such as model elements generated by IAL. Since the checks done by IAL are successful, no errors are issued in Figure 14, which in particular indicates that every referenced artifact exists and is correctly typed.
SACM provides several additional concepts for representing lifecycle artifacts, and we utilise them here. We record two activities, FSFR1 Def Act and FSFR1 Proof Act, which represent the activities in the development workflow for defining the formal requirement and discharging the proof obligations. Both have a startTime and endTime associated. FSFR1 is represented by the artifact FSFR1 A, which links to the IAL requirement SFR1, which contains the natural language description of the requirement SFR1 from the Tokeneer documentation, using the Requirement antiquotation, and the technique Weakest Precondition Calculus [Dij75]. FSFR1 A also contains a link to the corresponding formal Isabelle constant via the antiquotation @{const FSFR1}.
We record a link to the proof of FSFR1 in the artifact FSFR1 Proof, and the Isabelle theory where this Figure 14. Argument and Artifacts for the FSFR1 Argument resides in FSFR1 Proof Theory. Finally, we create an artifact relation that gives the provenance for the proof of FSFR1. This proof was performed by the participant Simon Foster, during the activity FSFR1 Proof Act, using the theorem prover Isabelle2019. In this way we record precisely how and when a particular assurance artifact was created.
With the artifacts and their provenance defined, we move on to the argumentation. We first create the key claims using the Claim command, which variously reference the artifacts previously defined. Claim FSFR1 V1 is marked as assumed, since this is the validation claim that must be satisfied elsewhere by review. The strategy SFR1 S1 from Figure 13, is modelled by SFR1_S1 in Figure 14. SFR1_S1 is created using the command Inference, which uses antiquotations to refer to the premise claims SFR1_Formalisation, TISOp_Correct, and FSFR1_V1, that is, the source src, and the conclusion claim FSFR1_Verified, that is the target tgt.
We use the Context command to model the two contextual relations in Figure 13. SFR1 C1 presents the external claim TISOp Correct as context, which refers to the invariant proof (Theorem 5.2), and SFR1 C2 presents the assumed validation claim as context. Finally, we model the relationships from Figure 13 that link FSFR1 Verified to FSFR1 Proof. This is done in Figure 14 by FSFR1_E1, which is created using the command Evidence. It supports the claim FSFR1_Verified with the artifact FSFR1_Proof.
We have shown how Isabelle/SACM enables the integration of formal development with assurance argumentation, documenting how the evidence collected establish the overall security claims. In the next two sections we survey related work and discuss the findings of our case study.

Related Work
In this section, we discuss previous efforts in the verification of Tokeneer as well as other approaches to the formalisation of assurance cases and the integration of formal methods with assurance cases.

Comparison with Previous Efforts in the Verification of Tokeneer
Woodcock et al. [WAC10] highlight defects of the Tokeneer SPARK implementation, indicate undischarged verification conditions, and perform robustness tests generated by the Alloy SAT solver [Jac00] from a corresponding Alloy model. Using De Bono's lateral thinking, these test cases go beyond the anticipated operational envelope and stimulate anomalous behaviours. In shortening the feedback cycle for verification and test engineers, theorem proving in form of the proposed framework can help using this approach more intensively.
Abdelhalim et al. [ASST10] model part of the Tokeneer specification using UML activity diagrams translated to be checked for deadlock freedom by the CSP model checker FDR. Their formalisation assumes to be implemented on top of asynchronous communication, modelled in CSP in terms of buffers for each channel between UML components. While our abstraction from such communication aspects yields a simpler proof of the SFRs in Section 3, their deadlock checking at a lower level can be useful for checking correctness of the communication in an implementation of the UML model such as the SPARK implementation mentioned in Figure 5b. Their UML diagrams can lead to comparatively large specifications whereas our formalisation stays compact thanks to the abstraction and reuse mechanisms in Z schema and Isabelle/UTP. Rivera et al. [RBC16] present an Event-B model of the TIS, verify this model, generate Java code from it using the Rodin tool, and test this code by JUnit tests manually derived from the specification. The tests validate the model in addition to the Event-B invariants derived from the same specification, and aim to detect errors in the Event-B model caused by misunderstandings of the specification. Using Rodin, the authors state that they verify the SFRs (Section 3) using Hoare triples. Our work uses a similar abstract machine specification, but with weakest precondition calculus as the main tool for verifying the SFRs. Beyond the replication of the Tokeneer case study, Rivera et al. [RBC16] deal with the relationship between the model and the code via testing, whereas we focus on the construction of certifiable assurance arguments from formal model-based specifications. Nevertheless, we believe Isabelle's code generation features could be applied in a similar way.

Previous Work on Formal Assurance and Formalised Assurance Cases
In concordance with Woodcock et al.'s [WAC10] observations, several researchers have investigated ways of introducing formality into assurance cases [CHOS13, Rus14, DP18, DMW + 18]. We highlight some of these approaches below.
AdvoCATE is a powerful graphical tool for the construction of GSN-based safety cases [DP18]. It uses a formal foundation called argument structures, which prescribe well-formedness checks for the syntactic structure of (i.e. the graph underlying) an AC, and allow instantiation of assurance case patterns. Our work likewise ensures well-formedness, but also allows the embedding of content with formal semantics. Denney and Pai's formalisation pattern [DP18] is an inspiration for our work. Our framework is to be used as an assurance backend, which complements AdvoCATE with a deep integration of modelling and specification formalisms.
Rushby [Rus14] illustrates how assurance arguments can be formalized with modern verification systems such as Isabelle or PVS to overcome some of the logical fallacies associated with informal ACs. Similarly, our framework allows reasoning using formal logic, but additionally supports the combination of formal and informal artifacts. We drew inspiration from the work on the Evidential Tool Bus [CHOS13], which enables the combination of evidence from several formal and semi-formal analysis tools. In a very similar way, Isabelle supports the integration of a variety of formal analysis tools [WW07].
Diskin et al. [DMW + 18] tackle the problem of hierarchical and modular assurance by using a formal model (in this case, a compositional data-flow model) of the system to be assured as the basis for generating evidence required for a particular assurance claim. Their framework is elaborate and practically relevant inasmuch as it integrates well with the practice of model-based development. The paradigm of our approach is similar to theirs except that we use mechanised algebraic reasoning techniques, provide computer-assistance for the proposed reasoning steps, integrating informal assurance evidence.
Overall, we believe that our work is the first to put formal verification effort into the wider context of structured assurance argumentation, in our case, a machine-checked security case using Isabelle/SACM. We have also recently applied our techniques to collision avoidance for autonomous ground robots [GFN19] and an autonomous underwater vehicle [FNO + 20]; both of which are more recent benchmark examples.

Findings and Limitations of the Case Study
Below, we summarise several observations and findings from our investigation.
Evaluation of the Tokeneer Assurance Case Despite its age, we see Tokeneer as a highly relevant benchmark specification, particularly since it is one of the grand challenges of the "Verified Software Initiative" [Woo06]. As we have argued elsewhere [GFW19], such benchmarks allow us to conduct objective analyses of assurance techniques to aid their transfer to other domains. The issues highlighted in [WAC10] are systematic design problems that can be fixed by a change of the benchmark (e.g. by a two-way biometric identification on both sides of the enclave entrance). However, this is out of scope of our work and does not harm Tokeneer in its function as a benchmark.
During the translation from Z into Isabelle/UTP's GCL and the formalisation of the SFRs, we identified some deficiencies in the way that the security requirements were originally proven. In particular, as we have previously mentioned, the developers acknowledge that there are missing invariants necessary to support the proofs: We have not [added the invariants], as we believe it will add little to the assurance of correctness, and is very time consuming. At higher levels of the CC assurance we would be required to carry out more formal proofs, in which case these modifications would be done. [C + 08b, page 11] One of the reasons we can now do this is because automation of formal proof has vastly improved since the development of Tokeneer. Consequently, we can reach these higher assurance levels with our mechanisation.
A further issue is that we could not prove SFR2 while staying faithful to its proposed formalisation in the benchmark artifacts [C + 08b, page 6]. This property states that, at the point of unlocking the door, the time must be close to being within the permitted entry period. Like SFR1, it uses the operation TISOpThenUpdate as the target for the verification. However, this operation does not allow currentTime, which internally records the time, to advance. The advance of internal time occurs only when polling currentTime from the corresponding monitored variables now, using TISPoll, which can advance arbitrarily. Consequently, an invariant of time can be trivially satisfied, because time is constant. A fix for the issue would require us to reason about TISPoll, and thus a more substantial proof Formal Design and Refinement As shown in Figure 5b on page 7, a complete assurance case of the TIS development would require the coverage of all three refinement steps described in [C + 08c], the functional or abstract formal specification, the more concrete formal design, and the SPARK implementation. The formal design is a data and operation refinement of the abstract types used in the formal specification, replacing sets and functions with data structures with operational semantics. Such refinement proofs would require formal reasoning about the memory models of the formal design and the SPARK implementation in Isabelle/UTP. This reasoning can be based on separation logic like, for example, implemented in the Isabelle data refinement library [Lam17].

Conclusions
We have presented Isabelle/SACM, a framework for integrating formal proof into a unified and standardised form of assurance cases and for their computer-assisted construction. We showed how SACM is embedded into Isabelle as an ontology, and provided an interactive assurance language that guides its user in generating valid instances of this ontology.
We applied this framework to part of the Tokeneer security case, including the verification of three of the security functional requirements, and embedded these results into a mechanised assurance argument. Isabelle/SACM enforces the usage of formal ontological links-a feature inherited from DOF-which establish and enrich traceability between the assurance arguments, evidence of different provenance, and the assurance claims. Isabelle/SACM combines features from Isabelle/HOL, DOF, and SACM in a way that allows integration of formal methods and assurance cases [GFN19]. In sum, our work allows us to intertwine a heterogeneous formal development in Isabelle with an assurance case that puts the formal results in context.
In future work, we will formalise the connection between ACME [WKD + 19] and Isabelle/SACM, which will make the platform more accessible to safety practitioners. We are currently working on a prototype model-to-text transformation from SACM to Isabelle to facilitate this, and an integration with Eclipse to allow feedback from Isabelle to be propagated back to the diagram editors. We will also consider the integration of AC pattern execution [DP18], to facilitate AC production. Moreover, to support more advanced safety analysis, we are exploring the use of DOF to develop an ontology for safety concepts, such as hazards, risks, and control measures, following work by Banham [Ban20] and the Safety of Autonomous Systems Working Group [oASWG20], and also ontologies for formal methods. Indeed, we envisage the development of a variety of ontologies that provide the necessary terminology to formulate requirements, document developments, and otherwise aid communication.
We also plan complete the mechanisation of the TIS security case, including the overarching argument for how the formal evidence can satisfy the requirements of CC [Com17]. This will involve mechanising the remaining operators, and tackling the last three security requirements, which will require derivation and verification of additional invariants. We are also applying Isabelle/SACM to develop an assurance case for an autonomous underwater vehicle safety controller [FNO + 20], which is being developed under the regime of the DO-178C standard.
In parallel, we are developing our verification framework, Isabelle/UTP [FBC + 20, FZW16, FZN + 19] to support a variety of software engineering notations. We recently demonstrated formal verification facilities for a StateChart-like notation [FBC + 18, FCC + 19], and are also working towards tools to support hybrid dynamical languages [FTCW16,Fos19,MSF20] like Modelica and MATLAB Simulink. Though these kinds of models seem quite different to Tokeneer, in the UTP they all have a relational semantics and so the development of proof facilities here feed into these works.
Our long-term overarching goal is a comprehensive assurance framework supported by a variety of integrated FMs, in order to support complex certification tasks for cyber-physical systems such as autonomous robots [GFW19, GFN19, FNO + 20].