Keywords

1 Introduction

Modern software applications are increasingly built using libraries and code from multiple third parties. At the same time, protecting confidentiality of information manipulated by such applications is a growing, yet long-standing problem. Third-party libraries could in general have been written by anyone and they are usually run with the same privileges as the main application. While powerful, such privileges open up for abuse.

Traditionally, access control [7] and encryption have been the main means for preventing data dissemination and leakage, however, such mechanisms fall short when third-party code needs access to sensitive information to provide its functionality. The key observation is that these mechanisms only place restrictions on the access to information but not its propagation. Once information is accessed, the accessor is free to improperly transmit or leak the information in some form, either by intention or error.

Language-based Information-Flow Control [36] is a promising technique for enforcing information security. Traditional enforcement techniques analyze how information at different security levels flows within a program ensuring that information flows only to appropriate places, suppressing illegal flows. To achieve this, most information-flow control tools require the design of new languages, compilers, or interpreters (e.g. [12, 17, 22, 23, 26, 29, 39]). Despite a large, growing body of work on language-based information-flow security, there has been little adoption of the proposed techniques. For information-flow policies to be enforced in such systems, the whole system has to be written in new languages – an inherently expensive and time-consuming process for large software systems. Moreover, in practice, it might very well be that only small parts of an application are governed by information-flow policies.

Pure functional programming languages, like Haskell, have something to offer with respect to information security as they strictly separate side-effect free and side-effectful code. This makes it possible to enforce lightweight information-flow control through libraries [11, 20, 34, 35, 42] by constructing an embedded domain-specific security sub-language. Such libraries enforce a secure-by-construction programming model as any program written against the library interface is not capable of leaking secrets. This construction forces the programmer to write security-critical code in the sub-language but otherwise allows them to freely interact and integrate with non-security critical code written in the full language. In particular, static enforcement libraries like MAC [34] are appealing as no run-time checks are needed and code that exhibits illegal flows is rejected by the type checker at compile-time. Naturally, the expressiveness of Haskell’s type system sets the limitation on which programs can be deemed secure and which information flow policies can be guaranteed.

Dependent type theories [24, 31] are implemented in many programming languages such as Coq [13], Agda [32], Idris [8], and F\(^{*}\) [44]. Programming languages that implement such theories allow types to dependent on values. This enables programmers to give programs a very precise type and increased confidence in its correctness.

In this paper, we show that dependent types provide a direct and natural way of expressing precise data-dependent security policies. Dependent types can be used to represent rich security policies in environments like databases and data-centric web applications where, for example, new classes of users and new kinds of data are encountered at run-time and the security level depends on the manipulated data itself [23]. Such dependencies are not expressible in less expressive systems like MAC. Among other things, with dependent types, we can construct functions where the security level of the output depends on an argument:

figure a

Given a user name , retrieves the corresponding password and classifies it at the security level of . As such, we can express much more precise security policies that can depend on the manipulated data.

Idris is a general-purpose functional programming language with full-spectrum dependent types, that is, there is no restrictions on which values may appear in types. The language is strongly influenced by Haskell and has, among others, inherited its strict encapsulation of side-effects. Idris essentially asks the question: “What if Haskell had full dependent types?” [9]. This work, essentially, asks:

“What if MAC had full dependent types?”

We address this question using Idris because of its positioning as a general-purpose language rather than a proof assistant. All ideas should be portable to equally expressive systems with full dependent types and strict monadic encapsulation of side-effects.

In summary, the contributions of this paper are as follows.

  • We present DepSec, a MAC inspired statically enforced dependently typed information-flow control library for Idris.

  • We show how adding dependent types strictly increases the expressiveness of state-of-the-art static information-flow control libraries and how DepSec matches the expressiveness of a special-purpose dependent information-flow type system on a key example.

  • We show how DepSec enables and aids the construction of policy-parameterized functions that abstract over the security policy.

  • We show novel and powerful means to specify statically-ensured declassification using dependent types for a wide variety of policies.

  • We show progress-insensitive noninterference [1] for the core library in a sequential setting.

Outline. The rest of the paper proceeds through a presentation of the DepSec library (Sect. 2); a conference manager case study (Sect. 3) and the introduction of policy-parameterized functions (Sect. 4) both showcasing the expressiveness of DepSec; means to specify statically-ensured declassification policies (Sect. 5); soundness of the core library (Sect. 6); and related work (Sect. 7).

All code snippets presented in the following are extracts from the source code. All source code is implemented in Idris 1.3.1. and available at

1.1 Assumptions and Threat Model

In the rest of this paper, we require that code is divided up into trusted code, written by someone we trust, and untrusted code, written by a potential attacker. The trusted computing base (TCB) has no restrictions, but untrusted code does not have access to modules providing input/output behavior, the data constructors of the domain specific language and a few specific functions related to declassification. In Idris, this means that we specifically do not allow access to functions and . In DepSec, constructors and functions marked with a comment are inaccessible to untrusted code. Throughout the paper we will emphasize when this is the case.

We require that all definitions made by untrusted code are total, that is, defined for all possible inputs and are guaranteed to terminate. This is necessary if we want to trust proofs given by untrusted code. Otherwise, it could construct an element of the empty type from which it could prove anything:

figure c

In Idris, this can be checked using the --total compiler flag. Furthermore, we do not consider concurrency nor any internal or termination covert channels.

2 The DepSec Library

In information-flow control, labels are used to model the sensitivity of data. Such labels usually form a security lattice [14] where the induced partial ordering \(\sqsubseteq \) specifies allowed flows of information and hence the security policy. For example, \(\ell _{1} \sqsubseteq \ell _{2}\) specifies that data with label \(\ell _{1}\) is allowed to flow to entities with label \(\ell _{2}\). In DepSec, labels are represented by values that form a verified join semilattice implemented as Idris interfacesFootnote 1. That is, we require proofs of the lattice properties when defining an instance of .

figure d

Dependent function types (often referred to as \({\varPi }\) types) in Idris can express such requirements. If is a type and is a type indexed by a value of type then is the type of functions that map arguments of type to values of type .

A lattice induces a partial ordering, which gives a direct way to express flow constraints. We introduce a verified partial ordering together with an implementation of this for . That is, to define an instance of the interface we require a concrete instance of an associated data type as well as proofs of necessary algebraic properties of .

figure f

This definition allows for generic functions to impose as few restrictions as possible on the user while being able to exploit the algebraic structure in proofs, as will become evident in Sects. 3 and 4. For the sake of the following case studies, we also have a definition of a requiring a least element of an instance of and a proof of the element being the unit.

Fig. 1.
figure 1

Type signature of the core DepSec API.

The Core API. Figure 1 presents the type signature of DepSec’s core API. Notice that names beginning with a lower case letter that appear as a parameter or index in a type declaration will be automatically bound as an implicit argument in Idris, and the annotation on implicit arguments means that Idris will attempt to fill in the implicit argument by searching the calling context for an appropriate value.

Abstract data type denotes a value of type a with sensitivity level \(\ell \). We say that is indexed by and parameterized by . Abstract data type denotes a secure computation that handles values with sensitivity level and results in a value of type a. It is internally represented as a wrapper around the regular monad that, similar to the one in Haskell, can be thought of as a state monad where the state is the entire world. Notice that both data constructors and are not available to untrusted code as this would allow pattern matching and uncontrolled unwrapping of protected entities. As a consequence, we introduce functions and for labeling and unlabeling values. Like Rajani and Garg [33], but unlike MAC, the type signature of imposes no lattice constraints on the computation context. This does not leak information as, if \(l \sqsubseteq l'\) and a computation c has type for any type V, then there is no way for the labeled return value of c to escape the computation context with label \(l'\).

As in MAC, the API contains a function that safely integrates sensitive computations into less sensitive ones. This avoids the need for nested computations and label creep, that is, the raising of the current label to a point where the computation can no longer perform useful tasks [34, 47]. Finally, we also have functions and that are only available to trusted code for unwrapping of the monad and lifting of the monad into the monad.

Labeled Resources. Data type is used to denote a labeled Idris value with type a. This is an example of a labeled resource [34]. By itself, the core library does not allow untrusted code to perform any side effects but we can safely incorporate, for example, file access and mutable references as other labeled resources. Figure 2 presents type signatures for files indexed by security levels used for secure file handling while mutable references are available in the accompanying source code. Abstract data type  \(\ell \) denotes a secure file with sensitivity level \(\ell \). As for , the data constructor is not available to untrusted code.

The function takes as input a secure file and returns a computation with sensitivity level that returns a labeled value with sensitivity level Notice that the flow constraint is required to enforce the no read-up policy [7]. That is, the result of the computation returned by only involves data with sensitivity at most . The function takes as input a secure file and a labeled value of sensitivity level , and it returns a computation with sensitivity level that returns a labeled value with sensitivity level . Notice that both the and flow constraints are required, essentially enforcing the no write-down policy [7], that is, the file never receives data more sensitive than its sensitivity level.

Finally, notice that the standard library functions for reading and writing files in Idris used to implement the functions in Fig. 2 do not raise exceptions. Rather, both functions return an instance of the sum type . We stay consistent with Idris’ choice for this instead of adding exception handling as done in MAC.

Fig. 2.
figure 2

Type signatures for secure file handling.

3 Case Study: Conference Manager System

This case study showcases the expressiveness of DepSec by reimplementing a conference manager system with a fine-grained data-dependent security policy introduced by Lourenço and Caires [23]. Lourenço and Caires base their development on a minimal \(\lambda \)-calculus with references and collections and they show how secure operations on relevant scenarios can be modelled and analysed using dependent information flow types. Our reimplementation demonstrates how DepSec matches the expressiveness of such a special-purpose built dependent type system on a key example.

In this scenario, a user is either a regular user, an author user, or a program committee (PC) member. The conference manager contains information about the users, their submissions, and submission reviews. This data is stored in lists of references to records, and the goal is to statically ensure, by typing, the confidentiality of the data stored in the conference manager system. As such, the security policy is:

  • A registered user’s information is not observable by other users.

  • The content of a paper can be seen by its authors as well as its reviewers.

  • Comments to the PC of a submission’s review can only be seen by other members that are also reviewers of that submission.

  • The only authors that are allowed to see the grade and the review of the submission are those that authored that submission.

To achieve this security policy, Lourenço and Caires make use of indexed security labels [22]. The security level \( U \) is partitioned into a number of security compartments such that \( U ( uid ) \) represents the compartment of the registered user with id \( uid \). Similarly, the security level \( A \) is indexed such that \( A ( uid , sid ) \) stands for the compartment of data belonging to author \( uid \) and their submission \( sid \), and \( PC \) is indexed such that \( PC ( uid , sid ) \) stands for data belonging to the PC member with user id \( uid \) assigned to review the submission with id \( sid \). Furthermore, levels \(\top \) and \(\bot \) are introduced such that, for example, \( U ( \bot ) \sqsubseteq U ( uid ) \sqsubseteq U ( \top ) \). Now, the security lattice is defined using two equations:

$$\begin{aligned} \forall uid , sid .\ U ( {uid} )&\sqsubseteq A ( {uid} , {sid} ) \end{aligned}$$
(1)
$$\begin{aligned} \forall uid1 , uid2 , sid .\ A ( {uid1} , {sid} )&\sqsubseteq PC ( {uid2} , {sid} ) \end{aligned}$$
(2)

Lourenço and Caires are able to type a list of submissions with a dependent sum type that assigns the content of the paper the security level \( A ( uid , sid ) \), where \( uid \) and \( sid \) are fields of the record. For example, if a concrete submission with identifier 2 was made by the user with identifier 1, the content of the paper gets classified at security level \( A ( 1 , 2 ) \). In consequence, \( A ( 1 , 2 ) \sqsubseteq PC ( n , 2 ) \) for any \( uid \) n and the content of the paper is only observable by its assigned reviewers. Similar types are given for the list of user information and the list of submission reviews, enforcing the security policy described in the above.

To express this policy in DepSec, we introduce abstract data types and (cf. Fig. 3) followed by an implementation of the interface that satisfies Eqs. (1) and (2).

Fig. 3.
figure 3

Abstract data types for the conference manager sample security lattice.

Fig. 4.
figure 4

Conference manager types encoded with DepSec.

Using the above, the required dependent sum types can easily be encoded with DepSec in Idris as presented in Fig. 4. With these typings in place, implementing the examples from Lourenço and Caires [23] is straightforward. For example, the function takes as input a list of submissions and a user identifier from which it returns a computation that returns a list of submissions authored by the user with identifier . Notice that denotes the automatically generated record projection function that retrieves the field of the record, and that is notation for a dependent pair (often referred to as a \({\varSigma }\) type) where and are types and may depend on .

figure y

The operation is used by the PC members to add comments to the submissions. The function takes as input a list of reviews, a user identifier of a PC member, a submission identifier, and a comment with label . It returns a computation that updates the field in the review of the paper with identifier .

figure z

Notice that to implement this specific type signature, up-classification is necessary to assign the comment with type to a field with type . This can be achieved soundly with the primitive introduced by Vassena et al. [47] as . We include this primitive in the accompanying source code together with several other examples. The entire case study amounts to about 300 lines of code where half of the lines implement and verify the lattice.

4 Policy-Parameterized Functions

A consequence of using a dependently typed language, and the design of DepSec, is that functions can be defined such that they abstract over the security policy while retaining precise security levels. This makes it possible to reuse code across different applications and write other libraries on top of DepSec. We can exploit the existence of a lattice , the induced poset, and their verified algebraic properties to write such functions.

Fig. 5.
figure 5

Reading two files to a string labeled with the join of the labels of the files.

Figure 5 presents the function that is parameterized by a bounded join semilattice. It takes two secure files with labels and as input and returns a computation that concatenates the contents of the two files labeled with the join of and . To implement this, we make use of the and primitives from Figs. 1 and 2, respectively. This computation unlabels the contents of the files and returns the concatenation of the contents if no file error occurred. Notice that is the Idris function for monadic return, corresponding to the function in Haskell. Finally, this computation is plugged into the surrounding computation. Notice how the usage of and introduces several proof obligations, namely \(\bot \sqsubseteq \) , , and , . When working on a concrete lattice these obligations are usually fulfilled by Idris’ automatic proof search but, currently, such proofs need to be given manually in the general case. All obligations follow immediately from the algebraic properties of the bounded semilattice and are given in three auxiliary lemmas , , and available in the accompanying source code (amounting to 10 lines of code).

Writing functions operating on a fixed number of resources is limiting. However, the function in Fig. 5 can easily be generalized to a function working on an arbitrary data structure containing files with different labels from an arbitrary lattice. Similar to the approach taken by Buiras et al. [11] that hide the label of a labeled value using a data type definition, we hide the label of a secure file with a dependent pair

figure ah

that abstracts away the concrete sensitivity level of the file. Moreover, we introduce a specialized join function

figure ai

that folds the function over a list of file sensitivity labels. Now, it is possible to implement a function that takes as input a list of files, reads the files, and returns a computation that concatenates all their contents (if no file error occurred) where the return value is labeled with the join of all their sensitivity labels.

figure aj

When implementing this, one has to satisfy non-trivial proof obligations as, for example, that \(l \sqsubseteq \)  for all secure files where the label of f is l. While provable (in 40 lines of code in our development), if equality is decidable for elements of the concrete lattice we can postpone such proof obligations to a point in time where it can be solved by reflexivity of equality. By defining a decidable lattice order

figure al

we can get such a proof “for free” by inserting a dynamic check of whether the flow is allowed. With this, a function with the exact same functionality as the original function can be implemented with minimum effort. In the below, is the proof that the label of may flow to .

figure am

The downside of this is the introduction of a negative case, the -case, that needs handling even though it will never occur if is implemented correctly.

In combination with , can be used to implement several other interesting examples. For instance, a function that reads all files with a sensitivity label below a certain label to a string labeled with that label. The accompanying source code showcases multiple such examples that exploit decidable equality.

5 Declassification

Realistic applications often release some secret information as part of their intended behavior; this action is known as declassification.

In DepSec, trusted code may declassify secret information without adhering to any security policy as trusted code has access to both the and data constructors. However, only giving trusted code the power of declassification is limiting as we want to allow the use of third-party code as much as possible. The main challenge we address is how to grant untrusted code the right amount of power such that declassification is only possible in the intended way.

Sabelfeld and Sands [38] identify four dimensions of declassification: what, who, where, and when. In this section, we present novel and powerful means for static declassification with respect to three of the four dimensions and illustrate these with several examples. To statically enforce different declassification policies we take the approach of Sabelfeld and Myers [37] and use escape hatches, a special kind of functions. In particular, we introduce the notion of a hatch builder; a function that creates an escape hatch for a particular resource and which can only be used when a certain condition is met. Such an escape hatch can therefore be used freely by untrusted code.

5.1 The what Dimension

Declassification policies related to the what dimension place restrictions on exactly “what” and “how much” information is released. It is in general difficult to statically predict how data to be declassified is manipulated or changed by programs [35] but exploiting dependent types can get us one step closer.

To control what information is released, we introduce the notion of a predicate hatch builder only available to trusted code for producing hatches for untrusted code.

figure ap

Intuitively, the hatch builder takes as input a data structure of type followed by a predicate upon and something of type . It returns a dependent pair of the initial data structure and a declassification function from sensitivity level to . To actually declassify a labeled value of type one has to provide a proof that holds. Notice that this proof may be constructed in the context of the sensitivity level that we are declassifying from.

The reason for parameterizing the predicate by a data structure of type is to allow declassification to be restricted to a specific context or data structure. This is used in the following example of an auction system, in which only the highest bid of a specific list of bids can be declassified.

Example. Consider a two point lattice where , and an auction system where participants place bids secretly. All bids are labeled and are put into a data structure . In the end, we want only the winning bid to be released and hence declassified to label . To achieve this, we define a declassification predicate .

figure aq

Informally, given a log of labeled bids and a bid , the predicate states that the bid is in the log, , and that it is the maximum bid, . We apply to a log of bids and the predicate to obtain a specialized escape hatch of type that enforces the declassification policy defined by the predicate.

figure as

This hatch can be used freely by untrusted code when implementing the auction system. By constructing a function

figure at

untrusted code can plug the resulting computation into an context and declassify the result value using the argument function.

figure au

To show the predicate (which in our implementation comprises 40 lines of code), untrusted code will need a generalized function that establishes the relationship between and the output of . The only difference is its return type: a computation that returns a value and a proof that when labeling this value we will get back the initial input. This definition poses no risk to soundness as the proof is protected by the computation sensitivity level.

figure av

Limiting Hatch Usage. Notice how escape hatches, generally, can be used an indefinite number of times. The Control.ST library [10] provides facilities for creating, reading, writing, and destroying state in the type of Idris functions and, especially, allows tracking of state change in a function type. This allows us to limit the number of times a hatch can be used. Based on a concept of resources, a dependent type tracks how resources change when a function is invoked. Specifically, a value of type represents a sequence of actions that manipulate state where is an underlying computation context in which the actions will be executed, is the return type of the sequence, is the required list of resources available before executing the sequence, and is the list of resources available after executing the sequence.

To represent state transitions more directly, is a type level function that computes an appropriate type given a underlying computation context, a result type, and a list of actions, which describe transitions on resources. Actions can take multiple forms but the one we will make use of is of the form that expresses that the resource begins in state and ends in state . By instantiating with as the underlying computation context:

figure ay

and use it together with a resource , we can create a function that applies its first argument to its second argument with as its initial required state and as the output state.

figure ba

That is, we encode that the function consumes “an attempt.” With the function it is possible to create functions where users are forced, by typing, to specify how many times it is used.

As an example, consider a variant of an example by Russo et al. [35] where we construct a specialized hatch that declassifies the boolean comparison of a secret number with an arbitrary number.

figure bb

To use this hatch, untrusted code is forced to specify how many times it is used.

figure bc

5.2 The who and when Dimensions

To handle declassification policies related to who may declassify information and when declassification may happen we introduce the notion of a token hatch builder only available to trusted code for producing hatches for untrusted code to use.

figure bd

The hatch builder takes as input a predicate on something of type and returns a declassification function from sensitivity level to given that the user can prove the existence of some such that holds. As such, by limiting when and how untrusted can obtain a value that satisfy predicate , we can construct several interesting declassification policies.

The rest of this section discusses how predicate hatches can be used for time-based and authority-based control of declassification; the use of the latter is demonstrated on a case study.

Time-Based Hatches. To illustrate the idea of token hatches for the when dimension of declassification, consider the following example. Let be an abstract data type with a data constructor only available to trusted code and a function that returns the current system time wrapped in the data type such that this is the only way for untrusted code to construct anything of type . Notice that this does not expose an unrestricted timer API as untrusted code can not inspect the actual value.

Now, we instantiate the token hatch builder with a predicate that demands the existence of a token that is greater than some specific value.

figure be

As such, can only be used after a specific point in time has passed as only then untrusted code will be able to satisfy the predicate.

figure bf

Authority-Based Hatches. The Decentralized Labeling Model (DLM) [27] marks data with a set of principals who owns the information. While executing a program, the program is given authority, that is, it is authorized to act on behalf of some set of principals. Declassification simply makes a copy of the released data and marks it with the same set of principals but excludes the authorities.

Similarly to Russo et al. [35], we adapt this idea such that it works on a security lattice of , assign authorities with security levels from the lattice, and let authorities declassify information at that security level.

To model this, we define the abstract data type with a data constructor available only to trusted code so that having an instance of corresponds to having the authority of the principal . Notice how assignment of authorities to pieces of code consequently is a part of the trusted code. Now, we instantiate the token hatch builder with a predicate that demands the authority of to declassify information at that level.

figure bg

That is, makes it possible to declassify information at level given an instance of the data type.

Example. Consider the scenario of an online dating service that has the distinguishing feature of allowing its users to specify the visibility of their profiles at a fine-grained level. To achieve this, the service allows users to provide a discovery agent that controls their visibility. Consider a user, Bob, whose implementation of the discovery agent takes as input his own profile and the profile of another user, say Alice. The agent returns a possibly side-effectful computation that returns an option type indicating whether Bob wants to be discovered by Alice. If that is the case, a profile is returned by the computation with the information about Bob that he wants Alice to be able to see. When Alice searches for candidate matches, her profile is run against the discovery agents of all candidates and the result is added to her browsing queue.

To implement this dating service, we define the record type that contains personal information related to principal .

figure bh

The interesting part of the dating service is the implementation of discovery agents. Figure 6 presents a sample discovery agent that matches all profiles with the opposite gender and only releases information about the name and gender. The discovery agent demands the authority of and takes as input two profiles and . The resulting computation security level is so to incorporate information from into the result, declassification is needed. This is achieved by providing with the authority proof of . The discovery agent in Fig. 6 unlabels ’s gender, declassifies and unlabels ’s gender and name, and compares the two genders. If the genders match, a profile with type only containing the name and gender of is returned. Otherwise, is returned indicating that does not want to be discovered. Notice that is the constructor for the built-in equality type in Idris and it is used to construct the proof of equality between principals required by the hatch.

Fig. 6.
figure 6

A discovery agent that matches with all profiles of the opposite gender and only releases the name and gender.

6 Soundness

Recent works [46, 47] present a mechanically-verified model of MAC and show progress-insensitive noninterference (PINI) for a sequential calculus. We use this work as a starting point and discuss necessary modification in the following. Notice that this work does not consider any declassification mechanisms and neither do we; we leave this as future work.

The proof relies on the two-steps erasure technique, an extension of the term erasure [21] technique that ensures that the same public output is produced if secrets are erased before or after program execution. The technique relies on a type-driven erasure function \(\varepsilon _{\ell _{A}}\) on terms and configurations where \(\ell _{A}\) denotes the attacker security level. A configuration consists of an \(\ell \)-indexed compartmentalized store \({\varSigma }\) and a term t. A configuration \(\langle {\varSigma }, t \rangle \) is erased by erasing t and by erasing \({\varSigma }\) pointwise, i.e. \(\varepsilon _{\ell _{A}}({\varSigma }) = \lambda \ell . \varepsilon _{\ell _{A}}({\varSigma }(\ell ))\). On terms, the function essentially rewrites data and computations above \(\ell _{A}\) to a special \(\bullet \) value. The full definition of the erasure function is available in the full version of this paper [15]. From this definition, the definition of low-equivalence of configurations follows.

Definition 1

Let \(c_{1}\) and \(c_{2}\) be configurations. \(c_{1}\) and \(c_{2}\) are said to be \(\ell _{A}\)-equivalent, written \(c_{1} \approx _{\ell _{A}} c_{2} \), if and only if \(\varepsilon _{\ell _{A}}(c_{1}) \equiv \varepsilon _{\ell _{A}}(c_{2})\).

After defining the erasure function, the noninterference theorem follows from showing a single-step simulation relationship between the erasure function and a small-step reduction relation: erasing sensitive data from a configuration and then taking a step is the same as first taking a step and then erasing sensitive data. This is the content of the following proposition.

Proposition 1

If \(c_{1} \approx _{\ell _{A}} c_{2}\), \(c_{1} \rightarrow c_{1}'\), and \(c_{2} \rightarrow c_{2}'\) then \(c_{1}' \approx _{\ell _{A}} c_{2}'\).

The main theorem follows by repeated applications of Proposition 1.

Theorem 1 (PINI)

If \(c_{1} \approx _{\ell _{A}} c_{2}\), \(c_{1} \Downarrow c_{1}'\), and \(c_{2} \Downarrow c_{2}'\) then \(c_{1}' \approx _{\ell _{A}} c_{2}'\).

Both the statement and the proof of noninterference for DepSec are mostly similar to the ones for MAC and available in the full version of this paper [15]. Nevertheless, one has to be aware of a few subtleties.

First, one has to realize that even though dependent types in a language like Idris may depend on data, the data itself is not a part of a value of a dependent type. Recall the type of vectors of length with components of type and consider the following program.

figure bi

This example may lead one to believe that it is possible to extract data from a dependent type. This is not the case. Both and are implicit arguments to the function that the compiler is able to infer. The actual type is

figure bj

As a high-level dependently typed functional programming language, Idris is elaborated to a low-level core language based on dependent type theory [9]. In the elaboration process, such implicit arguments are made explicit when functions are defined and inferred when functions are invoked. This means that in the underlying core language, only explicit arguments are given. Our modeling given in the full version of this paper reflects this fact soundly.

Second, to model the extended expressiveness of DepSec, we extend both the semantics and the type system with compile-time pure-term reduction and higher-order dependent types. These definitions are standard (defined for Idris by Brady [9]) and available in the full version of our paper. Moreover, as types now become first-class terms, the definition of \(\varepsilon _{\ell _{A}}\) has to be extended to cover the new kinds of terms. As before, primitive types are unaffected by the erasure function, but dependent and indexed types, such as the type , have to be erased homomorphically, e.g., . The intuition of why this is sensible comes from the observation that indexed dependent types considered as terms may contain values that will have to be erased. This is purely a technicality of the proof. If defined otherwise, the erasure function would not commute with capture-avoiding substitution on terms, \(\varepsilon _{\ell _{A}}(t[v/x]) = \varepsilon _{\ell _{A}}(t)[\varepsilon _{\ell _{A}}(v)/x\)], which is vital for the remaining proof.

7 Related Work

Security Libraries. The pioneering and formative work by Li and Zdancewic [20] shows how arrows [18], a generalization of monads, can provide information-flow control without runtime checks as a library in Haskell. Tsai et al. [45] further extend this work to handle side-effects, concurrency, and heterogeneous labels. Russo et al. [35] eliminate the need for arrows and implement the security library SecLib in Haskell based solely on monads. Rather than labeled values, this work introduces a monad which statically label side-effect free values. Furthermore, it presents combinators to dynamically specify and enforce declassification policies that bear a resemblance to the policies that DepSec are able to enforce statically.

The security library LIO [41, 42] dynamically enforces information-flow control in both sequential and concurrent settings. Stefan et al. [40] extend the security guarantees of this work to also cover exceptions. Similar to this work, Stefan et al. [42] present a simple API for implementing secure conference reviewing systems in LIO with support for data-dependent security policies.

Inspired by the design of SecLib and LIO, Russo [34] introduces the security library MAC. The library statically enforces information-flow control in the presence of advanced features like exceptions, concurrency, and mutable data structures by exploiting Haskell’s type system to impose flow constraints. Vassena and Russo [46], Vassena et al. [47] show progress-insensitive noninterference for MAC in a sequential setting and progress-sensitive noninterference in a concurrent setting, both using the two-steps erasure technique.

The flow constraints enforcing confidentiality of read and write operations in DepSec are identical to those of MAC. This means that the examples from MAC that do not involve concurrency can be ported directly to DepSec. To the best of our knowledge, data-dependent security policies like the one presented in Sect. 3 cannot be expressed and enforced in MAC, unlike LIO that allows such policies to be enforced dynamically. DepSec allows for such security policies to be enforced statically. Moreover, Russo [34] does not consider declassification. To address the static limitations of MAC, HLIO [11] takes a hybrid approach by exploiting advanced features in Haskell’s type-system like singleton types and constraint polymorphism. Buiras et al. [11] are able to statically enforce information-flow control while allowing selected security checks to be deferred until run-time.

Dependent Types for Security. Several works have considered the use of dependent types to capture the nature of data-dependent security policies. Zheng and Myers [51, 52] proposed the first dependent security type system for dealing with dynamic changes to runtime security labels in the context of Jif [29], a full-fledged IFC-aware compiler for Java programs, where similar to our work, operations on labels are modeled at the level of types. Zhang et al. [50] use dependent types in a similar fashion for the design of a hardware description language for timing-sensitive information-flow security.

A number of functional languages have been developed with dependent type systems and used to encode value-dependent information flow properties, e.g. Fine [43]. These approaches require the adoption of entirely new languages and compilers where DepSec is embedded in an already existing language. Morgenstern and Licata [25] encode an authorization and IFC-aware programming language in Agda. However, their encoding does not consider side-effects. Nanevski et al. [30] use dependent types to verify information flow and access control policies in an interactive manner.

Lourenço and Caires [23] introduce the notion of dependent information-flow types and propose a fine-grained type system; every value and function have an associated security level. Their approach is different to the coarse-grained approach taken in our work where only some computations and values have associated security labels. Rajani and Garg [33] show that both approaches are equally expressive for static IFC techniques and Vassena et al. [48] show the same for dynamic IFC techniques.

Principles for Information Flow. Bastys et al. [6] put forward a set of informal principles for information flow security definitions and enforcement mechanisms: attacker-driven security, trust-aware enforcement, separation of policy annotations and code, language-independence, justified abstraction, and permissiveness.

DepSec follows the principle of trust-aware enforcement, as we make clear the boundary between the trusted and untrusted components in the program. Additionally, the design of our declassification mechanism follows the principle of separation of policy annotations and code. The use of dependent types increases the permissiveness of our enforcement as we discuss throughout the paper. While our approach is not fully language-independent, we posit that the approach may be ported to other programming languages with general-purpose dependent types.

Declassification Enforcement. Our hatch builders are reminiscent of downgrading policies of Li and Zdancewic [19]. For example, similar to them, DepSec’s declassification policies naturally express the idea of delimited release [36] that provides explicit characterization of the declassifying computation. Here, DepSec’s policies can express a broad range of policies that can be expressed through predicates, an improvement over simple expression-based enforcement mechanisms for delimited release [4, 5, 36].

An interesting point in the design of declassification policies is robust declassification [49] that demands that untrusted components must not affect information release. Qualified robustness [2, 28] generalizes this notion by giving untrusted code a limited ability to affect information release through the introduction of an explicit endorsement operation. Our approach is orthogonal to both notions of robustness as the intent is to let the untrusted components declassify information but only under very controlled circumstances while adhering to the security policy.

8 Conclusion and Future Work

In this paper, we have presented DepSec – a library for statically enforced information-flow control in Idris. Through several case studies, we have showcased how the DepSec primitives increase the expressiveness of state-of-the-art information-flow control libraries and how DepSec matches the expressiveness of a special-purpose dependent information-flow type system on a key example. Moreover, the library allows programmers to implement policy-parameterized functions that abstract over the security policy while retaining precise security levels.

By taking ideas from the literature and by exploiting dependent types, we have shown powerful means of specifying statically enforced declassification policies related to what, who, and when information is released. Specifically, we have introduced the notion of predicate hatch builders and token hatch builders that rely on the fulfillment of predicates and possession of tokens for declassification. We have also shown how the monad [10] can be used to limit hatch usage statically.

Finally, we have discussed the necessary means to show progress-insensitive noninterference in a sequential setting for a dependently typed information-flow control library like DepSec.

Future Work. There are several avenues for further work. Integrity is vital in many security policies and is not considered in MAC nor DepSec. It will be interesting to take integrity and the presence of concurrency into the dependently typed setting and consider internal and termination covert channels as well. It also remains to prove our declassification mechanisms sound. Here, attacker-centric epistemic security conditions [3, 16] that intuitively express many declassification policies may be a good starting point.