Password-authenticated searchable encryption

We introduce Password Authenticated Searchable Encryption (PASE), a novel searchable encryption scheme where a single human-memorizable password can be used to outsource (encrypted) data with associated keywords to a group of servers and later retrieve this data through the encrypted keyword search procedure. PASE ensures that only the legitimate user who knows the initially registered password can perform these operations. In particular, PASE guarantees that no single server can mount an offline attack on the user’s password or learn any information about the encrypted keywords. The concept behind PASE protocols extends previous concepts behind searchable encryption by removing the requirement on the client to store high-entropy keys, thus making the protocol device-agnostic on the user side. In this paper, we model the functionality of PASE along with two security requirements (indistinguishability against chosen keyword attacks and authentication) and propose an efficient direct construction in a two-server setting those security we prove in the standard model under the Decisional Diffie–Hellman assumption. Our constructions support outsourcing and retrieval procedures based on multiple keywords and allow users to change their passwords without any need for the re-encryption of the outsourced data. Our theoretical efficiency comparisons and experimental performance and scalability measurements show that the proposed scheme is practical and offers high performance in relation to computations and communications on the user side. The practicality of our PASE scheme is further demonstrated through its implementation within a JavaScript-based web application that can readily be executed on any (mobile) browser and remains practical for commodity user devices such as laptops and smartphones.


Searchable encryption
Using protocols for Searchable Encryption [2,10,20,29] clients with limited computing and storage resources can outsource encrypted data to a server or a collection of servers, perform search over the encrypted data (typically using encrypted keywords) and eventually retrieve searched data while preserving its privacy against the servers. Existing searchable encryption schemes can be broadly split into those where the keyword search procedure requires either high-entropy shared keys such as Symmetric Searchable Encryption (SSE) schemes or a private-public key pair such as Public Key Encryption with Keyword Search (PEKS) schemes on the user side. In practice, the requirement to maintain high-entropy keys on the user side results in less flexibility when it comes to the use of multiple, different devices for outsourcing and retrieval of data. The user is effectively prevented from using different devices unless the private key is made available to every such device.

Symmetric searchable encryption
Symmetric searchable encryption enables the user to encrypt the data, organizing it in an arbitrary way (before encryption) and includes additional data structures to allow for efficient access of relevant data. In this setting, the initial work for the user (i.e., for preprocessing the data) is at least as large as the data, but subsequent work (i.e., for accessing the data) is very small relative to the size of the data for both the user and the server. Ostrovsky demonstrated that symmetric searchable encryption can be achieved in its full generality and with optimal security using Oblivious RAM but with huge overhead [35]. Further works try to make the construction efficient with more rounds and a weaker security model to reduce the overhead. Song et al. [21] approached SSE using a new two layered encryption, whose outer layer discloses whether a particular keyword is stored in an inner encryption using a trapdoor. Unfortunately, search requires computation linear in the size of each document and reveals statistical information about the distribution of the underlying plaintext. Both of these where limitations were addressed by Goh [23] through associating secure indexes to each document in a collection. It also introduced the notion of semantic security against chosen-keyword attacks (called IND-CKA), which is the first formal notion of security defined for searchable encryption.
In the context of complex search queries, the above schemes are restricted to single-keyword equality queries. Ballard et al. [5] provided an secure and efficient system to perform Boolean keyword searches using Shamir's secret sharing. Curtmola et al. [20] introduced two variants (adaptively secure and non adaptively secure) SSE with the use of lookup tables. Chase et al. [17] introduced the notion of structured encryption, where arbitrarily structured data are encrypted in such a way that it can be queried through the use of a query specific token that can only be generated with knowledge of the secret key. The scheme improves over non-adaptive variant of [20] achieving keyword search through generating dictionaries of each keyword which contain pointer-output for each document. Kamara et al. [29] further refine the model to a dynamic searchable encryption scheme based on the inverted indexes approach of [20].
Other variants of SSE include Message Lock Encryption by Bellare et al. [8], where the key under which encryption and decryption are performed is derived from the message itself and search pattern obfuscation by Orencik at al. [34] using preprocessed term frequency-inverse document frequency (tf-idf) weights of keyword-document pairs.

Public key encryption with keyword search (PEKS)
The notion of Public Key Encryption with Keyword Search was introduced by Boneh et al. [10] using bilinear maps and trapdoor permutations. The mechanism provided an efficient way to check whether a keyword is associated with a given document without leaking anything else about the document. However, due to the computation cost of public key encryption, the constructions were applicable to searching on a small number of keywords rather than an entire file. Moving beyond just equality-based keyword search, Park et al. [36] and Boneh et al. [10] extended PEKS for conjunctive [10,36], subset and range [10] queries on encrypted data.
However, the PEKS construction does not allow the recipient to decrypt keywords, i.e., encryption is not invertible. This was addressed by Fuhr et al. [22], through introducing decryptable searchable encryption using identity-based key encapsulation mechanism (ID-KEM). The concept also paved way for management of encrypted data, since the decryption key and the trapdoor derivation key are generated independently from one another and hence data can be decrypted by an entity and trapdoors be generated by some other managing party. Abdalla et al. [2] defined the computational and statistical relaxations of the existing notion of perfect consistency, showing that [10] is computationally consistent, and providing a new scheme that is statistically consistent. Third party delegation was further studied by Ibraimi et al. [25], employing the notion of Public Key Encryption with Delegated Search (PKEDS) which enables a third party to search an document for a particular keyword encrypted by the user.
Other variants of public key encryption in the context of keyword search include Deterministic Searchable Encryption [6] and Plaintext-Checkable Encryption [15]. Bellare et al. [6] achieved deterministic searchable encryption using RSA-DOAEP, a length preserving deterministic encryption scheme. A plaintext-checkable encryption scheme is a probabilistic public-key encryption scheme with the additional functionality that anyone can test whether a ciphertext is the encryption of a given plaintext message under a public encryption key. Canard et al. [15] provided an efficient construction for plaintext checkable encryption using an ElGamal-based approach.

Password-authenticated searchable encryption (PASE)
The idea of basing searchable encryption solely on passwords, proposed in this paper, helps to avoid costly and risky key management on the user side and enables the whole process to be device-agnostic. This, however, comes with challenges considering that both passwords and keywords typically have low entropy. Amongst the core security properties of PASE, there is a need to guarantee that only the legitimate user, who knows the password, can outsource, search and retrieve data. Hence, basing security of searchable encryption schemes on passwords introduces the need for a distributed server environment where trust is spread across at least two non-colluding servers, as is also the case in many password-based protocols for authentication and secret sharing, e.g., [4,[12][13][14][26][27][28]30,31,40]. The use of two servers provides the most practical scenario and the minimum requirement to achieve protection against offline dictionary attacks, while a more general secret sharing architecture with t-out-of-n servers would be applicable as well. Chen et al. [18] further demonstrated the resilience of two server model against keyword guessing attacks. Thus, the PASE's two server model offers best performance to protection tradeoff for (public key-based) PEKS schemes, protecting against offline dictionary and keyword guessing attacks.
We model PASE as a searchable encryption scheme where users can register their passwords with the servers and then re-use these passwords for multiple sessions of the outsource and retrieval protocols. In each outsource session, the user can outsource encrypted keywords along with some (encrypted) document to both servers. The retrieval protocol realizes the search procedure based on the keyword that the user inputs to the protocol and provides the user with all documents associated with that keyword allowing the user to also verify the integrity of the retrieved documents. We define security of the PASE scheme using BPR-like models [3,9] that have been widely used for password-based protocols. We define privacy of PASE keywords through indistinguishability against chosen keyword attacks (IND-CKA) while considering active adversaries, possibly in control of at most one server, who can also register own passwords in the system. While IND-CKA security protects against the adversary who does not know the password from successfully retrieving outsourced data, we additionally require authentication to protect the outsourcing operation itself, thus preventing the adversary from outsourcing data on behalf of the user; this requirement must also hold even if the adversary controls one of the servers.
Our direct PASE construction follows conceptually the following more general approach that combines ideas behind Password Authenticated Secret Sharing (PASS) [4,[12][13][14]26,27,40] and SSE [5,20,34]. In the registration phase, the user picks a password π and a high-entropy symmetric key K that will be used to encrypt keywords and secret-shares K protected with π across both servers. In order to outsource keywords, the user engages into the PASS reconstruction protocol to obtain K and then into the SSE outsource protocol to outsource the keywords. In order to search for keywords and retrieve data, the user again reconstructs K using PASS and performs the keyword search using SSE. We stress, however, that our construction is direct and does not use PASS and SSE as generic building blocks. A generic construction from these two primitives remains currently out of reach due to significant differences in the syntax, functionality and security amongst the existing PASS protocols. First, PASS protocols do not separate registration from secret sharing phase and therefore do not enforce user authentication upon secret sharing which would be required for the outsourcing protocol in PASE. Existing PASS protocols were proven in different security models, e.g., BPR-like in [4,40] and UC-based in [12,14,27,28] and do not necessarily follow the same functionality and syntax, which makes it hard to use PASS as a generic building block in PASE without revising the syntax and security models of those PASS protocols. While we could update the syntax of PASS protocols to allow for a generic usage in PASE such update would introduce changes to the original PASS protocols and require new security proofs. Moreover, generic constructions often lead to less efficient instantiations than directly constructed schemes. For all the aforementioned reasons, we are not formally proposing a generic PASE construction in this paper and opt for a direct and efficient scheme (cf. Sect. 3) based on well-known assumptions in the standard model.

Paper organization
Section 2 formally models PASE functionality and defines its main security properties. Section 3 introduces our direct PASE construction. We recall the underlying cryptographic building blocks and present a high-level design rationale for the scheme. This section also compares the efficiency of the key reconstruction phase of the proposed PASE scheme with existing PASS protocols and highlights additional support for multi-keyword operations and password change. Section 4 contains formal security analysis of the proposed scheme. In Sect. 5, we present our browser-based demonstrator with complete implementation of the proposed PASE functionality. This section also contains experimental results on the evaluation of performance and scalability of our implementation on commodity user devices. Section 6 concludes this paper.

PASE model and definitions
In this section, we model the functionality of PASE and provide definitions of its security requirements.

Syntax of algorithms and protocols
In our PASE model, any user U can perform an initial registration procedure with any two servers S 0 and S 1 in the system and then use the registered password π (from some dictionary D) to outsource and retrieve data based on associated keywords w ∈ W. Each server S d , d ∈ {0, 1} maintains its own database where for each user it records the associated secret information info d obtained during the registration procedure and the outsourced data (C, ix) obtained from multiple executions of the outsource protocol; C is used to represent a ciphertext for the keywords, whereas index ix stands for the outsourced (and possibly encrypted) document that is associated with the encrypted keywords. Similar to other searchable encryption schemes (e.g., [2]) we do not explicitly model the encryption of outsourced documents and use indices ix ∈ I I I as placeholders for these documents.
-Setup(1 κ ) is an initialization algorithm that on input a security parameter κ ∈ N generates public parameters par of the scheme.
-Register is a registration protocol executed between some user U (running interactive algorithm RegisterU) and two servers S 0 and S 1 (running interactive algorithms RegisterS d , d ∈ {0, 1}) according to the following specification: -RegisterU(par, π, S 0 , S 1 ): on input par and some password π ← D, this algorithm interacts with RegisterS d , d ∈ {0, 1} and outputs a flag s ∈ {succ, fail}. If (s = succ), the user remembers π and forgets all other informations. -RegisterS d (par, U, S 1-d ): on input par, this algorithm interacts with RegisterU (and possibly RegisterS 1-d ) and at the end of successful interaction stores some secret information info d associated with U at S d .
-Outsource is an outsourcing protocol executed between some user U (running interactive algorithm OutsourceU) and two servers S 0 and S 1 (running interactive algorithms OutsourceS d , d ∈ {0, 1}) according to the following specification: -OutsourceU(par, π, w, ix, S 0 , S 1 ): on input π , a keyword w, and some index ix this algorithms interacts with OutsourceS d , d ∈ {0, 1} and outputs a flag s ∈ {succ, fail}.
this algorithm upon successful interaction with OutsourceU (and possibly OutsourceS 1-d ) stores a record (C, ix) in its database C C C d .
-Retrieve is a retrieval protocol executed between some user U (running interactive algorithm RetrieveU) and two servers S 0 and S 1 (running interactive algorithms RetrieveS d , d ∈ {0, 1}) according to the following specification: -RetrieveU(par, π, w, S 0 , S 1 ): on input π and a keyword w, this algorithm upon successful interaction with RetrieveS d , d ∈ {0, 1} outputs set I I I containing all ix associated with w.
this algorithm interacts with RetrieveU (and possibly RetrieveS 1-d ) and outputs a flag s ∈ {succ, fail}.

PASE security model
The security of PASE is defined based on two main security goals: indistinguishability against chosen keyword attacks (IND-CKA) and authentication. We adopt a BPR-like modeling approach [9] for password-based cryptographic protocols and define security through experiments (cf. Fig. 1) where a PPT adversary A has full control over the communication channels and can interact with parties (controlled by a simulator) through the set of oracles defined in the following.

Adversarial model and oracles
For each user U, we allow A to take full control over at most one of the two servers S 0 and S 1 that were chosen by U during the registration phase to capture the required distributed trust relationship. We mostly use S d to denote the uncorrupted server and S 1-d to denote the server controlled by the adversary. The oracles allow A to invoke interactive algorithms for all protocols of PASE which will be executed (honestly) by the simulator. A can interact with these algorithms and by this participate in the protocol. In particular, we allow A to participate in outsourcing and retrieval protocols on behalf of some corrupted server and also as some (illegitimate) user who tries to guess the registered password during the execution of the protocol. Let τ τ τ be an initially empty array that will be populated with tuples of the form τ τ τ [ j] ← (d, π, info d ) at the end of each successful j-th registration session such that π is the registered password and info d is the secret data stored at the server S d at the end of that session. We also use variables i * ∈ Z, ix * ∈ I and a set Set that are maintained by the experiments. The adversary A can access the following oracles.
-Registration oracle Reg(·): on input d ∈ {0, 1}, the experiment first initializes C C C d, j ← ∅ as a database for session j. Then, it randomly picks fresh (π The Register protocol is executed with A where the oracle plays the roles of honest U and S d executing algorithms RegisterU(par, π, S 0 , S 1 ) and RegisterS d (par, U, S 1-d ), respectively, and A plays the role of corrupted S 1-d . After interactions, the experiment records τ τ τ [ j] ← (d, π, info d ), delivers j to the adversary and increases j ← j + 1.
-Outsource oracle Out(·, ·, ·): on input (i, w, ix), the oracle aborts if i ≥ j; or otherwise, it obtains The Retrieve protocol is then executed with A where the oracle plays the roles of honest U and S d executing algorithms RetrieveU(par, π, w, S 0 , S 1 ) and RetrieveS d (par, U, info d ), respectively, and A plays the role of corrupted S 1-d . In the IND-CKA experiment, if (i * = −1) the oracle additionally computes The Retrieve protocol is then executed with A where the oracle plays the role of honest S d executing algorithm RetrieveS d (par, U, info d ) and A plays the roles of (illegitimate) U and corrupted S 1-d . Note that this oracle will be used to model IND-CKA-security of PASE.

Indistinguishability against chosen keyword attacks (IND-CKA)
The IND-CKA property for PASE is defined through the experiment E x p IND-CKA-b PASE,A (κ) (cf. Fig. 1) and is closely related to [5] except that our setting is based on passwords. A is given the public parameters par and permitted to adaptively access oracles Ch ind (b, ·, ·, ·, ·), Reg(·), Out(·, ·, ·), Ret(·, ·) and RetS(·) at most 1, q r , q o , q t and q s times, respectively. In particular, our IND-CKA experiment captures the following ways that A may try to retrieve data: (i) from interaction with an honest user U and the honest server S d playing the role of corrupted S 1-d (which is captured through the oracle Ret(·, ·)), or (ii) from interaction with the honest server S d playing the role of illegitimate user, e.g., trying to guess the registered password, and the corrupted server S 1-d (which is captured through the oracle RetS(·)).
where |D| is the dictionary size and (κ) is negligible in the security parameter κ. Note that probability q s |D| relates to the use of oracle RetS(·) that models on-line dictionary attacks and assumes uniform distribution of passwords within D, as is also common in BPR-like models.

Authentication (Auth)
The property of authentication for PASE is defined using experiment E x p Auth PASE,A (κ) in Fig. 1. A is given the public parameters par and permitted to access oracles Reg(·), Out(·, ·, ·), OutS(·) and Ret(·, ·) with at most q r , q o , q s and q t times, respectively. Our experiment effectively captures attacks where A tries to outsource some data ix * on behalf of some user U without knowing the registered password (via OutS(·) oracle), possibly after having interacted with U and the honest server S d . In its attack on authentication, A can play the role of a corrupted server S 1-d and also mount man-in-the-middle attacks on sessions of Outsource and Retrieve protocols involving user U.
A PASE scheme provides authentication if for all PPT A the probability Adv Auth PASE, As in the IND-CKA case, we again need to account for the possibility of online guessing attacks via the oracle OutS(·).

Our direct PASE construction
In this section, we propose a direct and efficient construction of PASE. It follows our general idea of combining suitable password-authenticated secret sharing with symmetric searchable encryption techniques. In the introduction, we explained the difficulties behind an attempt to construct PASE generically using PASS and SSE schemes and motivated our choice for a direct construction.

Cryptographic building blocks
In our PASE construction, we rely on a number of wellknown cryptographic primitives that we briefly introduce in the following.

Pedersen commitments [37]
Let g, h be two generators in a multiplicative cyclic group G with order q, and the discrete logarithm between h and g is unknown. For a message m ∈ Z * q , the Pedersen commitment is computed as c ← g r h m where r $ ← Z * q and is opened by providing (r , m). We recall that Pedersen commitments offer computational binding based the discrete logarithm problem, i.e., assuming Adv DL A (κ) is negligible and provide perfect hiding. [24,33] Let k ∈ K PRF be a high min-entropy key in the PRF key space. A pseudorandom function PRF is called (t, q, (κ))secure if for any PPT algorithm A running in time t with at most q oracle queries the probability Adv PRF A (κ) ≤ (κ) for distinguishing the outputs of PRF(k, m) from the outputs of a truly random function f of the same length, assuming that A has oracle access to O PRF (·) which contains either PRF(k, ·) or f (·) and which cannot be queried on m. [32] Let Σ be a source of key material. A key derivation function KDF is called (t, q, (κ))-secure with respect to Σ if for any PPT algorithm A running in time t with at most q oracle queries the probability Adv KDF A (κ) ≤ (κ) for distinguishing the output of KDF(k, c) from uniformly drawn random strings of the same length, assuming that (k, α) ← Σ where k is the secret key material and α is some side information. It is assumed that A knows α, has control over the context information c and has oracle access to KDF(k, ·) which cannot be queried on c. [7] A message authentication code (KGen, Tag, Vrfy) is comprised of the algorithms A MAC is secure if any PPT algorithm A without knowledge of mk has only negligible probability Adv MAC A (κ) to forge a tag μ * for some message m * . A has access to the tag oracle O Tag (·) which returns μ ← Tag(mk, m) on input m. The only restriction is that m * is never queried to O Tag (·).

High-level design rationale
Our PASE protocol is inspired by the techniques used in the recent password-authenticated secret sharing protocol from [40] which we modified to address the functionality and requirements of PASE and extended with a suitable mechanism for symmetric searchable encryption of keywords. In particular, we define a new registration protocol Register upon which the user registers its password π encrypted in C π with both servers and also picks a symmetric key K for which it computes appropriate shares K 0 and K 1 which are then sent to the corresponding servers. The reconstruction of K is protected by π , and MAC codes μ d are used to ensure the validity of K upon its reconstruction. The protocols Outsource and Retrieve proceed according to the similar pattern. First, the user reconstructs K using its password π after communication with both servers. Then, in Outsource protocol U uses K in combination with its keyword w to derive a trapdoor t ← KDF 2 (K , w) and a fresh randomness e to derive verifier v ← PRF(t, e). The pair (e, v) becomes part of the outsourced ciphertext C which is bound to some data ix. During the Retrieve protocol, the user can recompute the trapdoor t for a given keyword w and then send it to the servers who can the find all outsourced ciphertexts C for which v ← PRF(t, e) holds and hence identify which data ix needs to be returned. In order to prevent servers from creating their own pairs (e, v) for a given t the outsourced ciphertext C additionally includes a MAC tag μ c which authenticates (e, v) and also ix and which can only be computed and verified using K . During the Retrieve protocol, the user will ensure that it final search result contains only data that pass this integrity and authenticity check. In addition, both protocols make use of MACs to ensure authenticity of messages, where the MAC keys are derived from K on the user side. We emphasize that our PASE construction is in the password-only setting where servers are not required to possess any public keys for the security of the PASE scheme. However, if the registration protocol Register is performed remotely over a public network, then this protocol needs to be executed over serverauthenticated secure-channels (e.g., TLS). In order to enable reconstruction of K by the user and to protect this phase with the password both servers communicate with each other as part of the Outsource and Retrieve protocols. While in practice this communication between the two servers will likely be protected using a secure channel (e.g., TLS), we stress that in our protocols this communication can take place over an insecure channel.

Detailed description
In the following, we provide a detailed description of all algorithms and protocols underlying our direct PASE scheme, along with Figs. 2 and 3 that illustrate the protocols Outsource and Retrieve, respectively.

Initialization procedure Setup(1 Ä )
The algorithm generates public parameters par containing {G, q, g, h, KDF 1 , KDF 2 , PRF, MAC}, where (G, q, g, h) represents a multiplicative cyclic group G with a prime order q and generators g, h $ ← G such that the discrete logarithm of h with respect to base g remains unknown.
where K PRF and K MAC are PRF and MAC key spaces, respectively. We assume that passwords from D are represented as elements of Z * q .

Registration protocol Register
In order to register, a user U picks r 1 , 1} over a server-authenticated secure channel. Finally, U memorizes π .

Outsourcing protocol Outsource
The Outsource protocol between the user U and each server S d , d ∈ {0, 1} is illustrated in Fig. 2, and its steps are detailed in the following. Note that as part of the Outsource protocol both S 0 and S 1 communicate with each other, possibly over an insecure channel.
, 1} κ and sends A ← g a h π to both servers. 2. On input A, server S d executes following steps: 3. Upon receiving (Y , Z 0 , μ 0 ) and (Y , Z 1 , μ 1 ) from both servers, user U executes following steps:

Retrieval protocol Retrieve
The Retrieve protocol between the user U and each server S d , d ∈ {0, 1} is illustrated in Fig. 3, and its steps are detailed in the following. Note that as part of the Retrieve protocol both S 0 and S 1 communicate with each other, possibly over an insecure channel.
1. User U randomly selects a $ ← Z * q and sends A ← g a h π to both servers. Fig. 2 The Outsource protocol between user U and server S d . The server-side algorithm includes communication between servers S d and S 1-d 2. On input A, server S d executes following steps: 3. Upon receiving (Y , Z 0 , μ 0 ) and (Y , Z 1 , μ 1 ) from both servers, U executes following steps: 4. On input (t, μ sk d ), server S d executes following steps: (a) If Vrfy(sk d , t, μ sk d ) = 0, then abort, else compute  , (e, v, ix), μ c ) = 1. This step guarantees that only outsourced data for which the integrity check was performed successfully will be added to the output set I I I .

Correctness of our PASE scheme
In the following, we illustrate that if the initially registered password π is used by the user in the executions of the Outsource and Retrieve protocols then computing Z 0 Z 1 Y a in the key reconstruction phase results in the original key K : Fig. 3 The Retrieve protocol between U and S d . The server-side algorithm includes communication between servers S d and S 1-d K 1 (C π A -1 ) y 1 (g r 1 R) -x 1 · g a(y 0 +y 1 ) =X r 1 K (X r 2 g -a ) y 0 +y 1 (g r 1 R) -(x 0 +x 1 ) g a(y 0 +y 1 ) =X r 1 K X r 2 (y 0 +y 1 ) X -(r 1 ) X -r 2 (y 0 +y 1 ) = K

Efficiency comparison with existing PASS protocols
Given that our direct PASE construction follows the general idea of building PASE protocols based on the techniques used for password-authenticated secret sharing, we compare performance with existing PASS protocols. Since our PASE scheme assumes password-only setting (except for the registration), we restrict our comparison with password-only PASS schemes [4,[26][27][28]40] and compare only the costs that arise from the sharing and retrieval of the symmetric key K -note that in our PASE scheme sharing of K is performed as part of the Register protocol, whereas retrieval of K is part of both Outsource and Retrieve protocols and is accomplished in step 3a) of these protocols. Since our PASE scheme adopts a two-server architecture, but the aforementioned PASS schemes were designed for a more general t-out-of-n threshold setting we consider their costs for the special case of t = n = 2 to ease the comparison. The results of the comparison are presented in Table 1.
We compare computation costs through the number of modular exponentiations for the user and each of the servers during the sharing and retrieval phases of the symmetric key K . We also compare communication costs in the number of bits communicated in both phases, while considering userserver and server-server communications. For the lengths of elements in G and Z * q , we use |G| = q and |q| = κ bits, respectively. We also compare the number of rounds needed for the sharing and retrieval of K .
We observe that in terms of computation and communication costs key sharing and reconstruction phases in our PASE scheme compare fairly well with those of existing PASS protocols. In particular, only [27,28] which are the most computationally efficient PASS protocol today offers better overall computation and communication performance. We stress, however, that for PASE protocols the efficiency The computation costs are measured in modular exponentiations; the communications costs are measured in bits. Both of these costs are provided separately for the user (u) and each server (s) of the retrieval phase is of greater importance than of the sharing phase. This is because in PASE sharing of K is performed only once as part of the registration procedure, but retrieval of K occurs each time the user wants to outsource data or search for keywords. Furthermore, due to the simplified key management (i.e., reliance on passwords only) PASE offers device-agnostic use of the functionality to the user and can possibly be executed on different client devices (ranging from desktops over to smartphones). In this case, it becomes important to keep the costs associated with computations on the user side and the user-server communication low. Considering this, we observe that in comparison with [27,28] our PASE scheme achieves similar and even partly better performance for computations and communication involving the user device. As a result of our comparison, we conclude that our PASE scheme is sufficiently practical since the additional costs arising from the encrypted keyword search functionality within our PASE protocols are negligible (due to the nature of computations involved) in comparison with the costly key sharing and retrieval steps.

PASE with sublinear search complexities
Our PASE construction supports all CRUD operations required for a database, but its search complexity is O(D) for a database D B of size D whilst state-of-the-art schemes achieve a better bound of O(log D). The search complexity of our PASE can be decreased using the techniques from [16,19,38], yet at the cost of some security and/or functionality limitations. For instance, using the techniques from [16,19] would require limiting PASE functionality to static databases loosing dynamic updates. The latter can be preserved with an ORAM but at a higher cost of O(D log D) for periodic oblivious sorting [38].
The state-of-the-art approach in [19] uses dynamic databases with limited updates. We adopt it here because of the best trade-off between efficiency and functionality. Currently, within each Outsource round we outsource one keyword w associated with some document ix. Using [19], within each Outsource round we can outsource a batch of documents by treating them as a static database D B. The optimization is achieved by constructing a look-up table T in the setup phase which holds pointers to locations of the documents in D B such that the table inputs depend on the document keyword w. This restricts the functionality to dynamic databases that allow only addition of documents but not their removal as the latter would require an update of the look-up table resulting in worse than linear complexity.
In order to extend the Outsource protocol (cf. Fig. 2 is the index of the file in D B and |L i | = n i . The protocol is then executed for each tuple (w i , ind(ix i j ), ix i j ) followed by the computation of an additional key o i j ← KDF 2 (t i , j) and the look-up table entry T [o i j ] := ind(ix i j ). Once these values are calculated for all entries in D B, the entire encrypted database C C C d is sent along with the look-up table T to each server. Notice that C C C d preserves the same order of elements from D B, and ind(·) should give the same location for both C C C d and D B. The Retrieve protocol is performed in the same way (cf. Fig. 3), except each server receives (t, μ sk d , from the lookup table T . To stop adversaries from trivially differentiating based on the list size, [19] extends D B to D B * with dummy documents, such that all lists have the same size |L i | = n, for n = max i {n i }. The resulting version of PASE would achieve the lower search bound of O(log D) but have the aforementioned limitation on the removal of documents. It intuitively satisfies the same security guarantees as the original version based on the fact that each outsource operation can be seen as an outsource of a new independent static database.

Extensions with multiple keywords
In the given specification of our PASE construction, users can use only one keyword w in each execution of Outsource and Retrieve protocols at a time. Often, users may want to be able to outsource or search for documents associated with multiple keywords. Our PASE scheme can be extended to provide efficient support for multiple keywords. Let w w w = (w 1 , . . . , w n ) be a set of outsourced keywords for some document ix and let w w w = (w 1 , . . . , w m ) be a set of searched keywords. In the following, we show how to support (i) outsourcing of ix with w w w through a single session of the Outsource protocol and (ii) search for all suitable documents ix using w w w through a single session of the Retrieve protocol, based on three different types of search queries [11]: conjunctive queries (w w w = w w w ), disjunctive queries (|w w w∩w w w | > 0), and those for a subset of keywords (w w w ⊆ w w w).

Outsourcing documents with multiple keywords
In order to outsource some document ix associated with multiple keywords w w w = (w 1 , . . . , w n for i = 1, · · · , n, and μ c ← Tag(mk u , (e, v v v)) as part of the same Outsource execution and outsource C ← (e, v v v, μ c ) as the resulting ciphertext to both servers.

Search queries with multiple keywords
In order to search for documents using multiple keywords, i.e., w 1 , . . . , w m , m ≤ n, within a single execution of the Retrieve protocol, user U can send a set of authenticated trapdoors t i = KDF 2 (K , w i ) for all searched keywords w i , i = 1, · · · , m to both servers. Then, for all (C, ix) = (e, v v v, μ c , ix) stored in the database C C C d , server S d can initialize an empty output set

Password change
Our PASE scheme allows users to change their passwords without changing the encryption keys K . The latter requirement is crucial since otherwise all outsourced keywords would need to be re-encrypted. In the following, we describe how a user can change current password π to a new pass-word π * depending on whether the user still knows π or has forgotten it.

Changing known passwords
A new password π * can be registered with the knowledge of the current π as follows: 1. User U sends A ← g a h π to both servers (as in Outsource and Retrieve).
Note that the current π is used implicitly to authenticate the user toward both servers.

Changing forgotten passwords
The above procedure cannot be executed if the user has forgotten her current password π . In this case, the user can no longer implicitly authenticate itself during the password change procedure. Since our PASE construction relies only on passwords, we naturally need to assume some alternative fall-back authentication mechanism (e.g., similar to those used on the web) that would be able to distinguish legitimate users from potential impersonators. We assume that a fallback authentication mechanisms is in place which allows the user to independently set up secure channels with each of the two servers S d , d ∈ {0, 1}. The establishment of such channels still leaves us with a challenge to register a new password π * for that user without changing the previously registered encryption key K . We observe that upon the initial registration the encryption key K satisfies the following equation is known only to the corresponding S d . Moreover, the current password π is encrypted in the ElGamal ciphertext (g r 2 , C π = X r 2 h π ) stored on both servers. In the following password change protocol, this ElGamal ciphertext is replaced with (g r * 2 , C π * = X r * 2 h π * ) for the new password π * such that the underlying base X remains unchanged: 1. Each server S d , d ∈ {0, 1}, computes X d ← g x d using x d from info d and sends X d to U over the previously established secure channel.
This password change protocol can be seen as a compressed version of the registration protocol. Jumping ahead of Sect. 4, we observe that the newly registered password π * remains protected against an adversary who can compromise at most one of the two servers under the same assumptions as the old password π .

Security analysis
In the following, we prove the security of our direct PASE scheme using our definitions from Sect. 2.2. In the proofs, we adopt the standard game-hopping technique. Let succ n denote the event that the adversary wins in the experiment n. PASE,A (κ). The oracles Ch ind (b, ·, ·, ·, ·), Reg(·), Out(·, ·, ·), Ret(·, ·) and RetS(·) are implemented as follows.

IND-CKA-security of our PASE scheme
-Ch ind (b, ·, ·, ·, ·): on input (i, w 0 , w 1 , ix * ), the oracle aborts if ((i * ≥ 0) ∨ (i ≥ j) ∨ ((i, w 0 ) ∈ Set) ∨ ((i, w 1 ) ∈ Set)); otherwise, it sets i * ← i and invokes oracle Out(i * , w b , ix * ). -Reg(·): on input d ∈ {0, 1}, the simulator randomly selects fresh π $ ← D and K $ ← G and initializes an empty database C C C d, j . The simulator and A complete the Register protocol, where the simulator plays the roles of U and S d , and A plays the role of S 1-d . The oracle sends j to A as a session identifier. Finally, it , increments j ← j + 1, and stores r 2 and x 1-d for later use in the proof.
-Out(·, ·, ·): on input (i, w, ix), the simulator aborts if (i ≥ j); otherwise, it obtains (d, π, info d , Then, the simulator plays the roles of U and S d and interacts with A who plays the role of S 1-d in the Outsource protocol. -Ret(·, ·): on input (i, w), the simulator aborts if (i ≥ j)∨((i = i * )∧(w ∈ {w 0 , w 1 })); or otherwise, it obtains except that the simulator aborts if some value for y d used on behalf of honest server S d appears in two different protocol sessions through oracles Out(·, ·, ·), Ret(·, ·) and RetS(·).

Lemma 2 Pr[succ
This experiment is similar to E x p IND 2 except that the simulator aborts if some value for Y appears in two different protocol sessions executed through oracles Out(·, ·, ·), Ret(·, ·) and RetS(·).

By the perfect hiding property of Pedersen commitments,
value Y 1-d is guaranteed to be independent from Y d because the adversary acquires nothing from c d .

Due to the binding property of Pedersen commitments,
which is based on the hardness of the DL problem, it is hard to open c 1-d to a different Y 1-d = Y 1-d .
Since Y 1-d is guaranteed to be independent from Y d ; and Y d is fresh, we can follow that Y is fresh based on the hardness of the DL problem.
This experiment is similar to E x p IND 3 except that in oracles Out(·, ·, ·), Ret(·, ·) and RetS(·), the message (Z d , μ d ) from the honest server S d to the A, Y , E). We discuss the following two cases: 1. For the oracles Out(·, ·, ·) and Ret(·, ·), let (g, g α , g β , Q) be an instance of the DDH problem, the simulator aims to output 1 if Q = g αβ ; or 0 otherwise. The simulator sets The hardness of the DDH problem implies the indistinguishability of E x p IND 2. For oracle RetS(·), assume π is the password tried by A, the key K (in E x p IND 3 ) is equal to Z 0 Z 1 Y a h (π −π )(y 0 +y 1 ) ; under the DDH assumption, the adversary cannot distinguish h (π −π )(y 0 +y 1 ) (in E x p IND

) from a random number in G (in E x p IND
3 ) unless π = π which denotes a successful on-line dictionary attack. By the uniform distribution of passwords, its probability is estimated as otherwise, the simulator randomly picks a fresh mk $ ← K MAC , stores (i, id, k, mk) in T 1 and returns mk where fresh means that no record of the form (·, ·, ·, mk) ∈ T 1 exists so far. Since A only acquires mk 1-d , by the uniform distribution of K and the security of KDF 1 , we obtain Experiment E x p IND 5 . This experiment is similar to E x p IND 5 except that in each session i of oracles Out(·, ·, ·) and Ret(·, ·), value t ← KDF 2 (K , w) is replaced with t ← F 2 (i, w). T 2 is initialized as an empty table in the beginning of E x p IND 5 . F 2 returns t if ∃(i, w, t) ∈ T 2 ; otherwise, F 2 picks a fresh t $ ← K PRF , stores (i, w, t) in T 2 and returns t where fresh means that no record of the form (·, ·, t) exists in T 2 . By the uniform distribution of K and the security of KDF 2 , we have Experiment E x p IND 6 . This experiment is similar to E x p IND 6 except for one of the following cases: 1. For the oracle Out(·, ·, ·), the adversary successfully forges ((C, ix), μ sk d ) which satisfies Vrfy(sk d , (C, ix), μ sk d ) = 1. 2. For the oracles Ret(·, ·) or RetS(·), the adversary successfully forges (t, μ sk d ) which satisfies Vrfy(sk d , t, μ sk d ) = 1.
By the unforgeability of MAC, we have This experiment is similar to E x p IND 7 except that in oracles Out(·, ·, ·) and Ret(·, ·), the value v is set in a different way. Let O PRF (·) be the oracle from the security experiment of the pseudorandom function PRF; and let T v be initialized as an empty table in the beginning of E x p IND 7 . When the simulator needs to compute v ← PRF(t, e) in session i, it obtains v using table Assuming the pseudorandomness of PRF, we have As a consequence, based on Lemmas 1 to 8 we can conclude that our proposed PASE construction is IND-CKAsecure assuming the intractability of the DDH problem and security of KDF 1 , KDF 2 , PRF and MAC.

Authentication property of our PASE scheme
Theorem 2 Our proposed PASE construction provides authentication based on the hardness of the DDH problem and security of KDF 1 , KDF 2 and MAC.
-Reg(·): on input d ∈ {0, 1}, the simulator randomly selects a fresh π $ ← D and K $ ← G and initializes an empty database C C C d, j . Then, the simulator and A execute the Register protocol, where the simulator plays the role of U, S d and A plays the role of S 1-d . The simulator then sends j to A as a session identifier. Finally, the simulator records τ τ τ [ j] ← (d, π, info d , increments j ← j + 1, and stores r 2 and x 1-d for later use in the proof. -Out(·, ·, ·): on input (i, w, ix), the simulator aborts if . This experiment is similar to E x p Auth 0 except that the value y d is ensured to be fresh in every session executed by the simulator through the oracles Out(·, ·, ·), OutS(·) and Ret(·, ·).

Lemma 10 Pr[succ Auth
. This experiment is similar to E x p Auth 1 except that the simulator aborts if a value for Y repeats in two different sessions of the protocol executed by the simulator through oracles Out(·, ·, ·), OutS(·), and Ret(·, ·).
Since Y 1-d is guaranteed to be independent from Y d and Y d is fresh, the freshness of Y is implied by the hardness of the DL problem.
. This experiment is similar to E x p Auth 2 except that in oracles Out(·, ·, ·), Ret(·, ·) and OutS(·), the message (Z d , μ d ) from the honest server S d to the user is replaced with (E, μ d ) where E $ ← G and μ d ← Tag(mk d , A, Y , E). We consider the following two case: 1. For oracles Out(·, ·, ·) and Ret(·, ·), let (g, g α , g β , Q) be an instance of the DDH problem, the simulator aims to output 1 if Q = g αβ ; or 0 otherwise. The simulator sets 2. For the oracle OutS(·), assume π is a password used by the adversary, the key K (in E x p Auth 2 ) is equal to Z 0 Z 1 Y a h (π −π )(y 0 +y 1 ) ; under the DDH assumption, the adversary cannot distinguish h (π −π )(y 0 +y 1 ) (in E x p Auth 2 ) from a random number in G (in E x p Auth 3 ) unless π = π which denotes a successful on-line dictionary attack. By the uniform distribution of passwords, its probability is estimated as q s · Adv DDH A (κ) + q s |D| .  , k) then return mk; otherwise, the simulator randomly picks a fresh mk $ ← K MAC , stores (i, id, k, mk) on T 1 and returns mk ← F 1 (i, id, k) where fresh means that no record of the form (·, ·, ·, mk) ∈ T 1 exists so far. Since the adversary only acquires mk 1-d , by the uniform distribution of K as well as the security of KDF 1 , we obtain . This experiment is similar to E x p Auth 4 except that in each session i for the oracles Out(·, ·, ·) and Ret(·, ·), the value t ← KDF 2 (K , w) is replaced with t ← F 2 (i, w). T 2 is initialized as an empty table in the beginning of E x p Auth 5 . Function F 2 returns t if ∃(i, w, t) ∈ T 2 ; otherwise, the simulator randomly picks a fresh t $ ← K PRF , stores (i, w, t) on table T 2 and returns t where fresh means that no record of the form (·, ·, t) exists so far in T 2 . By the uniform distribution of K and the security of KDF 2 , we obtain We observe that E x p Auth 5 is simulated independent the key K . The only probability of winning E x p Auth 5 comes from the adversary successfully forging μ c for (e, v, ix) such that Vrfy(mk u , (e, v, ix), μ c ) = 1. Assuming that MAC is unforgeable, we obtain To sum, by Lemmas 9 to 15, we can conclude that our direct PASE scheme provides authentication based on the hardness of the DDH problem and security of KDF 1 , KDF 2 and MAC.

PASE in practice: browser-based implementation and performance evaluation
In order to demonstrate the functionality of our PASE scheme, we implemented a stateful web application that can be accessed from any web or mobile browser. Our PASE demonstrator implements the client and server sides of the protocol and comes with a single portal (cf. Fig. 4) through which users can register, outsource/retrieve files based on multiple keywords and change their passwords. The source code is available from https://github.com/Spockuto/surreypaks.
The entire PASE implementation is written in Javascript with the client side backed by browser's V8 engine 1 and the server side backed by NodeJS server 2 . By choosing JavaScript, we could use Stanford JavaScript Crypto library 3 in the implementation of both sides (client and server) whereby reusing some parts of the code. An alternative would be to use libsodium 4 or OpenSSL with a wrapper based on PHP. Since modern applications heavily adopt JavaScript, our implementation can in turn be used as a library to provide support for other applications that wish to use the functionality of PASE.
In the following, we provide a more detailed description of our PASE demonstrator and evaluate performance of its functionality.

Cryptographic implementation choices
The following choices of cryptographic parameters and algorithms have been made for our implementation. For the cyclic group G of prime order q and its generator g, we use the parameters of the NISTP384 elliptic curve group 5 . The additional generator h is chosen at random. For the hash function H , we adopt SHA256 (256 bits). Both key derivation functions KDF 1 and KDF 2 are implemented as PBKDF2 (256 bits) 6 . Although PBKDF2 might not be the most efficient on mobile devices, better alternatives such as ARGON2 7 have not been adopted yet in major cryptographic libraries. Our pseudorandom function PRF uses AES256 in GCM mode with the output truncated to 256 bits 8 . For the message authentication code MAC, we adopt the standard HMAC 9 construction.

PASE client and servers
The JavaScript code running on the client includes the main.js file (about 800 LoCs) to manage the requests and formatting and the crypto.js file (300 LoCs) to execute the protocols on the client side. The code running on each server is split into multiple files with the protocol.js (400 LoC) occupying the major part of the implementation. The demonstrator requires NodeJS environment to run and can be deployed instantly.
In our implementation, one server acts as a primary server in that it serves the PASE website and is also used to store all outsourced files. It is helped by the secondary server during the registration, outsourcing and retrieval protocols. The database adopted in our PASE implementation is MongoDB 10 , which is particularly suited for storing and retrieving files. The MongoDB is also used to store all user related information from the registration process. Each server runs its own instance of the database.
Our browser-based demonstrator offers a registration interface where a user can provide its username (e.g., email address) and a chosen password to execute the registration protocol with both servers. Once registered, the user can outsource files and associate them with multiple keywords. Similarly, the user can retrieve outsourced files based on the keywords entered into the corresponding form. Note that for outsourcing and retrieval, the login form must contain the registered username and password. Our demonstrator supports outsourcing and retrieval using multiple keywords, which can be entered into the corresponding box separated by commas. If multiple keywords are used in the retrieval protocol, then the output produced currently will be based on the logic used to define subset queries (cf. Sect. 3.5).

Communication
In our PASE protocols, there are two types of communication which are implemented using different techniques as discussed in the following: -The client-server communication which is present in the registration, outsourcing and retrieval protocols is realized in our implementation using AJAX queries which are executed asynchronously to provide better functionality. This is only possible if the server accepts Cross Origin Resourse Sharing which can be easily setup through the NodeJS core library. -The server-server communication in the outsourcing and retrieval protocols requires both serves to maintain a shortlived state information for the two communication rounds of the protocol session (cf. Fig. 2). This is realized in our implementation using NodeCache 11 functionality which provides a simple and fast internal caching for NodeJS servers.

Encryption of outsourced files
In our demonstrator, we expand the implemented PASE functionality with the encryption of outsourced files, in addition to encrypted keywords. For this purpose, the client could use the secret-shared key K which it reconstructs on the client side during the execution of the outsourcing and retrieval protocols. More precisely, the client could use K to derive another key and use it with some standard symmetric encryption scheme, e.g., AES, to encrypt outsourced files and decrypt them upon retrieval. In addition, to minimize information leakage and distribute the encrypted data among the two servers, we XOR the encrypted data ENC with a random stream of data RND generated to the length of the encrypted data using Fortuna 12 PRNG. The resulting files F 0 ← ENC ⊕ RND and F 1 ← RND are sent to the respective servers S 0 and S 1 . The data ix (cf. Fig. 2), in this case, would be a concatenation of the encrypted file name with the random IV generated for symmertic encryption(i x ← Enc(Name)||IV). The encrypted file name acts as the encrypted file identifier in the server for querying. During Retrieve, from ix, the file name is decrypted using the reconstructed key K and IV. The encrypted file name is used to retrieve files F 0 and F 1 from servers S 0 and S 1 , respectively. The file is recovered by Dec(F 0 ⊕ F 1 ) and made available to the user. RAM (server and client instances) and OnePlus 5 smartphone with Qualcomm Snapdragon 835 octa-core 2.45GHz and 8GB RAM (client instance).

Performance of PASE
The results of our measurements are summarized in Table  2 with separate timings provided for the client and server side computations. For the client side, the table contains measurements performed on both the laptop and smartphone. The registration procedure includes all steps of the Register protocol. In the table, the time needed to reconstruct the symmetric key K , which is accomplished in step 3a of our PASE Outsource and Retrieve protocols (cf. Sect. 3), is measured separately from the time needed to outsource keywords (steps 3b and 3c of Outsource) and retrieve files (steps 3b and 5 of Retrieve). We observe that key reconstruction time on the client side is more than twice as fast as on the server side (when measured on the same device). Note that the key reconstruction procedure is identical in both protocols and its time is independent of the used keywords. In constrast, the measurements provided for outsourcing and retrieval procedures in Table 2 cover only keyword-dependent steps. Table 2 provides average timings based on one keyword, which are computed from multiple executions of the protocol involving a set of 100 randomly generated keywords, with each keyword being between 5 and 10 characters long. For each execution, a random keyword was chosen from the set and the resulting average was computed over 1000 executions.
Based on the measurements, we can highlight that our PASE registration procedure remains well under 1s on both the laptop and the smartphone. The time for outsourcing and retrieval is clearly dominated by the time needed to reconstruct K , which also remains well under 1s. The keyword-dependent computations in both protocols are very efficient, taking less than 100ms per keyword. On the client side, the outsourcing procedure is slightly more efficient than the retrieval procedure due to the additional integrity checks performed in step 5 of the Retrieve protocol.

Performance of file encryption
We evaluated the performance of our file encryption scheme by using test files of size Based on our measurements, we can highlight that an encryption scheme (e.g., AES) can be practically adapted into our protocol with less computation overhead. The total time taken for encryption and decryption, at large file sizes, is clearly dominated by random stream generation. Hence, we propose a configuration setting where the random stream generation is available as an optional security enhancement for the user to choose. With this configuration, users can leverage their computational flexibility to securely encrypt and distribute files of their choice.

Scalability of PASE
In addition to the measurements involving one keyword per execution, we are interested in the scalability of our PASE implementation. For this purpose, we have extended our measurements to calculate an average time for keyword- Fig. 5 Scalability of keyword-dependent outsourcing and retrieval operations on the client side using MacBook Pro laptop and OnePlus 5 smartphone dependent outsourcing and retrieval computations with up to 30 keywords (which is for example, the maximum number of hashtags allowed per image on Instagram 13 ). In our experiments, for each execution multiple keywords were randomly chosen from the same set of 100 keywords that were used in the experiments behind Table 2. A linear regression model was then applied to the average discrete timings to derive a linear approximation.
Our experimental results for client-side keyword-dependent computations are plotted in Fig. 5. These timings suggest that our implementation remains scalable on commodity user devices such as laptops and smartphones. For example, client-side processing of 10 keywords in the outsourcing phase requires about 256 ms (laptop) and 455 ms (smartphone), whereas computations associated with a subset query of 10 keywords during the retrieval phase require about 289 ms (laptop) and 523 ms (smartphone). If we add constant key reconstruction costs from Table 2, then the overall time for client-side processing of 10 keywords would be about 356 ms (laptop) and 879 ms (smartphone) in outsourcing and about 389 ms (laptop) and 947 ms (smartphone) in retrieval phases.

Strengthening password-based authentication
The PASE protocol is proven sound by rigorous mathematical analysis, but the usage of password for authentication indirectly inherits several issues associated with passwords and acts as a single point of failure for the entire architecture. Moving beyond brute force and online attacks, passwords are vulnerable to re-usage, leakage and social engineering attacks. A study [39] on password usage states 38% reused the same password for two different online services, and 21% of them slightly modified an old one to sign up for a new service. Have I been pwned (HIBP) 14 , a popular website which reports data breaches provides records over 500 million actual unique passwords leaked from various data breaches through a variety of attacks including credential stuffing and phishing. The study also shows that users with more passwords are more likely to reuse them, or use variations. The 2020 Verizon Data Breach Investigations Report (DBIR) [1] reports over 80% of breaches within hacking involve brute force or the use of lost or stolen credentials. To protect against such password weakness, the PASE protocol can be extended modularly with 2FA, a secondary authentication mechanism which provides a one-time password (OTP) or code generated or received by an authenticator (e.g., a security token or smartphone) that only the user possesses to complement the primary password used for authentication. The PASE scheme allows inclusion of additional complimentary authentication scheme without comprising the integrity of the internal PASE protocol which relies on high entropy keys generated from the primary password.

Conclusion
Password-Authenticated Searchable Encryption (PASE) introduced in this paper is a new concept for searchable encryption where the search over encrypted keywords can be performed solely with the help of a human-memorizable password. The main advantage over previous concepts is a simplified key management which removes the need for storing and managing high-entropy keys on the user side and makes the whole process device-agnostic. Basing searchable encryption on passwords introduces major design challenges; in particular, creating the need for a distributed server architecture to achieve security against offline dictionary attacks.
We modeled the functionality and security properties of PASE, incl. IND-CKA-security for keyword privacy and authentication for outsourcing for the search procedure and proposed a direct PASE construction those security and privacy has been proven under standard assumptions. Our direct PASE construction is an optimized version of a more general 14 https://haveibeenpwned.com/Passwords. concept for building PASE protocols based on techniques underlying password-authenticated secret sharing and symmetric searchable encryption.
We evaluated the practicality of our PASE scheme through implementation of a JavaScript-based web application that can readily be executed on any (mobile) browser. The conducted performance and scalability evaluation of our implementation shows that the proposed PASE approach remains practical on commodity user devices such as laptops and smartphones.

Compliance with ethical standards
Conflict of interest All authors declare that they have no conflict of interest.
Ethical approval This article does not contain any studies with human participants performed by any of the authors.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecomm ons.org/licenses/by/4.0/.