Private Function Evaluation with Cards

Card-based protocols allow to evaluate an arbitrary fixed Boolean function f\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f$$\end{document} on a hidden input to obtain a hidden output, without the executer learning anything about either of the two (e.g., [12]). We explore the case where f\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f$$\end{document} implements a universal function, i.e., f\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f$$\end{document} is given the encoding ⟨P⟩\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\langle P \rangle$$\end{document} of a program P\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$P$$\end{document} and an input x\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$x$$\end{document} and computes f(⟨P⟩,x)=P(x)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$f(\langle P \rangle , x) = P(x)$$\end{document}. More concretely, we consider universal circuits, Turing machines, RAM machines, and branching programs, giving secure and conceptually simple card-based protocols in each case. We argue that card-based cryptography can be performed in a setting that is only very weakly interactive, which we call the “surveillance” model. Here, when Alice executes a protocol on the cards, the only task of Bob is to watch that Alice does not illegitimately turn over cards and that she shuffles in a way that nobody knows anything about the total permutation applied to the cards. We believe that because of this very limited interaction, our results can be called program obfuscation. As a tool, we develop a useful sub-protocol sortΠX↑Y\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${{\mathrm{\mathsf {sort}}}}_{\varPi }{X} {\uparrow } Y$$\end{document} that couples the two equal-length sequences X,Y\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$X, Y$$\end{document} and jointly and obliviously permutes them with the permutation π∈Π\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\pi \in \varPi$$\end{document} that lexicographically minimizes π(X)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${\pi }(X)$$\end{document}. We argue that this generalizes ideas present in many existing card-based protocols. In fact, AND, XOR, bit copy [37], coupled rotation shuffles [30] and the “permutation division” protocol of [22] can all be expressed as “coupled sort protocols”.


Introduction
Secure multiparty computation (MPC) allows multiple players to jointly compute a function, without giving away anything about their inputs, except what can be deduced from the output. An important special case is when the function to be evaluated constitutes an input itself and should remain hidden, called Private Function Evaluation (PFE). This has been considered in the standard cryptographic setting, e.g., using universal circuits [45] in [5,19,32,38].
Secure multiparty computation, and hence also PFE (by choosing a universal function to be executed), can also be done with a deck of physical cards, as first shown in [8,12,41]. In this area of card-based cryptography, one designs tangible protocols using a deck of cards with information-theoretic privacy features. There is already a wealth of literature on how to jointly and securely compute an arbitrary (fixed) circuit on the players' inputs, see, e.g., [12,37,41]. Moreover, similar but different physical assumptions have been exploited in other settings, in particular in the cryptographic voting community, cf. Scantegrity, PunchScan, and Oblivious voting [2,10,11,43] (see [28] for a survey on physical assumptions in cryptography).
Motivation. Card-based protocols are often used in educational and recreational settings. For an illustration of PFE, we stretch the usual motivation for card-based AND protocols a bit, namely the dating problem where players want to find out whether there is mutual love.
We assume a predefined set of binary attributes A such as A = {LikesCats, HasPhD, IsGeeky,…} . Alice implicitly specifies (by providing a circuit or program) which combinations P ⊆ 2 A of attributes she likes and Bob specifies which attributes B ⊆ A he has. The task is to determine whether Bob's secret attributes satisfy Alice's secret preferences, i.e., whether B ∈ P . Here, we want to ensure that both Alice's and Bob's input remains hidden, i.e., nothing about the input is revealed, except what can be deduced from the output of the protocol.
In the same vein, PFE is useful for the game Skipjack [16] 1 , where a game master invents a rule and the other players take turns querying whether a chosen code words satisfies the rule or not-to deduce/guess the rule in this process. Applying our PFE protocol would allow to prevent the game master from cheating by changing the rule mid-game, or even to play the game in absence of a game master, assuming an encoding of a rule is available or can be obtained at random. (Moreover, as PFE even hides the code words that the player is testing, we can derive a competitive multi-player mode where questions of other players do not help the others.) Look and Feel of Our Protocols. Imagine a room with a table, where Alice puts an encoding of a function f in a sequence on the table, each bit of the description as two face-down cards encoding 0 via ♣,♡ and 1 via ♡ ♣. Next to Alice's cards, Bob will put his input x as a bit string using the same encoding. The game then proceeds according to a protocol (described in more detail later) that may prescribe to (i) shuffle the cards in certain controlled ways and (ii) turn over cards (the observed symbols may affect the future course of the protocol). The protocol terminates with output f (x) encoded as face-down cards. The output can then be revealed to both players or used obliviously in further computations.
The Sort Sub-protocol. The protocols proposed in this paper-and actually a large subset of the protocols from the literature-can be regarded as a sequence of subprotocols with basically the same functionality, which we capture under the name "sort protocol". We believe this observation is of independent interest. We also show that, under weak assumptions, protocols obtained as compositions of sort-protocols are secure. This elegantly re-proves the security of existing protocols and greatly simplifies the security proofs of our own protocols. (As we are in a simpler and fully information-theoretic setting, this is much easier than in the common universal composability framework [9]).
On Interaction in Card-Based Protocols. We point out that card-based cryptography can be assumed secure in a rather non-interactive physical model: it suffices to have one protocol executer, who is under surveillance by the other players. For example, when the protocol description specifies that a certain shuffle is to be performed, this step can be implemented by this one player, the executer, who uses envelopes (or helping cards) and completely random shuffles or uniform random cuts in a manner that ensures that not even he himself can keep track of concrete permutation done on the cards. (We could also use shuffling machines, such as the wheel-of-fortune-esque device in [46].) Note that in this surveillance model where players watch that the protocol is done correctly, many protocols can be argued secure with almost no interaction. For example, ( [21], Protocol 3) is a nice physical zero-knowledge proof system for proving that there is a solution to a Sudoku puzzle, where the verifier chooses one of three cards in each cells of the Sudoku to be assigned to piles for rows, columns and subgrids to be able to later verify that all numbers are present. In our model, we can plausibly argue that the randomness chosen by the verifier can also be directly generated by the prover himself on an additional deck of helping cards. If he is watched to perform the shuffle in a way that generates high entropy not under his control, he can use this generated randomness to assign the cards to the piles. This is actually a general observation regarding protocols using public coins, where this shuffling produces an output that can be interpreted to be like the Random Oracle output in the Fiat-Shamir heuristic. The possibility of secure shuffling in this way is a common assumption that people make when playing card games with others.
Using the PFE protocols introduced in this paper, this immediately leads to a direct way to obtain cryptographic obfuscation in this card-based surveillance model: assuming that the encoded protocol is lying on the table using cards, the executer can add cards encoding the inputs and then execute a universal protocol, such as the ones proposed in this paper, with the only interaction being guards that watch out for publicly observable deviations from the protocol.
However, note that because of the very different setting, there are no implications for the usual non-physical (strictly non-interactive) cryptographic world, where general (virtual black-box) obfuscation is impossible, cf. [7].
Universal Protocols and Their Qualities. We implement four different universal card-based protocols with varying degrees of abstraction, based on branching programs, circuits, Turing machines and RAM machines. Our primary focus is on simplicity and elegance of the protocols, but we also consider efficiency in terms of runtime and required cards.
The benefit of providing several solutions is that depending on the nature of the task, a certain computational model may be particularly suitable. For example, in the generalized dating game described above, using universal circuits is a natural option, while a rule in Skipjack might most naturally be described as a program using loops and thus benefit from the possibilities available in Turing machines and RAM machines. For didactic settings, all options are interesting in itself, as they demonstrate the computational models and the implemented privacy properties in a palpable way.
Contribution. -We show how to encode and execute circuits, Turing machines, RAM machines and branching programs with cards and specify protocols for executing these on hidden inputs so that nothing about the machine description (except the length, etc.) or the inputs is leaked. We achieve this using envelopes and only very natural shuffle operations, namely random cuts and S n -shuffles (i.e., ordinary shuffling, where all card reorderings are equally likely). -Given the weakly interactive nature of card-based cryptography in the "surveillance model" (see above), we thereby obtain what may be called cryptographic obfuscation in a card-based setting. -We identify and generalize a primitive that is the basis for many protocols and operations in cards-based cryptography, namely coupled sorting, cf. Sect. 3.
Related Work. Regarding our branching program construction, let us mention that there are several card-based protocols to randomly generate a permutation with specific, prescribed properties. For example, the secret santa game asks for random permutations on the player indices (encoding who gives a present to whom) that are fixed-point free to ensure that nobody receives their own present, and has been implemented with cards in [12,23]. Moreover, they also give protocols for generating permutations with cycles of a certain minimal length. Moreover, Hashimoto et al. [22] give a protocol for generating permutations with a prespecified cycle structure, and show how to obliviously execute the inverse of a permutation encoded with cards on another card sequence, which is a special case of our sorting operations. In general, we make use of card decks that not only feature heart or clubs cards, a line of research that was pursued in, e.g., [29,35,42]. Note that cryptographic obfuscation has been performed in other models. For example, Goyal et al. [18] make use of tamper-proof hardware tokens (such as smart cards) introduced by Katz [24]. Moreover, [36] allows to execute many cryptographic primitives (albeit not obfuscation) using scratch-off cards. They have a slightly weaker setting, as they do not gather players around a table, but use sealed (tamper-evident) envelopes that are sent between the players via mail, getting out-ofsight from the other players.
Physical computation is also described in [13] (as "Physical GMW protocol") to achieve security in the framework of Universal Composability with Local Adversaries (LUC). However, they make very strong assumptions on available "machines", which we do not need.
Crépeau and Kilian [12] also discuss playing games against a card-encoded (probabilistic) circuit opponent. However, they do not aim to hide this circuit to the player as it is given by the player himself.
Recently, Dvorák and Koucký [14] formulate a similar mechanism to execute Turing machines and branching programs using cards to classify a certain class of card-based protocols that compute functions that are specified by their complexity. This constitutes independent and concurrent work.
Outline. Section 2 gives the necessary preliminaries, including the computational model used in card-based cryptography. Section 3 introduces sorting protocols as a main and versatile building block in card-based cryptography and interprets many results in the field as a single application of such a protocol. We describe concrete protocols for executing universal circuits (Sect. 4), Turing machines (Sect. 5), (word-)RAM machines (Sect. 6) and branching programs (Sect. 7).
Notation (Permutations). For distinct elements and (x) = x for all x ∈ X not occurring in the cycle. For multiple cycles on pairwise disjoint sets, we write them next to one another to denote their composition, e.g.,

Computational Model of Card-Based Cryptography
Card-based protocols operate on a deck of cards, which is specified by a multiset D of symbols, e.g., from { ♡, ♣ } or from numbered cards {1, … , n} . It uses four operations, namely i) turning over cards to reveal their hidden symbols, ii) deterministically permuting the cards, iii) shuffling the cards in some controlled way to introduce randomness, and iv) terminating and outputting a list of card positions encoding the protocol output. The formal model is given in [39].
While many protocols in the literature only use { ♡, ♣ } as a deck alphabet, Niemi and Renvall [42] and Mizuki [35] introduce card-based protocols using the (multi-) set [1, … , n] , and an encoding rule, where a bit given by two face-down cards is 0 if the former card has a smaller value, and 1 otherwise.
More formally, a protocol P is a quadruple (D, U, Q, A) , where D is a deck, U is a set of input sequences over D , Q is a set of states with q 0 ∈ Q and q fin ∈ Q , being the initial and the final state. Moreover, we have an action function , depending on the current state and visible sequence (i.e., the sequence of the card symbols, with face-down cards specified as a special back symbol ' ? ', and face-up cards showing their symbol; the set of visible sequences on deck D is denoted by D ), which specifies the next state and an operation on the sequence. These actions, constituting the set are as follows, performed on a sequence = ( , T) , for a set T ⊆ {1, … , n} , flips the cards at positions specified by the turn set T . Formally, for a card c = a b we define (c) ∶= b a and transform Γ into , for a permutation ∈ S n , permutes according to , i.e., it yields the sequence ( ) = ( [ −1 (1) , for a permutation set ⊆ S n , draws a permutation ∈ uniformly at random and obliviously applies it to . iv) ( See [26,39] for more details. Then, a sequence trace of a finite protocol run is a list ( 0 , 1 , … , t ) of sequences such that 0 ∈ U and i+1 arises from i by the specified action. Moreover, mapping this to a trace where not the cards themselves, but only what is visible about the cards, is called the corresponding visible sequence trace.
Card-based protocols are secure if input and output are perfectly hidden, i.e., from the outside the execution of a protocol has the same distribution, regardless of what input and output are. Boolean Circuits A Boolean circuit with l input variables v 1 , … , v l is a directed acyclic graph C = (V, E) . The nodes are called gates and are labeled with ∨ , ∧ , ¬ , an input variable, or one of the constants 1 or 0. In the cases of ∨ , ∧ , ¬ , the in-degree must be 2, 2 or 1, respectively, otherwise it is 0. The output node is the unique node with out-degree 0. The depth of C is the maximum number of ∧ and ∨ gates on a path in C.
, 1} l is defined in the natural way. For this paper, it is convenient to transform all ∨-gates into ∧-gates using de Morgan's rule (x∨y) = ¬(¬x∧¬y) . Note that this transformation does not affect the depth of the circuit.
Group Actions. In Sect. 3, we make use of group actions and their orbits, which can be found, e.g., in ( [15], Sect. 1.3). For a definition, let X be a nonempty set, G a group, and ∶ G × X → X a function implicit in the notation g(x)∶= (g, x) for g ∈ G, x ∈ X . G acts on X, or is a group action on X if -(x) = x for all x ∈ X , where denotes the neutral element in G, for all x ∈ X and all g, h ∈ G.
Let G be a group acting on a set X . Then, the orbit of an x ∈ X is G(x)∶={g(x) ∶ g ∈ G} , i.e., all elements in X that are reachable from x via some g ∈ G . Note that orbits G(x), G(y) of x, y ∈ X are either disjoint or equal. Hence the orbits form a partition of X , called the orbit partition of X through G. For an application of this to proving lower bounds on the number of cards in card protocols, see [25]. In our setting, G = ⊆ S n is a permutation subgroup used in a shuffle and X is the set of sequences over a deck D . Then, acts on X by permuting the card

The Coupled Sorting Sub-protocol
In this section, we introduce our main, versatile building block, namely "sorting protocols", and later show how to interpret many protocols from the literature as such a protocol. We use the term "coupled" to indicate that a same permutation is applied to multiple card subsequences by forming piles (e.g., to be placed in envelopes) and then permuting them, cf. Fig. 2.
Notation. Let ∈ S n , A = (a 1 , … , a n ) a sequence of distinct natural numbers and B a sequence of length n . We define the lift ↑A of to A via for m with 1 ≤ m ≤ max {a 1 , … , a n } . For instance, the permutation = (1 3)(2 4) ∈ S 4 lifted to the sequence A = (5, 2, 7, 8) yields the permutation ↑A = (5 7)(2 8) . We define the lift of a permutation to a sequence of same-length , (5,6,7,8), (12,11,10,9)) when applied to a sequence (1, … , 12) of cards. The idea is to permute the three card sequences in positions (1, 2, 3, 4), (5,6,7,8) and (12,11,10,9) (all of same length) cyclically (as in ( 1 2 3) ), taking the groups of four cards "as a whole". To illustrate the possibility of given the sequences in the operation in another order, we reversed the third sequence with the effect that when (5,6,7,8) is "mapped" to (12,11,10,9), the card at the 5th position is mapped to the 12th position, and so on (as displayed in the figure) Random Shuffle turn red cards, reveals τ (A) sort piles w.r.t. turned cards split piles  3 4 2) is applied to A and B, leaving the red cards sorted and the pairs of blue cards permuted by (3,1,4,2) as shown. Note that the encoding of the permutation through card sequences is as in Sect. 3.2, and that the revealed sequence (4, 3, 1, 2) is independent of the input sequences and the output sequence. (The different back colors are for illustration and to avoid errors in handling the cards, but are not necessary in theory.) Note that for each i ∈ {1, … , k} , the (b i 1 , … , b i n ) are again assumed to be distinct. We permit that the b j i are sequences again. In this sense, this definition is recursive. Figure 1 illustrates the simple intuition behind these more complex lifts.
We naturally extend this definition to permutation sets ⊆ S n and, for convenience, a lift to two sequences A, B as The Family of Sort (Sub-)protocols. For each combination of a group of permutations ⊆ S n , a sequence of (card) positions A = (a 1 , … , a n ) and another sequence B = (b 1 , … , b n ) , we will define a "protocol" A↑B . However, to avoid a larger and unnecessary technical exposition of sequential compositions of card-based protocols, we will use the symbol A↑B just as a shorthand or syntactic sugar for the sequence of four actions as stated in Protocol 1 (which is explained below). As this behaves like an inlined function in programming languages, we chose to call it "(sub-)protocol" in the following.
Note that , A and B are a public part of the action specification, not inputs. To describe the intended behavior of the shorthand, assume it is executed on a sequence of cards. Let ∶= [A]∶=( [a 1 ], … , [a n ]) be the sequence of cards in positions A, and ∶= [B] the sequence of cards in positions B. We assume that these card (symbol) sequences and are secret.

Let
∈ be the permutation that sorts , i.e., ( ) is the lexicographical minimum of { ( ) | ∈ } w.r.t. a given order on the deck symbols 3 . The overall effect of A↑B should be that is applied to both and , yielding a sequence ′ with � [A] = ( ) , � [B] = ( ) and ′ equal to everywhere else. We permit B, and correspondingly , to be a sequence of k-element sequences

Implementation of Sort Protocols
An example for a practical implementation is given in Fig. 2 and a formal specification in Protocol 1. The first step applies a randomly chosen permutation ∈ to A and B. Then, the cards in positions A are turned over, revealing ( ) where is the sequence of cards that was previously in positions A. This allows us to recognize which permutation ( ) would sort ( ) and apply it to the sequences in positions A and B. Clearly, the overall effect is that and have both been permuted by the same permutation ( ) • . Moreover, this permutation sorted the cards in positions A as desired.
If we only want to reset the sequence in A to a sorted one, i.e., without applying it to cards at positions B, (as in Protocols 11 and 12) we write A. The rationale behind this definition is that if , then shuffling with destroys all information that is held in the sequence prior to turning it. Thus, no information is leaked. The condition |O| = | | ensures that the permutation ∈ that sorts is uniquely defined. 4 Note that this slightly involved criterion is necessary to ensure security in the case that the permutation is chosen at random from a proper subset of S n (on all n cards of the deck). An important example for this is a random cut, which we later use to apply a rotation encoded in a sequence. Assume for instance = ⟨(1 2 3)⟩ and ∈ uniformly random. Moreover, let X be the six-element set of permutations of (♡,♣,♠), and s ∈ X be arbitrary. Revealing (s) to be, say, (s) = (♣,♡,♠) reveals, e.g., that s is not (♡,♣,♠). The reason is that has two orbits when acting on sequences of length 3 with symbols ♡,♣,♠ and we learn in which orbit we have been, excluding all sequences of the other orbit. This criterion is also suitable for achieving security, as shown by the following lemma.

Lemma 3.1 If an shorthand/sub-protocol
A↑B at step i is valid in a protocol, then the sequence revealed in the sub-protocol's turn step is independent of the random variable denoting the card sequence before step i, and the random variable ′ denoting the sequence directly after the sub-protocol.
Proof By definition, (A, i) for the sub-protocol A↑B at step i is subset of an orbit O. Whatever the distribution of is, if ∈ is chosen uniformly at random, then the sequence � = ( ) revealed in the turn step is uniformly distributed on O.
It is thus independent of . Since ′ is a function of , we conclude that ′ is independent of ( , � ) . ◻

Corollary 3.1 If a protocol P contains no turn operations outside of valid instances
of sort sub-protocols, then P is secure.
Useful Specializations. Two subclasses of sort protocols will be particularly useful. The first will be useful, e.g., to apply an encoded permutation to another sequence of cards, the second to rotate a sequence by a specified offset. . The effect is that the rotation encoded in A is applied to [B] . In this case, we also write A↑B for A↑B . (Note that this is similar to a part of the coupled rotation protocols given in [30].) Note that for n = 2 , the two cases are the same.
Non-destructive Variant * . We define a variation * of that differs only in so far as it should make no net change to the cards in positions A. For this, a sequence of helping cards is assumed to be available in (otherwise unused) positions H = (h 1 , … , h n ) . We implement * in Protocol 2 by two applications of , where the latter restores from the helping "register". We say an application of * is valid whenever an application of would be valid and ℍ∶= [H] = (1, … , n) is guaranteed, i.e., H contains cards with numbers in ascending order. Note that * is defined as a shorthand or syntactic sugar via Protocol 2 in the same way as . It is easy to see that under these conditions, if is applied to the cards in positions A and H in the first sorting step, then −1 is applied to the cards in positions A and H in the second sorting step, as this is the unique permutation that sorts the cards in positions H. Thus, one complete valid application of * makes no net changes to A and H. It is also easy to check that both applications of are valid in the original sense, therefore, Lemma 3.1 and Corollary 3.1 extend naturally to * . We use * for the variant using cyclic rotations.

Stating Classical Protocols in Terms of
The standard and, or, xor and copy protocols due to Mizuki and Sone [37] can all be stated as single application of our sub-protocol as shown in Protocols 3 to 6 in Fig. 3. We also provide a permutation application protocol that takes the encoding of a permutation and a sequence as input and outputs the permuted sequence. This is in essence the permutation division protocol by Hashimoto et al. [22] (the only change being that we encode the inverse permutation). It has been suggested to us that more complex protocols, such as zero-knowledge protocols for Sudoku [44] and Makaro [6], as well as for the Millionaire's problem [34] can be interpreted to implicitly utilize our sort protocol. Moreover, the eight-card AND protocol for standard decks (where all card symbols are distinct) from [35] and the eight-card 3-bit majority protocol of [40] can be implemented using two sorts, the latter is given in Protocol 8. Fig. 3 The classical protocols and, or, xor and copy as well as a permutation application protocol, all stated as sort protocols

Securely Evaluating a Universal Circuit
Let us start with the most direct case, namely implementing PFE using universal circuits, first constructed by Valiant [45]. We do not want to go into the details of the construction and just import facts about the general structure of the circuit and how it is used. In our examples, Alice provides her private function, here as a circuit C , and Bob his private input to the function, and it should hold that neither party learns anything about the other's respective secrets. The universal circuit U n for circuits of size n takes as input an encoding ⟨C⟩ of C , where C has size n , and an input I ∈ {0, 1} l of length l . We assume C to have fan-out and fan-in at most 2 , i.e., each gate has at most two inputs and at most two outputs.
In the constructions by Valiant, U n is described via a directed acyclic graph with O(n log n) vertices, where each vertex represents a logic gate taking values on its incoming edges as well as certain "configuration" (or programming) bits as input and computes outputs emitted to its outgoing edges. More concretely, U n contains the following types of nodes: where Alice's configuration bit c decides which of the two inputs is forwarded as the output.
-O(n) forks (or " -switches") where the signal on one wire is forwarded to both outgoing wires, i.e., (a) = (a, a). -l input nodes with out-degree 1 and in-degree 0, and one output node with indegree 1 and out-degree 0 with their natural interpretation.
The universal gates correspond to the gates of Alice's circuit with the configuration bits determining what kind of gate it is, and the configuration of X and Y-switches ensures that the intermediate results are routed correctly to the relevant gates. For us, it suffices that there is an (efficient) way to obtain ⟨C⟩ from C , which Alice applies beforehand. Valiant [45] describes such a general mapping from circuits C to a string of O(n log n) configuration bits for U n , such that U n configured with ⟨C⟩ (in canonical order) implements C. We describe in Protocol 9 and Theorem 4.1 how, given U n , encodings of ⟨C⟩ and Bob's input I in sequences of cards, we can compute C(I) securely. Proof P is given as Protocol 9. All nodes of U n are considered in some topological order s 1 , … , s N , allowing us to compute the bits "flowing" along each edge of U n in a systematic way. The message at an edge e is stored in positions V e = (V e [0], V e [1]) . Note that the bit on each edge is only used in one subsequent computation: After processing s i , only the bits on the edges crossing the cut ({s 1 , … , s i }, {s i+1 , … , s N }) are needed in future computations. When processing s i+1 we may, therefore, when storing the bits for the outgoing edges of s i+1 , reuse the now freed up cards that stored the bits on the incoming edges of s i+1 . In Protocol 9, this is reflected by identifying V e and V e ′ for some pairs (e, e � ) of edges. We only need a new pair of cards in the case of a fork. To verify correctness, let us interpret the main sort commands in the protocol.
1. In the X-switch case, C v ↑(V e , V f ) swaps the positions encoding the incoming input values at edges e and f , if the configuration bit of the X-switch equals 1 and leaves them unchanged, if it equals 0 . This is exactly what we wanted. 2. In the Y-switch case, the command is exactly the same, with the difference that afterwards only the output bit that ends up in the first position ( V e ) is used afterwards.
3. In the fork case, we (non-destructively, i.e., with restoring) copy the bit to another position, used as an additional output wire value. 4. The universal gate case is the most interesting. Recall that we want to evaluate (c 1 , c 2 , c 3 , c 4 , x, y) = (z, z) with z = c 1xȳ + c 2x y + c 3 xȳ + c 4 xy . For this, first observe that exactly one of the terms xȳ , xy , xȳ , xy equals one. Essentially, the values of x and y select which configuration bit constitutes the output. If x = 0 then only c 1 and c 2 are relevant. If x = 1 only c 3 and c 4 are. Therefore, in the first sorting step, we obliviously swap (C 1 , C 2 ) for (C 3 , C 4 ) if x = 1 and leave things as is, if x = 0 . The interesting two configuration bits end up in positions C 1 , C 2 , without us knowing which they are. Now, we do the same with C 1 , C 2 , based on the value of y , so that the only relevant configuration bit is now in C 1 . In the last step, we write this value in both V g and V h (recall the fan-out two requirement).
To see that P is secure, we use Corollary 3.1 and the fact that no turn operations are performed outside of sorting steps. ◻ Dependent on the topological ordering used in Protocol 9, the helping deck we use to implement forks is not fully required. Instead of using a "fresh" pair of cards to store a copy of the incoming value whenever a fork is encountered, we can reuse cards that have already served their function and will not be used in the remainder of the protocol. This includes, for instance, the cards that encoded configuration bits of universal circuits or X-switches that have already been executed.

Remark 4.1 (Reusability of the Circuit)
If we would like to be able to execute the circuit multiple times, we want that the programming bits of Alice's program are not destroyed during the execution. Here, we have to take a little care to ensure that the relevant bits are written back and that conditionally swapped cards are "unswapped" again. For this variant of our algorithm, we replace all sort operations in Protocol 9 by their starred variants. In the case of v being a universal gate, we additionally need to take extra care: in the penultimate line of the case, instead of reusing V e and V f (which are now in temporary use to swap back the relative positions of the cards containing the configuration bits), we set V g and V h as the positions of two new cards, containing ♣♡ as in the fork case. To undo the swaps, we perform V f ↑(C 1 , C 2 ) and then V e ↑((C 1 , C 2 ), (C 3 , C 4 )) at the very end of the procedure in the universal gate case. Afterwards, the cards in V e and V f may be reused again. Hence, this variant uses uses 2x n + 2y n + 2f n + 8u n shuffles, where x n is the number of X-switches, y n is the number of Y-switches, f n is the number of forks and u n is the number of universal gates in U n .

Securely Simulating a Turing Machine
Assume we wish to execute a Turing machine (TM) with a secret encoding provided by one player, Alice, on a secret input provided by another player, Bob. As any secure card protocol uses a fixed number of cards and has a runtime which is independent of the input, there must be known bounds on certain parameters of the Turing machine. Let M be a bound on the number of states, N a bound on the number of accessed tape cells and t a bound on the execution time. For simplicity, assume Alice's TM has precisely M states (it can be padded with dummy states), runs t steps ("halting" can be achieved by staying in one state, writing the current tape symbol and not moving) and think of the tape as a cycle of length N (which makes no difference for a TM only ever accessing N memory cells).
All cards (and names for them occurring in the following description) used for our protocol, with the exception of a few helping cards used for * and * operations, are given in Fig. 4. The encoding of a Turing machine consists of the encoding of its M states. The encoding of each state q ∈ {0, … , M − 1} consists of the encoding of two transitions, one for each of the two tape symbols ♡♣ and ♣♡. Take for instance the positions W 0 , SHIFT 0 = (L, N, R) and Q ′ 0 encoding the transition from state q = 0 if the tape symbol is ♣♡. The two cards in positions W 0 contain the tape symbol to be written. The three cards in positions SHIFT 0 specify the movement of the Turing machine head, ♣♡♡ for "left", ♡♡♣ for "right", ♡♣♡ for "no movement" / "halt". Lastly, the M cards in positions Q ′ 0 contain a unary encoding of q − q � (mod M) where q � ∈ {0, … , M − 1} is the index of the state to be entered next (♣♡ … ♡ encodes 0, ♡♣♡ … ♡ encodes 1, etc.).
The input to the TM, provided by Bob, is encoded in the first l bits of the tape. When executing the Turing machine, the current tape cell will always be in position TAPE[0] and the current state in position Q[0] . Instead of having an explicit moving head we simply rotate the entire tape. Moreover, instead of having an explicit value encoding the current state, we rotate the sequence of states. This is also the reason we encode state index differences in the state transitions instead of absolute indices. The protocol is given as Protocol 10 and consists of a loop that does t times the following: -"read" the tape symbol in position TAPE[0] by conditionally swapping the two transitions in state Q[0] such that the transition that should be done is available  , ROT [1] or ROT[N−1] depending on whether the ♣-card among SHIFT 0 is in position N , R or L , respectively. Then, the TAPE and ROT cards are rotated together such that the tape cell whose corresponding ROT card is ♣ comes to rest in position TAPE[0] (and such that one does not learn which rotation has been performed.) -The same idea is used to first copy the information about the next state into Using this protocol idea, we obtain the following theorem.  Proof The protocol is given in Protocol 10 and Fig. 4. For security, observe that the protocol consists only of sort sub-protocols; we can thus use Corollary 3.1.
For the cards needed, we just count the number of cards depicted in Fig. 4. In a bit more detail, for the helping cards needed, note that we need N − l pairs of ♣♡ for the empty tape cells, which are placed next to Bob's input string. We have one ♣ for each of the registers ROT , SAV and NEXT , and N − 1 , 1 and M − 1 ♡s, respectively. The second part of the union scales with the size of the largest register to be used in starred commands, which is either SHIFT 0 or Q ′ 0 . ◻ ) . However, given that the charm of Turing machines is their simplicity rather than their efficiency, we felt that we should reserve this trick for later. For simplicity, we also chose to describe how to implement TMs with band alphabet {0, 1} , excluding the special blank symbol ␣. While one can generically map this to the standard case by using an encoding 1= 11 , 0= 10 , and ␣ = 00 , let us briefly discuss how one can easily upgrade our implementation with a TM supporting an additional blank symbol. For this, we encode tape cells with three cards via ♣♡♣ = 0 , ♡♣♣ = 1 and ♣♣♡ = ␣. In this way, the first two cards encode the value as previously, unless they are ♣♣, which would be a blank. We then need to add W 2 , SHIFT 2 and Q ′ 2 to each of the Q s, specifying the operation in the case that a blank symbol is used (Note that the W i contain the symbol to be written in reversed order, to ensure the right action is done to the tape cards). This approach has the advantage of allowing us to learn the length of the output after the computation (if it is not to be protected), by just turning over the third card in each of the tape cells and outputting (the first two cards of) those cells which do not show a ♡, i.e., which are not blank.

Remark 5.2 (Reusability of the TM)
First note that we never destroy any of the state description entries of the TMs code as in normal execution it is always possible to enter the state again. Hence, to be able to run a TM multiple times, we only need to ensure that after the execution the first state is again in Q[0] . As we cannot trust Alice to provide a program that guarantees this behavior, we can introduce an additional register START[0...M − 1] which is a copy of NEXT and is rotated together with Q . It can then be used to rotate Q back into its initial configuration by executing START↑Q after the loop in Protocol 10. Hence, this variant uses 10t + 1 shuffles. (Resetting all tape cells to 0 and placing the new input is excluded here, but can be easily appended.)

Securely Simulating a Random Access Machine (RAM)
We now describe a simple bounded Random Access Machine model. The goal is to execute a RAM machine with a secret encoding of the machine specified by one player, Alice, on a secret input provided by another player, Bob.

A Simple RAM Model
We assume fixed constants N = 2 n (memory words), M = 2 m (instruction groups), l ≤ N (input size) and t < ∞ (time limit). The machine has access to N binary words RAM[0], … , RAM[N − 1] of length n each, the first l of which contain the input and the remaining N − l contain zero. The following types of instructions are available, where x, y are n-bit words and p is an m-bit word: To simplify the implementation step later, we assume that a program is a sequence I[0], … , I[M] of groups of instructions. Each group of instructions contains precisely one instruction of each of the above types, in canonical order. Note that this fixed instruction order does not affect the strength of the model. Indeed, if we assume that without loss of generality the cell RAM[0] is never used in any "real" instruction, we may choose x = y = 0 to turn any instruction into a dummy instruction that has no effect. By turning all but one desired instruction in each instruction group into such a dummy instruction, we can implement programs without having to worry about the fixed instruction order at the expense of increasing the number of instructions by a constant factor.
Here, the RAM[x] p ("jump if not zero") instruction means that if RAM[x] contains zero, the execution should continue with the next instruction group. Otherwise, p is to be interpreted as the relative offset to the next instruction group that should be executed, i.e., if the current instruction group has index j, then the instruction group with index (j + p) mod M should be executed next.

Implementation with Cards
Assume we want a secure implementation of the RAM model with parameters N = 2 n , M = 2 m , l, t using playing cards. We may imagine that one player, Alice, provides the sequence of instructions, and the other player, Bob, provides the input in RAM[0 … l−1] of l⋅n bits. As usual, each bit is encoded with a pair of cards and a word of n or m bits is a sequence of n or m such pairs. In addition to the inputs, we have an encoding of RAM[l … N−1] (initially zero) and two additional n-bit , which will be used for the conditional jumps. An overview is given in Fig. 5.
We say a few words about the implementation of the instructions, starting with a general description of how words can be loaded from and stored to arbitrary addresses.
Loading a Word. Assume that an address is available as an n-bit word x = (x 1 , … , x n ) , each bit x i encoded as a pair of face-down cards in positions X i = (X i [0], X i [1]) and that the word RAM[x] should be loaded into the accumulator. We give an implementation as Protocol 11. The first loop uses n conditional swaps of RAM ranges to transport the content of RAM The second for-loop copies the content of RAM[0] to the accumulator. Since the copy protocol can copy information only onto card pairs that are in a known state, we must securely reset the accumulator bits before each copy operation. The third for-loop undoes all swaps of the first loop, in reverse order. In total, this uses 7n shuffles.
· · · ♣ ♥ a n a : Storing a word. Storing is very similar to loading, we give an implementation in Protocol 12. Here, instead of copying the RAM content to the accumulator in the second line of the second for loop, we copy the value of the accumulator into the RAM. As above, this uses 7n shuffles.
Move operations. The operations previously dubbed copy, indirect read and indirect write are easy to implement using the load and store algorithms. For temporary storage, the accumulator A ′ is used. For instance, the indirect write operation RAM[RAM [x]]←RAM[y] with the words x and y encoded in positions X and Y can be implemented using just swaps the two card sequences. As each load or store operation uses 7n shuffles, we use 14n shuffles for copy, and 21n shuffles for indirect read and indirect write. Loading Constants. Copying a value given directly in the instruction is simply done by copying each of the n bits one by one. This uses 7n shuffles. Addition and Subtraction. Secure half and full adders have been described by [33]. If n ≥ 2 , the accumulator A ′ is sufficient to store carry-bits temporarily. Note that both protocols use 5n shuffles (more precisely, random bisection cuts), as subtraction uses the full adder with the carry bit set to 1 and all bits of the second number inverted (via a simple perm operation). We omit the details. Conditional Jump. While it would be possible to have an instruction pointer that is affected by jump operations, we opt for an approach that seems slightly more elegant. We always execute instruction group I[0] , and when executing the last instruction RAM[x] p of that group, we rotate the sequence of all instructions such that either IP [1] or IP[p] becomes IP[0] , depending on the value of RAM[a] . See below for the exact description. Counting the shuffles in the relevant part of Protocol 13 yields 8n + 2m + 2 shuffles, as the n bit or operation uses n shuffles. (Note that here p might even be 0, meaning that if RAM[x] ≠ 0 the same instruction group is repeated again. Due to the time limit t, this cannot result in a real infinite loop and hence would not exhibit unusual detectable behavior that, e.g., Alice could use to learn information on Bob's input.) The overall execution of the RAM program is given in Protocol 13. We assume the addresses x and p are available in positions X and P, respectively. To carry out theconditional jump, first load x into the accumulator and form the Boolean OR of all its bits. Assuming RAM[0] is not zero, then the bit a 1 is set to true by this OR operation and the single ♡-card is swapped into IP * before the for-loop and is put into position IP [1] afterwards. If, however, RAM Proof For the correctness, we refer to the above explanation of all the relevant commands. For security we again use Corollary 3.1 and the fact that we do not turn over any cards outside sort or rot operations. For this, note that the OR operation in line 5 of Protocol 13 can be framed as a sort operation, cf. Protocol 4. The number of shuffles is derived by counting the numbers of shuffles in each instruction type as specified above. This yields (7n + 14n + 21n + 21n + 5n + 5n + 8n + 2m + 4)t = (85n + 2m + 4)t shuffles.

Securely Evaluating a Branching Program
Branching Programs [4] are commonly used for constructing program obfuscation, e.g., in [17,20,47], which inspired this section. Branching Program. A branching program B of length N and width w for l variables is a sequence ((j (i) , (i) In other words, in the i-th step, the value of the j (i) -th variable determines which of the two permutations of the i-th instruction is used. For ∈ S w , we say B -computes a Boolean circuit C, if for any ⃗ v ∈ {0, 1} l Now let be a set of states on which S w acts via some group action * and executing B on ⃗ v starting from some start state q 0 ∈ means computing states (q i ) 1≤i≤N iteratively as q i+1 = v j (i) * q i . Of course, we end with q N = v j (N) * … * v j (1) * q In this paper, is a set of card sequences of length w and * q yields the card sequence q permuted by .
A Peculiar Subset of S 5 . Barrington's Theorem makes heavy use of the fact that S 5 is not a solvable group. In particular, there are permutations , ∈ S 5 such that the commutator [ , ]∶= • • −1 • −1 is not the identity permutation. There is some freedom when choosing permutations for the construction that follows. To be more specific, we define the five permutations 0 , … , 4 as Barrington's Theorem. We now state a central theorem due to Barrington, which we specialize to permutations from the set F defined above. For self-containedness and illustration, we give the elegant and constructive proof in full. Recall from Sect. 2 that the depth of a circuit C is the maximum number of ∧ and ∨ gates on a path in C. Proof The proof works by induction on the length d ′ of the longest path in C. If d � = 0 , then we also have d = 0 and the output node is labeled with a constant 0, a constant 1 or the index j of a variable. In these cases, the trivial branching programs with a single instruction of the form (_, , ) , (_, , ) or (j, , ) , respectively, -compute C (here, _ is a placeholder for an arbitrary variable index). Now assume d ′ > 0 . If the output node is labeled �� ¬ ", then the value at its unique predecessor is computed by a circuit C ′ with longest path of length d � − 1 . Therefore, there is a branching program B ′ that −1 -computes C ′ with at most 4 d instructions. Let (j, , � ) be the last instruction of B ′ . Replacing it with (j, • , • � ) yields a branching program B that -computes C since we have and for similar reasons B(⃗ v) = ⇔ C(⃗ v) = 0.
If the output node is labeled ∧ , then values at its two predecessors are computed by two circuits C ′ and C ′′ with longest path of length at most d � − 1 and depth at most d − 1 . We previously observed that we can write = [ � , �� ] for two permutations � , �� ∈ F . Let B ′ ′ and B � �−1 be two branching programs that ′ -compute and �−1 -compute C ′ , respectively, and similarly B ′′ ′′ and B �� ��−1 be two branching programs that ′ -compute and ��−1 -compute C ′′ , respectively.
We obtain B as the concatenation of these four branching programs. Depending on the values r � = C � (v 1 , … , v l ) and r �� = C �� (v 1 , … , v l ) we get the following behavior of B:

Implementing Branching Programs with Cards
We first describe how the encoding P = P(C) is obtained from C, as the format of P already contributes to hiding details about C, especially the pattern in which variables are used. Firstly, by Barrington's Theorem (Theorem 7.1) there is a branching program B = B(C) that −1 0 -computes C with N ≤ 4 d instructions. We now transform B into a normalized branching program B ′ by preceding each instruction (j, 0 , 1 ) of B with the j − 1 dummy instructions (1, , ), … , (j − 1, , ) and appending to it the l − j dummy instructions (j + 1, , ), … , (l, , ) . This means that B ′ accesses all variables periodically in canonical order. Note that B ′ contains lN ≤ l⋅4 d . (In addition, we may choose to pad B ′ to a longer program B ′′ of length lN ′ if we wish to hide the length of B ′ and thus of B.) Clearly, B ′ exhibits the same behavior as B. The sequence P is now simply obtained by concatenating the lN sequences encoding the permutations occurring in the description of B ′ .
(ii) The output is two cards encoding B(v 1 , … , v l ).
(iii) In addition to the cards encoding the inputs, the helping deck [ 2⋅♡,5⋅ ♣] is used. (iv) Each execution of the protocol performs 3lN shuffle actions.
Proof The protocol is described in Protocol 14. We denote by capital letters the sets of positions on which the corresponding parts of the input (denoted by lower case letters) are present at the start of the protocol. Additionally, there are helping cards present in positions Q that initially contain the sequences ♣♡♣♣♣ as well as two cards to support the * -operation (not shown in Fig. 6).
Consider an iteration of the inner loop with k = li + j . First, the encodings of the two permutations (k) 0 and (k) 1 (in positions (k) 0 and (k) 1 ) are swapped if v j (in position V j ) is 1 and left as is otherwise. Hence, an encoding of (k) v j ends up in position (k) 0 , from where it is obliviously applied to the sequence in Q . For correctness, note that by assumption the normalized branching program −1 0 -computes C , i.e., if the output is 0 , in total we perform on the cards in Q , which results in a 0 being encoded in Q R . If C outputs 1 , then −1 0 is applied to the cards of Q , resulting in ♡♣♣♣♣, as −1 0 maps 2 ↦ 1 , yielding an encoded 1 in Q R . Security of P follows again from the fact that the protocol is only composed by valid sort operations and Corollary 3.1. ◻ Remark 7.1 (Reusability of the Program) To allow for reusing the branching program after its execution, we would need to write the executed permutation of each step back into its register and to undo any conditional swaps. In more formal terms, we replace the sort command in the second line of the inner loop of Protocol 14 with its starred variant. To undo the swap, we repeat the first line of the inner loop after the second line. Moreover, we reset the register Q . Hence, this variant of the protocol uses 6lN + 1 shuffles.
A Note Regarding Active Security. Note that a malicious Alice might learn something about the input passed to the program by choosing the permutations of the program in such a way that the output (the first two cards in Q after the protocol run) is not ♣♡ or ♡♣, but ♣♣. If we want to avoid this, we can initialize q 0 with ♣♡♣♡♣ (replacing the penultimate ♣ with a ♡), and instead of opening just the first two cards at the end, we have to ensure that the content of the register gets mapped to a single bit, without revealing anything else. For this, note that after a protocol run of a legal program, Q contains one of two configurations namely ♣♡♣♡♣ if was applied, and ♡♣♣♣♡ if −1 0 was applied. Important here, is that in the first case, the ♡ s have distance 1 and in the second case distance 0 , which is invariant over random cuts, and represents the two possible configuration classes (orbits w.r.t. random cuts) in the five-card trick [8]. We cannot use the five-card trick directly, as its output is not in committed format, however. To overcome this, we can make use of the five-card AND protocol of [3], which starts with a situation as above and then outputs a bit commitment to the AND value in a (restart-free) Las Vegas fashion. (Note that this protocol is shown to be optimal/card-minimal in a strong sense in [27].) This change would add seven shuffles (five random cuts and two random bisection cuts) in expectation.
Moreover, for active security in all the protocols in this paper, one should additionally implement the shuffle operation with active security as in [30]. For ease of implementing the coupled shuffles, we recommend to use envelopes to avoid additional helping cards, as in Fig. 2.

Conclusion
We give four card-efficient and conceptually simple protocols for executing a universal machine model in a secure multiparty computation protocol, hence achieving Private Function Evaluation. These are for circuits, Turing and word-RAM machines and branching programs, giving the user a palette of options, from which they can choose the most suitable one. As an interesting building block-also largely simplifying security proofs-we introduce sort protocols, which we believe to be of independent interest, as many protocols from the literature can be restated in these terms. We give the concrete numbers of necessary cards for each of the models, carefully reusing helping cards where possible. We additionally discuss several adaptations, e.g., on how to execute these in a non-destructive way that lets us reuse the program multiple times.
Our results can also be interpreted as a straightforward instantiation of Oblivious RAM (ORAM), making heavy use of the fact that we can physically and obliviously move around "RAM cells", which is not possible in the usual cryptographic ORAM model. By stating these classical cryptography problems, such as constructing ORAM or program obfuscation in the language of card-based cryptography, it might not only be of didactic use in explaining these to students, but also provide some insight into the constructions in the classical cryptographic realm.