A cryptographic study of tokenization systems

Díaz-Santiago, Sandra; Rodríguez-Henríquez, Lil María; Chakraborty, Debrup

doi:10.1007/s10207-015-0313-x

A cryptographic study of tokenization systems

Regular Contribution
Published: 22 January 2016

Volume 15, pages 413–432, (2016)
Cite this article

International Journal of Information Security Aims and scope Submit manuscript

Sandra Díaz-Santiago¹,
Lil María Rodríguez-Henríquez² &
Debrup Chakraborty³

1019 Accesses
10 Citations
6 Altmetric
Explore all metrics

Abstract

Payments through cards have become very popular in today’s world. All businesses now have options to receive payments through this instrument; moreover, most organizations store card information of its customers in some way to enable easy payments in future. Credit card data are a very sensitive information, and theft of this data is a serious threat to any company. Any organization that stores credit card data needs to achieve payment card industry (PCI) compliance, which is an intricate process where the organization needs to demonstrate that the data it stores are safe. Recently, there has been a paradigm shift in treatment of the problem of storage of payment card information. In this new paradigm instead of the real credit card data a token is stored, this process is called “tokenization.” The token “looks like” the credit/debit card number, but ideally has no relation with the credit card number that it represents. This solution relieves the merchant from the burden of PCI compliance in several ways. Though tokenization systems are heavily in use, to our knowledge, a formal cryptographic study of this problem has not yet been done. In this paper, we initiate a study in this direction. We formally define the syntax of a tokenization system and several notions of security for such systems. Finally, we provide some constructions of tokenizers and analyze their security in light of our definitions.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Identity-Based Cryptography in Credit Card Payments

Updatable Tokenization: Formal Definitions and Provably Secure Constructions

Smart Cards for Banking and Finance

Notes

In our view, irrespective of other possible identifiers, the associated data should contain an identifier of the merchant. Thus if $d,d'\in {\mathcal {D}}$ are two associated data related to two different merchants, it should be that $d \ne d'$. For our notion of correctness this requirement for the associated data would be required.
According to [6], the total number of credit cards in 2012 from the four primary credit card networks (i.e., VISA, MasterCard, American Express, and Discover) was 546 millions ($\approx 2^{30}$). This can be considered as a reasonable upper bound for q. Assuming credit card numbers to be of 16 decimal digits, $\#{\mathcal {T}}= 10^{16} \approx 2^{53}$. These numbers lead to a collision probability of $1/2^{23}$ which is insignificant.

References

Bellare, M., Ristenpart, T., Rogaway, P., Stegers T.: Format-preserving encryption. In: Jacobson Jr., M.J., Rijmen V., Safavi-Naini R., (eds.), Selected Areas in Cryptography, volume 5867 of Lecture Notes in Computer Science, pp. 295–312. Springer (2009)
Bellare, M., Rogaway, P., Spies, T.: The FFX Mode of Operation for Format-Preserving Encryption. NIST submission (2010). http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/ffx/ffx-spec
Berbain, C., Gilbert, H.: On the security of IV dependent stream ciphers. In: Biryukov, A., (ed.) FSE, volume 4593 of Lecture Notes in Computer Science, pp. 254–273. Springer (2007)
Black, J., Rogaway, P.: Ciphers with arbitrary finite domains. In: Preneel, B., (ed.) CT-RSA, volume 2271 of Lecture Notes in Computer Science, pp. 114–130. Springer (2002)
Brier, E., Peyrin, T., Stern, J.: BPS: A Format-Preserving Encryption Proposal. NIST submission (2010). http://csrc.nist.gov/groups/ST/toolkit/BCM/documents/proposedmodes/bps/bps-spec
CardHub: Number of Credit Cards and Credit Card Holders (2012). http://www.cardhub.com/edu/number-of-credit-cards/
EMV: Payment Tokenisation Specification. Technical Framework (2014). https://www.emvco.com/specifications.aspx?id=263
Hoang, V.T., Morris, B., Rogaway, P.: An enciphering scheme based on a card shuffle. In: Safavi-Naini, R., Canetti, R. (eds.) CRYPTO, volume 7417 of Lecture Notes in Computer Science, pp. 1–13. Springer (2012)
ISO/IEC 7812–1: Identification Cards-Identification of Issuers-Part 1: Numbering System (2006)
Liskov, M., Rivest, R.L., Wagner, D.: Tweakable block ciphers. In: Yung, M. (ed.) CRYPTO, volume 2442 of Lecture Notes in Computer Science, pp. 31–46. Springer (2002)
Morris, B., Rogaway, P., Stegers, T.: How to encipher messages on a small domain. In: Halevi, S. (ed.) CRYPTO, volume 5677 of Lecture Notes in Computer Science, pp. 286–302. Springer (2009)
PCI Security Standards Council: Payment Card Industry Data Security Standard Version 1.2 (2008). https://www.pcisecuritystandards.org/security_standards/pci_dss.shtml
PCI Security Standards Council: Information Supplement: PCI DSS Tokenization Guidelines (2011). https://www.pcisecuritystandards.org/documents/Tokenization_Guidelines_Info_Supplement
PCI Security Standards Council: Tokenization Product Security Guidelines-Irreversible and Reversible Tokens (2015). https://www.pcisecuritystandards.org/documents/Tokenization_Product_Security_Guidelines
Ristenpart, T., Yilek, S.: The mix-and-cut shuffle: Small-domain encryption secure against n queries. In: Canetti, R., Garay, J.A. (eds.) CRYPTO (1), volume 8042 of Lecture Notes in Computer Science, pp. 392–409. Springer (2013)
Robshaw, M.J.B., Billet, O. (eds.): New Stream Cipher Designs-The eSTREAM Finalists, volume 4986 of Lecture Notes in Computer Science. Springer (2008)
RSA White Paper: Tokenization: What Next After PCI (2012). http://www.emc.com/collateral/white-papers/h11918-wp-tokenization-rsa-dpm
Securosis White Paper: Tokenization Guidance: How to Reduce pci Compliance Costs (2011). http://gateway.elavon.com/documents/Tokenization_Guidelines_White_Paper
Securosis White Paper: Tokenization vs. Encryption: Options for Compliance (2011). https://securosis.com/research/publication/tokenization-vs.-encryption-options-for-compliance
Stefanov, E., Shi, E.: Fastprp: fast pseudo-random permutations for small domains. IACR Cryptol. ePrint Arch. 2012, 254 (2012)
Google Scholar
Voltage Security White Paper: Payment Security Solution—Processor Edition (2012). http://www.voltage.com/wp-content/uploads/Voltage_White_Paper_SecureData_PaymentsProcessorEdition

Download references

Acknowledgments

The authors thank the reviewers for their comments and suggestions. Debrup Chakraborty acknowledges the support from Consejo Nacional de Ciencias y Technologia (CONACyT), Mexico, through the grant 166763.

Author information

Authors and Affiliations

Escuela Superior de Cómputo, IPN, Av. Juan de Dios Bátiz, Lindavista, 07738, Mexico City, Mexico
Sandra Díaz-Santiago
Centro de Investigación en Computación, IPN, Av. Juan de Dios Bátiz, Col. Nueva Industrial Vallejo, 07738, Mexico City, Mexico
Lil María Rodríguez-Henríquez
Department of Computer Science, CINVESTAV-IPN, Av. IPN 2508 San Pedro Zacatenco, 07360, Mexico City, Mexico
Debrup Chakraborty

Authors

Sandra Díaz-Santiago
View author publications
You can also search for this author in PubMed Google Scholar
Lil María Rodríguez-Henríquez
View author publications
You can also search for this author in PubMed Google Scholar
Debrup Chakraborty
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Debrup Chakraborty.

Additional information

This is a substantially extended version of the paper: Sandra Díaz-Santiago, Lil María Rodríguez-Henríquez and Debrup Chakraborty, A Cryptographic Study of Tokenization Systems, International Conference on Security and Cryptography (SECRYPT 2014), pp. 393–398.

Appendix: Deferred Proofs

1.1 Proof of Theorem 1

We only prove the first claim in the theorem, as discussed earlier, the second claim directly follows from the first one. We construct a $\widetilde{\text{ prp }}$ adversary ${\mathcal {B}}$ which runs an arbitrary adversary ${\mathcal {A}}$ who attacks TKR1. ${\mathcal {B}}$ being a $\widetilde{\text{ prp }}$ adversary has access to an oracle ${\mathcal {O}}(.,.)$, which is either the real tweakable permutation $\mathsf{FP}_k(.,.)$ for a randomly chosen key k, or a random permutation chosen uniformly at random from the set of all tweak index permutations from ${\mathcal {T}}$ to ${\mathcal {T}}$. ${\mathcal {B}}$ with its oracle provides the environment to ${\mathcal {A}}$ and simulates the experiment EXP-IND-TKR$^{\mathcal {A}}_\mathsf{TKR1}$ as shown in Fig. 9.

We assume without loss of generality that ${\mathcal {A}}$ does not repeat queries, as ${\mathcal {A}}$ knows that $\mathsf{TKR1}$ is a deterministic scheme; hence, it does not gain anything by repeating a query.

It is easy to see that if the oracle ${\mathcal {O}}(.,.)$ of ${\mathcal {B}}$ is $\mathsf{FP}_k(.,.)$, then ${\mathcal {B}}^{\mathcal {O}}$ provides the perfect environment for ${\mathcal {A}}$ as in EXP-IND-TKR$^{\mathcal {A}}_\mathsf{TKR1}$. Hence,

$$\begin{aligned}&\Pr [k\mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}: {\mathcal {B}}^{\mathsf{FP}_k(.,.)} \Rightarrow 1] \nonumber \\&\quad = \Pr [\text{ EXP-IND-TKR }^{\mathcal {A}}_\mathsf{TKR1} \Rightarrow 1]. \end{aligned}$$

(3)

Also,

$$\begin{aligned} \Pr [\pi \mathop {\leftarrow }\limits ^{\$}\mathsf{Perm}^{\mathcal {D}}({\mathcal {T}}): {\mathcal {B}}^{\pi (.,.)} \Rightarrow 1] \le \frac{1}{2}, \end{aligned}$$

(4)

as, when ${\mathcal {O}}(.,.)$ is a uniform random tweakable permutation on ${\mathcal {T}}$, for each of its queries ${\mathcal {A}}$ gets uniform random elements in ${\mathcal {T}}$, thus $b'$ which ${\mathcal {A}}$ outputs is independent of b which is selected by ${\mathcal {B}}$.

Hence, from Eqs. (3) and (4), we have

and hence

as desired. $\square $

1.2 Proof of Proposition 2

To prove this proposition, we construct a PRF adversary $\mathcal {B}$ (shown in Fig. 10) which runs an arbitrary adversary $\mathcal {A}$ who attacks the encryption scheme $\mathbf E1$ in the DET-CPA sense. ${\mathcal {B}}$ being a PRF adversary has access to an oracle ${\mathcal {O}}$ which can be either be the block cipher $E_k$ or a function $\rho $, chosen uniformly at random from $\text{ Func }(n)$.

We can easily see that if the oracle of $\mathcal {B}$ is the block cipher $E_k$ then

$$\begin{aligned} \Pr [k \mathop {\leftarrow }\limits ^{\$}{\mathbb K}:{\mathcal {B}}^{E_k(\cdot )} \Rightarrow 1] = \Pr [k\mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}:{\mathcal {A}}^{\mathbf{E1}(\cdot ,\cdot )} \Rightarrow 1]. \end{aligned}$$

(5)

As ${\mathcal {A}}$ never repeats a query, so if the oracle of ${\mathcal {B}}$ is a random function $\rho $, then for each query ${\mathcal {A}}$ gets a uniform random n bit string as a response. Thus,

$$\begin{aligned} \Pr [\rho \mathop {\leftarrow }\limits ^{\$}\mathsf{Func}(n): {\mathcal {B}}^{\rho (\cdot )} \Rightarrow 1] = \Pr [ {\mathcal {A}}^{\$ (\cdot ,\cdot )} \Rightarrow 1] \end{aligned}$$

(6)

Thus from the equations above, and the definition of the DET-CPA advantage of ${\mathcal {A}}$ and the PRF advantage of ${\mathcal {B}}$, we obtain

$\square $

1.3 Proof of Proposition 3

As in the proof of Proposition 2, we construct a PRF adversary $\mathcal {B}$ (shown in Fig. 11) which runs an arbitrary adversary $\mathcal {A}$ who attacks the encryption scheme $\mathbf E2$. Adversary $\mathcal {B}$ has access to an oracle $\mathcal {O}$, which can be either a secure block cipher $E_k$ or a pseudorandom function $\rho $, chosen uniformly at random from $\text{ Func }(n)$.

We can easily see that if the oracle of $\mathcal {B}$ is the block cipher $E_k$ then

$$\begin{aligned} \Pr [k \mathop {\leftarrow }\limits ^{\$}{\mathbb K}:{\mathcal {B}}^{E_k(\cdot )} \Rightarrow 1] = \Pr [k \mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}:{\mathcal {A}}^{\mathbf{E2}(\cdot ,\cdot )} \Rightarrow 1] \end{aligned}$$

(7)

To analyze the situation when the oracle of ${\mathcal {B}}$ is a random function, we consider the game G0 shown in Fig. 12. The game $\mathbf{G0}$ describes a function $\mathbf{Choose}\text{- }\rho ()$, which acts as a random function. It returns uniform random strings in $\{0,1\}^n$ when it is invoked, but it returns the same string if invoked twice on the same input. It does this by maintaining a table $\rho $ of outputs that it has already returned. Additionally in the set $\mathsf{Dom}$, it maintains the points on which it has been queried. The function sets the bad flag to true if it is queried twice on the same input.

As $\mathbf{Choose}\text{- }\rho $ acts like a random function, hence it is immediate that

$$\begin{aligned} \Pr [\rho \mathop {\leftarrow }\limits ^{\$}\mathsf{Func}(n):{\mathcal {B}}^{\rho (\cdot )} \Rightarrow 1] = \Pr [{\mathcal {A}}^{G0} \Rightarrow 1] \end{aligned}$$

(8)

Now, we do a small change in game $\mathbf{G0}$, i.e., we remove the boxed entry in the function $\mathbf{Choose}\text{- }\rho $, we call this changed game as $\mathbf{G1}$. Notice that games $\mathbf{G1}$ and $\mathbf{G0}$ are identical until the flag bad is set to true; hence, we have

$$\begin{aligned} \Pr [{\mathcal {A}}^{G0} \Rightarrow 1] \!- \!\Pr [{\mathcal {A}}^{G1} \Rightarrow 1] \!\le \!\Pr [{\mathcal {A}}^{G1} \text{ sets } \text{ bad }] \end{aligned}$$

(9)

Also in game $\mathbf{G1}$, the function $\mathbf{Choose}\text{- }\rho $, returns random strings for any input it gets, thus ${\mathcal {A}}$ when interacts with $\mathbf{G1}$ gets random strings in $\{0,1\}^n$ in response to its queries. Hence,

$$\begin{aligned} \Pr [ {\mathcal {A}}^{\$ (\cdot ,\cdot )} \Rightarrow 1]=\Pr [{\mathcal {A}}^{G1} \Rightarrow 1]. \end{aligned}$$

(10)

Now, we do some small syntactic changes in the game $\mathbf{G1}$ to obtain the game $\mathbf{G2}$, also shown in Fig. 12. Game $\mathbf{G2}$ is only syntactically different from $\mathbf{G1}$. In $\mathbf{G2}$, random strings are returned immediately as a response to a query of ${\mathcal {A}}$, and later in the finalization phase appropriate values are inserted in the multiset $\mathsf{Dom}$, note as $\mathsf{Dom}$ is a multiset; hence, there can be several instances of the same element present here.

As there is no way that ${\mathcal {A}}$ can distinguish between $\mathbf{G1}$ and $\mathbf{G2}$, hence

$$\begin{aligned} \Pr [{\mathcal {A}}^{G1} \Rightarrow 1]= \Pr [{\mathcal {A}}^{G2} \Rightarrow 1], \end{aligned}$$

(11)

also

$$\begin{aligned} \Pr [{\mathcal {A}}^{G1} \text{ sets } \text{ bad }]=\Pr [{\mathcal {A}}^{G2} \text{ sets } \text{ bad }]. \end{aligned}$$

(12)

Thus, using Eqs. (8), (9), (10), (11) and (12) we get

$$\begin{aligned}&\Pr [\rho \mathop {\leftarrow }\limits ^{\$}\mathsf{Func}(n):{\mathcal {B}}^{\rho (\cdot )} \Rightarrow 1]\nonumber \\&\quad = \Pr [{\mathcal {A}}^{G0} \Rightarrow 1]\nonumber \\&\quad \le \Pr [{\mathcal {A}}^{G1} \Rightarrow 1]+\Pr [{\mathcal {A}}^{G1} \text{ sets } \text{ bad }]\nonumber \\&\quad \le \Pr [{\mathcal {A}}^{G2} \Rightarrow 1]+\Pr [{\mathcal {A}}^{G2} \text{ sets } \text{ bad }]\nonumber \\&\quad \le \Pr [ {\mathcal {A}}^{\$ (\cdot ,\cdot )} \Rightarrow 1] + \Pr [{\mathcal {A}}^{G2} \text{ sets } \text{ bad }] \end{aligned}$$

(13)

Let $\mathsf{COLLD}$ be the event that there is a collision in the multiset $\mathsf{Dom}$ in game $\mathbf{G2}$, then from the description of game $\mathbf{G2}$, we have

$$\begin{aligned} \Pr [{\mathcal {A}}^{G2} \text{ sets } \text{ bad }] = \Pr [\mathsf{COLLD}] \end{aligned}$$

Now we concentrate on finding an upper bound for $\Pr [\mathsf{COLLD}]$. The elements present in $\mathsf{Dom}$ are d’s and $\lambda $’s. Let $\mathsf{Dom} = Q_d \cup Q_\lambda $, where $Q_d \subseteq \{ d^{(i)}:1 \le i \le q\}$, and $Q_\lambda =\{\lambda ^{(i)}= z^{(i)}\oplus \mu ^{(i)}| 1\le i \le q \}$.

Note that the way the game $\mathbf{G2}$ is designed, all elements in $Q_d$ are distinct; thus, there can be no collision among two elements in $Q_d$. Additionally we claim the following

Claim 1

For $1\le i,j\le q$, $i\ne j$, $\Pr [\lambda ^{(i)} = \lambda ^{(j)}]\le 1/2^n$.

Proof

We have two cases to consider:

Case 1 If $d^{(i)}= d^{(j)}$, then $x^{(i)} \ne x^{(j)}$, as ${\mathcal {A}}$ does not repeat any query. This makes $z^{(i)} \ne z^{(j)}$. According to the game $\mathbf{G2}$, if $ d^{(i)}= d^{(j)}$, then $\mu ^{(i)} = \mu ^{(j)}$. Thus we have $\lambda ^{(i)} \ne \lambda ^{(j)}$. Thus, making $\Pr [\lambda ^{(i)} = \lambda ^{(j)}] =0$.

Case 2 If $d^{(i)}\ne d^{(j)}$, then $\mu ^{(i)}$ and $\mu ^{(j)}$ are uniform and independent random elements in $\{0,1\}^n$, thus making

$$\begin{aligned} \Pr [\lambda ^{(i)} = \lambda ^{(j)}] = \Pr [z^{(i)}_1 \oplus \mu ^{(i)}= z_1^{(j)} \oplus \mu ^{(j)}]=\frac{1}{2^n}. \end{aligned}$$

Claim 2

For any $d\in Q_d$ and any $\lambda \in Q_\lambda $, $\Pr [\lambda = d]\le 1/2^n $.

Proof

Any $\lambda \in Q_\lambda $ is a uniform random string in $\{0,1\}^n$, and is independent of any $d\in Q_d$.

Now, as $\#Q_d \le q$ and $\#Q_\lambda =q$, using Claims 1, 2 and the union bound, we have

$$\begin{aligned} \Pr [\mathsf{COLLD}] \le \frac{1}{2^n}{q\atopwithdelims ()2} + \frac{q^2}{2^n} < \frac{2q^2}{2^n}. \end{aligned}$$

Now, using the definition of DET-CPA advantage of ${\mathcal {A}}$ and Eqs. (7) and (13), we have the proposition. $\square $

1.4 Proof of Theorem 3

Note that the token generation algorithm for both TKR2 and TKR2a are the same, the only difference between the two procedures is the structure and content of the card-vault. Hence, the proof of security in IND-TKR sense for both TKR2 and TKR2a are same, as in case of IND-TKR security the adversary does not have access to the contents of the card-vault.

The structure of the proof is same as the proof of Theorem 1. We assume an arbitrary adversary ${\mathcal {A}}$ which attacks $\mathsf{TKR2}$ in IND-TKR sense, and we construct a RND adversary ${\mathcal {B}}$ which attacks $\mathsf{RN}^{\mathcal {T}}[k]$ using ${\mathcal {A}}$.

${\mathcal {B}}$ has an oracle ${\mathcal {O}}$, which is either $\mathsf{RN}^{\mathcal {T}}[k]$ for a random key, or $\$^{\mathcal {T}}()$, which on each invocation returns a random element in ${\mathcal {T}}$.

${\mathcal {B}}$ responds to queries of ${\mathcal {A}}$ as follows. First ${\mathcal {B}}$ initiates with an empty card-vault and then ${\mathcal {B}}$ performs the query phase, which in fact is the procedure $\mathsf{TKR2}_k$ in Fig. 5. Only when a call to $\mathsf{RN}^{\mathcal {T}}[k]()$ is required, it is replaced by a call to its oracle ${\mathcal {O}}$. After ${\mathcal {A}}$ stops querying and outputs the challenge pair $(m_0,d_0),(m_1,d_1)$, ${\mathcal {B}}$ selects a bit b uniformly at random from $\{0,1\}$ and provides ${\mathcal {A}}$ with t computed by following $\mathsf{TKR2}_k()$ (the call to $\mathsf{RN}^{\mathcal {T}}[k]()$ replaced by a call to ${\mathcal {O}}$). Finally, ${\mathcal {A}}$ outputs a bit $b'$, and if $b=b'$, then ${\mathcal {B}}$ outputs 1 else outputs a 0. Note that the challenge pair $(m_0,d_0),(m_1,d_1)$ is different from any previous query of ${\mathcal {A}}$.

From the above description, it is clear that if the oracle $\mathcal {O}(.,.)$ of ${\mathcal {B} }$ is $\mathsf{RN}^{\mathcal {T}}[k]()$, then ${\mathcal {B}}$ is performing experiment EXP-IND-TKR$^{\mathcal {A}}_\mathsf{TKR2}$. Hence,

$$\begin{aligned} \Pr [k\mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}: {\mathcal {B}}^{\mathsf{RN}^{\mathcal {T}}[k]()} \!\Rightarrow \!1] \!=\! \Pr [\text{ EXP-IND-TKR }^{\mathcal {A}}_\mathsf{TKR2} \!\Rightarrow \!1].\nonumber \\ \end{aligned}$$

(14)

Otherwise, i.e., if the oracle $\mathcal {O}(.,.)$ of ${\mathcal {B}}$ is $\$^{\mathcal {T}}()$, then

$$\begin{aligned} \Pr [ {\mathcal {B}}^{{\$^{\mathcal {T}}()}} \Rightarrow 1] \le \frac{1}{2}. \end{aligned}$$

(15)

As in this case the output that ${\mathcal {B}}$ provides to ${\mathcal {A}}$ is independent of $(m_0,d_0),(m_1,d_1)$.

From Eqs. (14), (15), we have

and from the definition of IND-TKR advantage of ${\mathcal {A}}$ it follows

$\square $

1.5 Proof of Theorem 4

For this proof, we use the sequence of games. The three games $\mathbf{EXP}_0^{\mathcal {A}}$, $\mathbf{EXP}_1^{\mathcal {A}}$ and $\mathbf{EXP}_2^{\mathcal {A}}$ are described in Fig. 13. Each game depicts the interaction of an IND-TKR-CV adversary with a tokenization procedure. In all the three games, we assume that the adversary ${\mathcal {A}}$ does not repeat a query in the query phase, and the queries presented in the challenge phase are also distinct from the queries made in the query phase. Also, to keep things simple in terms of notations, without loss of generality we assume that the ciphertext space ${\mathbb C}$ of the encryption algorithm $\mathbf{E}$ contains strings of length s. The proof can be made to work without this restriction. We describe the three different games briefly next:

1.
In game $\mathbf{EXP}_0^{\mathcal {A}}$, ${\mathcal {A}}$ interacts with $\mathsf{TKR2a}$, instantiated by $\mathsf{RN}^{T}[k_2]()$ and $\mathbf{E}_{k_1}(\cdot ,\cdot )$, where $k_1$ and $k_2$ are chosen uniformly at random from the respective key spaces ${\mathcal {K}}_1$ and ${\mathcal {K}}_2$. The game is designed with the assumption that, ${\mathcal {A}}$ does not repeat a query.
2.
Game $\mathbf{EXP}_1^{\mathcal {A}}$ is almost same as the game $\mathbf{EXP}_0^{\mathcal {A}}$. The differences are as follows:
- Here the encryption scheme $\mathbf{E}_{k_1}(\cdot ,\cdot )$, is no more used. Instead, each call to $\mathbf{E}_{k_1}(\cdot ,\cdot )$ is responded by a random string from ${\mathbb C}$. To maintain the same behavior of $\mathbf{E}_{k_1}$, a set $\mathsf{Ran}_1$ is maintained to keep track of the values already returned as output, and it is ensured that the same value is not returned for two different inputs.
- In the game $\mathbf{EXP}_0^{\mathcal {A}}$, in lines 11 to 14 and 53 to 56 it is ensured that a distinct token is t returned for each distinct (x, d). This is done by a search in the card-vault (see lines 14 and 56), as the card-vault contains encryption of the token t with associated data d. As in the game $\mathbf{EXP}_1^{\mathcal {A}}$, a real encryption scheme is not used, so this search is not possible. Hence, a set $\mathsf{Tok}$ is maintained which contains pairs of tokens and associated data (t, d) and the uniqueness of tokens is ensured using this set $\mathsf{Tok}$.
3.
Game $\mathbf{EXP}_2^{\mathcal {A}}$ is obtained from game $\mathbf{EXP}_1^{\mathcal {A}}$ by replacing $\mathsf{RN}^{\mathcal {T}}[k_2]()$ by a procedure which on each invocation returns a random element in ${\mathcal {T}}$. This game also used the sets $\mathsf{Ran}_1$ and $\mathsf{Tok}$ to ensure injectivity and the uniqueness of the tokens.

It is easy to see that $\mathbf{EXP}_0^{\mathcal {A}}$ is a restatement of the experiment $\text{ Exp-IND-TKR-CV }^{\mathcal {A}}$ in Fig. 2. Hence,

$$\begin{aligned} \Pr [\text{ Exp-IND-TKR-CV }^{\mathcal {A}} \Rightarrow 1] = \Pr [\mathbf{EXP}_0^{\mathcal {A}} \Rightarrow 1]. \end{aligned}$$

(16)

Also, we make the following claims:

Claim 3

There exists a DET-CPA adversary ${\mathcal {B}}$ for $\mathbf{E}$ such that

Proof

To prove this claim, we construct a DET-CPA adversary ${\mathcal {B}}$ which has access to an oracle ${\mathcal {O}}$. This oracle is either the encryption scheme $\mathbf{E}_{k_1}$ for a random key $k_1$ or $\$(\cdot ,\cdot )$ which on input (x, d) returns random strings of length s. ${\mathcal {B}}$ has the objective of distinguishing between these two scenarios. ${\mathcal {B}}$ runs ${\mathcal {A}}$ in the following way. First ${\mathcal {B}}$ initiates with an empty card-vault and selects a random key $k_2$ from ${\mathcal {K}}_2$, and also initializes a multi-set $\mathsf{Dom}$ to empty. Then, it answers queries of ${\mathcal {A}}$ according to the procedure TKR2a (shown in Fig. 6). To answer the queries, whenever a call to the encryption scheme $\mathbf{E}_{k_1}$ is required, it is replaced by a call to its oracle ${\mathcal {O}}$. ${\mathcal {B}}$ also stores each output it gets from its oracle ${\mathcal {O}}$ in the set $\mathsf{Dom}$. Note, as ${\mathcal {A}}$ does not repeat any query, hence all queries made by ${\mathcal {B}}$ to its oracle is distinct. After ${\mathcal {A}}$ stops querying and outputs a challenge pair $(x_0, d_0), (x_1,d_1)$, $\mathcal {B}$ selects a bit uniformly at random from $\{0,1 \}$ and provides $\mathcal {A}$ with the pair (t, c). For responding to ${\mathcal {A}}$’s challenge, ${\mathcal {B}}$ makes another call to ${\mathcal {O}}$ and the output of ${\mathcal {O}}$ for this call is also inserted in $\mathsf{Dom}$. Finally, $\mathcal {A}$ outputs a bit $b'$. Now, ${\mathcal {B}}$ checks if there is a collision in $\mathsf{Dom}$, i.e., if ${\mathcal {O}}$ ever returned two same values for two distinct queries. If there is a collision in $\mathsf{Dom}$, then ${\mathcal {B}}$ outputs 0. On the other hand, if there is no collision in $\mathsf{Dom}$ and $b=b'$, then $\mathcal {B}$ outputs 1, otherwise it outputs a 0.

From the description above, we can easily see that if the oracle of $\mathcal {B}$ is the encryption scheme $\mathbf{E}_{k_1}(\cdot ,\cdot )$, then there is never a collision in $\mathsf{Dom}$ as $\mathbf{E}_{k_2}(\cdot ,\cdot )$ is injective, and in this scenario $\mathcal {B}$ is providing the exact environment of the game $\mathbf{EXP}_0^{\mathcal {A}}$, i.e.,

$$\begin{aligned} \Pr [k_1 \mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}_1: {\mathcal {B}}^{{\mathcal {E}}_K(\cdot ,\cdot )} \Rightarrow 1] \le \Pr [\mathbf{EXP}_0^{\mathcal {A}} \Rightarrow 1]. \end{aligned}$$

(17)

On the other hand, if the oracle of ${\mathcal {B}}$ is $\$(\cdot ,\cdot )$, then $\mathcal {B}$ is providing the environment of $\mathbf{EXP}_1^{\mathcal {A}}$, given that there is no collision in $\mathsf{Dom}$. If $\mathsf{COLL}$ be the event that there is a collision in $\mathsf{Dom}$, then we have

$$\begin{aligned}&\Pr [{\mathcal {B}}^{\$(\cdot ,\cdot )} \Rightarrow 0 ] \\&\quad = \Pr [({\mathcal {B}}^{\$(\cdot ,\cdot )} \Rightarrow 0 )\wedge (\mathsf{COLL} \vee \overline{\mathsf{COLL}})] \\&\quad = \Pr [({\mathcal {B}}^{\$(\cdot ,\cdot )} \!\Rightarrow \!0) \wedge \mathsf{COLL}] \!+\! \Pr [({\mathcal {B}}^{\$(\cdot ,\cdot )} \!\Rightarrow \!0) \!\wedge \!\overline{\mathsf{COLL}})] \\&\quad = \Pr [({\mathcal {B}}^{\$(\cdot ,\cdot )} \Rightarrow 0)| \mathsf{COLL}]\Pr [\mathsf{COLL}] \\&\qquad + \Pr [({\mathcal {B}}^{\$(\cdot ,\cdot )} \Rightarrow 0) | \overline{\mathsf{COLL}}]\Pr [\overline{\mathsf{COLL}}]\\&\quad \ge \Pr [\mathbf{EXP}_1^{\mathcal {A}} \Rightarrow 0] ( 1- \Pr [\mathsf{COLL}]). \end{aligned}$$

Thus,

$$\begin{aligned} \Pr [{\mathcal {B}}^{\$(\cdot ,\cdot )} \Rightarrow 1 ]\le & {} \Pr [\mathbf{EXP}_1^{\mathcal {A}} \Rightarrow 1]\nonumber \\&+ \Pr [\mathbf{EXP}_1^{\mathcal {A}} \Rightarrow 0] \Pr [\mathsf{COLL}]\nonumber \\\le & {} \Pr [\mathbf{EXP}_1^{\mathcal {A}} \Rightarrow 1] + \Pr [\mathsf{COLL}] \end{aligned}$$

(18)

Now from Eqs. (17) and (18), and the definition of DET-CPA advantage of ${\mathcal {B}}$, we have

As, ${\mathcal {A}}$ asks q queries in the query phase, hence $\mathsf{Dom}$ has $q+1$ elements in it, and each element is a uniform random element in ${\mathbb C}$, and each element in ${\mathbb C}$ is s bits long. Hence,

$$\begin{aligned} \Pr [\mathsf{COLL}] = {q+1 \atopwithdelims ()2}\frac{1}{2^s} \le \frac{(q+1)^2}{2^{s+1}}. \end{aligned}$$

This completes the proof of the claim. $\square $

Claim 4

There exists a RND adversary ${\mathcal {B}}'$ such that

Proof

The proof of this claim is an easy reduction. Again we have an adversary $\mathcal {A}$ attacking TKR2a and we must construct a RND adversary ${\mathcal {B}}'$, which runs ${\mathcal {A}}$. $\mathcal {B}'$ has access to an oracle $\mathcal {O}$, that could be either $\mathsf{RN}^{T}[k_2]()$ or $\$^{\mathcal {T}}$, which on each invocation it returns a random element in ${\mathcal {T}}$. As in Claim 3, adversary $\mathcal {B}'$ do an initialization and a query phase, but now when a call to $\mathsf{RN}^{T}[k]()$ is required, it is substituted by a call to the oracle $\mathcal {O}$. Now we can see that

$$\begin{aligned} \Pr [k\mathop {\leftarrow }\limits ^{\$}{\mathcal {K}}: {\mathcal {B}'}^{\mathsf{RN}^{\mathcal {T}}[k]()} \Rightarrow 1] = \Pr [\mathbf{EXP}_1^{\mathcal {A}} \Rightarrow 1] \end{aligned}$$

(19)

in the case that the oracle of ${\mathcal {B}}$ is $\mathsf{RN}^{T}[k]()$, otherwise, i.e., if $\mathcal {O}$ is $\$^{\mathcal {T}}$ then

$$\begin{aligned} \Pr [ {\mathcal {B}'}^{{\$^{\mathcal {T}}()}} \Rightarrow 1] \le \Pr [\mathbf{EXP}_2^{\mathcal {A}} \Rightarrow 1] \end{aligned}$$

(20)

Again from Eqs. (19) and (20), the claim follows. $\square $

Claim 5

For any arbitrary adversary ${\mathcal {A}}$

$$\begin{aligned} \Pr [\mathbf{EXP}_2^{\mathcal {A}} \Rightarrow 1] = \frac{1}{2} \end{aligned}$$

Proof

In game $\mathbf{EXP}_2^{\mathcal {A}}$, in the query phase ${\mathcal {A}}$ receives q tuples (t, c) where t and c are distinct random elements in ${\mathcal {T}}$ and ${\mathbb C}$, respectively. Finally, in the challenge phase it receives (t, c) which is independent of $(x_0,d_0), (x_1,d_1)$. Hence, ${\mathcal {A}}$ cannot only guess the bit b with probability more than $\frac{1}{2}$.

Thus, from Claims 3, 4,

(21)

Using Eq. (16) and claim 5,

(22)

Finally, we have

as desired. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Díaz-Santiago, S., Rodríguez-Henríquez, L.M. & Chakraborty, D. A cryptographic study of tokenization systems. Int. J. Inf. Secur. 15, 413–432 (2016). https://doi.org/10.1007/s10207-015-0313-x

Download citation

Published: 22 January 2016
Issue Date: August 2016
DOI: https://doi.org/10.1007/s10207-015-0313-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A cryptographic study of tokenization systems

Abstract

Access this article

Similar content being viewed by others

Identity-Based Cryptography in Credit Card Payments