Skip to main content

Zero Knowledge Proofs Towards Verifiable Decentralized AI Pipelines

  • 419 Accesses

Part of the Lecture Notes in Computer Science book series (LNCS,volume 13411)


We are witnessing the emergence of decentralized AI pipelines wherein different organisations are involved in the different steps of the pipeline. In this paper, we introduce a comprehensive framework for verifiable provenance for decentralized AI pipelines with support for confidentiality concerns of the owners of data and model assets. Although some of the past works address different aspects of provenance, verifiability, and confidentiality, none of them address all the aspects under one uniform framework. We present an efficient and scalable approach for verifiable provenance for decentralized AI pipelines with support for confidentiality based on zero-knowledge proofs (ZKPs). Our work is of independent interest to the fields of verifiable computation (VC) and verifiable model inference. We present methods for basic computation primitives like read only memory access and operations on datasets that are an order of magnitude better than the state of the art. In the case of verifiable model inference, we again improve the state of the art for decision tree inference by an order of magnitude. We present an extensive experimental evaluation of our system.

This is a preview of subscription content, access via your institution.

Buying options

USD   29.95
Price excludes VAT (USA)
  • DOI: 10.1007/978-3-031-18283-9_12
  • Chapter length: 28 pages
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
USD   89.00
Price excludes VAT (USA)
  • ISBN: 978-3-031-18283-9
  • Instant PDF download
  • Readable on all devices
  • Own it forever
  • Exclusive offer for individuals only
  • Tax calculation will be finalised during checkout
Softcover Book
USD   119.99
Price excludes VAT (USA)
Fig. 1.
Fig. 2.
Fig. 3.
Fig. 4.


  1. 1.

    Provenance of the model training step is not considered in this paper.

  2. 2.

    This introduces no ambiguity if 0 is legitimately part of the vector, as s specifies the content of the vector.


  1. Albrecht, M., Grassi, L., Rechberger, C., Roy, A., Tiessen, T.: MiMC: efficient encryption and cryptographic hashing with minimal multiplicative complexity. In: Cheon, J.H., Takagi, T. (eds.) ASIACRYPT 2016. LNCS, vol. 10031, pp. 191–219. Springer, Heidelberg (2016).

    CrossRef  Google Scholar 

  2. Ames, S., Hazay, C., Ishai, Y., Venkitasubramaniam, M.: Ligero: lightweight sublinear arguments without a trusted setup. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2087–2104 (2017)

    Google Scholar 

  3. Ben-Sasson, E., Chiesa, A., Genkin, D., Tromer, E., Virza, M.: SNARKs for C: verifying program executions succinctly and in zero knowledge. In: Canetti, R., Garay, J.A. (eds.) CRYPTO 2013. LNCS, vol. 8043, pp. 90–108. Springer, Heidelberg (2013).

    CrossRef  MATH  Google Scholar 

  4. Ben-Sasson, E., Chiesa, A., Riabzev, M., Spooner, N., Virza, M., Ward, N.P.: Aurora: transparent succinct arguments for R1CS. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019. LNCS, vol. 11476, pp. 103–128. Springer, Cham (2019).

    CrossRef  Google Scholar 

  5. Ben-Sasson, E., Chiesa, A., Tromer, E., Virza, M.: Succinct non-interactive zero knowledge for a von neumann architecture. In: Proceedings of the 23rd USENIX Security Symposium, pp. 781–796 (2014)

    Google Scholar 

  6. Beneš, V.: Mathematical Theory of Connecting Networks and Telephone Traffic. Elsevier Science, ISSN (1965)

    Google Scholar 

  7. Bünz, B., Bootle, J., Boneh, D., Poelstra, A., Wuille, P., Maxwell, G.: Bulletproofs: short proofs for confidential transactions and more. In: Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 315–334 (2018)

    Google Scholar 

  8. Campanelli, M., Fiore, D., Querol, A.: Legosnark: modular design and composition of succinct zero-knowledge proofs. In: Proceedings of the ACM SI)GSAC Conference on Computer and Communications Security (CCS), pp. 2075–2092 (2019)

    Google Scholar 

  9. Chiesa, A., Ojha, D., Spooner, N.: Fractal: post-quantum and transparent recursive proofs from holography. In: Canteaut, A., Ishai, Y. (eds.) EUROCRYPT 2020. LNCS, vol. 12105, pp. 769–793. Springer, Cham (2020).

    CrossRef  Google Scholar 

  10. Eberhardt, J., Tai, S.: Zokrates - scalable privacy-preserving off-chain computations. In: Proceedings of the IEEE International Conference on Internet of Things (iThings), pp. 1084–1091 (2018)

    Google Scholar 

  11. Feng, B., Qin, L., Zhang, Z., Ding, Y., Chu, S.: ZEN: efficient zero-knowledge proofs for neural networks. IACR Cryptol. ePrint Arch. 2021, 87 (2021)

    Google Scholar 

  12. ffiec. Home mortgage disclosure act. Accessed 14 Sept 2021

  13. Gennaro, R., Gentry, C., Parno, B., Raykova, M.: Quadratic span programs and succinct NIZKs without PCPs. In: Johansson, T., Nguyen, P.Q. (eds.) EUROCRYPT 2013. LNCS, vol. 7881, pp. 626–645. Springer, Heidelberg (2013).

    CrossRef  Google Scholar 

  14. Ghodsi, Z., Gu, T., Garg, S.: Safetynets: verifiable execution of deep neural networks on an untrusted cloud. In: Proceedings of the Annual Conference on Neural Information Processing Systems (NeurIPS), pp. 4672–4681 (2017)

    Google Scholar 

  15. Kilbertus, N., Gascón, A., Kusner, M.J., Veale, M., Gummadi, K.P., Weller, A.: Blind justice: Fairness with encrypted sensitive attributes. In: Proceedings of the 35th International Conference on Machine Learning (ICML), pp. 2635–2644 (2018)

    Google Scholar 

  16. Kosba, A.E., Papamanthou, C., Shi, E.: xjsnark: a framework for efficient verifiable computation. In: Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 944–961 (2018)

    Google Scholar 

  17. Lab, S.: libsnark: A C++ library for zkSNARK proofs, howpublished. Accessed 14 Sept 2021

  18. Lee, S., Ko, H., Kim, J., Oh, H.: vcnn: verifiable convolutional neural network. IACR Cryptol. ePrint Arch. 2020, 584 (2020)

    Google Scholar 

  19. Lüthi, P., Gagnaux, T., Gygli, M.: Distributed ledger for provenance tracking of artificial intelligence assets. CoRR, abs/2002.11000 (2020)

    Google Scholar 

  20. Parno, B., Howell, J., Gentry, C., Raykova, M.: Pinocchio: nearly practical verifiable computation. In: Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 238–252 (2013)

    Google Scholar 

  21. Sarpatwar, K.K., et al.: Towards enabling trusted artificial intelligence via blockchain. In: Extended papers from the Second International Workshop on Policy-based Autonomic Data Governance, vol. 11550, pp. 137–153 (2018)

    Google Scholar 

  22. Segal, S., Adi, Y., Pinkas, B., Baum, C., Ganesh, C., Keshet, J.: Fairness in the eyes of the data: certifying machine-learning models. In: Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society (AIES), pp. 926–935 (2021)

    Google Scholar 

  23. Tramèr, F., Boneh, D.: Slalom: fast, verifiable and private execution of neural networks in trusted hardware. In: Proceedings of the 7th International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  24. Veeningen, M.: Pinocchio-based adaptive zk-SNARKs and secure/correct adaptive function evaluation. In: Joye, M., Nitaj, A. (eds.) AFRICACRYPT 2017. LNCS, vol. 10239, pp. 21–39. Springer, Cham (2017).

    CrossRef  Google Scholar 

  25. Wahby, R.S., Setty, S.T.V., Ren, Z., Blumberg, A.J., Walfish, M.: Efficient RAM and control flow in verifiable outsourced computation. In: Proceedings of the 22nd Annual Network and Distributed System Security Symposium (NDSS) (2015)

    Google Scholar 

  26. Waksman, A.: A permutation network. J. ACM 15(1), 159–163 (1968)

    CrossRef  MathSciNet  Google Scholar 

  27. Weng, C., Yang, K., Xie, X., Katz, J., Wang, X.: Mystique: efficient conversions for zero-knowledge proofs with applications to machine learning. In: 30th USENIX Security Symposium (USENIX Security 2021), pp. 501–518 (2021)

    Google Scholar 

  28. Zhang, J., Fang, Z., Zhang, Y., Song, D.: Zero knowledge proofs for decision tree predictions and accuracy. In: Proceedings of the ACM SIGSAC Conference on Computer and Communications Security (CCS), pp. 2039–2053 (2020)

    Google Scholar 

  29. Zhang, Y., Genkin, D., Katz, J., Papadopoulos, D., Papamanthou, C.: VSQL: verifying arbitrary SQL queries over dynamic outsourced databases. In: Proceedings of the IEEE Symposium on Security and Privacy (SP), pp. 863–880 (2017)

    Google Scholar 

  30. Zhang, Y., Genkin, D., Katz, J., Papadopoulos, D., Papamanthou, C.: A zero-knowledge version of vsql. IACR Cryptol. ePrint Arch. 2017, 1146 (2017)

    Google Scholar 

  31. Zhang, Y., Genkin, D., Katz, J., Papadopoulos, D., Papamanthou, C.: vram: Faster verifiable RAM with program-independent preprocessing. In: 2018 IEEE Symposium on Security and Privacy, SP 2018, Proceedings, San Francisco, California, USA, 21–23 May 2018, pp. 908–925 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations


Corresponding author

Correspondence to Nitin Singh .

Editor information

Editors and Affiliations


A Preliminaries

We briefly summarise some key cryptographic notions that we use throughout the paper. For more details on the notions discussed below, we refer the reader to [8, Section 2].

1.1 A.1 Commitment Scheme

Definition 1

A commitment scheme \(\textsf{Com}=\) \((\textsf{Setup}, \textsf{Commit},\textsf{VerCommit})\) is a tuple of algorithms with message space \(\mathcal {D}\), commitment space \(\mathcal {C}\) and opening space \(\mathcal {O}\) which satisfies correctness, hiding and binding as described below:

  • \(\textsf{Setup}(1^\lambda )\rightarrow \textsf{ck}\) takes security parameter \(\lambda \) and outputs commitment key \(\textsf{ck}\).

  • \(\textsf{Commit}(\textsf{ck},u)\rightarrow (c,o)\) takes commitment key \(\textsf{ck}\) and \(u\in \mathcal {D}\) and outputs commitment \(c\in \mathcal {C}\) and opening \(o\in \mathcal {O}\).

  • \(\textsf{VerCommit}(\textsf{ck},c,u,o)\rightarrow b\) takes commitment key \(\textsf{ck}\), commitment c, message u and opening o and outputs \(b\in \{0,1\}\).

Correctness: A valid commitment always verifies correctly, i.e. for \(\textsf{ck}\leftarrow \textsf{Setup}(1^\lambda )\), \((c,o)\leftarrow \textsf{Commit}(\textsf{ck},u)\), with probability 1, we have \(\textsf{VerCommit}(\textsf{ck},c,u,o)=1\).

Binding: It is infeasible for a polynomial time adversary to provide two openings to the same commitment.

Hiding: Commitments to any two messages are indistinguishable.

1.2 A.2 Zero Knowledge Arguments

We define the notion of pre-processing zero-knowledge Succinct Arguments of Knowledge (zkSNARKs).

Definition 2

A zkSNARK for a family of \(\textsf{NP}\) relations \(\{\mathcal {R}_\lambda \}_{\lambda \in \mathbb {N}}\) is a tuple of algorithms \((\textsf{G},\textsf{P},\textsf{V})\) where:

  • \(\textsf{G}(1^\lambda ,R)\rightarrow (\textsf{pp},\textsf{td})\) takes security parameter and the relation \(R\in \mathcal {R}_\lambda \) and outputs public parameters \(\textsf{pp}=(\textsf{pk},\textsf{vk})\) and a trapdoor \(\textsf{td}\). In the above \(\textsf{pk}\) is called the evaluation key and \(\textsf{vk}\) is called the verification key.

  • \(\textsf{P}(\textsf{pk},\boldsymbol{x},\boldsymbol{w})\rightarrow \pi \) takes the evaluation key, public input vector \(\boldsymbol{x}\), witness vector \(\boldsymbol{w}\) and outputs a proof \(\pi \).

  • \(\textsf{V}(\textsf{vk}, \boldsymbol{x}, \pi )\rightarrow b\) takes the verification key, public input vector \(\boldsymbol{x}\), a proof \(\pi \) and outputs \(b=1\) (accept) or \(b=0\) (reject).

A zkSNARK \(\mathcal {S}=(\textsf{G},\textsf{P},\textsf{V})\) satisfies the following properties:

Completeness: For all \((R,\boldsymbol{x},\boldsymbol{w})\) such that \(R\in \mathcal {R}_\lambda \) and \(R(\boldsymbol{x},\boldsymbol{w})=1\), the following probability is 1.

$$\begin{aligned} \textrm{Pr}[\pi \leftarrow \textsf{P}(\textsf{pk},\boldsymbol{x},\boldsymbol{w}); \textsf{V}(\textsf{vk},\boldsymbol{x},\pi )=1] \end{aligned}$$

Knowledge Soundness: Let \(\mathcal{R}\mathcal{G}\) denote a relation generator and \(\mathcal {Z}\) denote a (benign) auxiliary input generator. Then the zkSNARK \(\mathcal {S}\) is called knowledge sound for \((\mathcal{R}\mathcal{G},\mathcal {Z})\) if for all efficient provers \(P^*\), there exists an extractor \(E^{P^*}\) such that the following probability is negligible:

$$\begin{aligned} \textrm{Pr}\left[ \begin{array}{c|c} (R,aux_R)\leftarrow \mathcal{R}\mathcal{G}, \textsf{pp}\leftarrow \textsf{G}(1^\lambda ,R) &{} \\ Z\leftarrow \mathcal {Z}(\textsf{pp},R,aux_R) &{} \textsf{V}(\textsf{pp},\boldsymbol{x},\pi ) \wedge \\ (\boldsymbol{x},\pi )\leftarrow P^*(R,aux_R,\textsf{pp},Z) &{} \lnot R(\boldsymbol{x},\boldsymbol{w}) \\ \boldsymbol{w}\leftarrow E^{P^*}(R,aux_R,\textsf{pp},Z) &{} \end{array} \right] \end{aligned}$$

Zero Knowledge: We say that \(\mathcal {S}\) satisfies zero-knowledge for relation generator \(\mathcal{R}\mathcal{G}\) if there exists simulator \(S=(S_1, S_2)\) such that the following hold:

  • Key Indistinguishability: For all efficient adversaries \(\mathcal {A}\) we have:

    $$\begin{aligned}&\textrm{Pr}\left[ \begin{array}{l|l} (R,aux_R)\leftarrow \mathcal{R}\mathcal{G}(1^\lambda ), \textsf{pp}\leftarrow G(1^\lambda ,R)&\mathcal {A}(R,aux_R,\textsf{pp})=1 \end{array} \right] \\&\approx \textrm{Pr}\left[ \begin{array}{l|l} (R,aux_R)\leftarrow \mathcal{R}\mathcal{G}(1^\lambda ), &{} \mathcal {A}(R, aux_R, \textsf{pp})=1 \\ (\textsf{pp},\textsf{td})\leftarrow S_1(R,aux_R) \end{array} \right] \end{aligned}$$
  • Proof Indistinguishability: For all efficient adversaries \(\mathcal {A}\) and all \(R\in \mathcal {R}_\lambda \), \((\boldsymbol{x},\boldsymbol{w})\) such that \(R(\boldsymbol{x},\boldsymbol{w})=1\) we have:

    $$\begin{aligned}&\textrm{Pr}\left[ \begin{array}{l|l} (R,aux_R)\leftarrow \mathcal{R}\mathcal{G}(1^\lambda ), &{} \\ \textsf{pp}\leftarrow G(R,aux_R), &{} \mathcal {A}(\textsf{pp},aux_R,\pi )=1 \\ \pi \leftarrow \textsf{P}(\textsf{pp},\boldsymbol{x},\boldsymbol{w}) &{} \end{array}\right] \\&\approx \textrm{Pr}\left[ \begin{array}{l|l} (R,aux_R)\leftarrow \mathcal{R}\mathcal{G}(1^\lambda ), &{} \\ (\textsf{pp},\textsf{td})\leftarrow S_1(R,aux_R), &{} \mathcal {A}(\textsf{pp},aux_R,\pi )=1 \\ \pi \leftarrow S_2(\textsf{pp},\boldsymbol{x},\textsf{td}) &{} \end{array} \right] \end{aligned}$$

1.3 A.3 Commit and Prove SNARKs

Informally, a commit and prove SNARK (CP-SNARK) is a SNARK that can prove knowledge of witness where part of the witness opens a commitment c. In other words, a CP-SNARK for relation R allows one to prove knowledge of \(\boldsymbol{w}=(\boldsymbol{u},\boldsymbol{z})\) such that \(R(\boldsymbol{x},\boldsymbol{w})=1\) and c is a commitment for \(\boldsymbol{u}\). The commitments can be used in several proofs to prove composite statements. We summarise the formal notion of CP-SNARKs as defined in [8].

Definition 3 (CP-SNARK)

Let \(\textsf{Com}\) be a commitment scheme with input space \(\mathcal {D}\), opening space \(\mathcal {O}\) and commitment space \(\mathcal {C}\). Let \(\{R_\lambda \}_{\lambda \in \mathbb {N}}\) be a family of relations R over \(\mathcal {D}_x\times \mathcal {D}_u\times \mathcal {D}_w\) where \(\mathcal {D}_u\) splits as \(\mathcal {D}_1\times \cdots \times \mathcal {D}_\ell \) for some \(\ell \ge 1\) such that \(\mathcal {D}_i\subseteq \mathcal {D}\) for \(i=1,\ldots ,\ell \). A commit and prove zkSNARK (\(\textsf{CP}\)) for \(\textsf{Com}\) and \(\{R_\lambda \}_{\lambda \in \mathbb {N}}\) is a zkSNARK for family of relations \(\{R^{\textsf{Com}}_\lambda \}_{\lambda \in \mathbb {N}}\) where:

  • every \(\boldsymbol{R}\in R^{\textsf{Com}}\) is represented by \((\textsf{ck}, R)\) where \(\textsf{ck}\in \textsf{Setup}(1^\lambda )\) and \(R\in R_\lambda \).

  • \(\boldsymbol{R}\) is over the pairs \((\boldsymbol{x},\boldsymbol{w})\) where \(\boldsymbol{x}=(x, (c_j)_{j\in [\ell ]})\) \(\in \mathcal {D}_x\times \mathcal {C}^\ell \) is the statement and \(\boldsymbol{w}=((u_j)_{j\in [\ell ]}, (o_j)_{j\in [\ell ]}, \omega )\) \(\in \mathcal {D}_1\times \cdots \times \mathcal {D}_\ell \times \mathcal {O}^\ell \times \mathcal {D}_\omega \) is the witness. The relation \(\boldsymbol{R}\) holds iff:

    $$\begin{aligned} \bigwedge _{j\in [\ell ]} \textsf{VerCommit}(\textsf{ck}, c_j, u_j, o_j) = 1 \wedge R(x, (u_j)_{j\in [\ell ]}, \omega ) = 1 \end{aligned}$$

Further, we say that \(\textsf{CP}\) is knowledge sound for relation generator \(\mathcal{R}\mathcal{G}\) and auxiliary input generator \(\mathcal {Z}\) if it satisfies knowledge soundness \((\mathcal{R}\mathcal{G}^{\textsf{Com}}, \mathcal {Z})\) where \(\mathcal{R}\mathcal{G}^{\textsf{Com}}\) denotes the relation generator which samples \((\textsf{ck}, R, aux)\) as \(\mathcal{R}\mathcal{G}(1^\lambda )\rightarrow (R,aux)\) and \(\textsf{Setup}(1^\lambda )\rightarrow \textsf{ck}\).

We elaborate slightly on the intuition behind the above definition. Typically a zkSNARK for relation \(R\subseteq \mathcal {D}_x\times \mathcal {D}_\omega \) proves knowledge of \(\boldsymbol{w}\in \mathcal {D}_\omega \) for a given statement \(\boldsymbol{x}\in \mathcal {D}_x\) such that \(R(\boldsymbol{x},\boldsymbol{w})=1\). With a CP-SNARK, we additionally wish to prove that part of the witness \(\boldsymbol{w}\) opens a commitment c, i.e. \(\boldsymbol{w}=(\boldsymbol{u},z)\) where c is a commitment for \(\boldsymbol{u}\). Generalizing this further, we can decompose the committed part of the witness \(\boldsymbol{u}\) into \(\ell \) slots, where witness corresponding to each slot opens a specified commitment.

B Security Analysis

We describe our protocols as interactive protocols with (semi) honest verifiers. One can obtain non-interactive arguments of knowledge (SNARKs) in the Random Oracle model from them via Fiat-Shamir heuristic. We first define a secure protocol for proving a relation R under commitments using the commitment scheme \(\textsf{Com}\). We will write a relation R as \(R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\) where \(\boldsymbol{x}\) denotes the public input (plain-text), \(\boldsymbol{u}\) denotes the committed witness while \(\boldsymbol{w}\) denotes the “free” (uncommitted witness). The vector \(\boldsymbol{u}\) purportedly opens a public commitment c.

Definition 4 (Secure Protocol)

A secure protocol for a relation R and commitment scheme \(\textsf{Com}\) consists of tripe \(\varPi =(\mathcal {G},\mathcal {P},\mathcal {V})\) consisting of generator algorithm \(\mathcal {G}\), a \(\textsf{PPT}\) prover \(\mathcal {P}\) and a \(\textsf{PPT}\) verifier \(\mathcal {V}\) which work as follows:

  1. 1.

    \(\mathcal {G}(\textsf{ck},R,1^\lambda )\longrightarrow \textsf{pp}\): Given a commitment key \(ck\leftarrow \textsf{Com}.\textsf{Setup}(1^\lambda )\) and R, \(\mathcal {G}\) outputs public parameters \(\textsf{pp}\).

  2. 2.

    Given public parameters \(\textsf{pp}\) for relation R and a pair \((\boldsymbol{x},c)\) consisting of statement \(\boldsymbol{x}\) and a public commitment c, \(\mathcal {P}\) and \(\mathcal {V}\) interact via an alternating sequence of messages, at the end of which \(\mathcal {V}\) outputs \(0\, (\texttt{Reject})\) or \(1\, (\texttt{Accept})\).

Further, a secure protocol \(\varPi \) satisfies completeness, soundness and zero-knowledge which we define shortly.

Let \(\varPi (\textsf{pp},\boldsymbol{x},c;\boldsymbol{u},\boldsymbol{w},0)\) denote the output (0/1) of interaction between \(\mathcal {P}\) and \(\mathcal {V}\) on common input \((\boldsymbol{x},c)\) and \(\mathcal {P}\)’s private inputs as \(\boldsymbol{u},\boldsymbol{w},o\). Similarly, let \(\varPi .\textsf{Vw}(\boldsymbol{x},c; \boldsymbol{u},\boldsymbol{w},o)\) denote \(\mathcal {V}\)’s view in the interaction. We use \(\varPi _\mathcal {A}(\textsf{pp},\boldsymbol{x},c)\) to denote the output of interaction between an adversarial prover \(\mathcal {A}\) and \(\mathcal {V}\) on common input \((\boldsymbol{x},c)\). Next, we define the security properties satisfied by a secure protocol \(\varPi \).

Completeness: We call \(\varPi \) to be complete if for all \(\textsf{ck}\in \textsf{Com}.\textsf{Setup}(1^\lambda )\) and \((\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\in R\) we have:

$$\begin{aligned} \textrm{Pr}\left[ \textsf{pp}\leftarrow \mathcal {G}(\textsf{ck},R,1^\lambda ), c = \textsf{Com}.\textsf{Commit}(\textsf{ck}, \boldsymbol{u}, o), \varPi (\boldsymbol{x},c;\boldsymbol{u},\boldsymbol{w},o) = 1\right] = 1 \end{aligned}$$


Soundness: We call \(\varPi \) to have soundness if for all \(\textsf{PPT}\) adversaries \(\mathcal {A}\), there exists and efficient extractor \(\mathcal {E}\) such that the following probability is negligible:

$$\begin{aligned} \textrm{Pr}\left[ \begin{array}{l} \textsf{ck}\leftarrow \textsf{Com}.\textsf{Setup}(1^\lambda ), \textsf{pp}\leftarrow \mathcal {G}(\textsf{ck},R,1^\lambda ), \\ (\boldsymbol{x},c)\leftarrow \mathcal {A}(\textsf{pp},z), (\boldsymbol{u},\boldsymbol{w},o)\leftarrow \mathcal {E}^\mathcal {A}(\textsf{pp},z) \end{array} \,\left| \, \begin{array}{c} \varPi _\mathcal {A}(\textsf{pp},\boldsymbol{x},c)=1 \\ \wedge \lnot \widetilde{R}(\boldsymbol{x},c,\boldsymbol{u},\boldsymbol{w},o) \end{array}\right. \right] \end{aligned}$$

Here \(\tilde{R}(\boldsymbol{x},c,\boldsymbol{u},\boldsymbol{w},o)\equiv R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\wedge \textsf{Com}.\textsf{VerCommit}(\textsf{ck},c,\boldsymbol{u},o)\).

Zero Knowledge: We say that \(\varPi \) is zero-knowledge if there exists efficient simulator \(\mathcal {S}=(\mathcal {S}_1,\mathcal {S}_2)\) such that for all \(\textsf{ck}\in \textsf{Com}.\textsf{Setup}(1^\lambda )\), \((\boldsymbol{x},c,\boldsymbol{u},\boldsymbol{w},o)\) such that \((\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\in R\) and \(c=\textsf{Com}.\textsf{Commit}(\textsf{ck},\boldsymbol{u},o)\), the following are statistically indistinguishable:

$$\begin{aligned}&\left[ \textsf{pp}\leftarrow \mathcal {G}(\textsf{ck},R)\,\vert \,\big (\textsf{pp},\varPi .\textsf{Vw}(\textsf{pp},\boldsymbol{x},c; \boldsymbol{u},\boldsymbol{w},o)\big )\right] \\&\quad \approx \left[ (\textsf{pp},\textsf{td})\leftarrow \mathcal {S}_1(1^\lambda ,R)\,|\, \big (\textsf{pp},\mathcal {S}_2(\textsf{td},\textsf{pp},\textsf{ck},\boldsymbol{x},c)\big )\right] \end{aligned}$$

First, we exhibit a trivial secure protocol that can be obtained from a CP-SNARK for a relation.

Lemma 1

Let \(\textsf{CP}=(\textsf{G},\textsf{P},\textsf{V})\) be a CP-SNARK for relation R and commitment scheme \(\textsf{Com}\). Then \(\varPi =(\mathcal {G},\mathcal {P},\mathcal {V})\) as described below is a secure protocol for relation R and commitment scheme \(\textsf{Com}\).

  • \(\mathcal {G}(\textsf{ck},R,1^\lambda )\longrightarrow \textsf{pp}\) where \(\textsf{pp}\leftarrow \textsf{G}(\textsf{ck},R,1^\lambda )\).

  • On common input \((\boldsymbol{x},c)\) and \(\mathcal {P}\)’s input \((\boldsymbol{u},\boldsymbol{w},o)\), \(\mathcal {P}\) and \(\mathcal {V}\) interact as follows:

    1. 1.

      \(\mathcal {P}\) computes: \(\pi \leftarrow \textsf{P}(\textsf{pp},\boldsymbol{x},\boldsymbol{u},\boldsymbol{w},o)\).

    2. 2.

      \(\mathcal {P}\rightarrow \mathcal {V}\): \(\mathcal {P}\) sends \(\pi \) to \(\mathcal {V}\).

    3. 3.

      \(\mathcal {V}\) outputs \(\textsf{V}(\textsf{pp},\boldsymbol{x},c,\pi )\).

The proof of the above is trivial and follows directly from the properties of CP-SNARK \(\textsf{CP}\). We now formally define the probabilistic relation decomposition and provide a secure protocol for decomposed relation in by gluing the secure protocols for the constituent relations.

Definition 5 (Probabilistic Relation Decomposition)

Let \(R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\) be a relation. We say that relations \((R_1,R_2)\) are a probabilistic decomposition of R if there exists a canoical partitioning of \(\boldsymbol{w}\) as \(\boldsymbol{w}_0||\boldsymbol{w}_1||\boldsymbol{w}_2\) and a challenge space \(\mathcal {C}\) such that for \(\alpha \leftarrow \mathcal {C}\):

$$\begin{aligned} \textrm{Pr}\left[ R_1(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_1)\wedge R_2(\alpha ,\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_2)=1 \,\vert \, R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})=1\right]&= 1 \\ \textrm{Pr}\left[ R_1(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_1)\wedge R_2(\alpha ,\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_2)=1 \,\vert \, R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})=0\right]&= \textsf{negl} \end{aligned}$$

Lemma 2 (Glueing Lemma)

Let \((R_1,R_2)\) be a probabilistic relation decomposition of the relation R and let \(\varPi _1\) and \(\varPi _2\) be secure protocols for \((R_1,\textsf{Com})\) and \((R_2,\textsf{Com})\) respectively, where \(\textsf{Com}\) is a commitment scheme. Then the protocol \(\varPi =(\mathcal {G},\mathcal {P},\mathcal {V})\) as described below is a secure protocol for \((R,\textsf{Com})\).

  • \(\mathcal {G}(\textsf{ck},R,1^\lambda )\longrightarrow \textsf{pp}\): The algorithm \(\mathcal {P}\) invokes generator algorithms for the consituent relations as \(\textsf{pp}_1\leftarrow \varPi _1.\mathcal {G}(\textsf{ck}, R_1, 1^\lambda )\), \(\textsf{pp}_2\leftarrow \varPi _2.\mathcal {G}(\textsf{ck}, R_2, 1^\lambda )\) and returns \(\textsf{pp}=(\textsf{pp}_1,\textsf{pp}_2)\).

  • On common input \((\boldsymbol{x},c)\) and private prover inputs \((\boldsymbol{u},\boldsymbol{w},o)\), \(\mathcal {P}\) and \(\mathcal {V}\) interact as follows:

    1. 1.

      \(\mathcal {P}\) computes: \(\mathcal {P}\) partitions \(\boldsymbol{w}\) as \(\boldsymbol{w}_0||\boldsymbol{w}_1||\boldsymbol{w}_2\). Next \(\mathcal {P}\) samples \(o_w\leftarrow \mathcal {O}\) and computes \(c_w = \textsf{Com}.\textsf{Commit}(\textsf{ck},\boldsymbol{w}_0,o_w)\).

    2. 2.

      \(\mathcal {P}\rightarrow \mathcal {V}\): \(\mathcal {P}\) sends \(c_w\) to \(\mathcal {V}\).

    3. 3.

      \(\mathcal {P}\) and \(\mathcal {V}\) execute the secure protocol \(\varPi _1\) with common input \((\boldsymbol{x},(c,c_w))\) and prover’s (\(\varPi _1.\mathcal {P}\)) inputs as \(((\boldsymbol{u},\boldsymbol{w}_0),\boldsymbol{w}_1,(o,o_w))\). Let \(b_1\) denote the output of the protocol \(\varPi _1\).

    4. 4.

      \(\mathcal {V}\rightarrow \mathcal {P}\): \(\mathcal {V}\) samples \(\alpha \leftarrow \mathcal {C}\) and sends \(\alpha \) to \(\mathcal {P}\).

    5. 5.

      \(\mathcal {P}\) and \(\mathcal {V}\) execute the secure protocol \(\varPi _2\) with common input \(((\alpha ,\boldsymbol{x}),(c,c_w))\) and prover’s (\(\varPi _2.\mathcal {P}\)) inputs as \(((\boldsymbol{u},\boldsymbol{w}_0),\boldsymbol{w}_2,(o,o_w))\). Let \(b_2\) denote the output of the protocol \(\varPi _1\).

    6. 6.

      \(\mathcal {V}\) outputs \(b_1\wedge b_2\).


We skip the proof of completeness of protocol \(\varPi \), as it is straightforward to verify. To show soundness, let \(\mathcal {A}\) be a \(\textsf{PPT}\) adversary such that \(\varPi _\mathcal {A}(\textsf{pp},\boldsymbol{x},c)=1\). Let \(c_w\) be the first message (commitment) sent by \(\mathcal {A}\) to \(\mathcal {V}\). From the protocol description of \(\varPi \), we have:

$$\begin{aligned} \varPi _\mathcal {A}(\textsf{pp},\boldsymbol{x},c)=\varPi _{1,\mathcal {A}}(\textsf{pp}_1,\boldsymbol{x}, (c, c_w))\wedge \varPi _{2,\mathcal {A}}(\textsf{pp}_2,(\alpha ,\boldsymbol{x}),(c,c_w)). \end{aligned}$$

Thus \(\mathcal {A}\) is also an adversary for secure protocols \(\varPi _1\) and \(\varPi _2\). Soundness of \(\varPi _1\) and \(\varPi _2\) implies existence of extractors \(\mathcal {E}_1\) and \(\mathcal {E}_2\) such that \(((\boldsymbol{u},\boldsymbol{w}_0),\boldsymbol{w}_1,o) \leftarrow \mathcal {E}_1^\mathcal {A}(\textsf{pp}_1,z)\) and \(((\boldsymbol{u}',\boldsymbol{w}'_0,\boldsymbol{w}_2,(o',o'_w)) \leftarrow \mathcal {E}_2^\mathcal {A}(\textsf{pp}_2,z)\). We define extractor \(\mathcal {E}\) which invokes the above extractors and outputs \((\boldsymbol{u},\boldsymbol{w},o)\) for \(\boldsymbol{w}=\boldsymbol{w}_0||\boldsymbol{w}_1||\boldsymbol{w}_2\). With overwhelming probability we have

$$\begin{aligned} R_1(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_1)&\wedge \textsf{Com}.\textsf{VerCommit}(\textsf{ck},(c,c_w),(\boldsymbol{u},\boldsymbol{w}_0),(o,o_w)) \\ R_2(\alpha ,\boldsymbol{x},\boldsymbol{w}'_0,\boldsymbol{w}_2)&\wedge \textsf{Com}.\textsf{VerCommit}(\textsf{ck},(c,c_w),(\boldsymbol{u}',\boldsymbol{w}'_0,(o',o'_w)) \end{aligned}$$

By the binding property of \(\textsf{Com}\), we also have \(\boldsymbol{u}'=\boldsymbol{u}\), \(\boldsymbol{w}'_0=\boldsymbol{w}_0\), \(o'=o\) and \(o'_w=o_w\) and \(\textsf{Com}.\textsf{VerCommit}(\textsf{ck},(c,c_w),(\boldsymbol{u},\boldsymbol{w}_0),(o,o_w))=1\) with overwhelming probability. Finally, since \(R_1(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_1)\wedge R_2(\alpha ,\boldsymbol{x},\boldsymbol{u},\boldsymbol{w}_0,\boldsymbol{w}_2)=1\), we must have \(R(\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})=1\) for \(\boldsymbol{w}=\boldsymbol{w}_0||\boldsymbol{w}_1||\boldsymbol{w}_2\) with probability negligibly close to 1. This proves that \(\mathcal {E}\) extracts a valid witness with overwhelming proability.

We now show that \(\varPi \) is zero-knowledge. Let \(ck\leftarrow \textsf{Com}.\textsf{Setup}(1^\lambda )\) and let \((\boldsymbol{x},c,\boldsymbol{u},\boldsymbol{w},o)\) be such that \((\boldsymbol{x},\boldsymbol{u},\boldsymbol{w})\in R\) and \(c=\textsf{Com}.\textsf{Commit}(\textsf{ck},\boldsymbol{u},o)\). We show the existence of simulator \(\mathcal {S}=(\mathcal {S}_1,\mathcal {S}_2)\) such that:

$$\begin{aligned}&\left[ \textsf{pp}\leftarrow \mathcal {G}(\textsf{ck},R)\,\vert \,\big (\textsf{pp}, \varPi .\textsf{Vw}(\textsf{pp},\boldsymbol{x},c; \boldsymbol{u},\boldsymbol{w},o)\big )\right] \\&\quad \approx \left[ (\textsf{pp},\textsf{td})\leftarrow \mathcal {S}_1(1^\lambda ,R)\,|\, \big (\textsf{pp}, \mathcal {S}_2(\textsf{td},\textsf{pp},\textsf{ck},\boldsymbol{x},c)\big )\right] \end{aligned}$$

Let \(\widetilde{\mathcal {S}}=(\widetilde{\mathcal {S}}_1,\widetilde{\mathcal {S}}_2)\) and \(\widehat{\mathcal {S}}=(\widehat{\mathcal {S}}_1,\widehat{\mathcal {S}}_2)\) be the simulators for secure protocols \(\varPi _1\) and \(\varPi _2\) respectively. The simulator \(\mathcal {S}\) works as follows:

  • \(\mathcal {S}_1(1^\lambda ,R)\longrightarrow (\textsf{pp}',\textsf{td}')\): On input R and security parameter, \(\mathcal {S}_1\) invokes simulators for \(R_1\), \(R_2\) to obtain \((\textsf{pp}'_1,\textsf{td}'_1)\leftarrow \widetilde{\mathcal {S}}_1(1^\lambda ,R_1)\), \((\textsf{pp}'_2,\textsf{td}'_2)\leftarrow \widehat{\mathcal {S}}_1(1^\lambda ,R_2)\) respectively. It sets \(\textsf{pp}'=(\textsf{pp}'_1,\textsf{pp}'_2)\) and \(\textsf{td}'=(\textsf{td}'_1,\textsf{td}'_2)\).

  • \(\mathcal {S}_2\) works as follows: It samples \(\alpha \leftarrow \mathcal {C}\), \(\tilde{o}\leftarrow \mathcal {O}_\lambda \) and computes \(\tilde{c}_w=\textsf{Com}.\textsf{Commit}(\textsf{ck},\boldsymbol{0},\tilde{o})\). Then it invokes simulators \(\widetilde{\mathcal {S}}_2\) and \(\widehat{\mathcal {S}}_2\) as:

    • \(V'_1\leftarrow \widetilde{\mathcal {S}}_2(\textsf{td}'_1,\textsf{pp}'_1,\boldsymbol{x},(c,\tilde{c}_w))\),

    • \(V'_2\leftarrow \widehat{\mathcal {S}}_2(\textsf{td}'_2,\textsf{pp}'_2,(\alpha ,\boldsymbol{x}),(c,\tilde{c}_w))\).

  • Finally it outputs \((\alpha ,\tilde{c}_w,V'_1,V'_2)\).

The required indistinguishability follows via hybrids shown below. For ease of notation let \(V_1\) denote \(\varPi _1(\textsf{pp}_1,\boldsymbol{x},(c,c_w);(\boldsymbol{u},\boldsymbol{w}_0),\boldsymbol{w}_1,(o,o_w))\) and \(V_2\) denote \(\varPi _2(\textsf{pp}_2,(\alpha ,\boldsymbol{x}),(c,c_w);(\boldsymbol{u},\boldsymbol{w}_0),\boldsymbol{w}_2,(o,o_w))\). Then we have:

$$\begin{aligned}&\langle \textsf{pp},\varPi .\textsf{Vw}(\textsf{pp},\boldsymbol{x},c;\boldsymbol{u},\boldsymbol{w},o)\rangle \end{aligned}$$
$$\begin{aligned}&= \langle \textsf{pp}_1,\textsf{pp}_2,\alpha ,c_w,V_1,V_2 \rangle \end{aligned}$$
$$\begin{aligned}&\approx \langle \textsf{pp}'_1,\textsf{pp}_2,\alpha ,c_w,\widetilde{\mathcal {S}}_2(\textsf{td}'_1,\textsf{pp}'_1,\boldsymbol{x},(c,c_w)),V_2\rangle \end{aligned}$$
$$\begin{aligned}&\approx \langle \textsf{pp}'_1,\textsf{pp}'_2,\alpha ,c_w, \widetilde{\mathcal {S}}_2(\textsf{td}'_1,\textsf{pp}'_1,\boldsymbol{x},(c,c_w)), \widehat{\mathcal {S}}_2(\textsf{td}'_2,\textsf{pp}'_2,(\alpha ,\boldsymbol{x}),(c,c_w))\rangle \end{aligned}$$
$$\begin{aligned}&\approx \langle \textsf{pp}'_1,\textsf{pp}'_2,\alpha ,\tilde{c}_w,V'_1,V'_2\rangle \end{aligned}$$

In the above the indistinguishability of (2) and (3) follows from the zero knowledge property of \(\varPi _1\). Similarly zero knowledge of \(\varPi _2\) implies indistinguishability of (3) and (4). Finally, the indistinguishability of (4) and (5) follows from the hiding property of \(\textsf{Com}\). This completes the proof.

C Secure Protocols

In this section, we give secure protocols for the different relations discussed in this paper such as simultaneous permutation, consistent memory access, various dataset operations and decision tree inference.

1.1 C.1 Simultaneous Permutation

For a fixed N, recall that k-tuples \((\boldsymbol{u}_1,\ldots ,\boldsymbol{u}_k)\) and \((\boldsymbol{v}_1,\ldots ,\boldsymbol{v}_k)\) of vectors in \(\mathbb {F}^N\) satisfy simultaneous permutation relation if there exists a permutation \(\sigma \) of [N] such that \(\sigma (\boldsymbol{u}_i)=v_i\) for all \(i\in [N]\). Let \(R_\sigma \) denote the relation over \((\alpha ,\boldsymbol{u},\boldsymbol{v})\) with \(\alpha \in \mathbb {F}\) and \(\boldsymbol{u},\boldsymbol{v}\in \mathbb {F}^N\) such that \(\prod _{i=1}^N (\alpha - \boldsymbol{u}[\,i\,])\) \(=\) \(\prod _{i=1}^N (\alpha - \boldsymbol{v}[\,i\,])\). Let \(\varPi _\sigma \) denote the trivial secure protocol obtained from CP-SNARK for \((R_\sigma ,\textsf{Com})\) (using Lemma 1), where we also assume \(\textsf{Com}\) is homomorphic.

Lemma 3

The protocol \(\varPi _{\textrm{perm}}=(\mathcal {G},\mathcal {P},\mathcal {V})\) in Fig. 5 is a secure protocol for simultaneous permutation relation and commitment scheme \(\textsf{Com}\).


By standard rewinding technique, with overwhelming probability the extractor \(\mathcal {E}\), for an accepting adversarial prover \(\mathcal {A}\) can extract vectors \(\{\boldsymbol{u}_i,\boldsymbol{v}_i\}_{i=1}^k\) such that \(\boldsymbol{u}_i\) opens commitment \(\textsf{cu}_i\) and \(\boldsymbol{v}_i\) opens commitment \(\textsf{cv}_i\) for all \(i\in [k]\). This is accomplished by running the subprotocol \(\varPi _\sigma \) for k different linear combinations of commitments given by the challenge \((\beta _1,\ldots ,\beta _k)\), and using the extractor for \(\varPi _\sigma \) to obtain openings for respective linear combinations of vectors. Since the challenges are linearly independent with overwhelming probability, we can solve the system of equations to obtain openings for individual commitments \(\textsf{cu}_i\) and \(\textsf{cv}_i\) for all \(i\in [k]\). By homomorphism of \(\textsf{Com}\), the vectors \(\boldsymbol{u}=\sum _{i=1}^k \beta _i\boldsymbol{u}_i\) and \(\boldsymbol{v}=\sum _{i=1}^k \beta _i\boldsymbol{v}_i\) open commitments \(\textsf{cu}\) and \(\textsf{cv}\) respectively. Again soundness of \(\varPi _\sigma \) implies with overwhelming probability \((\alpha ,\boldsymbol{u},\boldsymbol{v})\in R_\sigma \). Since \(\alpha \) was drawn uniformly at random, we conclude that there is a permutation \(\pi \) such that \(\pi (\boldsymbol{u})=\boldsymbol{v}\) with probability almost 1. Finally, since \(\beta _1,\ldots ,\beta _k\) were drawn uniformly at random \(\pi (\sum _{i=1}^k \beta _i\boldsymbol{u}_i)\) \(=\) \(\sum _{i=1}^k \beta _i\boldsymbol{v}_i\), with overwhelming probability we must have \(\pi (\boldsymbol{u}_i)=\boldsymbol{v}_i\) for all \(i\in [k]\). This shows the soundness of \(\varPi _{\textrm{perm}}\). We skip the proof of zero-knowledge for \(\varPi _{\textrm{perm}}\) as it follows from the same property for \(\varPi _\sigma \).

Fig. 5.
figure 5

Protocol \(\varPi _{\textrm{perm}}\) for simultaneous permutation

1.2 C.2 Consistent Memory Access

in this section, we formalize the secure protocol for consistent memory access relation discussed in Sect. 3.4.

Lemma 4

There exists a secure protocol \(\varPi _\textrm{cma}\) for consistent memory access relation defined in Sect. 3.4.


We consider the relation \(R_\textrm{cma}\) explained in Sect. 3.4 for consistent memory access as:

$$R_\textrm{cma}(\cdot ,\llbracket \boldsymbol{L},\boldsymbol{U},\boldsymbol{V}\rrbracket , \llbracket \boldsymbol{u},\boldsymbol{v},\tilde{\boldsymbol{u}},\tilde{\boldsymbol{v},\boldsymbol{w}_1,\boldsymbol{w}_2}\rrbracket ) $$

In the above, there are no public inputs, the committed witness consists of \(\boldsymbol{L},\boldsymbol{U}\) and \(\boldsymbol{V}\) which denote the read only memory, access pattern and values respectively. The uncommitted witness consists of auxiliary inputs (\(\boldsymbol{u},\boldsymbol{v},\tilde{\boldsymbol{u}},\tilde{\boldsymbol{v}}\)) and other witness \(\boldsymbol{w}_1\) and \(\boldsymbol{w}_2\) required to prove the relation. The description in Sect. 3.4 partitions the above as:

$$\begin{aligned} \textsf{C}_{\textsf{ROM}, {m},{n}}(\cdot ,\llbracket \boldsymbol{L},\boldsymbol{U},\boldsymbol{V},\boldsymbol{w}_0\rrbracket , \boldsymbol{w}_1)\wedge R_\sigma (\cdot ,\boldsymbol{w}_0,\boldsymbol{w}_2) \end{aligned}$$

where \(\boldsymbol{w}_0=\llbracket \boldsymbol{u},\boldsymbol{v},\tilde{\boldsymbol{u}},\tilde{\boldsymbol{v}} \rrbracket \). The secure protocol \(\varPi _\textrm{ROM}\) can be obtained using a CP-SNARK for circuit \(\textsf{C}_{\textsf{ROM}, {m},{n}}\) via Lemma 1. Invoking Glueing Lemma (Lemma 2) with \(\varPi _\textrm{ROM}\) and protocol \(\varPi _{\textrm{perm}}\) for simultaneous permutation relation, we obtain the secure protocol \(\varPi _\textrm{cma}\).

1.3 C.3 Aggregation Operation

We now provide a secure protocol for showing correctness of aggregation operation on datasets as described in Sect. 4. In Sect. 4 we described a protocol for checking correct concatenation of vectors under commitments, and then reduced the verification of dataset aggregation to that of verifying concatenation of vectors (obtained via linear combination of columns of dataset). We also justify the aforementioned reduction. We assume \(\varPi _\textrm{concat}\) is a secure protocol for checking concatenation of vectors, which we assume is desceribed by the relation \(R_\textrm{concat}\). The secure protocol \(\varPi _\textrm{agg}=(\mathcal {G},\mathcal {P},\mathcal {V})\) for verifying aggregation of datasets appears in Fig. 6. Let \(D_x,D_y\) and \(D_z\) be datasets with columns given by \((\boldsymbol{x}_i)_{i=1}^k\), \((\boldsymbol{y}_i)_{i=1}^k\) and \((\boldsymbol{z}_i)_{i=1}^k\) respectively. Similarly let \((\textsf{cx}_i)_{i=1}^k\), \((\textsf{cy}_i)_{i=1}^k\) and \((\textsf{cz}_i)_{i=1}^k\) denote public commitments to the columns of \(D_x\), \(D_y\) and \(D_z\) respectively. As in Sect. 4, let N denote the upper bound on the sizes of datasets and vectors.

Fig. 6.
figure 6

Protocol \(\varPi _\textrm{agg}\) for dataset aggregation

Lemma 5

The protocol \(\varPi _\textrm{agg}\) in Fig. 6 is a secure protocol for aggregation relation on datasets and commitment scheme \(\textsf{Com}\).


The completeness and zero-knowledge properties of the protocol are proved in a manner similar to earlier protocols. Here we prove the soundness of the probabilistic reduction from aggregation relation on datasets to concatenation relation on vectors, which implies soundness of the overall protocol. With overwhelming probability, a successful adversary \(\mathcal {A}\) knows vectors \((\boldsymbol{x}_i)_{i=1}^k\), \((\boldsymbol{y}_i)_{i=1}^k\) and \((\boldsymbol{z}_i)_{i=1}^k\) such that their respective \(\beta \)-linear combinations \(\boldsymbol{x},\boldsymbol{y}\) and \(\boldsymbol{z}\) satisfy the concatenation relation. As in Sect. 4, we write \(\boldsymbol{x}_i=\llbracket s_i, \boldsymbol{X}_i\rrbracket \), \(\boldsymbol{y}_i=\llbracket t_i, \boldsymbol{Y}_i\rrbracket \) and \(\boldsymbol{z}_i=\llbracket w_i, \boldsymbol{Z}_i\rrbracket \) for \(i\in [k]\). Similarly, let \(\boldsymbol{x}=\llbracket s, \boldsymbol{X}\rrbracket \), \(\boldsymbol{y}=\llbracket t, \boldsymbol{Y}\rrbracket \) and \(\boldsymbol{z}=\llbracket w, \boldsymbol{Z}\rrbracket \). Note that we must have:

$$\begin{aligned} s = \sum _{i=1}^k\beta _i s_i, \quad t = \sum _{i=1}^k\beta _i t_i, \quad w = \sum _{i=1}^k\beta _i w_i \\ \boldsymbol{X} = \sum _{i=1}^k\beta _i \boldsymbol{X}_i, \quad \boldsymbol{Y} = \sum _{i=1}^k\beta _i \boldsymbol{Y}_i, \quad \boldsymbol{Z} = \sum _{i=1}^k\beta _i \boldsymbol{Z}_i \end{aligned}$$

Now, from description in Sect. 4, the vectors \(\boldsymbol{x},\boldsymbol{y}\) and \(\boldsymbol{z}\) satisfy the concatenation relation if there exists a permutation of [2N], which we denote by permutation matrix \(\Lambda \) such that \(\Lambda \cdot \llbracket \boldsymbol{\rho }_s, \boldsymbol{\rho }_t\rrbracket =\llbracket \boldsymbol{\rho }_w, \boldsymbol{0}\rrbracket \), \(\Lambda \cdot \llbracket \boldsymbol{X}, \boldsymbol{Y}\rrbracket = \llbracket \boldsymbol{Z}, \boldsymbol{0}\rrbracket \) where vectors \(\boldsymbol{\rho }_s,\boldsymbol{\rho }_t\) and \(\boldsymbol{\rho }_w\) are in \(\{0,1\}^N\) such that \(\boldsymbol{\rho }_s\) is 1 in precisely the first s positions, \(\boldsymbol{\rho }_t\) is 1 in precisely the first t positions and \(\boldsymbol{\rho }_w\) is 1 in precisely the first w positions where further \(w=s+t\). The relation thus also implicity requires that \(s,t,w\in [N]\). We now claim that \(s_i=s\), \(t_i=t\) and \(w_i=w\) for all \(i\in [k]\). Otherwise it is easily seen that s is distributed uniformly in \(\mathbb {F}\) (and likewise for t and w) for uniformly sampled \(\beta _1,\ldots ,\beta _k\) (subject to sum being 1), and thus \(s\in [N]\) with negligible probability \(N/|\mathbb {F}|\). Similar reasoning also implies that with overwhelming probability we have \(\Lambda \cdot \llbracket \boldsymbol{X}_i, \boldsymbol{Y}_i\rrbracket =\llbracket \boldsymbol{Z}_i, \boldsymbol{0}\rrbracket \) for all \(i\in [k]\). Combined with the fact that \(\Lambda \cdot \llbracket \boldsymbol{\rho }_s, \boldsymbol{\rho }_t\rrbracket =\llbracket \boldsymbol{\rho }_w, \boldsymbol{0}\rrbracket \), it implies that the same permutation \(\Lambda \) maps the first s entries of column \(\boldsymbol{x}_i\) and first t entries of column \(\boldsymbol{y}_i\) to the first \(w=s+t\) entries of the column \(\boldsymbol{z}_i\) for all \(i\in [k]\). Thus \(D_z\) corresponds to aggregation of datasets \(D_x\) and \(D_y\).

Protcols and Proofs for Other Operations: We have provided circuit descriptions for other operations such as filter, order-by, inner-join and also ML operations such as inference and accuracy from decision trees. These circuits can be used with CP-SNARKs to yeild secure protocols for those operations using techniques similar to presented protocols (essentially using Lemmas 1 and 2), alongwith reduction technique when applicable.

Rights and permissions

Reprints and Permissions

Copyright information

© 2022 International Financial Cryptography Association

About this paper

Verify currency and authenticity via CrossMark

Cite this paper

Singh, N., Dayama, P., Pandit, V. (2022). Zero Knowledge Proofs Towards Verifiable Decentralized AI Pipelines. In: Eyal, I., Garay, J. (eds) Financial Cryptography and Data Security. FC 2022. Lecture Notes in Computer Science, vol 13411. Springer, Cham.

Download citation

  • DOI:

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-18282-2

  • Online ISBN: 978-3-031-18283-9

  • eBook Packages: Computer ScienceComputer Science (R0)