Encyclopedia of Wireless Networks

Living Edition
| Editors: Xuemin (Sherman) Shen, Xiaodong Lin, Kuan Zhang

Verifiable Cloud Computing

  • Cheng XuEmail author
  • Ce Zhang
  • Jianliang XuEmail author
Living reference work entry
DOI: https://doi.org/10.1007/978-3-319-32903-1_299-1

Synonyms

Definition

Verifiable cloud computing is a way to provide cloud computing services that outsource computing to untrusted third parties while maintaining the integrity of the computation results.

Historical Background

In cloud computing, the data owner outsources the data storage and query services to a cloud service provider in order to scale up the services with a low cost. However, such an outsourcing model brings about serious issues in computation integrity. As the service provider is not the real owner of the data, it might return incomplete or incorrect results, intentionally or unintentionally. Thus, to ensure computation integrity, the client needs to authenticate the soundness (every result originates from the data owner’s database), the completeness (no valid result is missing), and the freshness (the result is up-to-date) of the computation results. To tackle this problem, one early solution is verification by replication (Haeberlen et al., 2007). The idea goes as follows. The client outsources the same computing task to multiple workers of different service providers. If not enough results are returned within a reasonable time or the results do not agree, the client would send the task to more other workers. Once a minimum number of the workers agree on the same computation results, the client assumes those results are correct. Although this solution is simple and effective, it assumes failure independence, which however does not hold when facing malicious service providers. In order to provide strong guarantee on the integrity of the computation results against untrusted or even malicious service providers, cryptographically enforced verifiable cloud computing is proposed and studied by a large body of literature. The basic idea is that the service provider should return not only the computation results but also a cryptographic proof, which can be used by the client to establish the soundness, completeness, and freshness of those results.

There are several metrics when it comes to evaluate the verifiable cloud computing protocols: (i) preprocessing time, which is the time for the data owner to generate some auxiliary data such as authenticated data structure; (ii) proving time, which is the time for the service provider to compute the proof; (iii) verification time, which is the time for the client to verify the proof; and (iv) proof size. It is worth noting that the verification time and proof size should ideally be proportional to the size of the computation results and be independent to the size of the whole database. This is particularly important when the client is a mobile user connecting to the cloud through a wireless network.

Foundations

In general, there are two fundamental approaches to support verifiable cloud computing, each with its own advantages and disadvantages. On the one hand, one can design a verifiable scheme specifically based on the computation task. This is often achieved by letting the data owner sign a well-designed authenticated data structure (ADS), based on which the service provider can construct corresponding proofs for the outsourced computation. This approach yields low overhead but supports only limited computation tasks. On the other hand, a verification scheme may model the computation task as a general Turing machine. As such, they can support arbitrary tasks at the expense of high and sometimes impractical overhead.

For the ADS-based verification schemes, there are two basic techniques: signature chaining (Fig. 1) and Merkle hash tree (Fig. 2). The former technique uses a public-key cipher with which the data owner can generate digital signatures for each data item using a private key. Consequently, the client can verify the authenticity of the data item using the public key. To establish the completeness of the query results, chaining signatures are generated to capture the correlation of each item and its adjacent items (Pang and Tan, 2004). Merkle hash trees (MHTs), on the other hand, are built on index trees (Merkle, 1989). Each entry in a leaf node is assigned a digest based on its hashed value, and each entry in an internal node is assigned a digest derived from its child nodes. The data owner signs the root digest of the MHT, which can be used to verify any subset of data items. For example, in Fig. 2, a range query Q will return {o2, N2, N3}, based on which the client can reconstruct the root digest N0 = h(h(N3|h(o2))|N2) for verification. MHT has been widely adapted to various index structures. Typical examples include the Merkle B-tree for relational data (Li et al., 2006), the Merkle R-tree for spatial data (Yang et al., 2009b; Yiu et al., 2011), the authenticated inverted index for text data (Pang and Mouratidis, 2008), the graph metric tree for subgraph similarity search (Peng et al., 2015), and the authenticated prefix tree for multisource data (Chen et al., 2015). It has also been adopted to support authenticated join queries (Yang et al., 2009a), verifiable privacy-preserving location-based services (Hu et al., 2012, 2013; Chen et al., 2013), and queries with fine-grained access control (Xu et al., 2018b). More recently, there have been studies of the ADS schemes on the set-valued data (Papamanthou et al., 2011; Canetti et al., 2014; Zhang et al., 2017b). They utilize a cryptographic set accumulator, which is able to present a (multi-)set with a constant-size digest and support a variety of verifiable (multi-)set operations such as subset, set disjoint, set sum, set intersection, and set union. Based on that, one can achieve verifiable query processing over high-dimensional data (Papadopoulos et al., 2014), verifiable SQL processing (Zhang et al., 2015), or verifiable analytics aggregation queries (Xu et al., 2018a).
Fig. 1

Signature chaining

Fig. 2

Merkle hash tree

In comparison, the general-purpose verifiable cloud computing scheme does not assume any specific properties on the computation task. Instead, the computation task is present as a Boolean or arithmetic circuit, which is Turing complete and thus can be used for arbitrary cloud computation. As the Boolean or arithmetic circuit can be viewed as a serious of constraints on the internal computation state as well as the final results, it can be transformed into a so-called quadratic span program. By utilizing certain cryptographic primitives such as pairing, the equivalent program can be verified using a technique known as zk-SNARKs (Parno et al., 2013). In addition to the ability to authenticate arbitrary programs, zk-SNARKs is able to achieve extremely low overhead on verification. The verification time and proof size are both in constant. Further, it leaks no information beyond the computation result, which is crucial for the applications with privacy and confidentiality concerns. However, as a cost of its generality, zk-SNARKs yields high overhead on the preprocessing time and proving time. It is considered impractical to many real-world cloud computing problems. Nevertheless, many studies have been carried out to reduce its proving overhead. Ben-Sasson et al. (2014) propose a zk-SNARK variant that avoids hard-coding the computation program into its verification key and thus reduces the preprocessing cost. Zhang et al. (2017a) propose an interactive protocol for general-purpose SQL queries, whose cost is substantially lower than the original zk-SNARKs scheme. More recently, it has been proposed to model the computation task as a random access machine (RAM) program as opposed to a circuit (Braun et al., 2013; Ben-Sasson et al., 2013; Zhang et al., 2018). As a result, it can gain significant performance improvement for some computation tasks. For example, the size of a circuit implementing binary search on a sorted array is linear in the length of the array, whereas the complexity of a RAM program for binary search is only logarithmic.

Key Applications

Verifiable cloud computing is essential in every cloud computing environment that has security and trust concerns. For example, it is imperative for the application scenario where business intelligence executives make critical, million-dollar decisions such as investing in new businesses based on OLAP queries in the cloud. Similar requirements also exist in many other applications such as scientific research and government policy making.

Cross-References

References

  1. Ben-Sasson E, Chiesa A, Genkin D, Tromer E, Virza M (2013) Snarks for c: verifying program executions succinctly and in zero knowledge. In: Canetti R, Garay JA (eds) Advances in cryptology – CRYPTO 2013, pp 90–108zbMATHGoogle Scholar
  2. Ben-Sasson E, Chiesa A, Tromer E, Virza M (2014) Succinct non-interactive zero knowledge for a von Neumann architecture. In: Proceedings of the 23rd USENIX conference on security symposium, pp 781–796Google Scholar
  3. Braun B, Feldman AJ, Ren Z, Setty S, Blumberg AJ, Walfish M (2013) Verifying computations with state. In: Proceedings of the twenty-fourth ACM symposium on operating systems principles, pp 341–357Google Scholar
  4. Canetti R, Paneth O, Papadopoulos D, Triandopoulos N (2014) Verifiable set operations over outsourced databases. In: Public-key cryptography – PKC, pp 113–130zbMATHGoogle Scholar
  5. Chen Q, Hu H, Xu J (2013) Authenticating top-k queries in location-based services with confidentiality. Proc VLDB Endowment 7(1):49–60CrossRefGoogle Scholar
  6. Chen Q, Hu H, Xu J (2015) Authenticated online data integration services. In: Proceedings of the 2015 ACM SIGMOD international conference on management of data, pp 167–181Google Scholar
  7. Haeberlen A, Kouznetsov P, Druschel P (2007) Peerreview: practical accountability for distributed systems. ACM SIGOPS Oper Syst Rev 41(6):175–188CrossRefGoogle Scholar
  8. Hu H, Xu J, Chen Q, Yang Z (2012) Authenticating location-based services without compromising location privacy. In: Proceedings of the 2012 ACM SIGMOD international conference on management of data, pp 301–312Google Scholar
  9. Hu H, Chen Q, Xu J (2013) VERDICT: privacy-preserving authentication of range queries in location-based services. In: 2013 IEEE 29th international conference on data engineering, pp 1312–1315Google Scholar
  10. Li F, Hadjieleftheriou M, Kollios G, Reyzin L (2006) Dynamic authenticated index structures for outsourced databases. In: Proceedings of the 2006 ACM SIGMOD international conference on management of data, pp 121–132Google Scholar
  11. Merkle RC (1989) A certified digital signature. In: Advances in cryptology – CRYPTO, pp 218–238Google Scholar
  12. Pang H, Mouratidis K (2008) Authenticating the query results of text search engines. In: Proceedings of the VLDB endowment, pp 126–137Google Scholar
  13. Pang H, Tan KL (2004) Authenticating query results in edge computing. In: Proceedings of the 20th international conference on data engineering, pp 560–571Google Scholar
  14. Papadopoulos D, Papadopoulos S, Triandopoulos N (2014) Taking authenticated range queries to arbitrary dimensions. In: Proceedings of the 2014 ACM SIGSAC conference on computer and communications security, pp 819–830Google Scholar
  15. Papamanthou C, Tamassia R, Triandopoulos N (2011) Optimal verification of operations on dynamic sets. In: Advances in cryptology – CRYPTO, pp 91–110zbMATHGoogle Scholar
  16. Parno B, Howell J, Gentry C, Raykova M (2013) Pinocchio: nearly practical verifiable computation. In: 2013 IEEE symposium on security and privacy (SP), pp 238–252Google Scholar
  17. Peng Y, Fan Z, Choi B, Xu J, Bhowmick SS (2015) Authenticated subgraph similarity searchin outsourced graph databases. IEEE Trans Knowl Data Eng 27(7):1838–1860CrossRefGoogle Scholar
  18. Xu C, Chen Q, Hu H, Xu J, Hei X (2018a) Authenticating aggregate queries over set-valued data with confidentiality. IEEE Trans Knowl Data Eng 30:630–644CrossRefGoogle Scholar
  19. Xu C, Xu J, Hu H, Au MH (2018b) When query authentication meets fine-grained access control: a zero-knowledge approach. In: Proceedings of the 2018 ACM SIGMOD international conference on management of data, pp 147–162Google Scholar
  20. Yang Y, Papadias D, Papadopoulos S, Kalnis P (2009a) Authenticated join processing in outsourced databases. In: Proceedings of the 2009 ACM SIGMOD international conference on management of data, pp 5–18Google Scholar
  21. Yang Y, Papadopoulos S, Papadias D, Kollios G (2009b) Authenticated indexing for outsourced spatial databases. VLDB J 18(3):631–648CrossRefGoogle Scholar
  22. Yiu ML, Lo E, Yung D (2011) Authentication of moving kNN queries. In: Proceedings of the 27th IEEE international conference on data engineering, pp 565–576Google Scholar
  23. Zhang Y, Katz J, Papamanthou C (2015) IntegriDB: verifiable SQL for outsourced databases. In: Proceedings of the 22Nd ACM SIGSAC conference on computer and communications security, pp 1480–1491Google Scholar
  24. Zhang Y, Genkin D, Katz J, Papadopoulos D, Papamanthou C (2017a) vSQL: Verifying arbitrary SQL queries over dynamic outsourced databases. In: IEEE symposium on security and privacy, pp 863–880Google Scholar
  25. Zhang Y, Katz J, Papamanthou C (2017b) An expressive (zero-knowledge) set accumulator. In: IEEE European symposium on security and privacy (EuroS&P), pp 158–173Google Scholar
  26. Zhang Y, Genkin D, Katz J, Papadopoulos D, Papamanthou C (2018) vRAM: faster verifiable ram with program-independent preprocessing. In: IEEE symposium on security and privacyGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Hong Kong Baptist UniversityKowloon TongHong Kong

Section editors and affiliations

  • Kui Ren

There are no affiliations available