Skip to main content
Log in

LineageChain: a fine-grained, secure and efficient data provenance system for blockchains

  • Special Issue Paper
  • Published:
The VLDB Journal Aims and scope Submit manuscript

Abstract

The success of Bitcoin and other cryptocurrencies is drawing significant interest to blockchains. A blockchain system implements a tamper-evident ledger for recording transactions that modify some global states. The system captures the entire evolution history of the states. The management of that history, also known as data provenance or lineage, has been studied extensively in database systems. However, querying data history in existing blockchains can only be done by replaying all transactions. This approach is applicable to large-scale, offline analysis, but is not suitable for online transaction processing. In this paper, we identify a new class of blockchain applications whose execution logics depend on provenance information at runtime. We first motivate the need for adding native provenance support to blockchains. We then present LineageChain, a fine-grained, secure and efficient provenance system for blockchains. LineageChain exposes lineage information to smart contracts runtime via simple and elegant interfaces that efficiently and securely support provenance-dependent contracts. LineageChain captures provenance during contract execution and stores it in a Merkle tree. LineageChain provides a novel skip list index designed for efficient provenance queries. We have implemented LineageChain on top of Fabric and a blockchain optimized storage system called ForkBase. Our extensive evaluation of LineageChain demonstrates its benefits to the new class of blockchain applications, its high query performance and its small storage overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12
Fig. 13
Fig. 14
Fig. 15
Fig. 16
Fig. 17
Fig. 18
Fig. 19
Fig. 20
Fig. 21
Fig. 22
Fig. 23

Similar content being viewed by others

References

  1. Ethereum. https://www.ethereum.org

  2. Fabric. https://www.fabric.org

  3. Fabric#. https://www.comp.nus.edu.sg/~dbsystem/fabricsharp/

  4. Medilot. https://medilot.com

  5. Abadi, D. J., Arden, O., Nawab, F., Shadmon, M.: Anylog: a grand unification of the internet of things. In CIDR, (2020)

  6. Akoush, S., Sohan, R., Hopper, A.: Hadoopprov: Towards provenance as a first class citizen in mapreduce. In TaPP, (2013)

  7. Al-Bassam, M., Sonnino, A., Bano, S., Hrycyszyn, D., Danezis, G.: Chainspace: A sharded smart contracts platform. arXiv preprint arXiv:1708.03778, (2017)

  8. Allen, L., Antonopoulos, P., Arasu, A., Gehrke, J., Hammer, J., Hunter, J., Kaushik, R., Kossmann, D., Lee, J., Ramamurthy, R., Setty, S., Szymaszek, J., van Renen, A., Venkatesan, R.: Veritas: Shared verifiable databases and tables in the cloud. In CIDR, (2019)

  9. Atzei, N., Bartoletti, M., Cimoli, T.: A survey of attacks on ethereum smart contracts (sok). In: Principles of Security and Trust, pp. 164–186. Springer (2017)

  10. Brown, R.G., Carlyle, J., Grigg, I., Hearn, M.: Corda: An introduction. R3 CEV, August, (2016)

  11. Buneman, P., Chapman, A., Cheney, J.: Provenance management in curated databases. In: Proceedings of the 2006 ACM SIGMOD International Conference on Management of data, pp. 539–550. ACM, (2006)

  12. Buneman, P., Khanna, S., Wang-Chiew, T.: Why and where: a characterization of data provenance. In: International Conference on Database Theory, pp. 316–330. Springer (2001)

  13. Cachin, C., Schubert, S., Vukolić, M.: Non-determinism in byzantine fault-tolerant replication. arXiv preprint arXiv:1603.07351, (2016)

  14. Castro, M., Liskov, B., et al.: Practical byzantine fault tolerance. In OSDI 99, 173–186 (1999)

  15. Chen, C., Lehri, H. T., Kuan Loh, L., Alur, A., Jia, L., Loo, B. T., Zhou, W.: Distributed provenance compression. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 203–218. ACM (2017)

  16. Cheney, J., Chiticariu, L., Tan, W.-C., et al.: Provenance in databases: Why, how, and where. Found. Trends. Databases 4(1), 379–474 (2009)

    Google Scholar 

  17. Chiticariu, L., Tan, W.-C., Vijayvargiya, G.: Dbnotes: a post-it system for relational databases based on provenance. In: Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data, pp. 942–944. ACM (2005)

  18. Dang, H., Dinh, T. T. A., Loghin, D., Chang, E.-C., Lin, Q., Ooi, B. C.: Towards scaling blockchain systems via sharding. arXiv preprint arXiv:1804.00399, (2018)

  19. Delmolino, K., Arnett, M., Kosba, A., Miller, A., Shi, E.: Step by step towards creating a safe smart contract: Lessons and insights from a cryptocurrency lab. In: International Conference on Financial Cryptography and Data Security, pp. 79–94. Springer (2016)

  20. Deutch, D., Frost, N., Gilad, A.: Provenance for natural language queries. PVLDB 10(5), 577–588 (2017)

  21. Dickerson, T., Gazzillo, P., Herlihy, M., Koskinen, E.: Adding concurrency to smart contract. https://arxiv.org/abs/1702.04467

  22. Dinh, T.T.A., Liu, R., Zhang, M., Chen, G., Ooi, B.C., Wang, J.: Untangling blockchain: A data processing view of blockchain systems. IEEE Trans. Knowl. Data Eng. 30(7), 1366–1385 (2018)

    Article  Google Scholar 

  23. Dinh, T.T.A., Wang, J., Chen, G., Liu, R., Ooi, B.C., Tan, K.-L.: Blockbench: A framework for analyzing private blockchains. In: Proceedings of the 2017 ACM International Conference on Management of Data, pp. 1085–1100. ACM, (2017)

  24. Eyal, I., Gencer, A. E., Sirer, E. G., Van Renesse, R.: Bitcoin-ng: A scalable blockchain protocol. In: NSDI, pp. 45–59 (2016)

  25. Eyal, I., Sirer, E.G.: Majority is not enough: Bitcoin mining is vulnerable. Commun. ACM 61(7), 95–102 (2018)

    Article  Google Scholar 

  26. Gilad, Y., Hemo, R., Micali, S., Vlachos, G., Zeldovich, N.: Algorand: Scaling byzantine agreements for cryptocurrencies. In: Proceedings of the 26th Symposium on Operating Systems Principles, pp. 51–68. ACM (2017)

  27. Gupta, S., Rahnama, S., Hellings, J., Sadoghi, M.: ResilientDB: Global Scale Resilient Blockchain Fabric. arXiv e-prints, arXiv:2002.00160, (2020)

  28. Ikeda, R., Park, H., Widom, J.: Provenance for generalized map and reduce workflows. (2011)

  29. Interlandi, M., Shah, K., Tetali, S.D., Gulzar, M.A., Yoo, S., Kim, M., Millstein, T., Condie, T.: Titian: Data provenance support in spark. PVLDB 9(3), 216–227 (2015)

    Google Scholar 

  30. Ives, Z.G., Green, T.J., Karvounarakis, G., Taylor, N.E., Tannen, V., Talukdar, P.P., Jacob, M., Pereira, F.: The orchestra collaborative data sharing system. ACM Sigmod Record 37(3), 26–32 (2008)

    Article  Google Scholar 

  31. Javad Amiri, M., Agrawal, D., El Abbadi, A.: SharPer: Sharding Permissioned Blockchains Over Network Clusters. arXiv e-prints, page arXiv:1910.00765, (Oct. 2019)

  32. Kalra, S., Goel, S., Dhawan, M., Sharma, S.: Zeus: Analyzing safety of smart contracts. In: NDSS (2018)

  33. Kogias, E.K., Jovanovic, P., Gailly, N., Khoffi, I., Gasser, L., Ford, B.: Enhancing bitcoin security and performance with strong consistency via collective signing. In: 25th USENIX Security Symposium (USENIX Security 16), pp. 279–296 (2016)

  34. Korpela, K., Hallikas, J., Dahlberg, T.: Digital supply chain transformation toward blockchain integration. In: Proceedings of the 50th Hawaii International Conference on System Sciences, (2017)

  35. Luu, L., Chu, D.-H., Olickel, H., Saxena, P., Hobor, A.: Making smart contracts smarter. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 254–269. ACM, 2016

  36. Luu, L., Narayanan, V., Zheng, C., Baweja, K., Gilbert, S., Saxena, P.: A secure sharding protocol for open blockchains. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pp. 17–30. ACM (2016)

  37. Luu, L., Teutsch, J., Kulkarni, R., Saxena, P.: Demystifying incentives in the consensus computer. In: Proceedings of the 22Nd ACM SIGSAC Conference on Computer and Communications Security, CCS ’15, pp. 706–719, ACM, New York (2015)

  38. Maiyya, S., Cho, D.H.B., Agrawal, D., Abbadi, A.E.: Fides: Managing data on untrusted infrastructure. arXiv preprint arXiv:2001.06933, (2020)

  39. Nakamoto, S.: Bitcoin: A peer-to-peer electronic cash system. https://bitcoin.org/bitcoin.pdf, (2009)

  40. Nawab, F., Sadoghi, M.: Blockplane: A global-scale byzantizing middleware. In: 2019 IEEE 35th International Conference on Data Engineering (ICDE), pp. 124–135. IEEE, (2019)

  41. Nayak, K., Kumar, S., Miller, A., Shi, E.: Stubborn mining: Generalizing selfish mining and combining with an eclipse attack. In: 2016 IEEE European Symposium on Security and Privacy (EuroS&P), pp. 305–320. IEEE, (2016)

  42. Nguyen, Q.K.: Blockchain-a financial technology for future sustainable development. In: 2016 3rd International Conference on Green Technology and Sustainable Development (GTSD), pp. 51–54. IEEE, (2016)

  43. Park, H., Ikeda, R., Widom, J.: Ramp: A system for capturing and tracing provenance in mapreduce workflows. (2011)

  44. Psallidas, F., Wu, E.: Smoke: Fine-grained lineage at interactive speed. PVLDB 11(6), 719–732 (2018)

    Google Scholar 

  45. Ruan, P., Chen, G., Dinh, T.T.A., Lin, Q., Ooi, B.C., Zhang, M.: Fine-grained, secure and efficient data provenance on Blockchain systems. Proc. VLDB Endow. 12(9), 975–988 (2019)

    Article  Google Scholar 

  46. Ruan, P., Dinh, T.T.A., Lin, Q., Zhang, M., Chen, G., Ooi, B.C.: Revealing every story of data in blockchain systems. ACM Sigmod Record, (2020)

  47. Sapirshtein, A., Sompolinsky, Y., Zohar, A.: Optimal selfish mining strategies in bitcoin. In: International Conference on Financial Cryptography and Data Security, pp. 515–532. Springer (2016)

  48. Sergey, I., Hobor, A.: A concurrent perspective on smart contracts. In: International Conference on Financial Cryptography and Data Security, pp. 478–493. Springer, (2017)

  49. Simmhan, Y.L., Plale, B., Gannon, D.: A survey of data provenance in e-science. ACM Sigmod Record 34(3), 31–36 (2005)

    Article  Google Scholar 

  50. Tapscott, A., Tapscott, D.: How blockchain is changing finance. Harvard Business Review 1(9), (2017)

  51. Tian, F.: An agri-food supply chain traceability system for china based on rfid & blockchain technology. In: 2016 13th International Conference on Service Systems and Service Management (ICSSSM), pp. 1–6. IEEE, (2016)

  52. Tsankov, P., Dan, A., Drachsler-Cohen, D., Gervais, A., Buenzli, F., Vechev, M.: Securify: Practical security analysis of smart contracts. In: Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, pp. 67–82, (2018)

  53. Wang, J., Crawl, D., Purawat, S., Nguyen, M., Altintas, I.: Big data provenance: Challenges, state of the art and opportunities. In: 2015 IEEE International Conference on Big Data (Big Data), pp. 2509–2516. IEEE, (2015)

  54. Wang, S., Dinh, T.T.A., Lin, Q., Xie, Z., Zhang, M., Cai, Q., Chen, G., Ooi, B.C., Ruan, P.: Forkbase: An efficient storage engine for blockchain and forkable applications. PVLDB 11(10), 1137–1150 (2018)

    Google Scholar 

  55. Weber, I., Xu, X., Riveret, R., Governatori, G., Ponomarev, A., Mendling, J.: Untrusted business process monitoring and execution using blockchain. In: International Conference on Business Process Management, pp. 329–347. Springer, (2016)

  56. Xu, C., Zhang, C., Xu, J.: vchain: Enabling verifiable boolean range queries over blockchain databases. arXiv preprint arXiv:1812.02386, (2018)

  57. Xu, Z., Han, S., Chen, L.: Cub, a consensus unit-based storage scheme for blockchain system. In ICDE, (2018)

  58. Zamani, M., Movahedi, M., Raykova, M.: Rapidchain: Scaling blockchain via full sharding. In: CCS, (2018)

Download references

Acknowledgements

This research is supported by Singapore Ministry of Education Academic Research Fund Tier 3 under MOE’s official grant number MOE2017-T3-1-007.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pingcheng Ruan.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ruan, P., Dinh, T.T.A., Lin, Q. et al. LineageChain: a fine-grained, secure and efficient data provenance system for blockchains. The VLDB Journal 30, 3–24 (2021). https://doi.org/10.1007/s00778-020-00646-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00778-020-00646-1

Keywords

Navigation