Privacy Centric Collaborative Machine Learning Model Training via Blockchain

  • Aman LadiaEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1010)


This paper tackles the issue of data siloing, where organisations are unable to share data with each other because of privacy concerns. Machine Learning models, which could benefit greatly from larger data sets shared between organisations, suffer in this era of data isolation. To solve this problem, a blockchain based implementation is proposed that allows training of machine learning models in a privacy compliant way. Instead of using blockchain in a typical database-style manner, the proposed solution uses blockchain as a means to handle joint ownership and joint control over a computer system known as the Training Machine. The Training Machine, set-up jointly by consortium members, serves as a secure, independent container that accepts data sets and an untrained model as inputs from different entities, trains the model internally, and outputs the trained model without revealing any data to other entities. Data is then deleted automatically. Blockchain ensures that this machine is not under the control of any one entity but is rather controlled transparently by all data-sharing parties. By placing sensitive information in an isolated system, and establishing blockchain based access control, the solution ensures that data is not accessible to any party other than the owner. The paper also shares use cases of this technology, along with a risk analysis and proof of concept.


Private data sharing Shared model training Blockchain access control Consortium data exchange Deep learning training 


  1. 1.
    Domingos, P.M.: A few useful things to know about machine learning. Commun. ACM 55(10), 78 (2012). JCotACrossRefGoogle Scholar
  2. 2.
    Witten, I.H., Frank, E., Hall, M.A., Pal, C.J.: Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann (2016)Google Scholar
  3. 3.
    Tene, O., Polonetsky, J.: Big data for all: privacy and user control in the age of analytics. Nw. J. Tech. Intell. Prop. 11, xxvii (2012)Google Scholar
  4. 4.
    General data protection regulation (2016). 2016/679Google Scholar
  5. 5.
    Mougayar, W.: The Business Blockchain: Promise, Practice, and Application of the Next Internet Technology. Wiley, Hoboken (2016)Google Scholar
  6. 6.
    Cachin, C.: Architecture of the hyperledger blockchain fabric. In: Workshop on Distributed Cryptocurrencies and Consensus Ledgers (2016)Google Scholar
  7. 7.
    Implementation specifications. Accessed 24 Mar 2019
  8. 8.
    Communities and crime data set. Accessed 24 Mar 2019
  9. 9.
    Risk matrix. Accessed 24 Mar 2019
  10. 10.
    Galindo, J., Tamayo, P.: Credit risk assessment using statistical and machine learning: basic methodology and risk modeling applications. Comput. Econ. 15(1–2), 107–143 (2000). JCEGoogle Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Liquid ProtocolMumbaiIndia

Personalised recommendations