Abstract
Machine learning has recently enabled large advances in artificial intelligence, but these results can be highly centralized. The large datasets required are generally proprietary; predictions are often sold on a per-query basis; and published models can quickly become out of date without effort to acquire more data and maintain them. Published proposals to provide models and data for free for certain tasks include Microsoft Research’s Decentralized and Collaborative AI on Blockchain. The framework allows participants to collaboratively build a dataset and use smart contracts to share a continuously updated model on a public blockchain. The initial proposal gave an overview of the framework omitting many details of the models used and the incentive mechanisms in real world scenarios. For example, the Self-Assessment incentive mechanism proposed in their work could have problems such as participants losing deposits and the model becoming inaccurate over time if the proper parameters are not set when the framework is configured. In this work, we evaluate the use of several models and configurations in order to propose best practices when using the Self-Assessment incentive mechanism so that models can remain accurate and well-intended participants that submit correct data have the chance to profit. We have analyzed simulations for each of three models: Perceptron, Naïve Bayes, and a Nearest Centroid Classifier, with three different datasets: predicting a sport with user activity from Endomondo, sentiment analysis on movie reviews from IMDB, and determining if a news article is fake. We compare several factors for each dataset when models are hosted in smart contracts on a public blockchain: their accuracy over time, balances of a good and bad user, and transaction costs (or gas) for deploying, updating, collecting refunds, and collecting rewards. A free and open source implementation for the Ethereum blockchain of these models is provided at https://github.com/microsoft/0xDeCA10B.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Nakamoto, S., et al.: Bitcoin: a peer-to-peer electronic cash system (2008)
Buterin, V.: A next generation smart contract & decentralized application platform (2015)
Harris, J.D., Waggoner, B.: Decentralized and collaborative AI on blockchain. In: 2019 IEEE International Conference on Blockchain (Blockchain), July 2019
Marathe, A., Narayanan, K., Gupta, A., Pr, M.: DInEMMo: decentralized incentivization for enterprise marketplace models. In: 2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW), pp. 95–100 (2018)
Kurtulmus, A.B., Daniel, K.: Trustless machine learning contracts; evaluating and exchanging machine learning models on the ethereum blockchain (2018)
Lihu, A., Du, J., Barjaktarevic, I., Gerzanics, P., Harvilla, M.: A proof of useful work for artificial intelligence on the blockchain (2020)
Li, M., et al.: CrowdBC: a blockchain-based decentralized framework for crowdsourcing. IEEE Trans. Parallel Distrib. Syst. 30(6), 1251–1266 (2019)
Webb, G.I.: Naïve bayes. In: Sammut, C., Webb, G.I. (eds.) Encyclopedia of Machine Learning, pp. 713–714. Springer, Boston (2010). https://doi.org/10.1007/978-0-387-30164-8_576
Tibshirani, R., Hastie, T., Narasimhan, B., Chu, G.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. Proc. Nat. Acad. Sci. 99(10), 6567–6572 (2002)
Rosenblatt, F.: The perceptron: a probabilistic model for information storage and organization in the brain. Psychol. Rev. 65(6), 386 (1958)
Ni, J., Muhlstein, L., McAuley, J.: Modeling heart rate and activity data for personalized fitness recommendation. In: The World Wide Web Conference (WWW 2019), New York, NY, USA, pp. 1343–1353. Association for Computing Machinery (2019)
Maas, A.L., Daly, R.E., Pham, P.T., Huang, D., Ng, A.Y., Potts, C.: Learning word vectors for sentiment analysis. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, Oregon, USA, pp. 142–150. Association for Computational Linguistics, June 2011
Kaggle: UTK Machine Learning Club: Fake News (2020). https://www.kaggle.com/c/fake-news/overview. Accessed 07 Jan 2020
Wikipedia contributors: Wikipedia – Wikipedia, the free encyclopedia (2020). https://en.wikipedia.org/w/index.php?title=Wikipedia. Accessed 08 Jan 2020
Schlimmer, J.C., Fisher, D.: A case study of incremental concept induction. In: Proceedings of the Fifth AAAI National Conference on Artificial Intelligence (AAAI 1986), pp. 496–501. AAAI Press (1986)
Wikipedia contributors: Moving average – Wikipedia, the free encyclopedia (2020). https://en.wikipedia.org/w/index.php?title=Moving_average. Accessed 3 Feb 2020
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Harris, J.D. (2020). Analysis of Models for Decentralized and Collaborative AI on Blockchain. In: Chen, Z., Cui, L., Palanisamy, B., Zhang, LJ. (eds) Blockchain – ICBC 2020. ICBC 2020. Lecture Notes in Computer Science(), vol 12404. Springer, Cham. https://doi.org/10.1007/978-3-030-59638-5_10
Download citation
DOI: https://doi.org/10.1007/978-3-030-59638-5_10
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59637-8
Online ISBN: 978-3-030-59638-5
eBook Packages: Computer ScienceComputer Science (R0)