A Dynamic Pricing Mechanism in IoT for DaaS: A Reinforcement Learning Approach
With the rapid development of the Internet of things, a large amount of data has been accumulated. However, how to make full use of these data has become a new problem. In this article, we will focus on how to develop data resources using the intelligent data pricing (SDP) approach. Establish a B2B data marketplace for integrating, storing, and analyzing business data. Simulate interactions between service providers and enterprises in the marketplace. Since the service provider has markov consciousness, q-learning algorithm is adopted to solve the model. Experimental results show that q-learning algorithm can make every participant in the market obtain the optimal profit.
KeywordsDynamic pricing Markov decision process Q-learning
This work is supported by the National Key Research and Development Program of China (2016YFB1001100).
- 1.Biru, A., Minerva, R., Rotondi, D.: Towards a definition of the Internet of Things (IoT). IEEE Technical report (2015). http://iot.ieee.org/definition.html
- 2.The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things (2014). https://www.emc.com/leadership/digital-universe/
- 4.Baur, C., Wee, D.: Manufacturing’s next act (2015). https://www.mckinsey.com/business-functions/operations/our-insights/manufacturings-next-act
- 23.Melo, F.S.: Convergence of Q-learning: a simple proof. Technical report, pp. 1–4. Institute of Systems and Robotics (2001)Google Scholar