Advertisement

A Dynamic Pricing Mechanism in IoT for DaaS: A Reinforcement Learning Approach

  • Binpeng Song
  • Jinze Song
  • Jian YeEmail author
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1075)

Abstract

With the rapid development of the Internet of things, a large amount of data has been accumulated. However, how to make full use of these data has become a new problem. In this article, we will focus on how to develop data resources using the intelligent data pricing (SDP) approach. Establish a B2B data marketplace for integrating, storing, and analyzing business data. Simulate interactions between service providers and enterprises in the marketplace. Since the service provider has markov consciousness, q-learning algorithm is adopted to solve the model. Experimental results show that q-learning algorithm can make every participant in the market obtain the optimal profit.

Keywords

Dynamic pricing Markov decision process Q-learning 

Notes

Acknowledgment

This work is supported by the National Key Research and Development Program of China (2016YFB1001100).

References

  1. 1.
    Biru, A., Minerva, R., Rotondi, D.: Towards a definition of the Internet of Things (IoT). IEEE Technical report (2015). http://iot.ieee.org/definition.html
  2. 2.
    The Digital Universe of Opportunities: Rich Data and the Increasing Value of the Internet of Things (2014). https://www.emc.com/leadership/digital-universe/
  3. 3.
    Liang, F., Yu, W., An, D., Yang, Q., Fu, X., Zhao, W.: A survey on big data market: pricing, trading and protection. IEEE Access 6, 15132–15154 (2017)CrossRefGoogle Scholar
  4. 4.
  5. 5.
    Al-Fagih, A.E., Al-Turjman, F.M., Alsalih, W.M., Hassanein, H.S.: A priced public sensing framework for heterogeneous IoT architectures. IEEE Trans. Emerg. Top. Comput. 1(1), 133–147 (2013)CrossRefGoogle Scholar
  6. 6.
    Femminella, M., Pergolesi, M., Reali, G.: IoT, big data, and cloud computing value chain: pricing issues and solutions. Ann. Telecommun. 73(7–8), 511–520 (2018)CrossRefGoogle Scholar
  7. 7.
    Niyato, D., Hoang, D.T., Luong, N.C., Wang, P., Kim, D.I., Han, Z.: Smart data pricing models for the Internet of Things: a bundling strategy approach. IEEE Netw. 30(2), 18–25 (2016)CrossRefGoogle Scholar
  8. 8.
    Guijarro, L., Pla, V., Vidal, J.R., Naldi, M.: Game theoretical analysis of service provision for the Internet of Things Based on sensor virtualization. IEEE J. Sel. Areas Commun. 35(3), 691–706 (2017)CrossRefGoogle Scholar
  9. 9.
    Wang, W., Wang, Q.: Price the QoE, not the data: SMP-economic resource allocation in wireless multimedia Internet of Things. IEEE Commun. Mag. 56(9), 74–79 (2018)CrossRefGoogle Scholar
  10. 10.
    Jiao, Y.T., Wang, P., Feng, S.H., Niyato, D.: Profit maximization mechanism and data management for data analytics services. IEEE Internet Things J. 5(3), 2001–2014 (2018)CrossRefGoogle Scholar
  11. 11.
    Hayat, R., Sabir, E., Badidi, E., ElKoutbi, M.: A signaling game-based approach for Data-as-a-Service provisioning in IoT-Cloud. Future Gener. Comput. Syst. Int. J. Esci. 92, 1040–1050 (2019)CrossRefGoogle Scholar
  12. 12.
    Watkins, C., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992)zbMATHGoogle Scholar
  13. 13.
    Rahimiyan, M., Mashhadi, H.R.: An adaptive Q-learning algorithm developed for agent-based computational modeling of electricity market. IEEE Trans. Syst. Man Cybern. C Appl. Rev. 40(5), 547–556 (2010)CrossRefGoogle Scholar
  14. 14.
    Prashanth, L.A., Bhatnagar, S.: Reinforcement learning with function approximation for traffic signal control. IEEE Trans. Intell. Transp. Syst. 12(2), 412–421 (2011)CrossRefGoogle Scholar
  15. 15.
    Kar, S., Moura, J.M.F., Poor, H.V.: QD-learning: a collaborative distributed strategy for multi-agent reinforcement learning through consensus + innovations. IEEE Trans. Signal Process. 61(7), 1848–1862 (2013)MathSciNetCrossRefGoogle Scholar
  16. 16.
    Huang, T., Liu, D.: A self-learning scheme for residential energy system control and management. Neural Comput. Appl. 22(2), 259–269 (2013)CrossRefGoogle Scholar
  17. 17.
    Sun, Q., Zhou, J., Guerrero, J.M., Zhang, H.: Hybrid threephase/single-phase microgrid architecture with power management capabilities. IEEE Trans. Power Electron. 30(10), 5964–5977 (2015)CrossRefGoogle Scholar
  18. 18.
    Sun, Q., Han, R., Zhang, H., Zhou, J., Guerrero, J.M.: A multiagent-based consensus algorithm for distributed coordinated control of distributed generators in the energy Internet. IEEE Trans. Smart Grid 6(6), 3006–3019 (2015).  https://doi.org/10.1109/TSG.2015.2412779CrossRefGoogle Scholar
  19. 19.
    Sun, Q., Zhang, Y., He, H., Ma, D., Zhang, H.: A novel energy function-based stability evaluation and nonlinear control for energy Internet. IEEE Trans. Smart Grid. To be published.  https://doi.org/10.1109/tsg.2015.2497691CrossRefGoogle Scholar
  20. 20.
    Ni, J., Liu, M., Ren, L., Yang, S.X.: A multiagent Q-learning based optimal allocation approach for urban water resource management system. IEEE Trans. Autom. Sci. Eng. 11(1), 204–214 (2014)CrossRefGoogle Scholar
  21. 21.
    Lu, R., Hong, S.H., Zhang, X.: A dynamic pricing demand response algorithm for smart grid: reinforcement learning approach. Appl. Energy 220, 220–230 (2018)CrossRefGoogle Scholar
  22. 22.
    Yu, H., Zhang, M.: Data pricing strategy based on data quality. Comput. Ind. Eng. 112, 1–10 (2017)CrossRefGoogle Scholar
  23. 23.
    Melo, F.S.: Convergence of Q-learning: a simple proof. Technical report, pp. 1–4. Institute of Systems and Robotics (2001)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

  1. 1.Institute of Computing TechnologyChinese Academy of SciencesBeijingChina
  2. 2.University of Chinese Academy of SciencesBeijingChina
  3. 3.Northeast UniversityShenyangChina

Personalised recommendations