Skip to main content

Differentially Private Federated Combinatorial Bandits with Constraints

  • Conference paper
  • First Online:
Machine Learning and Knowledge Discovery in Databases (ECML PKDD 2022)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13716))

Abstract

There is a rapid increase in the cooperative learning paradigm in online learning settings, i.e., federated learning (FL). Unlike most FL settings, there are many situations where the agents are competitive. Each agent would like to learn from others, but the part of the information it shares for others to learn from could be sensitive; thus, it desires its privacy. This work investigates a group of agents working concurrently to solve similar combinatorial bandit problems while maintaining quality constraints. Can these agents collectively learn while keeping their sensitive information confidential by employing differential privacy? We observe that communicating can reduce the regret. However, differential privacy techniques for protecting sensitive information makes the data noisy and may deteriorate than help to improve regret. Hence, we note that it is essential to decide when to communicate and what shared data to learn to strike a functional balance between regret and privacy. For such a federated combinatorial MAB setting, we propose a Privacy-preserving Federated Combinatorial Bandit algorithm, P-FCB. We illustrate the efficacy of P-FCB through simulations. We further show that our algorithm provides an improvement in terms of regret while upholding quality threshold and meaningful privacy guarantees.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Regret is the deviation of utility gained while engaging in learning from the utility gained if the mean qualities were known.

References

  1. Foundry model. https://en.wikipedia.org/w/index.php?title=Foundry_model &oldid=1080269386

  2. Original equipment manufacturer. https://en.wikipedia.org/w/index.php?title=Original_equipment_manufacturer &oldid=1080228401

  3. Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Mach. Learn. 47, 235–256 (2004)

    Article  MATH  Google Scholar 

  4. Chen, S., Tao, Y., Yu, D., Li, F., Gong, B., Cheng, X.: Privacy-preserving collaborative learning for multiarmed bandits in IoT. IEEE Internet Things J. 8(5), 3276–3286 (2021)

    Article  Google Scholar 

  5. Chen, W., Wang, Y., Yuan, Y.: Combinatorial multi-armed bandit: general framework and applications. In: ICML. PMLR, 17–19 June 2013

    Google Scholar 

  6. Chiusano, F., Trovò, F., Carrera, G.D., Boracchi, Restelli, M.: Exploiting history data for nonstationary multi-armed bandit. In: ECML/PKDD (2021)

    Google Scholar 

  7. Deva, A., Abhishek, K., Gujar, S.: A multi-arm bandit approach to subset selection under constraints. In: AAMAS 2021, pp. 1492–1494. AAMAS (2021)

    Google Scholar 

  8. Dubey, A., Pentland, A.: Differentially-private federated linear bandits. Adv. Neural. Inf. Process. Syst. 33, 6003–6014 (2020)

    Google Scholar 

  9. Dwork, C.: Differential privacy. In: Proceedings of the 33rd International Conference on Automata, Languages and Programming - Volume Part II. ICALP 2006, pp. 1–12 (2006)

    Google Scholar 

  10. Dwork, C., Roth, A.: The algorithmic foundations of differential privacy. Found. Trends Theor. Comput. Sci. 9(3–4), 211–407 (2014)

    Google Scholar 

  11. Gai, Y., Krishnamachari, B., Jain, R.: Learning multiuser channel allocations in cognitive radio networks: a combinatorial multi-armed bandit formulation. In: DySPAN 2010 (2010)

    Google Scholar 

  12. Hannun, A.Y., Knott, B., Sengupta, S., van der Maaten, L.: Privacy-preserving contextual bandits. CoRR abs/1910.05299 (2019), http://arxiv.org/abs/1910.05299

  13. Ho, C.J., Jabbari, S., Vaughan, J.W.: Adaptive task assignment for crowdsourced classification. In: ICML, pp. 534–542 (2013)

    Google Scholar 

  14. Huang, R., Wu, W., Yang, J., Shen, C.: Federated linear contextual bandits. In: Advances in Neural Information Processing Systems, vol. 34. Curran Associates, Inc. (2021)

    Google Scholar 

  15. Jain, S., Gujar, S., Bhat, S., Zoeter, O., Narahari, Y.: A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing. Artif. Intell. 254, 44–63 (2018)

    Google Scholar 

  16. Kim, T., Bae, S., Lee, J., Yun, S.: Accurate and fast federated learning via combinatorial multi-armed bandits. CoRR (2020). https://arxiv.org/abs/2012.03270

  17. Li, L., Chu, W., Langford, J., Schapire, R.E.: A contextual-bandit approach to personalized news article recommendation. In: International Conference on World Wide Web (2010)

    Google Scholar 

  18. Li, T., Song, L.: Privacy-preserving communication-efficient federated multi-armed bandits. IEEE J. Sel. Areas Commun. 40(3), 773–787 (2022)

    Article  Google Scholar 

  19. Malekzadeh, M., Athanasakis, D., Haddadi, H., Livshits, B.: Privacy-preserving bandits. In: Proceedings of Machine Learning and Systems, vol. 2, pp. 350–362 (2020)

    Google Scholar 

  20. McMahan, B., Moore, E., Ramage, D., Hampson, S., Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics. PMLR (2017)

    Google Scholar 

  21. Mehta, D., Yamparala, D.: Policy gradient reinforcement learning for solving supply-chain management problems. In: Proceedings of the 6th IBM Collaborative Academia Research Exchange Conference (I-CARE) on I-CARE 2014, pp. 1–4 (2014)

    Google Scholar 

  22. Roy, K., Zhang, Q., Gaur, M., Sheth, A.: Knowledge infused policy gradients with upper confidence bound for relational bandits. In: Oliver, N., Pérez-Cruz, F., Kramer, S., Read, J., Lozano, J.A. (eds.) ECML PKDD 2021. LNCS (LNAI), vol. 12975, pp. 35–50. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-86486-6_3

    Chapter  Google Scholar 

  23. Saber, H., Saci, L., Maillard, O.A., Durand, A.: Routine bandits: minimizing regret on recurring problems. In: ECML-PKDD 2021. Bilbao, Spain, September 2021

    Google Scholar 

  24. Shi, C., Shen, C.: Federated multi-armed bandits. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 11, pp. 9603–9611 (2021)

    Google Scholar 

  25. Shi, C., Shen, C., Yang, J.: Federated multi-armed bandits with personalization. In: Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, pp. 2917–2925 (2021)

    Google Scholar 

  26. Shweta, J., Sujit, G.: A multiarmed bandit based incentive mechanism for a subset selection of customers for demand response in smart grids. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 02, pp. 2046–2053 (2020)

    Google Scholar 

  27. Silva, N., Werneck, H., Silva, T., Pereira, A.C., Rocha, L.: Multi-armed bandits in recommendation systems: a survey of the state-of-the-art and future directions. Expert Syst. Appl. 197, 116669 (2022)

    Article  Google Scholar 

  28. Slivkins, A.: Introduction to multi-armed bandits. CoRR abs/1904.07272 (2019). http://arxiv.org/abs/1904.07272

  29. Solanki, S., Kanaparthy, S., Damle, S., Gujar, S.: Differentially private federated combinatorial bandits with constraints (2022). https://arxiv.org/abs/2206.13192

  30. Triastcyn, A., Faltings, B.: Federated learning with Bayesian differential privacy. In: 2019 IEEE International Conference on Big Data (Big Data), pp. 2587–2596. IEEE (2019)

    Google Scholar 

  31. Wang, S., Chen, W.: Thompson sampling for combinatorial semi-bandits. In: Proceedings of the 35th International Conference on Machine Learning, pp. 5114–5122 (2018)

    Google Scholar 

  32. Zhao, H., Xiao, M., Wu, J., Xu, Y., Huang, H., Zhang, S.: Differentially private unknown worker recruitment for mobile crowdsensing using multi-armed bandits. IEEE Trans. Mob. Comput. (2021)

    Google Scholar 

  33. Zheng, Z., Zhou, Y., Sun, Y., Wang, Z., Liu, B., Li, K.: Applications of federated learning in smart cities: recent advances, taxonomy, and open challenges. Connect. Sci. (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sambhav Solanki .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Solanki, S., Kanaparthy, S., Damle, S., Gujar, S. (2023). Differentially Private Federated Combinatorial Bandits with Constraints. In: Amini, MR., Canu, S., Fischer, A., Guns, T., Kralj Novak, P., Tsoumakas, G. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2022. Lecture Notes in Computer Science(), vol 13716. Springer, Cham. https://doi.org/10.1007/978-3-031-26412-2_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-26412-2_38

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-26411-5

  • Online ISBN: 978-3-031-26412-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics