Skip to main content

Research on Selective Combination of Distributed Machine Learning Models

Abstract

We have proposed a distributed platform for machine learning without data accumulation. This method constructs feature models from distributed data and combines them to obtain the same level of performance as conventional methods based on data accumulation. In this paper, we propose and compare three methods for selecting and combining these combinable feature models named fog model; sequential model selection, a conventional method to select fog models in the order of node IDs; adaptive selection, a method to select only models to improve task performance; and similar model selection, a method to select models similar to the current model. Each method has different priorities in selecting models. As an evaluation, the proposed methods are compared with the method of previous research by simulation. The proposed method which gives priority to performance improvement with small number of combinations for the processing tasks comparison to previous one. The proposed method enables more efficient combinations of fog models with fewer models than previous research, and users are able to adaptively select fog models to improve overall performance or to prioritize the performance of specific features by selecting the target models.

This is a preview of subscription content, access via your institution.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

References

  1. Tsuchiya T, Mochizuki R, Hirose HY, Koyanagi TK, Quang TM. Distributed data platform for machine learning using the fog computing model. SN Comput Sci. 2020;1:164.

    Article  Google Scholar 

  2. Bonomi F, Milito R, Natarajan P, Zhu J. Fog computing: a platform for internet of things and analytics, big data and internet of things: a roadmap for smart environments. Berlin: Springer; 2014. p. 169–86.

    Google Scholar 

  3. Laperdrix P, Bielova N, Baudry B, Avoine G. Browser fingerprinting: a survey. CoRR arXiv:1905.01051 (2019)

  4. https://github.com/WICG/floc

  5. McMahan HB, Moore E, Ramage D, Hampson S, Arcas BA. Communication-efficient learning of deep networks from decentralized data. In: Proceeding of the 20th Int’l conference on artificial intelligence and statistics, JMLR: W&CP vol 54; 2014. p. 169–186.

  6. Devlin J, Chang M-W, Lee K, Toutanova K. BERT: pre-training of deep bidirectional transformers for language understanding. CoRR arXiv:1810.04805 (2018).

  7. Bonomi F, Milito R, Natarajan P, Zhu J, Bessis N, Dobre C. Fog computing: a platform for internet of things and analytics, big data and internet of things: a roadmap for smart environments. Berlin: Springer International Publishing; 2014. p. 169–86.

    Google Scholar 

  8. Zhou X, Huang S, Zheng Z. RPD: a distance function between word embeddings. In: Proceedings of 58th the Association for Computational Linguistics; 2020. p. 42–50.

  9. Agirre E, Alfonseca E, Hall K, Kravalova J, Paşca M, Soroa A. A study on similarity and relatedness using distributional and wordnet-based approaches. In: Proceedings of human language technologies: the North American chapter of the association for computational linguistics, Boulder; 2009. p. 19–27.

  10. WikimediaDownloads, https://dumps.wikimedia.org/. Accessed 25 Mar 2022.

  11. Tomas M, Kai C, Greg C, Jeffrey D. Efficient estimation of word representations in vector space. CoRR arXiv:1301.3781 (2013).

  12. https://nodejs.org/en/. Accessed 25 Mar 2022.

  13. https://www.python.org/. Accessed 25 Mar 2022.

  14. https://radimrehurek.com/gensim/. Accessed 25 Mar 2022.

  15. Erickson N, Mueller J, Shirkov A, Zhang H, Larroy P, Li M, Smola A. AutoGluon-tabular: robust and accurate AutoML for structured data. arXiv:2003.06505 (2020)

Download references

Acknowledgements

This research was partially supported by the Ministry of Education, Science, Sports and Culture, Grant-in Aid for Scientific Research (C), 2021–2023 21K11850, Takeshi TSUCHIYA.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Takeshi Tsuchiya.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the topical collection “Future Data and Security Engineering 2022” guest edited by Tran Khanh Dang.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Tsuchiya, T., Mochizuki, R., Hirose, H. et al. Research on Selective Combination of Distributed Machine Learning Models. SN COMPUT. SCI. 3, 438 (2022). https://doi.org/10.1007/s42979-022-01312-9

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s42979-022-01312-9

Keywords

  • Distributed machine learning
  • Learning model combination
  • Fog computing