Skip to main content

Utilizing the Hive Mind – How to Manage Knowledge in Fully Distributed Environments

  • Conference paper
  • First Online:
On the Move to Meaningful Internet Systems: OTM 2015 Conferences (OTM 2015)

Abstract

By 2020, the Internet of Things will consist of 26 Billion connected devices. All these devices will be collecting an innumerable amount of raw observations, for example, GPS positions or communication patterns. In order to benefit from this enormous amount of information, machine learning algorithms are used to derive knowledge from the gathered observations. This benefit can be increased further, if the devices are enabled to collaborate by sharing gathered knowledge. In a massively distributed environment, this is not an easy task, as the knowledge on each device can be very heterogeneous and based on a different amount of observations in diverse contexts. In this paper, we propose two strategies to route a query for specific knowledge to a device that can answer it with high confidence. To that end, we developed a confidence metric that takes the number and variance of the observations of a device into account. Our routing strategies are based on local routing tables that can either be learned from previous queries over time or actively maintained by interchanging knowledge models. We evaluated both routing strategies on real world and synthetic data. Our evaluations show that the knowledge retrieved by the presented approaches is up to \(96.7\%\) as accurate as the global optimum.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. Personal and Ubiquitous Computing 7(5) (2003)

    Google Scholar 

  2. Batko, M., Gennaro, C., Zezula, P.: A scalable nearest neighbor search in P2P systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 79–92. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  3. Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: ACM SIGCOMM Computer Communication Review, vol. 34, pp. 353–366. ACM (2004)

    Google Scholar 

  4. Chen, D., Zhou, J., Le, J.: Reverse nearest neighbor search in peer-to-peer systems. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 87–96. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  5. Chen, X., Yin, W., Tu, P., Zhang, H.: Weighted k-means algorithm based text clustering. In: Proceedings of the 2009 International Symposium on Information Engineering and Electronic Commerce, IEEC 2009, pp. 51–55. IEEE Computer Society, Washington, DC (2009). http://dx.doi.org/10.1109/IEEC.2009.17

  6. Cheng, C., Cimet, I., Kumar, S.: A protocol to maintain a minimum spanning tree in a dynamic topology. In: Symposium Proceedings on Communications Architectures and Protocols, SIGCOMM 1988. ACM (1988)

    Google Scholar 

  7. Cisco: Cisco global cloud index: Forecast and methodology, 2013–2018 (2014). http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud_Index_White_Paper.html

  8. Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multi-dimensional queries in p2p systems. In: Proc. of the 7th Int. Workshop on the Web and Databases: Colocated with ACM SIGMOD/PODS 2004. ACM (2004)

    Google Scholar 

  9. Hamerly, G., Elkan, C.: Learning the k in k-means. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 281–288. MIT Press (2004)

    Google Scholar 

  10. Hernádvölgyi, I.: Generating random vectors from the multivariate normal istribution. Tech. rep., University of Ottawa, Canada, Ottawa, ON (1998)

    Google Scholar 

  11. informationisbeautiful.net: World’s biggest data breaches (2015). http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/

  12. Johnson, S.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)

    Article  Google Scholar 

  13. Li, M., Lee, W.C., Sivasubramaniam, A.: Semantic small world: an overlay network for peer-to-peer search. In: Proceedings of the 12th IEEE International Conference on Network Protocols, ICNP 2004, pp. 228–238. IEEE (2004)

    Google Scholar 

  14. Lindley, D.V.: Introduction to probability and statistics from bayesian viewpoint, part 1 probability, vol. 1. CUP Archive (1965)

    Google Scholar 

  15. Lu, Y., Cohen, I., Zhou, X.S., Tian, Q.: Feature selection using principal feature analysis. In: Proc. of the 15th Int. Conf. on Multimedia, pp. 301–304. ACM (2007)

    Google Scholar 

  16. Madden, M.: Privacy and cybersecurity: Key findings from pew research (2015). http://www.pewresearch.org/key-data-points/privacy/

  17. Malkov, Y., Ponomarenko, A., Logvinov, A., Krylov, V.: Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems 45, 61–68 (2014)

    Article  Google Scholar 

  18. Montresor, A., Jelasity, M.: Peersim: a scalable p2p simulator. In: IEEE Ninth Int. Conf. on Peer-to-Peer Computing, P2P 2009, pp. 99–100. IEEE (2009)

    Google Scholar 

  19. Ogallo, H.G., Jha, M.K., Marroquin, O.: Studying the impacts of vehicular congestion and offering sustainable solutions to city living. In: Proceedings of the 2nd International Conference on Sustainable Cities, Urban Sustainability and Transportation (SCUST 2013) (2013)

    Google Scholar 

  20. Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. ACM SIGCOMM Computer Communication Review 31, 161–172 (2001)

    Article  Google Scholar 

  21. Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  22. Sahin, O.D., Emekci, F., Agrawal, D.P., El Abbadi, A.: Content-based similarity search over peer-to-peer systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 61–78. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  23. Schmidt, C., Parashar, M.: Squid: Enabling search in dht-based systems. Journal of Parallel and Distributed Computing 68(7), 962–975 (2008)

    Article  MATH  Google Scholar 

  24. Shu, Y., Ooi, B.C., Tan, K.L., Zhou, A.: Supporting multi-dimensional range queries in peer-to-peer systems. In: Fifth IEEE International Conference on Peer-to-Peer Computing, P2P 2005, pp. 173–180. IEEE (2005)

    Google Scholar 

  25. Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review 31(4), 149–160 (2001)

    Article  Google Scholar 

  26. Tang, C., Xu, Z., Mahalingam, M.: pSearch: Information retrieval in structured overlays. ACM SIGCOMM Computer Communication Review 33(1), 89–94 (2003)

    Article  Google Scholar 

  27. Tariq, M.A., Koldehofe, B., Bhowmik, S., Rothermel, K.: Pleroma: a SDN-based high performance publish/subscribe middleware. In: Proceedings of the 15th International Middleware Conference, Middleware 2014, pp. 217–228. ACM, New York (2014). http://doi.acm.org/10.1145/2663165.2663338

  28. Tariq, M.A., Koldehofe, B., Koch, G.G., Rothermel, K.: Distributed spectral cluster management: a method for building dynamic publish/subscribe systems. In: Proceedings of the 6th ACM int. Conf. on Distributed Event-Based Systems, pp. 213–224. ACM (2012)

    Google Scholar 

  29. Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. ICML 3, 856–863 (2003)

    Google Scholar 

  30. Zheng, Y., Li, Q., Chen, Y., Xie, X., Ma, W.Y.: Understanding mobility based on GPS data. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 312–321. ACM (2008)

    Google Scholar 

  31. Zheng, Y., Xie, X., Ma, W.Y.: Geolife: A collaborative social networking service among user, location and trajectory. IEEE Data Eng. Bull. 33(2), 32–39 (2010)

    Google Scholar 

  32. Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th International Conference on World Wide Web, pp. 791–800. ACM (2009)

    Google Scholar 

  33. Ziegeldorf, J.H., Morchon, O.G., Wehrle, K.: Privacy in the internet of things: threats and challenges. Security and Communication Networks 7(12), 2728–2742 (2014)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Bach .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Bach, T., Tariq, M.A., Mayer, C., Rothermel, K. (2015). Utilizing the Hive Mind – How to Manage Knowledge in Fully Distributed Environments. In: Debruyne, C., et al. On the Move to Meaningful Internet Systems: OTM 2015 Conferences. OTM 2015. Lecture Notes in Computer Science(), vol 9415. Springer, Cham. https://doi.org/10.1007/978-3-319-26148-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-26148-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-26147-8

  • Online ISBN: 978-3-319-26148-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics