Advertisement

Utilizing the Hive Mind – How to Manage Knowledge in Fully Distributed Environments

  • Thomas Bach
  • Muhammad Adnan Tariq
  • Christian Mayer
  • Kurt Rothermel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9415)

Abstract

By 2020, the Internet of Things will consist of 26 Billion connected devices. All these devices will be collecting an innumerable amount of raw observations, for example, GPS positions or communication patterns. In order to benefit from this enormous amount of information, machine learning algorithms are used to derive knowledge from the gathered observations. This benefit can be increased further, if the devices are enabled to collaborate by sharing gathered knowledge. In a massively distributed environment, this is not an easy task, as the knowledge on each device can be very heterogeneous and based on a different amount of observations in diverse contexts. In this paper, we propose two strategies to route a query for specific knowledge to a device that can answer it with high confidence. To that end, we developed a confidence metric that takes the number and variance of the observations of a device into account. Our routing strategies are based on local routing tables that can either be learned from previous queries over time or actively maintained by interchanging knowledge models. We evaluated both routing strategies on real world and synthetic data. Our evaluations show that the knowledge retrieved by the presented approaches is up to \(96.7\%\) as accurate as the global optimum.

Keywords

Knowledge retrieval Distributed knowledge Confidence-based indexing Query routing 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Ashbrook, D., Starner, T.: Using GPS to learn significant locations and predict movement across multiple users. Personal and Ubiquitous Computing 7(5) (2003)Google Scholar
  2. 2.
    Batko, M., Gennaro, C., Zezula, P.: A scalable nearest neighbor search in P2P systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 79–92. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  3. 3.
    Bharambe, A.R., Agrawal, M., Seshan, S.: Mercury: supporting scalable multi-attribute range queries. In: ACM SIGCOMM Computer Communication Review, vol. 34, pp. 353–366. ACM (2004)Google Scholar
  4. 4.
    Chen, D., Zhou, J., Le, J.: Reverse nearest neighbor search in peer-to-peer systems. In: Larsen, H.L., Pasi, G., Ortiz-Arroyo, D., Andreasen, T., Christiansen, H. (eds.) FQAS 2006. LNCS (LNAI), vol. 4027, pp. 87–96. Springer, Heidelberg (2006) CrossRefGoogle Scholar
  5. 5.
    Chen, X., Yin, W., Tu, P., Zhang, H.: Weighted k-means algorithm based text clustering. In: Proceedings of the 2009 International Symposium on Information Engineering and Electronic Commerce, IEEC 2009, pp. 51–55. IEEE Computer Society, Washington, DC (2009). http://dx.doi.org/10.1109/IEEC.2009.17
  6. 6.
    Cheng, C., Cimet, I., Kumar, S.: A protocol to maintain a minimum spanning tree in a dynamic topology. In: Symposium Proceedings on Communications Architectures and Protocols, SIGCOMM 1988. ACM (1988)Google Scholar
  7. 7.
    Cisco: Cisco global cloud index: Forecast and methodology, 2013–2018 (2014). http://www.cisco.com/c/en/us/solutions/collateral/service-provider/global-cloud-index-gci/Cloud_Index_White_Paper.html
  8. 8.
    Ganesan, P., Yang, B., Garcia-Molina, H.: One torus to rule them all: multi-dimensional queries in p2p systems. In: Proc. of the 7th Int. Workshop on the Web and Databases: Colocated with ACM SIGMOD/PODS 2004. ACM (2004)Google Scholar
  9. 9.
    Hamerly, G., Elkan, C.: Learning the k in k-means. In: Thrun, S., Saul, L., Schölkopf, B. (eds.) Advances in Neural Information Processing Systems, vol. 16, pp. 281–288. MIT Press (2004)Google Scholar
  10. 10.
    Hernádvölgyi, I.: Generating random vectors from the multivariate normal istribution. Tech. rep., University of Ottawa, Canada, Ottawa, ON (1998)Google Scholar
  11. 11.
    informationisbeautiful.net: World’s biggest data breaches (2015). http://www.informationisbeautiful.net/visualizations/worlds-biggest-data-breaches-hacks/
  12. 12.
    Johnson, S.: Hierarchical clustering schemes. Psychometrika 32(3), 241–254 (1967)CrossRefGoogle Scholar
  13. 13.
    Li, M., Lee, W.C., Sivasubramaniam, A.: Semantic small world: an overlay network for peer-to-peer search. In: Proceedings of the 12th IEEE International Conference on Network Protocols, ICNP 2004, pp. 228–238. IEEE (2004)Google Scholar
  14. 14.
    Lindley, D.V.: Introduction to probability and statistics from bayesian viewpoint, part 1 probability, vol. 1. CUP Archive (1965)Google Scholar
  15. 15.
    Lu, Y., Cohen, I., Zhou, X.S., Tian, Q.: Feature selection using principal feature analysis. In: Proc. of the 15th Int. Conf. on Multimedia, pp. 301–304. ACM (2007)Google Scholar
  16. 16.
    Madden, M.: Privacy and cybersecurity: Key findings from pew research (2015). http://www.pewresearch.org/key-data-points/privacy/
  17. 17.
    Malkov, Y., Ponomarenko, A., Logvinov, A., Krylov, V.: Approximate nearest neighbor algorithm based on navigable small world graphs. Information Systems 45, 61–68 (2014)CrossRefGoogle Scholar
  18. 18.
    Montresor, A., Jelasity, M.: Peersim: a scalable p2p simulator. In: IEEE Ninth Int. Conf. on Peer-to-Peer Computing, P2P 2009, pp. 99–100. IEEE (2009)Google Scholar
  19. 19.
    Ogallo, H.G., Jha, M.K., Marroquin, O.: Studying the impacts of vehicular congestion and offering sustainable solutions to city living. In: Proceedings of the 2nd International Conference on Sustainable Cities, Urban Sustainability and Transportation (SCUST 2013) (2013)Google Scholar
  20. 20.
    Ratnasamy, S., Francis, P., Handley, M., Karp, R., Shenker, S.: A scalable content-addressable network. ACM SIGCOMM Computer Communication Review 31, 161–172 (2001)CrossRefGoogle Scholar
  21. 21.
    Rowstron, A., Druschel, P.: Pastry: scalable, decentralized object location, and routing for large-scale peer-to-peer systems. In: Guerraoui, R. (ed.) Middleware 2001. LNCS, vol. 2218, p. 329. Springer, Heidelberg (2001) CrossRefGoogle Scholar
  22. 22.
    Sahin, O.D., Emekci, F., Agrawal, D.P., El Abbadi, A.: Content-based similarity search over peer-to-peer systems. In: Ng, W.S., Ooi, B.-C., Ouksel, A.M., Sartori, C. (eds.) DBISP2P 2004. LNCS, vol. 3367, pp. 61–78. Springer, Heidelberg (2005) CrossRefGoogle Scholar
  23. 23.
    Schmidt, C., Parashar, M.: Squid: Enabling search in dht-based systems. Journal of Parallel and Distributed Computing 68(7), 962–975 (2008)zbMATHCrossRefGoogle Scholar
  24. 24.
    Shu, Y., Ooi, B.C., Tan, K.L., Zhou, A.: Supporting multi-dimensional range queries in peer-to-peer systems. In: Fifth IEEE International Conference on Peer-to-Peer Computing, P2P 2005, pp. 173–180. IEEE (2005)Google Scholar
  25. 25.
    Stoica, I., Morris, R., Karger, D., Kaashoek, M.F., Balakrishnan, H.: Chord: A scalable peer-to-peer lookup service for internet applications. ACM SIGCOMM Computer Communication Review 31(4), 149–160 (2001)CrossRefGoogle Scholar
  26. 26.
    Tang, C., Xu, Z., Mahalingam, M.: pSearch: Information retrieval in structured overlays. ACM SIGCOMM Computer Communication Review 33(1), 89–94 (2003)CrossRefGoogle Scholar
  27. 27.
    Tariq, M.A., Koldehofe, B., Bhowmik, S., Rothermel, K.: Pleroma: a SDN-based high performance publish/subscribe middleware. In: Proceedings of the 15th International Middleware Conference, Middleware 2014, pp. 217–228. ACM, New York (2014). http://doi.acm.org/10.1145/2663165.2663338
  28. 28.
    Tariq, M.A., Koldehofe, B., Koch, G.G., Rothermel, K.: Distributed spectral cluster management: a method for building dynamic publish/subscribe systems. In: Proceedings of the 6th ACM int. Conf. on Distributed Event-Based Systems, pp. 213–224. ACM (2012)Google Scholar
  29. 29.
    Yu, L., Liu, H.: Feature selection for high-dimensional data: a fast correlation-based filter solution. ICML 3, 856–863 (2003)Google Scholar
  30. 30.
    Zheng, Y., Li, Q., Chen, Y., Xie, X., Ma, W.Y.: Understanding mobility based on GPS data. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 312–321. ACM (2008)Google Scholar
  31. 31.
    Zheng, Y., Xie, X., Ma, W.Y.: Geolife: A collaborative social networking service among user, location and trajectory. IEEE Data Eng. Bull. 33(2), 32–39 (2010)Google Scholar
  32. 32.
    Zheng, Y., Zhang, L., Xie, X., Ma, W.Y.: Mining interesting locations and travel sequences from GPS trajectories. In: Proceedings of the 18th International Conference on World Wide Web, pp. 791–800. ACM (2009)Google Scholar
  33. 33.
    Ziegeldorf, J.H., Morchon, O.G., Wehrle, K.: Privacy in the internet of things: threats and challenges. Security and Communication Networks 7(12), 2728–2742 (2014)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Thomas Bach
    • 1
  • Muhammad Adnan Tariq
    • 1
  • Christian Mayer
    • 1
  • Kurt Rothermel
    • 1
  1. 1.Institute for Parallel and Distributed SystemsUniversity of StuttgartStuttgartGermany

Personalised recommendations