World Wide Web

, Volume 18, Issue 6, pp 1717–1736 | Cite as

A formal method for rule analysis and validation in distributed data aggregation service

  • Vlad Serbanescu
  • Florin Pop
  • Valentin Cristea
  • Gabriel Antoniu


The usage of Cloud Serviced has increased rapidly in the last years. Data management systems, behind any Cloud Service, are a major concern when it comes to scalability, flexibility and reliability due to being implemented in a distributed way. A Distributed Data Aggregation Service relying on a storage system meets these demands and serves as a repository back-end for complex analysis and automatic mining of any type of data. In this paper we continue our previous work on data management in Cloud storage. We present a formal approach to express retrieval and aggregation rules with a compact, yet powerful tool called Rule Markup Language. Our extended solution proposes a standard form to schemes and uses the tool to match the rules to the XML form of the structured data in order to obtain the unstructured entries from BlobSeer data storage system. This allows the Distributed Data Aggregation Service (DDAS) to bypass several steps when processing a retrieval request. Our new architecture is more loosely-coupled with a separate module, the new tool, used for transforming the XML entries to standard XML files which represent the final result. We model the dynamic behavior of the system using this new standard to ensure a simpler and efficient representation of the operations performed by the client while maintaining the constraints imposed by a distributed system running in the Cloud. Furthermore we prove that this method correctly performs the translation between the storage model’s unstructured view of data and the client’s structured objects.


Data aggregation Data management Cloud storage Intelligent cloud services Distributed services Formal methods Rule markup language 


  1. 1.
    Aamodt, K., et al.: The ALICE experiment at the CERN LHC. JINST 3, S08002 (2008)Google Scholar
  2. 2.
    Bessani, A., Correia, M., Quaresma, B., André, F., Sousa, P.: Depsky: dependable and secure storage in a cloud-of-clouds. In: Proceedings of the sixth conference on Computer systems, EuroSys ’11, pp 31–46. ACM, New York, NY, USA (2011)Google Scholar
  3. 3.
    Brampton, A., MacQuire, A., Rai, I.A., Race, N.J.P., Mathy, L.: Stealth distributed hash table: a robust and flexible super-peered dht. In: Proceedings of the 2006 ACM CoNEXT conference, CoNEXT ’06, pp 19:1–19:12. ACM, New York, NY, USA (2006)Google Scholar
  4. 4.
    Cappello, F., Caron, E., Dayde, M., Desprez, F., Jegou, Y., Primet, P., Jeannot, E., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Quetier, B., Richard, O.: Grid’5000: A large scale and highly reconfigurable grid experimental testbed. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing, GRID ’05, pp 99–106. IEEE Computer Society, Washington, DC, USA (2005)Google Scholar
  5. 5.
    Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 4:1–4:26 (2008)CrossRefGoogle Scholar
  6. 6.
    Chen, J., Sehrish, S., Liao, W.-K., Choudhary, A., Schuchardt, K.: Improving the average response time in collective i/o. In: Recent Advances in the Message Passing Interface, LNCS 6090, pp 71–73 (2011)Google Scholar
  7. 7.
    Glatard, T., Montagnat, J., Pennec, X.: Efficient services composition for grid-enabled data-intensive applications. In: Proceedings of the IEEE International Symposium on High Performance and Distributed Computing, pp 333–334 (2006)Google Scholar
  8. 8.
    Gorgan, D., Bacu, V., Rodila, D., Pop, F., Petcu, D.: Experiments on ESIP—Environment oriented satellite data processing platform. Earth Science Informatics 3(4), 297–308 (2010)Google Scholar
  9. 9.
    Hummer, W., Leitner, P., Dustdar, S.: Ws-aggregation: distributed aggregation of web services data. In: Proceedings of the 2011 ACM Symposium on Applied Computing, SAC ’11, pp 1590–1597. ACM, New York, NY, USA (2011)Google Scholar
  10. 10.
    Jacob, J.: A rule markup language and its application to uml. In: Leveraging Applications of Formal Methods, pp 26–41. Springer (2006)Google Scholar
  11. 11.
    Kulla, E., Spaho, E., Xhafa, F., Barolli, L., Takizawa, M.: Using data replication for improving qos in manets. In: Proceedings of the 2012 Seventh International Conference on Broadband, Wireless Computing, Communication and Applications, BWCCA ’12, pp 529–533. IEEE Computer Society, Washington, DC, USA (2012)Google Scholar
  12. 12.
    Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 35–40 (2010)CrossRefGoogle Scholar
  13. 13.
    Lee, J.K., Sohn, M.M.: The extensible rule markup language. Commun. ACM 46(5), 59–64 (2003)CrossRefGoogle Scholar
  14. 14.
    Nicolae, B., Antoniu, G., Bougé, L., Moise, D., Carpen-Amarie, A.: Blobseer: Next-generation data management for large scale infrastructures. J. Parallel Distrib. Comput. 71, 169–184 (2011)CrossRefGoogle Scholar
  15. 15.
    Palankar, M. R., Iamnitchi, A., Ripeanu, M., Garfinkel, S.: Amazon s3 for science grids: a viable solution?. In: Proceedings of the 2008 international workshop on Data-aware distributed computing, DADC ’08, pp 55–64. ACM, New York, NY, USA (2008)Google Scholar
  16. 16.
    Pop, F., Gruia, C., Cristea, V.: Distributed algorithm for change detection in satellite images for Grid Environments. In: Parallel and Distributed Computing, 2007. ISPDC’07. Sixth International Symposium on (pp. 41-41). IEEE (2007)Google Scholar
  17. 17.
    Serbanescu, V., Pop, F., Cristea, V., Antoniu, G.: Architecture of distributed data aggregation service. In: Proceedings of the 2014 IEEE 28th International Conference on Advanced Information Networking and Applications, AINA ’14, pp 727–734. IEEE Computer Society, Washington, DC, USA (2014)Google Scholar
  18. 18.
    Song, S., Chen, L.: Indexing dataspaces with partitions. World Wide Web 16(2), 141–170 (2013)CrossRefGoogle Scholar
  19. 19.
    Stam, A., Jacob, J., de Boer, F.S., Bonsangue, M.M., van der Torre, L.: Using xml transformations for enterprise architectures. In: Margaria, T., Steffen, B. (eds.) Leveraging Applications of Formal Methods, volume 4313 of Lecture Notes in Computer Science, pp 42–56. Springer Berlin Heidelberg (2006)Google Scholar
  20. 20.
    Sufyan Beg, M.M., Ahmad, N.: Soft computing techniques for rank aggregation on the world wide web. World Wide Web 6(1), 5–22 (2003)CrossRefGoogle Scholar
  21. 21.
    Venugopal, S., Buyya, R., Ramamohanarao, K.: A taxonomy of data grids for distributed data sharing, management, and processing. ACM Comput. Surv., 38 (2006)Google Scholar
  22. 22.
    Xhafa, F., Kolici, V., Potlog, A.-D., Spaho, E., Barolli, L., Takizawa, M.: Data replication in p2p collaborative systems. In: Proceedings of the 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC ’12, pp 49–57. IEEE Computer Society, Washington, DC, USA (2012)CrossRefGoogle Scholar
  23. 23.
    Yu, Y., Gunda, P.K., Isard, M.: Distributed aggregation for data-parallel computing: interfaces and implementations. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP ’09, pp 247–260. ACM, New York, NY, USA (2009)CrossRefGoogle Scholar
  24. 24.
    Zhang, J, Tao, X., Wang, H.: Outlier detection from large distributed databases. World Wide Web 17(4), 539–568 (2014)CrossRefMATHGoogle Scholar

Copyright information

© Springer Science+Business Media New York 2015

Authors and Affiliations

  1. 1.Formal Methods Department, Centrum Wiskunde and InformaticaAmsterdamNetherlands
  2. 2.Computer Science Department, Faculty of Automatic Control and ComputersUniversity Politehnica of BucharestBucharestRomania
  3. 3.INRIA Rennes-BretagneAtlantiqueFrance

Personalised recommendations