Abstract
This paper describes the achievements of the H2020 project INDIGO-DataCloud. The project has provided e-infrastructures with tools, applications and cloud framework enhancements to manage the demanding requirements of scientific communities, either locally or through enhanced interfaces. The middleware developed allows to federate hybrid resources, to easily write, port and run scientific applications to the cloud. In particular, we have extended existing PaaS (Platform as a Service) solutions, allowing public and private e-infrastructures, including those provided by EGI, EUDAT, and Helix Nebula, to integrate their existing services and make them available through AAI services compliant with GEANT interfederation policies, thus guaranteeing transparency and trust in the provisioning of such services. Our middleware facilitates the execution of applications using containers on Cloud and Grid based infrastructures, as well as on HPC clusters. Our developments are freely downloadable as open source components, and are already being integrated into many scientific applications.
References
García, A.L., Castillo, E.F.-d., Puel, M.: Identity federation with VOMS in cloud infrastructures. In: 2013 IEEE 5Th International Conference on Cloud Computing Technology and Science, pp 42–48 (2013)
Chadwick, D.W., Siu, K., Lee, C., Fouillat, Y., Germonville, D.: Adding federated identity management to OpenStack. Journal of Grid Computing 12(1), 3–27 (2014)
Craig, A.L.: A design space review for general federation management using keystone. In: Proceedings of the 2014 IEEE/ACM 7th International Conference on Utility and Cloud Computing, pp 720–725. IEEE Computer Society (2014)
Pustchi, N., Krishnan, R., Sandhu, R.: Authorization federation in iaas multi cloud. In: Proceedings of the 3rd International Workshop on Security in Cloud Computing, pp 63–71. ACM (2015)
Lee, C.A., Desai, N., Brethorst, A.: A Keystone-Based Virtual Organization Management System. In: 2014 IEEE 6Th International Conference On Cloud Computing Technology and Science (Cloudcom), pp 727–730. IEEE (2014)
Castillo, E.F.-d., Scardaci, D., García, A.L.: The EGI Federated Cloud e-Infrastructure. Procedia Computer Science 68, 196–205 (2015)
AARC project: AARC Blueprint Architecture, see https://aarc-project.eu/architecture. Technical report (2016)
Oesterle, F., Ostermann, S., Prodan, R., Mayr, G.J.: Experiences with distributed computing for meteorological applications: grid computing and cloud computing. Geosci. Model Dev. 8(7), 2067–2078 (2015)
Plasencia, I.C., Castillo, E.F.-d., Heinemeyer, S., García, A.L., Pahlen, F., Borges, G.: Phenomenology tools on cloud infrastructures using OpenStack. The European Physical Journal C 73(4), 2375 (2013)
Boettiger, C.: An introduction to docker for reproducible research. ACM SIGOPS Operating Systems Review 49(1), 71–79 (2015)
Docker: http://www.docker.com (2013)
Gomes, J., Campos, I., Bagnaschi, E., David, M., Alves, L., Martins, J., Pina, J., Alvaro, L.-G., Orviz, P.: Enabling rootless linux containers in multi-user environments: the udocker tool. Computing Physics Communications. https://doi.org/10.1016/j.cpc.2018.05.021 (2018)
Zhang, Z., Chuan, W., Cheung, D.W.L.: A survey on cloud interoperability taxonomies, standards, and practice. SIGMETRICS perform. Eval. Rev. 40(4), 13–22 (2013)
Lorido-Botran, T., Miguel-Alonso, J., Lozano, J.A.: A Review of Auto-scaling Techniques for Elastic Applications in Cloud Environments. Journal of Grid Computing 12(4), 559–592 (2014)
Nyrén, R., Metsch, T., Edmonds, A., Papaspyrou, A.: Open Cloud Computing Interface–Core. Technical report, Open Grid Forum (2010)
Metsch, T., Edmonds, A.: Open Cloud Computing Interface-Infrastructure. Technical report, Open Grid Forum (2010)
Metsch, T., Edmonds, A.: Open Cloud Computing Interface-RESTful HTTP Rendering. Technical report, Open Grid Forum (2011)
(Ca Technologies) Lipton, P., (Ibm) Moser, S., (Vnomic) Palma, D., (Ibm) Spatzier, T.: Topology and Orchestration Specification for Cloud Applications. Technical report, OASIS Standard (2013)
Teckelmann, R., Reich, C., Sulistio, A.: Mapping of cloud standards to the taxonomy of interoperability in IaaS. In: Proceedings - 2011 3rd IEEE International Conference on Cloud Computing Technology and Science, CloudCom 2011, pp 522–526 (2011)
García, A.L., Castillo, E.F.-d., Fernández, P.O.: Standards for enabling heterogeneous IaaS cloud federations. Computer Standards & Interfaces 47, 19–23 (2016)
Caballer, M., Zala, S., García, A.L., Montó, G., Fernández, P.O., Velten, M.: Orchestrating complex application architectures in heterogeneous clouds. Journal of Grid Computing 16 (1), 3–18 (2018)
Hardt, M., Jejkal, T., Plasencia, I.C., Castillo, E.F.-d., Jackson, A., Weiland, M., Palak, B., Plociennik, M., Nielsson, D.: Transparent Access to Scientific and Commercial Clouds from the Kepler Workflow Engine. Computing and Informatics 31(1), 119 (2012)
Fakhfakh, F., Kacem, H.H., Kacem, A.H.: Workflow Scheduling in Cloud Computing a Survey. In: IEEE 18Th International Enterprise Distributed Object Computing Conference Workshops and Demonstrations (EDOCW), 2014, Vol. 71, pp. 372–378. Springer, New York (2014)
Stockton, D.B., Santamaria, F.: Automating NEURON simulation deployment in cloud resources. Neuroinformatics 15(1), 51–70 (2017)
Plóciennik, M., Fiore, S., Donvito, G., Owsiak, M., Fargetta, M., Barbera, R., Bruno, R., Giorgio, E., Williams, D.N., Aloisio, G.: Two-level Dynamic Workflow Orchestration in the INDIGO DataCloud for Large-scale, Climate Change Data Analytics Experiments. Procedia Computer Science 80, 722–733 (2016)
Moreno-Vozmediano, R., Montero, R.S., Llorente, I.M.: Multicloud deployment of computing clusters for loosely coupled mtc applications. IEEE transactions on parallel and distributed systems 22(6), 924–930 (2011)
Katsaros, G., Menzel, M., Lenk, A.: Cloud Service Orchestration with TOSCA, Chef and Openstack. In: Ic2e (2014)
Garcia, A.L., Zangrando, L., Sgaravatto, M., Llorens, V., Vallero, S., Zaccolo, V., Bagnasco, S., Taneja, S., Dal Pra, S., Salomoni, D., Donvito, G.: Improved Cloud resource allocation: how INDIGO-DataCloud is overcoming the current limitations in Cloud schedulers. J. Phys. Conf. Ser. 898(9), 92010 (2017)
Singh, S., Chana, I.: A survey on resource scheduling in cloud computing issues and challenges. Journal of Grid Computing, pp. 1–48 (2016)
García, A.L., Castillo, E.F.-d., Fernández, P.O., Plasencia, I.C., de Lucas, J.M.: Resource provisioning in Science Clouds: Requirements and challenges. Software: Practice and Experience 48(3), 486–498 (2018)
Chauhan, M.A., Babar, M.A., Benatallah, B.: Architecting cloud-enabled systems: a systematic survey of challenges and solutions. Software - Practice and Experience 47(4), 599–644 (2017)
Somasundaram, T.S., Govindarajan, K.: CLOUDRB A Framework for scheduling and managing High-Performance Computing (HPC) applications in science cloud. Futur. Gener. Comput. Syst. 34, 47–65 (2014)
Sotomayor, B., Keahey, K., Foster, I.: Overhead Matters: A Model for Virtual Resource Management. In: Proceedings of the 2nd International Workshop on Virtualization Technology in Distributed Computing SE - VTDC ’06, p 5. IEEE Computer Society, Washington (2006)
SS, S.S., Shyam, G.K., Shyam, G.K.: Resource management for Infrastructure as a Service (IaaS) in cloud computing SS Manvi A survey. J. Netw. Comput. Appl. 41, 424–440 (2014)
INDIGO-DataCloud consortium: Initial requirements from research communities - d2.1, see https://www.indigo-datacloud.eu/documents/initial-requirements-research-communities-d21 https://www.indigo-datacloud.eu/documents/initial-requirements-research-communities-d21 https://www.indigo-datacloud.eu/documents/initial-requirements-research-communities-d21. Technical report (2015)
Europen open science cloud: https://ec.europa.eu/research/openscience (2015)
Proot: https://proot-me.github.io/ (2014)
Runc: https://github.com/opencontainers/runc (2016)
Fakechroot: https://github.com/dex4er/fakechroot (2015)
Pérez, A., Moltó, G., Caballer, M., Calatrava, A.: Serverless computing for container-based architectures Future Generation Computer Systems (2018)
de Vries, K.J.: Global fits of supersymmetric models after LHC run 1. Phd thesis Imperial College London (2015)
Openstack: https://www.openstack.org/ (2015)
See http://argus-documentation.readthedocs.io/en/stable/argus_introduction.html (2017)
See https://en.wikipedia.org/wiki/xacml (2013)
See http://www.simplecloud.info (2014)
Opennebula: http://opennebula.org/ (2018)
Redhat openshift: http://www.opencityplatform.eu (2011)
The cloud foundry foundation: https://www.cloudfoundry.org/ (2015)
Caballer, M., Blanquer, I., Moltó, G., de Alfonso, C.: Dynamic management of virtual infrastructures. Journal of Grid Computing 13(1), 53–70 (2015)
See http://www.infoq.com/articles/scaling-docker-with-kubernetes http://www.infoq.com/articles/scaling-docker-with-kubernetes (2014)
Prisma project: http://www.ponsmartcities-prisma.it/ (2010)
Opencitiy platform: http://www.opencityplatform.eu (2014)
Onedata: https://onedata.org/ (2018)
Dynafed: http://lcgdm.web.cern.ch/dynafed-dynamic-federation-project http://lcgdm.web.cern.ch/dynafed-dynamic-federation-project (2011)
Fts3: https://svnweb.cern.ch/trac/fts3 (2011)
Fernández, P.O., García, A.L., Duma, D.C., Donvito, G., David, M., Gomes, J.: A set of common software quality assurance baseline criteria for research projects, see http://hdl.handle.net/10261/160086. Technical report
Httermann, M.: Devops for developers Apress (2012)
EOSC-Hub: ”Integrating and managing services for the European Open Science Cloud” Funded by H2020 research and innovation pr ogramme under grant agreement No. 777536. See http://eosc-hub.eu (2018)
Apache License: author = https://www.apache.org/licenses/LICENSE-2.0 (2004)
INDIGO Package Repo: http://repo.indigo-datacloud.eu/ (2017)
INDIGO DockerHub: https://hub.docker.com/u/indigodatacloud/ https://hub.docker.com/u/indigodatacloud/ (2015)
Indigo gitbook: https://indigo-dc.gitbooks.io/indigo-datacloud-releases https://indigo-dc.gitbooks.io/indigo-datacloud-releases (2017)
Van Zundert, G.C., Bonvin, A.M.: Disvis: quantifying and visualizing the accessible interaction space of distance restrained biomolecular complexes. Bioinformatics 31(19), 3222–3224 (2015)
Van Zundert, G.C., Bonvin, A.M.: Fast and sensitive rigid–body fitting into cryo–em density maps with powerfit. AIMS Biophys. 2(0273), 73–87 (2015)
Acknowledgments
INDIGO-Datacloud has been funded by the European Commision H2020 research and innovation program under grant agreement RIA 653549.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: Contribution to Open Source Software Projects
Here follows the list of software developed in the framework of INDIGO-Datacloud that has been contributed upstream to the Open Source community.
-
OpenStack (https://www.openstack.org)
-
Changes/contribution done already merged upstream
-
* Nova Docker
-
* Heat Translator (INDIGO-Data Cloud is 3rd overall contributor and core developer)
-
* TOSCA parser (INDIGO-Data Cloud is 2nd overall contributor and core developer)
-
* OpenID Connect CLI support
-
* OOI: OCCI implementation for OpenStack
-
-
Changes/contribution under discussion to be merged upstream OpenStack Preemptible Instances support (extensions)
-
-
OpenNebula
-
Changes/contribution done already merged upstream
-
* ONEDock
-
-
-
Changes/contribution done already merged upstream for:
-
Infrastructure Manager (http://www.grycap.upv.es/im/index.php)
-
Onedata (https://onedata.org)
-
Apache Libcloud (https://github.com/apache/libcloud)
-
Kepler Workflow Manager (https://kepler-project.org/)
-
TOSCA adaptor for JSAGA (http://software.in2p3.fr/jsaga/dev/)
-
CDMI and QoS extensions for dCache (https://www.dcache.org)
-
Workflow interface extensions for Ophidia (http://ophidia.cmcc.it)
-
OpenID Connect Java implementation for dCache (https://www.dcache.org)
-
MitreID (https://mitreid.org/) and OpenID Connect (http://openid.net/connect/) libraries
-
FutureGateway (https://www.catania-science-gateways.it/)
-
Appendix B: Tools and Services Involved in the Software Lifecycle
Figure 14 showcases the tools and services used for the development and distribution of the INDIGO-DataCloud software:
-
Project management service using openproject.org: It provides tools such as an issue tracker, wiki, a placeholder for documents and a project management timeline.
-
Source code is publicly available, housed externally in GitHub repositories, increasing so the visibility and simplifying the path to exploitation beyond the project lifetime. The INDIGO-DataCloud software is released under the Apache 2.0 software license [59].
-
Continuous Integration service using Jenkins: Service to automate the building, testing and packaging, where applicable. Testing includes the style compliance and estimation of the unit and functional test coverage of the software components.
-
Artifact repositories for RedHat and Debian packages [60] and virtual – Docker – images [61].
-
Code review service using GitHub: Source code review is one integral part of the SQA as it appears as the last step in the change verification process. This service facilitates the code review process, recording the comments and allowing the reviewer to verify the candidate change before being merged into the production version.
-
Issue tracking using GitHub Issues: Service to track issues, new features and bugs of INDIGO-DataCloud software components.
-
Release notes, installation and configuration guides, user and development manuals are made available on GitBook [62].
-
Code metrics services using Grimoire: To collect and visualize several metrics about the software components.
-
Integration infrastructure: this infrastructure is composed of computing resources to support directly the CI service.
-
Testing infrastructure: this infrastructure aims to provide a stable environment for users where they can preview the software and services developed by INDIGO-DataCloud, prior to its public release.
-
Preview infrastructure: where the released artifacts are deployed and made available for testing and validation by the use-cases.
Appendix C: DevOps Adoption from User Communities
DisVis [63] and PowerFit [64] applications were integrated into a CI/CD pipeline described above. As it can be seen in the Fig. 15, with this pipeline in place the application developers were provided with both a means to validate the source code before merging and the creation of a new versioned Docker image, automatically available in the INDIGO-DataCloud???s catalogue for applications i.e. DockerHub???s indigodatacloudapps repository.
Once the application is deployed as a Docker container, and subsequently uploaded to indigodatacloudapps repository, it is instantiated in a new container to be validated. The application is then executed and the results compared with a set of reference outputs. Thus this pipeline implementation goes a step forward by testing the application execution for the last available Docker image in the catalogue.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.
About this article
Cite this article
Salomoni, D., Campos, I., Gaido, L. et al. INDIGO-DataCloud: a Platform to Facilitate Seamless Access to E-Infrastructures. J Grid Computing 16, 381–408 (2018). https://doi.org/10.1007/s10723-018-9453-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-018-9453-3
Keywords
- Cloud computing
- Platform as a service
- Containers
- Software management
- Advanced user interfaces
- Authorization and authentication