Towards a European Civil Protection e-Infrastructure

In 2001 the EU Commission endorsed the GMES concept, a joint initiative between EC and ESA aiming to establish a European capacity for Global Monitoring of Environment and Security, to gather and use all available data and information in support of sustainable development policies. The challenge for GMES is to use these services to enable decision makers to better anticipate or mitigate crisis situations and management issues related to the environment and security. The European Civil Protection (CP) was recognized as one of the most important GMES service categories (GMES 2001–2003). Therefore the realization of a Europe-wide cooperative platform for supporting GMES and CP applications should be considered a mid-term objective to enhance the CP emergency management.

Many CP and GMES applications require a strict integration with operational and research infrastructures providing resources and knowledge useful in the full cycle of emergency management (forecasting, warning, management, assessment). Typically, these activities involve many different actors (civil protection systems, public authorities, local administrators, research agencies, etc.) that need to share information and services in a coordinated and effective way. The Grid paradigm is a recent approach to the problem of providing the coordinated sharing of resources (computing, storage, communication) needed by the so-called Virtual Organizations (VOs) along with a security infrastructure. Therefore the adoption of a Grid-based infrastructure seems a natural choice to start building a cooperative platform for supporting GMES and CP applications. However, these applications have specific needs. First of all, developing an advanced enabling infrastructure to facilitate the Earth system analysis as required in complex CP applications, implies to scale from specific and monolithic systems (data-centric) towards independent and modular (service-oriented) information systems (Wolff and Nativi 2008). In fact, such an infrastructure must provide scientists (as knowledge providers) and decision makers with a persistent set of independent high-level services and information that can be integrated into a range of more complex analyses. Scientists and researchers interested in individual disciplines can focus on developing and publishing related services (e.g. models); while, scientists, researchers and decision makers interested in multidisciplinary (system level) phenomena can focus on integrating these services (Foster and Kesselman 2006).

Moreover CP applications share most requirements with other ES based applications (Dissemination and Exploitation of GRids in Earth sciencE 2009), which are not fully supported by the current Grid infrastructures such as (CYCLOPS Technical Annex 2006):

  • Especially during emergency situations, these applications require to access research infrastructure to run models or search information, but they need real-time (RT) or near-real-time (NRT) responses; often, they privilege time of response rather than accuracy.

  • During emergency situations, these applications gain great benefit from the possibility to control sensors networks and acquisition systems, modify their acquisition strategy and the related processing chains.

  • Civil Protection and GMES require to share geo-spatial information that has specific characteristics. Especially, remotely-sensed observations from new and advanced satellite sensors produce huge amount of datasets that are frequently updated.

  • These systems often need to interact with military resources; accordingly, they have the strict data policy and security requirements typical of dual systems (civil/military).

  • These application require image processing, from image acquisition by a sensor to the extraction of information for decision support. This distributed processing capability is likely to be performed on an interconnected set of platforms that may be located in different places.

Therefore, the existing Grid infrastructures need to be enhanced in order to fully support GMES and CP applications, providing them with the required resources in a complete and transparent way and allowing scientists and decision makers to concentrate on high-level service interconnection. In order to fulfil such an enhancement, advanced spatial information services play an important role. In fact, spatial information services can address the semantic mismatch in a permanent and modular way, providing the stable and independent functionalities required by GMES applications (e.g. value added processing and knowledge extraction services).

The CYCLOPS architecture

The FP6 Project CYCLOPS (CYberinfrastructure for CiviL protection Operative ProcedureS) is a Specific Support Action directed to the EGEE Project, aiming to bridge the existing gap between the GMES and Grid communities, with particular reference to Civil Protection applications. Besides the dissemination activities required to put in contact Civil Protection and Grid communities representatives, one of the main objective of CYCLOPS is the definition of research strategies and innovation guidelines towards the building of a future European Civil Protection e-Infrastructure. As a result of a Civil Protection systems analysis and Civil Protection procedure use-cases analysis, a general architecture for the European Civil Protection e-Infrastructure has been proposed.

The Fig. 1 shows how an existing Grid Platform (EGEE) can provide the coordinated sharing of computing, storage and communication resources of existing and enhanced Processing, Data Storage and Network Infrastructure of Civil Protections and research centers involved in the emergency management procedures.

Fig. 1
figure 1

The CYCLOPS interoperability framework

On top of the Grid platform, specific Spatial Data Infrastructures (SDI) can be implemented to build the infrastructure for Civil Protection applications. These are:

  1. 1.

    Advanced Grid services (e.g. Quality of Service management, orchestration services, Knowledge Grid services, etc.).

  2. 2.

    Geospatial Resource Services for geo-spatial information access and sharing.

These SDI Services make the so-called CYCLOPS Infrastructure providing domain-specific services suitable to develop Grid-enabled application for Civil Protection applications.

An Environmental Monitoring Resource Infrastructure provides services to interact, at different semantic levels, with sensors and acquisition systems either in Grid or not.

The applications are built using generic Business Logic (e.g. expert systems) and Presentation/Fruition (e.g. GIS, collaborative work environment) services according to a Service-Oriented Architecture (SOA) approach. Beside these services, an advanced Security Infrastructure provides the Security and Policy services required at each level, for handling the complex data policies typical of the Civil Protection domain. They allow to satisfy the strict requirements of integrity, confidentiality and data/services access control that Civil Protection applications impose to the enabling platform. Moreover, to complete their tasks, many applications should be able to interact with external infrastructure such as existing Grid platforms, security systems, e-Government infrastructures, and Spatial Data Infrastructures. Thus, an interoperability infrastructure completes the platform. These high-level services and tools make the CYCLOPS platform, a complete Grid-based platform supporting Civil Protection applications.

Basically the proposed architecture integrates two main approaches: a Web Services (WS) based architecture at the upper layers, and a Grid based architecture at the lower layers. The adoption of a WS approach at the upper layers allows to easily integrate the CYCLOPS Platform with existing systems typically based on Web technologies (e.g. INSPIRE-compliant SDIs, e-Government and security infrastructures), while the Grid architecture at the lower layers makes possible the coordinated sharing of basic resources for heavy computation and large datasets storage.

Technological choices

Geospatial Data sharing and heavy computational capabilities were identified as main requirements for Civil Protection applications. Thus a deeper investigation on this topics has been performed to evaluate the state-of-the-art in this fields and to define the details of the Grid Platform and the SDI service layer.

The Grid platform: gLite middleware

The Europe’s flagship Research Infrastructure Grid project EGEE (Enabling Grid for E-sciencE) is providing since 2004 the world’s largest Grid infrastructure of its kind. Nowadays it encompasses more than 90 partners from 45 countries, with roughly 240 resource centres providing a total of 50000 CPUs and 25 PetaBytes of data storage capability. More than 8000 active users of many scientific and industrial applications run on average about 100000 jobs/day, from several different disciplines like Astronomy, Astrophysics, Earth Sciences, Finance, Computational Chemistry, Engineering, Life Sciences, Condensed Matter and High Energy Physics.

The EGEE Grid infrastructure is based on a middleware stack called gLite, developed starting from two different past pilot projects (EU-DataGrid and EU-DataTAG) and the LHC Computing Grid project (LCG). In these projects since the beginning the focus was set in enabling grid computing of scientific “data intensive” applications, mostly in the fields of High Energy Physics, Earth Observations and Biomedicine. These applications have required the design of a grid platform able to process a large amount of distributed data, ranging from 0.1 to 10 PetaBytes per year, exploiting the sharing of network, storage and computing resources. The sharing of the resources is coordinated adopting the concept of distributed, heterogeneous, dynamic, multi-institutional Virtual Organisations (VO), entities which span across the traditional administrative and organisational domains. A VO is a set of individuals and/or institutions having access to computing resources for collaborative problem solving. Resource providers and consumers organised in VOs agree on policies that define clearly what is shared, who is allowed to share and the conditions under which sharing occurs. The gLite middleware is a set of software components providing services to discover, access, allocate, monitor shared resources, in a secure way and according with well defined policies. These services form an intermediate layer (middleware) between the physical resources and the applications.

The gLite middleware can be divided in: high level grid components for scheduling and distributing the workload of computational jobs, executing complex workflows, moving and replicating data; foundation grid components for accessing, controlling, discovering, monitoring and accounting computational and storage resources within a well defined security model and infrastructure. The gLite architecture is following the Service Oriented Architecture (SOA) paradigm, simplifying interoperability among Grid services and allowing easier compliance with upcoming standards. The Grid services can be in general divided in 4 groups:

Security services (Grid Security Infrastructure, GSI) concern Authentication, Authorization and Auditing. An important role in the context of Authorization is played by the gLite Virtual Organisation Membership Service (VOMS), an attribute authority allowing fine-grained access control.

Job management services concern the execution and control of computational jobs for their whole lifetime throughout the Grid infrastructure. In the gLite terminology the Computing Element (CE) provides an interface to access and manage a computing resource typically consisting in a batch queue of a cluster farm. The Workload Management System (WMS) provides a metascheduler which dispatches jobs on the available CEs best suited to run the user’s job according to its requirements and well defined VO-level and resource-level policies. Job status tracking during the job’s lifetime and after its end is performed by the Logging and Bookkeeping service (LB).

Data management services concern the access, transfer and cataloguing of data. The granularity of data in gLite is on the file level. The Storage Element (SE) provides an interface to a storage resource, ranging from simple disk servers to complex hierarchical tape storage systems. The gLite LCG File Catalog (LFC) service keeps track of the locations of the files, as well as the relevant metadata (e.g. checksum and filesizes), and of their replicas distributed in the grid.

Information and monitoring services provide mechanisms to collect and publish information about the state of grid services and resources, as well as to discover them. gLite adopted two Information Systems: the Berkley DB Information Index (BDII), an evolution o the Globus Meta Directory System (MDS) based on the Lightweight Directory Access Protocol (LDAP); the Relational Grid Monitoring Architecture (R-GMA), a relational implementation of the Grid Monitoring Architecture (GMA) standardized by the Open Grid Forum (OGF).

The implementation of geospatial grid services described in the following sections has been based extensively on many of the gLite services listed above, namely VOMS, WMS, LB, CE, SE, LFC and BDII.

The SDI services layer: OpenGeospatial Web Services (OWS)

Civil Protection applications usually access and generate geospatial information, that is information referred to a defined spatial and temporal context. Recent advances in the geospatial science and technology provide the basis for building an advanced infrastructure for data sharing. In the last years many European and international projects and initiatives (e.g. INSPIRE, GEOSS, GMES) aimed to define architectural frameworks for the realization of the so-called Spatial Data Infrastructures (SDIs). They handle issues concerning data formats, metadata, network services to enable a basic set of geospatial data sharing capabilities. They typically refer to the specification issued by de jure or de facto standardization bodies (e.g. W3C, OASIS, OGC). By the CYCLOPS point-of-view the OGC Web Services have a particular importance since they could be the basic set of services to build the SDI service layer in the CYCLOPS architecture.

The Open Geospatial Consortium, Inc.®(OGC) is a non-profit, international, voluntary consensus standards organization for the development of standards for geospatial and location based services. It has defined specification for many different geospatial network web-based services: the OpenGeospatial Web Services (OWS). By the CYCLOPS viewpoint the following are the most interesting:

Web Coverage Service (WCS) The Web Coverage Service supports electronic interchange of geospatial data as “coverages”—that is, digital geospatial information representing space-varying phenomena. A WCS provides access to potentially detailed and rich sets of geospatial information, in forms that are useful for client-side rendering, multi-valued coverages, and input into scientific models and other clients (OpenGIS 2006a).

Web Map Service (WMS) The Web Map Service (WMS) provides operations in support of the creation and display of registered and superimposed map-like views of information that come simultaneously from multiple remote and heterogeneous sources (OpenGIS 2006b).

Web Processing Service (WPS) The Web Processing Service (WPS) defines a standardized interface that facilitates the publishing of geospatial processes, and the discovery of and binding to those processes by clients. “Processes” include any algorithm, calculation or model that operates on spatially referenced data. “Publishing” means making available machine-readable binding information as well as human readable metadata that allows service discovery and use (OpenGIS 2007).

Other OWS will be evaluated for integration in the CYCLOPS SDI layer, such as the Sensor Web Enablement framework for sensor integration and the Catalogue Service (CS-W).

A case study: the RISICO porting to CYCLOPS architecture

The proposed approach has been tested porting a real operational CP application according to the defined architectural framework.

RISICO, a wild fires risk assessment model

Starting from the experience of the Canadian Fire Weather Index (FWI), largely used in several countries, CIMA developed in 2003 the system RISICO to provide the Italian Civil Protection with dynamic information relevant to the potential fire danger over the whole Italian territory (Fiorucci et al. 2008). The RISICO system has been developed to consider all the static and dynamic available information. Different modules compose such a system, each representing a specific model. First, it is necessary to represent the dynamics relevant to the state variables associated with the fuel load, over the considered territory, as well as those related to the fuel moisture. Such dynamics refer to different fuel typologies. Then, the potential fire spread model has to be considered in order to quantitatively describe the potential behavior of a wildfire front, in absence of any extinguishing action.

Each model refers to a single cell in which the considered area is discretized. The models are discrete both in time and space without considering any relationship between different cells.

The conceptual scheme depicted in Fig. 2 is quite general and may constitute the basis for the development of different schemes for the assessment of potential fire danger index. Schemes can differ in the definition of fuel classes and characteristics, in the mathematical structure of the models appearing in the figure, and in the values of the parameters of such models.

Fig. 2
figure 2

A schematic representation of the structure of the RISICO system

The RISICO system considers only two fuel classes, namely live and fine dead fuel. Fuel load is daily determined by a phenological model. The fine dead fuel moisture dynamic is provided by a model based on the FFMC module of FWI. The FFMC has been simplified in its structure and calibrated on the basis of fuel stick sensors time series available on the Italian territory. The fine dead fuel moisture dynamics play the key role in fire ignition.

The information that feeds the various modules represented in Fig. 2 is partly static and partly dynamic. Static information is related to topography and land use/vegetation cover data, which can be obtained from a data set stored in a Geographical Information System (GIS). The dynamic information it consists of meteorological data (provided by a network of ground sensors, such as rain gauges, anemometers, hygrometers and thermometers), and meteorological forecasts (provided by one or more limited area model), over a time horizon of suitable length (3–5 days).

The static information used in the present implementation of RISICO system refers to topographic and vegetation cover data. As regards topography, a Digital Elevation Model (DEM) defined over a 100 m regular grid produced by the Italian SGN (Servizio Geologico Nazionale and Row, 1994) has been used in order to represent the aspect and the slope of the Italian territory. As regards vegetation cover, information is drawn from CLC (CORINE Land Cover, 1994) map, which is able to provide information on land cover at a scale of 1:100000. Such a map, available as a 100 m grid file (CLC90 released in December 2000), uses a database including 44 categories, in accordance with standard European nomenclature, organized into five large classes: artificial surfaces, agricultural areas, forest and semi-natural areas, wetlands, water bodies. For each seasonal period, and for those of the above 44 categories that can be interested by wildland fires, five parameters have been drawn from the literature (Anderson 1982; Corpo Forestale dello Stato 1985; Nunez-Regueira et al. 1999). Such parameters correspond to: a) for live fuel: load \(\rm \big[\frac{kg}{m^2}\big]\), HHV \(\rm \big[\frac{kJ}{kg}\big]\), and moisture [%], b) for fine dead fuel: load and HHV. Thus, on the basis of the information provided by CLC map, such parameters are specified for any cell.

A thorough exploitation of the dynamic information provided by a LAM is the main feature of RISICO system. In particular, the system receives daily from the Agenzia Regionale Protezione Ambiente in Bologna the outputs of the 00:00 UTC deterministic run of a meteorological non-hydrostatic Limited Area Model (LAM), namely Lokal Modell (Doms and Schättler 1999). The information provided by the LAM consists of a set of data discretized in time steps of 3 h over a time horizon of 72 h, and defined over a rectangle made of a grid of 57200 regular cells of 0.05×0.05 degrees. The used meteorological variables are the 3-h cumulated rainfall, the air temperature, the dew point temperature, and the wind speed/direction. However, in validation phase has been highlighted the necessity to introduce ground truth to avoid error driven by the meteorological forecast uncertainty. To this end, since 2007 RISICO system makes use of meteorological observation used to define the initial state variables relevant to the last 24 h. The space resolution of the operative version of the RISICO system is 1 km whereas the time resolution is 3 h.

The RISICO implementation in the CYCLOPS infrastructure

In the CYCLOPS prototyping activity, the RISICO model has been integrated within the Grid framework making use of an additional layer of standard geospatial services according to the CYCLOPS platform architecture. In particular two grid-enabled geospatial services have been implemented: an OGC WCS for data access and an OGC WPS for processing (Lee and Percivall 2008).

Grid-enabled WCS

The Grid-enabled implementation of WCS makes use of the storage and processing capabilities of the Grid. When a request is received, a main module analyzes it and sends a job in the Grid to get data and process them if required (interpolation, subsetting, resampling). If multiple requests occur, they are processed in parallel by different jobs. This allows to assure the “scalability” of the WCS data throughput. When the result dataset is ready, it is provided by the WCS as an “asynchronous response”. This last feature makes possible to integrate the OGC Web Service architecture with the EGEE Grid architecture which is inherently asynchronous. To optimize the data access of other Grid-enabled components, a copy of the processed dataset is preserved on the Grid.

Grid-enabled WPS

The Grid-enabled WPS server is the core component of the RISICO prototype running on the Grid. It wraps the RISICO application providing both a standard compliant access interface and distributed processing capabilities. Its activity workflow can be summarized as (Fig. 3):

  1. 1.

    A WPS Request describing input data, parameters and output options is received.

  2. 2.

    All the required input data are gathered from various Web Coverage Services (WCS) and stored in the Grid.

  3. 3.

    The whole process is split in sub-processes.

  4. 4.

    The sub-processes jobs are submitted and executed in parallel in the Grid.

  5. 5.

    The outputs are merged and published as coverages in the Grid-enabled WCS.

Fig. 3
figure 3

A schematic activity diagram of the WPS

Each phase of the workflow is described more in detail below.

WPS request A RISICO run on the Grid is launched through a WPS client: a graphical user interface has been developed for this purpose.

In the Run Management panel (Fig. 4) it is possible to select all the parameters of the RISICO run:

  • Spatial resolution (100 to 1000 m).

  • Bounding Box of the geographic area.

  • Time interval and time resolution.

  • Number of parallel jobs to be spawned on the Grid.

Fig. 4
figure 4

The user interface used to launch a RISICO run on the Grid

It is also possible to specify which Web Services the needed input coverages should be taken from. Once the run is launched, all the chosen parameters are encoded in a WPS request and are sent to the WPS Server.

Input data setup The WPS server is in charge of retrieving the required datasets from the data access web services and making them available on the Grid, where the RISICO model is going to be run. Since data access logic is decoupled from the business logic, RISICO could be run on demand with input data served by different data providers.

Model execution After all the input datasets have been correctly transferred to the Grid, the WPS server splits the initial spatial domain in various sub-domains: for each sub-domain, a separate RISICO process is submitted to the Grid. It is possible to split the job simply partitioning the initial sets of cells because the model processes the cells independently. Once landed on a Worker Node—the machine where the model physically runs—the job accesses the input data that were previously set up on the Grid by the WPS and executes the model.

In a production environment the number of parallel jobs could be chosen dynamically taking in consideration the run priority: in critical situations, when the execution time is a strong constraint, the WPS will increase the number of the spawned jobs, reducing the size of each sub-domain to speed up the whole execution.

Output data merging When all the output datasets have been created, the WPS submits on the grid a job which merges all the job outputs results in one dataset. The aggregated output dataset is now published on a WCS server, where is ready to be visualized by the user or to be accessed by another model (Fig. 5). If a time-of-response requirement has been specified, the WPS could manage the ongoing process to present a partial result within the time limit. This approach allows to privilege time of response instead of accuracy, as required in most Civil Protection applications.

Fig. 5
figure 5

The user interface showing one RISICO output

Results

Scalability To test the scalability of the proposed approach, various series of run have been executed varying the input parameters, the input data and the number of parallel jobs involved. In Fig. 6 it is possible to see the behaviour of the model execution time at the increase of the spatial resolution. The three colored graphics identify the prototype running respectively on 1, 4 or 8 parallel jobs on grid. At low resolution the parallelization strategy is slower due to some overhead, but when the input data size increases it is possible to maintain the execution time almost constant, increasing also the number of jobs.

Fig. 6
figure 6

A graphical comparison of the RISICO model run with 1, 4 or 8 parallel jobs

Similar results were achieved with test done increasing the geographic area of the run, with fixed resolution.

Multiple modeling During the test phase has been confirmed that multiple instances of the RISICO model could be run on the grid with no substantial performance loss, due to the high distributed Grid infrastructure.

Actually the added possibility to access heterogeneous data sources coming from the WS approach, allows to easily apply multi modeling techniques, like the execution of the same model based on forecast data generated by different models/institutions.

Discussion

The prototypal porting of the RISICO application on the Grid allowed to better analyze several aspects of the proposed CYCLOPS platform.

The scalability of the architecture has been confirmed by the preliminary test done on the RISICO prototype: this could allow the modelling research to develop more complex and accurate models, taking in account the huge amount of computing power the Grid can provide.

The multi modeling techniques could be an interesting choice for Civil Protections but, for exploit that methodologies, heterogeneous data sources should be made available: in this direction the introduced web services layer could allow to access many data are not physically stored on the grid, but will be soon accessible through standard web services adopted by international initiatives (e.g. INSPIRE, GEOSS).

Moreover the large quantity of datasets produced by these models should be made available through standard access services and should be possible to organize it with catalogue services.

Conclusions

The distributed processing capabilities of the Grid showed the possibility to reach the scalability requirements of Civil Protection applications. Moreover the enhancement of the Grid middleware with a set of standard geospatial services opens the Civil Protection applications to new data and standard service providers making possible to conceive more complex scenarios built as a workflow of basic applications.

Experiences from existing projects and initiatives (e.g. GIGAS, GENESI-DR (Ground European Network for Earth Science Interoperations 2009), GMES core and downstream services, etc.) could provide significant contribution towards the detailed design of an e-Infrastructure for CP applications.