Data utilisation in multi-hazard early warning system
Multi-hazard refers to a collection of multiple major hazards that a country faces . There is a possibility that several hazardous events occur simultaneously and are interrelated. Tropical storms, for example, is one of the most common environmental hazards (in the tropics), which can trigger multiple hazards such as heavy rainfall that in turn can induce flash flooding. Furthermore, heavy rain and flooding increase the moisture content of soil in a mountainous area and this may induce landslide. To minimise the loss of life and property damage from these inter-related hazards a comprehensive strategy for hazard management is required. In general, a strategy for hazard management is comprised of four phases : (i) mitigation the actions to minimise the cause and impact of hazards and prevent them from developing into full-blown disaster; (ii) preparedness the action plans and educational activities for communities to confront with unpreventable hazard events; (iii) response the actions for emergency situations to protect people life and properties during hazard or disaster events; and (iv) recovery the actions to restore damaged properties and communitys infrastructures and to cure people from their illnesses. These four phases demand supporting tools and technologies to enhance the effectiveness of hazard management.
Several modern multi-hazard early warning systems take advantage of the data explosion on the social media. Authors in  proposes using a twitter data analysis framework for identifying tweets that are relevant to a particular type of disaster (e.g. earthquake, flood, and wildfire). Several techniques, including matching-based and learning-based, to identify relevant tweets are also evaluated. The work in  studies the potential of using social media data to identify peatland fires and haze events in Sumatra Island, Indonesia. A data classification algorithm is used to analyse the tweets and the results are verified by using hotspot and air quality data from NASA satellite imagery. A data classification algorithm is also used in  to automatically classify tweets and text messages (from Ushahidi crowdsourcing application) generated during the Haiti earthquake in 2010. The goal of their work is to provide an information infrastructure for timely delivery of appropriately classified messages to the appropriate responsible departments. Work in  proposed a decision support system that integrates crowd sourcing information with Wireless Sensor Networks (WSN) to improve the coverage of monitoring area in flood risk management in Brazil. This research introduces the Open Geospatial Consortium (OGC) standards to facilitate the data integration among crowd sourcing information and WSN.
Semantic web technologies and high variety data management for multi-hazards
Earth Observation (EO) and urban data provided by multiple data sources are accessible by different methods ranging from direct download to various standard Web Services APIs (e.g. Web Map Services, Web Feature Services, Sensor Observation Services, RESTful API, SOAP-based API, etc.). In addition, there are heterogeneities among EO and urban data provided by different data sources  including: (i) syntactic heterogeneity the difference in data format or data model for presenting datasets (e.g. plain text, CSV, Excel, XML, JSON, O&M, SensorML, etc.); (ii) structural heterogeneity the difference in data schema for describing the same types of datasets (e.g. describing soil moisture using different XML Schemas); and (iii) semantic heterogeneity difference in meaning or context of the content in datasets. These heterogeneities reveal the challenging problems brought forth by the high variety data availability in multi-hazard applications. Semantic Web Technologies have thus played a significant role by providing languages and tools for modelling domains including describing the concept and relationship among the data and hazardous events. According to W3C definition [35, 36], the Semantic Web is a web of data that provides a common framework for data sharing and reuse across applications, enterprises, and communities.
Ontology, a key element of the Semantic Web, is a specification of a conceptual model for describing knowledge about a domain of interest. A basic concept in a form of ontology can be described by an Resource Description Framework (RDF) triple  which is comprised of a subject, a predicate and an object. Concepts described by RDF can be extended by Web Ontology Language (OWL)  to construct an ontology for representing rich and complex knowledge about things. In the case of multi-hazards application, an ontology can be used to: (i) represent domain knowledge through concepts, their attributes and relationships between data sources, data and hazards; and (ii) facilitate data integration across multiple data sources that represent varieties, velocity and volume characteristics of Big Data.
Ontologies are widely used in hazard management to model knowledge about hazards and use it to manage actual data derived from EO and urban sources. Hazard assessment and urbanisation analysis are two of the common application areas where ontologies are used. The Semantic Sensor Network Ontology (SSN)  and the Semantic Web for Earth and Environmental Terminology (SWEET)  are two significant ontologies that are commonly applied for hazard management. Authors in  reuse SWEET to conceptualize knowledge and expertise of several areas, such as buried assets (e.g. pipes and cables), soil, roads, the natural environment and human activities. Additionally, the Ontology of Soil Properties and Process (OSP) is proposed in their work to describe a concept of soil properties (e.g. soil strength) and process of soil (e.g. soil compaction). The OSP and other concepts are used to express how they affect each other in asset maintenance activities. Furthermore,  and  present the application of SSN for wind monitoring. The first work uses SSN with Ontology for Kinds and Units (QU)  to conceptualise wind properties (e.g. wind speed and direction) while the later uses SSN and SWEET to model the concepts of wind sensors and data streams of wind observations. The Landslides ontology  extends SSN to organized knowledge for the landslides domain such as the concepts of landslides, earthquake, geographical units, soil, precipitation and wind. Even though these ontologies provide comprehensive concepts for sensor data and hazard event, and provide a reusable, widely used semantic underpinning, they do not cover conceptual aspects on human sensors (e.g. social media data). Hence, currently additional processes are required when applying these ontologies to EWS for multi-hazard application.
The related literature in the context of multi-hazard management can be classified based on the following three perspectives, data sources, hazardous event analytics, and EO and urban time series data management. It can be seen that effective multi-hazard management demands high quality and rich data from vast amount of data sources that are related to the hazard of interest. Data sources utilized by multi-hazard management applications can be any sensors and/or data services that provide EO and urban data. Such data sources include physical sensor (e.g. remote sensing, in situ sensor, wireless sensor network) and human sensor (e.g. social media, blogs and crowd sourcing). Recent data analytics research for multi-hazard management focused on hazardous event analysis, which are conducted into three main directions, event identification, event verification, and event prediction. These research reveal the challenging problems in the EO and urban time series data management, especially the discovery of potential time series data sources over the complexity and high variety of such data sources in multi-hazard management applications. Ontology is a common method for not only modeling knowledge about hazard but also managing EO and urban data. Recent work around developing the ontology in this domain are classified as standardizing ontology and reusing ontology. They have shown that current standard ontologies for data sources discovery do not exist. In addition, existing applications of ontology in this domain mostly investigate specific problems, in other words these approaches are not generalized. They fail to model the relationship between data sources and the domain knowledge which is an important factor for efficient data integration and data sources discovery.