Bridging the Information Gap: Mapping Data Sets on Information Needs in the Preparedness and Response Phase

During the preparedness and response phase in regularly recurring natural disasters, the responding and professional communities have to decide which actions to take in order to support affected communities. We investigated the information needs of and the disaster management data available to both national and local decision-makers during the floods that affected the riverine islands of the Sirajganj district in Bangladesh in 2014. We conducted 13 semi-structured interviews and three focus group discussions, collecting in this way input from 51 people, transcribed and coded them so that clusters of information needs emerged. Subsequently, we mapped the information needs on the available data sets and identified the needs that are not well covered, of which the need for timely and location-based information is the most important. We recommend executing identification and mapping of available data sources on the information requirements as part of the preparedness phase. Data preparedness can solve to some extent the issue of data not being available timely enough in the initial response phase. Our future research aims at further closing these information gaps by linking and integrating disparate data sets to cover more information needs and by equipping disaster management volunteers at community level with a mobile data collection app to collect data before, during, and after the floods.


Introduction
The poor face different levels of impact when exposed to natural hazards than the nonpoor (Hallegatte et al. 2016). In addition, data on the risks poor and vulnerable face and the actual impact they experience is often lacking. The digital divide plays an important role in this as developing countries and within these developing countries, the poor and vulnerable have less access to digital technologies (Leidig and Teeuw 2015). Apart from technological reasons, there are also political and social reasons. For example, the poor often get their income through the informal economy, and their land and resource rights are usually not registered.
This lack of data affects the actions of different actors in the preparedness and response phase. Three main actors can be distinguished in the response and preparedness phase (van den Homberg and Neef 2015). The affected community are the people directly and indirectly adversely affected and in need of urgent (humanitarian) assistance. The responding community consists of local or outside community members that support in relief or recovery but are not trained in crisis response. The responding professionals are part of the professional community in the field of disaster management, such as national and local governments, NGOs, and national crisis coordination centers. All these three groups have to decide based on the data and information available to them-in the period just before the disaster hits-which early actions are the best to prevent loss of lives and to protect livelihoods and-after the disaster has hit-which response actions to take.
Decision-making in the preparedness and response phase should be based on factual data about the needs on the ground, but is in reality a highly political process. For example, damage and needs assessments are competitive and difficult processes as agencies continue to see information on the impact of a disaster as something to be "owned" in order to leverage resources and influence rather than as something to be shared (Walton-Ellery and Rashid 2012). Responders also face "high levels of uncertainty, extreme stress with significant consequences of actions, compressed timelines and significant lack of information available initially followed by extreme information overload" (Preece et al. 2013). In the case of Typhoon Haiyan, for example, those responding considered the multitude of different information sources and formats generally as an information overload (Comes et al. 2013). Responders will have different degrees of data literacy and cognitive abilities to deal with these circumstances and to make sense of the data and information that is available and accessible to them. The degree of collective sensemaking that is reached among responders is another important factor in decision-making between and within organizations (Wolbers and Boersma 2013). Important in sensemaking is having reliable, accurate, and timely data. Getting such data right after a sudden onset disaster is challenging given the chaotic and disrupted situation. In the case of floods in Bangladesh, the working group on Disaster Emergency Response (part of the Local Consultative Group that coordinates between the Government of Bangladesh and the different development partners) established a Joint Needs Assessment project in 2014. The Joint Needs Assessment project started a large survey to collect data in the field given the absence of consistent, comparable information across the affected area, whereby they had to trade-off between time, quality, granularity, and available resources (Wahed et al. 2014). Harmonizing and coordinating the different assessments organizations are doing is a difficult task and heterogeneity issues in the data sets that come out of the assessments are most commonly unavoidable. Given these data-related challenges, it is evident that responders face information gaps. Whether it is not having enough information at the very onset of a disaster or whether it is having too much information later in the disaster; in both cases, their information needs are not adequately covered. These gaps will be more articulate in developing countries-often data poor and low tech-than in developed countries.

Research Questions and Methodology
Three research questions were defined: (1) What are the information needs of disaster responders so that they can take appropriate decisions? What are the associated timing constraints? (2) What are available and relevant data sources and when do they become available? (3) How do these data sources currently meet the information requirements? We conducted a case study to address these questions (Monné 2016). The case study was part of a community-managed disaster risk reduction (DRR) program that included setting up an innovative last mile early warning system using voice SMS and developing an app and dashboard to enable two-way information exchange between affected communities and responders. The pilot areas are riverine islands in northwest Bangladesh, so-called char-islands, which are part of the densely populated floodplains where many poor and vulnerable people live. The focus was on the most recent and severe river flood of the last years, namely, the floods of 2014 that affected almost two million poor and vulnerable people living in nine districts in northwest Bangladesh (Wahed et al. 2014). About 1 year after these floods, we performed 13 oral history semi-structured interviews of which 11 in Dhaka (national NGOs (active in the JNA consortium) and Department of Disaster Management) and two in Sirajganj (one with a farmer and fisherman, and one with the director and his two co-directors of the local NGO, MMS). We held one focus group discussion with seven disaster responders of MMS, one focus group with 15 people living on the chars (imam, teachers, entrepreneurs, part of the volunteer disaster management committees), and one focus group with 13 local government officials [Upazila and Union Disaster Management Committee, civil defense organization (Ansar VDP)]. So, in total, we got input from 51 people. We arranged the first batch of interviewees based on our existing network and such that we would have a representative cross section. Subsequently, we used a snowballing approach to grow our sample considering the availability of respondents and useful references. Although focal point in these sessions was the flooding of 2014, we did allow interviewees also to draw from their earlier or more recent disaster management experiences. All interviews were transcribed. The focus group discussions were done with an interpreter, usually at an open noisy marketplace, and could not be literally transcribed. Instead, we used the notes taken. All interviews and notes were subsequently labeled using NVIVO 10 for Windows and coded based on three themes, i.e., Activity, Decision, and Information Need. We used inductive coding to have subthemes emerge from the data. For each of these themes, clustering was done based on experience emerging from the familiarization phase, domain knowledge, and literature study. In addition, we asked the interviewees to validate our transcribed interviews. We asked two domain experts to validate and expand on the list of needs. We also used the lists of Activities and Decisions to identify possible discrepancies. For the second research question, we used, in addition to the interviews, Internet search and literature study. In that way, we could make an inventory of the data sets that were available during the flooding of 2014. For the last research question, we singled out all the indicators per data file and manually determined the match with a subtheme information need. We scored the match as Yes, No, or Partly. Afterwards, we used constrained COUNT formulae to calculate the coverage per disaster data source of the subtheme information needs. We used approximately the phases as defined in the Multi-Sector Initial Rapid Assessment (MIRA) (MIRA 2015) to label both the data sets as well as the information needs. The phases consisted of before (1), the first 72 h (2), the first 2 weeks (3), and the first 2 months (4). Table 18.1 gives an example for three data sets and information needs. Data B covers 33% of the information needs if no time constraints are considered. With time constraints, none of the information needs are met, since the information was needed already in phase 1 but came only available in phase 4.

(a) Information Needs
A small group of interviewees, especially at the local level, had difficulties expressing their information needs and identifying the type of decisions they had to take when directly asked for it. However, when interviewees where asked to describe their role in the flooding of 2014, it was possible for us to derive these. Information needs varied as well from one responder to the other, which could usually be attributed to differences in the organization they were working for, their specific expertise, and level of education. Table 18.2 summarizes the needs as emerged from the coding and clustering of the transcribed interviews in normal text. The list is not exhaustive given our limited sample size. In the italic text, we have added the needs that two domain experts contributed. We decided not to aggregate the information needs to a too large extent, given that we want to map the information needs to the information in the available data sets, but also to reflect the needs as they were expressed. We defined seven clusters for in total 71 information needs. We have put in Table 18.2 on the left clusters that relate to the Crisis Impact and on the right, those that relate to the operational environment, in line with the MIRA Analytical Framework (MIRA 2015). The cluster Damage and needs scored highest in terms of amount of times mentioned in all interviews and in terms of in how many interviews it was mentioned. This cluster of information needs matched also with what the interviewees mentioned as the most difficult decisions for them to take, i.e., determining which beneficiaries to support where and with what kind of support. Next comes the need for information on Coordination, especially among government and NGOs. Specifically, it was mentioned in many interviews that it was important to have a gap analysis between the capacities available and the needs to be fulfilled. Capacity encompasses the response capacities of the responding communities and professionals and the coping capacity of the affected community. Knowing how to protect one's livelihood (such as agriculture, fishery, and hand looming) increases the coping capacity. Interviewees mentioned, for example, the importance of knowing when to harvest just before the flood arrived and which crop to cultivate when the flood started to recede. Similarly, it was important to know how well the local market was still functioning. Key is also a readily accessible and suitable emergency stock (IFRC 2015). Specifically, information about boat capacity was mentioned as a need in the interviews. The Baseline cluster focuses on the context of people before the disaster hit. Flood news groups the needs in relation to the arrival and duration of the flood. The Location Services cluster refers to locations for essential services such as water, health, food, and shelter, but also to places where there are opportunities for labor. We note however that also for many of the other information needs, a geospatial attribute adds value. Security and access refer to access for the responders to the affected community.
(b) Data sets and information products Figure 18.1 depicts the different flood-related data sets and information products, and at which point in time, they were collected and became available. The data sets and information products contain on the order of 40-60 indicators per source at a high level. For example, the Joint Needs Assessment consists of an extensive excel file that compiles answers to 62 questions for several unions (the data set) and an information product (a pdf report discussing and describing the survey results). We did not yet include in our analysis data sets or information products that were available only in Bengali. However, to our current understanding, based on the interviews and our literature and Internet search, this seems to be a minor fraction. We also did not include social media data, given that there is nearly no social media penetration among the affected communities. In the whole of Bangladesh, the percentage of people using the Internet is 9.6% in 2014 (ITU 2015) and most of these are in dense urban areas.  The data sets have different levels of collection and aggregation. The government works with the SOS and D form for damage and needs assessments. The union secretary or chairman has to fill in the 10 questions of the SOS form within 48 h after the disaster. The D form has 30 questions, which are filled in and submitted within three weeks. The system is still largely a paper-based system, whereby forms are manually summarized at each of the administrative levels, before they are passed on to central level and digitized. This means that it is often not possible to go back to the data at ward level. This type of data granularity loss we encountered in more data sources. The downward arrow in Fig. 18.2 depicts this risk of data granularity loss at each step up in the government hierarchy. Important data providers are the Department of Disaster Management of the Government of Bangladesh and the Humanitarian Coordination Task Team (HCTT), consisting of UN, NGO, and government representatives. For each file, we singled out all the indicators and determined the data type (excel sheets, relational databases, PDF, text, websites, and geographic information).

(a) Mapping data sets and information products on the information needs
Following the methodology described in Table 18.1, we mapped the 71 information needs on the 15 data sets and information products. We can draw the following conclusions per cluster level, where for now we do not look at time constraints. Service locations are not well covered at all with a cumulative coverage of 0.7%. 1 One of the reasons for this might be that information is often collected by phoning people and asking them to give an overview for their ward or by conducting a paper-based survey. We did not come across local responders that use an app or GPS to map locations during the floods. Capacity was also not well covered, varying from relatively easy to monitor capacities such as the number of boats up to the more difficult to assess coping mechanisms of affected communities. Damage and needs were covered largely by only two out of the 15 data sets (JNA and D form). The following data sources match well the information requirements: JNA (38%), D form (34%), District Disaster Management Plan (20%), the (online) News (14%), and the Situation Report (13%). We note that the 13% is based on the first Situation Report that became available; later ones yield a higher coverage. These data sources overlap on some indicators (and hence also their coverage of information needs), and the coverage percentage cannot be summed.
Overall, 75% of the information needs can be covered (49 information needs are fully covered and eight information needs (which were not fully covered) are partly covered by one or the other data set). We also determined which combination of data sources would yield the highest coverage, starting with the highest individual coverage and stepwise selecting the next data source based on its additional coverage. This resulted in a coverage of 68% for the following data sources: JNA, District Disaster Management Plan, and the FFWC website (44 of the 71 information needs are completely fulfilled). None of the data sources fulfills a critical mass of information needs to justify a 100% focus on one source. But an important conclusion is that a very good coverage of information needs can already be reached by the three most important data sets out of the total 15. However, we recall that the above results are without taking timing constraints into account, whereas our interviewees explained that they need certain response information within 48 or 72 h after the disaster strikes. If we do include the timing constraints, then it becomes apparent that most operationally related information is not available in time. In this case, 28.5 information needs are covered, or in other words 40%. Only 27% of all needs are covered in time, whereas 75% are covered if we do not take any timing constraints into account.

Discussion and Conclusions
We compared our framework of information needs with the one from Gralla et al. (2015). Their Context and scope, Coordination and Institutional Structures and Humanitarian Needs themes overlap with ours and are the most important factors in the earlier response. Several other information requirements are not mentioned in our interviews such as Looking forward and Relevant laws and policies as part of Coordination. The Gralla et al. framework emerged from consultation with mostly responders from the international humanitarian community, whereas our framework emerged from consultation with only national and local responders. Also, the type and scale of disasters looked at was different. We looked at small-scale disasters, whereas Gralla et al. focused on large-scale disasters, where international response is requested by the nation affected. Floods as in our case study have severe impacts on livelihoods but usually less in terms of loss of life. In many cases, there can be also a difference of opinion between the NGOs, on the one hand, and the government, on the other hand, as to whether declare a flood an official disaster. One interviewee mentioned encountering in some cases political pressure not to help. Nevertheless, it is widely acknowledged that the role of national and local responders is of utmost importance also in large-scale disasters. Local responders have more local context knowledge and -in case of recurring disasters like annual floods-they also usually have more response experience than the international community. This leads to a different level of information needs regarding the Baseline theme between local, national, and international responders (van den Homberg et al. 2014). For international responders, the public and media perception turned out to be a separate theme. In our interviews, media perception did not come forward as an important issue, probably related to the fact that national and local responders usually are not directly applying for funding themselves (but through their supporting international NGOs) and that the local communities affected do not have access to a lot of media channels. We did not find much information needs in relation to Recovery. This might have to do with the relatively limited possibilities for the responders in our interview group to extend their activities beyond response. In sum, it is important for each type of context and hazard to develop a tailor-made information needs framework. We have developed one for a hydrological hazard in one of the poorest countries in the world. A comprehensive framework with a generic set of themes can be used as a starting point and for each actor there will be differences as to which category is the most important to them given their organizational mandate, where, for example, some NGOs focus on women empowerment and others on disability. Such a comprehensive framework should include both the local, national, and international perspective.
Subsequent mapping of available data sources on the information needs in the framework is key for identifying the data gaps that currently exist. It is clear from the mapping we did that both the responding and the professional community lack information to effectively dimension and target their response. The governmental SOS form is rapidly available (within 48 h) but is very high level. The more detailed infor-mation from the D form is available within three weeks, but the lower government levels often do not have clear guidelines and resources for adequate data collection. NGOs that are part of the Local Consultative Group often do their own assessments, such as in 2014 via a Joint Needs Assessment, creating in fact a new process with different indicators that is only aligned with the government process to a very limited extent. Once the information is collected at central level, support is mobilized for the response, making the response largely a top-down mechanism. Both the NGO and government information architecture are not specifically geared toward coordination and action planning at Community, Union, and Upazilla level, forming a stumbling block for effective local response. To tackle the issues mentioned above, data preparedness activities should become an integral part of the preparedness phase.
First, we propose to organize regular multi-institutional mapping cycles of data sets on information requirements. These cycles should not only consist of keeping an up-to-date inventory of available data sources and providers, but also of regular consultations with responders as to what their information needs are. When the interviewees validated the information needs framework, this sparked their creativity. We got reactions like: "wow, if this is possible, we could also really benefit from X information". It is important hence to keep on evolving the requirements and to use these requirements to shape the information products that providers are creating so that they meet the decision-maker's needs. These mapping cycles will also benefit from advances countries make in terms of open data. Open data can promote inclusion and empowerment, as it has the potential to remove power imbalances that result from asymmetric information, and can give marginalized groups a greater say in policy debates (Davies and Perini 2016).
Second, coordination needs to be improved. A Coordinated Data Scramble (Campbell 2016) can be a very effective way to reach a higher level of coordination in the data collection process, avoiding duplicates, increasing quality, and promoting coherence. It basically means having a multitude of organizations use collaborative platforms and closed digital communication groups for "bounded crowdsourcing" (Meier 2015). Also, specific platforms for managing and sharing the different data sets can be used. Geodash, making use of Geonode, is such a collaborative geospatial platform that was the started up by the World Bank and is now taking over by the Government of Bangladesh (Geodash 2017). UN OCHA deploys the Humanitarian Data Exchange (HDX), more specifically targeting humanitarian data (Keßler and Hendrix 2015).
Third, to facilitate the sharing and exchange of data, standards are being developed and used-to varying degrees-ranging from P-codes for unique geographic identification codes up to the Humanitarian Exchange Language (HXL). Lastly, it will be key to develop capacities of the different stakeholders in parallel to the above activities enhancing their data literacy and access to digital technologies. Especially at the local level, many respondents were, for example, not aware of all the existing data sets nor were they trained in data collection and analysis.

Future Research
Our research focused on the relation between available data and information needs. Although we inventoried Decisions, Activities, and Information needs, we did not investigate the relationship between these three elements into depth and the system dynamics between the different stakeholders including the political and financial dimension. These dimensions played out, for example, in the still largely separate data collection processes between NGOs and government and in when a flood is declared an official disaster. Further research could address humanitarian decisionmaking in terms of "what may be influencing decisions, other than the needs on the ground" (Nissen 2015). A political analysis of the stakeholders and the financial flows might strengthen the information management research approach. Regarding the relation between available data and information needs, it will be worthwhile to determine the time dependency of the information needs into more detail and to do the mapping on the data products in a more automated fashion. For large organizations, it is possible to map through which information channels (email, mobile, fax, and chat) information consumers get information products from internal information producers. This kind of mapping does, however, not consider the degree to which information needs are covered. Furthermore, it is much more difficult to do this kind of mapping between organizations and even more so if certain workflows are still paper-based. It might be possible to log data file usage on the main websites that are used by responders and, for example, how the app and dashboard are used (Pachidi et al. 2014). In addition, an after-action review with the responders in a focus group setting could be used to have the responders categorize their needs according to the four phases. This refinement could lead to an enhanced understanding of the data gaps. We envision two avenues to further close these gaps. The first avenue consists of assessing how Artificial Intelligence for Disaster Response (AIDR), such as data and text mining, can be used to link and integrate disparate data sets and to in this way reach a higher coverage of information needs (Spruit and Vlug 2015). It will not be necessary to integrate all disparate data sets; we showed that a very good coverage of information needs can already be reached by integrating the three most important data sets out of the total 14. One could set up so-called data spaces which are loosely integrated sets of data sources where integration happens only when needed (Hristidis et al. 2010). This could become an essential extension to the earlier mentioned data exchange platforms so that these platforms offer-to a certain degree-sensemaking of all the data sets that are shared through them. The second avenue consists of tackling the lack of local and timely data. The Government of Bangladesh has started to develop an online process of collecting the SOS and D form data, tackling in this way the data granularity loss. We have co-created a smartphone application in Bengali that local disaster management professionals and volunteers can use to collect data just before and during the floods that fulfills the currently not covered information needs. The functions and features of the app and dashboard reflect the different clusters of information needs that we identified. The data collected is fed back to the affected communities through a dashboard that is accessible on the very same smartphone. Ultimately, we aim at replicating the same approach also to other flood-affected countries in Asia.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.