This section describes methodological steps applied in this study, including the study locations, the procedure for sample selection, data collection, and the analytical approach for analyzing the data.
Study area
We took two criteria into consideration for selecting study locations in Bangladesh. First, the area must be located in a hydrometeorological hazard-prone coastal region; and second, the transport network should be underprivileged in terms of (re)construction, repairing, and budget allocation. Following the criteria, we selected villages from five unionsFootnote 2 of Khulna and Satkhira districts as study locations, as presented in Fig. 2. Over the last one and half decades, these study locations were battered by tropical cyclones ranging from category-2 to category-5 (EM-DAT 2020). Among the most devastating recent cyclones were Amphan (in 2020), Fani (in 2019), Mora (in 2017), Roanu (in 2016), Mahashen (in 2013), Aila (in 2009), and Sidr (in 2007) (Ahsan and Khatun 2020; Ahsan et al. 2020).
Data collection procedure and sampling method
We used primary data for the analyses conducted in this study. For collecting primary data, we performed a face-to-face household-level survey using a structured questionnaire. The questionnaire was composed into five sections: the first section dealt with the questions on respondents’ personal information (i.e., mainly socioeconomic and sociodemographic information); the second section covered hazard- and damage-related information; the third section dealt with social attribute-related information; the fourth section addressed migration-related questions, and the final section covered respondents’ access to natural resources-related information. After reviewing the relevant literature, discussing with two experienced Cyclone Preparedness Program (CPP) volunteers, and performing five Focus Group Discussions (FGDs) with participants from different walks of life, the questions for the draft questionnaire were finalized. A piloting was performed with 25 respondents in June 2020 using this draft questionnaire, and necessary modifications were addressed in the final version of the questionnaire. We incorporated a total of 68 questions in the final version of the questionnaire.
We applied a multi-stage sampling approach to select the survey respondents. In this regard, at the first stage, we purposively chose two districts — Khulna and Satkhira based on the conditions mentioned in the “Study area” sub-section. In the second stage, we purposively selected five unions that suffered most due to hydrometeorological hazards over the last decade from the aforesaid two districts. At the third stage, we randomly selected 14 villages from five unions applying a lottery method.
Our study locations were hit by a category-5 tropical cyclone (Amphan) in May 2020 followed by river embankment breaches, which inundated nearly 83% of all villages in our study locations. As a result of this inundation, many households from these villages moved toward the dry areas together with the unaffected segments of the embankment. Under the circumstances, a random selection of sample respondents based on an electoral list from the local government administration office was not possible to apply. Therefore, we adopted a systematic random sampling for selecting sample households. Specifically, we performed a second lottery by blindly drawing a piece of paper with one of five numbers on it (7, 12, 17, 22, or 27), where the number written corresponded to the proportion of households to be sampled. In this lottery, a paper with the number “12” was drawn. Based on this lottery result, for the survey sampling, we selected every twelfth household on the river embankment in the case of inundated villages, while in the case of non-inundated villages, we chose every twelfth household from both sides of the main connecting road from the central business point to inside the village (odd attempts on left side and even attempts on right side). Based on the condition mentioned in the following sub-section, in total, our sample consisted of a total of 556 household respondents from 14 villages in five unions, as shown in Fig. 2.
Analytical approach
As it is difficult to set a definitional paradigm of “migration/non-migration,” especially for climate-sensitive regions (due to their complex characteristics), we framed and applied a simple definition of non-migration in this study. Thus, in line with our study design, we considered a respondent as a voluntary non-migrant if he/she (despite climatic extreme events) along with all members of the concern household (including its head if the respondent is not household head) willingly stayed in the same and/or adjacent locality for at least two decades. In this case, the aforesaid time span is considered as 2000 to 2020, during which seven tropical cyclones battered our study locations (see Table C in Appendix). We deployed a mixed method as an analytical approach in this study, where we applied quantitative tools mostly. We considered relevant statements by household respondents during face-to-face survey and FGD participants for qualitative analysis. For quantitative analysis, we used linear correlation, parametric, and non-parametric testing tools. In this regard, we divided the quantitative analysis plan into three segments.
First, we differentiated between two groups of samples: voluntary non-migrants and involuntary non-migrants (i.e., those who willingly did not migrate and those who unwillingly did not migrate). Then, we applied relevant parametric and non-parametric tests (correlation, z-test, chi-squared test) to analyze differences between the two said groups together with the association among relevant covariates.
Secondly, we deployed a principal component analysis (PCA) to extract major dimensions (composed of clusters of variables) of the respondents’ non-migration decision. We chose to apply PCA as it is a statistical tool for data reduction and can be used to re-express multivariate data with fewer dimensions (i.e., components). In other words, by applying PCA, one can mathematically derive a relatively small number of variables expressed in clusters to convey as much of the information in the observed variables as possible (Field 2013). By applying PCA, we re-oriented the data in such a way that a multitude of original variables was summarized with relatively few “factors” or “components” that captured the maximum possible information (variation) from a large chunk of original variables.
Finally, having figured out the major components using PCA, we applied a binary logistic regression model, presented in Eq. 1, where the dependent variable (p) takes a value of 1 if the respondent is a voluntary non-migrant only and 0 if otherwise. This “otherwise” included the respondents whose replies were both “no” and “not sure” to the question if they willingly did not move to other areas. We performed this regression analysis to determine the driving factors behind the non-migration decision of the respondents. In this regression model, we incorporated the predicted values of PCA components (Pk), socioeconomic and sociodemographic characteristics (Zk), and other relevant issues (Sk) as independent variables. In Eq. 1, subscript k denotes respondents, β is a constant term, parameters γ, δ, and φ are to be estimated, and ε is an idiosyncratic error term. All the statistical tests in this study were performed by using a statistical package known as SPSS (version 22). Table A in the Appendix exhibits the list of variables used in this study including their definition, measuring unit, and adapted sources.
$$\mathrm{Ln}{\left(\frac{p}{1-p}\right)}_{k}={\beta }_{k}+\upgamma {P}_{k}+\updelta {Z}_{k}+\varphi {S}_{k}+{\varepsilon }_{k}\dots \dots \dots$$
(1)
To make contrasts with the results from PCA and logistic regression, we used relevant statements by household respondents during face-to-face survey and FGD participants.