1 The Data Demand and Challenge

Accurate data on the key economic variables affecting people who have been forcibly displaced, such as consumption and assets, is essential to understanding their situation and to developing evidence-based policies to support them. Poor information or data inaccuracies can lead to flawed diagnostics and impact assessments, resulting in inefficient use and a waste of limited resources. In the context of displacement, consumption data is particularly important because malnutrition is rife and mortality rates are high, and interventions using consumption data are needed to support the immediate basic needs of vulnerable populations.

In previous High Frequency Survey (HFS) survey rounds, approximately 45% of Somali Internally Displaced Persons (IDP) households reported food consumption below subsistence levels, and 80%, below recommended levels. It is no surprise that IDP populations report lower consumption levels. IDPs face significant hardship that hinders their potential for generating adequate livelihoods, such as experiencing the loss of a breadwinner, not having any productive assets, or having fallen victim to violence. Indeed, IDPs have much less control over their own livelihoods, employment opportunities are scarce within camps, and a large part of their consumption is provided for through aid by NGOs and international organizations.

Yet, there are also reasons that indicate that the low levels of consumption might be due, at least in part, to misreporting. First, very low levels of consumption are associated with high rates of mortality due to starvation. The observed mortality rates among IDPs, however, does not indicate that mortality increased due to starvation across the country at such a scale.Footnote 1 Second, non-IDP households that are statistically similar on observable characteristics report higher levels of consumption than IDP households. While IDPs and non-IDPs may have different opportunities to generate income, it is unlikely that IDPs do not smooth their resources to balance food and non-food consumption in a way that endangers their life. The vulnerability of the population increases the stakes for getting the data right: for policymakers designing programs to support IDPs, spurious data is either unusable or biased.

The potential for surveys to generate information that is systematically biased is well documented. A large body of research focuses on improving the accuracy of self-reported information collected in household surveys.Footnote 2 In the context of IDPs, that respondents feel compelled to misreport is particularly relevant. Indeed, survey respondents in IDP camps may believe that their responses will influence the provision of humanitarian aid and will thus misreport consumption in an attempt to influence its distribution. If survey respondents are underreporting, the inaccuracies generated in the data are highly problematic. At best, it makes the data spurious and unusable. At worst, it could lead to misallocations of aid, from more vulnerable areas to less vulnerable areas, or from solutions emphasizing sustainability to immediate relief when immediate relief is unnecessary. Given this context, light touch adaptations to the design of the survey that prime the idea of honesty offer to make big improvements to the quality of the data and support provisions the data informs.Footnote 3

2 The Implementation

The experiment included 4145 IDP and 781 non-IDP households across South Sudan in 2017 rolled out in mid to late 2017. To investigate whether consumption might be underreported by IDP populations, households were randomly exposed to a bundle of ‘honesty primes.’ The treatment had three components, which were simultaneously administered in one treatment arm (Fig. 1). These included an emphasis on the importance of accurate answers at the beginning of the survey, a short fictional scenario, which required passing judgment on the behavior of one of the characters, and additional questions to determine the household’s last meal, asking respondents to explicitly report whether or not they have eaten in the last week.Footnote 4,Footnote 5 While the former two targets intentional misreporting, the latter addressed classical measurement error.Footnote 6 The bundle of primes addressed different psychological mechanisms:

Fig. 1
figure 1

(Source Authors’ visualization)

Treatment Components

  1. 1.

    Appeals to honesty: These are a standard tool in surveys to increase data accuracy that rely on respondents’ preference for the social approval of the enumerator.Footnote 7

  2. 2.

    Honesty primes: These bring the value of honesty to top of mind by asking the respondent to consider a fictional scenario in which honesty is relevant. If individuals feel they have a motivation to misreport, the honesty prime makes a competing motivation salient: to answer truthfully to sustain self-consistency. People make decisions on the basis of both external and internal reward systems: even when people have a material incentive to lie, their internal drive to protect their self-integrity may override.Footnote 8,Footnote 9

  3. 3.

    Investigative probing: This places a higher salience on the importance of getting answer to the question right. By asking for broader categories first, subsequent sub-categories are put under more scrutiny. Self-consistency is reinforced by relating to a longer recall period of seven days.

It is important to note that the treatment is not designed to directly elicit increases in reported consumption. Rather, the intention is to bring the importance of honesty into focus during the interview. It is only through this mechanism—increases in honesty—that we should expect to indirectly see increases in consumption. Thus, ex-ante, we should not expect the treatment effects to be uniform across the consumption distribution.

Almost one-third of respondents (30.1%) reported a calorie intake below the daily subsistence level of 1200 kcal per day and the median per capita consumption was below the recommended calorie intake (1589 kcal per day). Conditioning on adult equivalents, the median shifted well above the recommended daily intake. However, a substantial part of the distribution, 16%, still reported being below the subsistence level and 40% reported being below the recommended daily intake.Footnote 10 As with the number of consumption items, the graph indicates that there was a slight shift in the reported consumption among the treated, with respect to very low consumption levels.

Different dependent variables are specified because they have different implications for the respondent’s scope of influence on their value. The impact of the ‘honesty primes’ on the total consumption value, both in terms of money and food intake, is of primary interest. Yet, they are second-order values that are calculated as a function of other variables, including consumption quantities and calories or prices that are in turn deflated. These variables are difficult for respondents to falsify because of the intense mental computation required. The consumption quantity in kilograms is a more direct measure of the quantity consumed as expressed by the respondent and may lead to more accurate estimation of the impact of the ‘honesty primes.’ Finally, counting the number of items may lead to an even more accurate measure, since the variable is not cleaned and is taken at face value. Furthermore, omitting an item is the easiest and quickest way for respondents to reduce the value of the household’s consumption.Footnote 11

3 Key Results

There is a small difference in reported consumption on average between the treatment and control group. The consumption levels shown in Fig. 2 shows a slight difference in consumption between IDP households in the treatment and control groups, though this is apparent only at lower levels of consumption, below SSP 400. In contrast, the distribution of consumption across the two groups matches much more closely for the non-IDP population. The distribution of the number of items displays a similar pattern, though the effect is also faint (Fig. 3). Again, a difference is not visible in the non-IDP population. The number of observations for the non-IDP population is much lower than for the IDP population, and the variance of the distribution is expected to be much greater.

Fig. 2
figure 2

(Source Authors’ calculations using HFS 2017, IDPCSS 2017 and CRS 2017)

Consumption distribution by population and treatment

Fig. 3
figure 3

(Source Authors’ calculations using HFS 2017, IDPCSS 2017 and CRS 2017)

Number of items consumed by population and treatment

If respondents are deliberately misreporting, those misreporters are likely to be doing so at low consumption levels (e.g., it is more likely to be the case that a small number of respondents are significantly underreporting, rather than a large number of people underreporting by a just a little bit). Given the treatment is not designed to increase reported consumption levels per se, but rather to invoke honesty, it should affect only those people who are misreporting. Hence, heterogenous treatment effects across different household consumption levels (quantiles) test the validity of ‘honesty primes.’Footnote 12 Figure 4 depicts priming effects across different consumption levels for the four outcome measures of interest.Footnote 13 The priming significantly increases reported consumption among lower consumption levels, but not for medium and higher consumption levels. Significant treatment effects mainly influence the reported number of consumption items and the quantities in kilograms. Monetary and caloric consumption measures are not as strongly affected. The latter might also be less susceptible to deliberate misreporting as they depend in part on variables over which the respondent has no control (calories per item; deflators).

Fig. 4
figure 4

(Source Authors’ calculations using HFS 2017, IDPCSS 2017 and CRS 2017. All regressions use clustered robust standard errors [White 1980]. Confidence bands refer to the 95% confidence interval. Consumption quantities, values, and calories are used in per-adult equivalent terms. The regression framework is introduced in the appendix. No sampling weights are used as ‘honesty primes’ are expected to affect, specifically, the extremes of the distribution and the average treatment effect is not a priori of interest)

Treatment effects across quintiles (IDPs)

The priming has stronger effects among the more vulnerable IDPs. The non-IDP subsample is used to assess the robustness of our main results as we would expect a less significant priming effect among the non-IDPs. Results in Fig. 5 indicate less significant effects, corresponding to the hypothesis that ‘honesty primes’ are more effective among more vulnerable IDPs.Footnote 14 This corresponds to adverse/perverse incentives in foreign assistance settings. Specifically, when IDPs are exposed more intensively to development aid, they may more likely signal their ‘neediness’ or provide socially desirable answers to signal their ‘worthiness’ for assistance.

Fig. 5
figure 5

(Source Authors’ calculations using HFS 2017, IDPCSS 2017 and CRS 2017. All regressions use clustered robust standard errors [White 1980]. Confidence bands refer to the 95% confidence interval. Consumption quantities, values, and calories are used in per-adult equivalent terms. The regression framework is introduced in the appendix. No sampling weights are used as ‘honesty primes’ are expected to affect, specifically, the extremes of the distribution and the average treatment effect is not a priori of interest)

Treatment effects across quintiles (non-IDPs)

Four dichotomous indicators are used to assess whether the priming shifts a significant share of respondents above certain reporting thresholds. The indicators are equal to one if (i) the respondent household surpasses the caloric subsistence level of 1200 kcal or (ii) the recommended level of caloric intake of 2100 kcal. Two further dummies are created at (iii) 66.66% and (iv) 100% of a normalized poverty line, which is scaled by the fact that only core consumption items were assessed consistently across all surveys. Although the coefficients are mostly positive, only two coefficients turn significant in columns (2) and (3) (Table 1). The results stress the positive effect of the primes, where seven percent more respondent households would have reported above the recommended daily calorie intake level. However, only certain population strata are affected.

Table 1 Results using poverty thresholds

4 Lessons Learned and Next Steps

Most measures to increase the accuracy of surveys assume that respondents want to report as accurately as possible. In many cases, this assumption is incorrect. This research offers novel and suggestive evidence that increasing the salience of honesty may increase survey accuracy, even if incentives to misreport exist. We find significant treatment effects for respondents most likely to be underreporting (those at lower levels), but no significant effects for those at higher levels who are unlikely to be underreporting. We find that the effects are stronger for outcome measures that can easily be manipulated (the number of consumption items) than for those that cannot easily be manipulated (the monetary consumption quantities).

The study underlying this chapter has two main limitations. First, while the experimental set-up allows for identifying a clean treatment effect, it can only compare the control group against an estimate of the ‘true’ rates of consumption. Without more objective data it is not possible to dismiss the possibility that the higher consumption levels reported in the treatment group are not true and subject to overreporting. The mortality rates among IDPs suggest that starvation is not occurring systematically across the country, but the precarious situation calls for further scrutiny.Footnote 15 Before adjusting poverty estimates, a thorough comparison with more ‘objective’ data from administrative, anthropometric, or observational sources is needed. Second, the intervention is bundled. For this reason, it is impossible to isolate the causal mechanism affecting the observed changes in reporting. However, if classical measurement error would be affected, treatment effects of the primes should be uniform. In contrast, heterogenous effects across quantiles suggest that the targeting of intentional misreporting via the appeal to honesty and moral prime would be the driver of our results. More research, which unbundles these primes in different treatment arms or combines them with other survey tools can contribute to developing more durable solutions for data collection. Due to both the low costs in terms of money and survey time, the ‘honesty primes’ constitute a valuable supplement for surveys in contexts, where incentives for underreporting exist. Beyond fragile states, the primes could be also a possible survey extension if aid reliance is high (e.g., in Mali or Malawi) as indicated by our subsample analysis.