Studying Sensitive Topics in Fragile Contexts

This chapter discusses the challenges of studying sensitive attitudes and topics in fragility, conflict, and violence settings and summarizes the most common approaches to overcoming them. The first section reviews the challenges involved in studying sensitive attitudes and the factors that could introduce bias and affect the validity of such research. The second section discusses four techniques (endorsement experiment, list experiment, randomized response, and behavioral approaches) that have been developed by researchers to overcome these challenges. The chapter presents an overview of studies that have utilized these techniques and discusses their advantages and limitations.

may incur threats by state and non-state actors, stigmatization, and social ostracism. As a result, questions on issues that are perceived to be sensitive can introduce sensitivity bias, that is, respondents may either avoid answering sensitive questions altogether or provide untruthful responses.
Sensitivity biases generally originate from one of four sources: self-image, taboo (intrusive topics), risk of disclosure, and social desirability. 1 Self-image bias refers to untruthful replies based on misperceptions that individuals may have about themselves. Based on self-affirmation theory in psychology, individuals tend to maintain a perception of global integrity and moral adequacy and will reinterpret their own experience until their self-image is restored. 2 Individuals may therefore provide untruthful answers to questions that relate to their integrity and morality because of their distorted self-image, rather than admit an intent to deceive others. The second source of sensitivity bias is taboo or intrusive topics that respondents do not feel comfortable discussing with others. In such cases, non-response is more likely than untruthful answers as individuals try to avoid discussing the topic. 3 Risk of disclosure is the third source of sensitivity bias. Here, respondents are reluctant to reply altogether or provide a truthful response fearing that their response could be disclosed to the government, rebel groups, criminal groups, or local power holders. 4 Risk of disclosure, in the form of security threats by state and non-state actors or social sanctions by the community, is particularly relevant for research in an FCV context where the expression of views on sensitive topics could be very costly for individuals. 5 Finally, social scientists have long identified social desirability, the fourth source of bias, as a common threat to the validity of research findings. 6 Social desirability refers to 'the tendency on behalf of the subjects to deny socially undesirable traits and to claim socially desirable ones, and the tendency to say things which place the speaker in a favorable light.' 7 Social desirability usually reflects a respondent's concern about favorable attitudes of a reference group. The reference group could be peers, bystanders, family members or relatives present at the interview or even broader groups such as one's community or other communities, institutions, or individuals that consume the research findings. 8 An important reference group whose presence could introduce social desirability bias includes researchers and surveyors. In this case, social desirability is sometimes referred to as the 'experimenter demand effect.' In a study of anti-American sentiment in Pakistan, social desirability bias (social image) is found to potentially lead to the underestimation or overestimation of attitudes toward sensitive issues depending on whether those with extreme views conform to, and express views consistent with moderate respondents, and vice versa. 9 Experimenter demand effects highlight that even if a survey or experiment is conducted in a private context where peer pressure is ruled 5 Reminders of local insecurity reduce response rates on sensitive topics more than on other topics in a recent survey experiment in Somalia. Denny, Elaine, and Jesse Driscoll (2018), "Calling Mogadishu: How Reminders of Anarchy Bias Survey Participation," The Journal of Experimental Political Science. For an early paper on this challenges of measurement see Bullock, Will, Kosuke Imai, and Jacob N. Shapiro (2011), "Statistical Analysis of Endorsement Experiments: Measuring Support for Militant Groups in Pakistan," Political Analysis 19: 363-384. 6 Nederhof, Anton J. (1985), "Methods of Coping with Social Desirability Bias: A Review," European Journal of Social Psychology 15: 263-280;Rosenthal, Robert (1963), "On the Social Psychology of the Psychological Experiment: The Experiment's Hypothesis as Unintended Determinant of Experimental Results," American Scientist 51: 268-283;andRosenthal, Robert (1966), Experimenter Effects in Behavioral Research. New York: Appleton Century-Crofts. 7 Nederhof (1985: 264). 8 Blair et al. (2018) andTajfel, Henri, andJohn C. Turner (1979), "An Integrative Theory of Intergroup Conflict," The Social Psychology of Intergroup Relations 33 (47) out, the presence of a researcher alone could introduce bias and prevent respondents from expressing honest views and attitudes. 10 In a randomized experiment, it was demonstrated that participants who did not vote in an election were 20 percentage points less likely to answer the door to participate in a survey when they had been previously informed through a flyer about the survey, relative to those who had not received a flyer. 11 The experiment shows the strength of stigma and shame that respondents may feel upon revealing that they did not vote to a surveyor, a stranger whom they may never interact with again. 12 Social desirability bias may be even stronger in fragile contexts where social stigma could be costlier for individuals and where the association of surveys with aid and development projects could disincentivize truthful responses.
Regardless of the type, sensitivity bias can introduce two problems in surveys: item non-response and untruthful responses conditional on a response. In the case of item non-response, respondents take part in the survey but eschew answering sensitive questions, which is recorded as 'Don't Know' or 'Refused to Answer.' Item-non-response can lead to an underestimation of sensitive attitudes/behaviors and bias estimates of treatment effects when sensitivity is correlated with treatment status. 13 Untruthful reply conditional on a response reflects cases where respondents do not avoid answering questions but provide deceitful replies. Both of these outcomes undermine research findings. Considering the importance of studying sensitive attitudes, researchers have invested in developing approaches to eliminate or reduce sensitivity biases. Below, we discuss these approaches and highlight whether they address item non-response, untruthful reply conditional on response, or both. 10 Rosenthal (196310 Rosenthal ( , 1966. 11 Dellavigna et al. (2016). 12 Dellavigna, Stefano, John A. List, Ulrike Malmendier, and Gautam Rao (2016), "Voting to Tell others," The Review of Economic Studies 84 (1): 143-181. 13 For example, when estimating the correlation between receiving aid and support for militant groups one might worry that respondents in pro-militant communities are more reluctant to express support if they have gotten aid because they fear future aid will would be withheld. They therefore avoid the question at higher rates than those in other communities, leading one to erroneously conclude that receiving aid was negatively correlated with support for militants.

Approaches
Researchers in the fields of psychology, economics, and political science have developed a range of approaches to studying sensitive attitudes, which can be very useful for conducting research and data collection in fragile contexts. Endorsement experiments, list experiment, and randomized response are the most commonly used techniques developed to mitigate sensitivity bias. Table 1 summarizes the three techniques, as well as direct questioning, with respect to their ability to mitigate different types of sensitivity biases. 14 The three techniques can clearly improve direct questioning by reducing non-response and bias due to risk of disclosure and social desirability. However, they are costly in terms of sample size (because they leverage statistical inference on 14 We thank Graeme Blair for excellent advice on how to frame these issues. the difference between two groups vs. using the mean in one group), require extensive pre-testing, and cannot address bias due to the intrusiveness of the topic (taboos) and self-image. In this section, we review the three approaches, their advantages, and limitations. 15 At the end of the section, we will provide a brief overview of behavioral approaches to address sensitivity biases.

Endorsement Experiments
Endorsement experiments aim to mitigate non-response and biases due to social desirability and risk of disclosure by obfuscating the object of study. They were first used to study race relations in the US but were later used for studying support for states, international actors, and militant groups. 16 Since questions about support for the state or insurgent groups in fragile states could pose safety issues for enumerators as well as respondents, answers to direct questions about the state or insurgents may not elicit honest answers and typically face high non-response rates. The endorsement experiments overcome both issues by obfuscating the object of evaluation. When applied to measuring support for particular political actors, endorsement experiments seek respondents' views about particular policies, instead of asking the respondents to express views about particular groups or individuals. Researchers solicit views of actors by dividing respondents at random into treatment and control groups. In the control group, respondents are simply asked whether or not they support a particular policy. In the treatment group, respondents are asked the same questions but are reminded that the policy is endorsed by the groups or individuals who are the subject of the study. This approach is based on extensive research in social psychology, which 16 Sniderman, Paul M., and Thomas Piazza (1993), The Scar of Race. Boston: Harvard University Press; Blair, Graeme, C. Christine Fair, Neil Malhotra, and Jacob N. Shapiro (2012), "Poverty and Support for Militant Politics: Evidence from Pakistan," American Journal of Political Science.
show that individuals are more likely to favor policies that are endorsed by individuals from groups whom they like. 17 As endorsement experiments avoid direct questioning about sensitive topics, respondents feel more comfortable answering questions, reducing non-response rates. Because this method provides a reasonable degree of plausible deniability, respondents are more likely to provide truthful replies, reducing bias due to risk of disclosure and social desirability. This method can potentially mitigate bias due to taboo (intrusive topics) if researchers can phrase questions in such a way that respondents do not feel that intrusive words are being associated with them. It cannot, however, mitigate biases due to self-image because it does not deal with misperceptions that individuals have about themselves.
In a study on support for Islamist militant groups in Pakistan, researchers included questions about support for the polio vaccination, among other policies. 18 The respondents in control group received the following message: 'The World Health Organization recently announced a plan to introduce universal Polio vaccination across Pakistan. How much do you support such a policy?' The respondents in the treatment group were administered this slightly different statement and question, one which associated the policy with one of four militant groups active in the country at the time:  (1): 172-182. 18 Blair et al. (2012). 19 Blair et al. (2012). Compared to the direct questions about the militant groups in this study, the endorsement experiment questions received much lower non-response rates. For instance, while the non-response rate for direct questions ranged from 22% (questions about Al-Qaeda) to 6% (questions about the Kashmir Tanzeem), the non-response rate for endorsement experiments was much lower, ranging from 7.6 to 0.6%.
In addition to measuring sensitive attitudes, endorsement experiments can be utilized to study sensitive political behaviors as well. One study used an endorsement experiment to study voting 'no' on a personhood referendum in Mississippi. 20 They administered two slightly different primes among the treatment and control group, as in the following box. By obfuscating the researcher's intention and object of evaluation, endorsement experiments are useful in reducing non-response bias and recovering estimates of sensitive attitudes. Official results from an anti-abortion referendum in Mississippi in 2011 showed that while direct questioning significantly underestimated the votes against the referendum (by close to 20% in most counties) and had significant non-response rates, the endorsement experiment and list experimentdiscussed below-reduced item non-response and removed approximately half the underestimate of 'no' votes. In contrast, randomized response methods-also discussed below-almost completely recovered the known vote shares. 21 A number of studies have utilized endorsement experiments to study a range of sensitive topics, particularly support for the state and insurgents in fragile states. 22 A useful resource on this topic is a comprehensive guide for, and illustration of, questioning strategy, regression methods, and analysis tools (including software package in R) for endorsement experiments. 23 The advantage of an endorsement experiment is that it obscures the object of the evaluation above and beyond concealing the respondent's answer to the sensitive question. The main disadvantage is that a latent variable model is needed to estimate sensitive behavior and attitudes. In addition, the endorsement effect does not have an obvious scale, e.g. it is unclear a priori how a certain percentage change in support for a policy when it is associated with a group vs. not, would indicate supporting the group strongly to opposing it strongly on a standard Likert scale. Its estimates are also statistically inefficient (in the sense of requiring a larger sample to achieve a given confidence interval) compared to the other indirect methods discussed below.

List Experiments
List experiments try to mitigate sensitivity biases by introducing uncertainty through aggregation. This method, also referred to as an 'item count technique' has been extensively used to study racial attitudes and prejudice as well as voter turnout and vote buying. 25 Similar to the endorsement experiment, the sample is randomly divided into treatment and control groups. Both groups are asked to mention the total number of items on a list that they view as favorable or unfavorable (or number of actions they have taken), without identifying which specific items are favorable or unfavorable. The two groups receive similar lists except that the response options for the treatment group includes one additional item, the sensitive item which is the subject of the study.
As with endorsement experiments, list experiments can be used to study both sensitive attitudes and behavior. 26 A list experiment to study vote buying in Nicaragua found that almost one quarter of voters were offered gifts or services in exchange for votes while only 3% reported such activities when asked directly. 27 The following box shows the control and treatment statements used for assessing vote buying.
A regression analysis technique can be used to analyze list experiment data and recent work illustrates the application of the method 25 Raghavarao, Damaraju, and Walter T. Federer (1979)  The advantage of list experiments is that respondents do not disclose whether the sensitive item applies to them. By concealing which items a respondent has favorable or unfavorable views about, the list experiment can reduce non-response rates and mitigate biases due to the risk of disclosure and social desirability. Since respondents do not actually reveal which items they agree or disagree with, this method could alleviate the respondents' fear of disclosing their views and their concerns about reference groups. By only expressing the number of favorable or unfavorable items, they can deny reference to the sensitive item. This method, however, cannot mitigate biases due to taboo since the intrusive 28 Imai, Kosuke (2011), "Multivariate Regression Analysis for the Item Count Technique," Journal of the American Statistical Association 106 (494): 407-417. The software package in R for analysis of list experiments can be obtained at http://list.sensitivequestions.org/. 29 Blair et al. (2018). words need to be mentioned either in the question or options. This method cannot reduce biases due to self-image either. The main drawback of this approach is the problem of floor and ceiling effects. In the example above, if the respondent has experienced all the control items, then an honest response would no longer be obscure as it reveals that the respondent received a gift or favor in exchange for a vote, which is an example of the ceiling effect. 30 In a comprehensive meta-analysis of list experiments applied to political attitudes and behaviors, the list experiment performs well, both in terms of recovering estimates consistent with direct questions about non-sensitive behaviors and in terms of reducing bias. 31

Randomized Response
The randomized response approach is useful for estimating population-level variables by obscuring respondents' truthful answers through introducing noise in the responses. 32 In this approach, respondents rely on a random outcome (such as flipping a coin) to add noise to the response, noise whose distribution the researcher knows, and can thus later remove from population-level summaries of the responses. Randomized response questions come in two variants. In the disguised response version, the respondent is given two questions (an innocuous question and a sensitive question) and asked to flip a coin or other randomizing device out of sight of the surveyor. The coin flip determines which of the two questions the respondent answers. In the forced response version, the respondent is asked to answer the sensitive question but the randomizing device can determine their answer, obfuscating each individual's answer. The following box provides an illustration of these techniques.  Although the randomized response approach has not been used as widely as the endorsement and list experiments because it is slightly harder to explain to respondents, it is an effective method for studying sensitive attitudes and behaviors in contexts where the population is familiar with some randomization device such as the dice. 33 The randomized response technique has been used to study social connections and contacts with members of armed groups in Nigeria, which was not only sensitive but could even pose security threats to the respondents and surveyors if inquired about directly. This method has been used for estimating a range of sensitive behaviors, from application faking to cheating and drug use. 34 In the study on Nigeria, a multivariate regression analysis technique was used, and researchers provided guidance for power analysis and robust design for randomized response and illustration of applying this technique to their study of contacts with armed groups in Nigeria, in addition to a software package in R for data analysis. 35,36 Validation studies of the randomized response approach have led to mixed results. A number of validation studies have found that the randomized response method leads to less biased estimates than direct questioning and reduces item non-response, although it is not always better than list experiments and endorsement experiments. In a validation of the Mississippi referendum on the 'Personhood Initiative', the authors found that randomized response outperformed other methods in terms of reducing bias. 37 Compared to the actual referendum results, the bias in the weighted estimate of support for the referendum was only 0.04 in the randomized response while it was 0.236 in the direct question, 0.149 in the list experiment and 0.069 in the endorsement experiment. However, this method was not the best in reducing the non-response rate. Although the non-response rate in the randomized experiment (13%) was lower than the direct question method (20%), it was much higher than the non-response rate on the list experiment (2%) and the endorsement experiment (0.003%).
The main disadvantage of a randomized response approach is that it requires respondents to administer randomization, which can lead to high rates of item non-response and even survey and attrition. Furthermore, using randomizing devices or flipping coins may be culturally inappropriate in some contexts. A number of validation studies report high rates of non-response and less valid estimates for randomized response approach than a list experiment although other studies have found more favorable results and smaller non-response rates. 38 35 Blair, Graeme, Kosuke Imai, and Yang-Yang Zhou (2015), "Design and Analysis of the Randomized Response Technique," Journal of the American Statistical Association 110 (511): 1304-1319. 36 The software package in R can be obtained at http://rr.sensitivequestions.org/. 37 Rosenfeld et al. (2015). 38 For the discussion of advantages and disadvantages of randomized response, see Rosenfeld et al. (2015).

Behavioral Approaches
Behavioral approaches mitigate sensitivity bias through direct observation of behaviors that reveal preferences without direct inquiry about those preferences. Two common approaches to measuring behavior are dictator games (where the participants are asked to decide whether they want to share money with another participant) or 'offer' experiments where the respondents decide whether or not to accept an amount of money. The strength of these approaches is in their indirect measurement of sensitive attitudes and high degree of obfuscating the objective of the research. Behavioral approaches have been used in studying a range of attitudes and behaviors, such as discrimination and xenophobia, altruism and prosocial behavior, religious beliefs, and anti-American attitudes. 39 For instance, one study uses financial costs to indirectly study anti-American identity in Pakistan. 40 Study participants were given Pakistani Rupees (Rs.) 100 or 500, when the daily wage of a manual laborer is between Rs. 400 and 500, merely for checking a box to thank the donor. As shown in the box below, in one version of the instrument, the donor was local (the Lahore University of Management Science) while in the second version it was foreign (the US government). The study in Pakistan found that when participants make decision privately and if the source of the funds is the US government, almost one quarter of them forgo the money, Rs. 100. 41 However, when they expect their decision to be public, a significantly smaller proportion (around 10%) rejects the payment. They conclude that since the participants expect the majority to accept the payment from the US government, a substantial number of them (15%) conform to the majority and accept the payment although they would not in private. When the payment is increased to Rs. 500, the rejection rate falls from 25%, but a significant proportion of the participants (10%) still forgo the payment.

Practical Issues
In addition to being useful tools in recovering truthful responses, the indirect methods reviewed in this chapter have a number of practical advantages over direct questioning. First, they help reduce survey staff vulnerability, which might be particularly important in conflict settings. to be protected when local authorities do not allow sensitive questions being to be asked, despite legal protection. There is also the added benefit that plausible deniability may protect individuals by not revealing their true response at the individual level in case the survey instruments are compromised. These issues typically do not arise in non-conflict settings but can be particularly important when protecting individual responses is critically important. Although the indirect methods for studying sensitive topics outperform direct questioning in many settings, they also have limitations. First, the indirect methods add noise to the estimates, which means that for any given level of statistical power, much larger samples are required to measure group-level differences. 42 Although scholars have proposed ways to reduce noise and remedy the problem of large samples in some cases (such as using double lists or negatively correlated items in a list experiment), the requirement of a large sample remains an important drawback of these indirect methods. 43 Second, these methods require much more extensive pre-testing and preparation than direct questions, which would increase the costs (both financial and human resources) for studying the same topics and could affect the research timeline as well. Third, although these methods reduce sensitivity bias, they cannot overcome incentive compatibility issues. These methods may not provide incentives for the respondents to reveal their true views and attitudes even if they are assured that their individual views will not be disclosed. In essence, these methods reduce the cost of expressing views as long as respondents are interested in expressing their views. If the respondents see advantages in concealing their views and attitudes, these methods do not provide them with incentives to express their views. Some of the behavioral approaches overcome this problem by imposing costs on the 42 Blair et al. (2018) show that most prior list experiments have been underpowered and recommend using direct questions for all but the most sensitive questions unless large samples can be obtained. 43 For discussion of how to address ceiling effect and reduce noise in list experiments see Glynn (2013). subjects if they do not reveal their preferences, but the three indirect methods do not impose such costs. 44 The most important lesson learned from the studies that have utilized indirect methods, however, is the significance of pre-testing. Endorsement experiments require finding political issues on which the groups in question would plausibly take a stand for and that all relate to the same latent policy dimension. Properly implementing list experiments requires choosing control items so that floor and ceiling effects are avoided for almost all respondents. And randomized response requires finding a culturally appropriate randomization device and choosing the appropriate type of question. In short, all indirect methods require much more pre-testing of questions and instruments than traditional direct question do in order to ensure that they can recover truthful replies in which researchers are interested.
Given the cultural and contextual diversity of FCV contexts, some of these methods may work in some contexts but not in others. It is very important to select the appropriate method taking into consideration the concerns and context where the research is conducted. Finally, if feasible, researchers should consider validating the findings of indirect methods by comparing them with available census data or social media data whenever available.
The opinions expressed in this chapter are those of the author(s) and do not necessarily reflect the views of the International Bank for Reconstruction and Development/The World Bank, its Board of Directors, or the countries they represent.
Open Access This chapter is licensed under the terms of the Creative Commons Attribution 3.0 IGO license (https://creativecommons.org/ licenses/by/3.0/igo/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the International Bank for Reconstruction and Development/The World Bank, provide a link to the Creative Commons license and indicate if changes were made.
Any dispute related to the use of the works of the International Bank for Reconstruction and Development/The World Bank that cannot be settled amicably shall be submitted to arbitration pursuant to the UNCITRAL rules. The use of the International Bank for Reconstruction and Development/The World Bank's name for any purpose other than for attribution, and the use of the International Bank for Reconstruction and Development/The World Bank's logo, shall be subject to a separate written license agreement between the International Bank for Reconstruction and Development/The World Bank and the user and is not authorized as part of this CC-IGO license. Note that the link provided above includes additional terms and conditions of the license.
The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.