Measuring Menstruation-Related Absenteeism Among Adolescents in Low-Income Countries


 Benshaul-Tolonen et al. shine a light on two methodological issues impacting a research question that has received much attention recently: whether the provision of menstrual hygiene products reduces schoolgirls’ absenteeism in low-income countries. First, they identify bias in data sources, such as school records and recall data. Second, they show that limiting the focus to menstrual-related absenteeism obscures other threats that menstruation poses to educational attainment, health, and psychosocial well-being. To address these issues, the authors recommend the use of mixed methods, pre-analysis plans, and thoughtful consideration and validation of variables prior to study implementation. They also caution policymakers against overreliance on absenteeism as the sole outcome and overinterpreting results from existing studies that often lack scope and precision. They conclude with a call for more research on the links between menstruation and concentration, learning, self-esteem, and pain management.

hygiene interventions but point out considerable heterogeneity in study design and risk of bias in the underlying studies. Another systematic review and meta-analysis focusing on MHM in India found that absenteeism during periods is common, but when the analysis was adjusted for region, the relationship was not significant (van Eijk et al. 2016). MHM researchers have advised that greater attention be placed on improving the scope and robustness of research to reduce the risk of absenteeism being considered the sole or predominant indicator of a successful MHM intervention (Phillips-Howard et al. 2016b).
Impact evaluation techniques such as randomized control trials (RCT), lauded as among the most reliable methods for understanding development policy impact, offer useful insights for allocating funding to interventions that provide the most positive impact per dollar spent. Determining cost-effectiveness (CE), when outcomes are correctly captured and measured, is financially and ethically prudent in the resource-constrained development policy world. However, correctly identifying and measuring CE is tricky because the researcher needs to determine a priori which outcomes to measure, how to measure them (variable definition), and, importantly, how to define the sample population and size.
This chapter discusses how school enrollment and absenteeism behavior while enrolled in school can be useful outcomes for MHM interventions, as well as how an overreliance on these outcome measures may limit MHM policy impact. While research priorities for MHM have previously been spelled out in Phillips-Howard et al. (2016b) and Sommer et al. (2016), with appropriate methodologies discussed, to date there have been few properly designed analytical studies that have focused on school absenteeism. This chapter does not intend to summarize the qualitative literature on menstruation and schooling. Instead it discusses how school enrollment and absenteeism behavior are consequences of lack of MHM. The chapter also discusses externalities, pre-analysis plans, and CE as they relate to impact evaluation that is relevant to MHM research.

overvIew of exIstIng studIes
We explored the literature on MHM and school absenteeism and included studies that investigated absenteeism using recall data, diary data, school records, or spot check data. Studies have generally defined absenteeism as any child who was not documented to be present at school at the time of study, which thus can include schoolchildren who have migrated, transferred, or dropped out as well as those temporarily absent at the time of study. Table 52.1 provides an overview of 11 quantitative studies and two systematic reviews that explore the link between menstruation and education, revealing strong heterogeneity in methods, samples, target groups, and findings, which has hindered our understanding of the effectiveness of menstrual-related policies aimed at increasing educational attainment.     For purposes of brevity, only quantitative studies focusing on educational outcomes were included. This Table, then, is incomplete. We refer to the systematic reviews for a more complete overview An overview of the cross-sectional literature exploring the link between MHM and absenteeism illustrates large differences across contexts and studies. Girls aged 11-17 in Bangladesh report high levels of absenteeism during periods (41%, Alam et al. 2017), and a systematic review of studies in India similarly find high levels of absenteeism during periods (24%, van Eijk et al. 2016). In the latter study, differences in absence relating to pad use were no longer significant when taking region of India into account. In a recent study from three states in India, absenteeism rates were reportedly 6-11% among girls in grades 8-10 (Sivakami et al. 2019), while in neighboring Nepal, period-related absenteeism was almost non-existent among girls in grades 7-8 according to one study (0.19%, Oster and Thornton 2011).
Three studies from Malawi, Kenya, and Uganda find high overall levels of absenteeism among girls, especially due to transfers and mobility (for grades 5-8 in Benshaul-Tolonen et al. 2019; for grades 3-5 in Montgomery et al. 2016), but low menstrual-related absenteeism (2.4% of absent days, Grant, Lloyd, and Mensch 2013, for ages 14-16). Because menstruation is limited to 0-5 days per month, absence within these few days may be hard to isolate in a high-absenteeism context, such as those found in Malawi (Grant, Lloyd, and Mensch 2013) and western Kenya (Benshaul-Tolonen et al. 2019). In addition, two studies found no or weak gender differences in school absenteeism behavior (Benshaul-Tolonen et al. 2019;Grant, Lloyd, and Mensch 2013). However, the method used to collect such data may influence reporting. This is illustrated by Grant, Lloyd, and Mensch (2013), where one-third of girls reported having missed at least one day of school during their last period when answering using an audio-computer assisted survey instrument (ACASI) instead of reporting face-to-face. (The same question was not collected using face-to-face interviewing.) Building on the cross-sectional evidence of absenteeism, a subset of studies evaluates policies aiming to reduce school absenteeism among girls. A pilot non-randomized intervention in Ghana (Montgomery et al. 2012) and a quasi-randomized intervention in Uganda ) that provided education and sanitary pads were associated with increased school attendance, but notably showed a similar rate of change in absence among girls provided education only. The study designs and program effects may thus not be interpreted as causal. A larger cluster-randomized study from Kenya that provided sanitary pads or menstrual cups and followed 644 girls over an average of ten months found no or weak evidence of reductions in drop-out rates or absenteeism (Phillips-Howard et al. 2016a), although sanitary pads appear to have marginally reduced absenteeism (Benshaul-Tolonen et al. 2019).
Two studies focusing on latrine availability and quality are included. Latrine building and latrine improvements positively impact school enrollment and school attendance, especially for pubescent girls (grades 6-8 in Adukia 2017; grades 4-8 in Freeman et al. 2012). The gender differential could stem from menstruation-related absenteeism among girls; however, many factors differ between pubescent girls and boys, such as safety and privacy concerns while using latrines. The two studies (Adukia 2017;Freeman et al. 2012) did not specifically explore how latrine availability interacts with MHM needs, which limits our understanding of this potential channel. In fact, no study has been identified that evaluates the effect of latrine improvement programs specific to menstrual-related absenteeism.
Beyond school absenteeism, studies report several issues that girls face relating to MHM, such as lack of suitable disposal possibilities (Tegegne and Sisay 2014;van Eijk et al. 2016) and concentration issues (40-45%, Sivakami et al. 2019). In focus groups, some girls reported that being teased and humiliated after leaking led some girls to drop out of school (Tegegne and Sisay 2014). Cramps and pain stand out as a common issue, as 31-38% of girls interviewed in three government schools in India reported suffering from abdominal pain during their period (Sivakami et al. 2019). In fact, pain may be one of the main reasons for missed schooling; in one study in Nepal, almost half of missed days during periods were due to cramps (Oster and Thornton 2011). No program evaluation studies have been published to date on tackling cramps and pain as a means to increase school attendance and participation, although testing of pain relief has been conducted in pilot schools in Uganda.

choosIng outcomes
One of the most complex aspects of study design, especially in impact evaluations, is choosing the right measurable and objective outcome variables, and defining them in a transparent and intelligible way that will accurately reflect program effects. This is preferably done a priori (Head et al. 2015) and publicly registered 1  to avoid choosing definitions that yield statistically significant results, so-called cherry-picking and/or p-hacking. Journals tend to favor publishing studies that intentionally or unintentionally overreport statistically significant results (Head et al. 2015;Miguel et al. 2014;Brodeur 2016), leading to publication bias. To caution against this kind of bias in the published body of research, Miguel et al. (2014) show how small changes to variable definition for educational outcomes can yield different program effects. Preregistration of studies with a predefined statistical analysis plan thus emerges as a best practice to be adopted by all quantitative, menstrualrelated intervention studies, especially, but not limited to, impact evaluations.

Thinking About the Margin: Extensive and Intensive Margins
The field of economics is interested in the extensive and intensive margins when discussing policy impact. In the context of schooling, one easy way to measure impact is to focus on school enrollment. Enrollment is an extensive margin metric that answers how many students are enrolled in school, or the likelihood that a given student is enrolled in school. While enrollment is fairly simple to measure, making it a good contender for evaluating a menstrual health intervention, it tells us little about students' actual learning, making it a fairly crude measure. Alternatively, the intensive margin measure, one that helps us understand how often pupils attend school or what share of school hours are missed by a given student, may provide a better option. Intensive margins often involve more decisions regarding variable definition. Here, researchers must specify how to measure school attendance (for example, share of days in a school week missed or share of hours in the school day missed). In the context of MHM-related absence, it also raises other questions. Should absence due to period cramps before the onset of a period be classified as menstrual-related absence? More sophisticated temporal analysis than the extant "Period=Yes or No" is needed to detect behaviors across the menstrual cycle.

Hard vs Soft Metrics
In contrast to qualitative studies, quantitative studies rely on outcomes that are seemingly easy to measure and quantify. In the interest of precision, we propose making further distinctions between hard and soft metrics. Hard metricssuch as physical attendance in the classroom or exam scores-are more readily available and observable than soft metrics-such as concentration while in the classroom or absorption of knowledge. Test scores are often used in school-related impact evaluations, but they are an imperfect metric of learning, as even the most comprehensive and well-designed exam rarely reflects true knowledge. Impact evaluations in the MHM arena often limit their outcomes to "hard" metrics, a focus that may neglect positive impacts on equally important "soft" outcomes, such as concentration, participation, learning, self-esteem, enjoyment of learning, and comfort. While "soft" outcomes are not necessarily impossible to measure, doing so in a satisfactory way may require more complex, time-consuming, and expensive studies.

When Measuring "Hard" Outcomes Is Fraught
Even the most readily observable and measurable "hard" outcomes can be fraught. Consider the following example: The research team wants to understand if providing menstrual pads helps adolescent girls in rural areas in developing countries increase their educational attainment. They meet with principals of 40 schools in the study region, who agree to participate. The research team implements the program following the "ideal" setup, with randomizing treatment at the school level because of potential externalities. Analysis of the baseline data shows that girls are just as likely to be absent from school on days when they have their period in the treatment schools as in the control schools. Scenario 1: The administrative school records used have a lot of mistakes, so the program evaluation does not show any significant effects. In the presence of random measurement error in the dependent variable, the program effect will be correctly estimated but the variance will be larger. This could lead to a type II error, where we fail to reject the null hypothesis, that there is no difference across groups. We may wrongly conclude that the program didn't work, and the policy will be under-utilized from a social desirability standpoint. Scenario 2: The school record captures presence ("present"), but when the student is absent ("absent"), it either results in an entry in the record as "absent" or no record ("missing record"). To further complicate matters, sometimes the head teacher gets interrupted and not all attendance calls are completed, leading to missing records in the attendance book for some students who were present. This data suffers from sample selection bias, and regression analysis using the absence data as the dependent variable will lead to bias. Scenario 3: The principal is worried about the school's attendance data and whether it could be accessed by other parties, such as the regional government, which might have implications for future decisions about funding or permits. The principal decides to "clean" the data before sharing it. Because the principal is nonrandom in her application of the "cleaning," the measurement error likely follows a pattern that biases any results based on this data. While the randomization was successful at baseline and independent of principal behavior, the treatment came to interact with the principal behavior ex post. Scenario 4: Girls in the control group receive extra attention which encourages them to attend school in ways similar to the intervention group (that is, a Hawthorne effect).

Self-Reporting Bias
Asking respondents directly may be the easiest way to understand certain behaviors. But relying on self-reported measures of school attendance may lead to biased estimates and wrong conclusions, due to recall bias and social desirability bias. Recall bias is a real threat to surveys that ask participants to recall absenteeism behavior during previous menstrual periods. Indeed, recall bias has been shown to be common within epidemiological and medical studies relying on recall data (Althubaiti 2016;Coughlin 1990). Solutions to recall bias include asking participants to keep a diary and reducing the length of the recall period (Althubaiti 2016), although the data may still suffer from self-report bias. Social desirability bias is a threat in surveys when the topic is associated with shame or stigma and anonymity cannot be guaranteed at the time of the data collection. It is most easily overcome with validation of the instrument before the intervention (Althubaiti 2016). In the context of MHM, Grant, Lloyd, and Mensch (2013) complemented face-to-face survey questions with a more "private" option using a computer system (ACASI) and found higher levels of reported menstruation-related absenteeism in the more "private" option. Some further attempts have been made to cross-validate absenteeism data. For example, Oster and Thornton cross-validated self-reported absenteeism from diaries with school records in Nepal and found high levels of agreement. Benshaul-Tolonen et al. (2019) cross-validated school records against spot check data in 30 schools in Kenya and found a fair amount of inconsistency across the two data sources. We argue that studies focusing on absenteeism during periods should consider how to elicit truthful reporting, be prospective or current rather than retrospective, and aim to cross-validate any metric of absenteeism.

Externalities
The term externalities, as used in economics, refers to effects on third parties that are not actively involved in consumption or production choices. Externalities are important to consider in any impact evaluation; failure to do so can lead to an overestimation or underestimation of the program impact. Menstrual hygiene evaluations must consider a range of potential positive and negative externalities. For example, a large-scale, free, single-use menstrual pads program could put non-participants' health at risk if these used pads are disposed of in latrines, causing the latrines to malfunction and preventing their use among the wider population (negative externality). However, the same program may be effective in reducing transactional sex for pads among impoverished girls, in this way reducing the prevalence of sexually transmitted infections (STIs) and lowering the risk of STI transmission in the community (positive externality). Similarly, a menstrual health information program rolled out to 6th graders may have a positive impact on 4th graders, if many 6th graders have younger siblings to whom they pass on the information (positive externality). Therefore, a well-designed program will, firstly, randomize treatment at a level where no contamination of the control group is expected. Often with menstrual hygiene programs, randomization should be done at the school level with a sufficient number of schools included (more than 45 schools [so-called clusters] according to Angrist and Pischke 2008), ensuring balance on student characteristics at baseline. A well-designed program will measure externalities where appropriate and be explicit about when it does not. A study that does not measure externalities should make that clear, especially when conducting any cost-benefit (CB) or CE calculations.

Cost-Benefit and Cost-Effectiveness Analysis
CB and CE analyses are two distinct tools that help measure policy success. CB analysis measures the benefit from a policy intervention in terms of monetary gains. In the health and education sphere, CB analysis can be complex because it necessitates making assumptions about the statistical value of life, expected longevity, and change in future and lifetime earnings in the study population. CE analysis, on the other hand, avoids many of these assumptions by simply comparing two or more policies against each other. For example, providing menstrual pads might reduce absenteeism by one day per ten USD spent on pads. Within the same or a different program, providing a menstrual cup may be found to reduce absenteeism by 1.2 days per ten USD spent. In this case, the latter policy is more cost-effective (note that these numbers are hypothetical). Additionally, CE analysis allows comparison of completely different policies as long as the studies have similarly defined outcome variables. This is well illustrated in the J-PAL policy bulletin (2012) which compares the CE of a deworming program to alternative policies with the same aim of raising attendance, for example providing free school uniforms or scholarships.
A main limitation with CB and CE analyses is that they are limited to measured outcomes and do not fully reflect the costs and benefits of aspects that were not monitored within the research program. For example, if a program reduces time spent on washing cloths and increases money spent on soap but these outcomes were not monitored within the program, the CE and CB analyses will not fully reflect these costs and benefits. Similarly, externalities (such as impacts on siblings and friends who are not study participants) may not be captured in the study design and, therefore, not reflected in CE or CB analysis. These issues are inherent to all CE and CB analyses, of course; they are not specific to MHM studies.

Pre-analysis Plans
Pre-analysis plans, where the researcher registers the research protocol and planned analysis in advance of collecting the data, are tools available for impact evaluations that can increase transparency in research. 2 The use of pre-analysis plans has been lauded within the social sciences for reducing cherry-picking of outcomes (Casey, Glennerster, and Miguel 2012;Miguel et al. 2014) and mitigating donor/funder pressure to publish only positive results. Research by Casey, Glennerster, and Miguel (2012) provided support for the use of pre-analysis plans by demonstrating how researchers could alter their choice of outcome and show that a program was either a failure or a success. The pre-analysis plan limited donor-pressure to show a successful program.
However, pre-analysis plans are sometimes said to stifle researcher creativity and the adaptability of research programs in the wake of unexpected findings and events. Medical and clinical sciences, where RCTs and pre-analysis plans caught a foothold much earlier than in the social sciences, adopted pre-analysis plans to ensure quality in drug trials. In the case of MHM studies focusing on school absenteeism, it is important to register the research questions, the study design, the definition of absenteeism, and the collection method for absenteeism data. The use of pre-analysis plans is recommended for all impact evaluation analyses, not only those related to MHM.

the Benchmark study
Grounded in this review of the state of research examining the relationship between educational attainment and menstruation, we propose the features of an ideal study. This study would: • Allow for mixed methods to capture attitudes, beliefs, practices, and ideal scenarios at baseline and how these attitudes, beliefs, practices, and ideal scenarios differ at the endline • Differentiate between the extensive margin (drop-out rates) and the intensive margin (absenteeism) • Take into account seasonal and cyclical temporal variations in outcomes, such as premenstrual absenteeism due to cramps and differential absenteeism rates per term during harvest seasons or near school exams.
(For example: Before-after studies may need to evaluate outcomes at the same time in a calendar year.) • Use a before-after study design with randomized allocation of treatment and predefined outcomes (see pre-analysis plans) that are objective and observable. • Consider externalities (effects on third parties). For example, a student in the treatment group who received sanitary pads decides to share them with a friend who was randomized into the control group. If spillover effects are likely, randomization should be done at the cluster level, dividing schools into treatment and control schools, rather than at the individual level. Cluster randomized trials need to have sufficient number of clusters to ensure statistical inference. • Include a post-study roll-out of benefits to all participants for ethical reasons. For example, the control group should receive reusable menstrual pads or a menstrual cup upon the termination of the study. • Measure changes in the external environment. A menstrual health program may change school-level administration or student use of the latrines. Therefore, a careful survey of the external environment is recommended.

dIscussIon
There is mixed evidence on whether menstruation leads to higher absenteeism rates in low-and middle-income countries, with some studies finding weak or nonexistent links between periods and absenteeism (Grant, Lloyd, and Mensch 2013;Oster and Thornton 2011;Phillips-Howard et al. 2016a).
Other studies confirm high levels of menstruation-related absenteeism (for example, Tegegne and Sisay 2014), especially among girls using traditional materials (Mason et al. 2015). While menstrual-related absence remains an important and unanswered topic, many factors hinder our understanding of how menstruation affects educational attainment and the psychosocial aspects of schoolgirls' lives. We call for the use of a broader set of outcomes in studies that explore the links between menstruation and education in low-and middle-income countries. Furthermore, stigma and taboos can make the measurement of menstrual-related absenteeism hard. In one study in Kenya, girls reported that other girls, but not themselves, miss school because of their periods (Mason et al. 2013). In Malawi, girls were much more likely to report being absent from school during their period if they reported in private to a computer instead of face-to-face to an enumerator (Grant, Lloyd, and Mensch 2013). Data collected on sensitive topics such as menstrual experiences and menstrual-related school absenteeism should therefore be cross-validated before use.
Because of the potential role of stigmas, taboos, and varying levels of poverty, we must refrain from overinterpreting the external validity of studies (the extent to which the results of a study generalize to other contexts, populations, and times). For example, a pilot study conducted in Nepal showed that providing menstrual products to schoolgirls did not improve school attendance, possibly because the girls did not report missing school because of their periods in the first place (Oster and Thornton 2011). While the study was well designed and implemented, it does not show that menstrual hygiene interventions cannot reduce school absenteeism. It merely shows that in a sample of 199 schoolgirls in Nepal, who do not miss school because of their periods, such interventions are ineffective in improving attendance. The external validity of these results to contexts with high menstrual-related absenteeism is likely low.
We assert that the majority of studies to date are informative but suffer from three main methodological limitations. First, the limited focus on menstrual-related absence hampers our understanding of the wider threats that menstruation poses to participation in school and the psychosocial aspects of schoolgirls' experiences. Second, most studies use data sources that may suffer from self-reporting and recall bias, and few studies validate their data using alternative methods. Third, the external validity of even the best menstrual-related studies must be considered, given the influence of stigma and taboos in determining behavior and experiences. Therefore we caution policy makers against relying on school absence as the sole outcome variable. We also caution against the overinterpretation of results from existing studies that often lack both scope and precision.
Furthermore, we recommend that future studies explore the importance of pain management. School-age girls from several different contexts have reported abdominal pain as an issue. A majority of students in an Indian study reported that menstrual pains reduced participation (Sivakami et al. 2019), 66% of interviewed students in Nigeria reported abdominal pain and discomfort (Adinma and Adinma 2008), and pain was considered an issue by both students and teachers in Cambodia (Connolly and Sommer 2013). Other issues that future studies should explore include the impact of menstruation on concentration, test scores, and self-esteem. To date, qualitative studies have best demonstrated an understanding of the relevance of these margins, and future quantitative studies need to build on this body of work to support the design of evidence-based programs.
Lastly, studies find that absenteeism is common among both boys and girls in some contexts (Benshaul-Tolonen et al. 2019;Grant, Lloyd, and Mensch 2013) and that puberty education is lacking for both girls and boys (Sommer 2013). Further research is needed to understand the underlying causes of high rates of absenteeism among boys and girls and to identify cost-efficient policies to reduce absenteeism among all students.
In summary, questions of internal and external validity are important to consider when deciding whether to employ or reject a potential policy, especially since there may be differences in menstrual stigma and taboos across different populations. To avoid misinformed conclusions regarding program success or failure, we recommend employing mixed methods for MHM interventions, designing each program with utmost care, validating the survey instrument, and replicating the study design in multiple contexts.
Finally, more research is needed on the topic of menstrual-related educational attainment, especially research that is sensitive to the context and particular intervention, in line with recommendations made by Sommer (2010b). We encourage attempts to conduct streamlined studies that would allow for comparisons across contexts, as recommended by Hennegan and Montgomery (2016). Reproducible and homogenous research designs with standardized metrics and definitions that are meaningful to educationalists across several distinct contexts (Phillips-Howard et al. 2016b) could provide a fruitful way forward in overcoming the shortcomings of the limited and heterogeneous body of evidence on this question of MHM in educational settings. 3 notes