1 Introduction

Although the tax compliance literature has traditionally focused on deterrence and tax morale as key determinants in the decision to comply (Slemrod et al., 2001; Luttmer & Singhal, 2014; Pomeranz, 2015; Kleven et al., 2011), recent research shows that taxpayer confusion and compliance costs play a key role too. Taxpayers often have a very limited understanding of the tax system. As a result, they behave in ways that are inconsistent with economic theory, and with the incentives set out by policymakers (Feldman et al., 2016). They also leave money on the table because they fail to take up provisions that would benefit them, such as the possibility to deduct expenses or to enroll in online tax-filing platforms (Benzarti, 2015; Okunogbe & Pouliquen, 2022). Taxpayers’ weak knowledge and confusion are consistent with a broader literature showing that people are often ill informed about tax and transfers that would affect their economic choices (Chetty et al., 2009; Liebman & Luttmer, 2015).

In our context, compliance costs refer to the cognitive effort that taxpayers make in understanding complex tax systems, and to the administrative costs they incur in fulfilling their tax obligations. Recent evidence has consistently shown that these costs are both large and regressive, affecting particularly small taxpayers (Coolidge, 2012; Aghion et al., 2017). A recent study focusing on small firms in Finland shows that compliance costs produce larger behavioral responses than changes in the tax rate, which has been the subject of a much larger empirical and theoretical literature (Harju et al., 2019). These issues are even more acute in African countries, where tax knowledge is strikingly low—to the extent that the majority of citizens do not know what taxes they owe to the government or what tax payments are for (Isbell, 2017; Aiko & Logan, 2014). These facts make our analysis of taxpayer education both urgent and relevant, as well as being novel, given the near absence of other studies in this area.

While it is increasingly clear that taxpayer confusion and compliance costs are key constraints in the taxpaying process, there is almost no evidence on how such constraints can be alleviated. An obvious option is taxpayer education that can be broadly defined as in depth and comprehensive provision of information about the tax system. Tax administrations across the world already engage in this area through a wide variety of initiatives, ranging from traditional training to tax edutainment. In Africa alone, we document the existence of radio programs on tax topics; tax-themed soap operas to sensitize the public about taxpaying; tax clubs in schools where pupils learn about tax and then compete across schools; informative videos on social media, where celebrities explain to young people why is it important to get their tax affaires in order right from the start; and mobile tax units, which are essentially vans traveling to rural areas to support taxpayers.Footnote 1 Similar initiatives can be found in other regions of the world. This includes, among others, programs to provide accounting and tax assistance to low-income taxpayers in the USA and Brazil; online tools and campaigns in Chile, Colombia, and Estonia; and various initiatives with children in schools in Jamaica, Mexico, Malaysia and Morocco.Footnote 2 However, programs like these often remain largely undocumented and, most importantly, none of these initiatives is rigorously evaluated. As a result, we do not know if they are effective and if they constitute a good use of tax administrations’ scarce resources.

This paper aims to address this gap by answering the following question: can taxpayer education affect tax compliance behavior? To address this question, we evaluate a program run by the Rwanda Revenue Authority (RRA) that aims to educate new taxpayers to basic elements of the tax system (more details in Sect. 2.2). It is specifically aimed at helping them to comply as they enter the tax system, with a focus on learning and setting good compliance habits right from the start. Our evaluation relies on a unique dataset combining administrative and survey data, which includes both information on real compliance behavior and a rich set of variables on taxpayer characteristics and perceptions (more in Sect. 3.1). Using this dataset, we show that the program positively affects three compliance outcomes: the probability to declare, the probability to have zero-tax liability, and the tax amount (more details on these variables in Sect. 3.1).

Because of policy constraints, we could not randomize program attendance or invitations (see Sect. 2.3). Our identification strategy is therefore not bulletproof, but it is nonetheless rigorous and robust. Our key findings are confirmed with three empirical strategies. First, we use a simple probit/tobit model exploiting the fact that participants to the program are very similar to non-participants across a wide range of characteristics, before the program happens (see Sects. 3.2 and 4.1). Second, to address potential empirical challenges related to selection, we also use propensity score matching to improve comparability across the two groups (see Sect. 3.2). Finally, we test the robustness of our results by using an IV strategy that relies on the exogenous assignment of taxpayers to take part in our survey (more details in Sect. 4.3).

Our analysis confirms that the program affects compliance both on the extensive and intensive margins. On the extensive margin, attendance to the program results in a large increase in the probability that a new taxpayer will file a declaration in the first year (full results in Sect. 4.2). This finding is particularly important in the context of low-income countries, where declaration rates are low (less than 50% among new taxpayers in Rwanda) and tax administrations struggle to expand the tax base. On the intensive margin, attending the program results in a lower probability of filing zero and in a higher reported tax liability. These impacts remain statistically significant and economically large across specifications: the increase in the probability to declare ranges from 29% in the probit estimation to 64% with our IV strategy, while the probability to file zero decreases by at least 6% and the tax amount increases by at least 43%.

Furthermore, in Sect. 4.4, we investigate the possible mechanisms through which the program’s effect on tax compliance may come about, including factors that are typically explored in this literature such as deterrence and tax morale. We show that the main channels are an increase in taxpayer knowledge and a decrease in the perceived complexity of the tax system—both of which are directly related to the central themes of taxpayer confusion and compliance costs. Importantly, we show that the effect of taxpayer education persists 2 years after attendance (Sect. 4.6). We argue that the program helps to establish a habit of compliance that lasts over time—linking our results to recent evidence on the role of habit in tax compliance (Dunning et al., 2017). Our administrative data shows that making a declaration in the first year of operation dramatically increases the probability of declaring in subsequent years, while taxpayers who fail to file a declaration in their first year are extremely unlikely to ever submit a declaration.

To the best of our knowledge, ours is the first rigorous evaluation of the impact of tax education on multiple outcomes related to compliance behavior, as well as taxpayer knowledge and perceptions. The only study we could find on this topic evaluates the effect of a two-minute explanation provided by tax preparers to taxpayers who are eligible for the Earned Income Tax Credit (EITC) in the USA (Chetty & Saez, 2013). These authors find no effect on the amount of income reported, which is the only outcome evaluated. While this paper represents a key reference point for our study, it focuses on a very quick (two minutes) and focused (specifically on the EITC) intervention. In the words of the authors: “While our results suggest that knowledge about the tax code cannot be easily manipulated with simple information treatments, the spread of knowledge through peer networks or other sources that affect knowledge in more persistent ways could have larger impacts on behaviour” (Chetty and Saez, 2013, p. 4). We build on this work by exploring another source of increased knowledge: taxpayer education, intended as in depth and comprehensive provision of information to taxpayers.

Our research is related to the growing strand of experimental literature on informational nudges used by tax administrations to increase compliance. Recent reviews of this literature are, for example, Antinyan and Asatryan (2020), Mascagni (2018) and Hallsworth (2014). These field experiments evaluate the effectiveness of messages sent by the local tax administration to taxpayers around the time of declaration. Such messages provide information about specific aspects of the tax system, typically related to either deterrence (e.g., sanctions for non-compliance) or tax morale (e.g., public services funded by tax). While these nudges are generally effective in increasing compliance in the tax declaration that immediately follows receipt of the message, most studies do not look at longer-term effects. The limited existing evidence is that they do not seem to generate any longer-term effect (Mascagni & Nell, 2021; Manoli & Turner, 2014). The literature is a lot scarcer for low-income countries, although two studies from Ethiopia and Rwanda largely confirm the results of the broader literature—including the lack of any effect beyond the first year (Shimeles et al., 2017; Mascagni & Nell, 2021).Footnote 3 The (very) short-lived effectiveness of behavioral nudges begs the question on which other interventions can be implemented to affect compliance in the longer term, through learning. Our results address this question by showing that taxpayer education can be an effective alternative that, contrary to simple messages, affects compliance more persistently.

Our results also speak to the literature linking taxation to state-building and accountability (Moore et al., 2018; Prichard, 2015; Brautigam et al., 2008). While weak tax knowledge has immediate implications on tax compliance, it also has other potentially serious implications. Confused taxpayers do not know with confidence how much they should pay, thus potentially being more vulnerable to corrupt officials or to be coerced into making unofficial payments. They may also be more prone to seeing the tax system as unjust and extortionary, either because of corruption or because they might misperceive the benefits of paying tax (Ali et al., 2015). A recent study has shown that providing beneficiaries with information about the eligibility and amount of a subsidy increases the amount of subsidy they receive by about 26%, thanks to lower leakage (Banerjee et al., 2018). The recent evidence on informal taxation suggests that similar effects are likely to occur also in the area of tax (van den Boogaard et al., 2021). Ultimately, uninformed taxpayers are less likely to engage in a meaningful debate with the government about tax issues, thus limiting the potential of taxation to act as a catalyst for improved governance and accountability (van den Boogaard et al., 2020).

2 Conceptual framework and application to Rwanda

2.1 Taxpayer education and compliance: key mechanisms

As discussed above, one can reasonably expect tax education to improve compliance behavior, especially when tax knowledge is low and compliance costs high, like it is the case in many lower-income countries. However, there are multiple kinds of compliance costs and multiple channels linking tax education to compliance—including but also beyond the most obvious one related to compliance costs. This section unpacks the potential mechanisms at play in the relationship between tax education and compliance, which also serves as a conceptual framework for our empirical investigation of mechanisms in Sect. 4.4. We start from three elements within the compliance costs mechanism, which remains the main focus of this analysis. We then turn to discuss other possible mechanisms linked to enforcement and tax morale.

The most obvious mechanism linking tax education with compliance relates to changes in compliance costs. To understand this relationship better, it is useful to distinguish three kinds of compliance costs that can be affected by taxpayer education, each of which has potentially different implications on compliance.

Firstly, taxpayers face cognitive costs related to understanding tax laws and regulations, which they might internalize or outsource to professionals. For example, taxpayers need to know what tax exemptions or benefits might be available to them, which expenses they can deduct in their tax declarations, and how to report taxes related to their employees in line with current laws. This kind of knowledge (or the lack thereof) might affect compliance in opposite ways. On the one hand, firms might leave money on the table if they fail to take up measures that would be beneficial to them, which has been documented in the literature (Benzarti, 2015; Okunogbe & Pouliquen, 2022). On the other hand, better tax knowledge might allow taxpayers to minimize their tax liability with aggressive planning, while remaining largely within the confines of the law.

Secondly, a separate set of costs relates to the practicality of preparing and submitting a tax return. For example, in Sect. 4.1 we show that many new taxpayers do not keep appropriate business records, although these are needed to prepare tax declarations and as proof for potential audits and checks. Once the declaration and supporting documents are ready, taxpayers need to become familiar with the process to submit a tax return and pay taxes. In recent years, with the extensive digitization of tax administration, these processes typically rely on online platforms like the ones described for Rwanda below, which in turn imply cognitive and time costs to learn operating these systems, as well as administrative costs related to potential technical issues like the ones described, for example, in Mascagni et al. (2022). Taxpayer education has the potential to reduce these costs and thus to improve the probability of successfully and correctly filing a tax return—including reporting taxable income more comprehensively, if the program is successful in sensitizing taxpayers about the importance of keeping appropriate records.

Thirdly, taxpaying typically involves stress costs. These are only amplified in contexts where taxpayers feel confused about the tax system because they do not understand the laws or the practicalities of taxpaying. In this case, taxpayer education can be a way to help taxpayers feel more comfortable with the taxpaying process, including interactions with the revenue administration, and to boost their confidence by familiarizing them with tax requirements and filing systems. This, in turn, could reduce feelings of fear and anxiety related to the possibility of making mistakes or being sanctioned. The potential implications for compliance are mixed: while taxpayers might become more likely to engage with the tax system (e.g., a greater probability that they will file declarations), they might also become more relaxed about tweaking their tax affairs to minimize tax payments.

These three kinds of compliance costs—legal, practical, and psychological—represent key mechanisms through which taxpayer education can affect compliance, in some cases in ambiguous ways. While reducing practical costs might more unambiguously facilitate compliance, decreasing legal and stress costs might, in principle, result in either better or worse compliance behavior.

Beyond compliance costs, taxpayer education can also affect compliance more indirectly through other mechanisms, for example those related to enforcement capacity or tax morale. For example, education programs might give a signal to taxpayers that the revenue administration has the organizational capacity to reach out to them and to organize extensive campaigns throughout the country. It might also make sanctions for evasion and mis-reporting more salient. As a result, participants’ perceptions around enforcement might increase, potentially resulting in increased compliance.

Related to factors under the umbrella concept of tax morale, taking part in education programs may provide an opportunity for taxpayers to experience the most friendly side of tax administration, as the key purpose of these sessions is to support them rather than checking their individual tax affairs. This might improve perceptions linked to fairness and trust in the tax administration, which the literature has linked to better tax compliance. Depending on how tax education programs are designed, they might also highlight broader issues like the importance of paying taxes for funding public services and development strategies, or as a tool to improve accountability and citizen engagement. These would be additional channels, related to fiscal exchange, that might affect compliance positively. Peer effects might have also been at play, as attendees found themselves in the same room as other similar businesspeople who are also trying to navigate the tax system.

While we explore empirically these possible mechanisms in Sect. 4.4, the next section illustrates how they play out in the context of Rwanda and describes the education program that is at the center of this evaluation.

2.2 Taxation and compliance costs in Rwanda

Rwanda is a small landlocked country in East Africa, with a population of over 12 million people. After its history of civil war and genocide, in the past two decades the country has seen remarkable progress on social and economic development. Among other indicators, growth rates have averaged over 7% in the last decade and the primary enrollment rate is 95% for both boys and girls.Footnote 4

When it comes to tax, Rwanda is generally seen as a success story too. The tax to GDP ratio stands at about 14.5%, broadly in line with other sub-Saharan African countries.Footnote 5 More importantly, the RRA is widely perceived as a performing revenue authority that has fully embraced a modern approach to tax administration, where encouraging voluntary compliance plays a key role in revenue mobilization, along with traditional enforcement. For example, Rwanda was one of the first countries to organize a yearly National Taxpayer Appreciation Day, which normally entails events stretching well beyond a single day and aimed to appreciate taxpayers’ contributions to national development through their tax payments. Partly reflecting the recent history of the country’s relations with foreign donors, the government has adopted a rhetoric of self-reliance that has domestic revenue mobilization at its core. As shown in Sect. 4.1, all this is reflected in exceptionally strong perceptions about the importance of taxes and their contribution to national development.

Rwanda’s tax system includes many of the features that we would expect to find in a modern tax system. In terms of the main tax types, it applies the usual taxes on personal income (such as pay-as-you-earn, PAYE, taxes for employees) and on business income: personal income taxes (PIT) for the self-employed and sole-proprietorships, and corporate income taxes (CIT) for incorporated firms.Footnote 6 Firms above a threshold are also required to pay value added tax (VAT).Footnote 7 Tax rates are both in line with international standards (for example, the corporate tax rate is 30%) and organized in simple progressive schedules, where relevant (for example, 0%, 20%, and 30% for PIT and PAYE).

In terms of compliance costs, Rwanda seems to fare fairly well. Declarations for income tax are due once a year,Footnote 8 while VAT can be paid either monthly or quarterly. The latter option is particularly targeted to small businesses, to decrease their administrative burden. Similarly, the RRA has implemented a number of other administrative measures to facilitate compliance. Most notably, Rwandan taxpayers normally file online through the e-tax system or, for micro taxpayers, through a mobile phone platform (m-tax). To simplify things further for small taxpayers, Rwanda also offers two simplified regimes with minimal bookkeeping and reporting requirements (i.e., only turnover). As a result of these (and other) efforts, the country ranks very well in the World Bank’s Doing Business indicators. For example, in the sub-indicator related to “paying taxes” Rwanda’s score (84.6) is slightly higher than the OECD’s average (83.3) and far above the average for Sub-Saharan African countries (57.2).Footnote 9 These numbers suggest that the Rwanda Revenue Authority already put in place all the most obvious measures to reduce compliance costs and to simplify the tax system. This is relevant to our study, as the intervention we set to evaluate (described in more detail in Sect. 2.3) does not vary any administrative parameter on compliance costs (i.e., simpler procedures, less strict requirements), but rather affects access to information about the tax system and the cognitive costs related to understanding the taxpaying process.

2.3 Rwanda’s taxpayer education program

In line with its modern approach to tax administration, RRA’s Taxpayer Services department (TPS) organizes a wide range of initiatives on taxpayer education. In addition to the National Taxpayer Appreciation Day events described above, RRA organizes, for example, tax clubs in schools and seminars for specific business sectors or types of taxpayers.Footnote 10 In line with the rhetoric of self-reliance, taxation is also often featured on radio or TV programs. Among these initiatives is the training program for newly registered taxpayers that is the focus of this study.

This program is the key initiative that RRA takes to educate and inform new taxpayers. It is particularly important in the context of the relatively large number of new businesses that register every year, partly pushed by the government’s encouragement for entrepreneurship and tax registration efforts (Mascagni et al., 2022). The program involves a half-day class led by tax officials.

The contents cover the basics of taxpaying, largely referring to the practical dimension of compliance costs described in Sect. 2.1 and some basic elements of the legal framework such as applicable rates and exemptions. For example, the training includes explanations of key taxes and duties, the relevant deadlines, what is a Taxpayer Identification Number (TIN) and why it is needed, procedures for tax declaration and payment, services available for simplifying declarations and payments, such as online services, among others. The training focuses particularly on income taxes and VAT, but also describes PAYE and local taxes and fees. Information on sanctions and audits is included in the training materials but it does not feature prominently. Table 7 summarizes the structure of the training sessions. It confirms that the content is more focused on providing practical information on taxpaying, rather than going into the details of the legal framework, or taking a broader approach based on accountability and citizen engagement (see Sect. 2.1). Based on our conceptual framework, we would expect this program to have a positive impact on compliance behavior. Indeed, increasing compliance through improved taxpayer knowledge is an explicit goal of this program.

Program sessions are organized throughout the country, trying to reach all tax centers where there are substantial numbers of new taxpayers who might need the training. Typically, the first session of the year is organized in July, to capture taxpayers who registered from January onward, and other sessions continue until the end of March, which is the deadline for filing income tax declarations for the year ending on 31st December. All sessions are the same in terms of content, duration, and trainers, who are all from RRA’s Taxpayer Services Department (TPS). The program schedule for 2017/18 is reported in Table 8 in Appendix.

All participants are new taxpayers: they have typically registered for income taxes (PIT or CIT) and obtained their TIN for the first time.Footnote 11 These are all new businesses, either individual non-incorporated ones (PIT) or corporations (CIT), and not taxpayers who only have employment incomes. Regardless of the business type, this is a relatively homogeneous group of small firms (see Sect. 4.1). Once they register, RRA invites all of them to the first available session in their district. Invitations normally happen through SMS, official letters posted at the office of the relevant tax center and the local branch of the Private Sector Federation, and some phone calls from RRA officials to taxpayers to remind them of the session. Although this invitation process is comprehensive in principle, practical difficulties and administrative constraints mean that in reality it does not reach all intended beneficiaries (more on this in Sect. 4.1). Nonetheless, the explicit intention of the RRA is to invite all new taxpayers and, potentially, keep the training accessible to any other taxpayer even if they were not specifically invited. In practice, the vast majority of attendees are new taxpayers from the relevant district. However, this policy intention means that we could not randomize invitations or attendance.

3 Empirical framework

3.1 Data

Our dataset includes information from three sources, which we describe in more detail below: administrative records from tax returns, a survey we conducted in 2017 in three districts, and individual-level data on program attendance. Importantly, we can connect these three sources of information thanks to unique identifiers. Our matched dataset allows us to reap the key benefits of administrative data, namely the possibility to observe real compliance behavior, while overcoming its main drawbacks, most importantly the fact that only very few taxpayer characteristics are typically observable in such data. By matching administrative data with our survey, we obtain a unique and rich dataset that includes firms’ characteristics, perceptions and attitudes, a direct measure of taxpayer knowledge, and information on real compliance behavior.

When we started working on this evaluation, there was no reliable data on attendance to the program. To solve this problem, in collaboration with RRA, we developed a new registration procedure to collect attendance data during the program’s sessions, at the individual level. These data are now available for all trainings held in 2017/18 (see list in Table 8).

We use administrative data primarily to observe real compliance behavior. These data are obtained from income tax declarations filed between January and March 2018, relative to fiscal year 2017. We focus on income tax because the program targets specifically new taxpayers who just registered for PIT or CIT. Since the program focuses on new taxpayers, fiscal year 2017 is their first declaration period. This implies that we have no baseline administrative data for this group, as they never filed a declaration before.Footnote 12

We use three main variables to capture compliance behavior, all of which are measured for the full sample. The first one takes the value of one if a taxpayer makes a declaration in the relevant period, and zero otherwise. This variable captures the extensive margin of tax compliance, which is particularly relevant in a context where over half of all new taxpayers fail to file a declaration in the first year of operation (see Sect. 4.1). The second variable captures taxpayers with zero-tax liability (= one, or zero otherwise). In addition to non-filing, this variable also captures nil-filing, both of which result in zero-tax liability. Nil-filing is a widespread phenomenon in Rwanda, as well as in other low-income countries (Mascagni et al., 2022; Santoro & Mdluli, 2019; Almunia et al., 2017; Mascagni & Mengistu, 2019). It refers to the behavior of taxpayers who file a declaration (thus making it distinct to non-filing) but report zero in all fields, such as turnover, taxable income, deductible expenses, and tax.Footnote 13 Recent evidence shows that “unproductive” taxpayers (i.e., non-filers and nil-filers that yield zero-tax to the revenue administration) often represent the majority of registered taxpayers (Santoro & Mdluli, 2019; Moore, 2020). The third variable captures the tax amount, the intensive margin. It takes value of zero for all non-filers and nil-filers, while for those with some positive tax liability it captures the amount.Footnote 14

In addition to these three compliance variables, we also use the few taxpayer characteristics available in the administrative dataset, namely whether the taxpayer registered as a corporation (CIT) or as an unincorporated business (PIT), the number of months since registration, the months since the training, and fixed effects for the tax center where they registered and where they were invited to attend the program. We refer to these variables as “administrative covariates.”

As far as our survey is concerned, we have two rounds of data: one before (i.e., baseline) and one after (i.e., post-intervention) the relevant session that taxpayers were invited to.Footnote 15 Each selected taxpayer received a request for a full interview in the month before the relevant training session (refer to Table 8) and for a shorter, follow-up interview focused on key outcomes in the 3 weeks after the training. Attrition between the two rounds is about 18%, but is not correlated to any of our variables, as shown by the normalized difference statistics \(\Delta\) (Imbens & Rubin, 2015) reported in Table 9.Footnote 16 Importantly, the first survey round is the only baseline information we have, since administrative data is only available after taxpayers have registered—and in our case they are all new.

Due to budget constraints, we limited the sample to 1000 taxpayers and we conducted the survey by phone. We focused on the first three trainings in RRA’s training plan (see Table 8 in Appendix): one in Kicukiro, which is a busy district of the capital Kigali, and the other two in the smaller towns of Musanze and Rubavu.Footnote 17 The selection of these three districts was largely dictated by logistical reasons: we wanted to have a large enough sample and some representation of both rural and urban areas. In total, over 2500 taxpayers registered in those three districts by July 2017, and a random sample was selected to take part in the survey. To allow for refusals to take part in the survey, we randomly selected 1400 taxpayers, aiming for a sample size of 1000. We discuss this process in more detail in Sect. 4.3 because allocation to the survey sample is the variable we use in our IV strategy.

Our survey includes six modules with detailed questions on: (1) respondents’ demographics (e.g., age, gender), (2) characteristics of the business (e.g., number of employees, income), (3) reasons to register for a TIN (e.g., obeying to the law, access to finance), (4) an innovative quiz to measure tax knowledge (see Table 10), (5) attitudes and perceptions (e.g., on the complexity of the tax system, on tax as a social duty, see full list in Table 11), (6) information on RRA’s program (e.g., in round one: intention to attend; in round two: feedback on the program or reasons for not attending).Footnote 18 Since the two rounds happened within a few weeks of each other, and given that the full survey took on average 42 min to complete, in the second round, we only collected data on our main variables of interest: all those in the modules on knowledge, attitudes and perceptions, and the feedback on the program.

Importantly, our tax quiz allows us to measure tax knowledge directly and more precisely than any other study we are aware of. It includes nineteen questions on basic aspects of the tax system, which we developed in close collaboration with the Rwanda Revenue Authority based on relevance to the local context. The full set of questions included in the quiz is reported in Table 10.

For surveyed taxpayers, we have information from all three data sources described above (administrative, attendance, and survey data). Therefore, our core results refer to this sample (Sects. 4.2 and 4.4). Beyond the survey sample, however, we can still rely on attendance information and administrative data for the whole population of new taxpayers who registered in 2017. This broader population is the basis of the results presented in Sect. 4.5. Table 12 summarizes the key data sources by groups of taxpayers that we use in our analysis.

3.2 Methods

As explained in Sect. 2.3, we could not randomize invitations or attendance to the program, due to RRA’s explicit policy intention to keep it open to all new taxpayers who might find it useful. This policy constraint ruled out the possibility to use methods such as an RCT or an encouragement design. Nonetheless, we are still able to conduct a rigorous evaluation by exploiting two features of our data. First, as discussed in Sect. 3.1, we use a uniquely rich dataset containing both compliance behavior and detailed information on taxpayers’ characteristics, their businesses, and their perceptions. While of course there are still elements that we cannot observe, this dataset allows us to control for a very large set of relevant variables. Second, based on these rich data, we check for balance between those taxpayers who attend the program and those who do not, across the same large set of characteristics and perceptions. The results in Table 1, which we discuss in the next section, show that these two groups are largely comparable at baseline, before the program happens.

On this basis, we use two estimation strategies to identify the impact of the program: a simple probit/tobit model and propensity score matching (PSM). We describe these two strategies in more detail in this section, while in Sect. 4.3, we describe a third strategy that uses allocation to the survey as an instrumental variable for the potentially endogenous program attendance variable.

Our first strategy is a probit/tobit estimation, where we compare attendees and non-attendees in our survey sample, controlling for a large set of covariates.Footnote 19 The main empirical concern here is related to self-selection into program attendance. For example, taxpayers who attend the program might have better attitudes toward compliance and be less inclined to evade to start with. However, our data reveal virtually no statistically significant difference between attendees and non-attendees at baseline. The two groups are therefore largely comparable across a rich set of variables including evasion attitudes, and perceptions around government authority, enforcement, fairness and trust as a wide set of taxpayer characteristics (see Sect. 4.1). While we do not argue that attendance is random, our data suggest that there is some element of randomness in the decision to attend. When we asked non-attendees why they did not attend, the vast majority responded with rather ad-hoc reasons (32% were busy at work, 30% were sick or had a sick family member to care for, 13% forgot), while very few gave answers that would make us think attendance is related to unobserved factors (e.g., did not think it would be useful, or prefer to learn from non-government sources).Footnote 20 Still, to account for possible differences between the two groups, we include a large set of covariates from our survey: owners’ characteristics (i.e., age, gender, level of education, whether the owner had a previous business, whether they previously had another tax training), characteristics of the business (i.e., size, months since registration, location, whether they use emails, whether they have a bank account, PIT/CIT), as well as knowledge and perceptions at baseline (see Tables 10, 11).Footnote 21

It is worth noting that the survey was not meant to be representative of the national population, as it only includes the three districts mentioned in Sect. 3.1. It is also over-representative of areas outside of Kigali, compared to the reference population of new taxpayers who registered in the relevant tax centers but did not take part in the survey.Footnote 22 We therefore use sampling weights throughout the analysis. We discuss external validity of our results, beyond our survey sample, in Sect. 4.5.

The second strategy is propensity score matching (PSM), which allows us to approximate a randomized trial and therefore provide a more rigorous causal analysis than the probit/tobit estimation. The propensity score represents the probability of being assigned to a treatment (in this case, attendance to the program), given a set of covariates. There are two assumptions for PSM to be valid. The first one is unconfoundedness, which requires that all confounding factors be included in the set of covariates used to calculate the propensity score. We cannot directly test this assumption, but we use a wide set of covariates that capture the most important potential sources of concern, including attitudes and knowledge at baseline, as well as taxpayer and business characteristics.Footnote 23 The second assumption is overlap (or common support), which requires that individuals with the same score can be either attendees or non-attendees, as if a randomized experiment was carried out. As shown in “Appendix” Fig. 2, this assumption holds in our case.Footnote 24 While kernel matching is our preferred matching algorithm,Footnote 25 we repeat our analysis using two additional matching options for robustness.Footnote 26 Our results tables also report two measures of the quality of matching, which confirm its validity.Footnote 27 In our PSM tables, we employ the weights obtained from the propensity score estimation to correct the initial probit/tobit estimation, using the same covariates.Footnote 28

Both empirical strategies rely on the fact that the two groups of attendees and non-attendees are largely comparable at baseline, before the program took place. This is an important feature, but is only based on observable characteristics. We cannot rule out that the two groups might be different based on unobservable characteristics. For example, attendance might be related to preferences on whether and how much to comply (i.e., if one aims to evade, why attend?), or to opportunity costs that might in turn relate to business performance. Although we cannot address this concern in a fully satisfactory way just with our survey data, we would make two arguments in support of our results. First, to take part in the survey, respondents had to agree to take part in a 42 min long interview about tax education. That might have already ruled out taxpayers who have set intentions to evade, or who have high opportunity costs. Similarly, taxpayers were only considered for this study if they had registered within the past few months—which makes it less likely that they are set on fully evading tax laws or avoiding the tax system altogether. In short, this is a relatively more homogeneous group of (new) taxpayers than the full population. Second, we would expect the potential unobservables to be related to some of the variables we do capture in our survey. However, Table 1 shows no statistically detectable differences across a wide range of relevant variables, including attitudes to compliance, perceptions around fairness and trust, and business income (as an expected correlate of opportunity costs). This gives us confidence in the validity of our strategy, as any unobserved difference across the two groups would need to be orthogonal to all the taxpayer characteristics and perceptions we observe. That is, of course, possible, though perhaps unlikely.

Still, recognizing that we cannot fully tackle concerns on unobservables with survey data alone, we also test the robustness of our results using an IV strategy that relies on allocation to the survey as an exogenous variable to instrument attendance. In this case, however, we can only count on a limited set of administrative covariates, since we necessarily extend our analysis beyond the survey sample. We describe this strategy in more detail, along with the relevant results, in Sect. 4.3. Our key results are robust to all three estimation strategies, confirming the validity of our empirical approach.

Here below, we report the equation we estimate in Sects. 4.2 and 4.4, where \(Y_{it}\) is the outcome of interest: the three measures of compliance described in Sect. 3.1 measured at the individual level at time t, after the program took place. T is attendance to the program and \(X_{i(t-1)}\) is the set of control variables measured at baseline (\(t-1\)). These controls are the survey covariates described above and, when available, baseline-level outcomes as well, in an ANCOVA estimation. When we extend our analysis beyond the survey sample, in Sects. 4.5 and 4.3, we use the alternative set of “administrative covariates” described in Sect. 3.1. Importantly, the inclusion of different sets of covariates does not affect our results in any substantial way, as shown in Sects. 4.2 and 4.5. As the trainings were delivered in specific districts, we also include training-level fixed effects \(\tau\) and clusterize the standard errors \(\epsilon\) at the training session level.

$$\begin{aligned} Y_{i} = b_0 + b_1 T_{i} + b_2 X_{i(t-1)} + \tau + \epsilon _i \end{aligned}$$
(1)

After having estimated the program’s effect on compliance, we investigate possible mechanisms using both the probit and PSM estimations. Our analysis of mechanisms follows the same empirical design described above, but using as dependent variables the candidates for potential channels (e.g., knowledge, and perceptions about complexity or deterrence). Given that there is no statistically significant difference between attendees and non-attendees on knowledge and perceptions at baseline, we are confident that any difference post-intervention can indeed be attributed to the program (more details in Sect. 4.4).

4 Results

4.1 Anatomy of taxpayers at baseline, knowledge and program take-up

Since we have no baseline administrative data, in this section we rely mostly on the survey to provide an anatomy of new taxpayers before the program. Table 1 reports baseline values of our key variables. As expected, new taxpayers are a relatively homogeneous group of small businesses: over 90% have less than 5 employees and many have no employee. A large proportion do not use emails (75%), do not have a bank account (56%), or keep proper books of accounts (53%).

Turning to one of our key variables of interest, our unique tax quiz allows us to document strikingly low levels of taxpayer knowledge at baseline. On average, survey participants at baseline responded correctly only to about a third of the nineteen basic questions of our tax quiz, where we asked about Rwanda’s tax system (as reported in Table 10). No respondent gave the right answer to all questions. Table 10 reports the disaggregated scores for each of the nineteen questions in the tax quiz. Interestingly, 37% of the people we interviewed did not know what tax type they registered for, just a few months after registering. These figures clearly confirm the importance of taxpayer education in Rwanda, as well as the presence of large margins for improvement in this area. To the best of our knowledge, this is the first time that taxpayer knowledge is being measured directly and in such detail.

Low knowledge of the basic parameters of the tax system is also consistent with the fact that most new taxpayers in Rwanda fail to make a declaration in the first year since registration. Our administrative data revels that, among taxpayers registered in 2017, only about 43% filed a declaration, while the majority failed to declare. These figures are consistent with those from previous years, showing that this is a persistent issue in Rwanda.Footnote 29

Contrary to tax knowledge, perceptions about tax are generally very good in the Rwandan context. Virtually all taxpayers agreed with the tax attitude statements we provided, for example on the government’s authority to make people pay tax (93%), fairness of the tax system (98%), and tax as a social duty (98%). These figures are fully in line with the Rwandan context and the government rhetoric of self-reliance (see Sect. 2.2).

Table 1 Summary statistics and mean differences by attendance

Table 1 reports the average of all key variables, disaggregated by attendance status, as well as t tests and the normalized difference test (\(\Delta\)) to check whether any variable is significantly different between these two groups (Imbens & Rubin, 2015). Although our evaluation was not set up as an RCT, the two groups we aim to compare (attendees and non-attendees) are fully comparable at baseline in relation to all available variables. None of the normalized differences (\(\Delta\)) is greater than 0.25, the benchmark that Imbens and Rubin (2015) suggest when testing for balance.Footnote 30 These results give us confidence in the validity of our probit/tobit estimation, although we nonetheless confirm our results with the more robust PSM strategy and the IV estimation of Sect. 4.3.

Last but not least, Table 13 shows program attendance rates in the survey sample and in the population of new taxpayers who registered in 2017, based on the attendance data described in Sect. 3.1. In the survey sample, which is the basis of our main analysis, the attendance rate is 46%. All attendance figures are disaggregated by urban (i.e., Kigali, the capital) and more rural districts (although some areas outside of the capital are urban). The pattern across all groups is similar, in that rural areas tend to have higher attendance than the capital. Generally, however, attendance seems to be rather low—with many taxpayers failing to take up the program. We discuss these figures again in Sect. 4.3.

4.2 The effect of taxpayer education on compliance

We start our analysis by addressing our main research question: does taxpayer education affect compliance? Table 2 suggests that it does. More specifically, we report results on the three compliance outcomes described in Sect. 3.1, which we measure for the full sample, and using the two estimation methods described in Sect. 3.2. Our results remain fully in line with the ones we present here when we use alternative specifications, such as including no controls or using a linear probability model to estimate effects on our binary outcome variables.Footnote 31

The probit/tobit estimation (panel A) suggests that the program significantly improves compliance as measured by all three outcomes: it increases declaration rates by 10 percentage points (column 1), decreases the probability of having zero-tax liability by 7.7 percentage points (column 2), and increases the tax amount by 46.5% (column 3). The more robust PSM estimation yields largely comparable results, reported in panel B of Table 2. The program effects on the probability to declare and zero-tax status (columns 1 and 2) remain statistically significant and of comparable magnitude. The effect on the tax amount is now less precisely estimated, and only significant at the 10% level. The statistically weaker effect on the tax amount is in line with the results of the only other study on a similar topic, Chetty and Saez (2013).

Table 2 Program’s effect on compliance: probit/tobit and PSM estimation

The effects we document here are not only statistically significant, but also economically large. A 10 percentage points increase in the probability to submit a declaration (as estimated with probit in Panel A), relative to the control group average of 34%, represents an improvement by 29%. Similarly, a 7.7 percentage points decrease in zero-tax (as estimated in Panel A) represents a 8.6% improvement on the control group average of 89%. The coefficient on the tax amount, though less significant in the PSM estimation, is also large, suggesting that attendees report at least 43% more tax than non-attendees (Panel B of Table 2).

While these effects are large, they are in line with a literature showing that interventions to improve compliance can have large effects in low-income contexts, in line with larger expected margins to increase compliance compared to higher-income countries. For example, behavioral studies find that nudging messages improved compliance by 55% and about 35%, respectively, in the context of Rwanda and Ethiopia (Mascagni & Nell, 2021; Shimeles et al., 2017).Footnote 32 Interventions related to more traditional enforcement, such the introduction of technology for improved enforcement and data monitoring, have been shown to increase compliance by 12–48% in Ethiopia, depending on the tax type Mascagni et al. (2022).Footnote 33 While our results are broadly in line with evidence from comparable countries, a caveat is in order: we focus particularly on new taxpayers, as opposed to the broader taxpaying population, thus any effect we find in this specific group might not be fully comparable with the effect of other interventions among more experienced and larger taxpayers. Larger effects might be expected in our sample both because of the kind of intervention (more in depth than nudges, as discussed in Sect. 1) and because new taxpayers might have a larger margin to improve compliance. It must also be noted that a large percentage increase in tax payable for very small taxpayers represents a more plausible increase in compliance, in monetary terms, than for larger ones.

Furthermore, we check for heterogeneous effects across a number of potentially relevant variables, using interaction terms between these variables and program attendance. We find that women and less educated taxpayers seem to experience larger increases in declaration rates (see Table 14). Interestingly, the interaction effect with gender is also significant when we consider the zero-tax and tax amount outcomes, with businesswomen experiencing larger compliance effects on both (see respectively Tables 15 and 16). This evidence is consistent with the fact that less educated and female business owners bear a larger compliance costs burden, as documented in Coolidge (2012) and van den Boogaard et al. (2021), which might be eased by the program.

Finally, a methodological note is in order. While the comparability of attendees and non-attendees across a wide set of variables gives us confidence on the validity of our analysis (see Sects. 3.2 and 4.1), we also acknowledge that we cannot fully rule out the presence of unobservable variables that might affect attendance. We therefore put our results to a further test in the next section, exploiting assignment to the survey as an exogenous instrumental variable.

4.3 Testing compliance effects with an IV strategy

The results presented in the previous section rely on the comparability of attendees and non-attendees, before the program takes place (as shown in Table 1). While this is an important feature, we cannot rule out that unobservable variables affects attendance and therefore bias our results through selection, as discussed in Sect. 3.2. It is therefore important to put our effects on compliance to a further test. As mentioned in Sect. 3.2, policy constraints meant that we could not randomize attendance or even invitations to the program. However, we can exploit the imperfect invitation process run by the RRA, which in practice failed to reach all the potential participants (see Sect. 2.3), and the fact that, instead, we know with certainty that taxpayers who took part in our survey were invited to the program, as that was part of our interview script.

Based on this feature, we use allocation to take part in the survey as an instrumental variable for actual attendance to the program—which allows us to test the robustness of our compliance results. As discussed in Sect. 3.1, we aimed to include 1000 taxpayers in our survey sample, but we allocated 1400 to take part, expecting that some of them would be unavailable or refuse to participate. Importantly, allocation to the survey sample can be considered an exogenous instrument for attendance, as we discuss in more detail below, more so than actual participation in the survey. For this IV strategy to be credible, we need to establish two elements: that the instrument is relevant, and that it satisfies the exclusion restriction. We discuss each of these, before turning to the results.

This IV is relevant if it is significantly associated with the (potentially) endogenous variable—program attendance. In our case, the first stage regression shows that being allocated to the survey sample significantly increases attendance by over 23 percentage points (Table 3, column 1). This result is consistent with attendance figures reported in Table 13, showing that take-up of the program is much higher, and significantly so, for those selected to take part in the survey (36% attended) compared to those who were not (15%). The main reason for higher attendance in the survey sample is related to administrative constraints in the RRA department in charge of invitations. Although everyone was supposed to receive an invitation, in practice many taxpayers were not invited and, thus, did not know about the program. On the other hand, taxpayers who took part in the first survey round have been reminded about the date and location of the program session relevant to them at the beginning of the interview.Footnote 34

Table 3 Program’s effect on compliance: IV estimation

The exclusion restriction is harder to prove. However, we intentionally use allocation to the survey rather than actual participation in the survey, as the former is arguably more exogenous than the latter. As mentioned in Sect. 3.1, we have randomly selected 1400 taxpayers that the survey company would attempt to contact to collect data, while we discarded the remaining 1100 (this is our IV). The company then started to call those 1400 taxpayers, in random order and until the desired sample size of 1000 was reached.Footnote 35 As a descriptive test, in Table 18, we estimate a regression where both the instrument and the instrumented variable are included to explain tax outcomes. Our IV is never significant in explaining tax outcomes—thus giving us confidence that the exclusion restriction is likely to be satisfied. Relatedly, Table 19 compares non-attendees who were assigned to the survey and those who were not, showing no evidence on any significant difference between the two groups on tax outcomes. This confirms that the simple fact of taking part in the survey did not directly affect tax outcomes, which are instead influenced by attendance to the program.

We therefore proceed with the IV estimation, based only on attendance and administrative data since we are necessarily extending the analysis beyond the survey sample (see Table 12). As such, we can only test the robustness of program’s effect on compliance outcomes (i.e. not mechanisms as in the next section) and using administrative covariates (see Sect. 3.1).

Table 3 reports results on all three compliance outcomes. Our previous results on all three are confirmed, and remain of comparable magnitude and significance. The program is confirmed to have a significant effect on the probability to declare, which is even larger than the estimates presented in Sect. 4.2: 24 percentage points, corresponding to a 64% improvement compared to the control group, as opposed to 10 percentage points in Table 2. This coefficient suggests that, if anything, our result on declaration rates may suffer from a downward bias, and therefore represent lower bound. Results on the other two outcomes (zero-tax and log tax) remain significant and of comparable magnitude to those reported in Sect. 4.2. The probability to report zero-tax decreases by 6% compared to the control group mean, while the tax amount increases by 63%. Interestingly, the result on tax amount also becomes highly significant in this IV estimation, while in the PSM results suggested a weaker impact on this outcome (see table 2).

4.4 Evaluating alternative mechanisms

The previous sections established the effectiveness of the program on all our outcomes related to tax compliance. We now investigate what is the channel through which this effect comes about. In Sect.  2.1, we hypothesized mechanisms related to compliance costs—that can be further broken down into legal, practical, and psychological elements—as well as additional ones related to enforcement and tax morale. Our survey captures variables that are relevant to each one of these mechanisms, such as the knowledge index (as well as individual knowledge questions) and on a comprehensive set of perceptions related to complexity, enforcement, evasion, fairness, trust in RRA, government authority and tax as a social duty.

More details on the construction of these perception indicators are reported in Table 11. The knowledge indicator is simply a score, scaled from 0 to 10, capturing how many of the 19 basic questions in our tax quiz each taxpayer got right (see Sect. 3.1 and Table 10). All these variables are measured both before and after the training takes place, thus allowing us to test the role of alternative mechanisms in explaining the impacts on tax compliance. We do so by estimating the program’s effects on variables related to potential mechanisms, using regressions based on the same probit and PSM estimation methods described in Sect. 3.2. Significant program effects on these variables suggest that the associated mechanism is indeed a valid channel for the effect on compliance that we document in Sects. 4.2 and 4.3. As for our main results, we are confident that we are capturing the program’s effect given that all knowledge and perceptions indicators are perfectly balanced at baseline (see Table 1). However, the usual caveat about allocation to the program not being random still applies (see Sect. 3.2).

Table 4 Mechanisms: program impacts on knowledge and complexity

Our analysis of mechanisms suggests that compliance costs play a major role in explaining the effect of taxpayer education on tax compliance, with some evidence on the program being effective in addressing all there elements: legal, practical, and psychological. Table 4 reports the program’s effects on the knowledge score and on perceptions of complexity, while Fig. 3 breaks the knowledge index down into specific questions. The knowledge score shows one of the largest and most significant effects. It increases by 1.3 points, which, considering the very low average score at baseline, implies a 38% improvement compared to the control group mean. To unpack this result further, Fig. 3 shows the impact of the training separately for each of our knowledge questions. Those that capture the practical aspects of tax declaration (e.g. declaration deadlines or e-filing) see significant improvements, while those on key policy parameters like the maximum PIT/CIT rate or the VAT rate do not display significant variation. This suggests that the legal aspects of the compliance cost mechanism are less relevant to explain the impact on compliance. Although the program affects questions on PAYE and VAT, these taxes are not particularly relevant in our sample since new taxpayers typically do not have employees and are not registered for VAT (see Sect. 4.1). To capture the psychological element of compliance costs we can use perceptions about complexity of the tax system. This variable allows us to identify an improvement in this area, as the perception of complexity decreases by over 9 percentage points (see Table  4). This suggests that attendees are at least 14% less likely than non-attendees to consider the tax system to be difficult to deal with. Importantly, these results are confirmed both with the probit and PSM estimation—which is fully expected, given baseline balance.

Of the other mechanisms hypothesized in Sect. 2.1, we find some weak evidence in favor of the role of enforcement and no evidence on the role of tax morale. Figure 1 shows that most of the other perception indicators, capturing these other two mechanisms, do not change significantly as a result of the program.Footnote 36

The only significant coefficient in Fig. 1, aside from complexity, is the enforcement indicator that shows a small but statistically significant improvement. However, other perceptions related to enforcement, such as those on government authority or the acceptability of evasion, either do not change after the training (i.e. evasion in Fig. 1) or show a weakly significant effect in the opposite direction (i.e. government authority). The evidence on enforcement as a key mechanism linking taxpayer education and compliance is therefore, at best, mixed. Indeed, the program’s curriculum does not feature sanctions and fines for non-compliance in any prominent way, as discussed in Sect. 2.3. Consistently, Fig. 3 indicates that there is no or little change in taxpayers’ knowledge about sanctions (we asked two questions on this aspect, related to fines and interest on non-filing).

Finally, the results on indicators related to tax morale are largely non-significant, such as trust in the RRA, tax as a social duty, and fairness. However, we should note that these indicators were already very high at baseline, therefore leaving little margin for improvement. While we cannot fully rule out some role enforcement and tax morale, the data available to us does not support them as a major mechanisms at play in this case.

Fig. 1
figure 1

PSM coefficients on perception indexes. Coefficients are marginal effects from probit regressions computed at the mean of the explanatory variables, strengthened by a PSM with kernel algorithm. Perceptions are indexes which range from 0 (disagree) to 1 (agree). All regressions are weighted by sampling weights. We clusterize standard errors at the training session level and include training session fixed effects

As a further check for alternative mechanisms, we explore our results further through mediation analysis, which allows us to separate the direct effect of the program on compliance outcomes to the indirect effect channeled through a mediator—in this case, for example, knowledge increase. This analysis aims to identify the average controlled direct effect (ACDE) of the program and proceeds in two steps (Vansteelandt, 2009; Acharya et al., 2016).Footnote 37 First, we regress the outcome (e.g. the declaration probability) on the mediator (e.g. the knowledge increase), the training dummy, a set of controls, and the interaction between the mediator and all other variables. We then derive the predicted value of the outcome fixing all mediators to zero. This is the demediated outcome, in the words of Acharya et al. (2016). In the second step, we regress the demediated outcome on the program dummy. The coefficients from this regression represent the ACDE estimate. Table 17 shows the role of the possible mechanisms discussed above, as mediators for the impact of the training on compliance. Confirming our previous results, by far the most important mediator is improved knowledge. Our mediation analysis shows that the proportion of the total program effect on declaration probability that is mediated by the increase in knowledge after the training is 33%, much larger than the proportion of the program effect mediated by changes in any other perception measure. These results confirm that knowledge is the main element behind the impact of the program on compliance, capturing the compliance cost mechanism (particularly the practical and legal aspects captured by the knowledge index). Other mechanisms, such as enforcement and tax morale, play more minor roles.

These results, taken together, are highly suggestive that the program’s effects on compliance largely occur through improvements in compliance costs, and particularly taxpayers’ knowledge—as well as perceptions about complexity. These variables capture particularly the practical elements of compliance costs, but also legal and psychological ones. This mechanism is consistent both with the recent literature highlighting the importance of compliance costs in explaining taxpayer behavior (see Sect. 1), and with the nature of the program that aims to educate taxpayers about basic elements of the tax system (see Sect. 2.3). Importantly, these results show that RRA’s program is effective in improving taxpayer knowledge, which is very weak in Rwanda (see Sect. 4.1).

4.5 Population-wide effects

In this section we address the issue of external validity, at least for what concerns the population of taxpayers in Rwanda beyond our survey sample. In other words, do our results hold when we look at the whole population of new taxpayers in Rwanda? This question is related to a potential concern that the survey sample, which refers to the population of taxpayers who registered in three districts in the first half of the year (see Sect. 3.1), might be a selected sample of the broader population of new taxpayers (see Sect. 3.2). Thanks to our data, we can observe both attendance and compliance behavior (from administrative data) for all taxpayers who registered in 2017, as well as a limited set of administrative covariates (see Sect. 3.1 and Table 12). We can therefore estimate the relationship between program attendance and our three compliance outcomes, for the universe of new taxpayers. However, in this case we are unable to adopt any of the more rigorous estimation methods used elsewhere, such as PSM or our IV strategy, nor can we include a large set of controls from the survey, since we can only use administrative data. Despite this limitation, these correlations can still provide useful insights that we discuss below.

Table 5 reports these results for the whole population of new taxpayers who registered in 2017. We start by re-estimating the coefficient of interest in the survey sample (column 1) and in the survey reference population (column 2), using a probit estimation and including only the administrative covariates (see Sect. 3.1). The results in column (1), based on the same survey sample used for our main results (Sect. 4.2), are very similar to the ones reported in Table 2, which are estimated with more robust specifications. This gives us confidence on two fronts. First, it confirms that choices over our control variables do not affect our results in any substantial way: the coefficients of Table 5 (column 1) are not significantly different to those in Table 2. Second, and relatedly, it suggests that the results on the population of new taxpayers may not be overly biased, despite the more limited options we have in terms of methods and data.

Table 5 Program effect on tax outcomes: population of new taxpayers

Column 3 reports results for the full population of new taxpayers. Focusing on our first outcome, the probability to declare, panel A shows a much larger effect size of almost 30% points in the broader population, compared to 12 and 15 in columns 1 and 2. Running separate regressions for rural and urban areas does not help us to explain this difference, as the relevant coefficients are always much larger in the broader population than in the other groups.Footnote 38 The most plausible explanation for this difference lies in the timing of sessions, which is reported in Table 8. While the three relevant sessions for both the survey sample and the survey reference population happened in August 2017, the other sessions were organized between late November 2017 and March 2018—with an almost three months gap in between. The latter timing certainly made the program a lot more salient for taxpayers, who were approaching the end of the year (end December) and the declaration period (January–March). On the contrary, the August sessions were still quite far from the time when most taxpayers start wrapping up their accounts for the year and think more concretely about their tax affairs. Despite their limitations, these results suggest that, if anything, our main results are a lower bound for potentially much larger effects of this type of intervention.

Panels B and C of Table 5 estimate the program’s effect on the other two compliance outcomes, zero-tax and the tax amount, for the whole population. As for the declaration probability, the results of column 1 are very similar to the main results reported in Sect. 4.2. In line with the result on declaration probability, the results on zero-tax and on the tax amount for the whole population (column 3) are both highly significant and larger in magnitude than those presented in Sect. 4.2. While these results are far from conclusive, they suggest the possibility that the program has wider effects both at the extensive and intensive margin, which are especially large when the program is attended closer to the declaration period.

4.6 Longer-term effects

As discussed in Sect. 1, taxpayer education provides information in greater depth and in a more comprehensive way than other interventions, such as informational messages that have been used in the literature on behavioral nudges. As such, we would expect it to generate learning effects that extend beyond the first year post-intervention. To test this hypothesis, this section investigates whether the effects on compliance outcomes documented in Sects. 4.2, 4.3, and, though less rigorously, in Sect. 4.5 last over time. In addition to the data relative to fiscal year 2017, which we used for our main results, we also obtained data for declarations relative to fiscal year 2018, filed between January and March 2019. We use these data to track taxpayers in our survey sample and in the population of those who registered in 2017, and check whether the program effects last in the second year after registration.

Table 6 reports results on all three compliance outcomes, using the same estimation methods as in table 4.2 but measuring the outcomes in year two, that is, fiscal year 2018. These results suggest that the program continues to have an impact on all outcomes in year 2 in both panels A and B. However, they lose statistical significance when the outcome for year 1 is included as an explanatory variable, suggesting there is no additional effect in year 2—the impact from year 1 is simply sustained. Along similar lines, Table 20 shows the results on outcomes measured in year 2 for the whole population of new taxpayers, to be compared with results in Sect. 4.5. In this case too, we document a strong association between attendance in year 1 and positive compliance outcomes in year 2, which is confirmed for all three outcomes. However, this estimation is less precise than our main results and the caveats highlighted in Sect. 4.5 remain valid.

Table 6 Program impact on tax outcomes in year 2

The result on the longer-term impact on compliance outcomes, and particularly declaration rates, is consistent with the high persistence of declaration behavior that we can document using our administrative data. To this aim, we focus particularly on taxpayers registered before our study, for which we have a longer time series. A taxpayer who registered in 2015 and made a declaration in the first year (year 1) has a 55% probability of filing again the following year (year 2) and a 86% probability for the year after that (year 3), the latter conditional on having declared in year 2. On the contrary, a taxpayer who failed to file in the first year has a negligible probability of ever making a declaration: 1% in year 2 and 0.2% in year 3. The same pattern can be shown for our cohort of new taxpayers who registered in 2017: those who declare in the first year are much more likely to keep declaring, while others are very unlikely to ever submit a declaration.

We explore this result further through mediation analysis, following the same logic as in Sect. 4.4. Mediation analysis allows to disentangle the direct effect of the program on outcomes in year 2 and the indirect effect channeled through filing behavior in year 1, the mediator. From this exercise we derive that that the proportion of the total program effect in year 2 that is mediated by declaration behavior in year 1 is considerably high—59%.

These results are highly suggestive that getting taxpayers into the habit of declaring is crucial to their future compliance behavior. A recent paper has shown that habit can be a powerful driver of compliance—a finding that is consistent to the separate, but related, literature on get-out-the-vote (Dunning et al., 2017). Just the act of regularly paying taxes, or not doing so, can significantly affect future compliance behavior.

5 Cost-benefit analysis

Against the positive impacts on compliance described above, we now investigate whether the tax education program is cost-effective. The costs of providing the training are calculated at about RWF 28,000 per taxpayer, or about USD 24.5. These costs include logistical costs of renting training venues and offering lunch and refreshments to attendees, while they do not include RRA staff’s time, as that is already captured in their standard salary and does not represent an additional cost. That unit cost is to be evaluated against revenue gains.

To compute revenue gains from the program, we use the results reported in column 3 of Panel A, Table 2—the Tobit marginal effects on log tax declared. We opt for this specification as it is more conservative compared to the IV and PSM specifications. The chosen tobit estimate also allows us to consider gains related to both the intensive and extensive margins: those who increased tax due as a result of the program and those who started to declare thanks to attendance (with related revenue gains). The chosen coefficient suggests that the program causes a 54% increase in tax payable. We then multiply this effect to the average tax payable declared by the control group, of about RWF 19,450 (USD 17). The resulting extra revenue per taxpayer is about RWF 10,500 or USD 8.3. To this extra revenue in Year 1 we should add an equal amount for Year 2, as our analysis in Sect. 4.6 suggest that the revenue gains are sustained at least in the second year after the program. The total revenue gain therefore becomes USD 16.6 per taxpayer. This total revenue gain implies that the program would break even in year 3: assuming sustained gains and no growth (a conservative assumption), the cumulative revenue gains from the program would outweigh its costs in year 3.

This cost-benefit analysis should also be put into the perspective of the low probability to declare among new taxpayers. As discussed in Sect. 4.6, when new taxpayers do not declare in the first year, they are unlikely to ever do so in subsequent years. Based on this fact, we can project the revenue gains from taxpayers who start declaring as a result of program attendance in years 3 to 6 based on data from a previous cohort of new taxpayers for whom we have a longer time series.Footnote 39 The two cohorts have very similar amounts of average tax paid in the first two years since registration, thus making the previous one a reasonable basis to estimate revenue gains from the more recent one (the one we used in our main analysis). Section 4.6 shows that many new filers do not declare their taxes in the first year after registration and are very unlikely to ever do this in subsequent years. However, if they do file in year 1, they are likely to keep filing in the following years, in line with the role of habits formation around filing. As shown in Sect. 4.2, the program is effective in increasing the probability to submit a declaration, among other outcomes. The program pushes about 29% more taxpayers from the survey sample to declare. We therefore incorporate the extra revenue from these taxpayers in the longer term, years 3 to 6, assuming they follow a similar trend to the one we can observe in the previous cohort of new registrations, an approach justified by very similar tax amounts across the two cohorts upon registration. The revenue gain per taxpayer is USD 55 in year 3, which is the total tax paid in the previous cohort 3 years after registration.Footnote 40 This amount increases progressively in years 4, 5 and 6, again using the trend in the previous cohort as a reference point, for a total over years 3 to 6 of USD 302 for each taxpayer who was pushed to declare in our sample.

Taking this longer-term perspective, it is then clear that the program is cost-effective. While the costs are entirely covered by revenue gains by year 3, the net benefit in the longer term is substantial, with a cost-benefit ratio after 6 years of 1:12. These benefits would be even greater if we used the other estimates from IV or PSM specifications. Importantly, policymakers should consider not only the monetary benefits outlined here, but also those related to building a culture of compliance, including filing, which are harder to measure.

6 Conclusion

This paper aims to shed light on the effectiveness of taxpayer education to improve compliance in Rwanda. This topic has so far been largely ignored in the literature, despite the increasing evidence that taxpayer confusion and compliance costs are key determinants of tax compliance. More specifically, we evaluate a taxpayer education program run by the Rwanda Revenue Authority and offered to all new taxpayers. We rely on a unique dataset that matches information from a survey and from administrative tax records. Although we could not randomize attendance to the program, we are confident of the robustness of our results, based on three methods: probit/tobit estimation controlling for a wide variety of characteristics and perceptions, PSM, and an IV strategy to test the robustness of our results further (see Sects. 4.2 and 4.3).

Our results suggest a positive and significant impact of the taxpayer education program on three outcomes: the probability to declare, the probability of filing a zero-tax return, and the amount of tax due. Attendance to the program results in a 29% to 64% higher probability to submit a declaration, compared to those who do not attend the program (Sect. 4.2 and 4.3), while it also decreases the probability to submit a zero-tax return by at least 6% and increases the tax amount by at least 43% (PSM estimate in Table 2). We show that these effects may be even larger for taxpayers who attend the training closer to the declaration period (Sect. 4.5).

These results are particularly relevant in a context like Rwanda where increasing compliance is essential to fund the country’s ambitious development plans. On the extensive margin, the program improves declaration rates among new taxpayers, most of which would otherwise fail to submit a declaration in the first year after registration. On the intensive margin, it improves reporting by decreasing the probability to report zero-tax and increasing the tax amount, thus addressing under-reporting that remains widespread in this context.

Furthermore, thanks to a rich set of survey variables, we test possible alternative mechanisms through which the compliance effect may come about. We argue that the major mechanism in the case of Rwanda is a reduction in compliance costs, related particularly to increased knowledge and improved perceptions about the complexity of the tax system (Sect. 4.4). These results represent a novel contribution to the emerging literature on compliance costs and taxpayer confusion, which has so far largely ignored the issue of taxpayer education. Moreover, and contrary to much of the literature on tax compliance, we report longer-term effects beyond the year of implementation (Sect. 4.6). We present suggestive evidence that the program’s effect on all compliance outcomes stretches beyond the first year—thus documenting a learning effect.

On a broader level, we document dramatically low levels of tax knowledge in Rwanda—highlighting the importance of taxpayer confusion and compliance costs, perhaps especially in low-income countries. The majority of new taxpayers have a very weak understanding of the basic parameters of the tax system, as shown in Sect. 4.1. While this is in line with previous evidence on taxpayer knowledge (see Sect. 1), we use a more accurate measure of knowledge, which is another original contribution to this literature.

In terms of policy, our results suggest that tax education programs can be highly effective, as the benefits outweigh the costs especially once one considers the longer-term implications. Although policymakers may not see immediate revenue gains from such programs, we would argue that bringing new taxpayers into the habit of compliance is crucial to improve future behavior. The evidence we present here supports the expansion of education programs particularly to support smaller taxpayers as they enter the tax system. Still, more could be done to increase the program’s reach, which is currently quite limited (Sect. 4.1). This might imply boosting staffing of the department in charge of taxpayer services and facilitation. Aside from education programs similar to the one we evaluate here, revenue authorities should also provide basic information to taxpayers at the point of registration, for example stating clearly what taxes they registered for and what are the related obligations. This would help tackle gaps in knowledge that we document here, along with other measures.

Last but not least, this study shows the importance of collecting data for all interventions implemented by revenue authorities, in view of evaluating them and using these results to inform policy. Carrying out such an evaluation with the data previously available would have been impossible, as we did not know which taxpayers attended and which ones did not (see Sect. 3.1). While our results are only applicable to Rwanda, more research remains needed to evaluate external validity in other contexts.